Data Analysis


School of Medicine Research Computing can help with accessing, preparing, visualizing and analyzing data. We can assist you by implementing your analysis strategy in an appropriate computing language, including R, Python, C/C++, and Mathematica. We will work with you to prepare scripts that are as reproducible, flexible and legible to you and your research team as possible. Our consultation services include wrangling, manipulating or otherwise cleaning datasets to prepare them to be analyzed. We also have expertise in visual representation of data both with static and interactive plotting tools.

Manipulation

Data analysis generally involves a significant effort to transform, aggregate, subset or otherwise prepare a dataset. That could include dealing with missing values as well as merging or joining multiple datasets. We can help you wrangle your data into a "tidy" format in order to analyze the features and observations relevant to your question.

Inference

Testing assumptions about a dataset is critical to arriving at a scientific conclusion. Once you've identified a hypothesis test appropriate for your data, we can help you translate that procedure into code.

Modeling

Developing algorithms and predictive models can be a fruitful data analysis strategy. Whether you're interested in using regression, network analysis or general machine learning techniques, we can help compare, optimize and interpet models. Depending on the size of the dataset and the method used, these approaches can be quite computationally intensive. Our expertise includes implementation of modeling in high-performance computing environments.

Visualization

From exploratory plots to publication-ready visualizations, we are available to help you prepare strong visual representations of your data. Some example of visualizations where we can help include heat maps, clustering, box plots, and Kaplan-Meier plots. In some cases, interactive plotting tools are helpful for communicating features, particularly when dealing with high-dimensional datasets.
Boxplot Example
Boxplot Example