School of Medicine Research Computing (SOMRC) provides state-of-the-art resources and expertise in handling and analyzing genomics and metagenomics data. Bioinformatics is a quickly evolving field with new biological and computational techniques being formalized and adopted at a fast pace. Hence, the following is only a brief cross-section of the ways researchers can use SOMRC’s expertise and computing resources for their bioinformatics research.
Next-generation sequence data analyis
- Genome assembly, reference-based and/or de-novo
- Whole-Genome/Exome sequence analysis for variant calling/annotation
- RNA-Seq data analysis to quantify, discover and profile RNAs
- Mircobiome data analysis, including 16S rRNA surveys, OTU clustering, microbial profiling, taxonomic and functional analysis from whole shotgun metagenomic/metatranscriptomic datasets
- Epigenetic analysis from BSAS/ChIP-Seq/ATAC-Seq
In addition to above-mentioned generic cookie cutter analyses, SOMRC can work with you to provide customized bioinformatics solutions for specific research goals.
We can advise and collaborate on the various experimental stages, from experimental design, to data processing/analysis/visualization/exploration, as well as downstream statistical modeling for biological insights.
UVA has two local HPC facilities available to researchers: Rivanna and Ivy. In addition, Cloud-based services offer computing enviroment for running flexible, scalable on-demand applications. SOMRC can work with your team to determine the computing platform best suited for your research project.
High-performance Computing Cluster
All faculty, research staff and graduate students of UVA have access to Rivanna, university's high-performance computing system with 290+ compute nodes (6500+ cores) for high-throughput multithreaded jobs, parallel jobs as well as memmory intensive large-scale data analyses. The architecture is specifically suited for large scale distributed genomic data analysis, with 100+ bioinformatics software packages installed and ready to use.
We can explore the possibility of using cloud infrastucture (AWS/GCP) for your bioinformatics analysis and data storage. For certain applications, the 'elasticity' of cloud computing may prove beneficial for saving time and reducing costs of data analysis and sharing. The SOMRC team is available for consultation on your project needs.
Bioinformatics analyses invariably involve chaining a series of tools/processes/functions etc. across many input samples to go from raw data to biologically interpretable results. Using a workflow management system to setup, execute and monitor pipelines makes it simpler to put together such complex scientific workflow. SOMRC is using WDL (pronounced widdle), a workflow definition language for describing tasks and workflows, and Cromwell, the execution engine that can run the WDL scripts locally or on the cloud. Cromwell provides an abstraction layer between the pipeline’s logic and execution, so that it can be executed on multiple platforms with minimal configuration changes. We are using the built-in scatter-gather parallelism features to develop variant calling WDL workflows, adhering to GATK Best Practices, and executing them on Rivanna using Cromwell. In the future, we also plan to transition these pipelines to Google Cloud Platform, for more cost-effective execution solutions.
We warmly welcome long-term collaborations with experimentalists and computational biologists. Working with biologists we can chart out experimental design, come up with controls, and think about statistical and computational analyses so the experiment is designed to extract the most value from the data. We can also build and maintain code and pipelines, as well as test new algorithms on multi-core and multi-node architectures.
If you have a bioinformatics project and would like to discuss potential solutions and implementation locally, or on the cloud, SOMRC is available for consultation.
SOMRC offers interactive workshops that focus on various aspects of bioinformatics. We typically make use of Rivanna to teach participants how to analyze large amounts of high-throughput sequencing data. To learn more and register for workshops, please visit the CADRE Academy education platform.