Cloud computing is ideal for running flexible, scalable applications on demand, in periodic bursts, or for fixed periods of time. UVA SOMRC works alongside researchers to design and run research applications and datasets into Amazon Web Services, the leader among public cloud vendors. This means that server, storage, and database needs do not have to be estimated or purchased beforehand – they can be scaled larger and smaller with your needs, or programmed to scale dynamically with your application.
Researchers Using the Cloud
|Serverless Web||SoM faculty and researchers can share data, findings, tools and other resources from static HTML content published to object storage. This simple method for publishing can cost only a few dollars a month and requires no server management.|
|Data Lakes||A new paradigm in data storage and processing, data lakes help researchers by providing a central repository for both structured and unstructured data, of any type or size. These data can then be siphoned off for processing, either in real-time streams or in queues for later analysis.|
|Services Supporting HPC||Users of HPC usually have more than enough computing power to run their jobs. But what if you need a relational or NoSQL database, a messaging service, or offsite storage? Researchers have begun integrating the cloud into their HPC jobs to create, use, and manage external services like these.|
|HIPAA-Compliant Computing||Researchers working on clinical datasets use Ivy, our private virtualized platform to perform HIPAA compliant analytics and compute jobs. This platform offers virtual machines, an R/Python data analytics tool, and Hadoop/Spark for larger analytics projects. Many users in Ivy work with EPIC clinical data alongside other highly-sensitive datasets for their investigations.|
|Workflows & Pipeline Management||Researchers need flexibility for where they run their data pipelines -- it might be on a personal computer, a lab server, an HPC cluster, or a cloud instance. We are working with faculty to extend some commonly-used pipeline tools so that they can create and push jobs to cloud-based resources, regardless of the cloud vendor.|
|Long-term Cold Storage||AWS Glacier and Google Nearline/Coldline offer researchers "cold" offsite storage for long-term backups of infrequently-accessed data. Genomics researchers use Glacier to store terabytes of source data as required by grants and federal research projects.|
Other Common Use Cases
Proofs of concept - To verify a system or design works, to benchmark processing speeds, we may use short-lived instances to learn from before building a production system.
Test / Development environments - For installing test packages, trying new ideas, and testing design patterns.
Dynamic / flexible / scaling application stacks - When future traffic or load cannot be determined beforehand, deploying into a dynamic environment means the infrastructure is not locked into any set type of CPU/RAM or scale.
Short-term or fast deployment projects - For almost immediate computing needs, existing users can create new instances as needed.
Container deployments - Run microservices (such as Docker containers) in an environment that can load-balance their traffic and maintain container health.
Service Oriented Architecture
A key advantage of the cloud is that for many services you do not need to build or maintain the servers that support the service – you simply use it.
Here are some of the building blocks available using cloud infrastructure:
- Containers / Docker
- Analytics / Data Management
- Continuous Integration
- Sensor / IoT Data Streaming
- Messaging Queues
- SMS / Push Integration
- Alexa Skills / Speech Integration
- Serverless Computing
- Code Build / Validation
Cloud Services at UVA
As an Internet2 institution, the University of Virginia has access to AWS accounts through a reseller, DLT. This program offers a few key advantages for researchers:
- First, it allows for billing through purchase orders (P.O.’s) rather than credit cards;
- Second, it gives a slight (~3%) discount on services; and
- Third, it removes the required minimum costs for AWS support. Read more about the Internet2/AWS program.
Requesting an Account
Researchers or labs who would like to use AWS for their computing infrastructure should contact us to help set up an account through DLT. In order to set up your account you will need a “standing” annual P.O. from the UVA Procurement office equal to or greater than your estimated annual costs. For example, if you estimate your costs will be $300 per month, you might want to request a $4000 standing P.O. Your monthly AWS bills are then charged against that P.O. for the year. Note that these purchase orders must be renewed each year that you continue to use AWS.
Training & Implementation
With an AWS account in hand, you will need training. We offer regular, free workshops on cloud computing. In addition, we have weekly availability to answer your questions during our office ours, or we can schedule an in-person, hands-on training with your research group or lab.
If you need help in designing your infrastructure in a cloud environment, or thinking through how to migrate your existing projects, contact us for a consultation.
Sensitive Data in the Cloud
If your cloud-based project involves any sensitive data (HIPAA, PHI, etc.) you must request approval from the Information Security office at UVA. You will be required to verify that your application, infrastructure, and staff can meet all minimum requirements for the secure transfer and handling of sensitive data.
To get an idea of how AWS is used in real-world and research scenarios, visit the AWS Architecture Center or review some reference deployments below. These examples are drawn from AWS.
Build auto-scalable batch processing systems like video/image/datastream processing pipelines (PDF)
Large Scale Processing and Huge Data sets
Build high-performance computing systems that involve Big Data (PDF)
Time Series Processing
Build elastic systems that process time series data (PDF)
Solution Architecture & Consulting
We have experience designing and delivering solutions to the public cloud using industry best practices. If you have a project and would like to discuss options, pricing, design, or implementation, we are available for consultation. Our staff includes an AWS certified solution architect, and the entire SOMRC team uses AWS for our own internal systems and development.
We also offer in-person, hands-on workshops and sessions on working with the cloud. Workshops cover a number of topics, from creating object storage buckets and simple compute instances to more complex data-driven workflows and Docker containers, If you have an idea for a workshop or would like to schedule training for your lab or group, please contact us.
|October 24, 2018||Exploring Microbial Community||Hardik I Parikh|
|October 25, 2018||Writing Functions in R||VP Nagraj|
|November 01, 2018||Data Visualization with Python||Hardik I Parikh|
|November 07, 2018||RNA-Seq Data Analysis||Hardik I Parikh|
|November 08, 2018||Using Jupyter Notebooks||Jacalyn Huband|
|November 28, 2018||WDL/Cromwell on Rivanna||Hardik I Parikh|