UVA School of Medicine

Research Computing

Enabling scientific breakthroughs at scale with advanced computing

/tag/emr

  • Hadoop / Spark

    Hadoop is a framework that supports the processing and storage of extremely large data sets in a distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation. In Amazon Web Services, Hadoop is available as a service named “Elastic MapReduce” (EMR). Hadoop Hadoop is frequently used by researchers for processing either massive data files (larger than you might be able to store on your local workstation) or a great quantity of files.