Ivy Secure Environment

Ivy

Ivy is a secure computing environment for researchers consisting of virtual machines (Linux and Windows), Domino Data Lab, and the Apache Spark environment. Researchers can use Ivy to process and store sensitive data with the confidence that the environment is secure and meets HIPAA requirements.

Overview

Ivy consists of three separate computing environments. Access to one environment does not automatically grant access to the others:


Requesting Access

Access to Ivy resources is project-based, limited to PIs and their designees, and requires approval. Costs for Ivy resources and storage must be funded by the PI. Once a project is approved a PI and her/his researchers must sign a RUDA (one for every researcher on each project).


Pricing

Ivy resources will be provided without a fee for approved projects. Please note that the pricing model is still under evaluation. A valid PTAO is required as part of the account request process, although no charges will be made without advanced notice to the PI.

Connecting and Signing In

1 Authentication

You will sign in to all Ivy resources using your UVA computing ID and Eservices password. Because of Ivy's high security requirements, your Eservices password must be changed every 60 days.

Need help resetting your Eservices password?

If you are working from a secure Health Systems workstation you are ready to connect. You need an Identity Token and JointVPN connection as described in the following if you are outside of the secure HS network.

2 Identity Token

To connect to the Ivy environment with VPN you will need a physical USB identity token, issued to you by the ISPRO Access Management Office. Tokens must be requested, approved, and may take from 1-2 weeks for delivery. You must pick up and activate your token in person, with proof of identification. Your token will have its own password in order to be used.

3 Joint VPN

With your UVA computing ID, Eservices password, and USB identity token in hand, you must run the Cisco AnyConnect software to start a JointVPN connection every time you use any Ivy resource. AnyConnect will authenticate to the UVA network using a digital certificate installed on your workstation.

More information on VPN from ITS:


Virtual Machines

A virtual machine (VM) is a computing instance dedicated to your project. Multiple users can sign into a single VM.

Virtual machines come in two platforms, CentOS7 Linux and Windows Server 2012R2. Each platform is available in three instance types. Refer to the grid below for specifics.

Type CPU Memory
Small 4 cores 16GB
Medium 8 cores 32GB
Large 16 cores 128GB

Once created, your instance will be assigned a private IP address that you will use to connect to it (in the format 10.xx.xx.xx). VMs exist in a private, secure network and cannot reach outside resources on the Internet. Most inbound and outbound data transfer is managed through the Data Transfer Node (see below).

Connecting to your VM

To connect to your VM, you must install either an SSH client to connect to your VM using the command-line interface (CentOS VMs only), or remote desktop software to connect to the desktop GUI of your VM. These options are outlined below.

MacOSX Users:

  • Terminal (for SSH, built-in. Can be found in Applications -> Utilities -> Terminal)
  • x2goclient (for remote desktop to CentOS VMs, download here)
  • Other RDP clients (for remote desktop to Windows VMs, download here)

Windows Users:

Linux Users:

  • Terminal / Command (for SSH, built-in)
  • x2goclient (for remote desktop to CentOS VMs, download here)
  • Other remote desktop clients can be used (for Windows VMs)

To connect to Ivy follow the platform-specific steps below:

CentOS 7 Linux
  • Open your JointVPN connection
  • Reference the IP address of your Ivy VM.
  • For SSH access:
      ssh uva-id@ip-address
  • For Remote Desktop access: Start the x2goclient to the IP address of your VM and sign in.
Windows
  • Open your JointVPN connection
  • Reference the IP address of your Ivy VM.
  • For Remote Desktop access: Start an RDP client to the IP address of your VM and sign in.

Software

Every virtual machine (Linux or Windows) comes with a base installation of software by default. These help researchers by providing the basic tools for data processing and manipulation. Additional software packages are pre-approved and available for installation upon request. See the lists below for options.

If you require additional software not listed, you must submit a request. Requests are reviewed by the UVA ISPRO office for security and regulatory compliance and, if approved, will be installed for you.

Python/R Packages - Anaconda Python and R packages are available to users through the normal pip, conda, and CRAN and library installation methods.

PREINSTALLED Linux Software

PREINSTALLED Windows Software

ADDITIONAL Linux Groups

ADDITIONAL Windows Groups

Storage

Ivy VM has a pool of over 2 petabytes of Network Attached Storage shared amongst users. A PI specifies the storage space s/he would like to have when requesting access to Ivy. Virtual machines do not come with any significant disk storage of their own.

Learn More


Domino Data Lab

Domino Data Lab (DDL) provides a central environment for data science projects including project management, collaboration with team members, and setting up hardware configuration for a project.

Access

DDL is entirely browser-based and does not require any setup on your workstation. Once connected via JointVPN, point your browser to:

https://domino.hpc.virginia.edu/

You will be prompted for Domino login credentials, which correspond to your UVa computing ID and Eservices password. Please remember that in order to maintain access to any platform on Ivy (including DDL), you will need to change your Eservices password every 60 days.

Storage

Each DDL node comes with 500 gigabytes of storage. Central storage is not visible to DDL nodes.

Features

DDL is organized in a project structure, which is ideal for collaborative data analyses. Scripts written in Python and R can be edited, scheduled and run from within the web interface, both inside and outside of interactive notebook sessions (i.e. RStudio or Jupyter).

For specifics about these features and more, refer to the Ivy DDL User Guide.

Learn More


Apache Spark

Ivy Spark is an environment for distributed map/reduce computational analyses for Big Data applications.

Access

The Apache Spark installation on Ivy is under active development. If you have questions about access to this recourse, please email ivy-support@virginia.edu with information about your use-case.

Software

The platform comes with Cloudera Hadoop, Spark, YARN, Hive, Impala, Piig, ZooKeeper, and Oozie.

Storage

Ivy Spark has 480 terabytes of HDFS storage shared amongst users. Each node has 500 gigabytes of local disk storage. Data can be uploaded through the Hue web interface.


Data Transfer In/Out of Ivy

Moving your data in and out of Ivy requires that it move through the Data Transfer Node (DTN). This server has 100TB of storage and can be accessed via a web interface as well as via SFTP, SCP, or FTPS in the command-line.

Learn More


HIPAA Compliance

The Ivy platform is HIPAA compliant by default.