Domino Data Lab (DDL) provides a central environment and features for data science projects including project management, collaboration with team members, and setting up hardware configuration for a project.
Access to DDL to Ivy is managed through the Ivy account request process. Accounts are issued on a per project basis, with PIs (and any project members) being granted individual accounts to log into the DDL platform.
Once the request has been approved and all associated members have completed the necessary documentation, each individual project member can sign into DDL with his / her UVa Eservices user name and password.
DDL is organized into projects, which automatically provision a folder hierarchy to store code, data, and output. Users can create new projects, and can also invite other Ivy DDL users to collaboratively view, edit, or run files in an existing project.
Collaborators can “fork” (copy the contents of) projects, leave comments, and use built-in version control utilities to store / revert changes to files as necessary.
To upload a script, dataset or other file, users can navigate to a project and select the “files” menu item. DDL includes a drag-and-drop interface for uploading files less than 550 MB.
To upload files larger that 550 MB refer to the following:
The DDL platform allows users to run Python and R scripts that have been either uploaded to a project or written in one of the DDL editors or notebooks. To issue a run, navigate to the file you would like to execute and click
Run. Alternatively you can can use the
Runs window to start a run by entering the filename.
Note that if code is associated with data, it should be written relative to the location of that dataset in the project directory.
In addition to scheduling and executing scripts that have been uploaded, DDL provides an interactive notebook session feature.
Available notebooks include:
Scripts and data generated in an interactive notebook session can be saved (“synced”) to the DDL project from which they were initiated.
Choosing a Base Environment
Depending on what analysis tools (i.e. R or Python) you plan on using, you may need to adjust the default computing environment. For example, if your code is written to run with Python 2.x (and not Python 3.x), you can choose a base environment that uses that version. Note that these configurations are on a per project basis, and can managed by visiting Settings >> Compute environment.
Each compute environment comes with a number of popular packages / modules pre-installed. Users are able to install additional packages as necessary via standard package management tools.
For specifics related to R or Python package installation refer to the following:
Selecting Hardware Tiers
Ivy DDL currently has a single hardware tier available. This tier is selected by default and currently is the only option available on Ivy DDL.
In the future additional tiers may be specified; to specify the tier you would like to use, navigate to Settings >> Hardware tier.
DDL includes a number of additional features. Use the resources below to find specific topics:
Please note that because Ivy is designed to be a secure environment, certain DDL features are not available in Ivy DDL. Examples include the following:
- Github integration
- Email notifications