Data Cruncher allows you to enter and execute Python, R or Julia code to perform further analyses on your data on the Seven Bridges Platform. This page will explain how you can access Data Cruncher from a project on the Platform, set up an analysis and execute code within the analysis.
- Open the desired project on the Platform.
This project should contain the data that you want to analyze further using Data Cruncher.
- From the project's dashboard, click the Interactive Analysis tab.
The list of available interactive analysis tools opens.
- On the Data Cruncher card click Open.
This takes you to the Data Cruncher home page. If you have previous analyses, they will be listed on this page.
- In the top-right corner click Create analysis.
The Create new analysis wizard is displayed.
- On the first screen, name your analysis in the Analysis name field.
- Select JupyerLab or RStudio as the analysis environment.
- Click Next.
- Select the instance for the analysis.
The Instance type list displays available instances along with their disk size, number of vCPUs and memory (shown in brackets). The default instance is c3.2xlarge that has 160 GB of SSD storage, 8 vCPUs and 15 GB of RAM.
Suspend time is the period of analysis inactivity after which the instance is stopped automatically. Inactivity implies that:
- No files have been modified or created in the analysis (in the
/sbgenomics/workspacedirectory) (JupyterLab and RStudio).
- There are no running kernels (JupyterLab).
Apart from stopping the instance, this also includes stopping the analysis and saving all analysis files and output files. Minimum suspend time is 15 minutes.
- Click Start the analysis.
The Platform will start acquiring an adequate instance for your analysis, which may take a few minutes. Analysis initialization goes through the following stages:
- Allocating the instance for your analysis - Obtain an instance from the cloud infrastructure provider.
- Preparing the allocated instance - Load the required software onto the instance.
- Doing the final setup of the analysis environment - Perform final settings and initialize the analysis environment.
Once an instance is ready, you will be notified.
If you don't have execute permissions in the project where the analysis is being created, the button is labelled Create the analysis. This allows you to create the analysis in draft state with the defined settings, but not execute it.
Once the Platform has acquired an instance for your analysis, you will be able to open the editor and run your analysis by clicking Open in editor.
Depending on the selected editing environment, you are presented with the following options:
Notebook - select whether to create a Python 2, Python 3, R or Julia notebook. A notebook is the central element of a Data Cruncher analysis in JupyterLab, where you can enter your code, but also store equations, visualizations and explanatory text.
Console - select any of the Python 2, Python 3, R or Julia options if you prefer to run your code directly in the console.
Other - this section offers the following options:
- Text Editor - used to create any text-based file that you want to have or use during your analysis. For example, if you need to add a JSON file to your analysis files, you can select this option, enter or paste the JSON content and save the file with a .json extension.
- Terminal - a familiar way of interacting with the system by bringing the functionality of a Linux shell into the Data Cruncher analysis environment.
The RStudio editing environment opens right away, giving you direct access to the R console. To enter your R code as an R script or notebook instead of typing it directly into the console, select File > New File from the main menu bar and choose the adequate file type.