API Quickstart

šŸš§

On this page:

Overview

To introduce you to capabilities of the Seven Bridges API, this QuickStart walks you through the process of a Whole Exome Sequencing Analysis. This tutorial is ideal for familiarizing yourself with the various Seven Bridges API requests.

If you wish to learn the basics behind APIs, please see our Introduction to the API tutorial.

Alternatively, to make use of the Seven Bridges' Python API library, you can consult our Jupyter notebook Quickstart tutorial.

Prerequisites

You will need an account on the Seven Bridges Platform in order to obtain your authentication token. Almost all API requests require your Seven Bridges authentication token. This acts as a security measure regulating your access to your projects. Learn more about obtaining your authentication token.

Procedure

We'll start by creating a project and populating it with files. Then, we'll use one of the Seven Bridges Whole Exome Sequencing workflows, Whole Exome Sequencing GATK 2.3.9.-lite, to carry out the analysis. Finally, we'll examine our results.

All necessary tools and data will be available on the Platform.

Note that the HTTP requests on this page use the base API path for Amazon Web Services US (AWS US) deployment of the Platform, https://api.sbgenomics.com/v2.

Find your billing group

To start an analysis, we must first create a project. To do this, we need to obtain following information:

Your authentication token acts as a security measure so only you can access your projects and resources on the Platform. The billing group ID designates which funding resource to charge for the analyses you run in the project you're about to create. Learn more about billing groups on the Platform.

Use the API request to list your billing groups, as shown in the HTTP request below. Be sure to insert your authentication token for X-SBG-Auth-Token.

GET /v2/billing/groups HTTP/1.1
Host: api.sbgenomics.com
X-SBG-Auth-Token: 3210a98c1db9304ea9d9873156740f74

This request returns a list of the billing groups you are part of, as shown below:

{
  "href": "https://api.sbgenomics.com/v2/billing/groups/",
  "items":
    {
      "id": "ec1dc1e3-12a3-4b56-789c-e3f2dca0c6f7",
      "href": "https://api.sbgenomics.com/v2/billing/groups/ec1dc1e3-12a3-4b56-789c-e3f2dca0c6f7",
      "name": "RFranklin Billing Group"
    },
  "links": []
}

Copy the value for id (in this case ec1dc1e3-12a3-4b56-789c-e3f2dca0c6f7) to your clipboard. We will use this in the next step when creating a project.

Create a project

Projects are the core building blocks of the Platform. Each project corresponds to a distinct scientific investigation, serving as a container for its data, analysis tools, results, and team of collaborators.

To create a project, make the API request to create a new project, as shown in the HTTP request shown below. Be sure to paste in your authentication token for the X-SBG-Auth-Token key.

This request also requires a request body. Provide a name for your project and an optional description. Here, you should also paste in the billing_group id you obtained in the previous step.

POST /v2/projects HTTP/1.1
Host: api.sbgenomics.com
X-SBG-Auth-Token: 3210a98c1db9304ea9d9873156740f74
content-type: application/json
{
    "name":"API Quickstart project",
    "description":"project for the API",
    "billing_group":"ec1dc1e3-12a3-4b56-789c-e3f2dca0c6f7"
}

You'll see a response body, as shown below, containing the name of your project, its URL (href), your project id, and your project's billing_group.

Note down the project id. We will use this throughout the tutorial to designate our project. The project id consists of two parts: your username followed by your project's short name.

{
  "href": "https://api.sbgenomics.com/v2/projects/rfranklin/api-quickstart-project",
  "id": "rfranklin/api-quickstart-project",
  "name": "API Quickstart project",
  "type": "v2",
  "description": "project for the API",
  "tags": [],
  "billing_group": "ec1dc1e3-12a3-4b56-789c-e3f2dca0c6f7"
}

Now that you've successfully created a project, we can add data to the project for analysis.

Add files to your project

In this tutorial, we'll analyze publicly available data that is hosted in the Public Reference Files repository on the Seven Bridges Platform. This repository contains reference files, datasets, and other frequently-used genomic data that you might find useful.

For this analysis, we want to use two paired-end files that contain Whole Exome Sequencing data.

Find your files

To find these files, we will make the API request to list all files in a project, as shown below.

We'll need to pass along two query parameters to locate the files. First, we have to specify the project containing the files. In this case, the Seven Bridges Public Reference Files repository is specified in the same way as a project on the Platform by an id of admin/sbg-public-data. Following the path, you can pass this query parameter using project=admin/sbg-public-data.

Then, we want to filter the results by metadata. Learn more about the API keys for metadata fields on the Platform. In this instance, we want to find two files produced by an experimental strategy of Whole Exome Sequencing. We can append this query parameter to our previous parameter using an & followed by metadata.experimental_strategy=WXS.

The entire HTTP request is shown below:

GET /v2/files?project=admin/sbg-public-data&metadata.experimental_strategy=WXS
Host: api.sbgenomics.com
X-SBG-Auth-Token: 3210a98c1db9304ea9d9873156740f74

In the response returned, you will see a list of files along with their id, name, and the project to which they belong, as shown below.

For this tutorial, let's choose the two paired end files, C835.HCC1143.2.converted.pe_1.fastq and C835.HCC1143.2.converted.pe_2.fastq. Copy the id for each file to your clipboard. We will use this in the next step when copying the files.

{
  "href": "https://api.sbgenomics.com/v2/files?metadata.experimental_strategy=WXS&offset=0&limit=18&project=admin/sbg-public-data",
  "items": [
    {
      "href": "https://api.sbgenomics.com/v2/files/567890abc9b0307bc0414164",
      "id": "567890abc9b0307bc0414164",
      "name": "merged-normal.bam",
      "project": "admin/sbg-public-data"
    },
    {
      "href": "https://api.sbgenomics.com/v2/files/567890abc1e5339df0414123",
      "id": "567890abc1e5339df0414123",
      "name": "C835.HCC1143_BL.4.converted.pe_1.fastq",
      "project": "admin/sbg-public-data"
    },
    {
      "href": "https://api.sbgenomics.com/v2/files/567890abc4f3066bc3750174",
      "id": "567890abc4f3066bc3750174",
      "name": "C835.HCC1143_BL.4.converted.pe_2.fastq",
      "project": "admin/sbg-public-data"
    },
    <snip>
  ],
  "links": []
}

šŸ“˜

For brevity, we have omitted part some of the returned files.

Once we've obtained the file ids, we can copy these files into our project.

Copy files to a project

To copy files into a project, make the API request to batch copy files, as shown below.

In the body of the request, you can specify the target project for the copied files. In this tutorial, we want to copy the files to the project we created above, whose project id consists of your username followed by the project's short name. In the example request below, we've input the project id rfranklin/api-quickstart-project as the relevant value for the project key.

We also want to pass along the file ids you obtained in the step above in the body of the request. We can input the ids as a list of values for the file_ids key, as shown below.

POST /v2/action/files/copy HTTP/1.1
Host: api.sbgenomics.com
X-SBG-Auth-Token: 3210a98c1db9304ea9d9873156740f74
{
    "project":"rfranklin/api-quickstart-project",
    "file_ids": ["567890abc9b0307bc0414164","567890abc1e5339df0414123"]
}

The response body, as shown below, will indicate if your request was successful. The response contains the original ids of your copied files and its status.

The response body also contains two other fields: new_file_id and new_file_name. These indicate the new id and the new name assigned to the copy of the file within your project. You can use this id in future API requests to refer to the copy of the file within your project as opposed to the original file.

{
  "567890abc9b0307bc0414164": {
    "status": "OK",
    "new_file_id": "567890abc9b0307bc0414164",
    "new_file_name": "C835.HCC1143_BL.4.converted.pe_1.fastq"
  },
  "567890abc1e5339df0414123": {
    "status": "OK",
    "new_file_id": "567890abc1e5339df0414123",
    "new_file_name": "C835.HCC1143_BL.4.converted.pe_2.fastq"
  }
}

The files have been successfully copied to your project. However, before we can use the files in an analysis, we should annotate them with metadata.

Modify file metadata

Metadata makes files easier to manage. File metadata includes information about the File (e.g. experimental strategy and library ID), Sample (e.g. sample ID), and General (e.g. investigation and species) . For more information on the metadata fields used on the Platform, please see the documentation on file metadata. On this page, you can also obtain the keys used to identify metadata in API requests.

We will set the platform_unit_id metadata field to 1 in this tutorial. This metadata will inform tools that these files come from the same sample, were produced by the same library, and have been sequenced on the same lane.

To change a file's metadata, make the API request to modify a file's metadata, as shown below. Note that we can only modify one file's metadata at a time. You can pass the file id in the path of the request in the format of https://api.sbgenomics.com/v2/files/{file_id}/metadata. Be sure you use the new_file_id from the step above. This ensures you modify the metadata for the file in your project.

We'll enter the metadata field you wish to modify in the body of the request.

PATCH /v2/files/567890abc9b0307bc0414164/metadata HTTP/1.1
Host: api.sbgenomics.com
X-SBG-Auth-Token: 3210a98c1db9304ea9d9873156740f74
{
 "platform_unit_id":"1"  
}

In the response body, you'll see the file's metadata. The metadata field we just changed, platform_unit_id is listed as a metadata field in the response.

{
  "platform_unit_id": "1",
  "reference_genome": "HG19_Broad_variant",
  "species": "Homo sapiens",
  "sample_id": "HCC1143BL",
  "case_id": "CCLE-HCC1143BL",
  "investigation": "CCLE-BRCA",
  "paired_end": "1",
  "sample_type": "EBV Immortalized Normal",
  "experimental_strategy": "WXS",
  "platform": "Illumina"
}

Now that we've modified the metadata for the first file we copied, we need to repeat this process for the second file.

šŸ“˜

Adding metadata for multiple files

While this process is manageable for a smaller number of files, it grows cumbersome if you are processing numerous files. In this case, you can use a programming language like Python, R, or bash to write a for loop around this process.

Our API Quickstart tutorial using the sevenbridges-python library introduces this method.

Next, we will add reference files to our project.

Add reference files to your project

Many bioinformatics tools require certain data, such as reference genomes or annotation files, to execute properly. The Whole Exome Sequencing GATK 2.3.9.-lite workflow uses a reference genome and annotation files (known indels and snps) to map Exome-seq reads. We'll need to have these reference files in our QuickStart project to be able to use them while setting up our task.
For this analysis, we need to supply the workflow with the following six reference files:

  • dbsnp_137.b37.vcf
  • 1000G_phase1.indels.b37.vcf
  • Mills_and_1000G_gold_standard.indels.b37.sites.vcf
  • exome_targets.b37.bed
  • snpEff_v3_6_GRCh37.75.zip
  • human_g1k_v37_decoy.fasta

See the table below for more information about each file.

API keyInput filesFile type
Known_SNPsdbsnp_137.b37.vcfVCF files contain databases of the known genetic variants - SNPs and indels.
Known_IndelsMills_and_1000G_gold_standard.indels.b37.sites.vcf

1000G_phase1.indels.b37.vcf
VCF files contain databases of the known genetic variants - SNPs and indels.
Target_BEDexome_targets.b37.bedBED files contain all target regions which are relevant for our analysis - in this case exomes. It points to the relevant locations of the FASTA file we are using for the analysis.
databasesnpEff_v3_6_GRCh37.75.zipZIP file (snpEff) is a specific build of the snpEff database which contains annotations of the genetic variants and their supposed effects.
input_tar_with_referencehuman_g1k_v37_decoy.fastaFASTA file is a reference genome which we will use for the alignment of the FASTQ files.

Find reference files

To find these files, we will make the API request to list all files in a project, as shown below.

This process is similar to finding the data files above: in this query, we'll need pass along the query parameter to designate the Public Reference Files repository,
project=admin/sbg-public-data.

Then, we want to filter the results by the name query parameter. We can append this query parameter to our previous parameter using an & followed by the name of the file, such as dbsnp_137.b37.vcf.

You can search for multiple files by their names with the same API request by including the field name multiple times. When filtering on any resource, including the same field several times with different filtering criteria results in an implicit OR operation for that field and the different criteria.

The entire HTTP request is shown below:

GET /v2/files?project=admin/sbg-public-data&name=dbsnp_137.b37.vcf&name=1000G_phase1.indels.b37.vcf&name=Mills_and_1000G_gold_standard.indels.b37.sites.vcf&name=exome_targets.b37.bed&name=snpEff_v3_6_GRCh37.75.zip&name=human_g1k_v37_decoy.fasta HTTP/1.1 
Host: api.sbgenomics.com
X-SBG-Auth-Token: 3210a98c1db9304ea9d9873156740f74

In the response, you will see information about each file as well as the file's id. Copy each id (for example, 5772b6cd507c1752674486d8) to your clipboard. We will use these ids in the next step.

{
  "href": "https://api.sbgenomics.com/v2/files?offset=0&name=dbsnp_137.b37.vcf&name=1000G_phase1.indels.b37.vcf&name=Mills_and_1000G_gold_standard.indels.b37.sites.vcf&name=exome_targets.b37.bed&name=snpEff_v3_6_GRCh37.75.zip &name=human_g1k_v37_decoy.fasta&limit=6&project=admin/sbg-public-data",
  "items": [
    {
      "href": "https://api.sbgenomics.com/v2/files/567890abc07c1752674486d8",
      "id": "567890abc07c1752674486d8",
      "name": "dbsnp_137.b37.vcf",
      "project": "admin/sbg-public-data"
    },
    {
      "href": "https://api.sbgenomics.com/v2/files/567890abc07c1752674486e6",
      "id": "567890abc07c1752674486e6",
      "name": "human_g1k_v37_decoy.fasta",
      "project": "admin/sbg-public-data"
    },
    {
      "href": "https://api.sbgenomics.com/v2/files/567890abc07c17681a3117dc",
      "id": "567890abc07c17681a3117dc",
      "name": "exome_targets.b37.bed",
      "project": "admin/sbg-public-data"
    },
    {
      "href": "https://api.sbgenomics.com/v2/files/567890abc07c17681a3117ce",
      "id": "567890abc07c17681a3117ce",
      "name": "1000G_phase1.indels.b37.vcf",
      "project": "admin/sbg-public-data"
    },
    {
      "href": "https://api.sbgenomics.com/v2/files/567890abc07c1752674486d4",
      "id": "567890abc07c1752674486d4",
      "name": "Mills_and_1000G_gold_standard.indels.b37.sites.vcf",
      "project": "admin/sbg-public-data"
    },
    {
      "href": "https://api.sbgenomics.com/v2/files/567890abc07c1752674486ed",
      "id": "567890abc07c1752674486ed",
      "name": "snpEff_v3_6_GRCh37.75.zip",
      "project": "admin/sbg-public-data"
    }
  ],
  "links": []
}

šŸ‘

Pro-tip

To display only the id and name fields in the response, you can specify fields as a query parameter by using fields=id,name.

Copy reference files to your project

To copy files into a project, make the API request to batch copy files, as shown below. This is the same method we used to copy our data files into our project.
In the body of the request, you can specify the target project for the copied files, such as rfranklin/api-quickstart-project, as a value for the project key.

We also want to pass along the file ids you obtained in the step above in the body of the request. We can input the ids as a list of values for the file_ids key, as shown below.

POST /v2/action/files/copy HTTP/1.1
Host: api.sbgenomics.com
X-SBG-Auth-Token: 3210a98c1db9304ea9d9873156740f74
{
    "project":"rfranklin/api-quickstart-project",
    "file_ids": [
        "567890abc07c1752674486d8",
        "567890abc07c17681a3117ce",
        "567890abc07c1752674486d4",
        "567890abc07c17681a3117dc",
        "567890abc07c1752674486ed",
        "567890abc07c1752674486e6",
        "567890abc07c1752674486ed"
    ]
}

The response body contains the new_file_id for each of the copied reference files. These indicate the new id assigned to the copy of the file within your project. You can use this id in future API requests to refer to the copy of the file within your project as opposed to the original file.

{
  "567890abc07c1752674486ed": {
    "status": "OK",
    "new_file_id": "567890abc4b08370afe7e541",
    "new_file_name": "snpEff_v3_6_GRCh37.75.zip"
  },
  "567890abc07c1752674486d8": {
    "status": "OK",
    "new_file_id": "567890abc4b08370afe7e539",
    "new_file_name": "dbsnp_137.b37.vcf"
  },
  "567890abc07c1752674486e6": {
    "status": "OK",
    "new_file_id": "567890abc4b08370afe7e543",
    "new_file_name": "human_g1k_v37_decoy.fasta"
  },
  "567890abc07c17681a3117dc": {
    "status": "OK",
    "new_file_id": "567890abc4b08370afe7e53f",
    "new_file_name": "exome_targets.b37.bed"
  },
  "567890abc07c17681a3117ce": {
    "status": "OK",
    "new_file_id": "567890abc4b08370afe7e53b",
    "new_file_name": "1000G_phase1.indels.b37.vcf"
  },
  "567890abc07c1752674486d4": {
    "status": "OK",
    "new_file_id": "567890abc4b08370afe7e53d",
    "new_file_name": "Mills_and_1000G_gold_standard.indels.b37.sites.vcf"
  },
  "567890abc07c1752674486ed": {
    "status": "OK",
    "new_file_id": "567890abc4b08370afe7e541",
    "new_file_name": "snpEff_v3_6_GRCh37.75.zip"
  }
}

We have populated our project with the requisite data and reference files. Now, we can add a workflow to our project.

Add a public workflow to your project

We'll use the workflow, Whole Exome Sequencing GATK 2.3.9.-lite, which is based on the free version of the GATK tool developed by the Broad Institute.

This workflow is one of Seven Bridges' many open source workflows available to all users on the Platform. These workflows have been tested to run efficiently in the cloud environment by the Seven Bridges bioinformatics team.

Find a public workflow

To find a public workflow on the Platform, make the API request to list all apps (i.e. tools and workflows) available to you, as shown below.

You can filter for public workflows by adding the parameter visibility=public. Since this will return many results, we want to see as many results as we can on one page. To set the pagination, we use the query parameter limit=100 to display 100 results per page. The maximum allowable limit per page is 100.

GET /v2/apps?visibility=public&limit=100 HTTP/1.1
Host: api.sbgenomics.com
X-SBG-Auth-Token: 3210a98c1db9304ea9d9873156740f74

This query returns the following response:

{
  "href": "https://api.sbgenomics.com/v2/apps?visibility=public&offset=0&limit=100",
  "items": [
    {
      "href": "https://api.sbgenomics.com/v2/apps/admin/sbg-public-data/sbg-split-bed/3",
      "id": "admin/sbg-public-data/sbg-split-bed/3",
      "project": "admin/sbg-public-data",
      "name": "SBG Split BED"
    },
    {
      "href": "https://api.sbgenomics.com/v2/apps/admin/sbg-public-data/sbg-untar-fasta/8",
      "id": "admin/sbg-public-data/sbg-untar-fasta/8",
      "project": "admin/sbg-public-data",
      "name": "SBG Untar fasta"
    },
    <snip>
  ],
  "links": [
    {
      "href": "https://api.sbgenomics.com/v2/apps?visibility=public&offset=100&limit=100",
      "rel": "next",
      "method": "GET"
    }
  ]
}

šŸ“˜

For brevity, we have omitted part some of the returned apps.

Scrolling through this list of apps, you'll see that Whole Exome Sequencing GATK 2.3.9.-lite isn't on this list of the first 100 results. To page through to the next 100 results, you can user the parameter offset=100 which starts listing the next 100 results starting from the 101st result, as shown below.

GET /v2/apps?visibility=public&limit=100&offset=100 HTTP/1.1
Host: api.sbgenomics.com
X-SBG-Auth-Token: 3210a98c1db9304ea9d9873156740f74

You'll see the following response. Locate and copy the id of the Whole Exome Sequencing GATK 2.3.9.-lite workflow. We'll use this in the next step.

{
  "href": "https://api.sbgenomics.com/v2/apps?visibility=public&offset=100&limit=100",
  "items": [
    {
      "href": "https://api.sbgenomics.com/v2/apps/admin/sbg-public-data/chimera-1-12-0/4",
      "id": "admin/sbg-public-data/chimera-1-12-0/4",
      "project": "admin/sbg-public-data",
      "name": "Chimera"
    },
    <snip>
    {
      "href": "https://api.sbgenomics.com/v2/apps/admin/sbg-public-data/whole-exome-sequencing-gatk-2-3-9-lite/56",
      "id": "admin/sbg-public-data/whole-exome-sequencing-gatk-2-3-9-lite/56",
      "project": "admin/sbg-public-data",
      "name": "Whole Exome Sequencing GATK 2.3.9.-lite"
    },
    <snip>
    {
      "href": "https://api.sbgenomics.com/v2/apps?visibility=public&offset=0&limit=100",
      "rel": "prev",
      "method": "GET"
    }
  ]
}

We've located the id for the Whole Exome workflow, and now we can copy it into our project.

Copy a public app into a project

We can use the id we obtained above to copy the workflow into our project.
To copy a workflow, make the API request to copy an app, as shown below. Be sure to pass the app's id in the path of the request.
In the body of the request, include the name of the project you want to copy the public app into, such as rfranklin/api-quickstart-project.

POST /v2/apps/admin/sbg-public-data/whole-exome-sequencing-gatk-2-3-9-lite/56/actions/copy HTTP/1.1
Host: api.sbgenomics.com
X-SBG-Auth-Token: 3210a98c1db9304ea9d9873156740f74
{
    "project":"rfranklin/api-quickstart-project"
}

This call returns the name and the id of the app within your project. Copy this id as we'll need it in the next step.

The response body also contains the full Common Workflow Language description of the copied app. This is typically a lengthy JSON object (raw), which we have omitted in part below for brevity. Keep this JSON handy for the next step, as it include information about setting up inputs for the workflow.

{
  "href": "https://api.sbgenomics.com/v2/apps/rfranklin/api-quickstart-project/Whole Exome Sequencing GATK 2.3.9.-lite/0",
  "id": "rfranklin/api-quickstart-project/whole-exome-sequencing-gatk-2-3-9-lite/0",
  "project": "rfranklin/api-quickstart-project",
  "name": "Whole Exome Sequencing GATK 2.3.9.-lite",
  "revision": 0,
  <snip>
  "inputs": [
      {
        "sbg:suggestedValue": [
          {
            "path": "567890abc07c17681a3117ce",
            "class": "File",
            "name": "1000G_phase1.indels.b37.vcf"
          },
          {
            "path": "5772b6c9507c1752674486d4",
            "class": "File",
            "name": "Mills_and_1000G_gold_standard.indels.b37.sites.vcf"
          }
        ],
        "sbg:y": 938.3340420723234,
        "sbg:x": 276.4287550222375,
        "sbg:includeInPorts": true,
        "label": "Known Indels",
        "type": [
          "null",
          {
            "type": "array",
            "items": "File"
          }
        ],
        "id": "#Known_Indels"
      }
    ]
<snip>
}

šŸ“˜

If you ever need to obtain the CWL description of an app, you can make the API request to get details of an app.

We're now ready to set up a task.

Create a draft task

An app execution is called a task. Each task is associated with a set of input files and chosen settings for the tool(s) in the app. The first step to executing a task is to set up a draft task. In this step, you specify the inputs for your task.

To set up a draft task, make the API request to create a draft task, as shown below.

In the body of the request, provide a name for your task. Then, specify the workflow you're running by including its id, which we obtained in the step above. Referencing the inputs in the step above, add the inputs for your workflow, including your reference and data files.

POST /v2/tasks HTTP/1.1
Host: api.sbgenomics.com
X-SBG-Auth-Token: 3259c50e1ac5426ea8f1273259740f74
{   "description": "an api task to demonstrate the quickstart",
    "name": "api task WES",
    "app": "rfranklin/api-quickstart-project/whole-exome-sequencing-gatk-2-3-9-lite/0",
    "project": "rfranklin/api-quickstart-project",
    "inputs": {
    "Known_SNPs": [
      {
        "path": "567890abc4b08370afe7e539",
        "class": "File",
        "name": "dbsnp_137.b37.vcf"
      }
    ],
    "Known_Indels": [
      {
        "path": "567890abc4b08370afe7e53b",
        "class": "File",
        "name": "1000G_phase1.indels.b37.vcf"
      },
      {
        "path": "567890abc4b08370afe7e53d",
        "class": "File",
        "name": "Mills_and_1000G_gold_standard.indels.b37.sites.vcf"
      }
    ],
    "database":
      {
        "path":"567890abc4b046f332565ff3",
        "class": "File",
        "name": "GRCh37.75.zip"
      },
    "Target_BED":
      {
        "path": "567890abc4b08370afe7e53f",
        "class": "File",
        "name": "exome_targets.b37.bed"
      },
    "FASTQ": [
      {
        "path": "567890abc4b08370afe7e21f",
        "class": "File",
        "name": "C835.HCC1143_BL.4.converted.pe_1.fastq"
      },
      {
        "path": "567890abc4b08370afe7e221",
        "class": "File",
        "name": "C835.HCC1143_BL.4.converted.pe_2.fastq"
      }
    ],
    "input_tar_with_reference":
      {
        "path":"567890abc4b046f332565ff1",
        "class": "File",
        "name": "human_g1k_v37_decoy.fasta.tar"
      }
    }
}

The response body will indicate if your draft task was successfully created. You'll also see the id for your draft task. Copy this to your clipboard, as we'll use it in the next step.

Note that you'll also see error messages if you've made a mistake in entering your inputs.

{
  "href": "https://api.sbgenomics.com/v2/tasks/48f79ccf-12b3-45b6-789c-b1e8d88dabcd7",
  "id": "48f79ccf-12b3-45b6-789c-b1e8d88dabcd",
  "name": "api task WES",
  "description": "an api task to demonstrate the quickstart",
  "status": "DRAFT",
  "project": "rfranklin/api-quickstart-project",
  "app": "rfranklin/api-quickstart-project/whole-exome-sequencing-gatk-2-3-9-lite/0",
  "type": "v2",
  "created_by": "rfranklin",
  "start_time": "2016-08-09T15:50:51Z",
  "batch": false,
<snip>
}

Now, we're ready to run the task.

Run a task

To run a task on the Platform, you'll need your draft task's id, obtained in the step above. Then, make the API request to run a task, as shown below.

POST /v2/tasks/48f79ccf-12b3-45b6-789c-b1e8d88dabcd/actions/run" HTTP/1.1
Host: api.sbgenomics.com
X-SBG-Auth-Token: 3210a98c1db9304ea9d9873156740f74

Your response body will contain information about your task as well as its status. When your task is first started, the status displays as QUEUED. Learn more about what happens when you run a task.

{
  "href": "https://api.sbgenomics.com/v2/tasks/48f79ccf-12b3-45b6-789c-b1e8d88dabcd",
  "id": "48f79ccf-12b3-45b6-789c-b1e8d88dabcd",
  "name": "api task WES new",
  "description": "an api task to demonstrate the quickstart",
  "status": "QUEUED",
  "project": "rfranklin/api-quickstart-project",
  "app": "rfranklin/api-quickstart-project/whole-exome-sequencing-gatk-2-3-9-lite/0",
  "type": "v2",
  "created_by": "rfranklin",
  "executed_by": "rfranklin",
  "start_time": "2016-08-09T15:17:01Z",
  "batch": false,
  "execution_status": {
    "message": "In queue"
  },
  "errors": [],
  "warnings": [],
  "inputs": {
    "Known_SNPs": [
      {
        "path": "57a4b564e4b08370afe7e539",
        "class": "File",
        "name": "dbsnp_137.b37.vcf",
        "size": 8436107210
      }
    ],
    <snip>
  },
  "outputs": {
    "summary": null,
    "summary_metrics": null,
    "raw_vcf": null,
    "summary_text": null,
    "plot_pdf": null,
    "recalibrated_bam": null,
    "b64html": null,
    "annotated": null
  }
}

Check the status of your task

You can check the status of your task by making the API request to get execution details, as shown below. Be sure to include your task's id in the path.

GET /v2/tasks/48f79ccf-12b3-45b6-789c-b1e8d88dabcd/execution_details HTTP/1.1
Host: api.sbgenomics.com
X-SBG-Auth-Token: 3210a98c1db9304ea9d9873156740f74

The response body, as shown below, will detail your task's status, which will be either RUNNING, COMPLETED, or FAILED. The response body also details the individual jobs which comprise the task. These jobs are individually marked as RUNNING or COMPLETED.

{
  "href": "https://api.sbgenomics.com/v2/tasks/48f79ccf-12b3-45b6-789c-b1e8d88dabcd/execution_details",
  "start_time": "2016-08-09T16:06:15Z",
  "status": "RUNNING",
  "message": "Running",
  "jobs": [
    {
      "name": "SBG_Html2b64_1_s",
      "start_time": "2016-08-09T16:15:08Z",
      "end_time": "2016-08-09T16:15:40Z",
      "status": "COMPLETED",
      "command_line": "python /opt/sbg_html_to_b64.py --input /sbgenomics/Projects/351df000-e176-48e9-9f10-8de87fe5ef3d/66a9a568-d8c5-4c43-b54c-361ee4f8d6b7/whole-exome-sequencing-gatk-2-3-9-lite_FastQC_1_s/C835.HCC1143_BL.4.converted.pe_2_fastqc.zip",
      "instance": {
        "id": "i-da0a2044",
        "type": "c3.2xlarge",
        "provider": "AWS"
      },
      "logs": {
        "cmd.log": "https://api.sbgenomics.com/v2/files/567890abc9b0307bc0414164/download_info",
        "job.err.log": "https://api.sbgenomics.com/v2/files/567890abc4f3066bc3750174/download_info",
        "stderr": "https://api.sbgenomics.com/v2/files/567890abc8a5639cc6722063/download_info"
      },
      "docker": {
        "checksum": "a2852b992f9b7673a151aa25fe98e1ae4f18703fb53b177b59a42dfef011c340"
      }
    },
    <snip>
    {
      "name": "BWA_INDEX",
      "start_time": "2016-08-09T16:10:56Z",
      "status": "RUNNING",
      "command_line": "/opt/bwa-0.7.13/bwa index human_g1k_v37_decoy.fasta ; tar -cf human_g1k_v37_decoy.fasta.tar human_g1k_v37_decoy.fasta *.amb *.ann *.bwt *.pac *.sa",
      "instance": {
        "id": "i-da0a2044",
        "type": "c3.2xlarge",
        "provider": "AWS"
      },
      "logs": {
        "stderr": null
      },
      "docker": {
        "checksum": "4b06d40825c9c2f2ffe3beba9b8f6267c2c4543984915593deac1a003f7b8e5b"
      }
    }
  ]
}

You'll be notified by email once your task has completed.

Get task outputs

Once your task has completed, you can get your task outputs by making the API request to get details of a task, as shown below. This call differs from the one to get execution details in that it provides details about the task rather than the process of its execution.

We can add the parameter fields=outputs to filter the response body to only display the outputs.

GET /v2/tasks/48f79ccf-12b3-45b6-789c-b1e8d88dabcd?fields=outputs HTTP/1.1
Host: api.sbgenomics.com
X-SBG-Auth-Token: 3210a98c1db9304ea9d9873156740f74

The response body returns the outputs of your task, including file ids (path) in case you wish to perform further analyses on these files.

{
  "outputs": {
    "summary_metrics": {
      "path": "567890abc4b051911c736364",
      "name": "C835.HCC1143.2.converted.realigned.base_recalibrated.summary_metrics.txt",
      "secondaryFiles": [],
      "class": "File"
    },
    "raw_vcf": {
      "path": "567890abc4b07c1599e31c9c",
      "name": "C835.HCC1143.2.converted.realigned.base_recalibrated.vcf",
      "secondaryFiles": [
        {
          "path": "567890abc4b051911c73e01d",
          "size": 998322,
          "name": "C835.HCC1143.2.converted.realigned.base_recalibrated.vcf.idx",
          "class": "File"
        }
      ],
      "class": "File"
    },
    "summary_text": {
      "path": "567890abc4b07c1599e31d2d",
      "name": "Sample_HCC1143.Library_Unknown.Platform_Unit_Unknown.combined.snpEff_summary.genes.txt",
      "class": "File"
    },
    "plot_pdf": {
      "path": "567890abc4b051911c72647b",
      "name": "C835.HCC1143.2.converted.realigned.pdf",
      "class": "File"
    },
    "recalibrated_bam": {
      "path": "567890abc4b0788896befc0b",
      "name": "C835.HCC1143.2.converted.realigned.base_recalibrated.bam",
      "secondaryFiles": [
        {
          "path": "567890abc4b0788896befc0d",
          "size": 5605128,
          "name": "C835.HCC1143.2.converted.realigned.base_recalibrated.bam.bai",
          "class": "File"
        }
      ],
      "class": "File"
    },
    "b64html": [
      {
        "path": "567890abc4b0c889884fddf2",
        "name": "_1_C835.HCC1143.2.converted.pe_1_fastqc.b64html",
        "class": "File"
      },
      {
        "path": "567890abc4b09e5c738a9cd7",
        "name": "_1_C835.HCC1143.2.converted.pe_2_fastqc.b64html",
        "class": "File"
      }
    ],
    "b64html_1": {
      "path": "567890abc4b051911c73e0f8",
      "name": "Sample_HCC1143.Library_Unknown.Platform_Unit_Unknown.combined.snpEff_summary.b64html",
      "class": "File"
    },
    "annotated": {
      "path": "567890abc4b0e14f48b05ac9",
      "name": "Sample_HCC1143.Library_Unknown.Platform_Unit_Unknown.combined.snpEff_annotated.vcf",
      "class": "File"
    }
  }
}

Thatā€™s it! We've executed a data analysis and obtained some results.