The Seven Bridges Knowledge Center

The Seven Bridges Platform is a simple solution for doing bioinformatics at industrial scale. But sometimes, everyone needs a little help.

Get Started

SGDP data

Overview

The Simons Genome Diversity Project (SGDP) dataset is made possible by the Simons Foundation. The dataset contains complete genome sequences from more than one hundred diverse human populations. It is the largest dataset of diverse, high quality human genome sequences ever reported. To represent as much anthropological, linguistic, and cultural diversity as possible, the dataset includes many deeply divergent human populations that are not well-represented in other datasets.

Distribution of the data

The SGDP public project contains Open Access whole genome sequencing data for 279 samples.

By geographical regions, the SGDP dataset is comprised of 44 Africans, 22 Native Americans, 27 Central Asians or Siberians, 47 East Asians, 25 Oceanians, 39 South Asians and 75 West Eurasians.

SGDP metadata

Learn more about SGDP metadata:

  1. Access the Nature article about SGDP.
  2. Look under Excel files.
  3. Select Supplementary Table 1. Note that this will start a download for a local copy of the spreadsheet.
  4. Open your local version of the spreadsheet and filter for X in Column G. This displays all the Open Access data in the SGDP which Seven Bridges has made available in their Simons Genome Diversity Project (SGDP) public project.

Access SGDP data

Access a repository of SGDP files via the SGDP public project.

Note that you cannot currently query the SGDP dataset via the Data Browser.

Updated about a year ago

SGDP data


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.