The Seven Bridges Knowledge Center

The Seven Bridges Platform is a simple solution for doing bioinformatics at industrial scale. But sometimes, everyone needs a little help.

Get Started

TCIA data

ABOUT DATASETS > TCIA data

Overview

The Cancer Imaging Archive (TCIA) contains radiological imaging data from The Cancer Genome Atlas (TCGA) and is part of an effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects. TCIA includes radiological images which represent 21 types of cancer detailed in TCGA. All images are accessible for public use. These images are de-identified to ensure that images are free of protected health information (PHI), and are stored in a standard DICOM format.

Distribution of the data

See below for an overview of the number of subjects and the image modalities (such as MRI or CT) of the data, grouped by different cancer types (“Collections”) in the TCIA public project. See a full list of cancer type abbreviations and a full list of DICOM image modality abbreviations.

Collection

Subjects

Modalities

TCGA-KIRC

267

CT, MR, CR

TCGA-GBM

262

MR, CT, DX

TCGA-LGG

199

MR, CT

TCGA-HNSC

192

CT, MR, PT, RTSTRUCT, RTPLAN, RTDOSE

TCGA-OV

143

CT, MR

TCGA-BRCA

139

MR, MG

TCGA-BLCA

97

CT, CR, MR, PT

TCGA-LIHC

97

MR, CT, PT

TCGA-LUAD

69

CT, PT, NM

TCGA-UCEC

58

CT, CR, MR, PT

TCGA-CESC

54

MR

TCGA-STAD

46

CT

TCGA-LUSC

37

CT, NM, PT

TCGA-KIRP

33

CT, MR, PT

TCGA-COAD

25

CT

TCGA-ESCA

16

CT

TCGA-KICH

15

CT, MR

TCGA-PRAD

14

CT, PT, MR

TCGA-THCA

6

CT, PT

TCGA-SARC

5

CT, MR

TCGA-READ

3

CT, MR

TCIA Metadata

Each TCIA file on the Platform contains a set of images acquired during the same scanning mode in a compressed file format. The following metadata are also set for each file when available:

Property

Description

Case UUID

A Universally Unique Identifier (UUID) for the sample or files of a case.

Case ID

A human-readable identifier, such as a number or a string that may contain metadata information. This identifier is often referred as submitter ID.

Ethnicity

A socially defined category of people based on common ancestral, cultural, biological, and social factors. See NCI Thesaurus Code: C29933.

Gender

The collection of behaviors and attitudes that distinguish people on the basis of the societal roles expected for the two sexes. See NCI Thesaurus Code: C17357.

Race

A classification of humans characterized by certain heritable traits, common history, nationality, or geographic distribution. See NCI Thesaurus Code: C17049.

Investigation

A value denoting the project or study that generated the data. See NCI Thesaurus Code: C41198.

Age at diagnosis

The age in years of the case at the initial pathological diagnosis of disease or cancer. See NCI Thesaurus Code: C15220.

Primary site

The anatomical site where the primary tumor is located in the organism. See NCI Thesaurus Code: C43761.

Disease type

The type of the disease or condition studied. See NCI Thesaurus Code: C2991.

Vital status

The state of being living or deceased for cases that are part of the investigation. See NCI Thesaurus Code: C25717.

Days to death

The number of days from the date of the initial pathological diagnosis to the date of death for the case in the investigation.

Series date

Date the Series was acquired.

Manufacturer

Manufacturer's name of the equipment that produced the composite instances.

Body part examined

Text description of the part of the body examined.

Modality

Type of equipment that originally acquired the data.

Protocol name

User-defined description of the conditions under which the Series was performed.

Manufacturer model name

Manufacturer's model name of the equipment that produced the composite instances.

Series description

User provided description of the Series.

Software versions

Manufacturer's designation of software version of the equipment that produced the composite instances.

Image count

Number of images in this series.

Access TCIA data

Access a repository of TCIA files via the TCIA public project .

Note that you cannot currently query the TCIA dataset via the Data Browser.

Updated about a year ago

TCIA data


ABOUT DATASETS > TCIA data

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.