The Seven Bridges Knowledge Center

The Seven Bridges Platform is a simple solution for doing bioinformatics at industrial scale. But sometimes, everyone needs a little help.

Get Started

CPTAC data

ABOUT DATASETS > CPTAC data

Overview

The Clinical Proteomic Tumor Analysis Consortium (CPTAC) is a comprehensive and coordinated effort to accelerate understanding of the molecular basis of cancer through the application of robust, quantitative, proteomic technologies and workflows.

The CPTAC analyzes cancer biospecimens from genomics initiatives such as The Cancer Genome Atlas (TCGA) by mass spectrometry to characterize and quantify their constituent proteins or “proteome”. These mass spectrometry data are present in four different file formats: raw, mzML, psm, and mzid. Raw files contain raw mass spectrometry spectra in vendor-specific file formats corresponding to the mass spectrometers used to acquire the spectra. The mzML files are generated by converting these raw files to a HUPO Proteome Standards Initiative (PSI)-compliant format. The psm files report the peptide spectrum match (PSM) data obtained by processing the mzML files. The mzID files were generated by converting the psm files to the HUPO PSI-compliant mzldentML format.

Learn more about the metadata associated with CPTAC data on the Platform.

Distribution of the data

Mass spectrometry enables the highly specific identification of proteins and proteoforms, accurate relative quantitation of protein abundance in contrasting biospecimens, and the localization of post-translational protein modifications (such as phosphorylation) on a protein’s sequence. Mass spectrometry (MS) data from four TCGA cancer types (TCGA-OV, TCGA-BRCA, TCGA-COAD, TCGA-READ) are included in the CPTAC dataset.

See below for an overview of the number of samples, type of analytics, and experiment strategies available for each cancer type.

Collection
Samples
Analytics
Experiments

TCGA-OV

174

Proteome, Phosphoproteome

4-plex iTRAQ MS

TCGA-BRCA

105

Proteome, Phosphoproteome

4-plex iTRAQ MS

TCGA-COAD

64

Proteome

MS

TCGA-READ

31

Proteome

MS

ACCESS CPTAC DATA

Access a repository of CPTAC files via the Data Browser.