The Seven Bridges Knowledge Center

The Seven Bridges Platform is a simple solution for doing bioinformatics at industrial scale. But sometimes, everyone needs a little help.

Get Started

Metadata manifest file format

Two different metadata manifest file formats are used on the Seven Bridges Platform. Which metadata manifest file you should use depends on the functionality that you're using the manifest for.

  • Command Line Uploader - use this format of metadata manifest file in order to apply metadata when uploading files to the Platform.
  • Editing metadata via the visual interface - use this format of the metadata manifest file in order to edit metadata for the files that are already in one of your projects on the Platform.

Metadata manifest format for the Command Line Uploader

When uploading files to the Platform using the Command Line Uploader, a metadata manifest file can be used to define metadata values for those files. The supported file format for this manifest file is CSV, i.e. comma separated values. A CSV file contains a number of rows with columns which are separated with a comma.

The following rules apply:

Rule

Description

Line separation

The lines are separated with a line break, while the columns are separated using a comma.

First row

The first row (manifest file header) has to contain column names which are treated as metadata keys (e.g. “sample”, “library”). Within the header, the first column must be File name, while the remaining columns are metadata keys. Also, please have in mind that metadata keys might be metadata schema or custom metadata keys.

Must include File name

It's mandatory to have File name as the first column in the header.

First column

The first column has to contain the names of the files which will be uploaded. In case the files are not in the same directory as the metadata manifest file, you should also include a path to the files (e.g. ../filename.fastq).

Subsequent columns

All subsequent columns should contain metadata fields which will be assigned to the specified files.

Case sensitivity

Keys and values are case sensitive.

Escape character

To escape comma character in metadata key or value, enclose that field within double quotation marks (e.g. for setting specific key to md_value1,2 use "md_value 1,2"). Furthermore, to use double quote character in metadata key or value, enclose that field within double quotation marks and use two double quotation marks (e.g. for setting specific key to md_value"1" use "md_value""1""").

Maximum size

The maximum size for the metadata manifest file is 5 GiB.

Maximum number of key-value pairs

Maximum number of key-value pairs per file is 1000, including null-value keys.

Keys and values encoding

Keys and values are UTF-8 encoded strings.

Maximum key length

100 bytes (UTF-8 encoding)

Maximum value length

300 bytes (UTF-8 encoding)

Metadata for files only

There is no metadata for folders on the Platform. Any folder specified in the manifest file will be skipped and the metadata will not be set for a folder.

The following example shows the content of the metadata manifest for three files with three metadata fields.

File name

sample

library

paired_end

file1.fastq

sample1

examplelibrary1

1

file2.fastq

sample1

examplelibrary1

2

file3.fastq

sample2

examplelibrary2

1

Below is the same example in a comma separated format.

File name,sample,library,paired_end
file1.fastq,sample1,examplelibrary1,1
file2.fastq,sample1,examplelibrary1,2
file3.fastq,sample2,examplelibrary2,1

Metadata manifest format for modifying metadata via the visual interface

This manifest file is used In case you are modifying metadata for your project files via the visual interface.

The supported file formats for this manifest file are CSV and TSV. A CSV file contains a number of rows with columns which are separated with a comma, while the TSV file separates them with a tab.

The following rules apply:

Rule

Description

Line separation

The lines are separated with a line break, while the columns are separated using a comma.

First row

The first row (manifest file header) has to contain column names which are treated as metadata keys (e.g. “sample”, “library”). Within the header, the first column must be File name, while the remaining columns are metadata keys. Also, please have in mind that metadata keys might be metadata schema or custom metadata keys.

Must include File name

It is mandatory to have File name as the first column in the header.

First column

The first column has to contain the names of the files which will be uploaded. In case the files are not in the same directory as the metadata manifest file, you should also include a path to the files (e.g. ../filename.fastq).

Subsequent columns

All subsequent columns should contain metadata fields which will be assigned to the specified files.

Case sensitivity

Keys and values are case sensitive.

Escape character

To escape comma character in metadata key or value, enclose that field within double quotation marks (e.g. for setting specific key to md_value1,2 use "md_value 1,2"). Furthermore, to use double quote character in metadata key or value, enclose that field within double quotation marks and use two double quotation marks (e.g. for setting specific key to md_value"1" use "md_value""1""").

Maximum size

The maximum size for the metadata manifest file is 5 GiB.

Maximum number of key-value pairs

Maximum number of key-value pairs per file is 1000, including null-value keys.

Keys and values encoding

Keys and values are UTF-8 encoded strings.

Maximum key length

100 bytes (UTF-8 encoding)

Maximum value length

300 bytes (UTF-8 encoding)

Metadata for files only

There is no metadata for folders on the Platform. Any folder specified in the manifest file will be skipped and the metadata will not be set for a folder.

The following example shows the content of the metadata manifest for three files with three metadata fields.

File name

sample

library

paired_end

file1.fastq

sample1

examplelibrary1

1

file2.fastq

sample1

examplelibrary1

2

file3.fastq

sample2

examplelibrary2

1

Below is the same example in a comma separated format.

File name,sample,library,paired_end
file1.fastq,sample1,examplelibrary1,1
file2.fastq,sample1,examplelibrary1,2
file3.fastq,sample2,examplelibrary2,1

Metadata manifest format for modifying metadata via the visual interface

This manifest file is used In case you are modifying metadata for your project files via the visual interface.

The supported file formats for this manifest file are CSV and TSV. A CSV file contains a number of rows with columns which are separated with a comma, while the TSV file separates them with a tab.

The following rules apply:

Rule

Description

Line separation

The lines are separated with a line break, while the columns are separated using a comma.

Columns separator

The columns are separated using a comma (CSV) or a tab (TSV) while the lines are separated with a line break.

First row

The first row (manifest file header) has to contain column names which are treated as metadata keys (e.g. “sample”, “library”). Within the header, the first column must be either id or name, Next, there can be project and size columns which are system metadata fields and are treated as read-only (these fields are present if manifest file is generated using Export metadata to a manifest action). The remaining columns are metadata keys, which can be metadata schema or custom metadata keys. Please have in mind that the order of columns is important, e.g. id column (if present) must be the first one.

Must include id or name (along with project path)

  1. It’s mandatory to have either id or name column, and it’s allowed to have both of them.
  2. If both id and name columns are present, then id will be used for identifying file, while the name will be omitted.

Name field

This field should also include file path within the project for files which are not in the project root, so it’s actually path + name. If "id" column is not present in the metadata manifest file, then this field will be used for identifying files whose metadata should be edited. If "id" is present in the metadata manifest file, then this field will be ignored, meaning it's not possible to change file name or file path using "Import metadata manifest" feature (situation which is the same as currently, although it might change in the future). This way, it's possible to edit metadata using manifest file without providing file IDs and providing only file names (along with path, in case when files are stored in folder instead of project root) instead of having to fetch file IDs.

Project field

It’s possible to have “project” column (e.g. if you have used the Export metadata manifest feature, then edited generated metadata manifest file and submitted it for import). If present it is treated as read-only field (i.e. it is not possible to either move or copy files from one project to the other using this feature), but validation should take place - if the specified “project” value is different than the current project (i.e. the project in which the Import metadata manifest feature has been used) then this file shouldn’t be edited and it should be counted towards files which are failed as part of this action. Have in mind that the “project” field is non-mandatory field in the manifest file.

Size field

It is possible to have the “size” column (e.g. you have used the Export metadata manifest feature, edited the generated metadata manifest file and submitted it for import). If present, it is treated as read-only field (i.e. it is not possible to change the file size using this feature, or any other feature). In addition, there should not be any validations including this field.

Metadata schema fields and custom metadata fields

Following the aforementioned fields (there can be at a minimum one, and at the maximum four read-only fields) you can specify the metadata fields which should be edited. There must be at least one metadata column specified in the manifest file (otherwise the action will fail because there is no metadata field to be edited). The metadata schema fields are specified according to the documented metadata schema.

Empty rows

It's allowed to have an empty row in the manifest file. Empty row will be skipped during manifest file processing.

Case sensitivity

The manifest file is case-sensitive.

Escape character

To escape comma character in metadata key or value, enclose that field within double quotation marks (e.g. for setting specific key to md_value1,2 use "md_value 1,2"). Furthermore, to use double quote character in metadata key or value, enclose that field within double quotation marks and use two double quotation marks (e.g. for setting specific key to md_value"1" use "md_value""1""").

Maximum size

The maximum size for the metadata manifest file is 5 GiB.

Maximum number of rows

The maximum number of rows for the metadata manifest file is 40,001, which corresponds to maximum number of files in the single project on the Platform (plus header row).

Maximum number of key-value pairs

Maximum number of key-value pairs per file is 1000, including null-value keys.

Keys and values encoding

Keys and values are UTF-8 encoded strings.

Maximum key length

100 bytes (UTF-8 encoding)

Maximum value length

300 bytes (UTF-8 encoding)

📘

Columns which are specified after id, name, size and project will be treated as metadata fields.

The following example shows a metadata manifest file which contains both id and name columns along with one metadata schema field (quality_scale, paired_end) and one custom metadata field (Read length). Please note that in this case the id field will be used to uniquely identify file within the project, while the name field will be ignored.

id

name

quality_scale

paired_end

Read length

581b298d20946e087b2ce503

file1.fastq

illumina18

1

26

581b298d20946e087b2ce51f

file2.fastq

illumina18

2

26

581b298d20946e087b2ce50b

file3.fastq

solexa

1

98

Below is the same example in a comma separated format.

id,name,quality_scale,paired_end,Read length
581b298d20946e087b2ce503,file1.fastq,illumina18,1,26
581b298d20946e087b2ce503,file2.fastq,illumina18,2,26
581b298d20946e087b2ce503,file3.fastq,solexa,1,98

The following example shows a metadata manifest file which contains the name column, one metadata schema field (case_id) and one custom metadata field (Donor ID). Please note that in this case the name field (containing file path along with the name of the file) will be used to uniquely identify the file within the project.

name

case_id

Donor ID

StudyYYZ/file1.bam

cid173

1197

StudyYYZ/file2.bam

cid365

1198

StudyYYZ/file3.bam

cid882

1199

name,case_id,Donor ID
StudyYYZ/file1.bam,cid173,1197
StudyXYZ/file2.bam,cid365,1198
StudyXYZ/file3.bam,cid882,1199

Updated 11 days ago

Metadata manifest file format


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.