Query syntax

The Advanced Search query syntax is designed to provide control and comprehensiveness, especially when working with a large number of individual files. The Advanced Search uses the following general syntax format:

[id AS field] IN folder1 [, folder2, ...] [WHERE condition] [ORDER BY id [ASC/DESC]][LIMIT num] [OFFSET num] [TOKEN token] [RECURSIVE] [STALE]

The sections below provide detailed explanations about each the query syntax.

Search locations (IN) mandatory

IN project1 [, proectj2, ...]

The IN clause is used to specify the exact locations where the search will take place and is the only mandatory part of the query expression.

The location is specified as string argument (/Projects/, /Datasets/, ...). The query syntax supports simultaneous search within multiple projects. The maximum number of projects that can be searched at the same time is 10.

Examples:

IN "/Projects/example-project-id" WHERE type = "FILE"
IN "/Datasets/dataset-1", "/Datasets/dataset-2" WHERE type = "FILE"

Filtering values (IN)

field IN (value1, value2, ...)

The IN clause is used to specify the value(s) to filter by. Takes filtering values as strings.

Example:

file_extension IN ("FAI", "FASTA")

Search filter (WHERE)

The WHERE clause contains filtering rules. The filtering rules can be applied to:

  • File tags
  • System file properties
  • System file metadata
  • Custom metadata
WHERE condition

The tables below provide a detailed overview of each of the fields and the syntax that should be used in the search queries.

File tags

Tags are used for organizing and identifying files more efficiently. Use this filter to search for files based on the specified tags.

ParameterField nameExample
Tagstagstags IN ("RNA_SEQ", "test")

System file properties

FieldField name, as used in the queryDescriptionExample
Downloadabledownload_tagIndicates whether download of the file is allowed. Possible values:
Yes
No
Selects only downloadable files:

download_tag in {"YES")
Extensionfile_extensionFile extenionSelects files with BAI and BAM extensions:
file_extension IN ("BAI", "BAM")
File statusstateCurrent file status on the Platform, out of the possible values:

- AVAILABLE
- ARCHIVED
- ARCHIVING
- RESTORING
- UPLOADING
Selects files that are available for use on the Platform, as opposed to e.g. archived files:
state IN ("AVAILABLE")
Task IDproduced_by_taskID of the task that produced the file as one of its outputs. Available only for task output files.Select only files produced by the specific task:
produced_by_task IN ("0d123456-9876-5432-19ab-3f0df6666d09")
TypetypeItem type, can be one of the following values:

- FILE
- DIRECTORY
Select only files or directories:
type IN ("FILE")

System file metadata

Use this filter to search for files based on specified system metadata.

FieldField name, as used in the queryExample
Batch numbermetadata.batch_numbermetadata.batch_number = '' OR NOT EXISTS(metadata.batch_number)
Case IDmetadata.case_idmetadata.case_id = '' OR NOT EXISTS(metadata.case_id)
Experimental stratagymetadata.experimental_strategymetadata.experimental_strategy = '' OR NOT EXISTS(metadata.experimental_strategy)
File segment numbermetadata.file_segment_numbermetadata.file_segment_number = '' OR NOT EXISTS(metadata.file_segment_number
Library IDmetadata.library_idmetadata.library_id = '' OR NOT EXISTS(metadata.library_id)
Paired-endmetadata.paired-endmetadata.paired-end = '' OR NOT EXISTS(metadata.paired-end)
Platformmetadata.platformmetadata.platform = '' OR NOT EXISTS(metadata.platform)
Platform unit IDmetadata.platform-unit-idmetadata.platform_unit_id = '' OR NOT EXISTS(metadata.platform_unit_id)
Quality scalemetadata.quality-scalemetadata.quality_scale = '' OR NOT EXISTS(metadata.quality_scale)
Reference genomemetadata.reference-genomemetadata.reference-genome = '' OR NOT EXISTS(metadata.reference-genomed)
Sample IDmetadata.sample-idmetadata.case_id = '' OR NOT EXISTS(metadata.case_id)\
Sample typemetadata.sample-typemetadata.sample_type = '' OR NOT EXISTS(metadata.sample_type)
Speciesmetadata.speciesmetadata.species = '' OR NOT EXISTS(metadata.species)

Search operations

The supported operations that can be used in the condition:

NameUsageDescriptionExample
Comparison=, !=, <, <=, >=, >Operands can be numerical, or string type for =, !=.IN "/Projects/example" WHERE state = "AVAILABLE" IN "/Projects/example" WHERE size > 1000
Wildcard~Searching values matching a provided wildcard pattern. * and ? operators are supported:

?: matches any single character
*: matches zero or more characters, including an empty one
IN "/Projects/example" WHERE name ~ "_something_"
Case insensitivei=, i~Special case of the = and ~ operators, but provides comparison in the case insensitive manner. Supported only for a limited set of properties (should be covered by the documentation).IN "/Projects/example" WHERE name i~ "_sOmeThinG_"
LogicalAND, OR, NOTUsed for combining multiple other sub-conditions into one condition. A condition in one query expression can have up to 50 sub-conditions (divided by the logical operators).IN "/Projects/example" WHERE state = "AVAILABLE" AND size > 1000
ExistsEXISTS()Checking the existence of the node property. Returns true if a property with provided key exists, and its value is different of null.IN "/Projects/example" WHERE EXISTS(metadata)
InIN{)This operator practically represents just syntactic sugar for the repeated OR operator. The limit is set to maximum 100 values in this operator.IN "/Projects/example" WHERE state IN("AVAILABLE", "PROCESSING")
ConversionDATE()Special type of the conversion function is DATE() - it accepts string as input, but only in acceptable date format, and returns a number - the number of seconds since Unix epoch start time, as this is the format the date is written in the file system database.IN "/Projects/example" WHERE time_created > DATE("2021-05-12 14:00")

Operators precedence:

  1. Conversion; EXISTS()
  2. Comparison; IN(); wildcard; case insensitive
  3. Logical. Precedence of the logical operators is:
    a. NOT
    b. AND
    c. OR

Recursive search (RECURSIVE)

RECURSIVE specifies whether the search is performed only in the root folder(s), specified by the IN clause, or be extended to their subfolders as well.

RECURSIVE

For example, in the following node structure:

Folder1
   |-File1
   |-File2
   +-Folder2
      |-File3

Queries with and without RECURSIVE would return different result sets.

The following query would include results that are located only in Folder1 (excluding subfolders):

IN "/Folder1" WHERE type = "FILE"

On the other hand, the following query is RECURSIVE, therefore includes results from Folder1 and all of its subfolders:

IN "/Folder1" WHERE type = "FILE" RECURSIVE