Important Points
- In nextflow.config, always set: docker.enabled = false
This ensures compatibility with the platform’s execution environment and avoids unexpected failures caused by Docker execution logic.
- Workflows cannot send emails or ping external URLs.
- Code packages should not exceed 10 GB. o GUI / sbpack_nf CLI uploads: max 100 MB. o Avoid bundling large references inside packages. Provide them as external inputs.
- Inputs defined as folders prevent memoization and may reduce efficiency.
- Workflows may fail if a process generates too many output files (deep listing not supported).
- Avoid single-instance executions due to performance and compatibility issues.
Guidelines for defining and optimizing resources for Nextflow workflows
Computational resource configuration for Nextflow apps is defined in config files, which can be loaded via user-selected profiles. Each config specifies CPU, RAM, and time requirements for workflow steps. In addition to these, instance hints can be provided for platform execution in the following ways:
Defining Instance Hints
-
By process name:
process { withName: STAR_ALIGN { $sbgAWSInstanceHint = 'r5.8xlarge;ebs-gp2;1024' } }-
By process label:
process { withLabel: process_high { $sbgAWSInstanceHint = 'r5.8xlarge;ebs-gp2;1024' } }
-
Use instance hints carefully. If the selected instance has fewer resources than required, the task will fail.
Instance Selection Hierarchy
When instance hints are defined in multiple places, the following order of precedence applies:
- Execution Settings (set at runtime in the draft-task page). If set to default, the next level applies.
- sb_nextflow_schema.yaml instance hint.
- Config file instance hint.
- main.nf instance hint (discouraged).
Platform Instance Limits
- 3000 instances per platform.
- 25 parallel instances per user.
- 10 parallel instances per task.
- Multiple jobs can share one instance if resources allow and big enough to accomodate.
Best Practices
- Avoid assigning different instances for every process.
- Instead, define one instance at the task level that can handle the largest resource requirement of the workflow.
- This approach generally improves cost efficiency and runtime performance.
Updated about 1 hour ago
