Upload via the command line
Overview
The Seven Bridges CLI Uploader is the recommended tool for uploading data to the Seven Bridges Platform. It is integrated into the Seven Bridges CLI.
The Command Line Uploader is recommended for large scale uploads. If you need to upload smaller scale uploads instead, we recommend using the Web uploader.
The maximum number of files you can submit for upload is 250,000.
COMMAND LINE OPTIONS
Syntax:
sb [global-parameters] upload <upload-subcommand> [command-parameters]
The supported global parameters are:
Option | Short parameter | Description |
---|---|---|
--config <string> | Configuration file to use instead of the default one. | |
--help | -h | Display help. |
--profile <string> | Configuration profile to use from credentials file. Please note that this parameter is applicable only to "start" subcommand, "resume" and "delete" subcommands will use the same profile as the one used for starting the upload job, while for "status" subcommand this parameter is not applicable. (default "default") | |
--debug | Run the command with debug information in the output. |
Subcommands
Subcommand | Description |
---|---|
start | Start the upload job |
status | Check the upload status |
resume | Resume the upload job |
delete | Delete an upload job. Please note that only jobs that are paused can be deleted. To pause an upload job, use CTRL+C. |
Read below for detailed information and instructions for all of the commands.
Start the upload
This command initializes a new upload job. Each job is identified by its unique name which can be assigned manually (using the corresponding command parameter) or automatically. Job name can be used to track the upload status and resume its execution (in case it is paused).
The upload is performed in the foreground, in the current active command shell session. The command shell session needs to remain open during the upload command execution, otherwise the upload will be paused and can be resumed. During the upload providing any other input will not be possible.
You can perform multiple uploads (i.e. start multiple upload jobs) on a single machine. The following are the statuses for an upload job:
- Initializing -preparing the list of items (files and folders) which will be uploaded
- Running- the items are being uploaded.
- Paused - the upload job can be interrupted by terminating active command shell or issuing the CTRL+C command
- Completed- this status denotes that all items are processed and the upload job is done; it doesn't necessarily mean all items have been uploaded successfully. The log file contains more detailed information about the upload job, including failed items. The status command provides an overview of the upload job, including the number of uploaded, failed, skipped, and remaining items.
sb [global-parameters] upload start [<input> | --manifest-file <manifest-file>] --destination <path> [command-parameters]
Required parameters
Name | Description |
---|---|
--destination <string> | Upload destination, which can be either project root or a specific folder inside a Platform project. The destination can be specified either by project name or ID (e.g. "sbguser/sbgproject" or "sbguser/sbgproject/directory1" or "5dc01c9ae4b01e2d090a700a"). The project name should be specified by combining the information about the project owner and the name {project_owner}/{project}, where {project_owner} is the username of the user who created the project and {project} is the project slug (e.g. "rfranklin/my-project"). |
Optional parameters
Name | Description |
---|---|
<input> | The Item (file or folder) or list of items to be uploaded. By default, the source folder structure is preserved on the Platform. The <input> and --manifest-file parameters are mutually exclusive, and one of them must be included within the start command. |
--manifest-file <manifest-file> | The manifest file which contains the list of items that will be uploaded as well as the accompanying metadata (only applies to files). By default, the source folder structure is preserved on the Platform after the upload. The <input> and --manifest-file parameters are mutually exclusive, and one of them must be included within the start command. Learn about the manifest file format. |
--name <string> | The upload job name which must be unique. The allowed characters are A-Za-z0-9_-. The maximum number of characters is 20. If omitted, the name will be automatically assigned using the following format "DAY_DDMMMYYYY_HHMMSS", for example Thu_27Feb2020_231455). |
--autorename | If a file with the same name already exists at the upload destination, an underscore followed by a serial number will be added as a prefix. In case of a folder, the contents of both folders will be merged. In case a file and a folder bear the same name, the upload will be skipped. The --autorename and --overwrite parameters are mutually exclusive, if omitted SKIP will be used as default method. |
--overwrite | If a file with the same name already exists at the upload destination, it will be overwritten by the file that is uploaded. In case of a folder, the contents of both folders will be merged. In case a file and a folder bear the same name, the upload will be skipped. The --autorename and --overwrite parameters are mutually exclusive, if omitted SKIP will be used as default method. |
--tag <string> | Tag which will be set for each file once its upload is complete. Please note that tagging is not applicable to folders on the Seven Bridges Platform. It is possible to set maximum 32 tags, with each tag being maximum 64 characters long. |
--chunk-size <int> | Preferred size of the upload part in bytes. If omitted, the default value of 64MiB will be used. Unstable network connection: use for example --chunk-size 8000000 (note that min value is 5243000) . If you have an unstable network connection, we recommend using '--chunk-size 8000000'. |
--parallel | Maximum number of parallel file uploads. The allowed range for this value is 1 to 8. The default value is 8.
If you have a low upload speed or low (filesystem) read speed (eg. magnetic discs) or if you want to limit system resources used by the uploader, we recommend using --parallel 4 or --parallel 2 . |
--speed-limit <int> | Maximum allowed network bandwidth for the process that executes this upload job. Should be specified in kbps (kilobits per second). If omitted, the maximum possible bandwidth will be used. |
The following table illustrates the entire naming conflict resolution mechanism:
Check the upload status
Use the status
command to check the upload status. You can check the status for all upload jobs by omitting the name
parameter.
sb [global-parameters] upload status [<name>]
This will return the following information about each of the upload jobs:
-
Upload job name
-
Status
-
Processed (percent completed, bytes uploaded / total bytes)
-
Average upload speed (total bytes uploaded / total time spent in "Running" status)
-
ETA (Estimated Time of Arrival - expected remaining time for job completion)
To check a specific upload job, you should specify thename
of the upload job. This will return more detailed information about the upload job: -
Upload job name
-
Status
-
Log file path
-
Time submitted
-
Upload command (note: if wildcard is used in upload command, expanded command will be displayed)
-
Total number of submitted files
-
Number of uploaded files (successfully uploaded to the Platform)
-
Number of skipped files
-
Number of failed files
-
Number of remaining files (files in "Queued" and "Uploading" state)
-
Processed (percent completed, bytes uploaded / total bytes)
-
Average upload speed - total bytes uploaded / total time spent in "Running" status
-
ETA (Estimated Time of Arrival - expected remaining time for job completion)
The information about completed jobs will be kept in job history for at least 1 month after it has been completed or paused, unless it's been deleted (using thesb upload delete
option, see below). After this time, the completed and paused jobs will be removed from the list of upload jobs.
Optional parameters
Name | Description |
---|---|
<name> | The name of the upload job. Use this parameter to obtain more detailed information about a specified job. |
Resume an upload job
Use this command to resume a previously paused upload job. The upload will resume its execution from where it had been paused.
sb [global-parameters] upload resume [<name>]
Optional parameters
Name | Description |
---|---|
<name> | Specify the name of the paused job you wish to resume. This parameter is required unless there's only one paused job. |
Delete an upload job
Use this command to delete a specified upload job or all upload jobs. Only jobs that have been completed or paused can be deleted. To pause the job, use CTRL+C.
sb [global-parameters] upload delete [<name> | --all ]
Please keep in mind that an upload job is kept in the job history for (at least) 1 month after it has been completed or paused and is automatically deleted from the list after that time.
Optional parameters
Name | Description |
---|---|
<name> | Upload job name, required if the -all parameter is not used. The <name> and -all parameters are mutually exclusive. One or the other is required. |
-all | Deletes all upload jobs which are COMPLETED or PAUSED. The <name> and -all parameters are mutually exclusive. One or the other is required. |
Examples
This section will list several examples on how to upload files. The destination project that is used in all of the examples is: rfanklin/my-project
.
Please note that tags will not be shown in the examples.
Initial state
The following snippet shows the example of a directory tree on a local computer (from the local path where sb commands are executed):
├── dir1
├── dir2
│ ├── dir2-1
│ │ └── file4.bam
│ └── dir2-2
├── file1.bam
├── file2.bam
└── file3.bam
Example 1
Upload a folder and 2 files into the project root and tag the uploaded files.
Command
sb upload start dir1/ file1.bam file2.bam --destination rfranklin/my-project --tag upload1
Terminal output
upload job name: Thu_08Apr2021_182030
COMPLETE
Successfully uploaded 2 of 2 files
Result on the Platform
├── dir1
├── file1.bam
└── file2.bam
Example 2
Upload a folder (with underlying folder structure and one file included) and 3 files into the existing folder within the project.
- assigning name to the upload job (for easier status tracking)
- tagging uploaded files
- and specifying a custom setting for the number of parallel file uploads (covering the
low upload speed case).
Command
sb upload start dir2 file1.bam file2.bam file3.bam --destination rfranklin/my-project/dir1 --name upload2 --tag upload2 --parallel 2
Terminal output
upload job name: upload2
COMPLETE
Successfully uploaded 4 of 4 files
Result on the Platform
The following will be the result on the Platform, assuming that the Example 1 (see above) is already executed and that the result remains intact:
├── dir1
│ ├── dir2
│ │ ├── dir2-1
│ │ │ └── file4.bam
│ │ └── dir2-2
│ ├── file1.bam
│ ├── file2.bam
│ └── file3.bam
├── file1.bam
└── file2.bam
Example 3
Upload 2 folders (with underlying folder structure and one file included) and 1 file into the project root, while:
- choosing auto-rename as a method for resolving name conflict
- tagging uploaded files
- and specifying a custom setting for the chunk size (covering the unstable network connection case).
Command
sb upload start dir1/ dir2/ file1.bam --destination rfranklin/my-project --autorename --tag upload3 --chunk-size 8000000
Terminal output
upload job name: Thu_08Apr2021_182550
COMPLETE
Successfully uploaded 2 of 2 files
Result on the Platform
The following will be the result on the Platform, assuming that the two previous examples are already executed:
├── _1_file1.bam
├── dir1
│ ├── dir2
│ │ ├── dir2-1
│ │ │ └── file4.bam
│ │ └── dir2-2
│ ├── file1.bam
│ ├── file2.bam
│ └── file3.bam
├── dir2
│ ├── dir2-1
│ │ └── file4.bam
│ └── dir2-2
├── file1.bam
└── file2.bam
Example 4
Check the status for all upload jobs.
Command
sb upload status
Terminal output
Upload job name Status Processed Average speed Estimated time
Thu_08Apr2021_182030 COMPLETED 100% (85.00/85.00B) 85.00 Bps N/A
upload2 COMPLETED 100% (114.00/114.00B) 38.00 Bps N/A
Thu_08Apr2021_182550 COMPLETED 100% (68.00/68.00B) 68.00 Bps N/A
Example 5
Get the detailed status for an upload job.
Command
sb upload status upload2
Terminal output
Upload job name: upload2
Status: COMPLETED
Log file path: /home/nikola/.sevenbridges/sb/logs/sb.log
Time submitted: 08.04.2021 18:23
Command: sb upload start dir2 file1.bam file2.bam file3.bam --destination rfranklin/my-project/dir1 --name upload2 --tag upload2 --parallel 2
Total files: 4
Total size: 114.00 B
# uploaded: 4
# skipped: 0
# failed: 0
# remaining: 0
% processed: 100%
Average upload speed: 38.00 Bps
ETA N/A
Example 6
Delete single upload job and check the status.
Commands
sb upload delete upload2
sb upload status
Terminal output
Upload job name Status Processed Average speed Estimated time
Thu_08Apr2021_182030 COMPLETED 100% (85.00/85.00B) 85.00 Bps N/A
Thu_08Apr2021_182550 COMPLETED 100% (68.00/68.00B) 68.00 Bps N/A
Example 7
Delete all upload jobs and check the status.
Commands
sb upload delete --all
sb upload status
Terminal output
Upload job name Status Processed Average speed Estimated time
Updated 10 months ago