To accommodate the enormous variety in syntax and semantics for input, runtime environment, invocation, and output of arbitrary programs, a CommandLineTool defines an "input binding" that describes how to translate abstract input parameters to an concrete program invocation, and an "output binding" that describes how to generate output parameters from program output.
The tool command line is built by applying command line bindings to the
input object. Bindings are listed either as part of an input
parameter using the inputBinding
field, or
separately using the arguments
field of the CommandLineTool.
The algorithm to build the command line is as follows. In this algorithm, the sort key is a list consisting of one or more numeric or string elements. Strings are sorted lexicographically based on UTF-8 encoding.
-
Collect
CommandLineBinding
objects fromarguments
. Assign a sorting key[position, i]
whereposition
isCommandLineBinding.position
andi
is the index in thearguments
list. -
Collect
CommandLineBinding
objects from theinputs
schema and associate them with values from the input object. Where the input type is a record, array, or map, recursively walk the schema and input object, collecting nestedCommandLineBinding
objects and associating them with values from the input object. -
Create a sorting key by taking the value of the
position
field at each level leading to each leaf binding object. Ifposition
is not specified, it is not added to the sorting key. For bindings on arrays and maps, the sorting key must include the array index or map key following the position. If and only if two bindings have the same sort key, the tie must be broken using the ordering of the field or parameter name immediately containing the leaf binding. -
Sort elements using the assigned sorting keys. Numeric entries sort before strings.
-
In the sorted order, apply the rules defined in
CommandLineBinding
to convert bindings to actual command line elements. -
Insert elements from
baseCommand
at the beginning of the command line.
All files listed in the input object must be made available in the runtime environment. The implementation may use a shared or distributed file system or transfer files via explicit download to the host. Implementations may choose not to provide access to files not explicitly specified in the input object or process requirements.
Output files produced by tool execution must be written to the designated output directory. The initial current working directory when executing the tool must be the designated output directory. The designated output directory should be empty, except for files or directories specified using InitialWorkDirRequirement.
Files may also be written to the designated temporary directory. This directory must be isolated and not shared with other processes. Any files written to the designated temporary directory may be automatically deleted by the workflow platform immediately after the tool terminates.
For compatibility, files may be written to the system temporary directory
which must be located at /tmp
. Because the system temporary directory may be
shared with other processes on the system, files placed in the system temporary
directory are not guaranteed to be deleted automatically. A tool
must not use the system temporary directory as a backchannel communication with
other tools. It is valid for the system temporary directory to be the same as
the designated temporary directory.
When executing the tool, the tool must execute in a new, empty environment with only the environment variables described below; the child process must not inherit environment variables from the parent process except as specified or at user option.
HOME
must be set to the designated output directory.TMPDIR
must be set to the designated temporary directory.PATH
may be inherited from the parent process, except when run in a container that provides its ownPATH
.- Variables defined by EnvVarRequirement
- The default environment of the container, such as when using DockerRequirement
An implementation may forbid the tool from writing to any location in the runtime environment file system other than the designated temporary directory, system temporary directory, and designated output directory. An implementation may provide read-only input files, and disallow in-place update of input files. The designated temporary directory, system temporary directory and designated output directory may each reside on different mount points on different file systems.
An implementation may forbid the tool from directly accessing network
resources. Correct tools must not assume any network access unless they have
the 'networkAccess' field of a 'NetworkAccess' requirement set
to true
but even then this does not imply a publically routable IP address or
the ability to accept inbound connections.
The runtime
section available in parameter references
and expressions contains the following fields. As noted
earlier, an implementation may perform deferred resolution of runtime fields by providing
opaque strings for any or all of the following fields; parameter references
and expressions may only use the literal string value of the field and must
not perform computation on the contents.
runtime.outdir
: an absolute path to the designated output directoryruntime.tmpdir
: an absolute path to the designated temporary directoryruntime.cores
: number of CPU cores reserved for the tool processruntime.ram
: amount of RAM in mebibytes (2**20) reserved for the tool processruntime.outdirSize
: reserved storage space available in the designated output directoryruntime.tmpdirSize
: reserved storage space available in the designated temporary directory
For cores
, ram
, outdirSize
and tmpdirSize
, if an implementation can't
provide the actual number of reserved resources during the expression evaluation time,
it should report back the minimal requested amount.
See ResourceRequirement for details on how to describe the hardware resources required by a tool.
The standard input stream, the standard output stream, and/or the standard error
stream may be redirected as described in the stdin
,
stdout
, and stderr
fields.
Once the command line is built and the runtime environment is created, the actual tool is executed.
The standard error stream and standard output stream may be captured by platform logging facilities for storage and reporting.
Tools may be multithreaded or spawn child processes; however, when the parent process exits, the tool is considered finished regardless of whether any detached child processes are still running. Tools must not require any kind of console, GUI, or web based user interaction in order to start and run to completion.
The exit code of the process indicates if the process completed
successfully. By convention, an exit code of zero is treated as success
and non-zero exit codes are treated as failure. This may be customized by
providing the fields successCodes
, temporaryFailCodes
, and
permanentFailCodes
. An implementation may choose to default unspecified
non-zero exit codes to either temporaryFailure
or permanentFailure
.
The exit code of the process is available to expressions in
outputEval
as runtime.exitCode
.
If the output directory contains a file named "cwl.output.json", that file
must be loaded and used as the output object. Otherwise, the output object
must be generated by walking the parameters listed in outputs
and
applying output bindings to the tool output. Output bindings are
associated with output parameters using the outputBinding
field. See
CommandOutputBinding
for details.