-
Notifications
You must be signed in to change notification settings - Fork 0
cwl wip notes
https://www.research.manchester.ac.uk/portal/files/57032699/cwl_1.0_intro.pdf gatter/scatter: http://www.commonwl.org/v1.0/Workflow.html#WorkflowStep
- providing name of docker image as input parameter
- wanted to provide as an input to connect generating image with running tests in one workflow, but it looks like this is not possible: https://www.biostars.org/p/263163/
- but what about just having one name of docker image? if I scatter from the very beginning would this images be created in the same "system" or they will be independent??
- example of workflow:
- the main cwl file,
cwl_workflow.cwl
:
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: Workflow
inputs:
script_workf: File
input_workf: File
script_test: File
data_ref: File
report_txt: string
outputs:
# workflowout0:
# type: File
# outputSource: workflow/output_files_0
# workflowout1:
# type: File
# outputSource: workflow/output_files_1
testout:
type: File
outputSource: test/output_files_report
steps:
workflow:
run: cwl_mycode.cwl
in:
script: script_workf
input_files_0: input_workf
out: [output_files_0, output_files_1]
test:
run: cwl_test.cwl
in:
script: script_test
input_files_out: workflow/output_files_0
input_files_ref: data_ref
input_files_report: report_txt
out: [output_files_report]
- first step,
cwl_mycode.cwl
:
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
baseCommand: python
hints:
DockerRequirement:
dockerPull: repronim/regtests:2dcf653549f1dd740c2a2b9b4d2ed0d548576e67
inputs:
script:
type: File
inputBinding:
position: 1
input_files_0:
type: File
inputBinding:
position: 2
prefix: -f
outputs:
output_files_0:
type: File
outputBinding:
glob: list_sorted.json
output_files_1:
type: File
outputBinding:
glob: sum_list.json
- second step,
cwl_test.cwl
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
baseCommand: python
inputs:
script:
type: File
inputBinding:
position: 1
input_files_out:
type: File
inputBinding:
position: 2
prefix: -out
input_files_ref:
type: File
inputBinding:
position: 3
prefix: -ref
input_files_report:
#type: File
type: string
inputBinding:
position: 4
prefix: -report
outputs:
output_files_report:
type: File
outputBinding:
glob: $(inputs.input_files_report)
- input file,
input_workflow.yml
:
class: File
path: /Users/dorota/regtests/workflows4regtests/basic_examples/sorting_list/workflow/sorting.py
input_workf:
class: File
path: /Users/dorota/regtests/workflows4regtests/basic_examples/sorting_list/data_input/list2sort.json
script_test:
class: File
path: /Users/dorota/regtests/testing_functions/test_obj_eq.py
data_ref:
class: File
path: /Users/dorota/regtests/workflows4regtests/basic_examples/sorting_list/data_ref/list_sorted.json
- to run:
cwl-runner cwl_workflow.cwl input_workflow.yml
as an output you'll get raport_test.txt
- i don't know how to use
input_files_report
as File incwl_test.cwl
, so I can have input file, add some lines and return as a modified output. For now I only provide name (string) as an input and workflow create a new file. It's probably a better (more cwl) way of doing it.
- if you want input to be in a working directory you can add
requirements
part with Javascript expressions:
requirements:
- class: InlineJavascriptRequirement
- class: InitialWorkDirRequirement
listing:
- $(inputs.data_input)
inputs:
data_input:
type: File
-
it didn't work because FSLDIR was not set properly, the bash script works if
--preserve-entire-environment
is used:cwl-runner --preserve-entire-environment --tmp-outdir-prefix=/tmp/tmp cwl_bash_test.cwl cwl_bash_test_input.yml
-
was assuming tha the bash script set everything - ask Satra if this is the proper behaviour
-
the problem was that the default cwl docker command take user/uid from local machine, i.e.
--user=503:20
and it can't find it (have no idea why the error is during running the workflow and not earlier), we can as for not using the local user name:cwl-runner --no-match-user cwl_docker_test.cwl cwl_docker_test_input.yml
and it works -
still have no idea how to mount directory with data.
Within /Users/dorota/simple_workflow/simple_workflow/scripts/ i’m testing cwl with running simple workflow with two approaches:
-
cwl-runner cwl_test.cwl cwl_test_input.yml
wherecwl_test.cwl
:
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
baseCommand: python
inputs:
script_py:
type: File
inputBinding:
position: 1
key_str:
type: string
inputBinding:
position: 2
prefix: --key
n_str:
type: int
inputBinding:
position: 3
prefix: -n
outputs: []
requirements:
EnvVarRequirement:
envDef:
PATH: /Users/dorota/simple_workflow/simple_workflow/miniconda/envs/bh_demo/bin:/Users/dorota/anaconda/envs/cwl_py3/bin:/usr/local\
/fsl/bin:/Users/dorota/anaconda/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/opt/X11/bin:/Library/TeX/texbin:/Applications/git-ann\
ex.app/Contents/MacOS
FSLOUTPUTTYPE: NIFTI_GZ
and cwl_test_input.yml
:
script_py:
class: File
path: run_demo_workflow.py
key_str: 11an55u9t2TAf0EV2pHN0vOd8Ww2Gie-tHp9xGULh_dA
n_str: 1
script_py:
class: File
path: run_demo_workflow.py
key_str: 11an55u9t2TAf0EV2pHN0vOd8Ww2Gie-tHp9xGULh_dA
n_str: 1
- running bash script:
cwl-runner --tmp-outdir-prefix=/Users/dorota/tmp cwl_bash_test.cwl cwl_bash_test_input.yml
(--tmp-outdir-prefix
is needed due to some conda problems when file names too long, unfortuately it has to be with the home directory) where cwl_bash_test.cwl:
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
baseCommand: bash
inputs:
script_py:
type: File
inputBinding:
position: 1
opt_str:
type: string
inputBinding:
position: 2
outputs: []
and cwl_bash_test_input.yml
script_py:
class: File
path: Simple_Prep.sh
opt_str: test
Errors
Both "work" in a way that they start executing workflow, but it fails: FileNotFoundError: File/Directory '/Users/dorota/tmp26xdz6bi/simple_workflow/scripts/output/metaflow/AnnArbor_sub16960/reorient_brain/scan_mprage_anonymized_reoriented.nii' not found for Reorient2Std output 'out_file'. _Interface Reorient2Std failed to run. _ The file indeed doesn't exist. It is not the first node to execute, it executes without problem download_url. If I execute without cwl I have file nii.gz, but don't think it is the problem
OTHER Problems
don't know how to add some path to existing PATH (just provided as a string) in cwl_test.cwl
-
cwl-runner cwl_docker_test.cwl cwl_docker_test_input.yml
where cwl_docker_test.cwl:
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
#baseCommand: python #docker image already starts with python
hints:
DockerRequirement:
dockerPull: repronim/simple_workflow:latest
dockerOutputDirectory: /other #dj testing (i guess i don't need it)
inputs:
script_py:
type: File
inputBinding:
position: 1
key_str:
type: string
inputBinding:
position: 2
prefix: --key
n_str:
type: int
inputBinding:
position: 3
prefix: -n
outputs: []
and cwl_docker_test_input.yml:
script_py:
class: File
path: run_demo_workflow.py
key_str: 11an55u9t2TAf0EV2pHN0vOd8Ww2Gie-tHp9xGULh_dA
n_str: 1
ERROR It also starts running workflow, but gives an error (download_url): KeyError: 'getpwuid(): uid not found: 503'
OTHER PROBLEMS don't know how to provide input file to docker container, or how to mount an additional directory (some directories are mounted automatically)