Skip to content

cwl wip notes

Dorota Jarecka edited this page Feb 13, 2018 · 8 revisions

READING

https://www.research.manchester.ac.uk/portal/files/57032699/cwl_1.0_intro.pdf gatter/scatter: http://www.commonwl.org/v1.0/Workflow.html#WorkflowStep

Feb 10th

  • providing name of docker image as input parameter
  • wanted to provide as an input to connect generating image with running tests in one workflow, but it looks like this is not possible: https://www.biostars.org/p/263163/
  • but what about just having one name of docker image? if I scatter from the very beginning would this images be created in the same "system" or they will be independent??

Dec 9th

  • example of workflow:
  • the main cwl file, cwl_workflow.cwl:
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: Workflow

inputs:
  script_workf: File
  input_workf: File
  script_test: File
  data_ref: File
  report_txt: string

outputs:
#  workflowout0:
#    type: File
#    outputSource: workflow/output_files_0
#  workflowout1:
#    type: File
#    outputSource: workflow/output_files_1
  testout:
    type: File
    outputSource: test/output_files_report

steps:
  workflow:
    run: cwl_mycode.cwl
    in:
      script: script_workf
      input_files_0: input_workf
    out: [output_files_0, output_files_1]

  test:
    run: cwl_test.cwl
    in:
      script: script_test
      input_files_out: workflow/output_files_0
      input_files_ref: data_ref
      input_files_report: report_txt
    out: [output_files_report]
  • first step, cwl_mycode.cwl:
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
baseCommand: python
hints:
  DockerRequirement:
    dockerPull: repronim/regtests:2dcf653549f1dd740c2a2b9b4d2ed0d548576e67
inputs:
  script:
    type: File
    inputBinding:
      position: 1

  input_files_0:
    type: File
    inputBinding:
      position: 2
      prefix: -f
outputs:
  output_files_0:
    type: File
    outputBinding:
      glob: list_sorted.json
  output_files_1:
    type: File
    outputBinding:
      glob: sum_list.json
  • second step, cwl_test.cwl
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
baseCommand: python

inputs:
  script:
    type: File
    inputBinding:
      position: 1
  input_files_out:
    type: File
    inputBinding:
      position: 2
      prefix: -out
  input_files_ref:
    type: File
    inputBinding:
      position: 3
      prefix: -ref
  input_files_report:
    #type: File
    type: string
    inputBinding:
      position: 4
      prefix: -report

outputs:
  output_files_report:
    type: File
    outputBinding:
      glob: $(inputs.input_files_report)
  • input file, input_workflow.yml:
  class: File
  path: /Users/dorota/regtests/workflows4regtests/basic_examples/sorting_list/workflow/sorting.py
input_workf:
  class: File
  path: /Users/dorota/regtests/workflows4regtests/basic_examples/sorting_list/data_input/list2sort.json
script_test:
  class: File
  path: /Users/dorota/regtests/testing_functions/test_obj_eq.py
data_ref:
  class: File
  path: /Users/dorota/regtests/workflows4regtests/basic_examples/sorting_list/data_ref/list_sorted.json
  • to run:
cwl-runner cwl_workflow.cwl input_workflow.yml

as an output you'll get raport_test.txt

issues:

  • i don't know how to use input_files_report as File in cwl_test.cwl, so I can have input file, add some lines and return as a modified output. For now I only provide name (string) as an input and workflow create a new file. It's probably a better (more cwl) way of doing it.

Dec 6th

  • if you want input to be in a working directory you can add requirements part with Javascript expressions:
requirements:
  - class: InlineJavascriptRequirement
  - class: InitialWorkDirRequirement
    listing:
      - $(inputs.data_input)
inputs:
  data_input:
    type: File

Oct 2nd (after meeting with Jakub)

Running without docker:

  • it didn't work because FSLDIR was not set properly, the bash script works if --preserve-entire-environment is used: cwl-runner --preserve-entire-environment --tmp-outdir-prefix=/tmp/tmp cwl_bash_test.cwl cwl_bash_test_input.yml

  • was assuming tha the bash script set everything - ask Satra if this is the proper behaviour

running with docker

  • the problem was that the default cwl docker command take user/uid from local machine, i.e. --user=503:20 and it can't find it (have no idea why the error is during running the workflow and not earlier), we can as for not using the local user name: cwl-runner --no-match-user cwl_docker_test.cwl cwl_docker_test_input.yml and it works

  • still have no idea how to mount directory with data.

Sept 28th

Within /Users/dorota/simple_workflow/simple_workflow/scripts/ i’m testing cwl with running simple workflow with two approaches:

  • cwl-runner cwl_test.cwl cwl_test_input.yml where cwl_test.cwl:
#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
baseCommand: python


inputs:
  script_py:
    type: File
    inputBinding:
      position: 1
  key_str:
    type: string
    inputBinding:
      position: 2
      prefix: --key
  n_str:
    type: int
    inputBinding:
      position: 3
      prefix: -n


outputs: []

requirements:
  EnvVarRequirement:
    envDef:
      PATH: /Users/dorota/simple_workflow/simple_workflow/miniconda/envs/bh_demo/bin:/Users/dorota/anaconda/envs/cwl_py3/bin:/usr/local\
/fsl/bin:/Users/dorota/anaconda/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/opt/X11/bin:/Library/TeX/texbin:/Applications/git-ann\
ex.app/Contents/MacOS
      FSLOUTPUTTYPE: NIFTI_GZ

and cwl_test_input.yml:

script_py:
  class: File
  path: run_demo_workflow.py

key_str: 11an55u9t2TAf0EV2pHN0vOd8Ww2Gie-tHp9xGULh_dA
n_str: 1
script_py:
  class: File
  path: run_demo_workflow.py

key_str: 11an55u9t2TAf0EV2pHN0vOd8Ww2Gie-tHp9xGULh_dA
n_str: 1
  • running bash script: cwl-runner --tmp-outdir-prefix=/Users/dorota/tmp cwl_bash_test.cwl cwl_bash_test_input.yml (--tmp-outdir-prefix is needed due to some conda problems when file names too long, unfortuately it has to be with the home directory) where cwl_bash_test.cwl:
#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
baseCommand: bash

inputs:
  script_py:
    type: File
    inputBinding:
      position: 1
  opt_str:
    type: string
    inputBinding:
      position: 2

outputs: []

and cwl_bash_test_input.yml

script_py:
  class: File
  path: Simple_Prep.sh
opt_str: test

Errors

Both "work" in a way that they start executing workflow, but it fails: FileNotFoundError: File/Directory '/Users/dorota/tmp26xdz6bi/simple_workflow/scripts/output/metaflow/AnnArbor_sub16960/reorient_brain/scan_mprage_anonymized_reoriented.nii' not found for Reorient2Std output 'out_file'. _Interface Reorient2Std failed to run. _ The file indeed doesn't exist. It is not the first node to execute, it executes without problem download_url. If I execute without cwl I have file nii.gz, but don't think it is the problem

OTHER Problems

don't know how to add some path to existing PATH (just provided as a string) in cwl_test.cwl

Within /Users/dorota/simple_workflow i’m testing cwl+docker:

  • cwl-runner cwl_docker_test.cwl cwl_docker_test_input.yml where cwl_docker_test.cwl:
#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
#baseCommand: python #docker image already starts with python
hints:	
  DockerRequirement:
    dockerPull: repronim/simple_workflow:latest
    dockerOutputDirectory: /other #dj testing (i guess i don't need it) 

inputs:
  script_py:
    type: File
    inputBinding:
      position: 1
  key_str:
    type: string
    inputBinding:
      position: 2
      prefix: --key
  n_str:
    type: int
    inputBinding:
      position: 3
      prefix: -n

outputs: []

and cwl_docker_test_input.yml:

script_py:
  class: File
  path: run_demo_workflow.py

key_str: 11an55u9t2TAf0EV2pHN0vOd8Ww2Gie-tHp9xGULh_dA
n_str: 1

ERROR It also starts running workflow, but gives an error (download_url): KeyError: 'getpwuid(): uid not found: 503'

OTHER PROBLEMS don't know how to provide input file to docker container, or how to mount an additional directory (some directories are mounted automatically)