Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CWL Model, Encoding & Decoding #421

Merged
merged 49 commits into from
Oct 21, 2024
Merged

CWL Model, Encoding & Decoding #421

merged 49 commits into from
Oct 21, 2024

Conversation

caroott
Copy link
Member

@caroott caroott commented Jul 19, 2024

This PR aims to add the F# CWL Data Model, as well as encoding and decoding functionality to interact with CWL files. It addresses issue #420

@caroott caroott self-assigned this Jul 19, 2024
@kMutagene
Copy link
Member

what's the status on this @caroott ?

@caroott
Copy link
Member Author

caroott commented Sep 10, 2024

still in progress :/
I hope to get it done soon™

@caroott caroott force-pushed the CWL branch 2 times, most recently from a60e33d to 78e09de Compare October 14, 2024 10:40
@caroott caroott marked this pull request as ready for review October 15, 2024 18:13
@caroott caroott requested a review from HLWeil October 15, 2024 18:13
Copy link
Member

@kMutagene kMutagene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot say too much about the implementation, it looks good to me, but i really miss some xml docs on your types and functions. Ideally, also add a usage example to the docs folder

Decode.object (fun get ->
Dirent
{
// BUG: Entry Requires an Entryname to be present when it's an expression
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this, are you pushing code you know having a bug? 😆

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that it is theoretically possible, but don't know how it would look/work then.

Copy link
Member

@HLWeil HLWeil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your efforts. I've got a few general requests (which I partly also put directly onto the files):

  • For Fable-compatability:
    • Use [AttachMembers] attributes
    • Use ResizeArrays as collections
  • Don't put top-level types into nested module structure. Using ARCtrl.CWL should be enoguh to make these types directly accessible. If necessary, adjust names to go with this change
  • For the test objects: Adjust names in a way to make clear that these were the CWL-FIle contents and not any datamodel oder intermediate representations. (I suggested to add "file", but maybe "fileContent" or something would be better.
  • Test file structure: Please adjust according to the other testfiles:
    • Put a "main" object at the end with a TestList of the name of the class to be tested.
    • Then rename the TestList containing the Tests to the behaviour of the class you want to test. E.g. in most of your cases you would have a testList "Decode"

@@ -55,7 +55,7 @@ module ARCAux =
|> FileSystem.create
fs.Union(tree)

let updateFSByCWL (cwl : CWL.CWL option) (fs : FileSystem) =
let updateFSByCWL (cwl : CWL.CWLProcessingUnits.CWLToolDescription option) (fs : FileSystem) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undo here, as it's the wrong type anways and might lead to confusion later. Use unit instead?

@@ -65,7 +65,7 @@ module ARCAux =


[<AttachMembers>]
type ARC(?isa : ArcInvestigation, ?cwl : CWL.CWL, ?fs : FileSystem.FileSystem) =
type ARC(?isa : ArcInvestigation, ?cwl : CWL.CWLProcessingUnits.CWLToolDescription, ?fs : FileSystem.FileSystem) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undo here, as it's the wrong type anways and might lead to confusion later. Use unit instead?

<Compile Include="Encode.fs" />
</ItemGroup>
<ItemGroup>
<None Include="../../build/logo.png" Pack="true" PackagePath="\" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop this ItemGroup and Item

src/CWL/CWL.fs Outdated
open Outputs
open WorkflowSteps

module CWLProcessingUnits =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these types (the top-level ones) should definitely be directly in the ARCtrl namespace. Maybe take a look at ARCtrl.Core, where ArcAssay and such are directly in the ARCtrl namespace.

src/CWL/CWL.fs Outdated
module CWLProcessingUnits =

type CWLToolDescription (
cwlVersion: string,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How vastly differing are the cwl versions? Do we have a default version in the specs? If not, we should pick one and set this field here to optional with a default value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We ask for version 1.2 (current one) or higher. The version is a mandatory field in cwl files and always present, so it can always be read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but this type is not exclusively for parsing but also for being created, edited and such. I don't think we should have this as a mandatory field in the constructor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if you want to create or edit cwl, you need this field for it to be valid. So making it optional wouldn't make sense imo. A creator which automatically sets it to 1.2 could be an option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, having it as an optional parameter in the constructor with an internal default value is exactly what I meant.

@@ -0,0 +1,108 @@
module TestObjects.CWL.CommandLineToolMetadata

let cwl ="""cwlVersion: v1.2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to "cwlFile"

@@ -0,0 +1,39 @@
module TestObjects.CWL.CommandLineTool

let cwl ="""cwlVersion: v1.2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to "cwlFile"

|> Decode.read
|> Decode.stepsDecoder

let testWorkflowStep =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reorganize with "main" testList at the bottom

let requirements = requirementsDecoder yamlCWL
let hints = hintsDecoder yamlCWL
let steps = stepsDecoder yamlCWL
printfn "%A" steps
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debugging prints?

Copy link
Member Author

@caroott caroott Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe 👀
It was telling me yesterday evening that it was empty, so i had to check^^

open WorkflowSteps
open DynamicObj

module Decode =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remind to me to recheck this file when the structural change requests are implemented :D

Copy link
Member

@HLWeil HLWeil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some consideration: Please omit all modules, i.e. put all type definitions in the top-level ARCtrl.CWL namespace.
Makes usage more straightforward and is consistent with the other types defined in ARCtrl namespace.
For this would be good to rename classes Input to CWLInput and Output to CWLOutput to minimize confusion.

Copy link
Member

@HLWeil HLWeil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the adjustments. Looks good to me now 👍

@HLWeil HLWeil merged commit edb22da into main Oct 21, 2024
2 checks passed
@kMutagene kMutagene linked an issue Oct 23, 2024 that may be closed by this pull request
@caroott caroott deleted the CWL branch November 18, 2024 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[Feature Request] Add ARC CWL Data Model
3 participants