Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some WDL Maps are not handled #77

Open
dinithiW opened this issue Aug 24, 2021 · 8 comments
Open

Some WDL Maps are not handled #77

dinithiW opened this issue Aug 24, 2021 · 8 comments

Comments

@dinithiW
Copy link
Contributor

dinithiW commented Aug 24, 2021

Currently, the WDL input compound type Map[] is not handled by the translator. (https://github.com/openwdl/wdl/blob/main/versions/1.1/SPEC.md#mapp-y)

A workflow using Maps is given in this workflow (https://github.com/biowdl/tasks/blob/bc1bacf11498d2d30b85591cfccdcf71ef0966a5/star.wdl#L144)

edit by @mr-c , we now correctly translate non-input variables of type Map as demonstrated by #189 but I can't find any examples of real-world WDL workflows with inputs of type Map ; currently https://github.com/broadinstitute/warp/blob/b09880a71e3d3e42fa4b544d03aea23c0a246efc/pipelines/broad/dna_seq/germline/joint_genotyping/by_chromosome/JointGenotypingByChromosomePartOne.wdl#L46 does not translate, but that is probably more to due with the use of the Set type

@mr-c mr-c added the good first issue Good for newcomers label Sep 24, 2021
@RaOneG
Copy link
Contributor

RaOneG commented Oct 19, 2021

I am looking at this, and I am not exactly sure what I should do. Is it just to create a translation from WDL to CWL like here or am I missing something obvious?

@RaOneG
Copy link
Contributor

RaOneG commented Oct 19, 2021

Oh I should create a class for map, is that it?

@mr-c
Copy link
Member

mr-c commented Oct 21, 2021

@RaOneG Pardon my delay

In this instance,

     Map[String, String] samOutputNames = {"BAM SortedByCoordinate": "sortedByCoord.out.bam"}

is not in the WDLinputs section, it is a private declaration.

So to convert this example we would need to figure out how to represent the only other place samOutputNames is used. Here it is in the outputs section, to calculate the name of the output file:

   File bamFile = outFileNamePrefix + "Aligned." +  samOutputNames[outSAMtype]

So the strategy for Map used outside the inputs section (and not used as the type of one of the outputs) is to

  1. Convert that private declaration to Javascript. In this example that would be
var samOutputNames = Map([ ["BAM SortedByCoordinate", "sortedByCoord.out.bam"] ];
  1. Then continue as normal.
    For our example, that would be producing
${
var samOutputNames = new Map([ ["BAM SortedByCoordinate", "sortedByCoord.out.bam"] ]);
return inputs.outFileNamePrefix + "Aligned." + samOutputNames.get(inputs.outSAMtype);
}

for the output_glob at

outputBinding=cwl.CommandOutputBinding(glob=output_glob),

However, the Javascript Map type is not part of ECMAScript 5.1, which is the version of Javascript that CWL v1.x Expressions use

While in this instance we can rely on the Map like qualities of object in Javascript because all the keys in samOutputNames are strings, there may be regular uses of WDL's Map type that use other types of keys, like numbers.

So for now we can use a Javascript Object instead:

${
var samOutputNames = {"BAM SortedByCoordinate": "sortedByCoord.out.bam"};`
return inputs.outFileNamePrefix + "Aligned." + samOutputNames[inputs.outSAMtype];
}

But eventually we will have to find another solution like adapting this polyfill that brings the ECMAScript 6 Map to ECMAscript 5: https://github.com/zloirock/core-js/blob/v2.6.12/modules/es6.map.js

@RaOneG
Copy link
Contributor

RaOneG commented Oct 23, 2021

Hi @mr-c,
I am sorry but this is a little difficult for me to understand in addition to I have no prior knowledge of Javascript. Is there something I can learn/read to get some more in-depth knowledge about what you mentioned above apart from the Javascript part?

@mr-c
Copy link
Member

mr-c commented Oct 25, 2021

Hello @RaOneG
A very basic level of JavaScript understanding will be necessary. I personally know very little Javascript, instead I constantly do web searches to remind myself of the correct syntax. I literally Google searched for "if then else javascript" the other week!

As for my comments on this issue, you can ignore everything I wrote about "polyfill"s and ECMAScript Map.

@mr-c mr-c changed the title WDL Maps are not handled Some WDL Maps are not handled Mar 10, 2022
@mr-c
Copy link
Member

mr-c commented Mar 10, 2022

I asked in the OpenWDL slack for a real-world example of a Map input https://openwdl.slack.com/archives/CT0QEK1V0/p1646937620784419 ; pending a real-world usage or user request, I think this is now a low-priority issue

@mr-c mr-c removed the good first issue Good for newcomers label Mar 10, 2022
@mr-c
Copy link
Member

mr-c commented Mar 11, 2022

@dpark01 writes

so from a high level, the reason I find myself using read_json or Map structures in general in WDL is when I'm

  • scattering on X
  • need to track/coordinate metadata on a per-X basis, either post-gather or even inside the scatter block

https://github.com/broadinstitute/viral-pipelines/blob/b4010419ac2710b73e6c70be491e43778b1625f8/pipes/WDL/tasks/tasks_ncbi_tools.wdl#L433-L519

this particular example task is an interesting one where I take a bunch of same-length Array inputs and basically zip them together and then deduplicate them based on one of the strings: that is, it's assumed that the biosamples input array may contain non-unique values, so the output of this task produces an output Array of unique biosamples and then a bunch of Maps that let the workflow code query metadata inside a scatter block for each biosample.

https://github.com/broadinstitute/viral-pipelines/blob/bd0d777617229e61dd9d659054f99e568820295b/pipes/WDL/workflows/sarscov2_illumina_full.wdl#L105-L111

in an unrelated example, here's a workflow that calls other tasks that emit Map outputs and then looks up values in that Map in the middle of a scatter block in order to know what to do differently with each sample:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants