Allofus - Adapt targene for use on the All of Us Researcher Workbench #174

roskamsh · 2024-05-10T11:26:49Z

add a profile that allows execution of targene on the researcher workbench, which includes the subsequent steps:
add containers which route through Google Container Registry for each container hosted on DockerHub
specify executor-specific instructions (google lifesciences API as well as resources)
run an end-to-end test using a test configuration (data in test/ directory
run on all of us data for the FTO variant and BMI trait
update docs

resources:
https://support.researchallofus.org/hc/en-us/articles/21179878475028-Using-Docker-Images-on-the-Workbench
https://workbench.researchallofus.org/workspaces/aou-rw-5b81a011/howtousenextflowintheresearcherworkbenchv7/data

…/.nextflow/config

olivierlabayle

Thank you Breeshey.

On the two currently non-ticked boxes:

did you manage to run the pipeline entirely?
What is the format of the input data, does it require an additional model to the "custom"? It might be good to explicitely state in the docs even if it can use this cohort mode.

conf/allofus_container.config

olivierlabayle · 2024-09-23T16:08:58Z

conf/allofus.config

@@ -0,0 +1,23 @@
+process {
+    memory = { 6.GB * task.attempt }


On all memory/cpu requirements, why not simply include base.config? It would be a good opportunity to homogeneise these configurations.

Similarly, max_retries is only 3 on eddie

The only thing is the resource allocation needs to be slightly different on the AOU RW, given that they have couple memory-CPU constraints. Which is why these had to be changed here when you run on the AOU RW. I can update the max_retries, but there are additional error codes that need to be added when running on google life sciences API.

Right I remember, what is the exact coupling, is it a factor? If this ie the case we could probably let the user chose memory only and infer cpu based on it, e.g. cpu = memory / 4 or something similar?

It seems to be about 6 GB per 1 cpu. So we would have to make the memory a function of the number of CPUs.

For example:

cpu = 1 * task.attempt
memory = 6.GB * cpu

Or something like this?

Additionally - When CPUs requested is > 2, you must request an even number of cores. Ie. 4, 6, 8 etc. Another odd feature that I found when running the end-to-end test.

I guess if you wanted the user to choose memory, you could just do the inverse operation. But CPUs would have to be an integer, so something like:

cpu = ceiling(memory/6)

Unsure exactly how this can be implemented at the nextflow configuration level. Would probably have to use the Math() operators.

It seems to be about 6 GB per 1 cpu. So we would have to make the memory a function of the number of CPUs.

For example:

cpu = 1 * task.attempt memory = 6.GB * cpu

Or something like this?

Then the multithreaded label does not respect this factor does it? I see 8GB for 2 cpus, should it not be 12GB? Any chance you can this info from all of us support team?

I guess if you wanted the user to choose memory, you could just do the inverse operation. But CPUs would have to be an integer, so something like:

cpu = ceiling(memory/6)

Unsure exactly how this can be implemented at the nextflow configuration level. Would probably have to use the Math() operators.

fair enough!

conf/allofus_container.config

test.config

olivierlabayle · 2024-09-23T16:13:42Z

containers/flashpca/Dockerfile

have you successfully built this image? The problem is that at the moment I would be the only one able to push it to the current docker tag since it is not built within the CI process.

Yes i've built it and pushed it to my Dockerhub. It is here https://hub.docker.com/r/roskamsh/flashpca

Can we not just keep it as a stale file as it only needs to be done once and not updated again?

I've already built it so no need to re-build. It is pointed at in the conf/allofus_container.config. I can update container.config to point at the new docker container.

As you say the file would be unused so that is not ideal. Ideal the repo should be kept as minimal as possible with only useful code. Probably best to use a container for which there is already a build process within the organisation. That's also one less download at runtime, all we need is add a label to the process that misses one.

In the long run I am happy to have some docker files within the pipeline repo as long as they can be built within the CI process.

I just thought it would make sense to have the Dockerfile stored here so that people knew how it was built and we had a record of it. Where else would this file be stored then?

it is definitely the right place for it but as part of another branch --> PR if this is not addressed fully here.

roskamsh · 2024-09-23T16:48:21Z

Thank you Breeshey.

On the two currently non-ticked boxes:

did you manage to run the pipeline entirely?

What is the format of the input data, does it require an additional model to the "custom"? It might be good to explicitely state in the docs even if it can use this cohort mode.

Currently I haven't run it with AOU data yet. That is the plan for this week! Then I should be able to add additional changes (if required) for a new COHORT mode. Once this is complete, I will update the docs. At this point, I have managed to run an end-to-end test using test.config (which is basically just https://github.com/TARGENE/targene-pipeline/blob/main/test/configs/custom_cohort_flat.config).

joshua-slaughter · 2024-10-02T13:39:33Z

conf/allofus.config

+    }
+
+    // Set google appropriate error strategy
+    errorStrategy = {task.exitStatus in [143,137,104,134,139,14] ? 'retry' : 'finish'}


Are the error codes the same as Eddie? Just curious. And is there a place where we can view the docs for the platforms errors?

joshua-slaughter

Looks good to me!

roskamsh and others added 20 commits May 10, 2024 12:20

add all of us profile

e7fa074

add test config file to test running targene on AOU RW

cdd644b

add all of us profile configuration

78bf8d3

add actual test input data example

ffd108e

rearrange some baseline settings into the profile for allofus

fe36501

rearrange some parameters

c292b15

add allofus profile currently working

7438ace

remove olf file

1be28fa

add working container declaration in AOU RW

4bda18d

working test configuration for all of us

192921f

update

6f1757c

remove old file

ed610be

Merge branch 'main' into allofus

add2c14

update input parameters to be consistent with new version

afc3302

add gls and allofus profiles

72af43e

update flashpca and add baseline commandlinetools container

fa9a762

remove gls config

efa8883

add all of us RW specifications to run with gls profile included in ~…

883efab

…/.nextflow/config

add dockerfile for flashPCA container

d55b676

update to flashpca container version

da76044

roskamsh self-assigned this Sep 18, 2024

roskamsh added the enhancement New feature or request label Sep 18, 2024

roskamsh linked an issue Sep 18, 2024 that may be closed by this pull request

how to run this piepline in allofus workbench? #196

Open

roskamsh requested review from olivierlabayle and joshua-slaughter September 23, 2024 09:39

olivierlabayle requested changes Sep 23, 2024

View reviewed changes

roskamsh added 3 commits September 24, 2024 14:27

address Olivier's comment; reorganize container declaration

e54d4f4

missing quotation

a8cfbde

base image only required when running only through docker on aou rw

1d2c2af

retain image naming system

e5fb4f5

joshua-slaughter reviewed Oct 2, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allofus - Adapt targene for use on the All of Us Researcher Workbench #174

Allofus - Adapt targene for use on the All of Us Researcher Workbench #174

roskamsh commented May 10, 2024 •

edited

Loading

olivierlabayle left a comment

olivierlabayle Sep 23, 2024

olivierlabayle Sep 23, 2024

roskamsh Sep 23, 2024

olivierlabayle Sep 24, 2024

roskamsh Sep 24, 2024

roskamsh Sep 24, 2024

roskamsh Sep 24, 2024

olivierlabayle Sep 24, 2024

olivierlabayle Sep 24, 2024

olivierlabayle Sep 23, 2024

roskamsh Sep 23, 2024

roskamsh Sep 24, 2024 •

edited

Loading

olivierlabayle Sep 24, 2024 •

edited

Loading

olivierlabayle Sep 24, 2024

roskamsh Sep 24, 2024

olivierlabayle Sep 24, 2024

roskamsh commented Sep 23, 2024

joshua-slaughter Oct 2, 2024

joshua-slaughter left a comment

Allofus - Adapt targene for use on the All of Us Researcher Workbench #174

Are you sure you want to change the base?

Allofus - Adapt targene for use on the All of Us Researcher Workbench #174

Conversation

roskamsh commented May 10, 2024 • edited Loading

olivierlabayle left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

roskamsh Sep 24, 2024 • edited Loading

Choose a reason for hiding this comment

olivierlabayle Sep 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

roskamsh commented Sep 23, 2024

Choose a reason for hiding this comment

joshua-slaughter left a comment

Choose a reason for hiding this comment

roskamsh commented May 10, 2024 •

edited

Loading

roskamsh Sep 24, 2024 •

edited

Loading

olivierlabayle Sep 24, 2024 •

edited

Loading