Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[build] --cpus isn't passed to thru when using --exec #272

Open
corneliusroemer opened this issue Apr 12, 2023 · 5 comments
Open

[build] --cpus isn't passed to thru when using --exec #272

corneliusroemer opened this issue Apr 12, 2023 · 5 comments
Labels
bug Something isn't working

Comments

@corneliusroemer
Copy link
Member

Current Behavior

Invoking nextstrain build --docker or (--aws-batch) does not pass --cores option to snakemake causing error:

Error: you need to specify the maximum number of CPU cores to be used at the same time. If you want to use N cores, say --cores N or -cN. For all cores on your system (be sure that this is appropriate) use --cores all. For no parallelization use --cores 1 or -c1. <_io.TextIOWrapper name='<stderr>' mode='w' encoding='utf-8'>

Expected behavior

Nextstrain cli passes that argument to snakemake so that the error doesn't happen

How to reproduce

Steps to reproduce the current behavior:

  1. gh repo clone nextstrain/rsv
  2. cd rsv
  3. Run
nextstrain build \
    --docker \
    --detach \
    --no-download \
    --cpus 5 \ 
--exec env \
    . \
      snakemake \
        --configfiles config/configfile.yaml  \
        --printshellcmds

Environment

  • Operating system: macOS 13.3.1 on M1
  • shell: zsh
  • version: nextstrain.cli 6.2.1

Additional context

This error could be behind most of the snakemake workflow errors we've been getting since the snakemake upgrade in docker.

@corneliusroemer corneliusroemer added the bug Something isn't working label Apr 12, 2023
@corneliusroemer
Copy link
Member Author

corneliusroemer commented Apr 12, 2023

The issue appears iff the cli build command contains the snakemake magic word.

This alternative way, without --exec and without env and snakemake works:

nextstrain build \
    --docker \
    --detach \
    --no-download \
    --cpus 5 \
    . \
    --configfiles config/configfile.yaml  \
        --printshellcmds

I'm confused about the --exec option, it is not documented in nextstrain build --help

I see! It's in --help-all under development options
--exec <prog> Program to run inside the runtime (default: snakemake)

So the way we invoke things in the actions is not how we usually document how to run nextstrain builds (without --exec, using the default snakemake.

@tsibley
Copy link
Member

tsibley commented Apr 12, 2023

Ah, yes, that'd be an issue. Nextstrain CLI only passes thru --cpus as Snakemake's --cores when the program its running (--exec) is snakemake:

# Automatically pass thru appropriate resource options to Snakemake to
# avoid the user having to repeat themselves (once for us, once for
# snakemake).
if opts.exec == "snakemake":

@tsibley
Copy link
Member

tsibley commented Apr 12, 2023

In those GitHub Actions workflows which launch our pathogen workflows, we use env purely to keep the rest of the command (envdir … snakemake …) together, IIRC. We could drop --exec env, but then we'd have to replace it with --exec envdir in order to use the env dir we create.

Since 2018, I've wanted to integrate env var management for nextstrain build via first-class env dir support (so we don't have to do this hacky thing), but it's never floated to the top of the list. This is maybe a good reason to prioritize it (but there's also so many other things to prioritize).

@tsibley
Copy link
Member

tsibley commented Apr 12, 2023

(Summary notes on env var management for Nextstrain CLI and our pathogen workflows.)

@tsibley tsibley changed the title CLI doesn't pass --cpus to snakemake, required since v7, causing Error: you need to specify the maximum number of CPU cores --cpus isn't passed to thru when using --exec Apr 12, 2023
@tsibley tsibley changed the title --cpus isn't passed to thru when using --exec [build] --cpus isn't passed to thru when using --exec Apr 12, 2023
corneliusroemer added a commit to nextstrain/mpox that referenced this issue Apr 12, 2023
This is required for the action to be compatible with
snakemake version >5.11.0 which started to require the
`--cores` option
For more context see:
- nextstrain/cli#272
-https://bedfordlab.slack.com/archives/C01LCTT7JNN/p1681324279363569?thread_ts=1681293140.124269&cid=C01LCTT7JNN
corneliusroemer added a commit to nextstrain/rsv that referenced this issue Apr 12, 2023
This is required for the action to be compatible with snakemake version >5.11.0
which started to require the `--cores` option

For more context see:
- nextstrain/cli#272 
- https://bedfordlab.slack.com/archives/C01LCTT7JNN/p1681324279363569?thread_ts=1681293140.124269&cid=C01LCTT7JNN
corneliusroemer added a commit to nextstrain/forecasts-ncov that referenced this issue Apr 12, 2023
This is required for the action to be compatible with snakemake version >5.11.0
which started to require the `--cores` option

For more context see:
- nextstrain/cli#272 
- https://bedfordlab.slack.com/archives/C01LCTT7JNN/p1681324279363569?thread_ts=1681293140.124269&cid=C01LCTT7JNN
@jameshadfield
Copy link
Member

jameshadfield commented Apr 12, 2023

Since 2018, I've wanted to integrate env var management for nextstrain build via first-class env dir support (so we don't have to do this hacky thing)

Beyond the immediate PRs which will (probably?) close this issue, I think this would be a great thing to prioritise. I remember having to dive into the CLI code to write some of these actions, doubly complicated by my lack of familiarity with envdir (summarised in this slack thread). Doing this would have two wins: it removes the need for --exec and thus makes the GitHub actions syntax for nextstrain build more familiar (helps with debugging, readablitiy and lack of gotchas); secondly we become used to explicitly passing through env variables rather than having to know about this list.

@joverlee521 joverlee521 moved this from New to Prioritized in Nextstrain planning (archived) Apr 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Status: Prioritized
Development

No branches or pull requests

3 participants