Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expansion of the resume_from function #2582

Open
malininae opened this issue Nov 20, 2024 · 2 comments
Open

Expansion of the resume_from function #2582

malininae opened this issue Nov 20, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@malininae
Copy link
Contributor

At the November 2024 workshop the topic of expansion of the resume_from function came up.

Here are my propositions:

  1. Make the requirement of the recipes being absolutely identical looser. For example, allow resuming if few keywords in diagnostic changed. Ideally, addition of a new diagnostic and/or of variable group would be allowed to restart from.
  2. If the procession of the variable group didn't finish, still use the processed files. There were a few instances when a part of the variable group got processed, but the whole group had to restart, even though let's say 169 out of 170 files got successfully processed.

Happy to elaborate, other suggestions welcome, I know @k-a-webb had some suggestions.

@malininae malininae added the enhancement New feature or request label Nov 20, 2024
@k-a-webb
Copy link

k-a-webb commented Nov 20, 2024

My use case mainly concerns model benchmarking, where I am re-running the same recipe except an edit to the datasets list.
A very convenient feature of the current resume_from tool is that it uses the already preprocessed data.
Currently, resumable recipes need be exactly (or nearly exactly) the same as run previously -- which excludes editing the datasets list.

I would propose relaxing the requirement of having an identical dataset list, which is to say I support @malininae 's proposals!

@bouweandela
Copy link
Member

Editing the dataset list will only be possible if no preprocessor functions are used that use all datasets as input, i.e.:

MULTI_MODEL_FUNCTIONS = {
"bias",
"distance_metric",
"ensemble_statistics",
"multi_model_statistics",
"mask_multimodel",
"mask_fillvalues",
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants