Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

offline plotting of prediction and ground truth #15

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

clechartre
Copy link
Collaborator

Purpose

Offline plotting function to handle the plotting of the prediction output as a numpy array against a single .zarr archive. Useful to verify the accuracy of the inference

Code changes:

offline.py: created a module for this purpose
slum_offline.sh: automation of the offline.py module through batch system

Checklist

Before submitting this PR, please make sure:

  • You have followed the coding standards guidelines established at Code Review Checklist.
  • Docstrings and type hints are added to new and updated routines, as appropriate
  • All relevant documentation has been updated or added (e.g. README)

Review

For the review process follow the guidelines at Checklist

@clechartre clechartre requested review from twicki and sadamov April 22, 2024 08:30
@sadamov
Copy link
Collaborator

sadamov commented Apr 22, 2024

Hi @clechartre and thanks for the PR. I have some high-level comments before a deeper code-review. We have a new zarr-archive on Tsa that will be the default MeteoSwiss archive after tomorrow (when creation is done). I have already merged and updated this PR-branch where possible with latest main-branch.

  • Path: /scratch/cosuna/neural-lam/zarr/cosmo_ml_data.zarr2
  • The new lat/lon don't require unrotation anymore and that module was removed from the repo, you can remove it as well

One more question: Do we still need cli_plotting.py or can we combine these two workflows?

Copy link
Collaborator

@sadamov sadamov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have merged this branch with #16, added some prints to the log file and some smaller comments. Before merge we should:

offline.py Outdated
vmin = target_all.min().values
vmax = target_all.max().values

for i in range(22):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for i in range(22):
for i in range(time_range):

The length of the prediction should define the plotting range, no?

offline.py Outdated
start_time = pd.to_datetime(start_time_str, format="%Y%m%d%H")

# Output the prediction time range
time_range = len(predictions[1, :, 1, 1]) # Number of time steps
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
time_range = len(predictions[1, :, 1, 1]) # Number of time steps
time_range = len(predictions[0, :, 0, 0]) # Number of time steps

use first indices to avoid issues with edge cases

offline.py Outdated
parser.add_argument(
"--variable_to_plot",
type=str,
default="TQV",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
default="TQV",
default="T_2M",

Use more common variable as default. Future checkpoints might not contain TQV

@sadamov
Copy link
Collaborator

sadamov commented Apr 26, 2024

  • I have added a template of the latest zarr archive in data/cosmo/templates, to make sure we are using the latest set of variables and also the unrotated grid. For that reason I removed the unrotate functionality from offline.py file.
  • Further, a template for a inference.npy files is also available now in data/cosmo/templates
  • After addressing small comments above we can merge. Note that the scripts are not backward compatible with older input datasets that will be deleted soon anyways.
  • Note that the latest checkpoint is hi_lam, we need to re-train the model and create one checkpoint for each model soon
  • Since I merged this PR branch with latest main-branch you should now again only see changes relevant to offline-plotting (which was the goal of the whole exercise).

--> Maybe add a few words in the README.md about the new data/ folder structure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants