Merge pull request #136 from NCAR/djgagne

Version changes
NCAR · Dec 10, 2024 · a326a6c · a326a6c
2 parents fc2f8d1 + 3f375cc
commit a326a6c
Show file tree

Hide file tree

Showing 8 changed files with 520 additions and 35 deletions.
diff --git a/README.md b/README.md
@@ -1,12 +1,13 @@
-# NSF NCAR MILES Community Runnable Earth Digital Intelligence Twin (CREDIT)
+# NSF NCAR MILES Community Research Earth Digital Intelligence Twin (CREDIT)
 
 ## About
-CREDIT is a package to train and run neural networks
-that can emulate full NWP models by predicting
-the next state of the atmosphere given the current state.
+CREDIT is a research platform to train and run neural networks that can emulate full NWP models by predicting
+the next state of the atmosphere given the current state. The platform is still under very active development. 
+If you are interested in using or contributing to CREDIT, please reach out to David John Gagne (dgagne@ucar.edu). 
+
 
 ## NSF-NCAR Derecho Installation
-Currently, the framework for running miles-credit in parallel is centered around NSF-NCAR's Derecho HPC. Derecho requires building several miles-credit dependent packages locally, including PyTorch, to enable correct MPI configuration. To begin, create a clone of the pre-built miles-credit environment, which contains compatiable versions of torch, torch-vision, numpy, and others. 
+Currently, the framework for running miles-credit in parallel is centered around NSF NCAR's Derecho HPC. Derecho requires building several miles-credit dependent packages locally, including PyTorch, to enable correct MPI configuration. To begin, create a clone of the pre-built miles-credit environment, which contains compatiable versions of torch, torch-vision, numpy, and others. 
 
 ```bash
 module purge 
@@ -72,14 +73,14 @@ python applications/train.py -c config/vit.yml
 Or use a fancier [variation](https://github.com/lucidrains/vit-pytorch/blob/main/vit_pytorch/rvt.py)
 
 ```bash
-python applications/train.py -c config/rvt.yml
+python applications/train.py -c config/wxformer_1dg_test.yml
 ```
 
 ## Launch with PBS on Casper or Derecho
 
 Adjust the PBS settings in a configuration file for either casper or derecho. Then, submit the job via
 ```bash
-python applications/train.py -c config/vit.yml -l 1
+python applications/train.py -c config/wxformer_1dg_test.yml -l 1
 ```
 The launch script may be found in the save location that you set in the configation file. The automatic launch script generation will take care of MPI calls and other complexities if you are using more than 1 GPU.
 
@@ -88,5 +89,12 @@ The launch script may be found in the save location that you set in the configat
 The predict field in the config file allows one to speficy start and end dates to roll-out a trained model. To generate a forecast,
 
 ```bash
-python applications/rollout_to_netcdf.py -c config/vit.yml
+python applications/rollout_to_netcdf.py -c config/wxformer_1dg_test.yml
 ```
+
+# Support
+This software is based upon work supported by the NSF National Center for Atmospheric Research, a major facility sponsored by the 
+U.S. National Science Foundation  under Cooperative Agreement No. 1852977 and managed by the University Corporation for Atmospheric Research. Any opinions, findings and conclusions or recommendations 
+expressed in this material do not necessarily reflect the views of NSF. Additional support for development was provided by 
+The NSF AI Institute for Research on Trustworthy AI for Weather, Climate, and Coastal Oceanography (AI2ES)  with grant
+number RISE-2019758. 
diff --git a/credit/VERSION b/credit/VERSION
@@ -1 +1 @@
-2023.1.0
+2024.1.0
diff --git a/credit/models/crossformer.py b/credit/models/crossformer.py
@@ -345,31 +345,62 @@ def forward(self, x):
 class CrossFormer(BaseModel):
     def __init__(
         self,
-        image_height=640,
-        patch_height=1,
-        image_width=1280,
-        patch_width=1,
-        frames=2,
-        channels=4,
-        surface_channels=7,
-        input_only_channels=3,
-        output_only_channels=0,
-        levels=15,
-        dim=(64, 128, 256, 512),
-        depth=(2, 2, 8, 2),
-        dim_head=32,
-        global_window_size=(5, 5, 2, 1),
-        local_window_size=10,
-        cross_embed_kernel_sizes=((4, 8, 16, 32), (2, 4), (2, 4), (2, 4)),
-        cross_embed_strides=(4, 2, 2, 2),
-        attn_dropout=0.0,
-        ff_dropout=0.0,
-        use_spectral_norm=True,
-        interp=True,
-        padding_conf=None,
-        post_conf=None,
+        image_height: int = 640,
+        patch_height: int = 1,
+        image_width: int = 1280,
+        patch_width: int = 1,
+        frames: int = 2,
+        channels: int = 4,
+        surface_channels: int = 7,
+        input_only_channels: int = 3,
+        output_only_channels: int = 0,
+        levels: int = 15,
+        dim: tuple = (64, 128, 256, 512),
+        depth: tuple = (2, 2, 8, 2),
+        dim_head: int = 32,
+        global_window_size: tuple = (5, 5, 2, 1),
+        local_window_size: int = 10,
+        cross_embed_kernel_sizes: tuple = ((4, 8, 16, 32), (2, 4), (2, 4), (2, 4)),
+        cross_embed_strides: tuple = (4, 2, 2, 2),
+        attn_dropout: float = 0.0,
+        ff_dropout: float = 0.0,
+        use_spectral_norm: bool = True,
+        interp: bool = True,
+        padding_conf: dict = None,
+        post_conf: dict = None,
         **kwargs,
     ):
+        """
+        CrossFormer is the base architecture for the WXFormer model. It uses convolutions and long and short distance
+        attention layers in the encoder layer and then uses strided transpose convolution blocks for the decoder
+        layer.
+
+        Args:
+            image_height (int): number of grid cells in the south-north direction.
+            patch_height (int): number of grid cells within each patch in the south-north direction.
+            image_width (int): number of grid cells in the west-east direction.
+            patch_width (int): number of grid cells within each patch in the west-east direction.
+            frames (int): number of time steps being used as input
+            channels (int): number of 3D variables. Default is 4 for our ERA5 configuration (U, V, T, and Q)
+            surface_channels (int): number of surface (single-level) variables.
+            input_only_channels (int): number of variables only used as input to the ML model (e.g., forcing variables)
+            output_only_channels (int):number of variables that are only output by the model (e.g., diagnostic variables).
+            levels (int): number of vertical levels for each 3D variable (should be the same across frames)
+            dim (tuple): output dimensions of hidden state of each conv/transformer block in the encoder
+            depth (tuple): number of attention blocks per encoder layer
+            dim_head (int): dimension of each attention head.
+            global_window_size (tuple): number of grid cells between cells in long range attention
+            local_window_size (tuple): number of grid cells between cells in short range attention
+            cross_embed_kernel_sizes (tuple): width of the cross embed kernels in each layer
+            cross_embed_strides (tuple): stride of convolutions in each block
+            attn_dropout (float): dropout rate for attention layout
+            ff_dropout (float): dropout rate for feedforward layers.
+            use_spectral_norm (bool): whether to use spectral normalization
+            interp (bool): whether to use interpolation
+            padding_conf (dict): padding configuration
+            post_conf (dict): configuration for postblock processing
+            **kwargs:
+        """
         super().__init__()
 
         dim = tuple(dim)

diff --git a/docs/source/_static/credit_logo.graffle b/docs/source/_static/credit_logo.graffle
diff --git a/docs/source/_static/credit_logo.png b/docs/source/_static/credit_logo.png