Skip to content

Releases: lRomul/argus

Maintenance release, new guides and updated docs

25 Apr 00:26
Compare
Choose a tag to compare

Fix

  • Fix AverageMeter for n > 1 cases.

Breaking Changes

  • Delete batch object after iteration complete.
  • Don't store data loader in the state of engine.

New Features

  • Return metrics from fit method the same way as from from validate.
  • Use constructor from BuildModel to be able to user pass build_order.

Docs

New guides on:

  • Custom metrics.
  • Partial weights loading and manipulation.
  • Model export.
  • Custom callbacks.
  • LR schedulers.

Other improvements:

  • New solutions of competitions to examples.
  • Improve docstrings in many places.

Chore

  • Use pyproject.toml.
  • Update GitHub Actions versions.
  • Update dependencies.
  • Use ruff linter.

Full Changelog: v1.0.0...v1.1.0

Argus 1.0.0

09 Nov 20:18
Compare
Choose a tag to compare

New Features

Docs

  • Add guides that provide an in-depth overview of how the framework works (link).
  • Fix minor typos in docstrings.

Examples

  • New example with sequential LR scheduler (link).
  • Transitioning from torch.distributed.launch to torchrun in cifar_advanced example.

Chore

  • Add __all__ for all modules.
  • Update CUDA 11.3.1.
  • Update PyTorch 1.10.0.

Logo, pydata sphinx theme, custom state loading, share train and val states

21 Mar 20:40
Compare
Choose a tag to compare

New Features

  • Share train and val states between phases with phase_states attribute of state.

     @argus.callbacks.on_epoch_complete
     def some_validation_callback(state: argus.engine.State):
         train_step_output = state.phase_states['train'].step_output
         ...
  • Option to use custom state load function for argus.load_model.

    def state_load_from_dir(dir_path):
        file_path = pathlib.Path(dir_path) / 'some_model_name.pth'
        return torch.load(dir_path)
    
    model = load_model(path_to_dir_with_model,
                       state_load_func=custom_state_load_func)

Docs

Fix

  • Fix sdist package installation by adding MANIFEST.in with requirements.txt.

Examples

  • Use torch.cuda.amp instead Apex in advanced CIFAR example.
  • Add as an example solution for RANZCR CLiP - Catheter and Line Position Challenge.

Chore

  • setup.cfg with pytest and flake8 settings.
  • CI check code style with flake8.
  • Run tests on macOS and Windows.
  • Update Dockerfile and tests to PyTorch 1.8.0.
  • Update Dockerfile to CUDA 11.1.

Save optimizer state, improve docs and typing

15 Dec 10:07
Compare
Choose a tag to compare

New Features

  • Add saving of optimizer state for argus.Model and checkpoint callbacks.
    model.save('models/model.pth', optimizer_state=True)
    
    checkpoint = Checkpoint(dir_path='models/', optimizer_state=True)
  • Add get_device method to argus.Model.
  • Typing and fixing most cases of mypy errors.

Fix

  • Remove torch.optim._multi_tensor optimizers from defaults (torch >= 1.7.0).

Docs

  • Section argus.engine.
  • Section argus.metrics.
  • Section argus.utils with deep conversions.
  • Add docs for decorator callbacks.
  • Add docs for argus.Model methods: __init__, set_device, get_device, get_nn_module.
  • Update examples section.
  • Proofread and improve docs. Many small docstring fixes.

Internal changes

  • Use abstract container classes from collections.abc.
  • Now Engine and State only work with the argus.Model methods as a step_method. Phase name takes from the method name.
  • Simplify default logging.

Breaking Changes

  • Change optimizer state in argus.load_model. Now change_state_dict_func takes two arguments nn_state_dict and optimizer_state_dict (example).
  • Remove handler_kwargs_dict from the attach method of argus.callbacks.Callback.

Tests, replace params while model loading, custom events

09 Sep 09:39
Compare
Choose a tag to compare

New Features

  • Tests, 100% coverage (codecov).
  • Mechanism of params replacement while model loading (example).
    # change optimizer params 
    model = load_model(model_path, optimizer=('AdamW', {'lr': 0.001}))
    # load model without optimizer and loss  
    model = load_model(model_path, optimizer=None, loss=None)
  • Custom events for callbacks (example).
    import argus
    from argus.engine import EventEnum
    
    class CustomEvents(EventEnum):
        BACKWARD_START = 'backward_start'
        BACKWARD_COMPLETE = 'backward_complete'
    
    @argus.callbacks.on_event(CustomEvents.BACKWARD_START)
    def before_backward(state):
        ...
    
    class CustomEventModel(argus.Model):
        ...
        def train_step(self, batch, state):
            ...
            state.engine.raise_event(CustomEvents.BACKWARD_START)
            loss.backward()
            state.engine.raise_event(CustomEvents.BACKWARD_COMPLETE)
            ...
  • Typing.
  • Raise exceptions instead asserts.
  • Setup unique logger for each instance of argus.Model.
  • Check that params is a pickleble at model construction.
  • create_dir parameter for argus.callbacks.logging.LoggingToCSV.
  • Use instance of argus.utils.Identity as default for prediction_transform instead of lambda x: x.

Fix

  • Correctly save checkpoints with save_after_exception argument for argus.callbacks.checkpoints.

Breaking Changes

  • Change default append argument value to False for argus.callbacks.logging.LoggingToFile.
  • Rename attribute _scheduler of argus.callbacks.lr_schedulers.LRScheduler to scheduler.

Custom build methods, more examples

24 Jul 15:43
Compare
Choose a tag to compare

Features

  • New mechanics building of attributes. It allows customizing the creation of model parts. Example here.
  • CIFAR example with Distributed Data Parallel, mixed precision, and gradient accumulation cifar_advanced.py.
  • Add save_model method to argus.callbacks.checkpoints. It allows customizing checkpoint saving.
  • Add logging time and LR to argus.callbacks.logging.LoggingToCSV.
  • argus.utils.deep_chunk similar to scatter function in PyTorch DataParallel.
  • Dockerfile and Makefile for developing.

Breaking Changes

  • Use argus.utils.deep_to function instead method argus.Model.prepare_batch. argus.Model.prepare_batch removed so if you use custom val_train or train_step you should change replace
    input, target = self.prepare_batch(batch, self.device)
    to
    input, target = deep_to(batch, self.device, non_blocking=True)
  • Rename max_epochs to num_epochs of argus.Model.fit method.
    model.fit(train_loader,
              val_loader=val_loader,
              num_epochs=1000)
  • Remove copy_last parameter from argus.callbacks.checkpoints.
  • Remove period parameter from argus.callbacks.checkpoints.MonitorCheckpoint.

Documentation, LR scheduler step on iteration, new LR schedulers

24 May 18:37
Compare
Choose a tag to compare

New Features

  • Documentation https://pytorch-argus.readthedocs.io
  • Add step on iteration option for LR schedulers.
    from argus.callbacks import CosineAnnealingLR
    
    CosineAnnealingLR(10000, step_on_iteration=True)
  • New LR schedulers.
    • argus.callbacks.lr_schedulers.MultiplicativeLR: Multiply learning rate by the factor given in the specified function.
    • argus.callbacks.lr_schedulers.OneCycleLR: One Cycle learning rate policy.
  • Make LR scheduler step on epoch complete instead start.
  • Compute metric score with torch no grad.

Fix

  • Fix LR logging with several parameters group in optimizer.
  • Fix key error in redefine metric warning.

Breaking Changes

  • PyTorch requirements torch>=1.1.0.

New LR schedulers, csv logger, state in step functions

21 Aug 16:02
Compare
Choose a tag to compare

New Features

  • CyclicLR and CosineAnnealingWarmRestarts LR schedulers.

    • argus.callbacks.lr_schedulers.CyclicLR: Support for Cyclical Learning Rate and Momentum.
    • argus.callbacks.lr_schedulers.CosineAnnealingWarmRestarts: Stochastic Gradient Descent with Warm Restarts.
  • argus.callbacks.logging.LoggingToCSV: add csv logger callback.

    from argus.callbacks import LoggingToCSV
    
    LoggingToCSV('path/to/log.csv', separator=',', append=False)
  • Add train and eval mode methods to argus.Model.

    • model.train() sets the nn_module in training mode.
    • model.eval() sets the nn_module in evaluation mode
  • Set step_output of State to None after each iteration for saving GPU memory.

Breaking Changes

  • Pass state to train and val step functions:

    Before:

    def train_step(self, batch):
       ...

    Now:

    def train_step(self, batch, state: State):
       print(state.epoch)
       ...
  • Scheduler step on epoch start, train epochs from 0 to max_epochs - 1. The scheduler callback uses the epoch param of a scheduler step function, so it now works like in 20124.

  • Remove deprecated to_device and detach_tensors utils functions

Data parallel

20 Aug 13:46
f94b34d
Compare
Choose a tag to compare

Data parallel for multi-gpu training.

Select gpu with device indexing:

model = load_model(model_path, device="cuda:1")
model.set_device("cuda:0")

For multi-gpu you can use list of devices:

params = {
    ...,
    'device': ['cuda:0', 'cuda:1']
}
model = CnnFinetune(params)

model = load_model(model_path, device=["cuda:1", "cuda:0"])
model.set_device(["cuda:0", "cuda:1"])

Batch tensors will be scattered on dim 0. First device in list is location of output.

By default device "cuda" is one gpu training on torch.cuda.current_device.