Skip to content

Releases: marovira/helios-ml

1.2.4

01 Dec 19:21
Compare
Choose a tag to compare

Breaking Changes

  • helios.plugins.optuna.OptunaPlugin.enqueue_failed_trials has been removed in favour of helios.plugins.optuna.resume_study. See the notes below for the new system.

Feature Changes

  • Provides full support for resuming stopped/interrupted optuna trials. The new resume_study function now correctly resumes studies by creating a new DB and copying the trials over, enqueuing any trials that are considered failures. Please see the documentation for more info.
  • Adds a new function to checkpoint and restore samplers when using Optuna. This now provides full reproducibility (subject to the usual constraints from Optuna and PyTorch).

Release Notes

1.2.3...1.2.4

1.2.3

15 Nov 01:32
Compare
Choose a tag to compare

Emergency Patch

  • Fixes a critical issue with helios.metrics.functional.calculate_mae_torch that was causing it to return incorrect values.
  • Unit tests have been updated to ensure this doesn't happen again.

Release Notes

1.2.2...1.2.3

1.2.2

14 Nov 20:35
Compare
Choose a tag to compare

Breaking Changes

  • The Optuna plug-in no longer reports metrics automatically. Please call report_metrics whenever the metrics are ready to be sent to the trial.

Bug Fixes

  • Fixes an issue where the plug-in would incorrectly attempt to report metrics when on_validation_end hadn't been called and therefore mistakenly reported the wrong metrics. This in turn resulted in the plug-in reporting metrics that were one validation cycle behind the current cycle.
  • Updated the docs to ensure they are consistent with the code.
  • Fixed an issue where online docs did not show source code.

Release Notes

1.2.1...1.2.2

1.2.1

08 Nov 00:10
Compare
Choose a tag to compare

Breaking Changes

  • The Optuna plug-in no longer takes in num_trials as an argument. Please see the feature changes for more details.

Bug Fixes

  • The Optuna plug-in now has functionality to correctly resume trials that failed due to user intervention or some other errors. Please see the documentation on the new enqueue_failed_trials function for more information.

Full Changelog

1.2.0...1.2.1

1.2.0

05 Nov 23:30
Compare
Choose a tag to compare

Breaking Changes

  • helios.data.functional.load_image no longer uses PIL as its back-end and is now using OpenCV. If you're using the function with default arguments, no changes need to be made. However if you're using it as load_image(path, "RGB"), then you should change to load_image(path, cv2.IMREAD_COLOR) to get the equivalent behaviour. If you require PIL, you can use helios.data.functional.load_image_pil.
  • Helios now does exception handling internally. As a result, any exception that is not registered in the training or testing exception lists will be handled internally and swallowed by Helios.

Feature Changes

  • Replaced PIL in helios.data.functional.load_image with OpenCV to allow more flexibility in what types of images are loaded. In order to maintain compatibility with PyTorch, helios.data.functional.load_image_pil has been added so images can be load through PIL.
  • ToImageTensor is now type-hinted correctly with all the possible types it accepts.
  • Exception handling has been improved. The new system correctly logs exceptions in both single and distributed training cases. The side-effect of this is that Helios will now swallow exceptions unless explicitly told otherwise. See the documentation for the Trainer.fit and Trainer.test functions for more details.
  • The Optuna plug-in now has a system to stop optimisation whenever the given number of trials has been reached. This ensures that the number of trials is reached regardless of the number of runs it takes to get there.

Bug Fixes

  • The Model._val_scores and Model._test_scores tables now accept any type as their values.

Full Changelog

1.1.0...1.2.0

1.1.0

11 Sep 19:03
Compare
Choose a tag to compare

1.1.0

Breaking Changes

  • Dependencies have been updated. Please see the README for more information.
  • Helios now requires a minimum NumPY version of 2.0.0.
  • The TrainingState struct was previously saved in checkpoints as a dictionary. This has now been changed to save the struct itself, so you must migrate your checkpoints to the new system.

Feature Changes

  • Introduces a new plug-in system to extend the functionality of Helios.
  • Introduces a new safe_torch_load function that wraps torch.load with weights_only set to true. This addresses the warnings coming from PyTorch starting with 2.4.0.
  • Introduces a way to have the trainer ignore certain exception types when training so they can be caught by the calling code.
  • Adds a multi-processing queue to the trainer (available only in distributed mode) that allows data to be passed back to the main process.
  • Adds native integration with Optuna through the new OptunaPlugin.
  • Adds a new CUDAPlugin that automatically moves batches to the set GPU device.

Bug Fixes

  • When setting both CPU and GPU for the trainer, an exception is now raised instead of silently ignoring the CPU flag.
  • Unit tests are now expanded to cover all supported versions of Python.
  • Protobuf is no longer fixed to be less than 5.0.0.

1.0.0

18 Jul 22:38
Compare
Choose a tag to compare

First official release of Helios

Updates

  • Adds new unit tests to ensure device and map locations are correct.
  • Adds a way to add text to the default Helios banner.
  • Adds a tool to migrate checkpoints created by previous versions of Helios.
  • Cleans up and updates all documentation

Breaking Changes

  • Checkpoints created with prior versions of Helios will no longer work. You may migrate them to the latest version using python -m helios.chkpt_migrator <chkpt-root>

Full Changelog

0.3.0...1.0.0

0.3.0

21 Jun 00:17
Compare
Choose a tag to compare
0.3.0 Pre-release
Pre-release

Updates

  • Adds a new set of callbacks to the Model class that are called at the start/end of each epoch.
  • Adds a way to set a custom collate_fn for the dataloader.
  • The Model no longer contains abstract methods.
  • Changes the call site of model.on_training_start so print statements don't interfere with the progress bar.
  • Extend the list of optimizers and schedulers so all the ones provided by PyTorch are registered by default.
  • Extend the should_training_stop functionality to allow breaking out of the loop after a training step.
  • Adds documentation with Sphinx.
  • Allow __version__ to be directly imported from the helios package.

Breaking Changes

  • ToTensor has been renamed as ToImageTensor in order to be more explicit about what the class does.

Full Changelog

0.2.0...0.3.0

0.2.0

09 May 21:14
Compare
Choose a tag to compare
0.2.0 Pre-release
Pre-release

Updates

  • Fixes the way epochs are numbered. This should ensure that all epoch counts are now consistent with each other regardless of training type.
  • Fixes an issue where training with iterations and gradient accumulation resulted in half iterations being run after training should've stopped.
  • Removes F1, recall, and precision metrics. The implementations were not generic enough to be shipped with Helios.
  • Refactors the MAE implementation to make it more generic in terms of the types of tensors it accepts.
  • Adds a numpy version of MAE.

Full Changelog

0.1.9...0.2.0

0.1.9

07 May 21:56
Compare
Choose a tag to compare
0.1.9 Pre-release
Pre-release

Update

  • Adds a flag to disable the printing of the banner.

Full Changelog

0.1.8...0.1.9