Skip to content

v0.0.1.dev14

Pre-release
Pre-release
Compare
Choose a tag to compare
@nmichlo nmichlo released this 04 Jun 23:08
· 791 commits to main since this release

Overview

This release is mostly a large set of refactors, and reproducibility improvements with regards to seeds and datasets.

Notable Changes

  • Data now relies on disent.data.datafile.DataFiles, which are deterministic, hash and cache based, file generators that can fetch or pre-process data.
  • Added XYSquaresMinimalData, which is a minimal faster version of XYSquaresData without any configuration options. With default parameters, data from XYSquaresData should equal XYSquaresMinimalData
  • Added PickleH5pyFile that can pickle an hdf5 file and dataset. This is intended to be used with torch DataLoaders or multiprocessing.

Definitely Breaking Changes

  • renamed classes:

    • renamed AugmentableDataset to DisentDataset
    • renamed BaseFramework to DisentFramework
    • renamed BaseEncoderModule to DisentEncoder
    • renamed BaseDecoderModule to DisentDecoder
  • consolidated maths and helper functions into new submodule disent.nn

    • disent.nn.weights initialisation functions from originally disent.model.init
    • disent.nn.modules basic modules from various locations including DisentModule, DisentLightningModule, BatchView, Unsqueeze3D, Flatten3D
    • disent.nn.transform transform and augment functions and classes from disent.transform, still needs to be cleaned up in future releases.
    • disent.nn.loss various loss functions from other places including triplet, kl, softsort and reduction modules
    • torch.nn.functional various differentiable torch helper functions mostly from disent.util.math, including functions for computing the Covariance, Correlation, Generalised Mean, PCA, DCT, Channel-Wise convolutions and more! Some functions such as kernel generation need to be moved out of here.
  • split up and consolidated utilities:

    • disent.util.cache caching utilities including the stalefile decorator that only runs the wrapped function if the specified file is stale (hash does not match, or file does not exist)
    • disent.util.colors ANSI escape codes
    • disent.util.function wrapper, decorator and inspect utilities
    • disent.util.hashing compute the full hash of a file or a fast hash based on the README for the imohash algorithm.
    • disent.util.in_out originally from disent.data.util for handling file retrieval/downloading/copying and saving
    • disent.util.iters general iterators or map functions, including iter_chunks and iter_rechunk
    • disent.util.paths path handling and file or directory management
    • disent.util.profiling timers & memory usage
    • disent.util.seeds seed management contexts and functions
    • disent.util.strings string formatting helper functions
  • removed and cleaned up functions from:

    • disent.data.hdf5
    • disent.dataset.__init__
    • disent.util.__init__
    • disent.schedule.lerp renamed activate to scale_ratio and removed other functions.

Other Changes

  • Replaced GroundTruthData specialisations with general loading from DataFiles.
  • StateSpace now stores factor_names instead of GroundTruthData - preparing for rewrite of datasets to use dependency injections and samplers.

Experiment Config & Runner Changes

  • Many config fixes for refactors
  • Experiment can now be seeded

New Tests

  • test PickleH5pyFile multiprocessing support
  • test XYSquaresData and XYSquaresMinimalData similarity