Skip to content

v0.3.0

Compare
Choose a tag to compare
@nmichlo nmichlo released this 11 Nov 09:12
· 366 commits to main since this release

This release touches most of the codebase.

Major Additions

  • added XYObjectShadedData dataset, which is exactly the same as XYObjectData but the ground truth factors differ. This might be useful for testing how metrics are affected by the ground truth representation of factors. Note that XYObjectData differs from previous versions due to this.
  • added DSpritesImagenetData dataset that is the same as DSpritesData but masks that background or foreground depending on the mode and replaces the content with deterministic data from tiny-imagenet
  • added disent.framework.vae.AdaGVaeMinimal which is a minimal implementation of AdaVae configured to run in gvae
  • added disent.util.lightning.callbacks.VaeGtDistsLoggingCallback which logs various distances matrices computed from averaged ground truth factor traversals.
  • Updated experiment files to use hydra 1.1
    • can now switch between train and prepare_data modes with the defaults group run_action=train

Other Additions

  • added shallow_copy to disent.dataset.DisentDataset enabling a shallow copy of the dataset but overriding specific properties such as the transform
  • added new disent.dataset.transform including ToImgTensorF32 (was ToStandardisedTensor ) and ToImgTensorU8
  • additions to H5Builder
    • add_dataset_from_array that constructs and fills a dataset in the hdf5 file from an array
    • converted into context manager instead of manually opening the hdf5 file
  • additions to StateSpace (and ground truth dataset child classes)
    • normalise_factor_idx convert names of ground truth factors into the numerical value
    • normalise_factor_idxs convert a name, an idx, lists of names, or lists of idxs to the numerical values of the ground truth factors.
  • disent.dataset.util.stats added compute_data_mean_std(data) to compute the mean and std of datasets
  • added disent.schedule.SingleSchedule
  • improved disent.util.deprecate.deprecated, now prints the stack trace for the call location of the deprecated function by default. This can be disabled.
  • added restart method to disent.util.profiling.Timer for easy use within a loop
  • added disent.util.vizualize.plot which contains various matplotlib helper code used throughout the library and PyTorch lightning callbacks.

Breaking Changes

  • removed confusing observation_shape and obs_shape properties from GroundTruthData and any child classes. Any methods that require these properties across disent had their names update too. For example the ArrayGroundTruthData class now takes x_shape.
    • observation_shape (H, W, C) should be replaced with img_shape, you will need to update your overrides in child classes
    • obs_shape (C, H, W) should be replaced with x_shape
  • XYObjectData default parameters updated for XYObjectShadedData , dataset and colour palettes differs slightly from previous versions.
  • moved module disent.nn.transform to disent.dataset.transform
    • renamed ToStandardisedTensor to ToImgTensorF32
  • H5Builder converted into context manager, similar API to open or h5py.File
  • ReconLossHandlerMse changed to not scale or centre the output, this is because we now normalise the data instead which is more correct
  • AdaVae and inheriting classes have various functions renamed for clarity
  • disent.metrics functions have ground_truth_dataset parameter renamed to dataset
  • disent.model.ae renamed DecoderTest and EncoderTest to DecoderLinear and EncoderLinear
  • disent.registry updated registry to use new more simple class structure and format. Some variables have been renamed, and registry names have been changed to plurals, eg. OPTIMIZER is now OPTIMIZERS
  • disent.schedule cleaned up
    • renamed various variables and parameters min_step -> start_step, max_step -> end_step
    • removed disent.schedule.lerp.scale() function, as it is the same as lerp just not clipped
  • disent.util.lightning.callbacks.VaeDisentanglementLoggingCallback renamed to VaeMetricLoggingCallback
  • docs.examples updated to use new XYObjectData version and ToImgTensorF32 transform

Deprecations

  • deprecated ground_truth_data property on DisentDataset , this should be replaced with the shorter gt_data property. References to ground_truth_data have been replaced in disent.

Fixes

  • Fixed Mpi3dData datasets, and added file hashes
  • Updated requirements
  • Many minor fixes, usability and error message improvements

Hydra Experiment Changes

Hydra Config has finally been updated from version 1.0 to 1.1, adding support for recursive defaults and recursive instantiation. This allows is to remove all of our custom & hacky hydra helper code that previously enabled these features.

  • hydra now supports recursive instantiation
  • value based specialisation can now be done with recursive defaults using dummy groups

Updating hydra was a good opportunity to re-structure the configuration format.

  • All settings defined in the root config that are referenced elsewhere are now in the settings key.
  • Default settings defined in various subgroups that are referenced elsewhere are often placed in the dsettings key.
  • Keys for various objects were renamed for clarity, eg. augment.transform was renamed to augment.augment_cls
  • All datasets now require the meta.vis_mean and meta.vis_std keys that are used both to normalise the dataset and used to re-scale it between [0, 1] for visualisation during training.

Every config file has been touched, the best approach is probably to look at the new system. The general structure remains the same, but the recursive defaults from Hydra 1.1 allows us to implement various things in a more clean way.

  • new defaults group run_launcher to easily swap between slurm and local
  • defaults group run_location only specifies machine resources and paths
  • new defaults group sampling specifies details and the sampling strategy to be used by the frameworks
  • new defaults group run_action to switch between training and downloading and installing datasets prepare_data