Skip to content

Latest commit

 

History

History
96 lines (58 loc) · 3.52 KB

README.md

File metadata and controls

96 lines (58 loc) · 3.52 KB

Cordage: Computational Research Data Management

Parameterize experiments using dataclasses and use cordage to easily parse configuration files and command line options.


Build status PyPI - Version PyPI - Python Version PyPI - License

Cordage Icon

Repository | Documentation | Package


Cordage is in a very early stage. Currently, it lacks a lot of documentation and wider range of features. If you think it could be useful for you, try it out and leave suggestions, complains, and improvemnt ideas as github issues.

Check out the roadmap for an outline of the next steps that are planned for the development of this package.


Motivation

In many cases, we want to execute and parameterize a main function. Since experiments can quickly become more complex and may use an increasing number of parameters, it often makes sense to store these parameters in a dataclass.

Cordage makes it easy to load configuration files or configure the experiment via the commandline.

Quick Start

For more detailed information, check out the documentation.

Installation

In an environment of your choice (python>=3.8), run:

pip install cordage

Example

from dataclasses import dataclass
import cordage


@dataclass
class Config:
    lr: float = 5e-5
    name: str = "MNIST"


def train(config: Config):
    """Help text which will be shown."""
    print(config)


if __name__ == "__main__":
    cordage.run(train)

To use cordage, you need a main function (e.g. func) which takes a dataclass configuration object as an argument. Use cordage.run(func) to execute this function with arguments passed via the command line. Cordage parses the configuration and creates an output directory (if the function accepts output_dir, it will be passed as such).

See the examples in the examples directory for more details.

Features

The main purpose of cordage is to manage configurations to make configuring reproducible experiments easy. Cordage automatically generates a commandline interface which can be used to parse configuration files and/or set specific configuration fields via CLI options (run the experiment with the --help option to get an overview over the available configuration fields).

By using the __series__ key, it is possible ot invoke multiple repetitions of an experiment using the same base configuration but varying some of the configuration fields. The resulting trial runs are (by default) saved in a common series-level directory.

Additionally, cordage can provide an output directory (via the output_dir) where cordage will store the used configuration as well as some experimental metadata.