From 9176ca60fa9f1fd20dc452bd8bfdabeb72e080b8 Mon Sep 17 00:00:00 2001 From: "C. Benjamins" <75323339+benjamc@users.noreply.github.com> Date: Mon, 2 Dec 2024 15:09:10 +0100 Subject: [PATCH] 1085 runhistory documentation (#1175) * Content draft of the documentation. * Fix typos * Fix typos * Add important info to Facade and Scenario Documentation * adapt syntax * Update runhistory describtion * refactor(8_logging): add . --------- Co-authored-by: Lukas Fehring --- docs/3_getting_started.md | 7 ++ docs/advanced_usage/8_logging.md | 107 +++++++++++++++++++++++++++++++ 2 files changed, 114 insertions(+) diff --git a/docs/3_getting_started.md b/docs/3_getting_started.md index 3237db587..79bbc7c53 100644 --- a/docs/3_getting_started.md +++ b/docs/3_getting_started.md @@ -75,6 +75,7 @@ from smac import Scenario scenario = Scenario( configspace=cs, + name="experiment_name", output_directory=Path("your_output_directory") walltime_limit=120, # Limit to two minutes n_trials=500, # Evaluated max 500 trials @@ -83,9 +84,15 @@ scenario = Scenario( ) ``` +!!! note + If no `name` is given, a hash of the experiment is used. Running the same experiment again at a later time will result in exactly the same hash. This is important, because the optimization will warmstart on the preexisting evaluations, if not otherwise specified in the [Facade][smac.facade.abstract_facade]. + ## Facade +!!! warn + By default Facades will try to warmstart on preexisting logs. This behavior can be specified using the `overwrite` parameter. + A [facade][smac.facade.abstract_facade] is the entry point to SMAC, which constructs a default optimization pipeline for you. SMAC offers various facades, which satisfy many common use cases and are crucial to achieving peak performance. The idea behind the facades is to provide a simple interface to all of SMAC's components, diff --git a/docs/advanced_usage/8_logging.md b/docs/advanced_usage/8_logging.md index 7d24e84e2..18e8d7496 100644 --- a/docs/advanced_usage/8_logging.md +++ b/docs/advanced_usage/8_logging.md @@ -27,6 +27,113 @@ The table shows you the specific levels: | 40 | ERROR | | 50 | CRITICAL | +## Standard Logging Files + +By default, SMAC generates several files to document the optimization process. These files are stored in the directory structure `./output_directory/name/seed`, where name is replaced by a hash if no name is explicitly provided. This behavior can be customized through the [Scenario][smac.scenario] configuration, as shown in the example below: +```python +Scenario( + configspace = some_configspace, + name = 'experiment_name', + output_directory = Path('some_directory'), + ... +) +``` +Notably, if an output already exists at `./some_directory/experiment_name/seed`, the behavior is determined by the overwrite parameter in the [facade's][smac/facade/abstract_facade] settings. This parameter specifies whether to continue the previous run (default) or start a new run. + +The output is split into four different log files, and a copy of the utilized [Configuration Space of the ConfigSpace library](https://automl.github.io/ConfigSpace/latest/). + +### intensifier.json +The [intensification][Intensification] is logged in `intensifier.json` and has the following structure: + +```json +{ + "incumbent_ids": [ + 65 + ], + "rejected_config_ids": [ + 1, + ], + "incumbents_changed": 2, + "trajectory": [ + { + "config_ids": [ + 1 + ], + "costs": [ + 0.45706284046173096 + ], + "trial": 1, + "walltime": 0.029736042022705078 + }, + #... + ], + "state": { + "tracker": {}, + "next_bracket": 0 + } +} +``` + +### optimization.json +The optimization process is portrayed in `optimization.json` with the following structure + +```json +{ + "used_walltime": 184.87366724014282, + "used_target_function_walltime": 20.229533672332764, + "last_update": 1732703596.5609574, + "finished": false +} +``` +### runhistory.json +The runhistory.json in split into four parts. `stats`, `data`, `configs`, and `config_origins`. +`stats` contains overall broad stats on the different evaluated configurations: +```json + "stats": { + "submitted": 73, + "finished": 73, + "running": 0 + }, +``` + +`data` contains a list of entries, one for each configuration. +```json + "data": [ + [ + 1, # config_id + null, # instance or None + 209652396, # seed or None + null, # budget or None + 5.4345623938566385, # cost + 6.699562072753906e-05, # time + 6.299999999992423e-05, # cpu_time + 1, # status + 1733133181.2144582, # start_time + 1733133181.21695, # end_time + {} # additional_info + ], + ... + ] +``` + +`configs` is a human-readable dictionary of configurations, where the keys are the one-based `config_id`. It is important to note that in `runhistory.json`, the indexing is zero-based. +```json + "configs": { + "1": { + "x": -2.3312147893012 + }, +``` + +Lastly, `config_origins` specifies the source of a configuration, indicating whether it stems from the initial design or results from the maximization of an acquisition function. +```json + "config_origins": { + "1": "Initial Design: Sobol", + ... + } +``` + +### scenario.json +The ´scenario.json´ file contains the overall state of the [Scenario][smac.scenario] logged to a json file. ## Custom File