Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1174 replace hard coded indices in runhistory #1180

Open
wants to merge 5 commits into
base: development
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/1_installation.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Installation

TODO UPDATE THIS TO INCLUDE MIN_TRIALS. DO NOT ACCEPT A PUSH WITH THIS
## Requirements

SMAC is written in python3 and therefore requires an environment with python>=3.8.
Expand Down
31 changes: 15 additions & 16 deletions docs/advanced_usage/8_logging.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,24 +96,23 @@ The runhistory.json in split into four parts. `stats`, `data`, `configs`, and `c
},
```

`data` contains a list of entries, one for each configuration.
`data` contains a list of entries, one for each configuration where the keys are the one-based `config_id`.
```json
"data": [
[
1, # config_id
null, # instance or None
209652396, # seed or None
null, # budget or None
5.4345623938566385, # cost
6.699562072753906e-05, # time
6.299999999992423e-05, # cpu_time
1, # status
1733133181.2144582, # start_time
1733133181.21695, # end_time
{} # additional_info
],
"data": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK the config id is a unique configuration without the budget and the instance. If this is the case, we need a different data structure, because currently we would overwrite evaluated trials with the same configs but different budgets/instances. I would propose a list of dictionaries.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check that and get back to you

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will adapt this accordingly

"1": {
"instance": null,
"seed": 398764591,
"budget": null,
"cost": 16916.0,
"time": 4.0531158447265625e-06,
"cpu_time": 3.000000006636583e-06,
"status": 1,
"starttime": 1733155597.639732,
"endtime": 1733155597.64017,
"additional_info": {}
},
...
]
}
```

`configs` is a human-readable dictionary of configurations, where the keys are the one-based `config_id`. It is important to note that in `runhistory.json`, the indexing is zero-based.
Expand Down
2 changes: 1 addition & 1 deletion examples/2_multi_fidelity/1_mlp_epochs.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ def plot_trajectory(facades: list[AbstractFacade]) -> None:
mlp = MLP()

facades: list[AbstractFacade] = []
for intensifier_object in [SuccessiveHalving, Hyperband]:
for intensifier_object in [ Hyperband]:
# Define our environment variables
scenario = Scenario(
mlp.configspace,
Expand Down
64 changes: 31 additions & 33 deletions smac/runhistory/runhistory.py
Original file line number Diff line number Diff line change
Expand Up @@ -768,25 +768,25 @@ def save(self, filename: str | Path = "runhistory.json") -> None:
----------
filename : str | Path, defaults to "runhistory.json"
"""
data = []
data = list()
for k, v in self._data.items():
data += [
(
int(k.config_id),
str(k.instance) if k.instance is not None else None,
int(k.seed) if k.seed is not None else None,
float(k.budget) if k.budget is not None else None,
v.cost,
v.time,
v.cpu_time,
v.status,
v.starttime,
v.endtime,
v.additional_info,
)
]
data.append(
{
"config_id": int(k.config_id),
"instance": str(k.instance) if k.instance is not None else None,
"seed": int(k.seed) if k.seed is not None else None,
"budget": float(k.budget) if k.budget is not None else None,
"cost": v.cost,
"time": v.time,
"cpu_time": v.cpu_time,
"status": v.status,
"starttime": v.starttime,
"endtime": v.endtime,
"additional_info": v.additional_info,
}
)

config_ids_to_serialize = set([entry[0] for entry in data])
config_ids_to_serialize = set([entry["config_id"] for entry in data])
configs = {}
config_origins = {}
for id_, config in self._ids_config.items():
Expand Down Expand Up @@ -858,31 +858,29 @@ def load(self, filename: str | Path, configspace: ConfigurationSpace) -> None:
# Important to use add method to use all data structure correctly
# NOTE: These hardcoded indices can easily lead to trouble
for entry in data["data"]:
# Set n_objectives first
if self._n_objectives == -1:
if isinstance(entry[4], (float, int)):
if isinstance(entry["cost"], (float, int)):
self._n_objectives = 1
else:
self._n_objectives = len(entry[4])
self._n_objectives = len(entry["cost"])

cost: list[float] | float
if self._n_objectives == 1:
cost = float(entry[4])
cost = float(entry["cost"])
else:
cost = [float(x) for x in entry[4]]

cost = [float(x) for x in entry["cost"]]
self.add(
config=self._ids_config[int(entry[0])],
config=self._ids_config[int(entry["config_id"])],
cost=cost,
time=float(entry[5]),
cpu_time=float(entry[6]),
status=StatusType(entry[7]),
instance=entry[1],
seed=entry[2],
budget=entry[3],
starttime=entry[8],
endtime=entry[9],
additional_info=entry[10],
time=entry["time"],
cpu_time=entry["cpu_time"],
status=StatusType(entry["status"]),
instance=entry["instance"],
seed=entry["seed"],
budget=entry["budget"],
starttime=entry["starttime"],
endtime=entry["endtime"],
additional_info=entry["additional_info"],
)

# Although adding trials should give us the same stats, the trajectory might be different
Expand Down
Loading