Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1174 replace hard coded indices in runhistory #1180

Open
wants to merge 5 commits into
base: development
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 15 additions & 16 deletions docs/advanced_usage/8_logging.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,24 +96,23 @@ The runhistory.json in split into four parts. `stats`, `data`, `configs`, and `c
},
```

`data` contains a list of entries, one for each configuration.
`data` contains a list of entries, one for each configuration where the keys are the one-based `config_id`.
```json
"data": [
[
1, # config_id
null, # instance or None
209652396, # seed or None
null, # budget or None
5.4345623938566385, # cost
6.699562072753906e-05, # time
6.299999999992423e-05, # cpu_time
1, # status
1733133181.2144582, # start_time
1733133181.21695, # end_time
{} # additional_info
],
"data": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK the config id is a unique configuration without the budget and the instance. If this is the case, we need a different data structure, because currently we would overwrite evaluated trials with the same configs but different budgets/instances. I would propose a list of dictionaries.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check that and get back to you

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will adapt this accordingly

"1": {
"instance": null,
"seed": 398764591,
"budget": null,
"cost": 16916.0,
"time": 4.0531158447265625e-06,
"cpu_time": 3.000000006636583e-06,
"status": 1,
"starttime": 1733155597.639732,
"endtime": 1733155597.64017,
"additional_info": {}
},
...
]
}
```

`configs` is a human-readable dictionary of configurations, where the keys are the one-based `config_id`. It is important to note that in `runhistory.json`, the indexing is zero-based.
Expand Down
65 changes: 30 additions & 35 deletions smac/runhistory/runhistory.py
Original file line number Diff line number Diff line change
Expand Up @@ -768,25 +768,22 @@ def save(self, filename: str | Path = "runhistory.json") -> None:
----------
filename : str | Path, defaults to "runhistory.json"
"""
data = []
data = dict()
for k, v in self._data.items():
data += [
(
int(k.config_id),
str(k.instance) if k.instance is not None else None,
int(k.seed) if k.seed is not None else None,
float(k.budget) if k.budget is not None else None,
v.cost,
v.time,
v.cpu_time,
v.status,
v.starttime,
v.endtime,
v.additional_info,
)
]

config_ids_to_serialize = set([entry[0] for entry in data])
data[k.config_id] = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might need to be adapted

"instance": k.instance if k.instance is not None else None,
"seed": k.seed if k.seed is not None else None,
"budget": k.budget if k.budget is not None else None,
"cost": v.cost,
"time": v.time,
"cpu_time": v.cpu_time,
"status": v.status,
"starttime": v.starttime,
"endtime": v.endtime,
"additional_info": v.additional_info
}

config_ids_to_serialize = set(data.keys())
configs = {}
config_origins = {}
for id_, config in self._ids_config.items():
Expand Down Expand Up @@ -857,32 +854,30 @@ def load(self, filename: str | Path, configspace: ConfigurationSpace) -> None:

# Important to use add method to use all data structure correctly
# NOTE: These hardcoded indices can easily lead to trouble
for entry in data["data"]:
# Set n_objectives first
for key, value in data["data"].items():
if self._n_objectives == -1:
if isinstance(entry[4], (float, int)):
if isinstance(value["cost"], (float, int)):
self._n_objectives = 1
else:
self._n_objectives = len(entry[4])
self._n_objectives = len(value["cost"])

cost: list[float] | float
if self._n_objectives == 1:
cost = float(entry[4])
cost = float(value["cost"])
else:
cost = [float(x) for x in entry[4]]

cost = [float(x) for x in value["cost"]]
self.add(
config=self._ids_config[int(entry[0])],
config=self._ids_config[int(key)],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might need to be adapted

cost=cost,
time=float(entry[5]),
cpu_time=float(entry[6]),
status=StatusType(entry[7]),
instance=entry[1],
seed=entry[2],
budget=entry[3],
starttime=entry[8],
endtime=entry[9],
additional_info=entry[10],
time=value["time"],
cpu_time=value["cpu_time"],
status=StatusType(value["status"]),
instance=value["instance"],
seed=value["seed"],
budget=value["budget"],
starttime=value["starttime"],
endtime=value["endtime"],
additional_info=value["additional_info"],
)

# Although adding trials should give us the same stats, the trajectory might be different
Expand Down
Loading