Was it your intention to recreate wandb tables in iterator? #76

huskydoge · 2024-04-04T15:24:30Z

https://github.com/eric-mitchell/direct-preference-optimization/blob/f8b8c0f49dc92a430bae41585f9d467d3618fe2f/trainers.py#L297C1-L302C1

if self.config.sample_during_eval:
    all_policy_samples, all_reference_samples = [], []
    policy_text_table = wandb.Table(columns=["step", "prompt", "sample"])
    if self.config.loss.name in {'dpo', 'ipo'}:
        reference_text_table = wandb.Table(columns=["step", "prompt", "sample"])

Just make sure it's not a bug, since there is a "step" column, I suppose it should be a table recording "samples during eval" throughout the whole training procedure. However, I can only see eval_batch_size rows of policy/reference samples derived from the first eval in wandb UI, and then no updating to the table is made.

Besides, regarding updating wandb table, there is actually a bug in wandb that remains unsolved.
https://github.com/eric-mitchell/direct-preference-optimization/blob/f8b8c0f49dc92a430bae41585f9d467d3618fe2f/trainers.py#L341C1-L345C1

  if self.config.sample_during_eval:
      wandb.log({"policy_samples": policy_text_table}, step=self.example_counter)
      if self.config.loss.name in {'dpo', 'ipo'}:
          wandb.log({"reference_samples": reference_text_table}, step=self.example_counter)

This won't update the table in wandb UI.

Here is a possible solution: [App] Table not updating it at each call of log wandb/wandb#2981 (comment)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Was it your intention to recreate wandb tables in iterator? #76

Was it your intention to recreate wandb tables in iterator? #76

huskydoge commented Apr 4, 2024

Was it your intention to recreate wandb tables in iterator? #76

Was it your intention to recreate wandb tables in iterator? #76

Comments

huskydoge commented Apr 4, 2024