Parsing Codalab #112

yoavfreund · 2022-03-16T22:36:01Z

yoavfreund
Mar 16, 2022

First, I want to thank you for creating WILDS, it is a great addition to the landscape of datasets for ML.

I would like to make use of the models and data in CodaLab, but I am a little lost. Is there a document that describes the directories and files in Codalab? What are all the different files? How can I reproduce the results?

For example, I don't yet quite understand what is ERM in this context? Does it mean SGD on an approximation to the average error?

Thanks,
Yoav Freund
[email protected]

Answered by ssagawa

Mar 18, 2022

Hi Yoav,

Thank you for your kind words and your interest in WILDS!

Is there a document that describes the directories and files in Codalab? What are all the different files?

Here are the relevant file outputs from each experiment:

{split}_eval.csv: Reports model performance on the specified data split at the end of each epoch. The reported metrics include the official evaluation metrics for each dataset (e.g., macro F1 for the iWildCam dataset).
{split}_algo.csv: Reports additional metrics used to monitor training (e.g., loss).
best_model.pth: Saved weights for the model with best validation performance, as measured by the official evaluation metrics for each dataset. More specifically…

View full answer

ssagawa · 2022-03-18T03:41:16Z

ssagawa
Mar 18, 2022
Maintainer

Hi Yoav,

Thank you for your kind words and your interest in WILDS!

Is there a document that describes the directories and files in Codalab? What are all the different files?

Here are the relevant file outputs from each experiment:

{split}_eval.csv: Reports model performance on the specified data split at the end of each epoch. The reported metrics include the official evaluation metrics for each dataset (e.g., macro F1 for the iWildCam dataset).
{split}_algo.csv: Reports additional metrics used to monitor training (e.g., loss).
best_model.pth: Saved weights for the model with best validation performance, as measured by the official evaluation metrics for each dataset. More specifically, these are the weights for the Algorithm class in our code, which includes the model, the optimizer, etc.
last_model.pth: Saved weights for the model from the last epoch.
{dataset}_split:{split}_seed:{seed}_epoch:{epoch}.csv: Model predictions on unshuffled data from the specified split

How can I reproduce the results?

We provide a set of scripts that can be used to reproduce the benchmark results reported in the WILDS papers. You can find the scripts at https://github.com/p-lambda/wilds, and the README has additional instructions (e.g., https://github.com/p-lambda/wilds#using-the-example-scripts). So once you find the exact commands and hyperparameters on the Codalab worksheets, you can run these commands using these scripts.

I don't yet quite understand what is ERM in this context? Does it mean SGD on an approximation to the average error?

By ERM, we refer to the standard training procedure where we minimize the average loss. The optimizer used depends on the dataset.

Hope this helps, and let us know if you have any additional questions!

Best,
Shiori

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parsing Codalab #112

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Parsing Codalab #112

yoavfreund Mar 16, 2022

Replies: 1 comment

ssagawa Mar 18, 2022 Maintainer

yoavfreund
Mar 16, 2022

ssagawa
Mar 18, 2022
Maintainer