Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues reproducing for Bios #26

Open
AntoineGourru opened this issue Mar 16, 2023 · 5 comments
Open

Issues reproducing for Bios #26

AntoineGourru opened this issue Mar 16, 2023 · 5 comments

Comments

@AntoineGourru
Copy link

Dear Xudong,

First, a great thanks for your work, this is of high value for people working in fair classification. Kudos !

Second, I have some issue reproducing the results for the Bios dataset.

I used your code to download and preprocess the data : datasets.prepare_dataset("bios", "data/bios")

After that, in src/dataloaders/loaders/Bios.py, I had to comment:

if self.args.protected_task in ["economy", "both"] and self.args.full_label:
#if self.args.protected_task in ["gender", "economy", "both", "intersection"] and self.args.full_label:

Otherwise it couldn't build the datalaoder (because the data built with prepare_dataset does not contain economy_label).

Finally, I run this code:

##############
args = {
"dataset": "Bios_gender",
"emb_size": 768,
"num_classes": 28,
"batch_size": 16,
"data_dir": "data/bios",
"device_id": 0,
"exp_id":"fcl",
}

debias_options = fairlib.BaseOptions()
debias_state = debias_options.get_state(args=args, silence=True)

fairlib.utils.seed_everything(2022)

debias_model = fairlib.networks.get_main_model(debias_state)

debias_model.train_self()

##############

Everything run well, except the model get random results and the loss is not improving over the epochs. Do you have a clue about what is happening ?

For Moji, it works perfectly.

Best regards, and thank you again for your work,

Antoine

@HanXudong
Copy link
Member

Hi Antoine,

Thanks for reaching out!

Regarding the Bios dataset, the augmented Bios dataset with economy labels is recently released, and I will revise the preprocessing script to add it soon.

For Bios experiments, I noticed that the batch size is set to 16 ("batch_size":16), which might be too small given the default learning rate ("lr":0.003). Could you please test with larger batch sizes or smaller learning rates? Hopefully this would help. Otherwise, feel free to share your codes, I am more than happy to help!

Best,
Xudong

@AntoineGourru
Copy link
Author

II reduced the lr and increased the batch size, it seems to work much better, thank you very much.

Many thanks for this work and for your kind answer,

Antoine

@AntoineGourru
Copy link
Author

Dear Xudong,

Thanks again for your kind and prompt answer :-) . We still do not manage to reach the results you demonstrate in your articles (getting close but not exactly).

For example for the CE baseline, we reach 79.05 as max accuracy on the BiasInBios dataset.

Could you possibly share the parameters you used for BiasInBios and Moji (the optimal ones leading to the results in your papers). Similarly for the other methods ?

Best regards,

Antoine and Thibaud (@LetenoThibaud)

@HanXudong
Copy link
Member

HanXudong commented May 30, 2023

Hi Antoine and Thibaud,

Once again, thanks for reaching out!

Please be aware that we used fixed encoder models (e.g. BERT) in our previous experiments, and only trained MLP to make predictions. In our recent experiments, we tried to fine-tuned the whole model and further improve the results. To fine-tune the whole BERT model, could you please:

  1. set n_freezed_layers = 0 in you BERT model class (or in https://github.com/HanXudong/fairlib/blob/909f95237e26ed41d15f5777ba54ce4863e1f0c8/fairlib/src/networks/classifier.py#L147)
  2. set batch_size = 32
  3. set learning rate lr = 5e-6

In terms of the hyperparameters of each debiasing methods, we used the same batch size and learning rate as the vanilla method, and only search for the best trade-off hyperparameters for each debiasing method. The corresponding results are can downloaded. I have attached a jupyter notebook to demonstrate the process, which can be run in Google Colab. Please have a look and fell free to message me for any further information.

Reproduce_Results.zip

Best,
Xudong

@LetenoThibaud
Copy link

Hi Xudong,

Thank you for your quick answer,

Based on your code and by using the data downloaded from the notebook you sent, we managed reproducing your vanilla results.

This will be very helpful for our works, thanks again.

Best regards,

Thibaud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants