GEST Dataset

This is a repository for the GEST dataset used to measure gender-stereotypical reasoning in language models and machine translation systems.

Paper: Women Are Beautiful, Men Are Leaders: Gender Stereotypes in Machine Translation and Language Modeling
The dataset is also available at HuggingFace Datasets.

Changelog

December 6th 2024 - data/gest_1.1.csv was added. This is a new version that has 244 typos and other errors fixed.

Data

data folder contains various versions of the GEST dataset and predictions made by various models used to evaluate them.

`gest.csv`

This is a simple canonical version of the GEST dataset that contains only the filtered samples. Each line contains a text of the sample and the stereotype ID.

sentence - Text of the sample.
stereotype - Stereotype ID [1,16].

`annotations.csv`

This are all the samples we have collected from data creators before we filtered them, along with all our annotations.

sentence - Sentence proposed by the data creator.
stereotype - Stereotype ID [1,16] assigned to the sentence.
gender_strong_female - The following 5 columns show the annotation given by the first annotator to the sentence without seeing the stereotype. The question was: Stereotypically, is the sentence female or male?
gender_female
gender_neutral
gender_male
gender_strong_male
match_strong_agree - The following 5 columns show the annotation given by the first annotator to the sentence after seeing the stereotype. The question was: Does the stereotype match the sentence?
match_agree
match_neutral
match_disagree
match_strong_disagree
final_verdict - Final decision given by the second annotator: yes, fix or no.
final_stereotype - Final stereotype ID [1,16] assigned by the second annotator.
comment - Comment written by the first annotator.
fix - Fixed sentence proposed by either annotator.
first_annotator - Initials of the first data annotator.
second_annotator - Initials of the second data annotator.
creator - Initials of the data creator.

`gender_variants.csv`

Sentences where we have both male and female versions translated via the same machine translation system. See Section 4.3.2 in the paper for more details. Note that the samples here are not filtered according to the criteria mentioned in the paper (only one word difference, the same first letter).

translator - Machine translation system used to translate the original English sentence.
language - Target language of the male and female translations.
original - Original English sentence.
stereotype - Stereotype ID.
male - Translation to the target language with the masculine gender of the first person.
female - Translation to the target language with the feminine gender of the first person.

`data_guidelines.pdf`

This files contains the instructions that were available for the human data creators that created the GEST dataset. We have both the Slovak version that was used, as well as the English version to make the guidelines more accessible.

`translations/`

translation.csv files contain translations generated by various machine translation systems.

from - Original English sentence.
to - Translation to the target language.

`predictions/english_mlm/`

Scores generated for individual samples generated by various English MLMs for various templats. See Section 4.2 for more details. The particular models and template are indicated in the file names. Each line is the score for a sample, the order matches the order of samples in gest.csv.

`predictions/slavic_mlm/`

Scores generated for individual samples generated by various multilingual MLMs. See Section 4.3 for more details. The particular models are indicated in the file names. Each line is the score for a sample, the order matches the order of samples in gender_variants.csv.

Code

Notebooks

Notebooks _mt.ipynb, _english_mlm.ipynb, and _slavic_mlm.ipynb were used to analyze the results and visualize data. They match sections 4.1, 4.2, and 4.3 respectively.

_inference.ipynb was used to generate translations, parses, and other predictions.

`Dockerfile`

The best way to run the code is to use the included Dockerfile to build the environment and run the code, e.g.:

docker build . -t gest
docker run --gpus all -p 8888:8888 -v ${PWD}:/labs -it gest

--gpus all is optional.

Translators

The code work with various paid machine translation services. You can make them work by adding appropriate auth files to the config directory.

aws_access_key and aws_secret_key for the Amazon Translate.
chatgpt_auth with the OpenAI auth key.
deepl_auth with the DeepL auth key.
gcp.json file with the Google Cloud Platform service account key.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data		data
figures		figures
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
_english_mlm.ipynb		_english_mlm.ipynb
_inference.ipynb		_inference.ipynb
_mt.ipynb		_mt.ipynb
_slavic_mlm.ipynb		_slavic_mlm.ipynb
_winobias.ipynb		_winobias.ipynb
requirements.in		requirements.in
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GEST Dataset

Changelog

Data

`gest.csv`

`annotations.csv`

`gender_variants.csv`

`data_guidelines.pdf`

`translations/`

`predictions/english_mlm/`

`predictions/slavic_mlm/`

Code

Notebooks

`Dockerfile`

Translators

About

Releases

Packages

Languages

License

kinit-sk/gest

Folders and files

Latest commit

History

Repository files navigation

GEST Dataset

Changelog

Data

gest.csv

annotations.csv

gender_variants.csv

data_guidelines.pdf

translations/

predictions/english_mlm/

predictions/slavic_mlm/

Code

Notebooks

Dockerfile

Translators

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`gest.csv`

`annotations.csv`

`gender_variants.csv`

`data_guidelines.pdf`

`translations/`

`predictions/english_mlm/`

`predictions/slavic_mlm/`

`Dockerfile`

Packages