Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fairness evaluation #17

Open
Jeanselme opened this issue Sep 20, 2024 · 3 comments
Open

Fairness evaluation #17

Jeanselme opened this issue Sep 20, 2024 · 3 comments

Comments

@Jeanselme
Copy link

Here are some thoughts concerning fairness evaluation:

Protected attributes

Extraction: As methodologies may implement pre-, in- or post-processing to enhance algorithmic fairness, I think we need to extract sensitive attributes at the level of task extraction, not only at evaluation.

Definition: Sensitive attributes may differ across datasets (for instance, ethnicity changes for a patient at different visits in the MIMIC dataset; Canada and France do not record ethnicity; and different datasets may present different levels of granularity). As an initial step, we could focus on age and sex.

Save: Attributes could be saved in the target file and extracted at the same time. Alternatively, we could gather them in a separate file, following a different script.

Metrics

We can distinguish three fairness definitions:
Group fairness: A model is fair if performance is equal across groups defined by protected attributes.
Causal fairness: A model is fair if the prediction remains unchanged if membership changes.
Individual fairness: A model is fair if similar individuals are treated similarly.

We need the causal graph to estimate causal fairness and a meaningful distance for individual fairness. I think, as a first step, group fairness is the simplest to implement (and is widely used in medical ML). So, we could stratify performance per identified protected groups and compute the difference as a measure of fairness.

Let me know what you think!

@Jeanselme
Copy link
Author

Adding @kamilest

@kamilest
Copy link
Collaborator

  • Update dataset.md #31 The "new dataset" template should request fairness-related information and relevant/supported group definitions.

@Jeanselme
Copy link
Author

Following our conversation, we could start with the following fairness metrics for binary tasks:

  • Group specific AUC (just stratifying given the different groups)
  • Demographic parity: measure if the expected rate are the same across groups
  • Equalized Odds: same but conditional on outcome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants