Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements and Bug Fixes for Probabilistic Fairness #27

Merged
merged 29 commits into from
Jan 25, 2024
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
76dade3
Add test for get_all_scores
mthielbar Sep 7, 2023
2c399b8
Bug fix. EqualOpportunity should be included in get_all_scores.
mthielbar Sep 7, 2023
6073318
Small updates to test_utils_proba.py
mthielbar Sep 21, 2023
7f8ecde
Rearrange simulation into its own class.
mthielbar Sep 22, 2023
da13e4f
Simulator is its own class. Simulator unit tests running clean.
mthielbar Sep 22, 2023
04d0c63
Small edits to test_utils_proba.py
mthielbar Sep 26, 2023
25fc7fd
Fix small bug that occurs in summarizer when mambership_df has a surr…
mthielbar Sep 26, 2023
fa8f8bc
Add tests for summarizer.
mthielbar Sep 26, 2023
5ee2d0b
Incorporate fixes to summarizer.
mthielbar Sep 26, 2023
2ea6660
Merge branch 'summarizer_bug' into prob_membership_updates
mthielbar Sep 26, 2023
899c747
Cleanup code after merging changes to fix summarizer bug.
mthielbar Sep 26, 2023
6e0a826
run_bootstrap was using incorrect class label function call.
mthielbar Sep 27, 2023
4157bd4
Merge branch 'prob_membership_updates' into update_simulation
mthielbar Oct 13, 2023
15395ac
Clean up print statements in is_one_dimensional.
mthielbar Oct 13, 2023
325d123
Clean up deprecation warning caused by cvx.Variable returning a one-d…
mthielbar Oct 13, 2023
9f195d6
Turn off user warnings where possible in test_utils_proba.py. Warning…
mthielbar Oct 13, 2023
c05ae6e
Update to utils_proba.py
mthielbar Dec 21, 2023
c2401d9
Edit comments in simulator.
mthielbar Dec 21, 2023
5a18d04
Merge code for simulator class with fixes.
mthielbar Dec 21, 2023
721cd3e
Update minimum weight to 5 rows, according to results from simulation…
mthielbar Dec 26, 2023
13baf31
Make simulation dataframe large enough so values are not unstable and…
mthielbar Dec 26, 2023
50ff34e
Add simulation scripts and readme.md for probabilistic fairness.
mthielbar Dec 26, 2023
5971028
Update comments and readme.md
mthielbar Dec 26, 2023
83f1782
Add descriptions and citations to readme
mthielbar Dec 26, 2023
377756a
Add input data for simulations and supporting notebooks to create out…
mthielbar Dec 27, 2023
1adfb48
update
skadio Jan 24, 2024
4830f9b
update
skadio Jan 24, 2024
02f0497
update
skadio Jan 24, 2024
8897d66
update
skadio Jan 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3,589 changes: 3,589 additions & 0 deletions examples/probabilistic_fairness/input_data/sampled_surrogate_inputs.csv

Large diffs are not rendered by default.

33,185 changes: 33,185 additions & 0 deletions examples/probabilistic_fairness/input_data/surrogate_inputs.csv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick q on the "data". Where is this data coming from? Is there an original version that we borrow from somewhere else (hence, copyright?) OR .. this data is generated by us?

Large diffs are not rendered by default.

1,909 changes: 1,909 additions & 0 deletions examples/probabilistic_fairness/notebooks/analyze_prob_vs_model.ipynb

Large diffs are not rendered by default.

5,209 changes: 5,209 additions & 0 deletions examples/probabilistic_fairness/notebooks/analyze_sample_size_sim.ipynb

Large diffs are not rendered by default.

62 changes: 62 additions & 0 deletions examples/probabilistic_fairness/python_scripts/simulation.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
#Simulations
import pandas as pd
import numpy as np
import math
import sys
sys.path.append('../../jurity/tests')
sys.path.append('../../jurity/jurity')
from jurity.fairness import BinaryFairnessMetrics as bfm
from test_utils_proba import UtilsProbaSimulator

output_path='~/Documents/data/jurity_tests/simulations/'

testing_simulation=False
n_runs=30
avg_counts=[30,50]
fair_sim=UtilsProbaSimulator({'not_protected': {'pct_positive': 0.2, 'fnr': 0.1, 'fpr': 0.2},'protected': {'pct_positive': 0.2, 'fnr': 0.1, 'fpr': 0.2}},surrogate_name="ZIP")
slightly_unfair_sim=UtilsProbaSimulator({'not_protected': {'pct_positive': 0.2, 'fnr': 0.1, 'fpr': 0.2}, 'protected': {'pct_positive': 0.1, 'fnr': 0.35, 'fpr': 0.1}},surrogate_name="ZIP")
moderately_unfair_sim=UtilsProbaSimulator({'not_protected': {'pct_positive': 0.3, 'fnr': 0.1, 'fpr': 0.3}, 'protected': {'pct_positive': 0.1, 'fnr': 0.45, 'fpr': 0.1}},surrogate_name="ZIP")
very_unfair_sim =UtilsProbaSimulator({'not_protected': {'pct_positive': 0.4, 'fnr': 0.1, 'fpr': 0.3}, 'protected': {'pct_positive': 0.10, 'fnr': 0.65, 'fpr': 0.1}},surrogate_name="ZIP")
extremely_unfair_sim =UtilsProbaSimulator({'not_protected': {'pct_positive': 0.5, 'fnr': 0.1, 'fpr': 0.2}, 'protected': {'pct_positive': 0.10, 'fnr': 0.65, 'fpr': 0.05}},surrogate_name="ZIP")

scenarios={"fair":fair_sim,
"slightly_unfair":slightly_unfair_sim,
"moderately_unfair":moderately_unfair_sim,
"very_unfair":very_unfair_sim,
"extremely_unfair":extremely_unfair_sim}
surrogates=pd.read_csv('../input_data/surrogate_inputs.csv')
if testing_simulation:
output_string = output_path+'{0}_simulation_count_{1}_surrogates_{2}_test.csv'
else:
output_string = output_path+'{0}_simulation_count_{1}_surrogates_{2}.csv'

def run_one_sim(simulator, membership_df,count_mean,rng=np.random.default_rng()):
membership_df["count"]=pd.Series(rng.poisson(lam=count_mean,size=membership_df.shape[0]))
test_data=simulator.explode_dataframe(membership_df)
oracle_metrics=bfm.get_all_scores(test_data["label"].values,test_data["prediction"].values,
(test_data["class"]=="protected").astype(int).values).rename(columns={"Value":"oracle_value"})
prob_metrics=bfm.get_all_scores(test_data["label"],test_data["prediction"],
membership_df.set_index("ZIP")[["not_protected","protected"]],
test_data["ZIP"],[1]).rename(columns={"Value":"probabilistic_estimate"})
predicted_class=test_data[["not_protected","protected"]].values.tolist()
argmax_metrics=bfm.get_all_scores(test_data["label"].values,test_data["prediction"].values,
predicted_class).rename(columns={"Value":"argmax_estimate"})
return pd.concat([oracle_metrics["oracle_value"],prob_metrics["probabilistic_estimate"], argmax_metrics["argmax_estimate"]], axis=1)

if __name__=="__main__":
n_surrogates=surrogates.shape[0]
for sim_label,simulator in scenarios.items():
for c in avg_counts:
all_results=[]
for i in range(0, n_runs):
if testing_simulation:
output_df = run_one_sim(simulator, surrogates.head(10), c)
else:
output_df = run_one_sim(simulator, surrogates, c)
output_df["run_id"] = i
all_results.append(output_df)
all_output=pd.concat(all_results)
all_output["average_count"] = c
all_output["simulation"] = sim_label
all_output["n_surrogates"] = n_surrogates
all_output[~(all_output["probabilistic_estimate"].apply(np.isnan))].to_csv(output_string.format(sim_label, c,n_surrogates))
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
#Simulationed data: Model-based assignment to protected class vs probabilistic fairness
# One of the claims in the paper is that model-based fairness metrics are biased,
# and that the degree of bias is a function of the PPV (positive predictive value/precision)
# and NPV (negative predctive value) of the models that predicts protected status.
# This simulation demonstrates the difference between probabilistic estimates and
# model-based estimates for a given input data file (located in ../input_data.surrogate_inputscsv)

import pandas as pd
import numpy as np
import math
import sys
sys.path.append('../../tests')
sys.path.append('../../jurity')
from jurity.fairness import BinaryFairnessMetrics as bfm
from constants import Constants
from sklearn.metrics import confusion_matrix
from test_utils_proba import UtilsProbaSimulator

def performance_measures(ground_truth: np.ndarray,
predictions: np.ndarray) -> dict:
"""Compute various performance measures, optionally conditioned on protected attribute.
Assume that positive label is encoded as 1 and negative label as 0.

Parameters
---------
ground_truth: np.ndarray
Ground truth labels (1/0).
predictions: np.ndarray
Predicted values.
group_idx: Union[np.ndarray, List]
Indices of the group to consider. Optional.
group_membership: bool
Restrict performance measures to members of a certain group.
If None, the whole population is used.
Default value is False.

Returns
---------
Dictionary with performance measure identifiers as keys and their corresponding values.
"""
tn, fp, fn, tp = confusion_matrix(ground_truth, predictions).ravel()

p = np.sum(ground_truth == 1)
n = np.sum(ground_truth == 0)

return {Constants.TPR: tp / p,
Constants.TNR: tn / n,
Constants.FPR: fp / n,
Constants.FNR: fn / p,
Constants.PPV: tp / (tp + fp) if (tp + fp) > 0.0 else Constants.float_null,
Constants.NPV: tn / (tn + fn) if (tn + fn) > 0.0 else Constants.float_null,
Constants.FDR: fp / (fp + tp) if (fp + tp) > 0.0 else Constants.float_null,
Constants.FOR: fn / (fn + tn) if (fn + tn) > 0.0 else Constants.float_null,
Constants.ACC: (tp + tn) / (p + n) if (p + n) > 0.0 else Constants.float_null}

#If true, only simulate a small dataframe. Used to test simulation syntax.
testing_simulation=False
n_runs=30

# The test_utils_proba.py test file in jurity/tests contains a class called
# UtilsProbaSimulator, which can simulate the confusion matrix from an unfair model for different classes.
# Simulation is explained in :

fair_sim=UtilsProbaSimulator({'not_protected': {'pct_positive': 0.2, 'fnr': 0.1, 'fpr': 0.2},'protected': {'pct_positive': 0.2, 'fnr': 0.1, 'fpr': 0.2}},surrogate_name="ZIP")
slightly_unfair_sim=UtilsProbaSimulator({'not_protected': {'pct_positive': 0.2, 'fnr': 0.1, 'fpr': 0.2}, 'protected': {'pct_positive': 0.1, 'fnr': 0.35, 'fpr': 0.1}},surrogate_name="ZIP")
moderately_unfair_sim=UtilsProbaSimulator({'not_protected': {'pct_positive': 0.3, 'fnr': 0.1, 'fpr': 0.3}, 'protected': {'pct_positive': 0.1, 'fnr': 0.45, 'fpr': 0.1}},surrogate_name="ZIP")
very_unfair_sim =UtilsProbaSimulator({'not_protected': {'pct_positive': 0.4, 'fnr': 0.1, 'fpr': 0.3}, 'protected': {'pct_positive': 0.10, 'fnr': 0.65, 'fpr': 0.1}},surrogate_name="ZIP")
extremely_unfair_sim =UtilsProbaSimulator({'not_protected': {'pct_positive': 0.5, 'fnr': 0.1, 'fpr': 0.2}, 'protected': {'pct_positive': 0.10, 'fnr': 0.65, 'fpr': 0.05}},surrogate_name="ZIP")
if testing_simulation:
scenarios = {"moderately_unfair": moderately_unfair_sim,
"very_unfair":very_unfair_sim}
else:
scenarios={"fair":fair_sim,
"slightly_unfair":slightly_unfair_sim,
"moderately_unfair":moderately_unfair_sim,
"very_unfair":very_unfair_sim,
"extremely_unfair":extremely_unfair_sim}
#Location of input and output files
surrogates=pd.read_csv('../input_data/sampled_surrogate_inputs.csv')
if testing_simulation:
prob_output_string = '~/Documents/data/jurity_tests/simulations//model_v_prob/{0}_prob_simulation_{1}_surrogates_{2}_count_test.csv'
model_output_string = '~/Documents/data/jurity_tests/simulations/model_v_prob/{0}_model_simulation_{1}_surrogates_{2}_count_test.csv'
else:
prob_output_string = '~/Documents/data/jurity_tests/simulations/model_v_prob/{0}_prob_simulation_{1}_surrogates_{2}_count.csv'
model_output_string = '~/Documents/data/jurity_tests/simulations/model_v_prob/{0}_model_simulation_{1}_surrogates_{2}_count.csv'

def generate_test_data(simulator, membership_df,count_mean,rng=np.random.default_rng()):
membership_df["count"]=pd.Series(rng.poisson(lam=count_mean,size=membership_df.shape[0]))
return simulator.explode_dataframe(membership_df)

def calc_prob_estimate(test_data,membership_df):
oracle_metrics=bfm.get_all_scores(test_data["label"].values,test_data["prediction"].values,
(test_data["class"]=="protected").astype(int).values).rename(columns={"Value":"oracle_value"})
prob_metrics=bfm.get_all_scores(test_data["label"],test_data["prediction"],
membership_df.set_index("ZIP")[["not_protected","protected"]],
test_data["ZIP"],[1]).rename(columns={"Value":"probabilistic_estimate"})
return pd.concat([oracle_metrics["oracle_value"],prob_metrics["probabilistic_estimate"]], axis=1)

def calc_model_estimate(df,rng=np.random.default_rng()):
out_dfs=[]
for s in [[0.99, 0.99], [0.9, 0.99], [0.8, 0.9], [0.7, 0.8]]:
p_given_p = s[0]
np_given_np = s[1]
prediction_p=rng.choice([0,1],p=[1-p_given_p,p_given_p],size=df.shape[0])
prediction_np=rng.choice([0,1],p=[np_given_np,1-np_given_np],size=df.shape[0])
class_vec_p=(df["class"]=="protected").astype(int).values
class_vec_np=(df["class"]=="not_protected").astype(int).values
class_pred=np.multiply(class_vec_p,prediction_p)+np.multiply(class_vec_np,prediction_np)
scores=bfm.get_all_scores(df["label"].values,df["prediction"].values,class_pred).rename(columns={"Value":"model_estimate"})
scores["p_given_p"]=p_given_p
scores["np_given_np"]=np_given_np
class_model_performance=performance_measures(class_vec_p,class_pred)
scores["p_given_p"]=p_given_p
scores["np_given_np"]=np_given_np
scores["class_PPV"]=class_model_performance[Constants.PPV]
scores["class_NPV"]=class_model_performance[Constants.NPV]
scores["class_TPR"]=class_model_performance[Constants.TPR]
scores["class_BR"]=np.sum(class_vec_p)
out_dfs.append(scores.reset_index()[["Metric","model_estimate","class_PPV","class_NPV","class_TPR","p_given_p","np_given_np"]])
return pd.concat(out_dfs,axis=0)

if __name__=="__main__":
n_surrogates=surrogates.shape[0]
generator=np.random.default_rng()
for sim_label,simulator in scenarios.items():
prob_results=[]
model_results=[]
for i in range(0, n_runs):
if testing_simulation:
test_df = generate_test_data(simulator, surrogates, 50, generator)
else:
test_df = generate_test_data(simulator, surrogates, 50, generator)
p=calc_prob_estimate(test_df,surrogates)
p["run_id"]=i
prob_results.append(p)
m=calc_model_estimate(test_df, generator)
m["run_id"]=i
model_results.append(m)
all_prob_results=pd.concat(prob_results,axis=0)
all_prob_results["simulation"]=sim_label
all_model_results=pd.concat(model_results,axis=0)
all_model_results["simulation"]=sim_label
all_prob_results.to_csv(prob_output_string.format(sim_label,50,surrogates.shape[0]))
all_model_results.to_csv(model_output_string.format(sim_label,50,surrogates.shape[0]),index=False)
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
#Simulation inspecting probabilistic fairness performance for different sample sizes.
import pandas as pd
import numpy as np
import math
import sys
sys.path.append('../../jurity/tests')
sys.path.append('../../jurity/jurity')
from jurity.fairness import BinaryFairnessMetrics as bfm
from test_utils_proba import UtilsProbaSimulator
output_path='~/Documents/data/jurity_tests/simulations/sample_size/min_weight_0/'
testing_simulation=False
n_runs=30
avg_counts=[5,10,20,30,40]
num_surrogates=[50,100,300,400,500,1000]
fair_sim=UtilsProbaSimulator({'not_protected': {'pct_positive': 0.2, 'fnr': 0.1, 'fpr': 0.2},'protected': {'pct_positive': 0.2, 'fnr': 0.1, 'fpr': 0.2}},surrogate_name="ZIP")
slightly_unfair_sim=UtilsProbaSimulator({'not_protected': {'pct_positive': 0.2, 'fnr': 0.1, 'fpr': 0.2}, 'protected': {'pct_positive': 0.1, 'fnr': 0.35, 'fpr': 0.1}},surrogate_name="ZIP")
moderately_unfair_sim=UtilsProbaSimulator({'not_protected': {'pct_positive': 0.3, 'fnr': 0.1, 'fpr': 0.3}, 'protected': {'pct_positive': 0.1, 'fnr': 0.45, 'fpr': 0.1}},surrogate_name="ZIP")
very_unfair_sim =UtilsProbaSimulator({'not_protected': {'pct_positive': 0.4, 'fnr': 0.1, 'fpr': 0.3}, 'protected': {'pct_positive': 0.10, 'fnr': 0.65, 'fpr': 0.1}},surrogate_name="ZIP")
extremely_unfair_sim =UtilsProbaSimulator({'not_protected': {'pct_positive': 0.5, 'fnr': 0.1, 'fpr': 0.2}, 'protected': {'pct_positive': 0.10, 'fnr': 0.65, 'fpr': 0.05}},surrogate_name="ZIP")

scenarios={"fair":fair_sim,
"slightly_unfair":slightly_unfair_sim,
"moderately_unfair":moderately_unfair_sim,
"very_unfair":very_unfair_sim,
"extremely_unfair":extremely_unfair_sim}
surrogates=pd.read_csv('./supporting_data/surrogate_inputs.csv')
surrogates["ZIP"]=surrogates["ZIP"].astype(int)
if testing_simulation:
output_string = output_path+'{0}_simulation_count_{1}_test_surrogates_{2}.csv'
else:
output_string = output_path+'sample_size/min_weight_0/{0}_simulation_count_{1}_surrogates_{2}.csv'

def run_one_sim(test_data,membership_df):
#Sometimes the sub-sampling leads to data errors.
#Return a dataframe that is all nans in this case.
#Keep track--if there are too many of these, stop the simulation
global n_errors
try:
oracle_metrics=bfm.get_all_scores(test_data["label"].values,test_data["prediction"].values,
(test_data["class"]=="protected").astype(int).values).rename(columns={"Value":"oracle_value"})
except:
oracle_metrics=pd.DataFrame({"oracle_value":[np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan]},
index=['Average Odds', 'Disparate Impact', 'Equal Opportunity',
'FNR difference', 'FOR difference', 'Generalized Entropy Index',
'Predictive Equality', 'Statistical Parity', 'Theil Index']
)
n_errors=n_errors+1
try:
prob_metrics=bfm.get_all_scores(test_data["label"],test_data["prediction"],
membership_df.set_index("ZIP")[["not_protected","protected"]],
test_data["ZIP"],[1]).rename(columns={"Value":"probabilistic_estimate"})
except:
prob_metrics=pd.DataFrame({"probabilistic_estimate": [np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan]},
index=['Average Odds', 'Disparate Impact', 'Equal Opportunity',
'FNR difference', 'FOR difference', 'Generalized Entropy Index',
'Predictive Equality', 'Statistical Parity', 'Theil Index']
)
n_errors=n_errors+1
return pd.concat([oracle_metrics["oracle_value"],prob_metrics["probabilistic_estimate"]], axis=1)

if __name__=="__main__":
n_errors=0
rng=np.random.default_rng()
for sim_label,simulator in scenarios.items():
for c in avg_counts:
surrogates["count"]=pd.Series(rng.poisson(lam=c,size=surrogates.shape[0]))
if testing_simulation:
test_data=simulator.explode_dataframe(surrogates.head(10))
else:
test_data=simulator.explode_dataframe(surrogates)
print("The number of rows in the data data is: ",test_data.shape)
for n_surrogates in num_surrogates:
all_results = []
for i in range(0, n_runs):
#Sample surrogate classes from the dataframe
#Take a sample stratified by p(protected) to get a spread
#along the x axis for the regression
if testing_simulation:
sampled_surrogates=surrogates.head(10)["ZIP"].values
else:
sampled_surrogates=surrogates.groupby("bin").sample(frac=(n_surrogates/surrogates.shape[0]),
replace=True)["ZIP"].values
#only feed sampled surrogate classes into simulation
a=test_data["ZIP"].apply(lambda x: x in sampled_surrogates).values
b=surrogates["ZIP"].apply(lambda x:x in sampled_surrogates).values
input_data=test_data.iloc[a].copy(deep=True)
input_surrogates=surrogates.iloc[b].copy(deep=True)
output_df=run_one_sim(input_data,input_surrogates)
if n_errors>30:
print("Errors limit reached. Stopping simulation.")
break
output_df["run_id"] = i
all_results.append(output_df)
all_output=pd.concat(all_results)
all_output["average_count"] = c
all_output["n_surrogates"] = n_surrogates
all_output["simulation"] = sim_label
all_output[~(all_output["probabilistic_estimate"].apply(np.isnan))].to_csv(output_string.format(sim_label, c,n_surrogates))
Loading
Loading