propensity score estimate #178

arita37 · 2022-09-23T15:13:25Z

Hello,

In the input dataset, propensity scores needs to be provided,
does the the propensity score needs to be calibrated ?

How is the impact of wrong propensity score ? (vs reward level...)

usaito · 2022-09-26T18:57:49Z

@arita37 Calibrating your pscore estimate might help, but depends on the situation. The following papers explore the effect of calibration in OPE, so might be of your interest:

Aniruddh Raghu, Omer Gottesman, Yao Liu, Matthieu Komorowski, Aldo Faisal, Finale Doshi-Velez, Emma Brunskill. Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters. https://arxiv.org/abs/1807.01066
Yuta Saito, Takuma Udagawa, Haruka Kiyohara, Kazuki Mogi, Yusuke Narita, Kei Tateno. Evaluating the Robustness of Off-Policy Evaluation. https://arxiv.org/abs/2108.13703.

If you're working on a particular application, you might want to evaluate the effect of calibrated vs non-calibrated pscore estimate on the OPE accuracy using synthetic data that mimic your real data. Should be easily implementable with OBP.

arita37 · 2022-10-11T09:08:30Z

Ok. But, if you do the math, no calibrated proba can create significant bias. Different calibration strategies can be used with different temperatures

…

On Sep 27, 2022, at 3:58, yuta-saito ***@***.***> wrote: @arita37 Calibrating your pscore estimate might help, but depends on the situation. The following papers explore the effect of calibration in OPE, so might be of your interest: Aniruddh Raghu, Omer Gottesman, Yao Liu, Matthieu Komorowski, Aldo Faisal, Finale Doshi-Velez, Emma Brunskill. Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters. https://arxiv.org/abs/1807.01066 Yuta Saito, Takuma Udagawa, Haruka Kiyohara, Kazuki Mogi, Yusuke Narita, Kei Tateno. Evaluating the Robustness of Off-Policy Evaluation. https://arxiv.org/abs/2108.13703. If you're working on a particular application, you might want to evaluate the effect of calibrated vs non-calibrated pscore estimate on the OPE accuracy using synthetic data that mimic your real data. Should be easily implementable with OBP. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

propensity score estimate #178

propensity score estimate #178

arita37 commented Sep 23, 2022

usaito commented Sep 26, 2022

arita37 commented Oct 11, 2022 via email

propensity score estimate #178

propensity score estimate #178

Comments

arita37 commented Sep 23, 2022

usaito commented Sep 26, 2022

arita37 commented Oct 11, 2022 via email