Skip to content

Latest commit

 

History

History
42 lines (35 loc) · 1.3 KB

README.md

File metadata and controls

42 lines (35 loc) · 1.3 KB

Optimal Off-Policy Evaluation from Multiple Logging Policies

Overview

This repository contains the code for replicating the experiments of the paper "Optimal Off-Policy Evaluation from Multiple Logging Policies" (ICML2021, proceedings.mlr.press/v139/kallus21a.html)

If you find this code useful in your research then please cite:

@inproceedings{kallus2021optimal,
  title={Optimal Off-Policy Evaluation from Multiple Logging Policies},
  author={Kallus, Nathan and Saito, Yuta and Uehara, Masatoshi},
  booktitle = {Proceedings of the 38th International Conference on Machine Learning},
  pages={5247-5256},
  year={2021},
  volume = {139},
  publisher={PMLR},
}

Dependencies

  • python==3.7.3
  • numpy==1.18.1
  • pandas==0.25.1
  • scikit-learn==0.23.1
  • tensorflow==1.15.4
  • pyyaml==5.1
  • seaborn==0.10.1
  • matplotlib==3.2.2

Running the code

To run the simulations with the multi-class classification datasets, run the following commands in the ./src/ directory:

for data in optdigits pendigits
do
    python run_sims.py --num_sims 200 --data $data --is_estimate_pi_b
done

Nota that the configurations used in the experiments can be found in ./conf/policy_params.yaml. Once the simulations have finished running, the summarized results can be found in the ../log/{data} directory for each data.