Multi-armed bandit (MAB) problem under delayed feedback: numerical experiments

The framework for numerical experiments to simulate the multi-armed bandit in the stochastic stationary environment with delays.

Beta Upper Confidence Bound Policy for the Design of Clinical Trials, 2023

Evaluation of the adapted to delays policies using the publicly available dataset The International Stroke Trial. See this notebook for the analysis and simulation.

Bernoulli multi-armed bandit problem under delayed feedback, 2021

Provides the framework for numerical experiments to simulate the multi-armed bandit problem in the stochastic stationary environment with delays. Part of the paper Bernoulli multi-armed bandit problem under delayed feedback (Journal).

Structure of the project and currently implemented algorithms:

	Files
Environments	Protocol
	Bernoulli MAB
Policies	Protocol
	Uniform Random
	Explore-First
	Epsilon-Greedy
	Upper Confidence Bound
	Thompson Sampling (Beta distribution)
Experiments	Bernoulli MAB under delayed feedback
Tests	Test module

To run experiments on Bernoulli MAB see

python delayed_bandit/experiments.py --help

One might want to run a significant number of experiments and aggregate the result by removing outliers and averaging. The sampling of delays might be fixated over the horizon.

Development

python3 -m venv env
source env/bin/activate
pip install -r requirements.txt
./pychecks.sh

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
delayed_bandit		delayed_bandit
output		output
.gitignore		.gitignore
Beta-Upper-Confidence-Bound-Policy-for-the-Design-of-Clinical-Trials.ipynb		Beta-Upper-Confidence-Bound-Policy-for-the-Design-of-Clinical-Trials.ipynb
LICENSE		LICENSE
README.md		README.md
all-algorithms-delay-150.png		all-algorithms-delay-150.png
all-algorithms-delay-50.png		all-algorithms-delay-50.png
all-algorithms-no-delay.png		all-algorithms-no-delay.png
bernoulli-mab-explore-then-commit.png		bernoulli-mab-explore-then-commit.png
pychecks.sh		pychecks.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-armed bandit (MAB) problem under delayed feedback: numerical experiments

Beta Upper Confidence Bound Policy for the Design of Clinical Trials, 2023

Bernoulli multi-armed bandit problem under delayed feedback, 2021

Development

About

Releases

Packages

Languages

License

djo/delayed-bandit

Folders and files

Latest commit

History

Repository files navigation

Multi-armed bandit (MAB) problem under delayed feedback: numerical experiments

Beta Upper Confidence Bound Policy for the Design of Clinical Trials, 2023

Bernoulli multi-armed bandit problem under delayed feedback, 2021

Development

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages