To use it within your reinforcement learning context, you need:
- https://github.com/Abzinger/BROJA_2PID (BROJA)
import dit
(Finn-Lizier)
After that, use the function
intrinsic_reward(n, piT, piX_T, piY_t)
which takes as arguments
n
: The number of actionspiT
: some array-ish object such thatpiT[t]
is the probability that T takes action tpiX_T
: such thatpiX_T[x,t]
is the probability, conditioned on T taking action t, that X takes action xpiY_T
: such thatpiY_T[y,t]
is the probability, conditioned on T taking action t, that Y takes action y
The function returns a single floating point number normalized to [-1,+1].
To install this package just do:
pip install git+https://github.com/dojt/re-in-pid