Query on Handling Offline Data with scope-rl #25

jupitersh · 2024-01-21T15:45:10Z

I am working in the field of reinforcement learning research, particularly in medical applications.

My inquiry is about using pre-collected offline data (encompassing state, action, next state, and reward) for constructing a logged_dataset in scope-rl. I noticed that the documentation mostly focuses on simulated data. However, my dataset is offline and pre-collected, and I'm unsure about the correct approach to define pscore in the logged_dataset for such data.

Could you provide guidance or share best practices on how to manage pscore for offline datasets within scope-rl? Your input would be highly valuable and greatly assist my research in the medical domain.

pmoran3 · 2024-01-24T09:24:18Z

I am also struggling with setting the pscores using real-world data. More tutorials and/or documentation for these cases would be greatly appreciated!

aiueola · 2024-01-30T03:30:25Z

Hi @jupitersh and @pmoran3,

Thank you for the question.

When the "pscore" is not recorded in the logged data, I recommend using Marginal OPE estimators (e.g., Uehara et al., 20). These estimators first estimate the marginal probability given state and action, and apply importance sampling using the estimated marginal probability.

The marginal probability is calculated in the CreateOPEInput class by calling "obtain_whole_inputs" (and by setting "pscore" in the "logged_feedback" to None). Please set "require_weight_prediction=True" and specify the method to estimate marginal importance weight using the "w_function_method" argument.

For general instructions and formatting requirements in using real-worl data, please also refer to documentation for handling real world data.

I hope this information will be helpful to you.

pmoran3 · 2024-01-30T20:08:23Z

@aiueola Thank you for this information. Does this also apply to continuous real-world data?

When I set "pscore" to None and call "obtain_whole_inputs," I get an error:

[/usr/local/lib/python3.10/dist-packages/scope_rl/ope/input.py](https://localhost:8080/#) in _register_logged_dataset(self, logged_dataset)
    388         )
    389         self.reward_2d = self.reward.reshape((-1, self.step_per_trajectory))
--> 390         self.pscore_2d = self.pscore.reshape((-1, self.step_per_trajectory))
    391         self.done_2d = self.done.reshape((-1, self.step_per_trajectory))
    392         self.terminal_2d = self.terminal.reshape((-1, self.step_per_trajectory))

AttributeError: 'NoneType' object has no attribute 'reshape'

This is how I am initializing the logged_dataset dict and calling obtain_whole_inputs:

test_logged_dataset = {
  "size":100000,
  "step_per_trajectory":10,
  "n_trajectories":10000,
  "action":actions_test,
  "state":observations_test,
  "reward":rewards_test,
  "action_type":"continuous",
  "n_actions":None,
  "action_meaning":None,
  "state_dim":3,
  "done":terminals,
  "terminal":terminals,
  "random_state":random_state,
  "action_dim":2,
  "behavior_policy":None,
  "dataset_id":0,
  "pscore":None,
  "info":None
}

prep = CreateOPEInput()

input_dict = prep.obtain_whole_inputs(
    logged_dataset=test_logged_dataset,
    evaluation_policies=evaluation_policies,
    require_weight_prediction=True,
    require_value_prediction=True,
    w_function_method="dice",
    n_trajectories_on_policy_evaluation=100,
    random_state=random_state,
)

ericyue · 2024-02-02T15:26:38Z

@aiueola could you provide a more detail jupyter notebook about how to load a custom logged data (without pscore) to train a BCQ (or others) model? it will be very helpful!

pmoran3 · 2024-02-06T17:51:52Z

@jupitersh Were you able to calculate the pscores properly for your problem? I am still having issues.

ericyue · 2024-02-19T06:02:23Z

@jupitersh have you solve this problem? any idea will be helpful

ericyue mentioned this issue Feb 28, 2024

Load a custom logged data (without pscore) to train a BCQ (or others) model? #26

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query on Handling Offline Data with scope-rl #25

Query on Handling Offline Data with scope-rl #25

jupitersh commented Jan 21, 2024

pmoran3 commented Jan 24, 2024

aiueola commented Jan 30, 2024

pmoran3 commented Jan 30, 2024 •

edited

Loading

ericyue commented Feb 2, 2024 •

edited

Loading

pmoran3 commented Feb 6, 2024

ericyue commented Feb 19, 2024

Query on Handling Offline Data with scope-rl #25

Query on Handling Offline Data with scope-rl #25

Comments

jupitersh commented Jan 21, 2024

pmoran3 commented Jan 24, 2024

aiueola commented Jan 30, 2024

pmoran3 commented Jan 30, 2024 • edited Loading

ericyue commented Feb 2, 2024 • edited Loading

pmoran3 commented Feb 6, 2024

ericyue commented Feb 19, 2024

pmoran3 commented Jan 30, 2024 •

edited

Loading

ericyue commented Feb 2, 2024 •

edited

Loading