More precise logics for n_actions in Dataset and Simulator #45

yoongi0428 · 2021-01-14T04:51:41Z

Possible Issue

In bandit feedback, n_actions are set as int(self.action.max() + 1), which doesn't raise any error in above code,
assuming that logs generated by policy covered all possible actions.

However, to be more precise, I think n_actions should be explicitly given, rather than extracted from log data.
And if changed, the above code might raise error.
If 1000 possible actions and only 0~998 actions exist in bandit _feedback and somehow policy selected action 999,
this might raise out-of-index error.

Idea

BanditFeedback data is given n_actions explicitly.
Rather than:

zr-obp/obp/dataset/real.py

Lines 78 to 81 in 55ab57e

    
           @property 
        
           def n_actions(self) -> int: 
        
               """Number of actions.""" 
        
               return int(self.action.max() + 1)

Use n_actions directly in convert_to_action_dist
Rather than:

zr-obp/obp/simulator/simulator.py

Lines 75 to 78 in 55ab57e

    
           action_dist = convert_to_action_dist( 
        
               n_actions=bandit_feedback["action"].max() + 1, 
        
               selected_actions=np.array(selected_actions_list), 
        
           )

The text was updated successfully, but these errors were encountered:

yoongi0428 changed the title ~~Possible out of index error in simulator~~ More precise logics for n_actions in Dataset and Simulator Jan 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More precise logics for n_actions in Dataset and Simulator #45

More precise logics for n_actions in Dataset and Simulator #45

yoongi0428 commented Jan 14, 2021 •

edited

Loading

More precise logics for n_actions in Dataset and Simulator #45

More precise logics for n_actions in Dataset and Simulator #45

Comments

yoongi0428 commented Jan 14, 2021 • edited Loading

Possible Issue

Idea

yoongi0428 commented Jan 14, 2021 •

edited

Loading