RL-THOR is a lightweight and customizable reinforcement learning environment based on AI2-THOR.
Thanks to AI2-THOR simulation environment, AI agents can explore realistic 3D household environments ๐ , interact with a wide variety of objects ๐งน, navigate complex scenes ๐งญ, and complete meaningful tasks ๐. The environment is designed to support high-level embodied reasoning and complex interactions, enabling agents to learn and generalize from diverse, scalable tasks.
- ๐ป Installation
- ๐ Getting Started
- โ๏ธ Running Headless
- โ๏ธ Environment Configuration
- โ Creating new tasks - [WIP]
- ๐งฎ The Benchmark
- ๐ Citation
- ๐งพ License
- ๐ค Contributing
-
Create virtual environment
We recommend you use a conda virtual environment:# We require python>=3.12 conda create -n rl_thor python=3.12 conda activate rl_thor
-
Install RL-THOR and its dependencies
To install and customize the environment locally:git clone https://github.com/Kajiih/rl_thor.git pip install -r requirements/dev.txt
RL-THOR requires Python 3.12 or later. In addition, it shares the same requirements as AI2-THOR. Notably, Windows systems are not natively supported. For detailed requirements, refer to the AI2-THOR requirements.
RL-THOR uses Gymnasium API, so you can use it as simply as any other Gym/Gymnasium environment.
This short script runs an ITHOR environment with the basic configuration and random actions:
from rl_thor.envs import ITHOREnv
env = ITHOREnv()
obs, info = env.reset()
terminated, truncated = False, False
while not terminated and not truncated:
action = env.action_space.sample()
obs, reward, terminated, truncated, info = env.step(action)
env.close()
Note that the first time that you instantiate the environment, AI2-THOR will download locally the 3D simulator resources to ~/.ai2thor
(~500MB).
More examples and training scripts can be found in the examples
folder of this repository.
To go further, we recommend you to get familiar with the concepts of the ITHOR simulation environment and our documentation [create actual documentation] to understand how tasks are defined.
The environment can be configured through a yaml
file or by dictionary when instantiating the environment.
For a complete enumeration of the configuration options, see Configuration.
If unspecified, the environment configuration is equivalent to the file configurations/environment_config.yaml
of this repository.
# === General ===
seed: 1
max_episode_steps: 1000
# === Simulator ===
controller_parameters:
platform: null # set to "CloudRendering" for headless cloud rendering
visibility_distance: 1.5
# Camera properties
frame_width: 300
frame_height: 300
field_of_view: 90
scene_randomization:
random_agent_spawn: False # If True, the agent will spawn at a random location and rotation at the beginning of each episode
random_object_spawn: False # If True, pickupable objects will spawn at random locations at the beginning of each episode
random_object_materials: False # If True, the materials of the objects will be randomized at the beginning of each episode
random_object_colors: False # If True, the colors of the objects will be randomized at the beginning of each episode # Note: Not realistic
random_lighting: False # If True, the lighting conditions will be randomized at the beginning of each episode
# === Actions ===
action_groups:
# === Navigation actions ===
movement_actions: True
rotation_actions: True
head_movement_actions: True
crouch_actions: False
# === Object manipulation actions ===
pickup_put_actions: True
drop_actions: False
throw_actions: False
push_pull_actions: False
hand_control_actions: False
# === Object interaction actions ===
open_close_actions: True
toggle_actions: True
slice_actions: False
use_up_actions: False
liquid_manipulation_actions: False
break_actions: False
action_modifiers:
discrete_actions: True # If True, all actions requiring a parameter will be discretized and use their discrete value
target_closest_object: True # If True, the closest operable object to the agent will be used as target for object interaction actions (e.g. pickup, open, etc.)
simple_movement_actions: False # Only keep MoveAhead action (no MoveBack, MoveRight and MoveLeft Actions), should at least be used with body_rotation_actions
static_pickup: False # Picked up objects don't teleport to hand
stationary_placement: True # If False, a placed object will use the physics engine to resolve the final position (no deterministic placement)
partial_openness: False # If True, objects can be opened partially with a parameter (only if open_close_actions is already enabled and environment is continuous) -> Adds partial_open_object_action from the "special" action category and removes open_object_action and close_object_action
action_discrete_param_values: # If run in discrete mode
movement_magnitude: 0.25
rotation_degrees: 45
head_movement_degrees: 30
throw_strength: 50
push_pull_strength: 100
# === Tasks ===
tasks:
globally_excluded_scenes: [] # List of scene names to exclude for all tasks(only full names like "FloorPlan1", "FloorPlan201", ...)
task_blueprints: []
When instantiating the environment, you can set the relative path to the configuration:
env = gym.make("rl_thor/ITHOREnv-v0.1", config_path="path/to/config.yaml")
By default, this path is set to config/environment_config.yaml
.
For convenience, you can also override specific values of the configuration with the config_override
parameter.
Example: If you only want to change the maximum number of steps per episode to 200 and the randomization of the Agent's spawn location and object materials, you can do it like this:
config_override = {
"max_episode_steps": 200,
"scene_randomization": {
"random_agent_spawn": True,
"random_object_materials": True,
},
}
env = gym.make(
"rl_thor/ITHOREnv-v0.1",
config_folder_path="config/environment_config.yaml", # Default value
config_override=config_override,
)
In AI2-THOR RL, we use a specific task description format called Graph Task
Thanks to graph tasks, you can define a new task by describing its agency list. In practice, it is as simple as creating a python dictionary describing the task items, their properties and their relations:
task_description_dict = {
"plate_receptacle": {
"properties": {"objectType": "Plate"},
},
"hot_apple": {
"properties": {"objectType": "Apple", "temperature": "Hot"},
"relations": {"plate_receptacle": ["contained_in"]},
},
}
This code lets you define a new task consisting of putting a hot apple in a plate. hot_apple
and plate_receptacle
are identifiers of the items used to defined relations and each property and relation can be found here
This is enough to automatically create the reward function associated to the graph task.
For more details about how to define new tasks, item properties or relations, see the dedicated part of the documentation
The training script and commands to reproduce the results of the baselines are available in the examples/benchmark
folder.
Not available yet
Component | License |
---|---|
Codebase (this repo) | MIT License |
AI2-THOR | Apache License Version 2.0 |
Not available yet