We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Description of what the bug is. AssertionError: Tuple observations are not supported.
Code or a description of how to reproduce the bug.
import gymnasium as gym import numpy as np from stable_baselines3 import PPO from stable_baselines3.common.evaluation import evaluate_policy from stable_baselines3.ppo import MlpPolicy from imitation.algorithms import bc from imitation.data import rollout from imitation.data.wrappers import RolloutInfoWrapper from imitation.policies.serialize import load_policy from imitation.util.util import make_vec_env rng = np.random.default_rng(0) env = gym.make("Pendulum-v1") env = RolloutInfoWrapper(env) def train_expert(): print("Training a expert.") expert = PPO( policy=MlpPolicy, env=env, seed=0, batch_size=64, ent_coef=0.0, learning_rate=0.0003, n_epochs=10, n_steps=64, ) expert.learn(10) # Note: change this to 100_000 to train a decent expert. return expert def sample_expert_transitions(): expert = train_expert() print("Sampling expert transitions.") rollouts = rollout.rollout( expert, env, rollout.make_sample_until(min_timesteps=None, min_episodes=50), rng=rng, ) return rollout.flatten_trajectories(rollouts)
pip freeze --all
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Bug description
Description of what the bug is.
AssertionError: Tuple observations are not supported.
Steps to reproduce
Code or a description of how to reproduce the bug.
Environment
pip freeze --all
:The text was updated successfully, but these errors were encountered: