Gymnasium¶

Gymnasium is a community-driven toolkit for DRL, developed as an enhanced and actively maintained fork of OpenAI’s Gym by the Farama Foundation. It provides a standardized interface for building and benchmarking DRL algorithms while addressing the limitations of the original Gym. Gymnasium retains backward compatibility with Gym while introducing significant improvements to modernize the toolkit.

Official documentation: https://gymnasium.farama.org/.

备注

Gymnasium is a community-driven fork of OpenAI’s Gym, actively maintained by the Farama Foundation. It offers enhanced APIs, richer info outputs, clear termination criteria, and modern Python support. Unlike Gym, whose updates have slowed, Gymnasium ensures compatibility with new RL libraries, improved documentation, and ongoing support for future advancements. You can visit original Gym’s documentation from this link: https://www.gymlibrary.dev/

Classic Control¶

Overview¶

Cart Pole

Mountain Car Continuous

Mountain Car

Acrobot

Pendulum

Features¶

The Classic Control environment contains five scenarios: CartPole, Mountain Car Continuous, Mountain Car, Acrobot, Pendulum. These five tasks are usually used as preliminary verification for a DRL algorithm. The key features of each scenario are summarized in the table below:

Env-id	Observation Space	Action Space
CartPole-v1	`Box([-4.8 -inf -0.41887903 -inf], [4.8 inf 0.41887903 inf], (4,), float32)`	`Discrete(2)`
MountainCarContinuous-v0	`Box([-1.2 -0.07], [0.6 0.07], (2,), float32)`	`Box(-1.0, 1.0, (1,), float32)`
MountainCar-v0	`Box([-1.2 -0.07], [0.6 0.07], (2,), float32)`	`Discrete(3)`
Acrobot-v1	`Box([ -1. -1. -1. -1. -12.566371 -28.274334], [ 1. 1. 1. 1. 12.566371 28.274334], (6,), float32)`	`Discrete(3)`
Pendulum-v1	`Box([-1. -1. -8.], [1. 1. 8.], (3,), float32)`	`Box(-2.0, 2.0, (1,), float32)`

Arguments¶

In XuanCe, the arguments for running Classic Control environment are listed below.

Arguments	Value/Description
`env_name`	“Classic Control”
`env_seed`	The env-id.
`vectorize`	Choose the method to vectorize the environment. Choices: “DummyVecEnv”, “SubprocVecEnv”.
`parallels`	The number of environments that run in parallel.
`env_seed`	The env-seed for the first environment of vectorized environments.
`render_mode`	The render mode to visualize the environment, default is “human”. Choices: “human”, “rgb_array”.

Run in XuanCe¶

In XuanCe, if you want to run the Classic Control environment, you can specify the arguments in your config file. For example:

env_name: "Classic Control"  # The name of classic control environment.
env_id: "CartPole-v1"  # The env-id of the tasks in classic control environment.
env_seed: 1  # The random seed of the task.
vectorize: "SubprocVecEnv"  # Choose the method to vectorize the environment.
parallels: 10  # The number of environments that run in parallel.
render_mode: "rgb_array"  # The render mode.

You can also make the environments in the Python console to have a test:

from xuance import make_envs
from argparse import Namespace
envs = make_envs(Namespace(env_name="Classic Control", 
                           env_id="CartPole-v1", 
                           vectorize="DummyVecEnv", 
                           parallels=1, 
                           env_seed=1))
envs.reset()                        

To run a DRL demo with Classic Control environment, you can see the Quick Start.

Box2D¶

Overview¶

Bipedal Walker

Car Racing

Lunar Lander

Features¶

The Box2D environment is built using box2d for physics control. It contains three different scenarios: Bipedal Walker, Car Racing, Lunar Lander. The key features of each scenario are summarized in the table below:

Env-id	Observation Space	Action Space
BipedalWalker-v3	`Box([-3.1415927 -5. -5. -5. -3.1415927 -5. -3.1415927 -5. -0. -3.1415927 -5. -3.1415927 -5. -0. -1. -1. -1. -1. -1. -1. -1. -1. -1. -1. ], [3.1415927 5. 5. 5. 3.1415927 5. 3.1415927 5. 5. 3.1415927 5. 3.1415927 5. 5. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. ], (24,), float32)`	`Box(-1.0, 1.0, (4,), float32)`
CarRacing-v2	`Box(0, 255, (96, 96, 3), uint8)`	`Discrete(5)` or `Box([-1. 0. 0.], 1.0, (3,), float32)`
LunarLander-v2	`Box([ -2.5 -2.5 -10. -10. -6.2831855 -10. -0. -0. ], [ 2.5 2.5 10. 10. 6.2831855 10. 1. 1. ], (8,), float32)`	`Discrete(4)` or `Box(-1, +1, (2,), dtype=np.float32)`

Installation¶

The Box2D environment is not included with the installation of XuanCe. As an external package, it needs to be installed separately.

pip install swig
pip install gymnasium[box2d]

备注

If you’re using macOS and encounter the following error:

zsh: no matches found: gymnasium[box2d]

You can resolve it by typing the command as follows:

pip install 'gymnasium[box2d]'

Arguments¶

In XuanCe, the arguments for running Box2D environment are listed below.

Arguments	Value/Description
`env_name`	“Box2D”
`env_seed`	The env-id.
`vectorize`	Choose the method to vectorize the environment. Choices: “DummyVecEnv”, “SubprocVecEnv”.
`parallels`	The number of environments that run in parallel.
`env_seed`	The env-seed for the first environment of vectorized environments.
`render_mode`	The render mode to visualize the environment, default is “human”. Choices: “human”, “rgb_array”.
`continuous`	Determines if discrete or continuous actions will be used. (Only CarRacing-v2 and LunarLander-v2 have this argument.)

Run in XuanCe¶

In XuanCe, if you want to run the Box2D environment, you can specify the arguments in your config file. For example:

env_name: "Box2D"  # The name of classic control environment.
env_id: "BipedalWalker-v3"  # The env-id of the tasks in classic control environment.
env_seed: 1  # The random seed of the task.
vectorize: "SubprocVecEnv"  # Choose the method to vectorize the environment.
parallels: 10  # The number of environments that run in parallel.
render_mode: "rgb_array"  # The render mode.

You can also make the environments in the Python console to have a test:

from xuance import make_envs
from argparse import Namespace
envs = make_envs(Namespace(env_name="Box2D", 
                           env_id="BipedalWalker-v3", 
                           vectorize="DummyVecEnv", 
                           parallels=1, 
                           env_seed=1))
envs.reset()                        

To run a DRL demo with Box2D environment, you can refer to the Quick Start.

MuJoCo¶

Overview¶

Inverted Double Pendulum

MuJoCo stands for Multi-Joint dynamics with Contact. It is a physics engine for facilitating research and development in robotics, biomechanics, graphics and animation, and other areas where fast and accurate simulation is needed. There is physical contact between the robots and their environment - and MuJoCo attempts at getting realistic physics simulations for the possible physical contact dynamics by aiming for physical accuracy and computational efficiency.

Learn more about this environment here.

Features¶

The key features of each scenario are summarized in the table below:

Env-id	Observation Space	Action Space
Ant-v4	`Box(-inf, inf, (105,), float64)`	`Box(-1.0, 1.0, (8,), float32)`
HalfCheetah-v4	`Box(-inf, inf, (17,), float64)`	`Box(-1.0, 1.0, (6,), float32)`
Hopper-v4	`Box(-inf, inf, (11,), float64)`	`Box(-1.0, 1.0, (3,), float32)`
HumanoidStandIp-v4	`Box(-inf, inf, (348,), float64)`	`Box(-0.4, 0.4, (17,), float32)`
Humanoid-v4	`Box(-inf, inf, (348,), float64)`	`Box(-0.4, 0.4, (17,), float32)`
InvertedDoublePendulum-v4	`Box(-inf, inf, (9,), float64)`	`Box(-1.0, 1.0, (1,), float32)`
InvertedPendulum-v4	`Box(-inf, inf, (4,), float64)`	`Box(-3.0, 3.0, (1,), float32)`
Pusher-v4	`Box(-inf, inf, (23,), float64)`	`Box(-2.0, 2.0, (7,), float32)`
Reacher-v4	`Box(-inf, inf, (10,), float64)`	`Box(-1.0, 1.0, (2,), float32)`
Swimmer-v4	`Box(-inf, inf, (8,), float64)`	`Box(-1.0, 1.0, (2,), float32)`
Walker2d-v4	`Box(-inf, inf, (17,), float64)`	`Box(-1.0, 1.0, (6,), float32)`

Installation¶

The MuJoCo environment is not included with the installation of XuanCe. As an external package, it needs to be installed separately.

pip install gymnasium[mujoco]

备注

If you’re using macOS and encounter the following error:

zsh: no matches found: gymnasium[mujoco]

You can resolve it by typing the command as follows:

pip install 'gymnasium[mujoco]'

Arguments¶

In XuanCe, the arguments for running MuJoCo environment are listed below.

Arguments	Value/Description
`env_name`	“MuJoCo”
`env_seed`	The env-id.
`vectorize`	Choose the method to vectorize the environment. Choices: “DummyVecEnv”, “SubprocVecEnv”.
`parallels`	The number of environments that run in parallel.
`env_seed`	The env-seed for the first environment of vectorized environments.
`render_mode`	The render mode to visualize the environment, default is “human”. Choices: “human”, “rgb_array”.

Run in XuanCe¶

In XuanCe, if you want to run the MuJoCo environment, you can specify the arguments in your config file. For example:

env_name: "MuJoCo"  # The name of classic control environment.
env_id: "Ant-v4"  # The env-id of the tasks in classic control environment.
env_seed: 1  # The random seed of the task.
vectorize: "SubprocVecEnv"  # Choose the method to vectorize the environment.
parallels: 10  # The number of environments that run in parallel.
render_mode: "rgb_array"  # The render mode.

You can also make the environments in the Python console to have a test:

from xuance import make_envs
from argparse import Namespace
envs = make_envs(Namespace(env_name="MuJoCo", 
                           env_id="Ant-v4", 
                           vectorize="DummyVecEnv", 
                           parallels=1, 
                           env_seed=1))
envs.reset()                        

To run a DRL demo with MuJoCo environment, you can refer to the Quick Start.

Atari¶

Overview¶

Features¶

The Atari environment contains 62 different tasks, which are simulated via the Arcade Learning Environment (ALE).

Action space:

The complete action space of Atari contains 18 discrete actions. By default, all actions can be performed on Atari 2600 are available. If you specify the full_action_space=False, only a reduced number of actions are available in that game, which can reduce the complexity of the training. That is also the default setting in XuanCe.

Num	Action	Num	Action	Num	Action
0	Noop	6	UpRight	12	LeftFire
1	Fire	7	UpLeft	13	DownFire
2	Up	8	DownRight	14	UpRightFire
3	Right	9	DownLeft	15	UpLeftFire
4	Left	10	UpFire	16	DownRightFire
5	Down	11	RightFire	17	DownLeftFire

Observation space:

The observation space can be specified by the obs_type argument in XuanCe’s config file.

`obs_type`	Description
“rgb”	observation_space=Box(0, 255, (210, 160, 3), np.uint8)
“grayscale”	Box(0, 255, (210, 160), np.uint8)
“ram”	observation_space=Box(0, 255, (128,), np.uint8)

Installation¶

The Atari environment is not included with the installation of XuanCe. As an external package, it needs to be installed separately.

After installing XuanCe, you need to install Atari dependencies via the following command:

pip install gymnasium[accept-rom-license] gymnasium[atari]
pip install atari-py==0.2.9 ale-py==0.7.5

备注

If you’re using macOS and encounter the following error:

zsh: no matches found: gymnasium[accept-rom-license]

You can resolve it by typing the command as follows:

pip install 'gymnasium[accept-rom-license]' 'gymnasium[atari]'

And then, reinstall atari-py and ale-py:

pip install atari-py==0.2.9 ale-py==0.7.5

Arguments¶

In XuanCe, the arguments for running MuJoCo environment are listed below.

Arguments	Value/Description
`env_name`	“Atari”
`env_seed`	The env-id. (For example, “ALE/Breakout-v5”) ]
`vectorize`	Choose the method to vectorize the environment. Choices: “Dummy_Atari”, “Subproc_Atari”.
`parallels`	The number of environments that run in parallel.
`env_seed`	The env-seed for the first environment of vectorized environments.
`render_mode`	The render mode to visualize the environment, default is “human”. Choices: “human”, “rgb_array”.
`obs_type`	The observation type. Choices: “rgb”, “grayscale”, “ram”.
`frame_skip`	The number of frames to skip at each step.
`full_action_space`	Whether to use the full action space, default is False.
`image_size`	The observed image size, default is [210, 160].
`num_stack`	Frame stack trick.
`noop_max`	Do Noop action for a number of steps in [1, noop_max]

Run in XuanCe¶

In XuanCe, if you want to run the Atari environments, you can specify the arguments in your config file. For example:

env_name: "Atari"  # The name of classic control environment.
env_id: "ALE/Breakout-v5"  # The env-id of the tasks in classic control environment.
env_seed: 1  # The random seed of the task.
vectorize: "Dummy_Atari"  # Choose the method to vectorize the environment.
parallels: 10  # The number of environments that run in parallel.
render_mode: "rgb_array"  # The render mode.
obs_type: "grayscale"  # choice for Atari env: ram, rgb, grayscale
img_size: [84, 84]  # default is [210, 160].
num_stack: 4  # frame stack trick
frame_skip: 4  # frame skip trick
noop_max: 30  # Do no-op action for a number of steps in [1, noop_max].

You can also make the environments in the Python console to have a test:

from xuance import make_envs
from argparse import Namespace
envs = make_envs(Namespace(env_name="Atari", 
                           env_id="ALE/Breakout-v5", 
                           vectorize="Dummy_Atari", 
                           parallels=1, 
                           env_seed=1,
                           obs_type="grayscale",
                           img_size=[84, 84],
                           num_stack=4,
                           frame_skip=4,
                           noop_max=30))
envs.reset()                        

To run a DRL demo with Atari environment, you can refer to the Quick Start.

APIs¶

class xuance.environment.single_agent_env.gym.Gym_Env(*args: Any, **kwargs: Any)[源代码]¶

基类：Wrapper

参数:

env_id (str) – The environment id of Atari, such as “Breakout-v5”, “Pong-v5”, etc.
env_seed (int) – The random seed to set the environment.
render_mode (str) – “rgb_array”, “human”

render(*args)[源代码]¶

reset()[源代码]¶

step(actions)[源代码]¶

class xuance.environment.single_agent_env.gym.LazyFrames(frames)[源代码]¶

基类：object

This object ensures that common frames between the observations are only stored once. It exists purely to optimize memory usage which can be huge for DQN’s 1M frames replay buffers. This object should only be converted to numpy array before being passed to the model.

class xuance.environment.single_agent_env.gym.MountainCar(*args: Any, **kwargs: Any)[源代码]¶

基类：Gym_Env

reset()[源代码]¶

step(actions)[源代码]¶