memory_tools¶
- class xuance.common.memory_tools.Buffer(observation_space: gymnasium.Space, action_space: gymnasium.Space, auxiliary_info_shape: dict | None)[源代码]¶
基类:
ABCBasic buffer single-agent DRL algorithms.
- 参数:
observation_space – the space for observation data.
action_space – the space for action data.
auxiliary_info_shape – the shape for auxiliary data if needed.
- class xuance.common.memory_tools.DummyOffPolicyBuffer(observation_space: gymnasium.Space, action_space: gymnasium.Space, auxiliary_shape: dict | None, n_envs: int, buffer_size: int, batch_size: int)[源代码]¶
基类:
BufferReplay buffer for off-policy DRL algorithms.
- 参数:
observation_space – the observation space of the environment.
action_space – the action space of the environment.
auxiliary_shape – data shape of auxiliary information (if exists).
n_envs – number of parallel environments.
buffer_size – the total size of the replay buffer.
batch_size – size of transition data for a batch of sample.
- class xuance.common.memory_tools.DummyOffPolicyBuffer_Atari(observation_space: gymnasium.Space, action_space: gymnasium.Space, auxiliary_shape: dict | None, n_envs: int, buffer_size: int, batch_size: int)[源代码]¶
-
Replay buffer for off-policy DRL algorithms and Atari tasks.
- 参数:
observation_space – the observation space of the environment.
action_space – the action space of the environment.
auxiliary_shape – data shape of auxiliary information (if exists).
n_envs – number of parallel environments.
buffer_size – the total size of the replay buffer.
batch_size – batch size of transition data for a sample.
- class xuance.common.memory_tools.DummyOnPolicyBuffer(observation_space: gymnasium.Space, action_space: gymnasium.Space, auxiliary_shape: dict | None, n_envs: int, horizon_size: int, use_gae: bool = True, use_advnorm: bool = True, gamma: float = 0.99, gae_lam: float = 0.95)[源代码]¶
基类:
BufferReplay buffer for on-policy DRL algorithms.
- 参数:
observation_space – the observation space of the environment.
action_space – the action space of the environment.
auxiliary_shape – data shape of auxiliary information (if exists).
n_envs – number of parallel environments.
horizon_size – max length of steps to store for one environment.
use_gae – if use GAE trick.
use_advnorm – if use Advantage normalization trick.
gamma – discount factor.
gae_lam – gae lambda.
- property full¶
- class xuance.common.memory_tools.DummyOnPolicyBuffer_Atari(observation_space: gymnasium.Space, action_space: gymnasium.Space, auxiliary_shape: dict | None, n_envs: int, horizon_size: int, use_gae: bool = True, use_advnorm: bool = True, gamma: float = 0.99, gae_lam: float = 0.95)[源代码]¶
-
Replay buffer for on-policy DRL algorithms and Atari tasks.
- 参数:
observation_space – the observation space of the environment.
action_space – the action space of the environment.
auxiliary_shape – data shape of auxiliary information (if exists).
n_envs – number of parallel environments.
horizon_size – max length of steps to store for one environment.
use_gae – if use GAE trick.
use_advnorm – if use Advantage normalization trick.
gamma – discount factor.
gae_lam – gae lambda.
- class xuance.common.memory_tools.PerOffPolicyBuffer(observation_space: gymnasium.Space, action_space: gymnasium.Space, auxiliary_shape: dict | None, n_envs: int, buffer_size: int, batch_size: int, alpha: float = 0.6)[源代码]¶
基类:
BufferPrioritized Replay Buffer.
- 参数:
observation_space – the observation space of the environment.
action_space – the action space of the environment.
auxiliary_shape – data shape of auxiliary information (if exists).
n_envs – number of parallel environments.
buffer_size – the total size of the replay buffer.
batch_size – batch size of transition data for a sample.
alpha – prioritized factor.
- class xuance.common.memory_tools.RecurrentOffPolicyBuffer(observation_space: gymnasium.Space, action_space: gymnasium.Space, auxiliary_shape: dict | None, n_envs: int, buffer_size: int, batch_size: int, episode_length: int, lookup_length: int)[源代码]¶
基类:
BufferReplay buffer for DRQN-based algorithms.
- 参数:
observation_space – the observation space of the environment.
action_space – the action space of the environment.
auxiliary_shape – data shape of auxiliary information (if exists).
n_envs – number of parallel environments.
buffer_size – the size of replay buffer that stores episodes of data.
batch_size – batch size of transition data for a sample.
episode_length – data length for an episode.
lookup_length – the length of history data.
- property full¶
- class xuance.common.memory_tools.SequentialReplayBuffer(observation_space: gymnasium.Space, action_space: gymnasium.Space, auxiliary_shape: dict | None, n_envs: int, buffer_size: int, batch_size: int)[源代码]¶
基类:
BufferSequential Replay buffer for Dreamerv3
- 参数:
observation_space – the observation space of the environment.
action_space – the action space of the environment.
auxiliary_shape – data shape of auxiliary information (if exists).
n_envs – number of parallel environments.
buffer_size – the total size of the replay buffer.
batch_size – size of transition data for a batch of sample.
- xuance.common.memory_tools.create_memory(shape: tuple | dict | None, n_envs: int, n_size: int, dtype: type = <class 'numpy.float32'>)[源代码]¶
Create a numpy array for memory data.
- 参数:
shape – data shape.
n_envs – number of parallel environments.
n_size – length of data sequence for each environment.
dtype – numpy data type.
- 返回:
numpy.zeros())
- 返回类型:
An empty memory space to store data. (initial