Deep Reinforcement Learning¶
Value-based:
DQN: Deep Q-Network (DQN).DuelDQN: Dueling Deep Q-Network (Dueling DQN).NoisyDQN: DQN with Noisy Layers (Noisy DQN).C51: Categorical 51 DQN (C51).DRQN: Deep Recurrent Q-Network (DRQN).
Policy-based:
PG: Policy Gradient (PG).A2C: Advantage Actor Critic (A2C).PPOKL: Proximal Policy Optimization with KL Divergence (PPO-KL).PPOCLIP: Proximal Policy Optimization with Clipped Objective (PPO-Clip).PPG: Phasic Policy Gradient (PPG).SAC: Soft Actor-Critic (SAC).TD3: Twin Delayed Deep Deterministic Policy Gradient (TD3).