Atari Ice Hockey Environment

Overview

Ice Hockey is a game of two-on-two ice hockey. One player on each team is the goalie, and the other plays offensive (although, the goalie is not confined to the goal). As in the real sport, the object of the game is to take control of the puck and shoot it into the opposing goal to score points. When the puck is in player control, it moves left and right along the blade of the hockey stick. The puck can be shot at any of 32 angles, depending on the position of the puck when it’s shot.

Human players take control of the skater in control of (or closest to) the puck. The puck can be stolen from its holder; shots can also be blocked by the blade of the hockey stick.

Description from Wikipedia

Performances of RL Agents

We list various reinforcement learning algorithms that were tested in this environment. These results are from RL Database. If this page was helpful, please consider giving a star!

Star

Human Starts

Result	Algorithm	Source
0.5	PDD DQN	Dueling Network Architectures for Deep Reinforcement Learning
0.5	Human	Massively Parallel Methods for Deep Reinforcement Learning
-0.1	Distributional DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning
-0.2	Prioritized DDQN (rank, tuned)	Prioritized Experience Replay
-0.7	Rainbow	Rainbow: Combining Improvements in Deep Reinforcement Learning
-1.0	Prioritized DDQN (prop, tuned)	Prioritized Experience Replay
-1.3	DuDQN	Dueling Network Architectures for Deep Reinforcement Learning
-1.7	A3C LSTM	Asynchronous Methods for Deep Reinforcement Learning
-1.72	Gorila DQN	Massively Parallel Methods for Deep Reinforcement Learning
-2.5	DDQN (tuned)	Deep Reinforcement Learning with Double Q-learning
-2.8	A3C FF	Asynchronous Methods for Deep Reinforcement Learning
-3.6	DDQN	Deep Reinforcement Learning with Double Q-learning
-3.8	Prioritized DQN (rank)	Prioritized Experience Replay
-3.8	DQN	Massively Parallel Methods for Deep Reinforcement Learning
-4.7	A3C FF 1 day	Asynchronous Methods for Deep Reinforcement Learning
-9.7	Random	Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result	Algorithm	Source
15.7	Reactor ND	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
10.7	Reactor	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
3.48	IMPALA (deep)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
3.4	Reactor	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
3	NoisyNet DuDQN	Noisy Networks for Exploration
1.3	Distributional DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning
1.1	Rainbow	Rainbow: Combining Improvements in Deep Reinforcement Learning
0.9	Human	Dueling Network Architectures for Deep Reinforcement Learning
0.9	Human	Human-level control through deep reinforcement learning
0.5	DuDQN	Dueling Network Architectures for Deep Reinforcement Learning
0.2	IQN	Implicit Quantile Networks for Distributional Reinforcement Learning
0	DuDQN	Noisy Networks for Exploration
-0.4	PDD DQN	Dueling Network Architectures for Deep Reinforcement Learning
-0.61	Gorila DQN	Massively Parallel Methods for Deep Reinforcement Learning
-1.6	DQN	Human-level control through deep reinforcement learning
-1.7	QR-DQN-1	Distributional Reinforcement Learning with Quantile Regression
-1.9	DQN	A Distributional Perspective on Reinforcement Learning
-2	DQN	Noisy Networks for Exploration
-2	A3C	Noisy Networks for Exploration
-2.4	DDQN	Deep Reinforcement Learning with Double Q-learning
-2.7	DDQN	A Distributional Perspective on Reinforcement Learning
-3	NoisyNet DQN	Noisy Networks for Exploration
-3	NoisyNet A3C	Noisy Networks for Exploration
-3.2	Contingency	Human-level control through deep reinforcement learning
-3.5	C51	A Distributional Perspective on Reinforcement Learning
-3.6	QR-DQN-0	Distributional Reinforcement Learning with Quantile Regression
-4.2	ACKTR	Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
-5.25	IMPALA (shallow)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
-9.5	Linear	Human-level control through deep reinforcement learning
-11.2	Random	Human-level control through deep reinforcement learning
-13.55	IMPALA (deep, multitask)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

Normal Starts

Result	Algorithm	Source
-3.5	DQN Ours	Deep Recurrent Q-Learning for Partially Observable MDPs
-4.2	DQN Ours	Deep Recurrent Q-Learning for Partially Observable MDPs
-4.2	PPO	Proximal Policy Optimization Algorithm
-4.4	DRQN	Deep Recurrent Q-Learning for Partially Observable MDPs
-5.4	DRQN	Deep Recurrent Q-Learning for Partially Observable MDPs
-5.9	ACER	Proximal Policy Optimization Algorithm
-6.4	A2C	Proximal Policy Optimization Algorithm