Atari Frostbite Environment

Overview

The bottom two thirds of the screen are covered by a mass of water with four rows of ice blocks floating horizontally. The player moves by jumping from one row to another while trying to avoid various kinds of foes including crabs and birds. There are also fish which grant extra points.

On the top of the screen is the shore where the player must build the Igloo. From the fourth level onwards there is also a polar bear walking around on the shore which must be avoided.

The game levels alternate between large ice blocks and little ice pieces. The levels with the little pieces are actually easier, since one can walk left or right over them without falling in the water.

Each time the player jumps on a piece of ice in a row its color changes from white to blue and the player gets an ice block in the Igloo on the shore. The player has the ability to change the direction in which the ice is flowing by pressing the fire button, but that costs a piece of the Igloo.

After the player has jumped on all the pieces on the screen, they all turn back to white and one can jump on them again. When all the 15 ice blocks required for building the Igloo are gathered, the player has to get back to the shore and get inside it, thus proceeding to the next level. On every level the enemies and the ice blocks move slightly faster than in the previous level making the game more difficult.

Each level must be completed in 45 seconds, (represented as the declining temperature,) else the eskimo dies frozen. The faster the level is completed the more bonus points are awarded to the player. If player makes it past level 20, a “magic” fish will appear between the temperature gage and the number of lives remaining, this serves no real purpose other than as an Easter egg to the game.

Description from Wikipedia

Performances of RL Agents

We list various reinforcement learning algorithms that were tested in this environment. These results are from RL Database. If this page was helpful, please consider giving a star!

Star

Human Starts

Result	Algorithm	Source
4202.8	Human	Massively Parallel Methods for Deep Reinforcement Learning
4141.1	Rainbow	Rainbow: Combining Improvements in Deep Reinforcement Learning
4038.4	PDD DQN	Dueling Network Architectures for Deep Reinforcement Learning
3510.0	Prioritized DDQN (rank, tuned)	Prioritized Experience Replay
2930.2	Prioritized DDQN (prop, tuned)	Prioritized Experience Replay
2813.9	Distributional DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning
2332.4	DuDQN	Dueling Network Architectures for Deep Reinforcement Learning
1448.1	DDQN (tuned)	Deep Reinforcement Learning with Double Q-learning
426.6	Gorila DQN	Massively Parallel Methods for Deep Reinforcement Learning
288.7	Prioritized DQN (rank)	Prioritized Experience Replay
258.3	DDQN	Deep Reinforcement Learning with Double Q-learning
197.6	A3C LSTM	Asynchronous Methods for Deep Reinforcement Learning
190.5	A3C FF	Asynchronous Methods for Deep Reinforcement Learning
180.1	A3C FF 1 day	Asynchronous Methods for Deep Reinforcement Learning
157.4	DQN	Massively Parallel Methods for Deep Reinforcement Learning
66.4	Random	Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result	Algorithm	Source
9590.5	Rainbow	Rainbow: Combining Improvements in Deep Reinforcement Learning
8042.1	Reactor	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
7932.2	Reactor	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
7413.0	PDD DQN	Dueling Network Architectures for Deep Reinforcement Learning
7136.7	Reactor ND	The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
4839	QR-DQN-0	Distributional Reinforcement Learning with Quantile Regression
4672.8	DuDQN	Dueling Network Architectures for Deep Reinforcement Learning
4384	QR-DQN-1	Distributional Reinforcement Learning with Quantile Regression
4334.7	Human	Dueling Network Architectures for Deep Reinforcement Learning
4334.7	Human	Human-level control through deep reinforcement learning
4324	IQN	Implicit Quantile Networks for Distributional Reinforcement Learning
3965	C51	A Distributional Perspective on Reinforcement Learning
3938.2	Distributional DQN	Rainbow: Combining Improvements in Deep Reinforcement Learning
2923	NoisyNet DuDQN	Noisy Networks for Exploration
2807	DuDQN	Noisy Networks for Exploration
2744.15	IMPALA (deep, multitask)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
1683.3	DDQN	A Distributional Perspective on Reinforcement Learning
1000	DQN	Noisy Networks for Exploration
797.4	DQN	A Distributional Perspective on Reinforcement Learning
753	NoisyNet DQN	Noisy Networks for Exploration
605.16	Gorila DQN	Massively Parallel Methods for Deep Reinforcement Learning
328.3	DQN	Human-level control through deep reinforcement learning
317.75	IMPALA (deep)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
288	A3C	Noisy Networks for Exploration
269.65	IMPALA (shallow)	IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
261	NoisyNet A3C	Noisy Networks for Exploration
241.5	DDQN	Deep Reinforcement Learning with Double Q-learning
216.9	Linear	Human-level control through deep reinforcement learning
180.9	Contingency	Human-level control through deep reinforcement learning
65.2	Random	Human-level control through deep reinforcement learning

Normal Starts

Result	Algorithm	Source
2875	DRQN	Deep Recurrent Q-Learning for Partially Observable MDPs
519	DQN Ours	Deep Recurrent Q-Learning for Partially Observable MDPs
436	DQN Ours	Deep Recurrent Q-Learning for Partially Observable MDPs
414	DRQN	Deep Recurrent Q-Learning for Partially Observable MDPs
314.2	PPO	Proximal Policy Optimization Algorithm
285.6	ACER	Proximal Policy Optimization Algorithm
261.8	A2C	Proximal Policy Optimization Algorithm