Overview

Boxing is an Atari 2600 video game based on the sport of boxing. The game was designed by Activision programmer Bob Whitehead. Boxing shows a top-down view of two boxers, one white and one black. When close enough, a boxer can hit his opponent with a punch (executed by pressing the fire button on the Atari joystick). This causes his opponent to reel back slightly. Long punches score one point, while closer punches (power punches, from the manual) score two. There are no knockdowns or rounds. A match is completed either when one player lands 100 punches (a ‘knockout’) or two minutes have elapsed (a ‘decision’). In the case of a decision, the player with the most landed punches is the winner. Ties are possible.

While the gameplay is simple, there are subtleties, such as getting an opponent on the ‘ropes’ and ‘juggling’ him back and forth between alternate punches. Boxing was made available on Microsoft’s Game Room service for its Xbox 360 console and for Windows-based PCs on September 1, 2010.

Description from Wikipedia

Performances of RL Agents

We list various reinforcement learning algorithms that were tested in this environment. These results are from RL Database. If this page was helpful, please consider giving a star!

Star

Human Starts

Result Algorithm Source
79.2 PDD DQN Dueling Network Architectures for Deep Reinforcement Learning
77.3 DuDQN Dueling Network Architectures for Deep Reinforcement Learning
74.2 Gorila DQN Massively Parallel Methods for Deep Reinforcement Learning
73.5 DDQN (tuned) Deep Reinforcement Learning with Double Q-learning
72.3 Prioritized DDQN (rank, tuned) Prioritized Experience Replay
69.6 Prioritized DQN (rank) Prioritized Experience Replay
68.6 Prioritized DDQN (prop, tuned) Prioritized Experience Replay
62.1 Distributional DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
59.8 A3C FF Asynchronous Methods for Deep Reinforcement Learning
54.9 Rainbow Rainbow: Combining Improvements in Deep Reinforcement Learning
52.1 DDQN Deep Reinforcement Learning with Double Q-learning
37.3 A3C LSTM Asynchronous Methods for Deep Reinforcement Learning
33.7 A3C FF 1 day Asynchronous Methods for Deep Reinforcement Learning
25.8 DQN Massively Parallel Methods for Deep Reinforcement Learning
9.6 Human Massively Parallel Methods for Deep Reinforcement Learning
-1.5 Random Massively Parallel Methods for Deep Reinforcement Learning

No-op Starts

Result Algorithm Source
100 NoisyNet A3C Noisy Networks for Exploration
100 NoisyNet DuDQN Noisy Networks for Exploration
99.96 IMPALA (deep) IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
99.9 QR-DQN-1 Distributional Reinforcement Learning with Quantile Regression
99.8 QR-DQN-0 Distributional Reinforcement Learning with Quantile Regression
99.8 IQN Implicit Quantile Networks for Distributional Reinforcement Learning
99.7 Reactor ND The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
99.6 Rainbow Rainbow: Combining Improvements in Deep Reinforcement Learning
99.4 DuDQN Dueling Network Architectures for Deep Reinforcement Learning
99.4 Reactor The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
99.4 Reactor The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
99 DuDQN Noisy Networks for Exploration
98.9 PDD DQN Dueling Network Architectures for Deep Reinforcement Learning
98.1 Distributional DQN Rainbow: Combining Improvements in Deep Reinforcement Learning
97.8 C51 A Distributional Perspective on Reinforcement Learning
96.63 IMPALA (deep, multitask) IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
96.3 IMPALA (shallow) IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
94.88 Gorila DQN Massively Parallel Methods for Deep Reinforcement Learning
91.6 DDQN A Distributional Perspective on Reinforcement Learning
91 A3C Noisy Networks for Exploration
89 NoisyNet DQN Noisy Networks for Exploration
88.0 DQN A Distributional Perspective on Reinforcement Learning
87 DQN Noisy Networks for Exploration
81.7 DDQN Deep Reinforcement Learning with Double Q-learning
71.8 DQN Human-level control through deep reinforcement learning
44 Linear Human-level control through deep reinforcement learning
12.1 Human Dueling Network Architectures for Deep Reinforcement Learning
9.8 Contingency Human-level control through deep reinforcement learning
4.3 Human Human-level control through deep reinforcement learning
1.45 ACKTR Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
0.1 Random Human-level control through deep reinforcement learning

Normal Starts

Result Algorithm Source
98.9 ACER Proximal Policy Optimization Algorithm
94.6 PPO Proximal Policy Optimization Algorithm
17.7 A2C Proximal Policy Optimization Algorithm