Overview
In the game, the player controls a scuba diver who must protect a treasure from an octopus at the top of the screen: The octopus tries to capture the treasure with its tentacles. Meanwhile, a great white shark tries to distract the diver by swimming back and forth toward the bottom of the screen.
The diver loses a life if he is captured by the shark or the octopus’s tentacles, or if the air meter runs out. The diver can refill his air meter by touching a long pole which extends from a boat that appears from time to time.
Description from Wikipedia
Performances of RL Agents
We list various reinforcement learning algorithms that were tested in this environment. These results are from RL Database. If this page was helpful, please consider giving a star!
Human Starts
No-op Starts
Normal Starts
Result | Algorithm | Source |
---|---|---|
8488.0 | ACER | Proximal Policy Optimization Algorithm |
6254.9 | PPO | Proximal Policy Optimization Algorithm |
5961.2 | A2C | Proximal Policy Optimization Algorithm |