Why I Read This

  • I saw it on Twitter.
  • I enjoy participating in reinforcement learning competitions.

Short Summary

  • Reinforcement learning (RL) has now reached superhuman level on most environments of Arcade Learning Environment, so we need a new benchmark.
  • Obstacle Tower environment is a new environment, with challenges in generalization, vision, planning, and control.
  • Agents that use Hierarchical RL, Intrinsic Motivation, Meta-Learning or Model-based methods will probably perform better than pure baseline algorithms such as Rainbow or PPO.
  • The Obstacle Tower Challenge will begin on February 11th.


  • The Obstacle Tower environment can perhaps be better summarized as a 3D stochastic version of Montezuma’s Revenge with an easy version of Sokoban.
  • The environment is perhaps too difficult: it requires an agent with good exploration and planning, paired with a good convolutional neural network (CNN).

Accompanying Resources

If you want to learn more about the Arcade Learning Environment (ALE), the predecessor of Obstacle Tower environment, check these links.

If you want to learn more about the environment, check these links.