Today’s RL algorithms are great at exploiting a particular environment, terrible at using that knowledge in new situations. Here’s a new environment which is already helping us understand why, and which may help develop RL algorithms that generalize:
We’re releasing CoinRun, an environment generator that provides a metric for an agent’s ability to generalize across new environments - blog.openai.com/quantifying-…
8
64
1
196
This is our third major attempt in the past two years (Universe, Retro Contest) to develop a platform for RL generalization. Each time, we’ve made the task easier — but more focused on the core generalization challenge. Already seeing promising results on CoinRun.

Dec 6, 2018 · 4:46 PM UTC

3
7
42
Replying to @gdb
Promising result: standard anti-overfitting tools (dropout, batchnorm), that normally don’t help in RL, help in CoinRun generalization. This suggests that most RL benchmarks encourage overfitting.
2
3
42
Replying to @gdb
Looks interesting! Seems very related to our paper on Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation arxiv.org/pdf/1806.10729.pdf @togelius @nojustesen
3
2
13
Replying to @gdb
One of frustrations with Atari AI game playing is the lack of consideration of generalization and transfer. Hopefully this will draw attention to these issues