You make what you measure. Procgen Benchmark lets you directly measure how well and how quickly an RL agent learns generalizable skills.
We've found that with fewer than ~500-1000 levels, today's algorithms memorize rather than learn something general.
We're releasing Procgen Benchmark, 16 procedurally-generated environments for measuring how quickly a reinforcement learning agent learns generalizable skills.
This has become the standard research platform used by the OpenAI RL team: openai.com/blog/procgen-benc…
Dec 3, 2019 · 5:19 PM UTC
3
27
2
129

