Today’s RL algorithms are great at exploiting a particular environment, terrible at using that knowledge in new situations. Here’s a new environment which is already helping us understand why, and which may help develop RL algorithms that generalize:
We’re releasing CoinRun, an environment generator that provides a metric for an agent’s ability to generalize across new environments - blog.openai.com/quantifying-…
Dec 6, 2018 · 4:44 PM UTC
8
64
1
196








