Last week, OpenAI showed off an AI system that was able to beat 5 professional gamers at the very popular video game Dota II. In this episode I discuss how the system was built and my view on what this means for the field of AI! @gdb @cbd @OpenAI youtube.com/0eO2TSVVP1Y
2
12
44
Cool video! One note: don’t see the “dense sampling” hypothesis as well-supported by the data. Our physical robotic work (another application of our Dota training system) samples from simulators very unlike real world. We do see lopsided self-play games. Likely reward artifact!
1
7
So in game 3, the bots' defense strategy of pushing was the result of its realization that "We're so far behind we can't defend without dying, so safely pushing creep waves right now will maximize total expected reward"? Any plans on making the reward sparser in future versions?
1
Wouldn’t consider it proven without further evidence, but best hypothesis is that once you realize you can’t get the win reward, you try to drag out the game as much as possible. Preliminary sparse results mentioned here: blog.openai.com/openai-five/

Aug 14, 2018 · 4:06 PM UTC

1