Wouldn’t consider it proven without further evidence, but best hypothesis is that once you realize you can’t get the win reward, you try to drag out the game as much as possible. Preliminary sparse results mentioned here: blog.openai.com/openai-five/
Cool video!
One note: don’t see the “dense sampling” hypothesis as well-supported by the data. Our physical robotic work (another application of our Dota training system) samples from simulators very unlike real world. We do see lopsided self-play games. Likely reward artifact!
How did @OpenAI's team of 5 neural networks manage to beat some of the best DOTA 2 players in the world? I'll explain their algorithmic techniques in detail in this video. Amazing work! youtube.com/DzzFSyzv1p0
(Note that during training it just plays randomized lineups, so it’s not likely it’s played many games with all the lineups — instead it must generalize from what it has experienced.)
Worth noting: reality is *not* in the distribution of randomized simulations. Currently definitely need a simulator that captures important aspects of your problem, but we have non-trivial ability to generalize.
Has been really cool to see how excited & welcoming the Dota community has been about our project — on Sunday, /r/Dota2 front page was mostly OpenAI threads. Really motivating for the entire team. See you at TI ;)!
Had a stable fork of our codebase for Benchmark; now merging everything back into master for The International in what the team is affectionately calling the "Merge of Doom".
An analysis of the Benchmark games and draft: nitter.vloup.ch/SlashStrikeDotA/…, and what human players can learn from Five's gameplay. (Interestingly, playing with "confidence" is the first lesson.)
Posted a new video on the @OpenAI vs team humans match youtube.com/watch?v=eDSh2Hjw… explaining drafts and why the bots won!
Also, best comment on the video gets a free replay analysis! 🏆🙂
Two weeks until OpenAI Five faces Dota professionals at The International! To get a sense of the life of a pro, read venturebeat.com/2017/02/12/d…. Truly the peak of achievable human performance in this domain.
I'm super excited to share that I'll soon be joining the policy team at @OpenAI! I'll be working on a range of policy issues related to OAI's vital mission of "discovering and enacting the path to safe artificial general intelligence."