The time-tested way of getting better performance from deep learning — use a bigger model:
1. Double model size 2. Initialize from old parameters 3. 10 additional days of training Result: 80% win rate versus model that played on-stage at The International vs paiN Gaming.
16
19
2
149
From a students perspective: when getting more performance from a model increase like this one usually worries about overfitting. Is that a problem for you? How do you manage it?
1
3
This isn't really true for game playing. You can overfit on how to beat another RL agent and yet perform terribly against humans because the state space distribution overlaps minimally.
2
1
5
This is one of the things I find most remarkable about OpenAI Five: somehow, the bot and human strategy spaces overlap significantly. We know that for minigame environments we needed to add randomizations to cause that to happen. Less certain as the env has gotten harder.

Sep 7, 2018 · 7:45 PM UTC

3