The time-tested way of getting better performance from deep learning — use a bigger model:
1. Double model size 2. Initialize from old parameters 3. 10 additional days of training Result: 80% win rate versus model that played on-stage at The International vs paiN Gaming.
16
19
2
149
From a students perspective: when getting more performance from a model increase like this one usually worries about overfitting. Is that a problem for you? How do you manage it?
1
3
But isn't playing Dota perfectly different than beating generation X perfectly? Wouldn't a validation against humans make sense as a, well, validation? I get the feeling that I'm being simplistic but thanks for the reply!
1
Replying to @gwern @pcasaretto
Fun fact: we actually don’t know whether or not the opponent distribution is necessary — with 1v1 we trained entirely against past selves; with 5v5 we made an arbitrary choice to do 80% self-play and 20% past selves. Will be interesting to investigate if it’s needed.

Sep 7, 2018 · 3:52 AM UTC

1
2