From a students perspective: when getting more performance from a model increase like this one usually worries about overfitting. Is that a problem for you? How do you manage it?
1
3
Fun fact: we actually don’t know whether or not the opponent distribution is necessary — with 1v1 we trained entirely against past selves; with 5v5 we made an arbitrary choice to do 80% self-play and 20% past selves. Will be interesting to investigate if it’s needed.
Sep 7, 2018 · 3:52 AM UTC
1
2


