Statistics and deep learning give conflicting intuitions about how models should behave.
Turns out that many deep learning models transition from acting how statisticians think they should, to how deep learning practitioners think they should:
A surprising deep learning mystery:
Contrary to conventional wisdom, performance of unregularized CNNs, ResNets, and transformers is non-monotonic: improves, then gets worse, then improves again with increasing model size, data size, or training time.
openai.com/blog/deep-double-…
Dec 5, 2019 · 5:47 PM UTC
1
15
57


