From Richard Sutton (incompleteideas.net/), an essay on the repeated historical finding that computational scale has always beaten cleverness in AI (and some commentary on why this is such a hard-to-accept fact): incompleteideas.net/IncIdeas…

Mar 14, 2019 · 11:02 PM UTC

12
82
8
274
Replying to @gdb
I agree to some extent, but as I mentioned in person previously, Q-learning is a nontrivial algorithm with a nontrivial framework (MDP and RL), and without Q-learning there certainly wouldn't be DQN and the current explosion in DRL.
1
10
Agree!
Replying to @etzioni
Both are important! Look at GPT-2 for instance — that's a general-purpose architectural improvement (i.e. the Transformer) run at massive scale. One interesting point from the essay is that scale gets a bad rap — doing the reverse isn't a good way of fixing the problem!
Replying to @gdb
One needs new tricks. For example automl comes up with better versions for inception modules (NAS), but it won't come up with attention on it's own.
1
Replying to @gdb
"essay" really ? more a very confused long tweet. It's not just scale that helped improve most problems mentioned but "clever" algorithmic methods (eg learning: CNN, RL, ...). To hear that from an expert in ML is depressing.
1
Replying to @gdb
Hmm, seems the dichotomy is too simplistic. Does having more and higher quality data goes into the “more computation” bucket? Far fetched.
Replying to @gdb
Related testable question: how much compute would a Deep Blue type algorithm need to be superhuman in Go? aka, how many times more efficient is AlphaGo? @metaculus
Replying to @gdb
Is the corollary to that advancements in AI will only come from big enterprises, with money to access big computational resources? Or would you say that there is room for individuals/smaller companies?
Replying to @gdb
sutton's argument is that cleverness (i.e. human knowledge models) scale less than computational power applied to search and learning. this is just the restatement of the principles of reinforcement learning applied to human knowledge, but is equally true of machine knowledge.
1
Replying to @gdb
cc @vnovakovski re: our conversation a few days ago on modeling vs data vs compute, would love to hear your view on this!