For differentiable problems, there’s backpropagation. For everything else, there’s RL.

Jan 31, 2019 · 5:11 PM UTC

17
58
6
451
Replying to @gdb
Not quite right. A more accurate statement would be "for everything else, there is gradient-free (zeroth-order) optimization." RL is when there is a sequential decision process and what you see depends on previous actions you took.
8
24
295
We use different definitions of RL. In mine, any problem can be phrased as a one-step MDP (such as in arxiv.org/abs/1611.01578), and zeroth-order optimization is a special case. Can debate definitions, but I use mine because algos like PPO are doing RL regardless of MDP used.
3
28
Replying to @gdb
Are generative models differentiable ?
1
1
Replying to @gdb
What
2
24
Replying to @gdb
An I the only one who read this as a MasterCard ad riff?
1
3
Replying to @gdb
Imagine a world, of complete optimization
1
Replying to @gdb
How about proximal methods?
Replying to @gdb
Real life? 😂
Replying to @gdb
1
GIF