The robot didn't get to train *at all* with tied fingers — it had to adapt on the fly.
(Also, humans have a billion plus years of evolutionary practice to solve the cube with untied fingers; the robot only gets about 10,000 years of untied practice.)
I could learn it in a year with all my fingers. Then it would probably only take a couple days to adjust my mental model to solve it with two fingers tied.
How many billion years of training did the Deep RL agent need again?