ML bugs are so much trickier than bugs in traditional software because rather than getting an error, you get degraded performance (and it's not obvious a priori what ideal performance is). So ML debugging works by continual sanity checking, e.g. comparing to various baselines.

May 14, 2022 · 4:23 PM UTC

46
229
39
1,903
Replying to @gdb
Sometimes you don't even realize there's a bug until sufficient time has passed and you've collected enough data to realize your performance is lower. In certain use cases, this time is unfortunately very long.
2
1
32
Replying to @gdb @ykilcher
Thats why I am so happy we have self-driving cars now. Debugging happens on the street... This is not against ML... but against putting not yet ripe technologies on our streets
1
1
Replying to @gdb
I couldn't agree more
Deep learning is cool because it leads to double bugs. Is it a bug in your code, is it a bug in your idea? 🤷‍♂️
4
Replying to @gdb
real AI needs a virtualphysical environment .. I think the cloud companies just love this misdirected , power hungry, approach. Karl Sims paper on evolved virtual creatures is the road to emergent GAI , not data science,that don't know the physical meaning of that input seen.
1
Replying to @gdb
To move the discussion one step forward: How biological brain “bugs” differ from artificial brain bugs?
Replying to @gdb
There are different forms of this depending on the lifecycle of a model: 1) during model development, trying to constantly re-evaluate and sanity-check baselines/edge cases, and 2) during inference, when there are generally sparse labels
1
Replying to @gdb
ML bugs in input preparation stage can lead to perf degradation but model construction and loading doesn’t tolerare bugs?
Replying to @gdb
I recall hearing you talk about OpenAI Five team fixing bugs to improve agent performance. Open invitation for you to appear on TalkRL Podcast, audience would love to hear from you @gdb
3
Replying to @gdb
Right! For me the problem is often small ML bugs result in equivalent comparison to baselines. Just when these accumulate you start seeing degradation. Also hard to know if your idea is worse than baseline or whether it’s a bug
3
Replying to @gdb
This is why @kolenaIO exists :)