ML bugs are so much trickier than bugs in traditional software because rather than getting an error, you get degraded performance (and it's not obvious a priori what ideal performance is). So ML debugging works by continual sanity checking, e.g. comparing to various baselines.

May 14, 2022 · 4:23 PM UTC

46
229
39
1,903
Replying to @gdb
Though you know you **can** unittest your ML code right?
Replying to @gdb
There are also more ways a bug could be introduced because the data itself is an input, not just the business logic. It's not always possible to sanitize all data (e.g. continuous/live training models).
3
Replying to @gdb @ykilcher
The kernel (and real time embedded) has some very challenging bugs that are also of this kind of performance and unknown potential
Replying to @gdb
This also leads to old datasets being favoured over new ones, where you don't have any baselines. We overfit preferred model architecture on popular datasets (which then influences theoretical advancements).
1
2
Replying to @gdb
Creating the loss function is key
2
Replying to @gdb
Our @Modulos_ai #datacentricai platform can tell you which training samples have the most negative effect on your model performance.
3
1
20
Replying to @gdb
Is there a reference I could use that goes into detail for this?
Replying to @gdb
Now try causal inference
Replying to @gdb
Sometimes you get error as well. And its from a model that was fine-tuned for months, while the base model on same settings dont show the same error. Then you cry.
Sounds like engineering
1