working in a big c++ codebase is learning to accept that "yeah that memory just randomly corrupts sometimes?" is a valid resolution to a bug
4
8
2
55
When you closely analyze crash reports from a product as widely used as Firefox, some of the corruption is clearly hardware.
3
4
19
I've seen last few instructions copying pointer from stack to register & dereferencing it, with stack & register dump showing 1 bit flipped.
1
9
Hard to say if this sort of thing is 5% of crashes or 25% of crashes, but anecdotally (from crashes I've looked at closely) it's nontrivial.
2
5
For similar issues in our own infrastructure, see bugzil.la/787281 , where I found intermittent CI failures due to bad hardware.
1
1
Replying to @davidbaron @Gankro
Of course, now that this stuff is all "in the cloud" it's presumably much harder to track stuff like that down.

Oct 18, 2017 · 1:16 AM UTC

1
Replying to @davidbaron @Gankro
Probably a bigger source of crashes than memory→register pattern above is bitflips in the binary on disk, like bugzil.la/1272750
1
1
2
4
Also (a bit off topic), crashes that are actually CPU bugs, like bugzil.la/772330 bugzil.la/1296630 (PGO involved in both too)
1