A few years ago I designed a way to detect bit-flips in Firefox crash reports and last year we deployed an actual memory tester that runs on user machines after the browser crashes. Today I was looking at the data that comes out of these tests and now I'm 100% positive that the heuristic is sound and a lot of the crashes we see are from users with bad memory or similarly flaky hardware. Here's a few numbers to give you an idea of how large the problem is. 🧵 1/5
What makes Firefox more susceptible to bitflips than any other software? Wouldn’t that mean that 10% of all software crashes are caused by bitflips and it just depends what software you are running when that happens.
I don’t think they’re arguing that Firefox is more susceptible to bit flips. They’re trying to say that their software is “solid” enough that a significant number of the reported crashes are due to faulty hardware, which is essentially out of their control.
If other software used the same methodology, you could probably use the numbers to statistically compare how “solid” the code base is between the two programs. For example, if the other software found that 20% of their crashes were caused by bit flips, you could reasonably assume that the other software is built better because a smaller portion of their crashes is within their control.
Interesting metrics to measure, but since I have no reference to how many crashes are caused by bitflips in any other software, it’s really hard to say if Firefox is super stable or super flaky.
I have device that has ECC ram and I can keep it online and applications running for well over 18 months with no stability issues.
However, both my work computers and my personal computer start to become unstable after about 15 to 20 days. And degrade over the course of 1 to 2 years (with a considerable increase in the number of corrupt system files)
Firefox and chrome start to become unstable after usually a week if they have really high memory usage.
Programs that use more memory could be slightly more susceptible to this sort of thing because if a bit gets randomly flipped somewhere in a computer’s memory, the bit flip more likely to happen in an application that has a larger ram footprint as opposed to an application with a small ram footprint.
What makes Firefox more susceptible to bitflips than any other software? Wouldn’t that mean that 10% of all software crashes are caused by bitflips and it just depends what software you are running when that happens.
I don’t think they’re arguing that Firefox is more susceptible to bit flips. They’re trying to say that their software is “solid” enough that a significant number of the reported crashes are due to faulty hardware, which is essentially out of their control.
If other software used the same methodology, you could probably use the numbers to statistically compare how “solid” the code base is between the two programs. For example, if the other software found that 20% of their crashes were caused by bit flips, you could reasonably assume that the other software is built better because a smaller portion of their crashes is within their control.
Interesting metrics to measure, but since I have no reference to how many crashes are caused by bitflips in any other software, it’s really hard to say if Firefox is super stable or super flaky.
This checks out with Linus Torvalds saying most OS crashes across linux AND windows are caused by hardware issues, and also why he uses ECC RAM.
Honestly yeah it’s 100% checks out.
I have device that has ECC ram and I can keep it online and applications running for well over 18 months with no stability issues.
However, both my work computers and my personal computer start to become unstable after about 15 to 20 days. And degrade over the course of 1 to 2 years (with a considerable increase in the number of corrupt system files)
Firefox and chrome start to become unstable after usually a week if they have really high memory usage.
Can confirm, my linux server with ECC RAM has 1040 days of uptime now without a single issue.
Programs that use more memory could be slightly more susceptible to this sort of thing because if a bit gets randomly flipped somewhere in a computer’s memory, the bit flip more likely to happen in an application that has a larger ram footprint as opposed to an application with a small ram footprint.
I’m still surprised the percentage is this high.
Is it high? How does it compare to any other software?