-
Notifications
You must be signed in to change notification settings - Fork 31
Description
Hi Silifuzz,
We used Silifuzz to run a large-scale measurement (10K hours in total) on 200 CloudLab machines, to understand more about SDC characteristics. The detailed setups are attached below.
Surprisingly, we didn’t observe any SDC besides a false positive (which I reported in October).
The Google and Meta papers report that “on the order of a few mercurial cores per several thousand machines” and a “SDC occurrence rate of one in thousand silicon devices”.
I’m writing to ask whether you have any insights on our observation? Are SDCs you observed specific to certain CPU families (like Intel CPUs)? Or, you suggest an even large-scale testbed than CloudLab?
Our measurement setups are:
- Unique snapshots: 17,333
- Total hours spent executing: 10,031
- Machines: 200
- CPU: Intel Xeon 4 core E5530 processors (50 machines) and Intel Xeon 8 core E5-2630v3 processors (150 machines)
- Total execution per machine : ~50 hrs
We made a few changes (https://github.com/xlab-uiuc/SDCBench):
- edited Silifuzz to collect and organize the generated snapshots before the feature was added to the repository (Add a script to collect corpus from fuzzing results for real CPU tests #1)
- edited Silifuzz to run and report results at scale. The framework generates a snapshot to run on all machines and the machines report back their numbers to be stored into a database