Skip to content

Questions on Silifuzz measurement results on CloudLab #3

@Maknee

Description

@Maknee

Hi Silifuzz,

We used Silifuzz to run a large-scale measurement (10K hours in total) on 200 CloudLab machines, to understand more about SDC characteristics. The detailed setups are attached below.

Surprisingly, we didn’t observe any SDC besides a false positive (which I reported in October).

The Google and Meta papers report that “on the order of a few mercurial cores per several thousand machines” and a “SDC occurrence rate of one in thousand silicon devices”.

I’m writing to ask whether you have any insights on our observation? Are SDCs you observed specific to certain CPU families (like Intel CPUs)? Or, you suggest an even large-scale testbed than CloudLab?

Our measurement setups are:

  • Unique snapshots: 17,333
  • Total hours spent executing: 10,031
  • Machines: 200
  • CPU: Intel Xeon 4 core E5530 processors (50 machines) and Intel Xeon 8 core E5-2630v3 processors (150 machines)
  • Total execution per machine : ~50 hrs

We made a few changes (https://github.com/xlab-uiuc/SDCBench):

  • edited Silifuzz to collect and organize the generated snapshots before the feature was added to the repository (Add a script to collect corpus from fuzzing results for real CPU tests #1)
  • edited Silifuzz to run and report results at scale. The framework generates a snapshot to run on all machines and the machines report back their numbers to be stored into a database

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions