Skip to content

Conversation

@cerenmert14
Copy link
Collaborator

The current Bespoke generator for STLC Rackcheck is generating roughly 50% duplicate and trivial inputs on a sample size 100000 (see below the results from Tyche).
image
image

This fix tunes the generator by:

  • making gen:typ sized
  • making backtrack weighted
  • other minor changes that included changing recursive calls to halve instead of decrement by 1, etc

other minor changes

  • renamed gen:zero to gen:one to be consistent with other frameworks’ STLC case studies
  • one mutant was left active (shift_var_leq), turned it off
  • changed test number to 500000 (see note below)

A note: Currently, the generator will time out (at around 675000 tests) with the current deadline and the current number of tests (4000000). This can be changed by either changing the weights of the generators passed into the backtrack (which results in a less even distribution) or by changing the deadline.

The sample results with the updated generator are below
image
image

@cerenmert14 cerenmert14 requested a review from alpaylan January 21, 2026 04:50
@alpaylan
Copy link
Owner

The tyche charts look good, what I would like to see further down the road:

  • Running etna tasks with both versions to make sure the sizing differences actually lead to changes in the bug finding performance, could be number of inputs or time to failure.
  • The Tyche charts have different scales, which makes it hard to actually compare them. Let's see if it's possible to have them with the same scale of bins, time etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants