stdlib: add property based testing by saem · Pull Request #1559 · nim-works/nimskull

saem · 2025-07-05T22:07:18Z

Summary

Introduces experimental property based testing support via the property_testing module in the stdlib, available under C and JS backends.

Details

Create an integrated shrinking based approach to property based testing. This is based off the work in Python Hypothesis test library by Dr David R. Maciver.

The key benefits of this approach are:

generators/types do not have to specify shrinking strategies, instead a unified set of strategies are applied to the backing random byte stream
generator authors automatically get a very good shrinker with no effort on their part
arbitrarily complex generators, those composed of many generators, will automatically have shrinking support as well
it's very fast

The library ships with a number of basic generators, for scalar primitives such as numerics, booleans, characters, and the like. As well as vector like types, such as arrays, sequences, sets, and so on. Also generators for Tuples and procedural values (must be closures). In addition to these types a variety of combinators, map, filter, etc have been provided to allow composition of generators to create These provide the foundational primitives to compose arbitrarily complex generation of data types that a user might want.

Notes for Reviewers

write a proper commit message
Disclosure: some of this code was written with the assistance of the an LLM (i.e.: Gemini)

- made `PropCheck` a `func`

…d over again

- start tests for `set`s

add extra overloads upto five arbitraries reduce redundant code in `execProperty` thanks to Zerbina's suggestion

this doesn't seem like the right approach

this allows for replication

If it makes sense in the future it'll naturally pop out of a refactoring

still debating if numbering per section is senseible, or should each property have its own number for easy reference?

saem · 2025-08-30T23:56:53Z

The output from the current tests looks as follows, each property has an id that should be useable to run it specifically in the future:

nim

1 uint32
- are >= 0 (id: 1) - status: success, totalRuns: 1000
- within the range[100000000, 4294967295] (id: 2) - status: success, totalRuns: 1000

1.1 enums
- are typically ordinals (id: 3) - status: success, totalRuns: 1000

1.2 characters
        1.2.1 are ordinals
        - forming a bijection with int values between 0..255 (inclusive) (id: 4) - status: success, totalRuns: 1000
        - have successors and predecessors or are at the end range (id: 5) - status: success, totalRuns: 1000

- ascii - are from 0 to 127 (id: 6) - status: success, totalRuns: 1000

1.3 strings
- concatenation - len is == the sum of the len of the parts (id: 7) - status: success, totalRuns: 1000

1.4 sets
- cannot contain more items than the enum itself (id: 8) - status: success, totalRuns: 1000
        1.4.1 union
        - a union of sets contain all elements of each (id: 9) - status: success, totalRuns: 1000
        - union is commutative (id: 10) - status: success, totalRuns: 1000

        1.4.2 intersection
        - an intersection is a subset of both operands (id: 11) - status: success, totalRuns: 1000
        - intersection is commutative (id: 12) - status: success, totalRuns: 1000

        1.4.3 difference (or relative complement)
        - a difference has no overlap with the second operand (id: 13) - status: success, totalRuns: 1000

some test checks were weaker than they could be, only checking if failure was detected even if failure was guaranteed. clean-up: - trailing white spaces - superfluous comments

saem · 2026-03-25T00:31:22Z

There are places for improvement but I'm not sure exactly where to take it without more experience with the library.

So with that said I think this is ready to be reviewed and merged.

saem · 2026-03-26T05:05:49Z

I'm heavily reworking the tests:

splitting them up into more tests suites:
- public api/how to use them
- core primitives, both for generator authors, but also further library/application builders
- documentation of various use cases, such as generator composition, and also pushing potential exhaustion
- multi-overlapping testing of shrinking, candidate selection, and ast parsing (structured byte generation)
expanding their coverage
removing redundant code, especially around exhaustive generator tests

The above found some bugs, so I'll flush those out as well.

saem · 2026-03-28T17:11:01Z

+  for (pos, kind) in nodes:
+    if kind in {skArray8, skArray16, skArray32}:
+      let lenBytes =
+        case kind
+        of skArray8:  1
+        of skArray16: 2
+        of skArray32: 4
+        else:         unreachable("Invalid kind: " & $kind)


Suggested change

for (pos, kind) in nodes:

if kind in {skArray8, skArray16, skArray32}:

let lenBytes =

case kind

of skArray8: 1

of skArray16: 2

of skArray32: 4

else: unreachable("Invalid kind: " & $kind)

for (pos, kind) in nodes:

if kind == skArray8 or kind == skArray16 or kind == skArray32:

let lenBytes =

case kind

of skArray8: 1

of skArray16: 2

of skArray32: 4

else: unreachable("Invalid kind: " & $kind)

Update partially good news is that the error is actually intermitent, so some other memory corruption is happening... yay, I still don't know where though.

I'm currently hitting the unreachable unless I change the code to the above, and I must be blind because I don't see why that's happening.

The cgen seems fine:

cgen for or variant:

NU8* kind; FR_.line = (NI64)898; FR_.filename = "property_testing.nim"; kind = (&_107->Field1); NIM_BOOL _108; _108 = NIM_FALSE; NIM_BOOL _109; _109 = NIM_FALSE; FR_.line = (NI64)899; _109 = ((*kind) == (NU8)9); NIM_BOOL _111; _111 = !_109; if (_111) { _109 = ((*kind) == (NU8)10); } _108 = _109; NIM_BOOL _113; _113 = !_108; if (_113) { _108 = ((*kind) == (NU8)11); } if (_108) {

cgen for in variant:

kind = (&_107->Field1); NIM_BOOL _109; FR_.line = (NI64)899; _109 = !(((NU16)3584 & ((NU16)1 << ((NU16)(*kind)))) == (NU16)0); if (_109) {

zerbina

I did a first review pass, focusing on the things that can cause behaviour that looks like memory corruption.

zerbina · 2026-03-28T20:07:59Z

+        let rMax = decodeUint64(buffer, pos, sBytes)
+        pos += sBytes
+        let rangeSize = if rMax > rMin: rMax - rMin else: 0'u64
+        pos += bytesForRange(rangeSize)


Suggested change

pos += bytesForRange(rangeSize)

pos += bytesForRange(rMax - rMin)

If rMax < rMin (happens for signed integer ranges crossing zero), pos wouldn't be advanced properly when bytesForRange(rMax - rMin) != 1.

zerbina · 2026-03-28T20:08:08Z

+            rMin = decodeUint64(buffer, pos + 2, sBytes)
+            rMax = decodeUint64(buffer, pos + 2 + sBytes, sBytes)
+            rangeSize = if rMax > rMin: rMax - rMin else: 0'u64
+            vBytes = bytesForRange(rangeSize)


Suggested change

vBytes = bytesForRange(rangeSize)

vBytes = bytesForRange(rMax - rMin)

Same as above.

I think with the conversion I have in place this should no longer be an issue.

It doesn't cause any issues anymore, yea, but the rMax > rMin check is still unnecessary.

zerbina · 2026-03-28T20:09:23Z

+    if kind in {skByte, sk2Bytes, sk4Bytes, sk8Bytes}:
+      let sBytes = getScalarBytes(kind)
+      if pos + 1 + sBytes <= buffer.len:
+        let


Since the scalar reduction is identical to the one used by range reduction (and also has the same issue with yielding some candidates twice), could you factor the code out into a template?

I've pulled out the core of it into numberShrinker, I'll see if I can rework the code to make them more similar to pull a bit more in tomorrow. 🤞🏽

saem · 2026-03-28T20:39:54Z

Note to self: drop bool strings, they're not implemented and would be annoying to deal with, and the savings are questionable

- introduce `int64` `chooseRange` - create a conversion function from `uint64` to `int64` that preserves order - fix-up call sites and tests

focuses on `candidates` iterator, to avoid misreading the buffer

These should change from `doAssert` to `assert` Also, consider allowing catchable assertions for facilitating testing.

Currently buggy, it seems `collectNodes` isn't gathering all the nodes

saem · 2026-03-29T06:53:25Z

+
+# MARK: Buffer Tree Tools
+
+proc treeRepr*(buffer: seq[byte]): string =


Note to self: debug why this, and/or collectNodes, is not printing the tree properly in the failing test

In case you didn't get to debug it yet, this is because the buffer doesn't start with an array or group node, meaning that collectNodes only yields the node itself (a range).

I got that visualized, I'm just trying to sort out where it's happening now:

# what collect nodes sees @[(0, skRange)] # the buffer; note `5` corresponds to `skRange` and it's missing group @[5, 2, 1, 0, 0, 0, 10, 0, 0, 0, 1, 6, 2, 9, 2, 5, 3, 0, 0, 0, 0, 0, 0, 0, 0, 255, 255, 255, 255, 255, 255, 255, 255, 0, 0, 0, 0, 0, 0, 0, 0, 5, 2, 1, 0, 0, 0, 100, 0, 0, 0, 0, 6, 0, 9, 2, 5, 3, 0, 0, 0, 0, 0, 0, 0, 0, 255, 255, 255, 255, 255, 255, 255, 255, 0, 0, 0, 0, 0, 0, 0, 0, 5, 2, 1, 0, 0, 0, 100, 0, 0, 0, 0, 6, 1, 5, 0, 0, 255, 65] # what my repr ends up outputing, because of what collect sees skRange (kind: sk4Bytes, min: 1, max: 10, val: 1)

The structure comes to be because in genSeq, there's a chooseRange call (for computing the length to use) before beginArray, adding a superfluous node to the tree.

The best way I can think of to address this is to add two more fields to the skArrayX nodes, one for the minimum and one for the maximum length. This would also fix shrinking being able to produce candidates violating the minimum length specification.

I've started reworking the API, I collapsed all arrays to one kind, skArray, and made the length an skRange. The beginArray API now takes a min and max and so the range should be created in the correct position, it always write the array kind first and it returns the chosen length. This has the side effect of making getting rid of a lot recording state checks being dropped from various callers.

Now I gotta debug various bits and bobs, such as minimum length violations and whatever bugs lie in my rework.

Might be better to keep the various byte width arrays but they're easy enough to introduce after, I believe. It would save me a range kind byte.

@zerbina

Since it creates an `skGroup` and that only takes a byte worth of elements this makes the API safer. suggested by @zerbina

also renamed them to `renumerateXtoY` as renumerate means to renumber/recount, instead of conversion which is more ambiguous.

- work even if the input expression (`x`) has side-effects - avoid conversions that would result in range check errors and potentially extra work Co-authored-by: zerbina <100542850+zerbina@users.noreply.github.com>

need to fix `collectNodes` still

saem · 2026-03-30T15:57:48Z

I'm going to be pushing a lot of changes that are incremental as I rework the internals, so I've marked it as draft for now so as to not burn CI resources unnecessarily.

This test is the next failure to fix: > candidates yields structurally valid byte sequences (Meta-test) It shows that int -> uint range encoding is busted see the `treeRepr` output (confirmed by mentally converting the byte buffer as well): > skRange (kind: sk8Bytes, min: 9223372036854774808, max: 9223372036854776808, offset: 1741, value: 9223372036854776549) code is still ugly: - need to create a single range/array unpack template - array shouldn't have an skrange following it, just the kind, min, max, and offset

the array shrinking is more correct wherein `rMax and `rMin` aren't as far off in the final `skipToRangeOffsetAndGetSizeAndMin` template.

Saem Ghani and others added 2 commits July 5, 2025 14:17

initial commit with previous attempt

669b060

got things closer to working, now segfaulting

60ecb46

saem added this to the Standard library additions or cleanup milestone Jul 5, 2025

saem added the stdlib Standard library label Jul 5, 2025

saem marked this pull request as draft July 5, 2025 22:11

saem added 22 commits July 5, 2025 16:49

- rename Predicate to PropCheck

4a61048

- made `PropCheck` a `func`

create an arbitrary, constArb, that produces the same value over an…

952b804

…d over again

mark constArb as exhaustive

87d9b20

rename pred for predicate, to propCheck

e2b0f25

Merge remote-tracking branch 'origin' into saem-property-based-testing

f188a3b

fix test

a01fd13

got tests running again

bae6384

prep for exhaustive arbitrary handling

4f405a2

- enumLen exists in std/typetraits

cf7d594

- start tests for `set`s

add more set tests

9b58401

remove blank space

9b96dfc

expand test cases

d2bfd68

remove tuple args for property check

85ade78

add extra overloads upto five arbitraries reduce redundant code in `execProperty` thanks to Zerbina's suggestion

remove biasing

6b3c4ba

this doesn't seem like the right approach

clean-up reporting and track rng calls

3cff7b8

this allows for replication

fix string test

2ec41c1

remove RunExecution

48a8db4

If it makes sense in the future it'll naturally pop out of a refactoring

split the core and api of property based testing

719dd99

tap crap, likely will remove

65c253f

add numbering to specs

45fbb25

remove extra space

7a2bd94

add property numbering

056ac22

still debating if numbering per section is senseible, or should each property have its own number for easy reference?

saem added 2 commits August 31, 2025 11:20

make propNum unique across a spec set

16aa396

rename propNum to id

61365cb

saem added 4 commits March 24, 2026 10:03

formatting

2646c35

remove deadcode and formatting

c3b1ffa

fix tests and clean-up formatting

f3b15a1

some test checks were weaker than they could be, only checking if failure was detected even if failure was guaranteed. clean-up: - trailing white spaces - superfluous comments

trim trailing blank spaces

93fbf70

zerbina self-requested a review March 25, 2026 16:20

saem commented Mar 28, 2026

View reviewed changes

zerbina reviewed Mar 28, 2026

View reviewed changes

saem added 9 commits March 28, 2026 18:00

break-up test suites, add marks, and more coverage

34b0955

properly handle signed numbers and chooseRange

8f6401a

- introduce `int64` `chooseRange` - create a conversion function from `uint64` to `int64` that preserves order - fix-up call sites and tests

redundant code clean-up and add assertions

9b9fad0

focuses on `candidates` iterator, to avoid misreading the buffer

skipNode favour assertions instead of if guards

0b87611

These should change from `doAssert` to `assert` Also, consider allowing catchable assertions for facilitating testing.

trim blankspace

e55bd89

reduce redundant code between range and unbounded shrinker

892d7c0

remove skBoolString* this turned out to be a deadend

c5a7b30

add reminder for skNBytes shrinking (and generation)

c69517d

implement treeRepr for a buffer

bbef1ce

Currently buggy, it seems `collectNodes` isn't gathering all the nodes

saem commented Mar 29, 2026

View reviewed changes

beginGroup takes uint8 for number of elements

9d2f108

Since it creates an `skGroup` and that only takes a byte worth of elements this makes the API safer. suggested by @zerbina

zerbina reviewed Mar 29, 2026

View reviewed changes

Comment thread lib/experimental/property_testing.nim Outdated

write reciprocal conversion for signed/unsigned 64 bit ints

9c758f7

also renamed them to `renumerateXtoY` as renumerate means to renumber/recount, instead of conversion which is more ambiguous.

zerbina reviewed Mar 29, 2026

View reviewed changes

Comment thread lib/experimental/property_testing.nim Outdated

Comment thread lib/experimental/property_testing.nim Outdated

saem and others added 2 commits March 29, 2026 12:22

fix renumerate templates

259bb4c

- work even if the input expression (`x`) has side-effects - avoid conversions that would result in range check errors and potentially extra work Co-authored-by: zerbina <100542850+zerbina@users.noreply.github.com>

partial rework of array

b3f0515

need to fix `collectNodes` still

saem marked this pull request as draft March 30, 2026 15:56

saem added 2 commits April 3, 2026 12:29

closer to working

77f6bed

the array shrinking is more correct wherein `rMax and `rMin` aren't as far off in the final `skipToRangeOffsetAndGetSizeAndMin` template.

	pos += bytesForRange(rangeSize)
	pos += bytesForRange(rMax - rMin)

	vBytes = bytesForRange(rangeSize)
	vBytes = bytesForRange(rMax - rMin)


		# MARK: Buffer Tree Tools

		proc treeRepr*(buffer: seq[byte]): string =

Conversation

saem commented Jul 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Notes for Reviewers

Uh oh!

saem commented Aug 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

saem commented Mar 25, 2026

Uh oh!

saem commented Mar 26, 2026

Uh oh!

saem Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zerbina left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

saem commented Mar 28, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

saem commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

saem commented Jul 5, 2025 •

edited

Loading

saem commented Aug 30, 2025 •

edited

Loading

saem Mar 28, 2026 •

edited

Loading