Add small integer representation #4204

DRMacIver · 2024-12-16T12:59:07Z

Long-standing Hypothesis shrinking advice is that when you write one_of(x, y) then you should make sure that x is simpler than y so as to get good shrinking behaviour.

But, uh, what does "simpler" mean? Well it means "has smaller representations in the Hypothesis internal shrink order". Fair enough.

Well it turns out that this secretly means two different things:

Which of these is typically smaller?
Which of these is smaller once shrunk?

Which of these do we mean?

Well, uh, unfortunately we mean both. The former is important to get good shrinking performance/behaviour, the latter is important to to get good results once fully shrunk.

Sure would be a shame if there were common pairs of strategies where these gave different answers, huh?

Anyway one_of(integers(), text()) is such an example. integers() are typically (and sortof logically should be) smaller than text, but 0 is actually larger than '' in both the new and old representations.

This PR adds a small-integer optimisation that fixes both. It gives us a single-byte representation of small integers in the old buffer-based implementation, and adds a special case for zero in the serialisation format of the new IR representation (non-zero integers don't need special casing here, because in the new representation this is only a problem for 0. Any string of length > 0 will be at least two bytes, so the IR already handles the sizing of small non-zero integers correctly.

tybug

without looking too closely yet at the _draw_unbounded_integer changes - while this makes sense in principle for the bytestring, I'd just forewarn that a lot of this work will be redundant/overwritten/solved in a different way on the typed choice sequence, so don't be surprised to see this code go away in the near future! (e.g., shrink ordering is independent of buffer size on the TCS).

tybug · 2024-12-18T21:10:59Z

hypothesis-python/tests/conjecture/test_shrinker.py

+            data.mark_interesting()
+
+    shrinker.fixate_shrink_passes(["minimize_individual_nodes"])
+    assert shrinker.shrink_target.ir_nodes[0].value == boundary


shrink_target.choices[0] is a nice concise alternative here! I envision .buffer[i] -> .choices[i] being the default migration path for bytestring tests, though of course with tweaked indices.

(or better yet shrinker.choices, using the implicit forwarding to .shrink_target).

DRMacIver requested a review from Zac-HD as a code owner December 16, 2024 12:59

DRMacIver force-pushed the DRMacIver/smol-integers branch from 730ac18 to 5bcce8b Compare December 16, 2024 13:00

Add small integer representation

b066401

DRMacIver force-pushed the DRMacIver/smol-integers branch from 5bcce8b to b066401 Compare December 16, 2024 13:02

DRMacIver marked this pull request as draft December 16, 2024 23:13

tybug reviewed Dec 18, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add small integer representation #4204

Add small integer representation #4204

DRMacIver commented Dec 16, 2024

tybug left a comment •

edited

Loading

tybug Dec 18, 2024 •

edited

Loading

Add small integer representation #4204

Are you sure you want to change the base?

Add small integer representation #4204

Conversation

DRMacIver commented Dec 16, 2024

tybug left a comment • edited Loading

Choose a reason for hiding this comment

tybug Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

tybug left a comment •

edited

Loading

tybug Dec 18, 2024 •

edited

Loading