test: add benchmark to Enum/EnumArray #1303

bonjourmauko · 2024-11-20T02:05:16Z

"Now" means the version introduced by 43.0.0:

I like this as we can add per-method benchmarks, storing a baseline performance before doing changes, and veryfing prior to submit a PR that performance is not degraded.

The list of methods to look out for will I guess be built out of trial and error.

(The version of @clallemand restores the performance of the baseline)

cc @sandcha @benjello

bonjourmauko · 2024-11-20T02:46:00Z

I added a performance test to Enum.encode, and that one is actually 2x faster, while Enum.__eq__ is 3x slower (in 43.2.2, prior to @clallemand fixes):

------------------------------------------------------------------------------------- benchmark 'Enum.__eq__': 2 tests ------------------------------------------------------------------------------------
Name (time in us)                            Min                Max              Mean            StdDev            Median               IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_eq (0001_8e9da04)     1.4625 (1.0)       5.6417 (1.0)      1.5307 (1.0)      0.1313 (1.0)      1.5125 (1.0)      0.0166 (1.0)      956;4656      653.2840 (1.0)       50000          10
test_benchmark_enum_eq (NOW)              3.9541 (2.70)     21.6958 (3.85)     4.0818 (2.67)     0.2282 (1.74)     4.0166 (2.66)     0.1041 (6.27)    3005;4195      244.9888 (0.38)      50000          10
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------ benchmark 'Enum.encode': 2 tests ------------------------------------------------------------------------------------------
Name (time in us)                                 Min                   Max               Mean             StdDev             Median               IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_encode (NOW)              19.5292 (1.0)        170.2417 (1.0)      20.1429 (1.0)       1.6433 (1.0)      19.9459 (1.0)      0.2542 (1.0)      727;5533       49.6453 (1.0)       50000          10
test_benchmark_enum_encode (0001_8e9da04)     35.1708 (1.80)     1,317.6416 (7.74)     37.8116 (1.88)     14.4782 (8.81)     36.2125 (1.82)     0.7584 (2.98)     870;4857       26.4469 (0.53)      50000          10
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

This, of course, does not take into account the actual number of times each method is actually called in a real simulation, or in a bump/restore, so we can't do a "system-wide" comparison just with this.

By system wide, I mean, for example, making one method slower in profit of another, or bulking up an operation in memory to make it faster in the long run (for example, a method can be 3x faster but have a huge max outlier).

Also, disabling GC or GIL can have an impact. I don't know if you do that in CSAD.

Bulking up benefits both Python API and Web API users: Enum get indexed once at define time, and the more times we call Enum.encode, the faster it gets in the median.

Also, it is clear that that just comparing Enum by their memory address (object.__eq__ and object.__hash__) looks unbeatable performace-wise, although it introduces limitations to be aware of:

As definitions are loaded from external files (country-package and extension-template) as in-memory modules, it is imposible to do enum_1 == enum_1 comparisons, because, by virtue of object.__eq__, they will be different in memory, regardless of them being the same in the code (same class, same member, same value).

You can see that limitation being taken into account here:

        elif _is_enum_array(value) and cls.__name__ is value[0].__class__.__name__:
            indices = _enum_to_index(value)

Without that piece of code, Enum.encode would fail because of __eq__ = object.__eq__.

If you discover how to do an histogram I'd be happy to know how to do it, to see more clearly what is hiding behind the mean, that is not really useful as a measure here (I suspect distribution to be log-normal).

bonjourmauko · 2024-11-20T05:08:49Z

I think now it works as it should —Enum.__eq__ is <%1 slower than 42.0.0, but without the hack, and Enum.encode corresponds to what I saw before, ~40-50% faster. Furthermore, we have a baseline now:

pytest openfisca_core/indexed_enums --benchmark-only --benchmark-compare

------------------------------------------------------------------------------------ benchmark 'Enum.__eq__': 2 tests ------------------------------------------------------------------------------------
Name (time in us)                            Min               Max              Mean            StdDev            Median               IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_eq (0001_8e9da04)     1.4625 (1.0)      5.6417 (1.01)     1.5307 (1.0)      0.1313 (1.55)     1.5125 (1.0)      0.0166 (1.0)      956;4656      653.2840 (1.0)       50000          10
test_benchmark_enum_eq (NOW)              1.4750 (1.01)     5.5709 (1.0)      1.5380 (1.00)     0.0846 (1.0)      1.5333 (1.01)     0.0166 (1.00)     566;1789      650.2020 (1.00)      50000          10
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------ benchmark 'Enum.encode': 2 tests ------------------------------------------------------------------------------------------
Name (time in us)                                 Min                   Max               Mean             StdDev             Median               IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_encode (NOW)              20.2542 (1.0)         60.1250 (1.0)      20.8112 (1.0)       0.6201 (1.0)      20.7292 (1.0)      0.1416 (1.0)     1677;5426       48.0510 (1.0)       50000          10
test_benchmark_enum_encode (0001_8e9da04)     35.1708 (1.74)     1,317.6416 (21.92)    37.8116 (1.82)     14.4782 (23.35)    36.2125 (1.75)     0.7584 (5.36)     870;4857       26.4469 (0.55)      50000          10
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

(To get better performance, we would have to code a custom Enum in C or Rust)

bonjourmauko · 2024-11-20T09:38:06Z

One last test with 50k shows that increase/decrease on performance is mixed. Enum.encode excels except when passed a list of str (which is the commonly used scenario):

------------------------------------------------------------------------------------ benchmark 'Enum.__eq__': 2 tests ------------------------------------------------------------------------------------
Name (time in us)                            Min               Max              Mean            StdDev            Median               IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_eq (0001_3f253db)     6.0880 (1.0)      6.2798 (1.0)      6.1524 (1.0)      0.0337 (1.0)      6.1526 (1.0)      0.0410 (1.0)          29;3      162.5381 (1.0)         100       10000
test_benchmark_enum_eq (NOW)              6.1442 (1.01)     6.5179 (1.04)     6.2145 (1.01)     0.0697 (2.07)     6.1949 (1.01)     0.0497 (1.21)        11;10      160.9138 (0.99)        100       10000
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------- benchmark 'Enum.encode': 2 tests --------------------------------------------------------------------------------------
Name (time in ms)                                 Min                Max               Mean            StdDev             Median               IQR            Outliers       OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_encode (0001_3f253db)      1.1622 (1.0)       1.2289 (1.0)       1.1774 (1.0)      0.0104 (1.0)       1.1762 (1.0)      0.0098 (1.0)          20;6  849.3253 (1.0)         100          10
test_benchmark_enum_encode (NOW)              21.5223 (18.52)    30.4022 (24.74)    21.9348 (18.63)    1.0418 (99.76)    21.6959 (18.45)    0.1438 (14.64)        5;14   45.5897 (0.05)        100          10
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------ benchmark 'EnumArray.__eq__': 2 tests -------------------------------------------------------------------------------------
Name (time in us)                                  Min               Max              Mean            StdDev            Median               IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_array_eq (NOW)              6.3896 (1.0)      7.1213 (1.0)      6.4851 (1.0)      0.0999 (1.02)     6.4517 (1.0)      0.0790 (1.0)          10;6      154.1990 (1.0)         100         100
test_benchmark_enum_array_eq (0001_3f253db)     8.8371 (1.38)     9.4550 (1.33)     8.9318 (1.38)     0.0978 (1.0)      8.8933 (1.38)     0.1158 (1.47)         10;3      111.9598 (0.73)        100         100
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------- benchmark 'EnumArray.decode': 2 tests ---------------------------------------------------------------------------------------------
Name (time in us)                                        Min                   Max                Mean             StdDev              Median                IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_array_decode (NOW)              240.3592 (1.0)        252.2633 (1.0)      241.7589 (1.0)       1.4377 (1.0)      241.4629 (1.0)       0.9042 (1.0)           4;3        4.1364 (1.0)         100         100
test_benchmark_enum_array_decode (0001_3f253db)     750.8642 (3.12)     1,362.8329 (5.40)     772.7886 (3.20)     73.1442 (50.88)    757.5662 (3.14)     11.1679 (12.35)         2;9        1.2940 (0.31)        100         100
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------- benchmark 'EnumArray.decode_to_str': 2 tests ------------------------------------------------------------------------------------------
Name (time in us)                                               Min                 Max                Mean            StdDev              Median               IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_array_decode_to_str (NOW)              316.5763 (1.0)      322.2183 (1.0)      317.8096 (1.0)      0.8534 (1.0)      317.7663 (1.0)      0.8179 (1.0)          17;3        3.1465 (1.0)         100         100
test_benchmark_enum_array_decode_to_str (0001_3f253db)     859.9421 (2.72)     904.4842 (2.81)     868.7573 (2.73)     8.2100 (9.62)     866.1508 (2.73)     4.8421 (5.92)        13;12        1.1511 (0.37)        100         100
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

bonjourmauko · 2024-11-20T12:24:57Z

Full test (for 50k):

------------------------------------------------------------------------------------ benchmark 'Enum.__eq__': 2 tests ------------------------------------------------------------------------------------
Name (time in us)                            Min               Max              Mean            StdDev            Median               IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_eq (0001_e8b0acf)     6.1164 (1.0)      6.2392 (1.0)      6.1712 (1.0)      0.0236 (1.0)      6.1702 (1.0)      0.0284 (1.0)          30;2      162.0429 (1.0)         100       10000
test_benchmark_enum_eq (NOW)              6.2180 (1.02)     6.5928 (1.06)     6.2866 (1.02)     0.0616 (2.62)     6.2660 (1.02)     0.0355 (1.25)        14;12      159.0686 (0.98)        100       10000
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------- benchmark 'Enum.encode (Enum)': 2 tests -----------------------------------------------------------------------------------
Name (time in ms)                                     Min               Max              Mean            StdDev            Median               IQR            Outliers       OPS            Rounds  Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_encode_enum (NOW)              1.8012 (1.0)      1.8782 (1.0)      1.8230 (1.0)      0.0123 (1.0)      1.8219 (1.0)      0.0161 (1.0)          30;2  548.5553 (1.0)         100          10
test_benchmark_enum_encode_enum (0001_e8b0acf)     1.9466 (1.08)     2.4946 (1.33)     2.0038 (1.10)     0.0812 (6.59)     1.9726 (1.08)     0.0592 (3.67)        15;11  499.0517 (0.91)        100          10
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------- benchmark 'Enum.encode (int)': 2 tests --------------------------------------------------------------------------------------------------
Name (time in ns)                                         Min                    Max                   Mean                StdDev                 Median                 IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_encode_int (0001_e8b0acf)        600.0000 (1.0)       4,333.4001 (1.0)         660.3360 (1.0)        371.1875 (1.0)         620.9000 (1.0)       12.4500 (1.0)           1;3    1,514.3805 (1.0)         100          10
test_benchmark_enum_encode_int (NOW)              36,554.1000 (60.92)    60,558.3999 (13.97)    38,288.2460 (57.98)    4,060.5919 (10.94)    36,887.4500 (59.41)    945.8501 (75.97)       10;14       26.1177 (0.02)        100          10
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------- benchmark 'Enum.encode (str)': 2 tests -----------------------------------------------------------------------------------
Name (time in ms)                                    Min               Max              Mean            StdDev            Median               IQR            Outliers       OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_encode_str (0001_e8b0acf)     1.1514 (1.0)      5.4571 (2.68)     1.3338 (1.0)      0.5570 (23.23)    1.1804 (1.0)      0.0383 (1.79)         6;16  749.7253 (1.0)         100          10
test_benchmark_enum_encode_str (NOW)              1.8314 (1.59)     2.0398 (1.0)      1.8600 (1.39)     0.0240 (1.0)      1.8566 (1.57)     0.0214 (1.0)          14;3  537.6388 (0.72)        100          10
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------- benchmark 'EnumArray.__eq__': 2 tests -------------------------------------------------------------------------------------
Name (time in us)                                  Min                Max              Mean            StdDev            Median               IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_array_eq (NOW)              6.3462 (1.0)      10.5004 (1.12)     7.1608 (1.0)      1.2814 (12.70)    6.4923 (1.0)      0.5665 (11.62)       20;20      139.6490 (1.0)         100         100
test_benchmark_enum_array_eq (0001_e8b0acf)     8.8654 (1.40)      9.3842 (1.0)      8.9354 (1.25)     0.1009 (1.0)      8.8969 (1.37)     0.0487 (1.0)         10;15      111.9138 (0.80)        100         100
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------- benchmark 'EnumArray.decode': 2 tests --------------------------------------------------------------------------------------------
Name (time in us)                                        Min                   Max                Mean             StdDev              Median               IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_array_decode (NOW)              241.6167 (1.0)        255.0425 (1.0)      245.2318 (1.0)       3.1863 (1.0)      243.6250 (1.0)      4.5190 (1.05)         22;1        4.0778 (1.0)         100         100
test_benchmark_enum_array_decode (0001_e8b0acf)     752.3354 (3.11)     1,501.3621 (5.89)     765.7738 (3.12)     74.5790 (23.41)    757.6515 (3.11)     4.3040 (1.0)           1;3        1.3059 (0.32)        100         100
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------- benchmark 'EnumArray.decode_to_str': 2 tests ------------------------------------------------------------------------------------------
Name (time in us)                                               Min                 Max                Mean            StdDev              Median               IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_enum_array_decode_to_str (NOW)              317.2813 (1.0)      355.2050 (1.0)      320.5240 (1.0)      4.9967 (1.0)      319.1794 (1.0)      2.1779 (1.0)          7;10        3.1199 (1.0)         100         100
test_benchmark_enum_array_decode_to_str (0001_e8b0acf)     861.6308 (2.72)     916.3904 (2.58)     871.6658 (2.72)     9.9436 (1.99)     868.3410 (2.72)     6.7525 (3.10)        12;11        1.1472 (0.37)        100         100
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

bonjourmauko · 2024-11-20T12:33:41Z

...-64bit/0001_e8b0acf18413d142a4778eb63f40c518f32b94bc_20241120_094745_uncommited-changes.json

@@ -0,0 +1,288 @@
+{


It does not make sense to keep this file in version control, but it is userful to have it as an artifact in CI to make comparisons, and, eventually, tests to fail under a certain threshold (say, <25%).

bonjourmauko added 3 commits November 20, 2024 02:06

perf: add pytest-benchmark

dd03bed

revert: Enum to v42.0.0

e7e4e77

revert: Enum to v43.0.0

5dec58b

bonjourmauko added the kind:test Adding missing tests or correcting existing tests label Nov 20, 2024

bonjourmauko requested a review from benoit-cty November 20, 2024 02:05

bonjourmauko self-assigned this Nov 20, 2024

bonjourmauko added 2 commits November 20, 2024 03:42

revert: Enum to v42.0.0

d5060fc

revert: Enum to v43.2.2

82f0055

bonjourmauko added 3 commits November 20, 2024 05:45

revert: Enum to v43.2.2

5fc125d

test: add benchmark groups

572da3b

perf: improve Enum.__eq__

196145a

bonjourmauko force-pushed the perf/add-benchmark-to-perf-test branch from 03ef54b to 196145a Compare November 20, 2024 04:46

bonjourmauko requested a review from clallemand November 20, 2024 04:46

test: add perf tests to EnumArray

e8b0acf

bonjourmauko force-pushed the perf/add-benchmark-to-perf-test branch from 3f253db to e8b0acf Compare November 20, 2024 09:36

bonjourmauko added 3 commits November 20, 2024 11:28

test: add missing perf tests

0ef333f

perf: improve Enum.encode(str) by 15x

84668d3

perf: improve Enum.encode(int) by 339x

d95a1ce

bonjourmauko commented Nov 20, 2024

View reviewed changes

bonjourmauko mentioned this pull request Nov 21, 2024

Fix enums performance #1306

Open

bonjourmauko changed the title ~~perf: add benchmark to Enum.__eq__~~ test: add benchmark to Enum.__eq__ Nov 21, 2024

test: fix failing test

84b1ba4

bonjourmauko changed the title ~~test: add benchmark to Enum.__eq__~~ test: add benchmark to Enum/EnumArray Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add benchmark to Enum/EnumArray #1303

test: add benchmark to Enum/EnumArray #1303

bonjourmauko commented Nov 20, 2024 •

edited

Loading

bonjourmauko commented Nov 20, 2024 •

edited

Loading

bonjourmauko commented Nov 20, 2024 •

edited

Loading

bonjourmauko commented Nov 20, 2024

bonjourmauko commented Nov 20, 2024 •

edited

Loading

bonjourmauko Nov 20, 2024

test: add benchmark to Enum/EnumArray #1303

Are you sure you want to change the base?

test: add benchmark to Enum/EnumArray #1303

Conversation

bonjourmauko commented Nov 20, 2024 • edited Loading

bonjourmauko commented Nov 20, 2024 • edited Loading

bonjourmauko commented Nov 20, 2024 • edited Loading

bonjourmauko commented Nov 20, 2024

bonjourmauko commented Nov 20, 2024 • edited Loading

bonjourmauko Nov 20, 2024

Choose a reason for hiding this comment

bonjourmauko commented Nov 20, 2024 •

edited

Loading

bonjourmauko commented Nov 20, 2024 •

edited

Loading

bonjourmauko commented Nov 20, 2024 •

edited

Loading

bonjourmauko commented Nov 20, 2024 •

edited

Loading