Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reevaluate jump threading in GDScript VM #11418

Open
YYF233333 opened this issue Dec 25, 2024 · 0 comments
Open

Reevaluate jump threading in GDScript VM #11418

YYF233333 opened this issue Dec 25, 2024 · 0 comments

Comments

@YYF233333
Copy link

Describe the project you are working on

godot gdscript module

Describe the problem or limitation you are having in your project

I'm trying to add immediate number to gdscript vm instruction, having two implementation (jump threading/loop) significantly increase the workload and maintainability burden.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

According to https://inria.hal.science/hal-01100647/document, as our hardware evolve, jump threading is playing less important role in reducing branch prediction misses.

I test the two version with godot-benchmark, the results are as below:

Template Release
                               jump_threading_release   loop_switch_release   diff      percent
Deep Tree                                         254.25          256.57   -0002.32     -00.91%
Duplicate                                        1732.25         1730.75   +0001.50     +00.09%
Fragmentation                                    2653.00         2675.50   -0022.50     -00.84%
Wide Tree                                         219.95          216.22   +0003.72     +01.72%
Fill Loop                                         276.85          293.65   -0016.80     -05.72%
Fill Method                                       106.25          111.08   -0004.83     -04.34%
Packed Color Array                                172.28          179.40   -0007.12     -03.97%
Packed Float 32 Array                             146.40          148.43   -0002.03     -01.36%
Packed Float 64 Array                             150.47          157.25   -0006.78     -04.31%
Packed Int 32 Array                               101.49          110.95   -0009.46     -08.53%
Packed Int 64 Array                               104.53          110.88   -0006.35     -05.73%
Packed String Array                               954.00          958.38   -0004.38     -00.46%
Packed Vector 2 Array                             145.75          154.28   -0008.53     -05.53%
Packed Vector 3 Array                             166.35          177.80   -0011.45     -06.44%
Typed Color Array                                 283.23          290.27   -0007.05     -02.43%
Typed Float Array                                 240.05          235.22   +0004.83     +02.05%
Typed Int Array                                   228.10          228.07   +0000.03     +00.01%
Typed String Array                               1113.00         1121.75   -0008.75     -00.78%
Typed Vector 2 Array                              258.80          275.77   -0016.97     -06.16%
Typed Vector 3 Array                              277.15          283.10   -0005.95     -02.10%
Untyped Color Array                               469.20          472.15   -0002.95     -00.62%
Untyped Float Array                               415.15          413.32   +0001.82     +00.44%
Untyped Int Array                                 410.45          411.75   -0001.30     -00.32%
Untyped String Array                             1305.00         1309.50   -0004.50     -00.34%
Untyped Vector 2 Array                            450.20          454.02   -0003.82     -00.84%
Untyped Vector 3 Array                            463.20          469.42   -0006.22     -01.33%
Binary Trees 13                                   685.98          703.85   -0017.88     -02.54%
Binary Trees 15                                  3294.25         3357.25   -0063.00     -01.88%
Control                                             0.01            0.01   -0000.00     -08.33%
For Loop Add                                        9.07           10.28   -0001.21     -11.72%
For Loop Call                                      85.09           87.84   -0002.74     -03.12%
Hello World                                         0.10            0.10   -0000.00     -02.25%
Lambda Call                                        85.88           86.50   -0000.62     -00.72%
Mandelbrot Set                                   2766.75         2992.75   -0226.00     -07.55%
Merkle Trees 13                                  2125.50         2140.75   -0015.25     -00.71%
Merkle Trees 15                                 10577.50        10732.50   -0155.00     -01.44%
Nbody 1 000 000                                  8001.00         8980.75   -0979.75     -10.91%
Nbody 500 000                                    4007.00         4490.25   -0483.25     -10.76%
Spectral Norm 100                                  61.32           63.73   -0002.41     -03.78%
Spectral Norm 1000                               6096.75         6329.50   -0232.75     -03.68%
Spectral Norm 500                                1520.25         1580.00   -0059.75     -03.78%
Md 5 Buffer Empty                                 244.57          243.80   +0000.77     +00.32%
Md 5 Buffer Non Empty                             730.50          729.05   +0001.45     +00.20%
Md 5 Text Empty                                   194.25          193.00   +0001.25     +00.65%
Md 5 Text Non Empty                               675.38          666.50   +0008.88     +01.33%
Sha 1 Buffer Empty                                262.80          261.55   +0001.25     +00.48%
Sha 1 Buffer Non Empty                            800.38          804.33   -0003.95     -00.49%
Sha 1 Text Empty                                  220.95          219.10   +0001.85     +00.84%
Sha 1 Text Non Empty                              755.05          755.25   -0000.20     -00.03%
Sha 256 Buffer Empty                              422.27          414.52   +0007.75     +01.87%
Sha 256 Buffer Non Empty                         1386.75         1374.25   +0012.50     +00.91%
Sha 256 Text Empty                                388.82          381.70   +0007.12     +01.87%
Sha 256 Text Non Empty                           1351.50         1346.50   +0005.00     +00.37%
Complex Variable Concatenate                     2931.00         2955.25   -0024.25     -00.82%
Complex Variable Method                          4891.00         5427.00   -0536.00     -09.88%
Complex Variable Percent                         3973.25         4041.50   -0068.25     -01.69%
No Op Constant Method                             247.30          242.55   +0004.75     +01.96%
Simple Constant Concatenate                         3.58            3.36   +0000.22     +06.59%
Simple Constant Method                           1034.00         1040.00   -0006.00     -00.58%
Simple Constant Method Constant Dict              686.38          692.92   -0006.55     -00.95%
Simple Constant Percent                             3.21            3.37   -0000.15     -04.53%
Simple Variable Concatenate                       264.88          271.57   -0006.70     -02.47%
Simple Variable Method                           1036.50         1033.00   +0003.50     +00.34%
Simple Variable Percent                           622.00          619.23   +0002.77     +00.45%
Begins With                                        11.28           11.73   -0000.45     -03.84%
Bigrams                                           893.08         1041.38   -0148.30     -14.24%
Capitalize                                       1461.00         1489.50   -0028.50     -01.91%
Casecmp To                                         10.92           11.41   -0000.48     -04.23%
Contains                                           10.94           11.92   -0000.97     -08.18%
Contains Gdscript In                                3.20            3.48   -0000.27     -07.87%
Count                                             148.78          146.72   +0002.05     +01.40%
Countn                                            380.80          384.18   -0003.38     -00.88%
Ends With                                          12.69           13.11   -0000.42     -03.22%
Find                                               75.05           73.67   +0001.38     +01.87%
Findn                                              90.21           89.17   +0001.04     +01.16%
Get Slice                                         100.36           98.71   +0001.65     +01.67%
Get Slice Count                                    21.41           22.93   -0001.51     -06.60%
Humanize Size                                    1188.25         1188.25   +0000.00     +00.00%
Insert                                             86.22           85.40   +0000.81     +00.95%
Is Valid Filename                                  38.39           37.87   +0000.52     +01.37%
Lpad                                              303.93          304.90   -0000.97     -00.32%
Naturalnocasecmp To                                13.63           14.58   -0000.95     -06.48%
Nocasecmp To                                       46.39           47.20   -0000.81     -01.71%
Pad Decimals                                      682.58          674.75   +0007.83     +01.16%
Pad Decimals Pre Constructed                       99.19           98.45   +0000.75     +00.76%
Pad Zeros                                         488.30          485.20   +0003.10     +00.64%
Pad Zeros Pre Constructed                         299.70          296.25   +0003.45     +01.16%
Rfind                                              71.37           71.89   -0000.53     -00.73%
Rfindn                                            245.18          245.47   -0000.30     -00.12%
Rpad                                              229.00          230.35   -0001.35     -00.59%
Rsplit                                            625.40          621.50   +0003.90     +00.63%
Similarity                                         24.07           25.52   -0001.45     -05.66%
Simplify Path                                    1823.50         1820.25   +0003.25     +00.18%
Split                                             604.85          602.05   +0002.80     +00.47%
Split Floats                                      467.38          465.18   +0002.20     +00.47%
Substr                                             85.34           84.54   +0000.80     +00.94%
To Camel Case                                     677.23          686.40   -0009.17     -01.34%
To Lower                                          230.38          230.38   +0000.00     +00.00%
To Pascal Case                                   1680.50         1715.75   -0035.25     -02.05%
To Snake Case                                    1236.75         1285.25   -0048.50     -03.77%
To Utf 16 Buffer                                  210.60          211.57   -0000.97     -00.46%
To Utf 32 Buffer                                  136.57          136.88   -0000.30     -00.22%
To Utf 8 Buffer                                   205.85          207.18   -0001.33     -00.64%
To Wchar Buffer                                   209.95          209.47   +0000.47     +00.23%
Uri Decode                                        765.75          772.83   -0007.08     -00.92%
Uri Encode                                        583.12          593.25   -0010.12     -01.71%
Validate Filename                                 439.73          443.38   -0003.65     -00.82%
Validate Node Name                                120.58          122.97   -0002.40     -01.95%
Xml Escape                                        650.58          656.00   -0005.42     -00.83%
Xml Unescape                                       99.20          111.31   -0012.11     -10.88%
Sum of all tests                                93429.25        96700.96   -3271.71     -03.38%
Editor
                                          jump_threading     loop_switch       diff    percent
Deep Tree                                         310.70          307.10   +0003.60     +01.17%
Duplicate                                        2098.00         2110.67   -0012.67     -00.60%
Fragmentation                                    3091.33         3083.67   +0007.67     +00.25%
Wide Tree                                         243.30          244.90   -0001.60     -00.65%
Fill Loop                                         336.90          348.33   -0011.43     -03.28%
Fill Method                                       104.43          114.30   -0009.87     -08.63%
Packed Color Array                                229.63          238.13   -0008.50     -03.57%
Packed Float 32 Array                             193.63          200.20   -0006.57     -03.28%
Packed Float 64 Array                             195.77          203.87   -0008.10     -03.97%
Packed Int 32 Array                               139.43          146.47   -0007.03     -04.80%
Packed Int 64 Array                               150.03          149.17   +0000.87     +00.58%
Packed String Array                              1219.33         1229.33   -0010.00     -00.81%
Packed Vector 2 Array                             200.03          210.27   -0010.23     -04.87%
Packed Vector 3 Array                             226.60          239.30   -0012.70     -05.31%
Typed Color Array                                 354.43          377.40   -0022.97     -06.09%
Typed Float Array                                 277.03          282.37   -0005.33     -01.89%
Typed Int Array                                   272.30          275.03   -0002.73     -00.99%
Typed String Array                               1371.00         1374.33   -0003.33     -00.24%
Typed Vector 2 Array                              325.43          335.10   -0009.67     -02.88%
Typed Vector 3 Array                              344.43          352.63   -0008.20     -02.33%
Untyped Color Array                               562.60          583.67   -0021.07     -03.61%
Untyped Float Array                               478.93          489.27   -0010.33     -02.11%
Untyped Int Array                                 471.60          485.43   -0013.83     -02.85%
Untyped String Array                             1588.00         1593.00   -0005.00     -00.31%
Untyped Vector 2 Array                            532.83          544.43   -0011.60     -02.13%
Untyped Vector 3 Array                            550.43          562.73   -0012.30     -02.19%
Binary Trees 13                                   873.90          889.90   -0016.00     -01.80%
Binary Trees 15                                  4213.00         4322.33   -0109.33     -02.53%
Control                                             0.01            0.01   +0000.00     +00.00%
For Loop Add                                       14.24           17.29   -0003.04     -17.61%
For Loop Call                                     126.23          129.30   -0003.07     -02.37%
Hello World                                         0.17            0.14   +0000.04     +26.02%
Lambda Call                                       101.87          102.43   -0000.57     -00.55%
Mandelbrot Set                                   4054.00         4860.00   -0806.00     -16.58%
Merkle Trees 13                                  3383.33         3368.00   +0015.33     +00.46%
Merkle Trees 15                                 16766.67        16836.67   -0070.00     -00.42%
Nbody 1 000 000                                 10146.67        10440.00   -0293.33     -02.81%
Nbody 500 000                                    5072.33         5226.00   -0153.67     -02.94%
Spectral Norm 100                                  84.95           90.30   -0005.35     -05.92%
Spectral Norm 1000                               8503.67         8992.33   -0488.67     -05.43%
Spectral Norm 500                                2120.67         2240.67   -0120.00     -05.36%
Md 5 Buffer Empty                                 271.47          272.47   -0001.00     -00.37%
Md 5 Buffer Non Empty                             756.70          757.43   -0000.73     -00.10%
Md 5 Text Empty                                   211.90          212.30   -0000.40     -00.19%
Md 5 Text Non Empty                               698.17          697.47   +0000.70     +00.10%
Sha 1 Buffer Empty                                284.40          284.53   -0000.13     -00.05%
Sha 1 Buffer Non Empty                            825.47          828.13   -0002.67     -00.32%
Sha 1 Text Empty                                  236.73          233.03   +0003.70     +01.59%
Sha 1 Text Non Empty                              770.80          765.20   +0005.60     +00.73%
Sha 256 Buffer Empty                              445.13          445.27   -0000.13     -00.03%
Sha 256 Buffer Non Empty                         1426.67         1437.00   -0010.33     -00.72%
Sha 256 Text Empty                                400.23          401.27   -0001.03     -00.26%
Sha 256 Text Non Empty                           1375.00         1384.67   -0009.67     -00.70%
Complex Variable Concatenate                     3294.00         3307.00   -0013.00     -00.39%
Complex Variable Method                          5410.33         5398.33   +0012.00     +00.22%
Complex Variable Percent                         4098.67         4121.00   -0022.33     -00.54%
No Op Constant Method                             276.47          286.03   -0009.57     -03.34%
Simple Constant Concatenate                         6.49            7.90   -0001.41     -17.87%
Simple Constant Method                           1137.67         1142.33   -0004.67     -00.41%
Simple Constant Method Constant Dict              766.50          764.87   +0001.63     +00.21%
Simple Constant Percent                             6.46            7.75   -0001.29     -16.60%
Simple Variable Concatenate                       289.30          292.20   -0002.90     -00.99%
Simple Variable Method                           1138.00         1143.67   -0005.67     -00.50%
Simple Variable Percent                           648.63          658.23   -0009.60     -01.46%
Begins With                                        16.24           17.13   -0000.88     -05.16%
Bigrams                                           980.80          980.27   +0000.53     +00.05%
Capitalize                                       1550.00         1558.00   -0008.00     -00.51%
Casecmp To                                         16.09           19.50   -0003.42     -17.52%
Contains                                           16.50           17.11   -0000.61     -03.55%
Contains Gdscript In                                6.33            7.78   -0001.46     -18.71%
Count                                             167.63          170.80   -0003.17     -01.85%
Countn                                            396.60          402.47   -0005.87     -01.46%
Ends With                                          17.41           18.09   -0000.67     -03.72%
Find                                               83.36           85.34   -0001.98     -02.32%
Findn                                              99.85          100.53   -0000.68     -00.68%
Get Slice                                         113.73          115.93   -0002.20     -01.90%
Get Slice Count                                    27.02           27.68   -0000.66     -02.38%
Humanize Size                                    1254.67         1258.67   -0004.00     -00.32%
Insert                                             99.17           99.41   -0000.24     -00.24%
Is Valid Filename                                  42.77           43.80   -0001.03     -02.36%
Lpad                                              332.53          334.43   -0001.90     -00.57%
Naturalnocasecmp To                                18.72           20.29   -0001.57     -07.72%
Nocasecmp To                                       33.82           35.36   -0001.53     -04.34%
Pad Decimals                                      712.17          711.47   +0000.70     +00.10%
Pad Decimals Pre Constructed                      111.30          111.63   -0000.33     -00.30%
Pad Zeros                                         599.97          606.43   -0006.47     -01.07%
Pad Zeros Pre Constructed                         331.93          334.00   -0002.07     -00.62%
Rfind                                              85.19           82.18   +0003.01     +03.67%
Rfindn                                            252.97          252.77   +0000.20     +00.08%
Rpad                                              259.77          261.80   -0002.03     -00.78%
Rsplit                                            682.73          685.03   -0002.30     -00.34%
Similarity                                         29.45           30.03   -0000.57     -01.91%
Simplify Path                                    1909.67         1903.67   +0006.00     +00.32%
Split                                             650.33          662.83   -0012.50     -01.89%
Split Floats                                      530.30          497.23   +0033.07     +06.65%
Substr                                             99.39           99.18   +0000.21     +00.21%
To Camel Case                                     705.00          704.90   +0000.10     +00.01%
To Lower                                          244.17          246.87   -0002.70     -01.09%
To Pascal Case                                   1787.67         1796.33   -0008.67     -00.48%
To Snake Case                                    1320.67         1323.67   -0003.00     -00.23%
To Utf 16 Buffer                                  240.67          239.80   +0000.87     +00.36%
To Utf 32 Buffer                                  157.13          159.20   -0002.07     -01.30%
To Utf 8 Buffer                                   236.30          238.13   -0001.83     -00.77%
To Wchar Buffer                                   240.20          239.50   +0000.70     +00.29%
Uri Decode                                        786.60          800.40   -0013.80     -01.72%
Uri Encode                                        624.13          626.07   -0001.93     -00.31%
Validate Filename                                 472.33          477.30   -0004.97     -01.04%
Validate Node Name                                133.07          135.40   -0002.33     -01.72%
Xml Escape                                        748.47          757.07   -0008.60     -01.14%
Xml Unescape                                      110.20          112.33   -0002.13     -01.90%
Sum of all tests                               115440.09       117852.36   -2412.27     -02.05%

Overall jump threading version is 2%~3% faster.

Breaking down with Intel VTune Profiler shows that the speedup mainly comes from reduced inst count instead of branch misses.

The retired instruction diff is at the same percent as the execution time diff, and the IPC is the same, meaning jump threading is faster mainly because it execute less instructions.

As a conclusion, I think 2~3% speed is totally trade-offable with maintainability. Without jump threading we can split the huge GDScriptFunction::call, full of macro magic, into small, standard, readable part. Also you don't need to check compitability for two version when you do any change to the vm, make the enhancement a lot easier to be done.

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

Remove jump threading version of VM, split GDScriptFunction::call. Not a very hard thing if the proposal is widely accepted.

If this enhancement will not be used often, can it be worked around with a few lines of script?

No.

Is there a reason why this should be core and not an add-on in the asset library?

part of godot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants