Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LogUp-GKR integration into VM #1364

Closed
wants to merge 93 commits into from
Closed

LogUp-GKR integration into VM #1364

wants to merge 93 commits into from

Conversation

plafer
Copy link
Contributor

@plafer plafer commented Jun 25, 2024

Built on top of #1307

Integrates LogUp GKR implementation into the VM.

Al-Kindi-0 and others added 30 commits March 22, 2024 19:41
refactor: serialize usize with write_usize (#1266)
* change prover

* change verifier

* cleanup

* Remove `VirtualBusVerifier`

* Remove `VirtualBusProver`

* reorganize exports

* `LayerGatesInputs`

* use constant

* docstrings

* add TODO

* remove TODO

* clippy

* add static_assertions

* add padding comment

* add back `claim` in verifier
Copy link
Contributor Author

@plafer plafer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial implementation ready for review! I ran the BLAKE3 benchmarks:

  • next branch:
    • prover time: 42s
    • verifier time: 1.7ms
  • current branch:
    • prover time: 68s
    • verifier time: 3ms

Upon closer inspection,

  • our implementation's constraint evaluation is 12 seconds slower - we could expect parallelization to shrink that down close to 0.
  • Also notably, we are currently saving 1 second in auxiliary trace commitment with, but I believe this is noise, since we actually have one extra column currently (we removed 1, but added 2). Hopefully when we go from 7 to 2, we can see an actual speedup.

Find below the full benchmark output. The command ran for both branches was:

$ make exec-info
$ MIDEN_LOG=info RAYON_NUM_THREADS=16 ./target/optimized/miden example blake3 -n 200

next branch output

============================================================
INFO     assemble_with_opts_in_context [ 6.25ms | 100.00% ]
Generated a program to compute 200-th iteration of BLAKE3 1-to-1 hash; expected result: [3273414978, 861395390, 3406933114, 1249690799, 3478855123, 2478635211, 4040123218, 3120808209]
--------------------------------
INFO     prove_program [ 42.2s | 4.46% / 100.00% ]
INFO     ┝━ execute_program [ 641ms | 1.52% ]
INFO     ┝━ i [info]: Generated execution trace of 70 columns and 2097152 steps (49% padded) in 640 ms
INFO     ┝━ build_domain [ 21.3ms | 0.05% ] trace_length: 2097152 | lde_domain_size: 16777216
INFO     ┝━ commit_to_main_trace_segment [ 20.0s | 0.00% / 47.44% ]
INFO     │  ┝━ extend_execution_trace [ 16.6s | 39.36% ] num_cols: 70 | blowup: 8
INFO     │  ┕━ compute_execution_trace_commitment [ 3.41s | 8.08% ] tree_depth: 24
INFO     ┝━ commit_to_aux_trace_segment [ 3.64s | 0.00% / 8.62% ]
INFO     │  ┝━ extend_execution_trace [ 2.95s | 6.98% ] num_cols: 7 | blowup: 8
INFO     │  ┕━ compute_execution_trace_commitment [ 692ms | 1.64% ] tree_depth: 24
INFO     ┝━ evaluate_constraints [ 11.0s | 25.93% ] ce_domain_size: 16777216
INFO     ┝━ commit_to_constraint_evaluations [ 3.28s | 0.00% / 7.75% ]
INFO     │  ┝━ build_composition_poly_columns [ 275ms | 0.65% ] num_columns: 8
INFO     │  ┝━ evaluate_composition_poly_columns [ 2.30s | 5.44% ]
INFO     │  ┕━ compute_constraint_evaluation_commitment [ 702ms | 1.66% ] tree_depth: 24
INFO     ┝━ build_deep_composition_poly [ 1.32s | 3.12% ]
INFO     ┝━ evaluate_deep_composition_poly [ 133ms | 0.32% ]
INFO     ┝━ compute_fri_layers [ 163ms | 0.39% ] num_layers: 5
INFO     ┝━ determine_query_positions [ 1.22ms | 0.00% ] grinding_factor: 16 | num_positions: 27
INFO     ┕━ build_proof_object [ 173ms | 0.41% ]
--------------------------------
Executed program in 42248 ms
Stack outputs: [3273414978, 861395390, 3406933114, 1249690799, 3478855123, 2478635211, 4040123218, 3120808209]
Execution proof size: 107 KB
Execution proof security: 96 bits
--------------------------------
INFO     verify_program [ 1.70ms | 100.00% ]
Execution verified in 1 ms

current branch output

============================================================
INFO     assemble_with_opts_in_context [ 6.03ms | 100.00% ]
Generated a program to compute 200-th iteration of BLAKE3 1-to-1 hash; expected result: [3273414978, 861395390, 3406933114, 1249690799, 3478855123, 2478635211, 4040123218, 3120808209]
--------------------------------
INFO     prove_program [ 68.5s | 21.03% / 100.00% ]
INFO     ┝━ execute_program [ 638ms | 0.93% ]
INFO     ┝━ i [info]: Generated execution trace of 70 columns and 2097152 steps (49% padded) in 638 ms
INFO     ┝━ build_domain [ 20.5ms | 0.03% ] trace_length: 2097152 | lde_domain_size: 16777216
INFO     ┝━ commit_to_main_trace_segment [ 21.3s | 0.00% / 31.06% ]
INFO     │  ┝━ extend_execution_trace [ 17.0s | 24.77% ] num_cols: 70 | blowup: 8
INFO     │  ┕━ compute_execution_trace_commitment [ 4.32s | 6.30% ] tree_depth: 24
INFO     ┝━ commit_to_aux_trace_segment [ 2.38s | 0.00% / 3.47% ]
INFO     │  ┝━ extend_execution_trace [ 1.77s | 2.59% ] num_cols: 8 | blowup: 8
INFO     │  ┕━ compute_execution_trace_commitment [ 604ms | 0.88% ] tree_depth: 24
INFO     ┝━ evaluate_constraints [ 23.7s | 34.53% ] ce_domain_size: 16777216
INFO     ┝━ commit_to_constraint_evaluations [ 3.32s | 0.00% / 4.84% ]
INFO     │  ┝━ build_composition_poly_columns [ 241ms | 0.35% ] num_columns: 8
INFO     │  ┝━ evaluate_composition_poly_columns [ 2.31s | 3.38% ]
INFO     │  ┕━ compute_constraint_evaluation_commitment [ 762ms | 1.11% ] tree_depth: 24
INFO     ┝━ build_deep_composition_poly [ 2.32s | 3.39% ]
INFO     ┝━ evaluate_deep_composition_poly [ 142ms | 0.21% ]
INFO     ┝━ compute_fri_layers [ 168ms | 0.24% ] num_layers: 5
INFO     ┝━ determine_query_positions [ 246µs | 0.00% ] grinding_factor: 16 | num_positions: 27
INFO     ┕━ build_proof_object [ 188ms | 0.27% ]
--------------------------------
Executed program in 68527 ms
Stack outputs: [3273414978, 861395390, 3406933114, 1249690799, 3478855123, 2478635211, 4040123218, 3120808209]
Execution proof size: 128 KB
Execution proof security: 96 bits
--------------------------------
INFO     verify_program [ 3.54ms | 100.00% ]
Execution verified in 3 ms

pub use gkr_verifier::{
verify_virtual_bus, CompositionPolyQueryBuilder, SumCheckVerifier, SumCheckVerifierError,
};
pub mod logup_gkr;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving this as-is, since we'll be moving it out of the VM in the next PR.

Comment on lines +31 to +37
// Hack: These columns already have an assertion
CLK_COL_IDX
| FMP_COL_IDX
| STACK_TRACE_OFFSET..=STACK_TRACE_END
| B0_COL_IDX_MAIN_TRACE
| B1_COL_IDX_MAIN_TRACE
| V_COL_IDX => (),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How we deal with the main_trace_first_row's assertions currently is hacky. The main issue before this hack was that winterfell would panic because of overlapping assertions (i.e. this assertion being redundant with other ones set by other modules).

The real underlying issue is that all the columns listed here shouldn't actually be part of the public inputs. But we'll need to fix it properly when we move this into winterfell, so I left it as is for now.

Comment on lines +51 to +52
let final_opening_claim =
verify_virtual_bus(E::ZERO, gkr_proof, vec![log_up_randomness], public_coin)?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be missing a check after this; will double-check in the morning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all the necessary checks are there, but @Al-Kindi-0 please double-check

@plafer plafer requested review from bobbinth and Al-Kindi-0 July 8, 2024 21:09
@plafer plafer marked this pull request as ready for review July 8, 2024 21:10
@bobbinth
Copy link
Contributor

bobbinth commented Jul 8, 2024

Not a review yet - just a few observations about the performance because I think something looks odd.

First, the times are quite a bit higher than what I'm getting on my machine. Specifically, I'm getting 16 seconds for the next branch and 40 seconds for this branch. Is it possible that you are running in a memory-constrained environment?

Also, I usually run the benchmarks for 1M cycles. Previously, this is what we were getting for -n 200 - but since we stopped inlining, -n 200 now requires 2M cycles. So, I also ran the benchmark for -n 100 and the results are below:

next branch

============================================================
INFO     assemble_with_opts_in_context [ 3.48ms | 100.00% ]
Generated a program to compute 100-th iteration of BLAKE3 1-to-1 hash; expected result: [4210649924, 4239425932, 2583891669, 2278324621, 1697424527, 1323302812, 3062448259, 2695025053]
--------------------------------
INFO     prove_program [ 8.06s | 4.28% / 100.00% ]
INFO     ┝━ execute_program [ 355ms | 4.41% ]
INFO     ┝━ i [info]: Generated execution trace of 70 columns and 1048576 steps (49% padded) in 355 ms
INFO     ┝━ build_domain [ 8.51ms | 0.11% ] trace_length: 1048576 | lde_domain_size: 8388608
INFO     ┝━ commit_to_main_trace_segment [ 2.89s | 0.00% / 35.87% ]
INFO     │  ┝━ extend_execution_trace [ 2.04s | 25.26% ] num_cols: 70 | blowup: 8
INFO     │  ┕━ compute_execution_trace_commitment [ 856ms | 10.61% ] tree_depth: 23
INFO     ┝━ commit_to_aux_trace_segment [ 760ms | 0.00% / 9.42% ]
INFO     │  ┝━ extend_execution_trace [ 463ms | 5.74% ] num_cols: 7 | blowup: 8
INFO     │  ┕━ compute_execution_trace_commitment [ 297ms | 3.68% ] tree_depth: 23
INFO     ┝━ evaluate_constraints [ 2.28s | 28.31% ] ce_domain_size: 8388608
INFO     ┝━ commit_to_constraint_evaluations [ 774ms | 0.00% / 9.60% ]
INFO     │  ┝━ build_composition_poly_columns [ 129ms | 1.60% ] num_columns: 8
INFO     │  ┝━ evaluate_composition_poly_columns [ 380ms | 4.71% ]
INFO     │  ┕━ compute_constraint_evaluation_commitment [ 266ms | 3.30% ] tree_depth: 23
INFO     ┝━ build_deep_composition_poly [ 397ms | 4.93% ]
INFO     ┝━ evaluate_deep_composition_poly [ 117ms | 1.45% ]
INFO     ┝━ compute_fri_layers [ 83.6ms | 1.04% ] num_layers: 4
INFO     ┝━ determine_query_positions [ 266µs | 0.00% ] grinding_factor: 16 | num_positions: 27
INFO     ┕━ build_proof_object [ 46.6ms | 0.58% ]
--------------------------------
Executed program in 8065 ms
Stack outputs: [4210649924, 4239425932, 2583891669, 2278324621, 1697424527, 1323302812, 3062448259, 2695025053]
Execution proof size: 99 KB
Execution proof security: 96 bits
--------------------------------
INFO     verify_program [ 886µs | 100.00% ]
Execution verified in 0 ms

This branch

============================================================
INFO     assemble_with_opts_in_context [ 5.17ms | 100.00% ]
Generated a program to compute 100-th iteration of BLAKE3 1-to-1 hash; expected result: [4210649924, 4239425932, 2583891669, 2278324621, 1697424527, 1323302812, 3062448259, 2695025053]
--------------------------------
INFO     prove_program [ 18.9s | 32.96% / 100.00% ]
INFO     ┝━ execute_program [ 331ms | 1.76% ]
INFO     ┝━ i [info]: Generated execution trace of 70 columns and 1048576 steps (49% padded) in 330 ms
INFO     ┝━ build_domain [ 8.90ms | 0.05% ] trace_length: 1048576 | lde_domain_size: 8388608
INFO     ┝━ commit_to_main_trace_segment [ 2.90s | 0.00% / 15.36% ]
INFO     │  ┝━ extend_execution_trace [ 2.04s | 10.82% ] num_cols: 70 | blowup: 8
INFO     │  ┕━ compute_execution_trace_commitment [ 857ms | 4.54% ] tree_depth: 23
INFO     ┝━ commit_to_aux_trace_segment [ 733ms | 0.00% / 3.89% ]
INFO     │  ┝━ extend_execution_trace [ 443ms | 2.35% ] num_cols: 8 | blowup: 8
INFO     │  ┕━ compute_execution_trace_commitment [ 290ms | 1.54% ] tree_depth: 23
INFO     ┝━ evaluate_constraints [ 7.02s | 37.24% ] ce_domain_size: 8388608
INFO     ┝━ commit_to_constraint_evaluations [ 762ms | 0.00% / 4.04% ]
INFO     │  ┝━ build_composition_poly_columns [ 127ms | 0.67% ] num_columns: 8
INFO     │  ┝━ evaluate_composition_poly_columns [ 371ms | 1.97% ]
INFO     │  ┕━ compute_constraint_evaluation_commitment [ 264ms | 1.40% ] tree_depth: 23
INFO     ┝━ build_deep_composition_poly [ 721ms | 3.82% ]
INFO     ┝━ evaluate_deep_composition_poly [ 66.5ms | 0.35% ]
INFO     ┝━ compute_fri_layers [ 75.3ms | 0.40% ] num_layers: 4
INFO     ┝━ determine_query_positions [ 115µs | 0.00% ] grinding_factor: 16 | num_positions: 27
INFO     ┕━ build_proof_object [ 24.8ms | 0.13% ]
--------------------------------
Executed program in 18854 ms
Stack outputs: [4210649924, 4239425932, 2583891669, 2278324621, 1697424527, 1323302812, 3062448259, 2695025053]
Execution proof size: 119 KB
Execution proof security: 96 bits
--------------------------------
INFO     verify_program [ 1.39ms | 100.00% ]
Execution verified in 1 ms

To summarize the running times in a table for each step (in seconds):

Step next this
execute program 0.35 0.33
commit main trace 2.9 2.9
commit aux trace 0.76 0.73
evaluate constraints 2.28 7.02
commit to constraint evaluations 0.77 0.76
everything else 0.66 1.51
unaccounted 0.34 5.65
total 8.06 18.9

So, the delta between the next and this branch is wholly explained by: (1) the increase in constraint evaluation step which is now taking 4.8 extra seconds, and (2) some unaccounted steps which now take an extra 5.3 seconds.

My hypothesis is that there are 2 "unaccounted" steps: building of the auxiliary trace and generating GKR proof. We should probably update Winterfell to log these steps explicitly. Also, would be nice to add "sub-items" to the constraint evaluation step logging so that we can see Lagrange kernel constraint evaluation time (and maybe something else?) explicitly as well.

@plafer
Copy link
Contributor Author

plafer commented Jul 10, 2024

Added instrumentation for generate_gkr_proof() and build_aux_trace(). On my machine (only 7Gb of free RAM...), I got 64s for the benchmark again, with:

  • generate_gkr_proof(): 11.2s
  • build_aux_trace(): 1.58s

@bobbinth
Copy link
Contributor

I re-ran the benchmark with the latest changed to Winterfell included (and also added a couple more instrumentations inside the GKR prover), and now everything seems to be accounted for. It seems like building the the GKR proof adds pretty much as much extra time as constraint evaluation and accounts for almost 5.6 seconds. Here is the breakdown:

============================================================
INFO     assemble_with_opts_in_context [ 3.65ms | 100.00% ]
Generated a program to compute 100-th iteration of BLAKE3 1-to-1 hash; expected result: [4210649924, 4239425932, 2583891669, 2278324621, 1697424527, 1323302812, 3062448259, 2695025053]
--------------------------------
INFO     prove_program [ 19.5s | 0.54% / 100.00% ]
INFO     ┝━ execute_program [ 354ms | 1.81% ]
INFO     ┝━ i [info]: Generated execution trace of 70 columns and 1048576 steps (49% padded) in 354 ms
INFO     ┝━ build_domain [ 10.1ms | 0.05% ] trace_length: 1048576 | lde_domain_size: 8388608
INFO     ┝━ commit_to_main_trace_segment [ 2.93s | 0.00% / 14.98% ]
INFO     │  ┝━ extend_execution_trace [ 2.06s | 10.57% ] num_cols: 70 | blowup: 8
INFO     │  ┕━ compute_execution_trace_commitment [ 861ms | 4.41% ] tree_depth: 23
INFO     ┝━ generate_gkr_proof [ 5.59s | 0.00% / 28.60% ]
INFO     │  ┕━ prove_virtual_bus [ 5.59s | 0.96% / 28.60% ]
INFO     │     ┝━ new_evaluated_circuit [ 1.31s | 6.69% ]
INFO     │     ┝━ prove_before_final_circuit_layers [ 556ms | 2.85% ]
INFO     │     ┕━ prove_final_circuit_layer [ 3.54s | 18.10% ]
INFO     ┝━ build_aux_trace [ 786ms | 4.02% ]
INFO     ┝━ commit_to_aux_trace_segment [ 719ms | 0.00% / 3.68% ]
INFO     │  ┝━ extend_execution_trace [ 448ms | 2.29% ] num_cols: 8 | blowup: 8
INFO     │  ┕━ compute_execution_trace_commitment [ 271ms | 1.39% ] tree_depth: 23
INFO     ┝━ evaluate_constraints [ 7.32s | 14.21% / 37.49% ] ce_domain_size: 8388608
INFO     │  ┕━ evaluate_constraints [ 4.55s | 23.28% ]
INFO     ┝━ commit_to_constraint_evaluations [ 821ms | 0.00% / 4.20% ]
INFO     │  ┝━ build_composition_poly_columns [ 126ms | 0.65% ] num_columns: 8
INFO     │  ┝━ evaluate_composition_poly_columns [ 423ms | 2.16% ]
INFO     │  ┕━ compute_constraint_evaluation_commitment [ 272ms | 1.39% ] tree_depth: 23
INFO     ┝━ build_deep_composition_poly [ 732ms | 3.75% ]
INFO     ┝━ evaluate_deep_composition_poly [ 62.1ms | 0.32% ]
INFO     ┝━ compute_fri_layers [ 85.3ms | 0.44% ] num_layers: 4
INFO     ┝━ determine_query_positions [ 131µs | 0.00% ] grinding_factor: 16 | num_positions: 27
INFO     ┕━ build_proof_object [ 23.0ms | 0.12% ]
--------------------------------
Executed program in 19536 ms
Stack outputs: [4210649924, 4239425932, 2583891669, 2278324621, 1697424527, 1323302812, 3062448259, 2695025053]
Execution proof size: 119 KB
Execution proof security: 96 bits
--------------------------------
INFO     verify_program [ 1.11ms | 100.00% ]
Execution verified in 1 ms

@plafer
Copy link
Contributor Author

plafer commented Sep 12, 2024

Closing in favor of #1493

@plafer plafer closed this Sep 12, 2024
@bobbinth bobbinth deleted the plafer-logupgkr-e2e branch November 5, 2024 03:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants