Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Wholesale" blob compression #366

Open
1 of 4 tasks
Tabaie opened this issue Nov 29, 2024 · 0 comments · May be fixed by #367
Open
1 of 4 tasks

"Wholesale" blob compression #366

Tabaie opened this issue Nov 29, 2024 · 0 comments · May be fixed by #367
Assignees
Labels
Data compressor enhancement New feature or request

Comments

@Tabaie
Copy link
Contributor

Tabaie commented Nov 29, 2024

Problem Statement

Currently the blob compressor works on its payload block by block. This robs the underlying lzss compressor of some opportunities to find longer back references. When recompressing the entire blob payload on sample Sepolia blobs, we get the following results:

recompressing  4454961-4457351-e1e96ae51f742703127930ac10b1271bba431670cb58df245d447fec45b5abb0-getZkBlobCompressionProof.json
recompressed size 130016 -> 121312
Utilization 99.19% -> 92.55%

recompressing  4457352-4459439-11b64ec1986a1b3f91b2682f863ba295174141bdafa43991384e18f9cecb4424-getZkBlobCompressionProof.json
recompressed size 130048 -> 122496
Utilization 99.22% -> 93.46%

recompressing  4459440-4461292-033bc5d3158db24465a9320fdae89118d95e1f6d0845d6e6d274ee093458256f-getZkBlobCompressionProof.json
recompressed size 130016 -> 123296
Utilization 99.19% -> 94.07%

recompressing  4461293-4463201-a53da3be9d3a80b5ffe7f69942bfdd75d9728d11d00d0f46d6ef564f9f6b03c3-getZkBlobCompressionProof.json
recompressed size 130048 -> 123104
Utilization 99.22% -> 93.92%

recompressing  4463202-4464942-ae8a1b736e364c93e76034d42560ba2ac4937c3b13c3df49cc64f231acbd3231-getZkBlobCompressionProof.json
recompressed size 130048 -> 123712
Utilization 99.22% -> 94.38%

recompressing  4464943-4466962-e747d64b41e98ba6211a9af05971beee141fab4e4a4e3b7a9ed3ba0db297ea51-getZkBlobCompressionProof.json
recompressed size 129952 -> 122592
Utilization 99.15% -> 93.53%

recompressing  4466963-4469060-e33f117a3638c9d15d6f9295f4b69632cf5b48558ada42536287abca693c8e97-getZkBlobCompressionProof.json
recompressed size 130016 -> 122368
Utilization 99.19% -> 93.36%

recompressing  4469061-4471076-6d2296765837ff017e6a6fb9043bf95c8cac0d5ac9cb6c15818f1821f9ef9ed5-getZkBlobCompressionProof.json
recompressed size 129824 -> 122496
Utilization 99.05% -> 93.46%

recompressing  4471077-4473022-dfcd2c387d8f5563e3a1a890c5bc7846c0ce9fa9830a6650f6dd52fe077435be-getZkBlobCompressionProof.json
recompressed size 130048 -> 122944
Utilization 99.22% -> 93.80%

recompressing  4473023-4474914-2c11da9113438ffd919da190bf11cabd2057112eb2e5caba6bf9fd6b0d224f25-getZkBlobCompressionProof.json
recompressed size 130016 -> 123136
Utilization 99.19% -> 93.95%

recompressing  4474915-4476909-7a4a04da84f9821c001c0fcc25e7866c629a648e8650870cb4d8cde3958e94c6-getZkBlobCompressionProof.json
recompressed size 128832 -> 121568
Utilization 98.29% -> 92.75%

--- PASS: TestRecompress (3.34s)

Proposed Solution

We already knew that wholesale compression would have yielded better results, though perhaps not to this extent. It was avoided to ensure the latency for the compression of a single block is low. In order to get both benefits, I propose that the Write method be kept as is, but to add methods StartOptimizer(period Duration) and StopOptimizer(). The optimizer runs in that background, performing wholesale compression periodically. Its results are voided if a new block has been submitted while recompression was running.

Note: This will not affect the decompressor.

Pitfalls: Async work is tricky and errors are hard to reproduce.

@gbotrel @jpnovais thoughts?

Reference

func TestRecompress(t *testing.T) {
	dir, err := os.ReadDir("integration/all-backend/testdata/sepolia-v0.8.0-rc3/prover-compression/responses")
	require.NoError(t, err)

	const dictPath = "lib/compressor/compressor_dict.bin"
	dict, err := os.ReadFile(dictPath)
	require.NoError(t, err)
	compressor, err := lzss.NewCompressor(dict)
	require.NoError(t, err)
	dictStore := dictionary.NewStore(dictPath)

	var req blobdecompression.Response
	var bbH, bbT bytes.Buffer
	for _, file := range dir {
		fmt.Println("recompressing ", file.Name())
		test_utils.LoadJson(t, filepath.Join("integration/all-backend/testdata/sepolia-v0.8.0-rc3/prover-compression/responses", file.Name()), &req)
		blob, err := base64.StdEncoding.DecodeString(req.CompressedData)

		nbValidBytes := len(blob)
		for nbValidBytes != 0 && blob[nbValidBytes-1] == 0 {
			nbValidBytes--
		}
		nbValidBytes = (nbValidBytes + 31) / 32 * 32
		require.Equal(t, nbValidBytes, len(blob))

		require.NoError(t, err)
		header, payload, _, err := v1.DecompressBlob(blob, dictStore)
		require.NoError(t, err)

		bbH.Reset()
		bbT.Reset()
		compressor.Reset()

		_, err = header.WriteTo(&bbH)
		require.NoError(t, err)

		_, err = compressor.Write(payload)
		require.NoError(t, err)

		_, err = encode.PackAlign(&bbT, bbH.Bytes(), fr381.Bits-1, encode.WithAdditionalInput(compressor.Bytes()))
		require.NoError(t, err)

		headerBack, payloadBack, _, err := v1.DecompressBlob(bbT.Bytes(), dictStore)
		require.NoError(t, err)
		require.Equal(t, header, headerBack)
		require.Equal(t, payload, payloadBack)

		fmt.Printf("recompressed size %d -> %d\nUtilization %.2f%% -> %.2f%%\n\n", len(blob), len(bbT.Bytes()), float64(len(blob))/32/4096*100, float64(len(bbT.Bytes()))/32/4096*100)
	}
}
  • Contains some manner of action item.
  • Contains the service the action pertains to
  • Network scope: Select those that apply, or select All
  • All
  • Mainnet
  • Testnet - Sepolia
  • Devnet
@Tabaie Tabaie added enhancement New feature or request Data compressor labels Nov 29, 2024
@Tabaie Tabaie self-assigned this Nov 29, 2024
@Tabaie Tabaie linked a pull request Nov 30, 2024 that will close this issue
3 tasks
@Tabaie Tabaie linked a pull request Nov 30, 2024 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data compressor enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant