mirror of
https://github.com/OffchainLabs/prysm.git
synced 2026-02-03 09:35:00 -05:00
<!-- Thanks for sending a PR! Before submitting: 1. If this is your first PR, check out our contribution guide here https://docs.prylabs.network/docs/contribute/contribution-guidelines You will then need to sign our Contributor License Agreement (CLA), which will show up as a comment from a bot in this pull request after you open it. We cannot review code without a signed CLA. 2. Please file an associated tracking issue if this pull request is non-trivial and requires context for our team to understand. All features and most bug fixes should have an associated issue with a design discussed and decided upon. Small bug fixes and documentation improvements don't need issues. 3. New features and bug fixes must have tests. Documentation may need to be updated. If you're unsure what to update, send the PR, and we'll discuss in review. 4. Note that PRs updating dependencies and new Go versions are not accepted. Please file an issue instead. 5. A changelog entry is required for user facing issues. --> **What type of PR is this?** Feature **What does this PR do? Why is it needed?** This PR replaces the previous PR https://github.com/OffchainLabs/prysm/pull/16121, which built the entire Merkle tree and generated proofs only after the tree was complete. In this PR, the Merkle proof is produced by collecting hashes while the Merkle tree is being built. This approach has proven to be more efficient than the one in https://github.com/OffchainLabs/prysm/pull/16121. - **ProofCollector**: - New `ProofCollector` type in `encoding/ssz/query/proof_collector.go`: Collects sibling hashes and leaves needed for Merkle proofs during merkleization. - Multiproof-ready design with `requiredSiblings`/`requiredLeaves` maps for registering target gindices before merkleization. - Thread-safe: read-only required maps during merkleization, mutex-protected writes to `siblings`/`leaves`. - `AddTarget(gindex)` registers a target leaf and computes all required sibling gindices along the path to root. - `toProof()` converts collected data into `fastssz.Proof` structure. - Parallel execution in `merkleizeVectorBody` for composite elements with worker pool pattern. - Optimized container hashing: Generalized `stateutil.OptimizedValidatorRoots` pattern for any SSZ container type: - `optimizedContainerRoots`: Parallelized field root computation + level-by-level vectorized hashing via `VectorizedSha256`. - `hashContainerHelper`: Worker goroutine for processing container subsets. - `containerFieldRoots`: Computes field roots for a single container using reflection and SszInfo metadata. - **`Prove(gindex)` method** in `encoding/ssz/query/merkle_proof.go`: Entry point for generating SSZ Merkle proofs for a given generalized index. - **Testing** - Added `merkle_proof_test.go` and `proof_collector_test.go` to test and benchmark this feature. The main outcomes of the optimizations are here: ``` ❯ go test ./encoding/ssz/query -run=^$ -bench='Benchmark(OptimizedContainerRoots|OptimizedValidatorRoots|ProofCollectorMerkleize)$' -benchmem goos: darwin goarch: arm64 pkg: github.com/OffchainLabs/prysm/v7/encoding/ssz/query cpu: Apple M2 Pro BenchmarkOptimizedValidatorRoots-10 3237 361029 ns/op 956858 B/op 6024 allocs/op BenchmarkOptimizedContainerRoots-10 1138 969002 ns/op 3245223 B/op 11024 allocs/op BenchmarkProofCollectorMerkleize-10 522 2262066 ns/op 3216000 B/op 19000 allocs/op PASS ok github.com/OffchainLabs/prysm/v7/encoding/ssz/query 4.619s ``` Knowing that `OptimizedValidatorRoots` implements very effective optimizations, `OptimizedContainerRoots` mimics them. In the benchmark we can see that `OptimizedValidatorRoots` remain as the most performant and tit the baseline here: - `ProofCollectorMerkleize` is **~6.3× slower**, uses **~3.4× more memory** (B/op), and performs **~3.2× more allocations**. - `OptimizedContainerRoots` sits in between: it’s **~2.7× slower** than `OptimizedValidatorRoots` (and **~3.4× higher B/op**, **~1.8× more allocations**), but it is a clear win over `ProofCollectorMerkleize` for lists/vectors: **~2.3× faster** with **~1.7× fewer allocations** (and essentially the same memory footprint). The main drawback is that `OptimizedContainerRoots` can only be applied to vector/list subtrees where we don’t need to collect any sibling/leaf data (i.e., no proof targets within that subtree); integrating it into the recursive merkleize(...) flow when targets are outside the subtree is expected to land in a follow-up PR. **Which issues(s) does this PR fix?** Partially https://github.com/OffchainLabs/prysm/issues/15598 **Other notes for review** In this [write-up](https://hackmd.io/@fernantho/BJbZ1xmmbg), I depict the process to come up with this solution. Future improvements: - Defensive check that the gindex is not too big, depicted [here]( https://github.com/OffchainLabs/prysm/pull/16177#discussion_r2671684100). - Integrate optimizedContainerRoots into the recursive merkleize(...) flow when proof targets are not within the subtree (skip full traversal for container lists). - Add multiproofs. - Connect `proofCollector` to SSZ-QL endpoints (direct integration of `proofCollector` for BeaconBlock endpoint and "hybrid" approach for BeaconState endpoint). **Acknowledgements** - [x] I have read [CONTRIBUTING.md](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md). - [x] I have included a uniquely named [changelog fragment file](https://github.com/prysmaticlabs/prysm/blob/develop/CONTRIBUTING.md#maintaining-changelogmd). - [x] I have added a description with sufficient context for reviewers to understand this PR. - [x] I have tested that my changes work as expected and I added a testing plan to the PR description (if applicable). --------- Co-authored-by: Radosław Kapka <radoslaw.kapka@gmail.com> Co-authored-by: Jun Song <87601811+syjn99@users.noreply.github.com>