13 KiB
Reth Development Guide for AI Agents
This guide provides comprehensive instructions for AI agents working on the Reth codebase. It covers the architecture, development workflows, and critical guidelines for effective contributions.
Project Overview
Reth is a high-performance Ethereum execution client written in Rust, focusing on modularity, performance, and contributor-friendliness. The codebase is organized into well-defined crates with clear boundaries and responsibilities.
Architecture Overview
Core Components
- Consensus (
crates/consensus/): Validates blocks according to Ethereum consensus rules - Storage (
crates/storage/): Hybrid database using MDBX + static files for optimal performance - Networking (
crates/net/): P2P networking stack with discovery, sync, and transaction propagation - RPC (
crates/rpc/): JSON-RPC server supporting all standard Ethereum APIs - Execution (
crates/evm/,crates/ethereum/): Transaction execution and state transitions - Pipeline (
crates/stages/): Staged sync architecture for blockchain synchronization - Trie (
crates/trie/): Merkle Patricia Trie implementation with parallel state root computation - Node Builder (
crates/node/): High-level node orchestration and configuration 9 The Consensus Engine (crates/engine/): Handles processing blocks received from the consensus layer with the Engine API (newPayload, forkchoiceUpdated)
Key Design Principles
- Modularity: Each crate can be used as a standalone library
- Performance: Extensive use of parallelism, memory-mapped I/O, and optimized data structures
- Extensibility: Traits and generic types allow for different implementations (Ethereum, Optimism, etc.)
- Type Safety: Strong typing throughout with minimal use of dynamic dispatch
Development Workflow
Code Style and Standards
-
Formatting: Always use nightly rustfmt
cargo +nightly fmt --all -
Linting: Run clippy with all features
RUSTFLAGS="-D warnings" cargo +nightly clippy --workspace --lib --examples --tests --benches --all-features --locked -
Testing: Use nextest for faster test execution
cargo nextest run --workspace
Common Contribution Types
Based on actual recent PRs, here are typical contribution patterns:
1. Small Bug Fixes (1-10 lines)
Real example: Fixing beacon block root handling (#16767)
// Changed a single line to fix logic error
- parent_beacon_block_root: parent.parent_beacon_block_root(),
+ parent_beacon_block_root: parent.parent_beacon_block_root().map(|_| B256::ZERO),
2. Integration with Upstream Changes
Real example: Integrating revm updates (#16752)
// Update code to use new APIs from dependencies
- if self.fork_tracker.is_shanghai_activated() {
- if let Err(err) = transaction.ensure_max_init_code_size(MAX_INIT_CODE_BYTE_SIZE) {
+ if let Some(init_code_size_limit) = self.fork_tracker.max_initcode_size() {
+ if let Err(err) = transaction.ensure_max_init_code_size(init_code_size_limit) {
3. Adding Comprehensive Tests
Real example: ETH69 protocol tests (#16759)
#[tokio::test(flavor = "multi_thread")]
async fn test_eth69_peers_can_connect() {
// Create test network with specific protocol versions
let p0 = PeerConfig::with_protocols(NoopProvider::default(), Some(EthVersion::Eth69.into()));
// Test connection and version negotiation
}
4. Making Components Generic
Real example: Making EthEvmConfig generic over chainspec (#16758)
// Before: Hardcoded to ChainSpec
- pub struct EthEvmConfig<EvmFactory = EthEvmFactory> {
- pub executor_factory: EthBlockExecutorFactory<RethReceiptBuilder, Arc<ChainSpec>, EvmFactory>,
// After: Generic over any chain spec type
+ pub struct EthEvmConfig<C = ChainSpec, EvmFactory = EthEvmFactory>
+ where
+ C: EthereumHardforks,
+ {
+ pub executor_factory: EthBlockExecutorFactory<RethReceiptBuilder, Arc<C>, EvmFactory>,
5. Resource Management Improvements
Real example: ETL directory cleanup (#16770)
// Add cleanup logic on startup
+ if let Err(err) = fs::remove_dir_all(&etl_path) {
+ warn!(target: "reth::cli", ?etl_path, %err, "Failed to remove ETL path on launch");
+ }
6. Feature Additions
Real example: Sharded mempool support (#16756)
// Add new filtering policies for transaction announcements
pub struct ShardedMempoolAnnouncementFilter<T> {
pub inner: T,
pub shard_bits: u8,
pub node_id: Option<B256>,
}
Testing Guidelines
- Unit Tests: Test individual functions and components
- Integration Tests: Test interactions between components
- Benchmarks: For performance-critical code
- Fuzz Tests: For parsing and serialization code
- Property Tests: For checking component correctness on a wide variety of inputs
Example test structure:
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_component_behavior() {
// Arrange
let component = Component::new();
// Act
let result = component.operation();
// Assert
assert_eq!(result, expected);
}
}
Performance Considerations
- Avoid Allocations in Hot Paths: Use references and borrowing
- Parallel Processing: Use rayon for CPU-bound parallel work
- Async/Await: Use tokio for I/O-bound operations
- File Operations: Use
reth_fs_utilinstead ofstd::fsfor better error handling
Common Pitfalls
- Don't Block Async Tasks: Use
spawn_blockingfor CPU-intensive work or work with lots of blocking I/O - Handle Errors Properly: Use
?operator and proper error types
What to Avoid
Based on PR patterns, avoid:
- Large, sweeping changes: Keep PRs focused and reviewable
- Mixing unrelated changes: One logical change per PR
- Ignoring CI failures: All checks must pass
- Incomplete implementations: Finish features before submitting
- Modifying libmdbx sources: Never modify files in
crates/storage/libmdbx-rs/mdbx-sys/libmdbx/- this is vendored third-party code
CI Requirements
Before submitting changes, ensure:
- Format Check:
cargo +nightly fmt --all --check - Clippy: No warnings with
RUSTFLAGS="-D warnings" - Tests Pass: All unit and integration tests
- Documentation: Update relevant docs and add doc comments with
cargo docs --document-private-items - Commit Messages: Follow conventional format (feat:, fix:, chore:, etc.)
Opening PRs against https://github.com/paradigmxyz/reth
Label PRs appropriately, first check the available labels and then apply the relevant ones:
- when changes are RPC related, add A-rpc label
- when changes are docs related, add C-docs label
- when changes are optimism related (e.g. new feature or exclusive changes to crates/optimism), add A-op-reth label
- ... and so on, check the available labels for more options.
- if being tasked to open a pr, ensure that all changes are properly formatted:
cargo +nightly fmt --all
If changes in reth include changes to dependencies, run commands zepter and make lint-toml before finalizing the pr. Assume zepter binary is installed.
Debugging Tips
-
Logging: Use
tracingcrate with appropriate levelstracing::debug!(target: "reth::component", ?value, "description"); -
Metrics: Add metrics for monitoring
metrics::counter!("reth_component_operations").increment(1); -
Test Isolation: Use separate test databases/directories
Finding Where to Contribute
- Check Issues: Look for issues labeled
good-first-issueorhelp-wanted - Review TODOs: Search for
TODOcomments in the codebase - Improve Tests: Areas with low test coverage are good targets
- Documentation: Improve code comments and documentation
- Performance: Profile and optimize hot paths (with benchmarks)
Common PR Patterns
Small, Focused Changes
Most PRs change only 1-5 files. Examples:
- Single-line bug fixes
- Adding a missing trait implementation
- Updating error messages
- Adding test cases for edge conditions
Integration Work
When dependencies update (especially revm), code needs updating:
- Check for breaking API changes
- Update to use new features (like EIP implementations)
- Ensure compatibility with new versions
Test Improvements
Tests often need expansion for:
- New protocol versions (ETH68, ETH69)
- Edge cases in state transitions
- Network behavior under specific conditions
- Concurrent operations
Making Code More Generic
Common refactoring pattern:
- Replace concrete types with generics
- Add trait bounds for flexibility
- Enable reuse across different chain types (Ethereum, Optimism)
When to Comment
Write comments that remain valuable after the PR is merged. Future readers won't have PR context - they only see the current code.
✅ DO: Add Value
Explain WHY and non-obvious behavior:
// Process must handle allocations atomically to prevent race conditions
// between dealloc on drop and concurrent limit checks
unsafe impl GlobalAlloc for LimitedAllocator { ... }
// Binary search requires sorted input. Panics on unsorted slices.
fn find_index(items: &[Item], target: &Item) -> Option
// Timeout set to 5s to match EVM block processing limits
const TRACER_TIMEOUT: Duration = Duration::from_secs(5);
Document constraints and assumptions:
/// Returns heap size estimate.
///
/// Note: May undercount shared references (Rc/Arc). For precise
/// accounting, combine with an allocator-based approach.
fn deep_size_of(&self) -> usize
Explain complex logic:
// We reset limits at task start because tokio reuses threads in
// spawn_blocking pool. Without reset, second task inherits first
// task's allocation count and immediately hits limit.
THREAD_ALLOCATED.with(|allocated| allocated.set(0));
❌ DON'T: Describe Changes
// ❌ BAD - Describes the change, not the code
// Changed from Vec to HashMap for O(1) lookups
// ✅ GOOD - Explains the decision
// HashMap provides O(1) symbol lookups during trace replay
// ❌ BAD - PR-specific context
// Fix for issue #234 where memory wasn't freed
// ✅ GOOD - Documents the actual behavior
// Explicitly drop allocations before limit check to ensure
// accurate accounting
// ❌ BAD - States the obvious
// Increment counter
counter += 1;
// ✅ GOOD - Explains non-obvious purpose
// Track allocations across all threads for global limit enforcement
GLOBAL_COUNTER.fetch_add(1, Ordering::SeqCst);
✅ Comment when:
- Non-obvious behavior or edge cases
- Performance trade-offs
- Safety requirements (unsafe blocks must always be documented)
- Limitations or gotchas
- Why simpler alternatives don't work
❌ Don't comment when:
- Code is self-explanatory
- Just restating the code in English
- Describing what changed in this PR
The Test: "Will this make sense in 6 months?"
Before adding a comment, ask: Would someone reading just the current code (no PR, no history) find this helpful?
Example Contribution Workflow
Let's say you want to fix a bug where external IP resolution fails on startup:
-
Create a branch:
git checkout -b fix-external-ip-resolution -
Find the relevant code:
# Search for IP resolution code rg "external.*ip" --type rust -
Reason about the problem, when the problem is identified, make the fix:
// In crates/net/discv4/src/lib.rs pub fn resolve_external_ip() -> Option<IpAddr> { // Add fallback mechanism nat::external_ip() .or_else(|| nat::external_ip_from_stun()) .or_else(|| Some(DEFAULT_IP)) } -
Add a test:
#[test] fn test_external_ip_fallback() { // Test that resolution has proper fallbacks } -
Run checks:
cargo +nightly fmt --all cargo clippy --all-features cargo test -p reth-discv4 -
Commit with clear message:
git commit -m "fix: add fallback for external IP resolution Previously, node startup could fail if external IP resolution failed. This adds fallback mechanisms to ensure the node can always start with a reasonable default."
Quick Reference
Essential Commands
# Format code
cargo +nightly fmt --all
# Run lints
RUSTFLAGS="-D warnings" cargo +nightly clippy --workspace --all-features --locked
# Run tests
cargo nextest run --workspace
# Run specific benchmark
cargo bench --bench bench_name
# Build optimized binary
cargo build --release --features "jemalloc asm-keccak"
# Check compilation for all features
cargo check --workspace --all-features
# Check documentation
cargo docs --document-private-items