test(engine): cover stale sparse trie checkout drop

fix(engine): tighten shared cache visibility
fix(node): preserve sparse trie cache kind
2026-04-30 03:01:58 -04:00 · 2026-03-16 05:14:05 +00:00 · 2026-03-16 05:11:35 +00:00 · 2026-03-16 04:55:28 +00:00 · 2026-03-16 04:55:28 +00:00 · 2026-03-16 04:53:57 +00:00
299 changed files with 17009 additions and 5410 deletions
--- a/.changelog/cool-owls-rest.md
+++ b/.changelog/cool-owls-rest.md
@@ -0,0 +1,5 @@
+---
+reth-trie-sparse: patch
+---
+
+Fixed a bug in `merge_subtrie_updates` where source insertions did not cancel destination removals (and vice versa), causing inconsistent trie updates accumulated across multiple `root()` calls without intermediate `take_updates()`. Added a test covering the cross-cancellation behavior.
--- a/.changelog/dark-ants-write.md
+++ b/.changelog/dark-ants-write.md
@@ -0,0 +1,5 @@
+---
+reth-tasks: patch
+---
+
+Added panic handler to all rayon thread pools that logs panics via `tracing::error` instead of aborting the process.
--- a/.changelog/dull-clams-pack.md
+++ b/.changelog/dull-clams-pack.md
@@ -0,0 +1,5 @@
+---
+reth-trie-sparse: patch
+---
+
+Refactored arena trie internals by adding a `BranchChildIdx::sibling()` helper, deduplicating `Index`/`NodeArena` type aliases, and replacing `is_empty()` with a `drop_root()` method. Fixed a bug where `cursor.pop()` was called before checking if the leaf was the root node, which could cause incorrect dirty-state propagation.
--- a/.changelog/fair-wolves-smile.md
+++ b/.changelog/fair-wolves-smile.md
@@ -0,0 +1,5 @@
+---
+reth-payload-builder: minor
+---
+
+Added observability metrics for payload resolve latency and new payload job creation latency to the payload builder service.
--- a/.changelog/fine-koalas-shout.md
+++ b/.changelog/fine-koalas-shout.md
@@ -0,0 +1,10 @@
+---
+reth-chain-state: minor
+reth-engine-primitives: minor
+reth-engine-tree: minor
+reth-node-core: minor
+reth-node-events: minor
+reth: patch
+---
+
+Added configurable slow block logging (`--engine.slow-block-threshold`) that emits a structured `warn!` log with detailed timing, state-operation counts, and cache hit-rate metrics for blocks whose total processing time exceeds the threshold. Introduced `ExecutionTimingStats`, `CacheStats`, `StateProviderStats`, and `SlowBlockInfo` types to carry execution statistics from block validation through persistence, and refactored `PersistenceResult` to carry commit duration alongside the last persisted block.
--- a/.changelog/merry-bees-break.md
+++ b/.changelog/merry-bees-break.md
@@ -0,0 +1,8 @@
+---
+reth-engine-primitives: minor
+reth-engine-tree: minor
+reth-node-core: minor
+reth-trie-parallel: minor
+---
+
+Added `--engine.proof-jitter` CLI option behind the `trie-debug` feature flag. When set, each proof worker sleeps for a random duration up to the specified value before starting proof computation, useful for stress-testing timing-sensitive proof logic.
--- a/.changelog/neat-ducks-whisper.md
+++ b/.changelog/neat-ducks-whisper.md
@@ -0,0 +1,6 @@
+---
+reth-trie: patch
+reth-trie-sparse: patch
+---
+
+Refactored test harness for sparse trie tests by extracting `TrieTestHarness` into a shared `reth-trie` test utility, replacing duplicated inline harness code across multiple test modules. Updated `proof_v2` return type to include an optional root hash, and converted `original_root` and `storage` from public fields to accessor methods.
--- a/.changelog/odd-frogs-break.md
+++ b/.changelog/odd-frogs-break.md
@@ -0,0 +1,7 @@
+---
+reth-cli-commands: minor
+reth-node-core: minor
+reth: patch
+---
+
+Made v2 storage the default for all new databases, deprecating the `--storage.v2` flag to a hidden no-op kept for backwards compatibility. Updated CLI reference docs to remove the now-hidden flag from all command help pages.
--- a/.changelog/proud-slugs-fly.md
+++ b/.changelog/proud-slugs-fly.md
@@ -0,0 +1,7 @@
+---
+reth-engine-tree: patch
+reth-trie-sparse: patch
+reth-tasks: patch
+---
+
+Offloaded deallocation of expensive proof node buffers to a persistent background thread (`Runtime::spawn_drop`) to avoid blocking state root computation or lock-holding code.
--- a/.changelog/rare-whales-pack.md
+++ b/.changelog/rare-whales-pack.md
@@ -0,0 +1,5 @@
+---
+reth-trie-sparse: minor
+---
+
+Added a comprehensive generic `SparseTrie` test suite covering `set_root`, `reveal_nodes`, `update_leaves`, `root`, `take_updates`, `commit_updates`, `prune`, `wipe`/`clear`, `get_leaf_value`, `find_leaf`, `size_hint`, and integration lifecycle scenarios. Tests are stamped out for all concrete `SparseTrie` implementations via a macro.
--- a/.changelog/tall-fish-climb.md
+++ b/.changelog/tall-fish-climb.md
@@ -0,0 +1,5 @@
+---
+reth-cli-commands: minor
+---
+
+Added `reth_version` field to `SnapshotManifest` to record the Reth version that produced a snapshot. The field is optional and populated automatically during manifest generation.
--- a/.changelog/vain-goats-cook.md
+++ b/.changelog/vain-goats-cook.md
@@ -0,0 +1,9 @@
+---
+reth-trie-sparse: minor
+reth-engine-primitives: minor
+reth-engine-tree: minor
+reth-node-core: minor
+reth-trie-common: patch
+---
+
+Added an arena-based sparse trie implementation (`ArenaParallelSparseTrie`) using `slotmap` arena allocation for node storage, enabling parallel subtrie mutation without per-node hashing overhead. Added `ConfigurableSparseTrie` enum to switch between the arena and hash-map implementations, and a `--engine.enable-arena-sparse-trie` CLI flag to opt in at runtime.
--- a/.changelog/zesty-clouds-wave.md
+++ b/.changelog/zesty-clouds-wave.md
@@ -0,0 +1,5 @@
+---
+reth-trie: patch
+---
+
+Fixed a potential panic in `ProofCalculator` by clearing internal computation state (`branch_stack`, `child_stack`, `branch_path`, etc.) after errors, preventing stale state from causing `usize` underflow panics when the calculator is reused. Added a test verifying correct behavior after simulated mid-computation errors.
--- a/.dockerignore
+++ b/.dockerignore
@@ -20,6 +20,11 @@
 # include dist directory, where the reth binary is located after compilation
 !/dist

+# include PGO build helper used by Dockerfile.depot
+!/.github
+!/.github/scripts
+!/.github/scripts/build_pgo_bolt.sh
+
 # include licenses
 !LICENSE-*

--- a/.github/actionlint.yaml
+++ b/.github/actionlint.yaml
@@ -5,3 +5,4 @@ self-hosted-runner:
    - depot-ubuntu-latest-4
    - depot-ubuntu-latest-8
    - depot-ubuntu-latest-16
+    - available
--- a/.github/scripts/bench-metrics-proxy.py
+++ b/.github/scripts/bench-metrics-proxy.py
@@ -0,0 +1,276 @@
+#!/usr/bin/env python3
+"""
+Prometheus metrics proxy that fetches from a local reth node and
+re-exposes with additional benchmark labels.
+
+Reads labels from a JSON file (updated by local-reth-bench.sh between runs)
+and injects them into every Prometheus metric line.
+
+Returns empty 200 when reth is not running (clean Grafana gaps).
+"""
+import argparse
+import ipaddress
+import json
+import subprocess
+import sys
+import time
+from http.server import HTTPServer, BaseHTTPRequestHandler
+from urllib.request import urlopen
+from urllib.error import URLError
+
+
+def read_labels(path):
+    try:
+        with open(path) as f:
+            return json.load(f)
+    except (FileNotFoundError, json.JSONDecodeError):
+        return {}
+
+
+def inject_labels(metrics_bytes, label_str, label_names):
+    """Inject labels into Prometheus text format.
+
+    Operates on bytes and uses simple string ops instead of regex
+    for speed on large payloads (reth exposes thousands of metrics).
+
+    Skips injecting into lines that already contain any of the label names
+    to avoid duplicate labels (which Prometheus rejects).
+    """
+    if not label_str:
+        return metrics_bytes
+
+    label_bytes = label_str.encode("utf-8")
+    # Pre-encode label names for fast duplicate detection
+    label_name_bytes = [n.encode("utf-8") for n in label_names]
+    out = []
+    for line in metrics_bytes.split(b"\n"):
+        # Skip comments and blank lines
+        if line.startswith(b"#") or not line:
+            out.append(line)
+            continue
+
+        brace = line.find(b"{")
+        space = line.find(b" ")
+
+        if space == -1:
+            # Malformed, pass through
+            out.append(line)
+        elif brace != -1 and brace < space:
+            # Has labels: metric{existing="val"} 123
+            close = line.find(b"}", brace)
+            if close == -1:
+                out.append(line)
+                continue
+
+            # Filter out labels that already exist in this line
+            existing = line[brace + 1:close]
+            inject = label_bytes
+            if existing:
+                for name in label_name_bytes:
+                    if name + b"=" in existing:
+                        # Rebuild inject string excluding this label
+                        inject = _remove_label(inject, name)
+                if not inject:
+                    out.append(line)
+                    continue
+
+            if close == brace + 1:
+                # Empty braces: metric{} 123
+                out.append(line[:close] + inject + line[close:])
+            else:
+                out.append(line[:close] + b"," + inject + line[close:])
+        else:
+            # No labels: metric 123
+            out.append(line[:space] + b"{" + label_bytes + b"}" + line[space:])
+
+    return b"\n".join(out)
+
+
+def _remove_label(label_bytes, name):
+    """Remove a single label (name=\"...\") from a comma-separated label string."""
+    parts = []
+    for part in label_bytes.split(b","):
+        if not part.startswith(name + b"="):
+            parts.append(part)
+    return b",".join(parts)
+
+
+def build_label_str(labels):
+    """Pre-format the label injection string: key1="val1",key2="val2" """
+    if not labels:
+        return ""
+    return ",".join(f'{k}="{v}"' for k, v in sorted(labels.items()))
+
+
+def build_elapsed_gauge(labels):
+    """Build a bench_elapsed_seconds gauge from run_start_epoch in labels."""
+    start = labels.get("run_start_epoch")
+    if not start:
+        return b""
+    try:
+        elapsed = time.time() - float(start)
+    except (ValueError, TypeError):
+        return b""
+    # Build labels excluding internal keys
+    display = {k: v for k, v in labels.items()
+               if k not in ("run_start_epoch", "reference_epoch")}
+    lstr = build_label_str(display)
+    return (
+        f"# HELP bench_elapsed_seconds Seconds since benchmark run started\n"
+        f"# TYPE bench_elapsed_seconds gauge\n"
+        f"bench_elapsed_seconds{{{lstr}}} {elapsed:.1f}\n"
+    ).encode("utf-8")
+
+
+def compute_timestamp_ms(labels):
+    """Compute a synthetic timestamp so all runs share a common time origin.
+
+    Returns the timestamp in milliseconds, or None if not enough info.
+    Uses: reference_epoch + (now - run_start_epoch) → all runs overlay at
+    the same Grafana time range.
+    """
+    ref = labels.get("reference_epoch")
+    start = labels.get("run_start_epoch")
+    if not ref or not start:
+        return None
+    try:
+        elapsed = time.time() - float(start)
+        return int((float(ref) + elapsed) * 1000)
+    except (ValueError, TypeError):
+        return None
+
+
+def inject_timestamps(metrics_bytes, timestamp_ms):
+    """Append a Prometheus timestamp (ms) to every data line.
+
+    Prometheus text format: metric{labels} value [timestamp_ms]
+    Adding timestamps causes Prometheus to store all runs' samples
+    at the same relative time, enabling natural overlay in Grafana.
+    """
+    if timestamp_ms is None:
+        return metrics_bytes
+
+    ts = str(timestamp_ms).encode("utf-8")
+    out = []
+    for line in metrics_bytes.split(b"\n"):
+        if line.startswith(b"#") or not line:
+            out.append(line)
+        else:
+            out.append(line + b" " + ts)
+    return b"\n".join(out)
+
+
+class MetricsHandler(BaseHTTPRequestHandler):
+    # Use HTTP/1.1 so Content-Length is respected and Prometheus
+    # doesn't have to rely on connection close to detect end of body.
+    protocol_version = "HTTP/1.1"
+
+    def do_GET(self):
+        src = self.client_address[0]
+        try:
+            resp = urlopen(self.server.upstream, timeout=2)
+            metrics = resp.read()
+        except (URLError, ConnectionError, OSError):
+            # reth not running — return empty 200
+            self._send(b"")
+            #print(f"  scrape from {src}: empty (reth not running)", flush=True)
+            return
+
+        all_labels = read_labels(self.server.labels_file)
+        # Internal keys — not injected as Prometheus labels
+        internal = ("run_start_epoch", "reference_epoch")
+        labels = {k: v for k, v in all_labels.items() if k not in internal}
+        label_str = build_label_str(labels)
+        label_names = sorted(labels.keys())
+
+        t0 = time.monotonic()
+        result = inject_labels(metrics, label_str, label_names)
+        result += build_elapsed_gauge(all_labels)
+        ts_ms = compute_timestamp_ms(all_labels)
+        result = inject_timestamps(result, ts_ms)
+        dt = time.monotonic() - t0
+
+        self._send(result)
+        print(f"  scrape from {src}: {len(metrics)} -> {len(result)} bytes, "
+              f"inject {dt*1000:.1f}ms", flush=True)
+
+    def _send(self, body):
+        self.send_response(200)
+        self.send_header("Content-Type", "text/plain; version=0.0.4")
+        self.send_header("Content-Length", str(len(body)))
+        self.send_header("Connection", "close")
+        self.end_headers()
+        if body:
+            self.wfile.write(body)
+
+    def log_message(self, format, *args):
+        pass  # suppress per-request logging
+
+
+def resolve_bind_address(subnet_cidr):
+    """Find the local IP address that belongs to the given subnet.
+
+    Uses ``ip -j addr show`` to enumerate interfaces and returns the first
+    address that falls within *subnet_cidr* (e.g. ``10.10.0.0/24``).
+    """
+    network = ipaddress.ip_network(subnet_cidr, strict=False)
+    try:
+        result = subprocess.run(
+            ["ip", "-j", "addr", "show"],
+            capture_output=True, text=True, check=True,
+        )
+        interfaces = json.loads(result.stdout)
+    except (subprocess.CalledProcessError, FileNotFoundError, json.JSONDecodeError) as exc:
+        print(f"Error: cannot enumerate interfaces: {exc}", file=sys.stderr)
+        sys.exit(1)
+
+    for iface in interfaces:
+        for addr_info in iface.get("addr_info", []):
+            try:
+                addr = ipaddress.ip_address(addr_info["local"])
+            except (KeyError, ValueError):
+                continue
+            if addr in network:
+                return str(addr)
+
+    print(f"Error: no interface address found in subnet {subnet_cidr}", file=sys.stderr)
+    sys.exit(1)
+
+
+def main():
+    parser = argparse.ArgumentParser(description="Prometheus metrics proxy with label injection")
+    parser.add_argument("--labels", default="/tmp/bench-metrics-labels.json",
+                        help="Path to JSON file with labels to inject (default: /tmp/bench-metrics-labels.json)")
+    parser.add_argument("--upstream", default="http://127.0.0.1:9100/",
+                        help="Upstream reth metrics URL (default: http://127.0.0.1:9100/)")
+
+    bind_group = parser.add_mutually_exclusive_group()
+    bind_group.add_argument("--bind", default=None,
+                            help="Address to bind the proxy (default: 0.0.0.0)")
+    bind_group.add_argument("--subnet", default=None,
+                            help="Auto-detect bind address from a local interface in this subnet (e.g. 10.10.0.0/24)")
+
+    parser.add_argument("--port", type=int, default=9090,
+                        help="Port to bind the proxy (default: 9090)")
+    args = parser.parse_args()
+
+    if args.subnet:
+        bind_addr = resolve_bind_address(args.subnet)
+    elif args.bind:
+        bind_addr = args.bind
+    else:
+        bind_addr = "0.0.0.0"
+
+    server = HTTPServer((bind_addr, args.port), MetricsHandler)
+    server.upstream = args.upstream
+    server.labels_file = args.labels
+
+    print(f"bench-metrics-proxy listening on {bind_addr}:{args.port}")
+    print(f"  upstream: {args.upstream}")
+    print(f"  labels:   {args.labels}")
+    sys.stdout.flush()
+    server.serve_forever()
+
+
+if __name__ == "__main__":
+    main()
--- a/.github/scripts/bench-reth-build.sh
+++ b/.github/scripts/bench-reth-build.sh
@@ -22,6 +22,22 @@ MODE="$1"
 SOURCE_DIR="$2"
 COMMIT="$3"

+# Tracy support: when BENCH_TRACY is "on" or "full", add Tracy cargo features
+# and frame pointers for accurate stack traces.
+EXTRA_FEATURES=""
+EXTRA_RUSTFLAGS=""
+if [ "${BENCH_TRACY:-off}" != "off" ]; then
+  EXTRA_FEATURES="tracy,tracy-client/ondemand"
+  EXTRA_RUSTFLAGS=" -C force-frame-pointers=yes"
+fi
+
+# Cache suffix: hash of features+rustflags so different build configs get separate cache entries
+if [ -n "$EXTRA_FEATURES" ] || [ -n "$EXTRA_RUSTFLAGS" ]; then
+  BUILD_SUFFIX="-$(echo "${EXTRA_FEATURES}${EXTRA_RUSTFLAGS}" | sha256sum | cut -c1-12)"
+else
+  BUILD_SUFFIX=""
+fi
+
 # Verify a cached reth binary was built from the expected commit.
 # `reth --version` outputs "Commit SHA: <full-sha>" on its own line.
 verify_binary() {
@@ -42,7 +58,7 @@ verify_binary() {

 case "$MODE" in
  baseline|main)
-    BUCKET="minio/reth-binaries/${COMMIT}"
+    BUCKET="minio/reth-binaries/${COMMIT}${BUILD_SUFFIX}"
    mkdir -p "${SOURCE_DIR}/target/profiling"

    CACHE_VALID=false
@@ -59,14 +75,23 @@ case "$MODE" in
    if [ "$CACHE_VALID" = false ]; then
      echo "Building baseline (${COMMIT}) from source..."
      cd "${SOURCE_DIR}"
-      cargo build --profile profiling --bin reth
+      FEATURES_ARG=""
+      WORKSPACE_ARG=""
+      if [ -n "$EXTRA_FEATURES" ]; then
+        # --workspace is needed for cross-package feature syntax (tracy-client/ondemand)
+        FEATURES_ARG="--features ${EXTRA_FEATURES}"
+        WORKSPACE_ARG="--workspace"
+      fi
+      # shellcheck disable=SC2086
+      RUSTFLAGS="-C target-cpu=native${EXTRA_RUSTFLAGS}" \
+        cargo build --profile profiling --bin reth $WORKSPACE_ARG $FEATURES_ARG
      $MC cp target/profiling/reth "${BUCKET}/reth"
    fi
    ;;

  feature|branch)
    BRANCH_SHA="${4:-$COMMIT}"
-    BUCKET="minio/reth-binaries/${BRANCH_SHA}"
+    BUCKET="minio/reth-binaries/${BRANCH_SHA}${BUILD_SUFFIX}"

    CACHE_VALID=false
    if $MC stat "${BUCKET}/reth" &>/dev/null && $MC stat "${BUCKET}/reth-bench" &>/dev/null; then
@@ -85,7 +110,14 @@ case "$MODE" in
      echo "Building feature (${COMMIT}) from source..."
      cd "${SOURCE_DIR}"
      rustup show active-toolchain || rustup default stable
-      make profiling
+      if [ -n "$EXTRA_FEATURES" ]; then
+        # Can't use `make profiling` when adding features; build explicitly
+        # --workspace is needed for cross-package feature syntax (tracy-client/ondemand)
+        RUSTFLAGS="-C target-cpu=native${EXTRA_RUSTFLAGS}" \
+          cargo build --profile profiling --workspace --bin reth --features "${EXTRA_FEATURES}"
+      else
+        make profiling
+      fi
      make install-reth-bench
      $MC cp target/profiling/reth "${BUCKET}/reth"
      $MC cp "$(which reth-bench)" "${BUCKET}/reth-bench"
--- a/.github/scripts/bench-reth-local.sh
+++ b/.github/scripts/bench-reth-local.sh
@@ -0,0 +1,581 @@
+#!/usr/bin/env bash
+#
+# local-reth-bench.sh — Run the reth Engine API benchmark locally.
+#
+# Replicates the CI bench.yml workflow (build, snapshot, system tuning,
+# interleaved B-F-F-B execution, summary, charts) without any GitHub
+# Actions glue (no PR comments, no artifact upload, no Slack).
+#
+# Usage:
+#   local-reth-bench.sh <baseline-ref> <feature-ref> [options]
+#
+# Options:
+#   --blocks N      Number of blocks to benchmark (default: 500)
+#   --warmup N      Number of warmup blocks (default: 100)
+#   --cores N       Limit reth to N CPU cores, 0 = all available (default: 0)
+#   --samply        Enable samply profiling
+#   --tracy MODE    Tracy profiling: off, on, full (default: off)
+#   --tracy-filter F Tracy tracing filter (default: debug)
+#   --no-tune       Skip system tuning (useful on dev machines / macOS)
+#
+# Requires: the reth repo at RETH_REPO (default: ~/reth)
+#
+# Dependencies (install before first run):
+#   mc (MinIO client), schelk, cpupower, taskset, stdbuf, python3, curl,
+#   make, uv, pzstd, jq, Rust toolchain (cargo/rustup)
+#
+# The script delegates to the existing bench-reth-*.sh scripts in the reth
+# repo for the actual build, snapshot, and run steps.
+set -euo pipefail
+
+# ── PATH ──────────────────────────────────────────────────────────────
+# Ensure cargo and user-local bins (mc, uv) are visible
+export PATH="$HOME/.local/bin:$HOME/.cargo/bin:$PATH"
+
+# ── Defaults ──────────────────────────────────────────────────────────
+RETH_REPO="${RETH_REPO:-$HOME/reth}"
+BLOCKS=500
+WARMUP=100
+CORES=0
+SAMPLY=false
+TRACY="off"
+TRACY_FILTER="debug"
+TUNE=true
+BASELINE_REF=""
+FEATURE_REF=""
+
+# ── Parse arguments ──────────────────────────────────────────────────
+usage() {
+  cat <<EOF
+Usage: $(basename "$0") <baseline-ref> <feature-ref> [options]
+
+Options:
+  --blocks N         Number of blocks to benchmark (default: 500)
+  --warmup N         Number of warmup blocks (default: 100)
+  --cores N          Limit reth to N CPU cores (default: 0 = all)
+  --samply           Enable samply profiling
+  --tracy MODE       Tracy profiling: off, on, full (default: off)
+                       on   = tracing only (lower overhead)
+                       full = tracing + CPU sampling (higher overhead)
+  --tracy-filter F   Tracy tracing filter (default: debug)
+  --no-tune          Skip system tuning
+EOF
+  exit 1
+}
+
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --blocks)       BLOCKS="$2"; shift 2 ;;
+    --warmup)       WARMUP="$2"; shift 2 ;;
+    --cores)        CORES="$2"; shift 2 ;;
+    --samply)       SAMPLY=true; shift ;;
+    --tracy)        TRACY="$2"; shift 2 ;;
+    --tracy-filter) TRACY_FILTER="$2"; shift 2 ;;
+    --no-tune)      TUNE=false; shift ;;
+    --help|-h)  usage ;;
+    -*)         echo "Unknown option: $1"; usage ;;
+    *)
+      if [ -z "$BASELINE_REF" ]; then
+        BASELINE_REF="$1"
+      elif [ -z "$FEATURE_REF" ]; then
+        FEATURE_REF="$1"
+      else
+        echo "Unexpected argument: $1"; usage
+      fi
+      shift
+      ;;
+  esac
+done
+
+if [ -z "$BASELINE_REF" ] || [ -z "$FEATURE_REF" ]; then
+  echo "Error: both <baseline-ref> and <feature-ref> are required."
+  usage
+fi
+
+# Validate --tracy value
+case "$TRACY" in
+  off|on|full) ;;
+  *) echo "Error: --tracy must be off, on, or full (got: $TRACY)"; usage ;;
+esac
+
+# Samply + tracy=full are mutually exclusive (both use perf sampling)
+if [ "$SAMPLY" = "true" ] && [ "$TRACY" = "full" ]; then
+  echo "Warning: samply and tracy=full both use perf sampling; downgrading tracy to 'on'."
+  TRACY="on"
+fi
+
+# ── Check dependencies ───────────────────────────────────────────────
+missing=()
+for cmd in mc schelk cpupower taskset stdbuf python3 curl make uv pzstd jq cargo; do
+  command -v "$cmd" &>/dev/null || missing+=("$cmd")
+done
+if [ ${#missing[@]} -gt 0 ]; then
+  echo "Error: missing required tools: ${missing[*]}"
+  echo "See the CI 'Install dependencies' step in .github/workflows/bench.yml for install instructions."
+  exit 1
+fi
+
+if [ "$TRACY" != "off" ]; then
+  if ! command -v tracy-capture &>/dev/null; then
+    echo "Error: tracy-capture is required for --tracy $TRACY"
+    exit 1
+  fi
+fi
+
+# Ensure tools that run via sudo are in a sudo-visible path.
+# The bench scripts use `sudo schelk` / `sudo samply` but cargo installs
+# them to ~/.cargo/bin which sudo's secure_path doesn't include.
+for cmd in schelk samply; do
+  if command -v "$cmd" &>/dev/null && ! sudo sh -c "command -v $cmd" &>/dev/null; then
+    echo "Installing $cmd to /usr/local/bin (needed for sudo)..."
+    sudo install "$(command -v "$cmd")" /usr/local/bin/
+  fi
+done
+
+if [ ! -d "$RETH_REPO/.git" ]; then
+  echo "Error: RETH_REPO=$RETH_REPO is not a git repository."
+  echo "Set RETH_REPO or clone reth to ~/reth"
+  exit 1
+fi
+
+# ── Resolve paths ────────────────────────────────────────────────────
+SELF_DIR="$(cd "$(dirname "$0")" && pwd)"
+SCRIPTS_DIR="${RETH_REPO}/.github/scripts"
+BENCH_WORK_DIR="${RETH_REPO}/../bench-work-$(date +%Y%m%d-%H%M%S)"
+BASELINE_SRC="${RETH_REPO}/../reth-baseline"
+FEATURE_SRC="${RETH_REPO}/../reth-feature"
+
+mkdir -p "$BENCH_WORK_DIR"
+BENCH_WORK_DIR="$(cd "$BENCH_WORK_DIR" && pwd)"
+
+# ── Global cleanup trap (restores system tuning on any exit) ─────────
+TUNING_APPLIED=false
+CSTATE_PID=
+METRICS_PROXY_PID=
+cleanup_global() {
+  [ -n "$METRICS_PROXY_PID" ] && kill "$METRICS_PROXY_PID" 2>/dev/null || true
+  if [ "$TUNING_APPLIED" = true ]; then
+    echo
+    echo "▸ Restoring system settings..."
+    [ -n "$CSTATE_PID" ] && kill "$CSTATE_PID" 2>/dev/null || true
+    sudo systemctl start irqbalance cron atd 2>/dev/null || true
+    echo "  System settings restored."
+  fi
+}
+trap cleanup_global EXIT
+
+echo "═══════════════════════════════════════════════════════════"
+echo "  reth local benchmark"
+echo "═══════════════════════════════════════════════════════════"
+echo "  Baseline ref : $BASELINE_REF"
+echo "  Feature ref  : $FEATURE_REF"
+echo "  Blocks       : $BLOCKS"
+echo "  Warmup       : $WARMUP"
+echo "  Cores        : $CORES"
+echo "  Samply       : $SAMPLY"
+echo "  Tracy        : $TRACY"
+echo "  Tracy filter : $TRACY_FILTER"
+echo "  System tune  : $TUNE"
+echo "  Work dir     : $BENCH_WORK_DIR"
+echo "  Reth repo    : $RETH_REPO"
+echo "═══════════════════════════════════════════════════════════"
+echo
+
+# Enable sccache if available (matches CI's RUSTC_WRAPPER=sccache)
+if command -v sccache &>/dev/null; then
+  export RUSTC_WRAPPER="sccache"
+fi
+
+# Export env vars expected by the bench-reth-*.sh scripts
+export BENCH_BLOCKS="$BLOCKS"
+export BENCH_WARMUP_BLOCKS="$WARMUP"
+export BENCH_CORES="$CORES"
+export BENCH_SAMPLY="$SAMPLY"
+export BENCH_TRACY="$TRACY"
+export BENCH_TRACY_FILTER="$TRACY_FILTER"
+export BENCH_WORK_DIR
+export SCHELK_MOUNT="${SCHELK_MOUNT:-/reth-bench}"
+export BENCH_RPC_URL="${BENCH_RPC_URL:-https://ethereum.reth.rs/rpc}"
+export BENCH_METRICS_ADDR="127.0.0.1:9100"
+
+# ── Step 1: Resolve refs to full SHAs ────────────────────────────────
+echo "▸ Resolving git refs..."
+cd "$RETH_REPO"
+
+resolve_ref() {
+  local ref="$1"
+  git fetch origin "$ref" --quiet 2>/dev/null || true
+  git rev-parse "$ref" 2>/dev/null \
+    || git rev-parse "origin/$ref" 2>/dev/null \
+    || { echo "Error: cannot resolve ref '$ref'"; exit 1; }
+}
+
+BASELINE_SHA="$(resolve_ref "$BASELINE_REF")"
+FEATURE_SHA="$(resolve_ref "$FEATURE_REF")"
+echo "  Baseline SHA : $BASELINE_SHA"
+echo "  Feature SHA  : $FEATURE_SHA"
+echo
+
+# ── Step 2: Prepare source directories ───────────────────────────────
+echo "▸ Preparing source directories..."
+
+prepare_source() {
+  local src_dir="$1" ref="$2"
+  if [ -d "$src_dir" ]; then
+    git -C "$src_dir" fetch origin "$ref" 2>/dev/null || true
+  else
+    git clone --recurse-submodules "$RETH_REPO" "$src_dir"
+  fi
+  git -C "$src_dir" checkout "$ref" --force
+  git -C "$src_dir" submodule update --init --recursive
+}
+
+prepare_source "$BASELINE_SRC" "$BASELINE_SHA"
+prepare_source "$FEATURE_SRC" "$FEATURE_SHA"
+BASELINE_SRC="$(cd "$BASELINE_SRC" && pwd)"
+FEATURE_SRC="$(cd "$FEATURE_SRC" && pwd)"
+echo "  Baseline src : $BASELINE_SRC"
+echo "  Feature src  : $FEATURE_SRC"
+echo
+
+# ── Step 3: Check / download snapshot ────────────────────────────────
+echo "▸ Checking snapshot..."
+cd "$RETH_REPO"
+SNAPSHOT_NEEDED=false
+if ! "${SCRIPTS_DIR}/bench-reth-snapshot.sh" --check; then
+  SNAPSHOT_NEEDED=true
+  echo "  Snapshot needs update."
+else
+  echo "  Snapshot is up-to-date."
+fi
+echo
+
+# ── Step 4: Build binaries (+ snapshot download) in parallel ─────────
+echo "▸ Building binaries (parallel)..."
+cd "$RETH_REPO"
+
+FAIL=0
+
+"${SCRIPTS_DIR}/bench-reth-build.sh" baseline "$BASELINE_SRC" "$BASELINE_SHA" &
+PID_BASELINE=$!
+
+"${SCRIPTS_DIR}/bench-reth-build.sh" feature "$FEATURE_SRC" "$FEATURE_SHA" &
+PID_FEATURE=$!
+
+PID_SNAPSHOT=
+if [ "$SNAPSHOT_NEEDED" = "true" ]; then
+  echo "  Also downloading snapshot in parallel..."
+  "${SCRIPTS_DIR}/bench-reth-snapshot.sh" &
+  PID_SNAPSHOT=$!
+fi
+
+wait $PID_BASELINE || FAIL=1
+wait $PID_FEATURE  || FAIL=1
+[ -n "$PID_SNAPSHOT" ] && { wait $PID_SNAPSHOT || FAIL=1; }
+
+if [ $FAIL -ne 0 ]; then
+  echo "Error: one or more parallel tasks failed (builds / snapshot)"
+  exit 1
+fi
+echo "  Binaries built successfully."
+echo
+
+# ── Step 5: System tuning (optional) ────────────────────────────────
+if [ "$TUNE" = "true" ]; then
+  echo "▸ Applying system tuning..."
+
+  sudo cpupower frequency-set -g performance 2>/dev/null || true
+
+  # Disable turbo boost (Intel + AMD)
+  echo 1 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo 2>/dev/null || true
+  echo 0 | sudo tee /sys/devices/system/cpu/cpufreq/boost 2>/dev/null || true
+
+  sudo swapoff -a 2>/dev/null || true
+  echo 0 | sudo tee /proc/sys/kernel/randomize_va_space 2>/dev/null || true
+
+  # Disable SMT (hyperthreading)
+  for cpu in /sys/devices/system/cpu/cpu*/topology/thread_siblings_list; do
+    [ -f "$cpu" ] || continue
+    first=$(cut -d, -f1 < "$cpu" | cut -d- -f1)
+    current=$(echo "$cpu" | grep -o 'cpu[0-9]*' | grep -o '[0-9]*')
+    if [ "$current" != "$first" ]; then
+      echo 0 | sudo tee "/sys/devices/system/cpu/cpu${current}/online" 2>/dev/null || true
+    fi
+  done
+  echo "  Online CPUs: $(nproc)"
+
+  # Disable transparent huge pages
+  for p in /sys/kernel/mm/transparent_hugepage /sys/kernel/mm/transparent_hugepages; do
+    if [ -d "$p" ]; then
+      echo never | sudo tee "$p/enabled" 2>/dev/null || true
+      echo never | sudo tee "$p/defrag" 2>/dev/null || true
+      break
+    fi
+  done
+
+  # Prevent deep C-states
+  sudo sh -c 'exec 3<>/dev/cpu_dma_latency; echo -ne "\x00\x00\x00\x00" >&3; sleep infinity' &
+  CSTATE_PID=$!
+
+  # Pin IRQs to core 0
+  for irq in /proc/irq/*/smp_affinity_list; do
+    echo 0 | sudo tee "$irq" 2>/dev/null || true
+  done
+
+  # Stop noisy background services
+  sudo systemctl stop irqbalance cron atd unattended-upgrades snapd 2>/dev/null || true
+
+  TUNING_APPLIED=true
+
+  # Log environment for reproducibility (matches CI)
+  echo "  === Benchmark environment ==="
+  echo "  Kernel : $(uname -r)"
+  lscpu | grep -E 'Model name|CPU\(s\)|MHz|NUMA' | sed 's/^/  /'
+  echo "  Governor : $(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor 2>/dev/null || echo unknown)"
+  echo "  Freq     : $(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq 2>/dev/null || echo unknown)"
+  echo "  THP      : $(cat /sys/kernel/mm/transparent_hugepage/enabled 2>/dev/null || cat /sys/kernel/mm/transparent_hugepages/enabled 2>/dev/null || echo unknown)"
+  free -h | sed 's/^/  /'
+  echo "  System tuning applied."
+  echo
+fi
+
+# ── Step 5b: Tracefs mount (tracy=full only) ─────────────────────────
+if [ "$TRACY" = "full" ] && [ "$(uname)" = "Linux" ]; then
+  echo "▸ Mounting tracefs for Tracy full mode..."
+  sudo mount -t tracefs tracefs /sys/kernel/tracing -o mode=755 2>/dev/null || true
+fi
+
+# ── Tracy upload & viewer helpers ────────────────────────────────────
+TRACY_VIEWER_BASE="${TRACY_VIEWER_BASE:-}"
+
+tracy_viewer_url() {
+  local profile_url="$1"
+  if [ -z "$TRACY_VIEWER_BASE" ]; then
+    echo ""
+    return
+  fi
+  local encoded
+  encoded=$(python3 -c "import urllib.parse, sys; print(urllib.parse.quote(sys.argv[1], safe=''))" "$profile_url")
+  echo "${TRACY_VIEWER_BASE}?profile_url=${encoded}"
+}
+
+upload_tracy() {
+  local label="$1" output_dir="$2" sha="$3"
+  local tracy_file="$output_dir/tracy-profile.tracy"
+
+  if [ ! -f "$tracy_file" ]; then
+    echo "  Tracy: no profile found, skipping upload."
+    return
+  fi
+
+  local timestamp short_sha remote_name bucket mc_alias
+  timestamp=$(date +%Y%m%d-%H%M%S)
+  short_sha="${sha:0:7}"
+  remote_name="${label}-${short_sha}-${timestamp}.tracy"
+  bucket="${TRACY_BUCKET:-tracy-profiles}"
+  mc_alias="${MC_ALIAS:-minio}"
+  local minio_base="${TRACY_MINIO_URL:-http://minio.minio.svc.cluster.local:9000}"
+
+  echo "  Tracy: uploading profile..."
+  if mc cp "$tracy_file" "${mc_alias}/${bucket}/${remote_name}"; then
+    local url="${minio_base}/${bucket}/${remote_name}"
+    echo "$url" > "$output_dir/tracy_url.txt"
+    local viewer
+    viewer=$(tracy_viewer_url "$url")
+    if [ -n "$viewer" ]; then
+      echo "$viewer" > "$output_dir/tracy_viewer_url.txt"
+      echo "  Tracy: uploaded → $viewer"
+    else
+      echo "  Tracy: uploaded → $url"
+    fi
+  else
+    echo "  Tracy: upload failed (non-fatal)."
+  fi
+
+  # Delete large profile to free disk
+  rm -f "$tracy_file"
+}
+
+# ── Step 6: Pre-flight cleanup ───────────────────────────────────────
+echo "▸ Pre-flight cleanup..."
+pkill -f bench-metrics-proxy 2>/dev/null || true
+sudo pkill -9 reth 2>/dev/null || true
+sleep 1
+if mountpoint -q "$SCHELK_MOUNT" 2>/dev/null; then
+  sudo umount -l "$SCHELK_MOUNT" 2>/dev/null || true
+  sudo schelk recover -y 2>/dev/null || true
+fi
+echo
+
+# ── Step 7: Interleaved benchmark runs (B-F-F-B) ────────────────────
+# This ordering reduces systematic bias from thermal drift and cache warming.
+BASELINE_BIN="${BASELINE_SRC}/target/profiling/reth"
+FEATURE_BIN="${FEATURE_SRC}/target/profiling/reth"
+
+# Start metrics proxy (reth → label injection → Prometheus)
+LABELS_FILE="/tmp/bench-metrics-labels.json"
+echo '{}' > "$LABELS_FILE"
+METRICS_SUBNET="${METRICS_SUBNET:-10.10.0.0/24}"
+METRICS_PORT="${METRICS_PORT:-9090}"
+python3 "${SELF_DIR}/bench-metrics-proxy.py" \
+  --labels "$LABELS_FILE" \
+  --upstream "http://${BENCH_METRICS_ADDR}/" \
+  --subnet "$METRICS_SUBNET" \
+  --port "$METRICS_PORT" &
+METRICS_PROXY_PID=$!
+echo "▸ Metrics proxy started (PID $METRICS_PROXY_PID) on subnet ${METRICS_SUBNET}, port ${METRICS_PORT}"
+
+# Unique benchmark ID: local-<timestamp> for local runs, ci-<run_id> for CI
+BENCH_ID="local-$(basename "$BENCH_WORK_DIR" | sed 's/bench-work-//')"
+# Reference epoch: shared time origin so all runs overlay in Grafana.
+# The proxy maps each run's elapsed time onto this common origin.
+BENCH_REFERENCE_EPOCH=$(date +%s)
+
+write_labels() {
+  local run_label="$1" run_type="$2" ref="$3" sha="$4"
+  LAST_RUN_START=$(date +%s)
+  cat > "$LABELS_FILE" <<-EOF
+	{"benchmark_run":"${run_label}","run_type":"${run_type}","git_ref":"${ref}","bench_sha":"${sha}","benchmark_id":"${BENCH_ID}","run_start_epoch":"${LAST_RUN_START}","reference_epoch":"${BENCH_REFERENCE_EPOCH}"}
+	EOF
+}
+
+run_bench() {
+  local label="$1" binary="$2" output_dir="$3"
+  echo "▸ Running benchmark: ${label}..."
+  cd "$RETH_REPO"
+  if command -v taskset &>/dev/null; then
+    taskset -c 0 "${SCRIPTS_DIR}/bench-reth-run.sh" "$label" "$binary" "$output_dir"
+  else
+    "${SCRIPTS_DIR}/bench-reth-run.sh" "$label" "$binary" "$output_dir"
+  fi
+  echo "  ✓ ${label} complete."
+  echo
+}
+
+write_labels "baseline-1" "baseline" "$BASELINE_REF" "$BASELINE_SHA"
+run_bench "baseline-1" "$BASELINE_BIN" "$BENCH_WORK_DIR/baseline-1"
+
+write_labels "feature-1" "feature" "$FEATURE_REF" "$FEATURE_SHA"
+run_bench "feature-1"  "$FEATURE_BIN"  "$BENCH_WORK_DIR/feature-1"
+
+write_labels "feature-2" "feature" "$FEATURE_REF" "$FEATURE_SHA"
+run_bench "feature-2"  "$FEATURE_BIN"  "$BENCH_WORK_DIR/feature-2"
+
+write_labels "baseline-2" "baseline" "$BASELINE_REF" "$BASELINE_SHA"
+run_bench "baseline-2" "$BASELINE_BIN" "$BENCH_WORK_DIR/baseline-2"
+
+# ── Compute Grafana URL ──────────────────────────────────────────────
+GRAFANA_BASE_URL="https://tempoxyz.grafana.net/d/reth-bench-ghr/reth-bench-ghr"
+GRAFANA_DATASOURCE="ef57fux92e9z4e"
+LAST_RUN_DURATION=$(( $(date +%s) - LAST_RUN_START ))
+FROM_MS=$(( BENCH_REFERENCE_EPOCH * 1000 ))
+TO_MS=$(( (BENCH_REFERENCE_EPOCH + LAST_RUN_DURATION) * 1000 ))
+GRAFANA_URL="${GRAFANA_BASE_URL}?orgId=1&from=${FROM_MS}&to=${TO_MS}&timezone=browser&var-datasource=${GRAFANA_DATASOURCE}&var-job=reth-bench&var-benchmark_id=${BENCH_ID}&var-benchmark_run=\$__all"
+
+# ── Step 8: Scan logs for errors ─────────────────────────────────────
+echo "▸ Scanning logs for errors..."
+ERRORS_FILE="$BENCH_WORK_DIR/errors.md"
+found_errors=false
+for run_dir in baseline-1 feature-1 feature-2 baseline-2; do
+  LOG="$BENCH_WORK_DIR/$run_dir/node.log"
+  [ -f "$LOG" ] || continue
+  panics=$(grep -c -E 'panicked at' "$LOG" 2>/dev/null || true)
+  errors=$(grep -c ' ERROR ' "$LOG" 2>/dev/null || true)
+  if [ "$panics" -gt 0 ] || [ "$errors" -gt 0 ]; then
+    if [ "$found_errors" = false ]; then
+      printf '### ⚠️ Node Errors\n\n' >> "$ERRORS_FILE"
+      found_errors=true
+    fi
+    printf '<details><summary><b>%s</b>: %d panic(s), %d error(s)</summary>\n\n' \
+      "$run_dir" "$panics" "$errors" >> "$ERRORS_FILE"
+    if [ "$panics" -gt 0 ]; then
+      printf '**Panics:**\n```\n' >> "$ERRORS_FILE"
+      grep -E 'panicked at' "$LOG" | head -10 >> "$ERRORS_FILE"
+      printf '```\n' >> "$ERRORS_FILE"
+    fi
+    if [ "$errors" -gt 0 ]; then
+      printf '**Errors (first 20):**\n```\n' >> "$ERRORS_FILE"
+      grep ' ERROR ' "$LOG" | head -20 >> "$ERRORS_FILE"
+      printf '```\n' >> "$ERRORS_FILE"
+    fi
+    printf '\n</details>\n\n' >> "$ERRORS_FILE"
+  fi
+done
+if [ "$found_errors" = true ]; then
+  echo "  ⚠ Errors found — see $ERRORS_FILE"
+else
+  echo "  No errors found."
+fi
+echo
+
+# ── Step 9: Parse results ───────────────────────────────────────────
+echo "▸ Parsing results..."
+cd "$RETH_REPO"
+
+SUMMARY_ARGS=(
+  --output-summary "$BENCH_WORK_DIR/summary.json"
+  --output-markdown "$BENCH_WORK_DIR/comment.md"
+  --repo "paradigmxyz/reth"
+  --baseline-ref "$BASELINE_SHA"
+  --baseline-name "$BASELINE_REF"
+  --feature-name "$FEATURE_REF"
+  --feature-ref "$FEATURE_SHA"
+  --baseline-csv "$BENCH_WORK_DIR/baseline-1/combined_latency.csv" "$BENCH_WORK_DIR/baseline-2/combined_latency.csv"
+  --feature-csv "$BENCH_WORK_DIR/feature-1/combined_latency.csv" "$BENCH_WORK_DIR/feature-2/combined_latency.csv"
+  --gas-csv "$BENCH_WORK_DIR/feature-1/total_gas.csv"
+  --grafana-url "$GRAFANA_URL"
+)
+
+python3 "${SCRIPTS_DIR}/bench-reth-summary.py" "${SUMMARY_ARGS[@]}"
+echo
+
+# ── Step 10: Generate charts ─────────────────────────────────────────
+echo "▸ Generating charts..."
+CHART_ARGS=(
+  --output-dir "$BENCH_WORK_DIR/charts"
+  --feature "$BENCH_WORK_DIR/feature-1/combined_latency.csv" "$BENCH_WORK_DIR/feature-2/combined_latency.csv"
+  --baseline "$BENCH_WORK_DIR/baseline-1/combined_latency.csv" "$BENCH_WORK_DIR/baseline-2/combined_latency.csv"
+  --baseline-name "$BASELINE_REF"
+  --feature-name "$FEATURE_REF"
+)
+
+if python3 -c "import matplotlib" 2>/dev/null; then
+  python3 "${SCRIPTS_DIR}/bench-reth-charts.py" "${CHART_ARGS[@]}"
+elif command -v uv &>/dev/null; then
+  uv run --with matplotlib python3 "${SCRIPTS_DIR}/bench-reth-charts.py" "${CHART_ARGS[@]}"
+else
+  echo "  Warning: matplotlib not available, skipping chart generation."
+fi
+echo
+
+# ── Step 11: Upload Tracy profiles ────────────────────────────────────
+if [ "$TRACY" != "off" ]; then
+  echo "▸ Uploading Tracy profiles..."
+  upload_tracy "baseline-1" "$BENCH_WORK_DIR/baseline-1" "$BASELINE_SHA"
+  upload_tracy "feature-1"  "$BENCH_WORK_DIR/feature-1"  "$FEATURE_SHA"
+  upload_tracy "feature-2"  "$BENCH_WORK_DIR/feature-2"  "$FEATURE_SHA"
+  upload_tracy "baseline-2" "$BENCH_WORK_DIR/baseline-2" "$BASELINE_SHA"
+  echo
+fi
+
+# ── Done (system restore happens via EXIT trap) ─────────────────────
+echo "═══════════════════════════════════════════════════════════"
+echo "  Benchmark complete!"
+echo "═══════════════════════════════════════════════════════════"
+echo "  Results  : $BENCH_WORK_DIR/summary.json"
+echo "  Markdown : $BENCH_WORK_DIR/comment.md"
+echo "  Charts   : $BENCH_WORK_DIR/charts/"
+if [ -f "$ERRORS_FILE" ]; then
+  echo "  Errors   : $ERRORS_FILE"
+fi
+echo "  Grafana  : $GRAFANA_URL"
+if [ "$TRACY" != "off" ]; then
+  echo "  ─── Tracy Profiles ───"
+  for run_dir in baseline-1 feature-1 feature-2 baseline-2; do
+    url_file="$BENCH_WORK_DIR/$run_dir/tracy_viewer_url.txt"
+    if [ -f "$url_file" ]; then
+      echo "  $run_dir : $(cat "$url_file")"
+    fi
+  done
+fi
+echo "═══════════════════════════════════════════════════════════"
--- a/.github/scripts/bench-reth-run.sh
+++ b/.github/scripts/bench-reth-run.sh
@@ -6,6 +6,13 @@
 # Usage: bench-reth-run.sh <label> <binary> <output-dir>
 #
 # Required env: SCHELK_MOUNT, BENCH_RPC_URL, BENCH_BLOCKS, BENCH_WARMUP_BLOCKS
+# Optional env: BENCH_BIG_BLOCKS (true/false), BENCH_WORK_DIR (for big blocks path)
+#               BENCH_RETH_NEW_PAYLOAD (true/false, default true)
+#               BENCH_WAIT_TIME (duration like 500ms, default empty)
+#               BENCH_BASELINE_ARGS (extra reth node args for baseline runs)
+#               BENCH_FEATURE_ARGS (extra reth node args for feature runs)
+#               BENCH_OTLP_TRACES_ENDPOINT (OTLP HTTP endpoint for traces, e.g. https://host/insert/opentelemetry/v1/traces)
+#               BENCH_OTLP_LOGS_ENDPOINT (OTLP HTTP endpoint for logs, e.g. https://host/insert/opentelemetry/v1/logs)
 set -euo pipefail

 LABEL="$1"
@@ -17,6 +24,24 @@ LOG="${OUTPUT_DIR}/node.log"

 cleanup() {
  kill "$TAIL_PID" 2>/dev/null || true
+  # Stop tracy-capture first (SIGINT makes it disconnect and flush to disk)
+  # Must happen before killing reth, otherwise reth keeps streaming data.
+  if [ -n "${TRACY_PID:-}" ] && kill -0 "$TRACY_PID" 2>/dev/null; then
+    echo "Stopping tracy-capture..."
+    kill -INT "$TRACY_PID" 2>/dev/null || true
+    for i in $(seq 1 30); do
+      kill -0 "$TRACY_PID" 2>/dev/null || break
+      if [ $((i % 10)) -eq 0 ]; then
+        echo "Waiting for tracy-capture to finish writing... (${i}s)"
+      fi
+      sleep 1
+    done
+    if kill -0 "$TRACY_PID" 2>/dev/null; then
+      echo "tracy-capture still running after 30s, killing..."
+      kill -9 "$TRACY_PID" 2>/dev/null || true
+    fi
+    wait "$TRACY_PID" 2>/dev/null || true
+  fi
  if [ -n "${RETH_PID:-}" ] && sudo kill -0 "$RETH_PID" 2>/dev/null; then
    if [ "${BENCH_SAMPLY:-false}" = "true" ]; then
      # Send SIGINT to the inner reth process by exact name (not -f which
@@ -53,8 +78,14 @@ cleanup() {
  fi
 }
 TAIL_PID=
+TRACY_PID=
 trap cleanup EXIT

+# Clean up stale schelk state from a previous cancelled run.
+# If schelk thinks it's still mounted (e.g. a cancelled run skipped cleanup),
+# recover first to reset state.
+sudo schelk recover -y -k || true
+
 # Mount
 sudo schelk mount -y
 sync
@@ -73,6 +104,8 @@ if [ "${BENCH_CORES:-0}" -gt 0 ] && [ "$BENCH_CORES" -lt "$MAX_RETH" ]; then
 fi
 RETH_CPUS="1-${MAX_RETH}"

+BIG_BLOCKS="${BENCH_BIG_BLOCKS:-false}"
+
 RETH_ARGS=(
  node
  --datadir "$DATADIR"
@@ -87,16 +120,68 @@ RETH_ARGS=(
  --no-persist-peers
 )

+# Big blocks mode requires the testing API and skip-invalid-transactions
+if [ "$BIG_BLOCKS" = "true" ]; then
+  RETH_ARGS+=(--http.api eth,net,web3,reth,testing --testing.skip-invalid-transactions)
+fi
+
+# Append per-label extra node args (baseline or feature)
+EXTRA_NODE_ARGS=""
+case "$LABEL" in
+  baseline*) EXTRA_NODE_ARGS="${BENCH_BASELINE_ARGS:-}" ;;
+  feature*)  EXTRA_NODE_ARGS="${BENCH_FEATURE_ARGS:-}" ;;
+esac
+if [ -n "$EXTRA_NODE_ARGS" ]; then
+  # Word-split the string into individual args
+  # shellcheck disable=SC2206
+  RETH_ARGS+=($EXTRA_NODE_ARGS)
+fi
+
+if [ -n "${BENCH_METRICS_ADDR:-}" ]; then
+  RETH_ARGS+=(--metrics "$BENCH_METRICS_ADDR")
+fi
+
+# OTLP traces and logs export
+if [ -n "${BENCH_OTLP_TRACES_ENDPOINT:-}" ]; then
+  RETH_ARGS+=(--tracing-otlp="${BENCH_OTLP_TRACES_ENDPOINT}" --tracing-otlp.service-name=reth-bench)
+fi
+if [ -n "${BENCH_OTLP_LOGS_ENDPOINT:-}" ]; then
+  RETH_ARGS+=(--logs-otlp="${BENCH_OTLP_LOGS_ENDPOINT}" --logs-otlp.filter=debug)
+fi
+
+# Tracy profiling: add --log.tracy flags and set environment
+if [ "${BENCH_TRACY:-off}" != "off" ]; then
+  RETH_ARGS+=(--log.tracy --log.tracy.filter "${BENCH_TRACY_FILTER:-debug}")
+  if [ "${BENCH_TRACY}" = "on" ]; then
+    export TRACY_NO_SYS_TRACE=1
+  elif [ "${BENCH_TRACY}" = "full" ]; then
+    export TRACY_SAMPLING_HZ="${BENCH_TRACY_SAMPLING_HZ:-1}"
+  fi
+fi
+
+SUDO_ENV=()
+if [ -n "${OTEL_RESOURCE_ATTRIBUTES:-}" ]; then
+  SUDO_ENV+=("OTEL_RESOURCE_ATTRIBUTES=${OTEL_RESOURCE_ATTRIBUTES}")
+  SUDO_ENV+=("OTEL_BSP_MAX_QUEUE_SIZE=65536" "OTEL_BLRP_MAX_QUEUE_SIZE=65536")
+fi
+
+# Limit reth memory to 95% of available RAM to prevent OOM kills
+TOTAL_MEM_KB=$(awk '/^MemTotal:/ {print $2}' /proc/meminfo)
+MEM_LIMIT=$(( TOTAL_MEM_KB * 95 / 100 * 1024 ))
+echo "Memory limit: $(( MEM_LIMIT / 1024 / 1024 ))MB (95% of $(( TOTAL_MEM_KB / 1024 ))MB)"
+
 if [ "${BENCH_SAMPLY:-false}" = "true" ]; then
  RETH_ARGS+=(--log.samply)
  SAMPLY="$(which samply)"
-  sudo taskset -c "$RETH_CPUS" nice -n -20 \
+  sudo systemd-run --scope -p MemoryMax="$MEM_LIMIT" -p AllowedCPUs="$RETH_CPUS" \
+    env "${SUDO_ENV[@]}" nice -n -20 \
    "$SAMPLY" record --save-only --presymbolicate --rate 10000 \
    --output "$OUTPUT_DIR/samply-profile.json.gz" \
    -- "$BINARY" "${RETH_ARGS[@]}" \
    > "$LOG" 2>&1 &
 else
-  sudo taskset -c "$RETH_CPUS" nice -n -20 "$BINARY" "${RETH_ARGS[@]}" \
+  sudo systemd-run --scope -p MemoryMax="$MEM_LIMIT" -p AllowedCPUs="$RETH_CPUS" \
+    env "${SUDO_ENV[@]}" nice -n -20 "$BINARY" "${RETH_ARGS[@]}" \
    > "$LOG" 2>&1 &
 fi

@@ -124,21 +209,65 @@ done
 # files are not root-owned (avoids EACCES on next checkout).
 BENCH_NICE="sudo nice -n -20 sudo -u $(id -un)"

-# Warmup
-$BENCH_NICE "$RETH_BENCH" new-payload-fcu \
-  --rpc-url "$BENCH_RPC_URL" \
-  --engine-rpc-url http://127.0.0.1:8551 \
-  --jwt-secret "$DATADIR/jwt.hex" \
-  --advance "${BENCH_WARMUP_BLOCKS:-50}" \
-  --reth-new-payload 2>&1 | sed -u "s/^/[bench] /"
+# Build optional flags
+EXTRA_BENCH_ARGS=()
+if [ "${BENCH_RETH_NEW_PAYLOAD:-true}" != "false" ]; then
+  EXTRA_BENCH_ARGS+=(--reth-new-payload)
+fi
+if [ -n "${BENCH_WAIT_TIME:-}" ]; then
+  EXTRA_BENCH_ARGS+=(--wait-time "$BENCH_WAIT_TIME")
+fi

-# Benchmark
-$BENCH_NICE "$RETH_BENCH" new-payload-fcu \
-  --rpc-url "$BENCH_RPC_URL" \
-  --engine-rpc-url http://127.0.0.1:8551 \
-  --jwt-secret "$DATADIR/jwt.hex" \
-  --advance "$BENCH_BLOCKS" \
-  --reth-new-payload \
-  --output "$OUTPUT_DIR" 2>&1 | sed -u "s/^/[bench] /"
+if [ "$BIG_BLOCKS" = "true" ]; then
+  # Big blocks mode: replay pre-generated payloads with gas ramp
+  BIG_BLOCKS_DIR="${BENCH_WORK_DIR}/big-blocks"
+  # Count gas ramp blocks for reporting
+  GAS_RAMP_COUNT=$(find "$BIG_BLOCKS_DIR/gas-ramp-dir" -name '*.json' | wc -l)
+  echo "$GAS_RAMP_COUNT" > "$OUTPUT_DIR/gas_ramp_blocks.txt"
+  echo "Gas ramp blocks: $GAS_RAMP_COUNT"
+
+  # Start tracy-capture so profile only covers the benchmark
+  if [ "${BENCH_TRACY:-off}" != "off" ]; then
+    echo "Starting tracy-capture..."
+    tracy-capture -f -o "$OUTPUT_DIR/tracy-profile.tracy" &
+    TRACY_PID=$!
+    sleep 0.5  # give tracy-capture time to connect
+  fi
+
+  echo "Running big blocks benchmark (replay-payloads)..."
+  $BENCH_NICE "$RETH_BENCH" replay-payloads \
+    "${EXTRA_BENCH_ARGS[@]}" \
+    --gas-ramp-dir "$BIG_BLOCKS_DIR/gas-ramp-dir" \
+    --payload-dir "$BIG_BLOCKS_DIR/payloads" \
+    --engine-rpc-url http://127.0.0.1:8551 \
+    --jwt-secret "$DATADIR/jwt.hex" \
+    --output "$OUTPUT_DIR" 2>&1 | sed -u "s/^/[bench] /"
+else
+  # Standard mode: warmup + new-payload-fcu
+  # Warmup
+  $BENCH_NICE "$RETH_BENCH" new-payload-fcu \
+    --rpc-url "$BENCH_RPC_URL" \
+    --engine-rpc-url http://127.0.0.1:8551 \
+    --jwt-secret "$DATADIR/jwt.hex" \
+    --advance "${BENCH_WARMUP_BLOCKS:-50}" \
+    "${EXTRA_BENCH_ARGS[@]}" 2>&1 | sed -u "s/^/[bench] /"
+
+  # Start tracy-capture after warmup so profile only covers the benchmark
+  if [ "${BENCH_TRACY:-off}" != "off" ]; then
+    echo "Starting tracy-capture..."
+    tracy-capture -f -o "$OUTPUT_DIR/tracy-profile.tracy" &
+    TRACY_PID=$!
+    sleep 0.5  # give tracy-capture time to connect
+  fi
+
+  # Benchmark
+  $BENCH_NICE "$RETH_BENCH" new-payload-fcu \
+    --rpc-url "$BENCH_RPC_URL" \
+    --engine-rpc-url http://127.0.0.1:8551 \
+    --jwt-secret "$DATADIR/jwt.hex" \
+    --advance "$BENCH_BLOCKS" \
+    "${EXTRA_BENCH_ARGS[@]}" \
+    --output "$OUTPUT_DIR" 2>&1 | sed -u "s/^/[bench] /"
+fi

 # cleanup runs via trap
--- a/.github/scripts/bench-reth-summary.py
+++ b/.github/scripts/bench-reth-summary.py
@@ -350,6 +350,7 @@ def generate_comparison_table(
    baseline_name: str,
    feature_name: str,
    feature_sha: str,
+    big_blocks: bool = False,
 ) -> str:
    """Generate a markdown comparison table between baseline and feature."""
    n = paired["blocks"]
@@ -390,7 +391,7 @@ def generate_comparison_table(
        f"| Mgas/s | {fmt_mgas(run1['mean_mgas_s'])} | {fmt_mgas(run2['mean_mgas_s'])} | {change_str(gas_pct, mgas_ci_pct, lower_is_better=False)} |",
        f"| Wall Clock | {fmt_s(run1['wall_clock_s'])} | {fmt_s(run2['wall_clock_s'])} | {change_str(wall_pct, wall_ci_pct, lower_is_better=True)} |",
        "",
-        f"*{n} blocks*",
+        f"*{n} {'big blocks' if big_blocks else 'blocks'}*",
    ]
    return "\n".join(lines)

@@ -421,6 +422,7 @@ def generate_markdown(
    summary: dict, comparison_table: str,
    wait_time_tables: list[str] | None = None,
    behind_baseline: int = 0, repo: str = "", baseline_ref: str = "", baseline_name: str = "",
+    grafana_url: str | None = None,
 ) -> str:
    """Generate a markdown comment body."""
    lines = ["## Benchmark Results", ""]
@@ -440,6 +442,9 @@ def generate_markdown(
                lines.append(table)
                lines.append("")
        lines.append("</details>")
+    if grafana_url:
+        lines.append("")
+        lines.append(f"**[Grafana Dashboard]({grafana_url})**")
    return "\n".join(lines)


@@ -466,6 +471,9 @@ def main():
    parser.add_argument("--feature-name", "--branch-name", default=None, help="Feature branch name")
    parser.add_argument("--feature-ref", "--branch-sha", "--feature-sha", default=None, help="Feature commit SHA")
    parser.add_argument("--behind-baseline", "--behind-main", type=int, default=0, help="Commits behind baseline")
+    parser.add_argument("--big-blocks", action="store_true", default=False, help="Big blocks mode")
+    parser.add_argument("--gas-ramp-blocks", type=int, default=0, help="Number of gas ramp blocks (big blocks mode)")
+    parser.add_argument("--grafana-url", default=None, help="Grafana dashboard URL for this benchmark run")
    args = parser.parse_args()

    if len(args.baseline_csv) != len(args.feature_csv):
@@ -514,6 +522,7 @@ def main():
        baseline_name=baseline_name,
        feature_name=feature_name,
        feature_sha=feature_sha,
+        big_blocks=args.big_blocks,
    )
    print(f"Generated comparison ({paired_stats['n']} paired blocks, "
          f"mean diff {paired_stats['mean_diff_ms']:+.3f}ms ± {paired_stats['ci_ms']:.3f}ms)")
@@ -544,6 +553,8 @@ def main():

    summary = {
        "blocks": paired_stats["blocks"],
+        "big_blocks": args.big_blocks,
+        "gas_ramp_blocks": args.gas_ramp_blocks,
        "baseline": {
            "name": baseline_name,
            "ref": baseline_ref,
@@ -569,6 +580,7 @@ def main():
        repo=args.repo,
        baseline_ref=baseline_ref,
        baseline_name=baseline_name,
+        grafana_url=args.grafana_url,
    )

    with open(args.output_markdown, "w") as f:
--- a/.github/scripts/bench-slack-notify.js
+++ b/.github/scripts/bench-slack-notify.js
@@ -7,6 +7,8 @@
 //   BENCH_PR               – PR number (may be empty)
 //   BENCH_ACTOR            – GitHub user who triggered the bench
 //   BENCH_JOB_URL          – URL to the Actions job page
+//   BENCH_BASELINE_ARGS    – Extra CLI args for the baseline reth node
+//   BENCH_FEATURE_ARGS     – Extra CLI args for the feature reth node
 //   BENCH_SAMPLY           – 'true' if samply profiling was enabled
 //
 // Usage from actions/github-script:
@@ -118,15 +120,27 @@ function buildSuccessBlocks({ summary, prNumber, actor, actorSlackId, jobUrl, re
  if (fl1) featureLine += ` | <${fl1}|Samply 1>`;
  if (fl2) featureLine += ` | <${fl2}|Samply 2>`;

-  const warmup = summary.warmup_blocks || process.env.BENCH_WARMUP_BLOCKS || '';
  const cores = process.env.BENCH_CORES || '0';
  const countsParts = [];
-  if (warmup) countsParts.push(`*Warmup:* ${warmup}`);
-  countsParts.push(`*Blocks:* ${summary.blocks}`);
+  if (summary.big_blocks) {
+    const gasRamp = summary.gas_ramp_blocks || 0;
+    if (gasRamp > 0) countsParts.push(`*Gas Ramp:* ${gasRamp}`);
+    countsParts.push(`*Big Blocks:* ${summary.blocks}`);
+  } else {
+    const warmup = summary.warmup_blocks || process.env.BENCH_WARMUP_BLOCKS || '';
+    if (warmup) countsParts.push(`*Warmup:* ${warmup}`);
+    countsParts.push(`*Blocks:* ${summary.blocks}`);
+  }
  if (cores !== '0') countsParts.push(`*Cores:* ${cores}`);
  const countsLine = countsParts.join(' | ');

-  const sectionText = [metaParts.join(' | '), '', baselineLine, featureLine, countsLine].join('\n');
+  const baselineArgs = process.env.BENCH_BASELINE_ARGS || '';
+  const featureArgs = process.env.BENCH_FEATURE_ARGS || '';
+  const argsLines = [];
+  if (baselineArgs) argsLines.push(`*Baseline Args:* \`${baselineArgs}\``);
+  if (featureArgs) argsLines.push(`*Feature Args:* \`${featureArgs}\``);
+
+  const sectionText = [metaParts.join(' | '), '', baselineLine, featureLine, ...argsLines, countsLine].join('\n');

  // Action buttons
  const diffUrl = `https://github.com/${repo}/compare/${summary.baseline.ref}...${summary.feature.ref}`;
--- a/.github/scripts/build_pgo_bolt.sh
+++ b/.github/scripts/build_pgo_bolt.sh
@@ -0,0 +1,414 @@
+#!/usr/bin/env bash
+#
+# Full PGO+BOLT optimized build for reth using real reth-bench workloads.
+#
+# Phases:
+#   1. Build PGO-instrumented reth, run reth-bench → collect PGO profiles
+#   2. Build BOLT-instrumented reth (with PGO), run reth-bench → collect BOLT profiles
+#   3. Build final PGO+BOLT optimized binary
+#
+# Required environment variables:
+#   DATADIR    - Path to reth datadir (must already contain chain data)
+#   RPC_URL    - Source RPC URL for reth-bench to fetch payloads from
+#
+# Optional environment variables:
+#   PGO_BLOCKS      - Number of blocks for PGO profiling (default: 20)
+#   BOLT_BLOCKS     - Number of blocks for BOLT profiling (default: 20)
+#   SKIP_BOLT       - Temporarily skip BOLT phases (default: false)
+#   STRIP_SYMBOLS   - Strip debug symbols from output binary (default: true)
+#   COLLECT_PGO_ONLY - Stop after producing merged.profdata (default: false)
+#   PGO_PROFDATA    - Path to pre-collected merged.profdata (optional)
+#   PROFILE         - Cargo profile (default: maxperf-symbols)
+#   FEATURES        - Cargo features (default: jemalloc,asm-keccak,min-debug-logs)
+#   TARGET          - Target triple (default: auto-detected)
+#   EXTRA_RUSTFLAGS - Additional RUSTFLAGS (e.g. -C target-cpu=x86-64-v3)
+#
+# Output:
+#   target/$PROFILE_DIR/reth  — final optimized binary
+set -euo pipefail
+
+gha_section_start() {
+    local title="$1"
+    if [ -n "${GITHUB_ACTIONS:-}" ]; then
+        echo "::group::$title"
+    else
+        echo ""
+        echo "=== $title ==="
+    fi
+}
+
+gha_section_end() {
+    if [ -n "${GITHUB_ACTIONS:-}" ]; then
+        echo "::endgroup::"
+    fi
+}
+
+cd "$(dirname "$0")/../.."
+
+# ── Configuration ──────────────────────────────────────────────────────────────
+PGO_BLOCKS="${PGO_BLOCKS:-20}"
+BOLT_BLOCKS="${BOLT_BLOCKS:-20}"
+SKIP_BOLT="${SKIP_BOLT:-false}"
+STRIP_SYMBOLS="${STRIP_SYMBOLS:-true}"
+COLLECT_PGO_ONLY="${COLLECT_PGO_ONLY:-false}"
+PROFILE="${PROFILE:-maxperf-symbols}"
+FEATURES="${FEATURES:-jemalloc,asm-keccak,min-debug-logs}"
+TARGET="${TARGET:-$(rustc -Vv | grep host | cut -d' ' -f2)}"
+BASE_RUSTFLAGS="${RUSTFLAGS:-}"
+EXTRA_RUSTFLAGS="${EXTRA_RUSTFLAGS:-}"
+COMBINED_RUSTFLAGS="$BASE_RUSTFLAGS $EXTRA_RUSTFLAGS"
+PGO_PROFDATA="${PGO_PROFDATA:-}"
+DATADIR="${DATADIR:-}"
+RPC_URL="${RPC_URL:-}"
+
+SKIP_BOLT_BOOL=false
+if [[ "${SKIP_BOLT,,}" == "true" || "$SKIP_BOLT" == "1" ]]; then
+    SKIP_BOLT_BOOL=true
+fi
+
+STRIP_SYMBOLS_BOOL=false
+if [[ "${STRIP_SYMBOLS,,}" == "true" || "$STRIP_SYMBOLS" == "1" ]]; then
+    STRIP_SYMBOLS_BOOL=true
+fi
+
+COLLECT_PGO_ONLY_BOOL=false
+if [[ "${COLLECT_PGO_ONLY,,}" == "true" || "$COLLECT_PGO_ONLY" == "1" ]]; then
+    COLLECT_PGO_ONLY_BOOL=true
+fi
+
+USE_PRECOLLECTED_PGO=false
+if [ -n "$PGO_PROFDATA" ]; then
+    if [ ! -f "$PGO_PROFDATA" ]; then
+        echo "error: PGO_PROFDATA points to a missing file: $PGO_PROFDATA"
+        exit 1
+    fi
+    USE_PRECOLLECTED_PGO=true
+fi
+
+NEEDS_BENCH_WORKLOAD=true
+if [ "$USE_PRECOLLECTED_PGO" = true ] && [ "$SKIP_BOLT_BOOL" = true ]; then
+    NEEDS_BENCH_WORKLOAD=false
+fi
+
+if [ "$NEEDS_BENCH_WORKLOAD" = true ]; then
+    : "${DATADIR:?DATADIR must be set to the reth data directory}"
+    : "${RPC_URL:?RPC_URL must be set}"
+fi
+
+if [[ "$PROFILE" == dev ]]; then
+    PROFILE_DIR=debug
+else
+    PROFILE_DIR=$PROFILE
+fi
+
+MANIFEST_PATH="bin/reth"
+
+LLVM_VERSION=$(rustc -Vv | grep -oP 'LLVM version: \K\d+')
+PGO_DIR="$PWD/target/pgo-profiles"
+BOLT_DIR="$PWD/target/bolt-profiles"
+CARGO_ARGS=(--profile "$PROFILE" --features "$FEATURES" --manifest-path "$MANIFEST_PATH/Cargo.toml" --bin "reth" --locked)
+
+# Enable debug symbols for BOLT (requires symbols to reorder code).
+# Strip them at the end.
+PROFILE_UPPER=$(echo "$PROFILE" | tr '[:lower:]-' '[:upper:]_')
+export "CARGO_PROFILE_${PROFILE_UPPER}_STRIP=debuginfo"
+
+gha_section_start "Full PGO+BOLT Build"
+echo "Binary:      reth"
+echo "Manifest:    $MANIFEST_PATH"
+echo "Target:      $TARGET"
+echo "Profile:     $PROFILE"
+echo "Features:    $FEATURES"
+echo "LLVM:        $LLVM_VERSION"
+echo "PGO blocks:  $PGO_BLOCKS"
+echo "BOLT blocks: $BOLT_BLOCKS"
+echo "Skip BOLT:   $SKIP_BOLT"
+echo "Strip symbols: $STRIP_SYMBOLS"
+echo "Collect only: $COLLECT_PGO_ONLY"
+echo "PGO profdata: ${PGO_PROFDATA:-<collect with reth-bench>}"
+echo "RUSTFLAGS:   ${BASE_RUSTFLAGS:-<unset>}"
+echo "EXTRA_RUSTFLAGS: ${EXTRA_RUSTFLAGS:-<unset>}"
+if [ "$NEEDS_BENCH_WORKLOAD" = true ]; then
+    echo "Datadir:     $DATADIR"
+    echo "RPC URL:     $RPC_URL"
+else
+    echo "Datadir:     <not required>"
+    echo "RPC URL:     <not required>"
+fi
+gha_section_end
+
+# ── Prerequisites ──────────────────────────────────────────────────────────────
+gha_section_start "Installing prerequisites"
+rustup component add llvm-tools-preview
+
+LLVM_PROFDATA=$(find "$(rustc --print sysroot)" -name llvm-profdata -type f | head -1)
+if [ -z "$LLVM_PROFDATA" ]; then
+    echo "error: llvm-profdata not found"
+    exit 1
+fi
+
+install_bolt() {
+    if command -v llvm-bolt &>/dev/null; then
+        echo "BOLT already installed"
+        return
+    fi
+    echo "Installing BOLT from apt.llvm.org..."
+    wget -qO- https://apt.llvm.org/llvm-snapshot.gpg.key | sudo tee /etc/apt/trusted.gpg.d/apt.llvm.org.asc >/dev/null
+    CODENAME=$(lsb_release -cs)
+    echo "deb http://apt.llvm.org/$CODENAME/ llvm-toolchain-$CODENAME-$LLVM_VERSION main" | sudo tee /etc/apt/sources.list.d/llvm.list >/dev/null
+    sudo apt-get update -qq
+    sudo apt-get install -y -qq "bolt-$LLVM_VERSION"
+    sudo ln -sf "/usr/bin/llvm-bolt-$LLVM_VERSION" /usr/local/bin/llvm-bolt
+    sudo ln -sf "/usr/bin/merge-fdata-$LLVM_VERSION" /usr/local/bin/merge-fdata
+}
+if [ "$SKIP_BOLT_BOOL" = true ]; then
+    echo "Skipping BOLT installation (SKIP_BOLT=$SKIP_BOLT)"
+else
+    install_bolt
+fi
+gha_section_end
+
+if [ "$NEEDS_BENCH_WORKLOAD" = true ]; then
+    # Build reth-bench once (non-instrumented) — reused for both phases.
+    gha_section_start "Building reth-bench"
+    RUSTFLAGS="$COMBINED_RUSTFLAGS" \
+        cargo build --profile "$PROFILE" --features "$FEATURES" \
+        --manifest-path bin/reth-bench/Cargo.toml --bin reth-bench --locked
+    RETH_BENCH_BIN="$(find target -name reth-bench -type f -executable | head -1)"
+    echo "reth-bench: $RETH_BENCH_BIN"
+    gha_section_end
+else
+    gha_section_start "Building reth-bench"
+    echo "Skipping reth-bench build (pre-collected PGO with SKIP_BOLT=true)"
+    gha_section_end
+fi
+
+# ── Helpers ────────────────────────────────────────────────────────────────────
+RETH_PID=
+cleanup() {
+    if [ -n "${RETH_PID:-}" ] && kill -0 "$RETH_PID" 2>/dev/null; then
+        echo "Stopping reth (pid $RETH_PID)..."
+        sudo kill "$RETH_PID" 2>/dev/null || true
+        for i in $(seq 1 60); do
+            sudo kill -0 "$RETH_PID" 2>/dev/null || break
+            if [ $((i % 10)) -eq 0 ]; then
+                echo "  waiting... (${i}s)"
+            fi
+            sleep 1
+        done
+        sudo kill -9 "$RETH_PID" 2>/dev/null || true
+    fi
+}
+trap cleanup EXIT
+
+# Start reth, wait for RPC, run reth-bench, then stop reth.
+# Arguments: $1 = reth binary path, $2 = number of blocks, $3 = log label
+run_bench_workload() {
+    local reth_bin="$1" blocks="$2" label="$3"
+    local http_port=8545 authrpc_port=8551
+
+    echo "--- Starting reth ($label) ---"
+    sudo "$reth_bin" node \
+        --datadir "$DATADIR" \
+        --log.file.directory "/tmp/reth-${label}-logs" \
+        --engine.accept-execution-requests-hash \
+        --http --http.port "$http_port" \
+        --authrpc.port "$authrpc_port" \
+        --disable-discovery --no-persist-peers \
+        > "/tmp/reth-${label}.log" 2>&1 &
+    RETH_PID=$!
+
+    echo "Waiting for reth RPC..."
+    for i in $(seq 1 120); do
+        if curl -sf "http://127.0.0.1:$http_port" -X POST \
+            -H 'Content-Type: application/json' \
+            -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \
+            > /dev/null 2>&1; then
+            echo "reth is ready after ${i}s"
+            break
+        fi
+        if [ "$i" -eq 120 ]; then
+            echo "error: reth failed to start within 120s"
+            cat "/tmp/reth-${label}.log"
+            exit 1
+        fi
+        sleep 1
+    done
+
+    echo "Running reth-bench ($blocks blocks)..."
+    "$RETH_BENCH_BIN" new-payload-fcu \
+        --rpc-url "$RPC_URL" \
+        --engine-rpc-url "http://127.0.0.1:$authrpc_port" \
+        --jwt-secret "$DATADIR/jwt.hex" \
+        --advance "$blocks" \
+        --reth-new-payload 2>&1 | sed -u "s/^/[$label] /"
+
+    echo "Stopping reth ($label)..."
+    sudo kill "$RETH_PID" 2>/dev/null || true
+    for i in $(seq 1 60); do
+        sudo kill -0 "$RETH_PID" 2>/dev/null || break
+        sleep 1
+    done
+    sudo kill -9 "$RETH_PID" 2>/dev/null || true
+    RETH_PID=
+}
+
+publish_binary() {
+    local source_bin="$1"
+    for out in "target/$TARGET/$PROFILE_DIR" "target/$PROFILE_DIR"; do
+        local destination="$out/reth"
+        mkdir -p "$out"
+        # Skip copying when source and destination resolve to the same inode.
+        if [ -e "$destination" ] && [ "$source_bin" -ef "$destination" ]; then
+            continue
+        fi
+        cp "$source_bin" "$destination"
+    done
+}
+
+if [ "$USE_PRECOLLECTED_PGO" = true ]; then
+    gha_section_start "Phase 1: Using Pre-Collected PGO Profile"
+    rm -rf "$PGO_DIR"
+    mkdir -p "$PGO_DIR"
+    cp "$PGO_PROFDATA" "$PGO_DIR/merged.profdata"
+    echo "Using pre-collected profile: $PGO_PROFDATA"
+    echo "PGO profile: $PGO_DIR/merged.profdata ($(ls -lh "$PGO_DIR/merged.profdata" | awk '{print $5}'))"
+    gha_section_end
+else
+    # ── Phase 1: PGO profile collection ───────────────────────────────────────
+    gha_section_start "Phase 1: PGO Profile Collection"
+
+    rm -rf "$PGO_DIR"
+    mkdir -p "$PGO_DIR"
+
+    echo "Building PGO-instrumented binary..."
+    RUSTFLAGS="-Cprofile-generate=$PGO_DIR -Crelocation-model=pic $COMBINED_RUSTFLAGS" \
+        cargo build "${CARGO_ARGS[@]}" --target "$TARGET"
+
+    PGO_RETH_BIN="$PWD/target/$TARGET/$PROFILE_DIR/reth"
+    echo "Instrumented binary: $PGO_RETH_BIN ($(ls -lh "$PGO_RETH_BIN" | awk '{print $5}'))"
+
+    run_bench_workload "$PGO_RETH_BIN" "$PGO_BLOCKS" "pgo"
+
+    # Fix ownership if reth ran as root.
+    sudo chown -R "$(id -un):$(id -gn)" "$PGO_DIR" 2>/dev/null || true
+
+    # Merge PGO profiles.
+    echo "Merging PGO profiles..."
+    PROFRAW_COUNT=$(find "$PGO_DIR" -name '*.profraw' | wc -l)
+    echo "Found $PROFRAW_COUNT .profraw files"
+    if [ "$PROFRAW_COUNT" -eq 0 ]; then
+        echo "error: no .profraw files — instrumented binary did not produce profiles"
+        exit 1
+    fi
+    "$LLVM_PROFDATA" merge -o "$PGO_DIR/merged.profdata" "$PGO_DIR"/*.profraw
+    echo "PGO profile: $PGO_DIR/merged.profdata ($(ls -lh "$PGO_DIR/merged.profdata" | awk '{print $5}'))"
+    gha_section_end
+fi
+
+if [ "$COLLECT_PGO_ONLY_BOOL" = true ]; then
+    gha_section_start "PGO Collection Complete"
+    echo "COLLECT_PGO_ONLY=true, skipping PGO/BOLT optimized binary build"
+    echo "Profile: $PGO_DIR/merged.profdata"
+    gha_section_end
+    exit 0
+fi
+
+if [ "$SKIP_BOLT_BOOL" = true ]; then
+    gha_section_start "BOLT Phase Skipped"
+    echo "SKIP_BOLT=$SKIP_BOLT, building PGO-only binary"
+    echo "Building PGO-optimized binary..."
+    RUSTFLAGS="-Cprofile-use=$PGO_DIR/merged.profdata $COMBINED_RUSTFLAGS" \
+        cargo build "${CARGO_ARGS[@]}" --target "$TARGET"
+
+    BUILT_BIN="$PWD/target/$TARGET/$PROFILE_DIR/reth"
+    if [ "$STRIP_SYMBOLS_BOOL" = true ]; then
+        echo "Stripping debug symbols..."
+        strip "$BUILT_BIN"
+    else
+        echo "Skipping strip (STRIP_SYMBOLS=$STRIP_SYMBOLS)"
+    fi
+    publish_binary "$BUILT_BIN"
+    gha_section_end
+else
+    # ── Phase 2: BOLT profile collection (with PGO) ──────────────────────────
+    gha_section_start "Phase 2: BOLT Profile Collection (with PGO)"
+
+    rm -rf "$BOLT_DIR"
+    mkdir -p "$BOLT_DIR"
+
+    echo "Building BOLT-instrumented binary with PGO..."
+    # --emit-relocs preserves relocation entries in the binary, required by llvm-bolt -instrument
+    RUSTFLAGS="-Cprofile-use=$PGO_DIR/merged.profdata -Clink-arg=-Wl,--emit-relocs $COMBINED_RUSTFLAGS" \
+        cargo build "${CARGO_ARGS[@]}" --target "$TARGET"
+
+    # Instrument with BOLT
+    BUILT_BIN="$PWD/target/$TARGET/$PROFILE_DIR/reth"
+    BOLT_INSTRUMENTED_BIN="$BUILT_BIN-bolt-instrumented"
+
+    echo "Instrumenting binary with BOLT..."
+    # --skip-funcs: skip compiler-generated drop_in_place functions that BOLT can't handle
+    # as split functions in relocation mode (triggered by --emit-relocs)
+    llvm-bolt "$BUILT_BIN" \
+        -instrument \
+        --instrumentation-file-append-pid \
+        --instrumentation-file="$BOLT_DIR/prof" \
+        --skip-funcs='.*drop_in_place.*' \
+        -o "$BOLT_INSTRUMENTED_BIN"
+    echo "BOLT-instrumented binary: $BOLT_INSTRUMENTED_BIN ($(ls -lh "$BOLT_INSTRUMENTED_BIN" | awk '{print $5}'))"
+
+    run_bench_workload "$BOLT_INSTRUMENTED_BIN" "$BOLT_BLOCKS" "bolt"
+
+    # Fix ownership for BOLT profiles
+    sudo chown -R "$(id -un):$(id -gn)" "$BOLT_DIR" 2>/dev/null || true
+
+    # Merge BOLT profiles
+    echo "Merging BOLT profiles..."
+    FDATA_COUNT=$(find "$BOLT_DIR" -name '*.fdata' | wc -l)
+    echo "Found $FDATA_COUNT .fdata files"
+    if [ "$FDATA_COUNT" -eq 0 ]; then
+        echo "error: no .fdata files — BOLT-instrumented binary did not produce profiles"
+        exit 1
+    fi
+    merge-fdata "$BOLT_DIR"/*.fdata > "$BOLT_DIR/merged.fdata"
+    echo "BOLT profile: $BOLT_DIR/merged.fdata ($(ls -lh "$BOLT_DIR/merged.fdata" | awk '{print $5}'))"
+    gha_section_end
+
+    # ── Phase 3: Final optimized build ───────────────────────────────────────
+    gha_section_start "Phase 3: Final PGO+BOLT Optimized Build"
+
+    echo "Building PGO-optimized binary..."
+    # --emit-relocs preserves relocation entries in the binary, required by llvm-bolt for code reordering
+    RUSTFLAGS="-Cprofile-use=$PGO_DIR/merged.profdata -Clink-arg=-Wl,--emit-relocs $COMBINED_RUSTFLAGS" \
+        cargo build "${CARGO_ARGS[@]}" --target "$TARGET"
+
+    BUILT_BIN="$PWD/target/$TARGET/$PROFILE_DIR/reth"
+    OPTIMIZED_BIN="$BUILT_BIN-bolt-optimized"
+
+    echo "Optimizing with BOLT..."
+    llvm-bolt "$BUILT_BIN" \
+        -o "$OPTIMIZED_BIN" \
+        --data "$BOLT_DIR/merged.fdata" \
+        -reorder-blocks=ext-tsp \
+        -reorder-functions=cdsort \
+        -split-functions \
+        -split-all-cold \
+        -dyno-stats \
+        -icf=1 \
+        -use-gnu-stack \
+        --skip-funcs='.*drop_in_place.*'
+
+    if [ "$STRIP_SYMBOLS_BOOL" = true ]; then
+        echo "Stripping debug symbols..."
+        strip "$OPTIMIZED_BIN"
+    else
+        echo "Skipping strip (STRIP_SYMBOLS=$STRIP_SYMBOLS)"
+    fi
+    publish_binary "$OPTIMIZED_BIN"
+    gha_section_end
+fi
+
+gha_section_start "Build Complete"
+ls -lh "target/$PROFILE_DIR/reth"
+echo "Output: target/$PROFILE_DIR/reth"
+gha_section_end
--- a/.github/workflows/bench.yml
+++ b/.github/workflows/bench.yml
@@ -8,11 +8,11 @@

 on:
  issue_comment:
-    types: [created, edited]
+    types: [created]
  workflow_dispatch:
    inputs:
      blocks:
-        description: "Number of blocks to benchmark"
+        description: "Number of blocks to benchmark (or 'big' for big blocks mode)"
        required: false
        default: "500"
        type: string
@@ -31,16 +31,46 @@ on:
        required: false
        default: ""
        type: string
+      wait_time:
+        description: "Fixed wait time between blocks (e.g. 500ms, 1s)"
+        required: false
+        default: ""
+        type: string
+      baseline_args:
+        description: "Extra CLI args for the baseline reth node"
+        required: false
+        default: ""
+        type: string
+      feature_args:
+        description: "Extra CLI args for the feature reth node"
+        required: false
+        default: ""
+        type: string
      samply:
        description: "Enable samply profiling"
        required: false
        default: "false"
        type: boolean
+      reth_newPayload:
+        description: "Use reth_newPayload RPC (server-side timing)"
+        required: false
+        default: "true"
+        type: boolean
      cores:
        description: "Limit reth to N CPU cores (0 = all available)"
        required: false
        default: "0"
        type: string
+      no_slack:
+        description: "Suppress Slack notifications for benchmark results"
+        required: false
+        default: "true"
+        type: boolean
+      abba:
+        description: "Run ABBA (BFFB) interleaved order; false = single AB pass"
+        required: false
+        default: "true"
+        type: boolean

 env:
  CARGO_TERM_COLOR: always
@@ -72,7 +102,14 @@ jobs:
      baseline-name: ${{ steps.args.outputs.baseline-name }}
      feature-name: ${{ steps.args.outputs.feature-name }}
      samply: ${{ steps.args.outputs.samply }}
+      no-slack: ${{ steps.args.outputs.no-slack }}
      cores: ${{ steps.args.outputs.cores }}
+      big-blocks: ${{ steps.args.outputs.big-blocks }}
+      reth-new-payload: ${{ steps.args.outputs.reth-new-payload }}
+      wait-time: ${{ steps.args.outputs.wait-time }}
+      baseline-args: ${{ steps.args.outputs.baseline-args }}
+      feature-args: ${{ steps.args.outputs.feature-args }}
+      abba: ${{ steps.args.outputs.abba }}
      comment-id: ${{ steps.ack.outputs.comment-id }}
    steps:
      - name: Check org membership
@@ -100,7 +137,7 @@ jobs:
        with:
          github-token: ${{ secrets.DEREK_PAT }}
          script: |
-            let pr, actor, blocks, warmup, baseline, feature, samply, cores;
+            let pr, actor, blocks, warmup, baseline, feature, samply, cores, bigBlocks;

            if (context.eventName === 'workflow_dispatch') {
              actor = '${{ github.actor }}';
@@ -109,7 +146,14 @@ jobs:
              baseline = '${{ github.event.inputs.baseline }}';
              feature = '${{ github.event.inputs.feature }}';
              samply = '${{ github.event.inputs.samply }}' === 'true' ? 'true' : 'false';
+              var noSlack = '${{ github.event.inputs.no_slack }}' !== 'false' ? 'true' : 'false';
              cores = '${{ github.event.inputs.cores }}' || '0';
+              bigBlocks = blocks === 'big' ? 'true' : 'false';
+              var rethNewPayload = '${{ github.event.inputs.reth_newPayload }}' !== 'false' ? 'true' : 'false';
+              var abba = '${{ github.event.inputs.abba }}' !== 'false' ? 'true' : 'false';
+              var waitTime = '${{ github.event.inputs.wait_time }}' || '';
+              var baselineNodeArgs = '${{ github.event.inputs.baseline_args }}' || '';
+              var featureNodeArgs = '${{ github.event.inputs.feature_args }}' || '';

              // Find PR for the selected branch
              const branch = '${{ github.ref_name }}';
@@ -129,37 +173,75 @@ jobs:
              actor = context.payload.comment.user.login;

              const body = context.payload.comment.body.trim();
-              const intArgs = new Set(['blocks', 'warmup', 'cores']);
+              const intArgs = new Set(['warmup', 'cores']);
+              const intOrKeywordArgs = new Map([['blocks', new Set(['big'])]]);
              const refArgs = new Set(['baseline', 'feature']);
-              const boolArgs = new Set(['samply']);
-              const defaults = { blocks: '500', warmup: '100', baseline: '', feature: '', samply: 'false', cores: '0' };
+              const boolArgs = new Set(['samply', 'no-slack']);
+              const boolDefaultTrue = new Set(['reth_newPayload', 'abba']);
+              const durationArgs = new Set(['wait-time']);
+              const stringArgs = new Set(['baseline-args', 'feature-args']);
+              const defaults = { blocks: '500', warmup: '100', baseline: '', feature: '', samply: 'false', 'no-slack': 'false', cores: '0', reth_newPayload: 'true', abba: 'true', 'wait-time': '', 'baseline-args': '', 'feature-args': '' };
              const unknown = [];
              const invalid = [];
              const args = body.replace(/^(?:@decofe|derek) bench\s*/, '');
-              for (const part of args.split(/\s+/).filter(Boolean)) {
+              // Parse args, handling quoted values like key="value with spaces"
+              const parts = [];
+              const argRegex = /(\S+?="[^"]*"|\S+?='[^']*'|\S+)/g;
+              let m;
+              while ((m = argRegex.exec(args)) !== null) parts.push(m[1]);
+              for (const part of parts) {
                const eq = part.indexOf('=');
                if (eq === -1) {
                  if (boolArgs.has(part)) {
                    defaults[part] = 'true';
+                  } else if (boolDefaultTrue.has(part)) {
+                    defaults[part] = 'true';
                  } else {
                    unknown.push(part);
                  }
                  continue;
                }
                const key = part.slice(0, eq);
-                const value = part.slice(eq + 1);
-                if (intArgs.has(key)) {
+                let value = part.slice(eq + 1);
+                // Strip surrounding quotes
+                if ((value.startsWith('"') && value.endsWith('"')) || (value.startsWith("'") && value.endsWith("'"))) {
+                  value = value.slice(1, -1);
+                }
+                if (boolDefaultTrue.has(key)) {
+                  if (value === 'true' || value === 'false') {
+                    defaults[key] = value;
+                  } else {
+                    invalid.push(`\`${key}=${value}\` (must be true or false)`);
+                  }
+                } else if (durationArgs.has(key)) {
+                  if (/^\d+(ms|s|m)$/.test(value)) {
+                    defaults[key] = value;
+                  } else {
+                    invalid.push(`\`${key}=${value}\` (must be a duration like 500ms, 1s, 2m)`);
+                  }
+                } else if (intArgs.has(key)) {
                  if (!/^\d+$/.test(value)) {
                    invalid.push(`\`${key}=${value}\` (must be a positive integer)`);
                  } else {
                    defaults[key] = value;
                  }
+                } else if (intOrKeywordArgs.has(key)) {
+                  const keywords = intOrKeywordArgs.get(key);
+                  if (keywords.has(value)) {
+                    defaults[key] = value;
+                  } else if (/^\d+$/.test(value)) {
+                    defaults[key] = value;
+                  } else {
+                    invalid.push(`\`${key}=${value}\` (must be a positive integer or one of: ${[...keywords].join(', ')})`);
+                  }
                } else if (refArgs.has(key)) {
                  if (!value) {
                    invalid.push(`\`${key}=\` (must be a git ref)`);
                  } else {
                    defaults[key] = value;
                  }
+                } else if (stringArgs.has(key)) {
+                  defaults[key] = value;
                } else {
                  unknown.push(key);
                }
@@ -168,7 +250,7 @@ jobs:
              if (unknown.length) errors.push(`Unknown argument(s): \`${unknown.join('`, `')}\``);
              if (invalid.length) errors.push(`Invalid value(s): ${invalid.join(', ')}`);
              if (errors.length) {
-                const msg = `❌ **Invalid bench command**\n\n${errors.join('\n')}\n\n**Usage:** \`@decofe bench [blocks=N] [warmup=N] [baseline=REF] [feature=REF] [samply] [cores=N]\``;
+                const msg = `❌ **Invalid bench command**\n\n${errors.join('\n')}\n\n**Usage:** \`@decofe bench [blocks=N|big] [warmup=N] [baseline=REF] [feature=REF] [samply] [no-slack] [cores=N] [reth_newPayload=true|false] [abba=true|false] [wait-time=DURATION] [baseline-args="..."] [feature-args="..."]\``;
                await github.rest.issues.createComment({
                  owner: context.repo.owner,
                  repo: context.repo.repo,
@@ -183,7 +265,14 @@ jobs:
              baseline = defaults.baseline;
              feature = defaults.feature;
              samply = defaults.samply;
+              var noSlack = defaults['no-slack'];
              cores = defaults.cores;
+              bigBlocks = blocks === 'big' ? 'true' : 'false';
+              var rethNewPayload = defaults.reth_newPayload;
+              var abba = defaults.abba;
+              var waitTime = defaults['wait-time'];
+              var baselineNodeArgs = defaults['baseline-args'];
+              var featureNodeArgs = defaults['feature-args'];
            }

            // Resolve display names for baseline/feature
@@ -211,7 +300,14 @@ jobs:
            core.setOutput('baseline-name', baselineName);
            core.setOutput('feature-name', featureName);
            core.setOutput('samply', samply);
+            core.setOutput('no-slack', noSlack);
            core.setOutput('cores', cores);
+            core.setOutput('big-blocks', bigBlocks);
+            core.setOutput('reth-new-payload', rethNewPayload);
+            core.setOutput('wait-time', waitTime);
+            core.setOutput('baseline-args', baselineNodeArgs);
+            core.setOutput('feature-args', featureNodeArgs);
+            core.setOutput('abba', abba);

      - name: Acknowledge request
        id: ack
@@ -269,10 +365,24 @@ jobs:
            const baseline = '${{ steps.args.outputs.baseline-name }}';
            const feature = '${{ steps.args.outputs.feature-name }}';
            const samply = '${{ steps.args.outputs.samply }}' === 'true';
+            const noSlack = '${{ steps.args.outputs.no-slack }}' === 'true';
+            const bigBlocks = '${{ steps.args.outputs.big-blocks }}' === 'true';
            const samplyNote = samply ? ', samply: `enabled`' : '';
+            const noSlackNote = noSlack ? ', no-slack' : '';
            const cores = '${{ steps.args.outputs.cores }}';
            const coresNote = cores && cores !== '0' ? `, cores: \`${cores}\`` : '';
-            const config = `**Config:** ${blocks} blocks, ${warmup} warmup blocks, baseline: \`${baseline}\`, feature: \`${feature}\`${samplyNote}${coresNote}`;
+            const rethNP = '${{ steps.args.outputs.reth-new-payload }}' !== 'false';
+            const rethNPNote = !rethNP ? ', reth_newPayload: `disabled`' : '';
+            const abbaEnabled = '${{ steps.args.outputs.abba }}' !== 'false';
+            const abbaNote = !abbaEnabled ? ', abba: `disabled`' : '';
+            const waitTimeVal = '${{ steps.args.outputs.wait-time }}';
+            const waitTimeNote = waitTimeVal ? `, wait-time: \`${waitTimeVal}\`` : '';
+            const baselineArgsVal = '${{ steps.args.outputs.baseline-args }}';
+            const baselineArgsNote = baselineArgsVal ? `, baseline-args: \`${baselineArgsVal}\`` : '';
+            const featureArgsVal = '${{ steps.args.outputs.feature-args }}';
+            const featureArgsNote = featureArgsVal ? `, feature-args: \`${featureArgsVal}\`` : '';
+            const blocksDesc = bigBlocks ? 'blocks: `big`' : `${blocks} blocks, ${warmup} warmup blocks`;
+            const config = `**Config:** ${blocksDesc}, baseline: \`${baseline}\`, feature: \`${feature}\`${samplyNote}${noSlackNote}${coresNote}${rethNPNote}${abbaNote}${waitTimeNote}${baselineArgsNote}${featureArgsNote}`;

            const { data: comment } = await github.rest.issues.createComment({
              owner: context.repo.owner,
@@ -297,10 +407,24 @@ jobs:
            const baseline = '${{ steps.args.outputs.baseline-name }}';
            const feature = '${{ steps.args.outputs.feature-name }}';
            const samply = '${{ steps.args.outputs.samply }}' === 'true';
+            const noSlack = '${{ steps.args.outputs.no-slack }}' === 'true';
+            const bigBlocks = '${{ steps.args.outputs.big-blocks }}' === 'true';
            const samplyNote = samply ? ', samply: `enabled`' : '';
+            const noSlackNote = noSlack ? ', no-slack' : '';
            const cores = '${{ steps.args.outputs.cores }}';
            const coresNote = cores && cores !== '0' ? `, cores: \`${cores}\`` : '';
-            const config = `**Config:** ${blocks} blocks, ${warmup} warmup blocks, baseline: \`${baseline}\`, feature: \`${feature}\`${samplyNote}${coresNote}`;
+            const rethNP = '${{ steps.args.outputs.reth-new-payload }}' !== 'false';
+            const rethNPNote = !rethNP ? ', reth_newPayload: `disabled`' : '';
+            const abbaEnabled = '${{ steps.args.outputs.abba }}' !== 'false';
+            const abbaNote = !abbaEnabled ? ', abba: `disabled`' : '';
+            const waitTimeVal = '${{ steps.args.outputs.wait-time }}';
+            const waitTimeNote = waitTimeVal ? `, wait-time: \`${waitTimeVal}\`` : '';
+            const baselineArgsVal = '${{ steps.args.outputs.baseline-args }}';
+            const baselineArgsNote = baselineArgsVal ? `, baseline-args: \`${baselineArgsVal}\`` : '';
+            const featureArgsVal = '${{ steps.args.outputs.feature-args }}';
+            const featureArgsNote = featureArgsVal ? `, feature-args: \`${featureArgsVal}\`` : '';
+            const blocksDesc = bigBlocks ? 'blocks: `big`' : `${blocks} blocks, ${warmup} warmup blocks`;
+            const config = `**Config:** ${blocksDesc}, baseline: \`${baseline}\`, feature: \`${feature}\`${samplyNote}${noSlackNote}${coresNote}${rethNPNote}${abbaNote}${waitTimeNote}${baselineArgsNote}${featureArgsNote}`;
            const runUrl = `${context.serverUrl}/${context.repo.owner}/${context.repo.repo}/actions/runs/${context.runId}`;

            const numRunners = parseInt(process.env.BENCH_RUNNERS) || 1;
@@ -352,7 +476,7 @@ jobs:
  reth-bench:
    needs: reth-bench-ack
    name: reth-bench
-    runs-on: [self-hosted, Linux, X64]
+    runs-on: [self-hosted, Linux, X64, available]
    timeout-minutes: 120
    env:
      BENCH_RPC_URL: https://ethereum.reth.rs/rpc
@@ -364,7 +488,17 @@ jobs:
      BENCH_WARMUP_BLOCKS: ${{ needs.reth-bench-ack.outputs.warmup }}
      BENCH_SAMPLY: ${{ needs.reth-bench-ack.outputs.samply }}
      BENCH_CORES: ${{ needs.reth-bench-ack.outputs.cores }}
+      BENCH_BIG_BLOCKS: ${{ needs.reth-bench-ack.outputs.big-blocks }}
+      BENCH_RETH_NEW_PAYLOAD: ${{ needs.reth-bench-ack.outputs.reth-new-payload }}
+      BENCH_WAIT_TIME: ${{ needs.reth-bench-ack.outputs.wait-time }}
+      BENCH_BASELINE_ARGS: ${{ needs.reth-bench-ack.outputs.baseline-args }}
+      BENCH_FEATURE_ARGS: ${{ needs.reth-bench-ack.outputs.feature-args }}
+      BENCH_ABBA: ${{ needs.reth-bench-ack.outputs.abba }}
      BENCH_COMMENT_ID: ${{ needs.reth-bench-ack.outputs.comment-id }}
+      BENCH_NO_SLACK: ${{ needs.reth-bench-ack.outputs.no-slack }}
+      BENCH_METRICS_ADDR: "127.0.0.1:9100"
+      BENCH_OTLP_TRACES_ENDPOINT: ${{ secrets.BENCH_OTLP_TRACES_ENDPOINT }}
+      BENCH_OTLP_LOGS_ENDPOINT: ${{ secrets.BENCH_OTLP_LOGS_ENDPOINT }}
    steps:
      - name: Clean up previous bench-work
        run: sudo rm -rf "$BENCH_WORK_DIR" 2>/dev/null || true
@@ -383,13 +517,11 @@ jobs:
              repo: context.repo.repo,
              pull_number: parseInt(process.env.BENCH_PR),
            });
-            // For closed/merged PRs, the merge ref doesn't exist — use head SHA
-            if (pr.state !== 'open') {
-              core.info(`PR #${process.env.BENCH_PR} is ${pr.state}, using head SHA ${pr.head.sha}`);
-              core.setOutput('ref', pr.head.sha);
-            } else {
-              core.setOutput('ref', `refs/pull/${process.env.BENCH_PR}/merge`);
-            }
+            // Always use head SHA — the merge ref (refs/pull/N/merge) may not
+            // exist if the PR has conflicts, was force-pushed, or was
+            // merged/closed between this step and checkout.
+            core.info(`PR #${process.env.BENCH_PR} (${pr.state}), using head SHA ${pr.head.sha}`);
+            core.setOutput('ref', pr.head.sha);

      - uses: actions/checkout@v6
        with:
@@ -417,10 +549,24 @@ jobs:
            const baseline = '${{ needs.reth-bench-ack.outputs.baseline-name }}';
            const feature = '${{ needs.reth-bench-ack.outputs.feature-name }}';
            const samply = process.env.BENCH_SAMPLY === 'true';
+            const noSlack = process.env.BENCH_NO_SLACK === 'true';
+            const bigBlocks = process.env.BENCH_BIG_BLOCKS === 'true';
            const samplyNote = samply ? ', samply: `enabled`' : '';
+            const noSlackNote = noSlack ? ', no-slack' : '';
            const cores = process.env.BENCH_CORES || '0';
            const coresNote = cores && cores !== '0' ? `, cores: \`${cores}\`` : '';
-            core.exportVariable('BENCH_CONFIG', `**Config:** ${blocks} blocks, ${warmup} warmup blocks, baseline: \`${baseline}\`, feature: \`${feature}\`${samplyNote}${coresNote}`);
+            const rethNP = (process.env.BENCH_RETH_NEW_PAYLOAD || 'true') !== 'false';
+            const rethNPNote = !rethNP ? ', reth_newPayload: `disabled`' : '';
+            const abbaEnabled = (process.env.BENCH_ABBA || 'true') !== 'false';
+            const abbaNote = !abbaEnabled ? ', abba: `disabled`' : '';
+            const waitTimeVal = process.env.BENCH_WAIT_TIME || '';
+            const waitTimeNote = waitTimeVal ? `, wait-time: \`${waitTimeVal}\`` : '';
+            const baselineArgsVal = process.env.BENCH_BASELINE_ARGS || '';
+            const baselineArgsNote = baselineArgsVal ? `, baseline-args: \`${baselineArgsVal}\`` : '';
+            const featureArgsVal = process.env.BENCH_FEATURE_ARGS || '';
+            const featureArgsNote = featureArgsVal ? `, feature-args: \`${featureArgsVal}\`` : '';
+            const blocksDesc = bigBlocks ? 'blocks: `big`' : `${blocks} blocks, ${warmup} warmup blocks`;
+            core.exportVariable('BENCH_CONFIG', `**Config:** ${blocksDesc}, baseline: \`${baseline}\`, feature: \`${feature}\`${samplyNote}${noSlackNote}${coresNote}${rethNPNote}${abbaNote}${waitTimeNote}${baselineArgsNote}${featureArgsNote}`);

            const { buildBody } = require('./.github/scripts/bench-update-status.js');
            await github.rest.issues.updateComment({
@@ -680,6 +826,45 @@ jobs:
          rm -rf "$BENCH_WORK_DIR"
          mkdir -p "$BENCH_WORK_DIR"

+      - name: Download big blocks
+        if: env.BENCH_BIG_BLOCKS == 'true'
+        run: |
+          set -euo pipefail
+          MC="mc --config-dir /home/ubuntu/.mc"
+          BUCKET="minio/reth-snapshots/reth-1-minimal-nightly-previous-big-blocks.tar.zst"
+          BIG_BLOCKS_DIR="${BENCH_WORK_DIR}/big-blocks"
+          rm -rf "$BIG_BLOCKS_DIR"; mkdir -p "$BIG_BLOCKS_DIR"
+          echo "Downloading big blocks from $BUCKET..."
+          $MC cat "$BUCKET" | pzstd -d -p 6 | tar -xf - -C "$BIG_BLOCKS_DIR"
+          echo "Big blocks downloaded to $BIG_BLOCKS_DIR"
+          # Verify expected directory structure
+          if [ ! -d "$BIG_BLOCKS_DIR/gas-ramp-dir" ] || [ ! -d "$BIG_BLOCKS_DIR/payloads" ]; then
+            echo "::error::Big blocks archive missing expected gas-ramp-dir/ or payloads/ directories"
+            ls -laR "$BIG_BLOCKS_DIR"
+            exit 1
+          fi
+          echo "Payload files: $(find "$BIG_BLOCKS_DIR/payloads" -name '*.json' | wc -l)"
+
+      - name: Start metrics proxy
+        run: |
+          BENCH_ID="ci-${{ github.run_id }}"
+          BENCH_REFERENCE_EPOCH=$(date +%s)
+          echo "BENCH_ID=${BENCH_ID}" >> "$GITHUB_ENV"
+          echo "BENCH_REFERENCE_EPOCH=${BENCH_REFERENCE_EPOCH}" >> "$GITHUB_ENV"
+
+          LABELS_FILE="/tmp/bench-metrics-labels.json"
+          echo '{}' > "$LABELS_FILE"
+          echo "BENCH_LABELS_FILE=${LABELS_FILE}" >> "$GITHUB_ENV"
+
+          python3 .github/scripts/bench-metrics-proxy.py \
+            --labels "$LABELS_FILE" \
+            --upstream "http://${BENCH_METRICS_ADDR}/" \
+            --subnet 10.10.0.0/24 \
+            --port 9090 &
+          PROXY_PID=$!
+          echo "BENCH_METRICS_PROXY_PID=${PROXY_PID}" >> "$GITHUB_ENV"
+          echo "Metrics proxy started (PID $PROXY_PID)"
+
      - name: Update status (running benchmarks)
        if: success() && env.BENCH_COMMENT_ID
        uses: actions/github-script@v8
@@ -693,19 +878,64 @@ jobs:
      # thermal drift and cache warming.
      - name: "Run benchmark: baseline (1/2)"
        id: run-baseline-1
-        run: taskset -c 0 .github/scripts/bench-reth-run.sh baseline ../reth-baseline/target/profiling/reth "$BENCH_WORK_DIR/baseline-1"
+        env:
+          BASELINE_REF: ${{ steps.refs.outputs.baseline-ref }}
+          OTEL_RESOURCE_ATTRIBUTES: "benchmark_id=${{ env.BENCH_ID }},benchmark_run=baseline-1,run_type=baseline,git_ref=${{ steps.refs.outputs.baseline-ref }}"
+        run: |
+          cat > "$BENCH_LABELS_FILE" <<LABELS
+          {"benchmark_run":"baseline-1","run_type":"baseline","git_ref":"${BASELINE_REF}","bench_sha":"${BASELINE_REF}","benchmark_id":"${BENCH_ID}","run_start_epoch":"$(date +%s)","reference_epoch":"${BENCH_REFERENCE_EPOCH}"}
+          LABELS
+          taskset -c 0 .github/scripts/bench-reth-run.sh baseline ../reth-baseline/target/profiling/reth "$BENCH_WORK_DIR/baseline-1"

      - name: "Run benchmark: feature (1/2)"
        id: run-feature-1
-        run: taskset -c 0 .github/scripts/bench-reth-run.sh feature ../reth-feature/target/profiling/reth "$BENCH_WORK_DIR/feature-1"
+        env:
+          FEATURE_REF: ${{ steps.refs.outputs.feature-ref }}
+          OTEL_RESOURCE_ATTRIBUTES: "benchmark_id=${{ env.BENCH_ID }},benchmark_run=feature-1,run_type=feature,git_ref=${{ steps.refs.outputs.feature-ref }}"
+        run: |
+          cat > "$BENCH_LABELS_FILE" <<LABELS
+          {"benchmark_run":"feature-1","run_type":"feature","git_ref":"${FEATURE_REF}","bench_sha":"${FEATURE_REF}","benchmark_id":"${BENCH_ID}","run_start_epoch":"$(date +%s)","reference_epoch":"${BENCH_REFERENCE_EPOCH}"}
+          LABELS
+          taskset -c 0 .github/scripts/bench-reth-run.sh feature ../reth-feature/target/profiling/reth "$BENCH_WORK_DIR/feature-1"

      - name: "Run benchmark: feature (2/2)"
+        if: env.BENCH_ABBA != 'false'
        id: run-feature-2
-        run: taskset -c 0 .github/scripts/bench-reth-run.sh feature ../reth-feature/target/profiling/reth "$BENCH_WORK_DIR/feature-2"
+        env:
+          FEATURE_REF: ${{ steps.refs.outputs.feature-ref }}
+          OTEL_RESOURCE_ATTRIBUTES: "benchmark_id=${{ env.BENCH_ID }},benchmark_run=feature-2,run_type=feature,git_ref=${{ steps.refs.outputs.feature-ref }}"
+        run: |
+          cat > "$BENCH_LABELS_FILE" <<LABELS
+          {"benchmark_run":"feature-2","run_type":"feature","git_ref":"${FEATURE_REF}","bench_sha":"${FEATURE_REF}","benchmark_id":"${BENCH_ID}","run_start_epoch":"$(date +%s)","reference_epoch":"${BENCH_REFERENCE_EPOCH}"}
+          LABELS
+          taskset -c 0 .github/scripts/bench-reth-run.sh feature ../reth-feature/target/profiling/reth "$BENCH_WORK_DIR/feature-2"

      - name: "Run benchmark: baseline (2/2)"
+        if: env.BENCH_ABBA != 'false'
        id: run-baseline-2
-        run: taskset -c 0 .github/scripts/bench-reth-run.sh baseline ../reth-baseline/target/profiling/reth "$BENCH_WORK_DIR/baseline-2"
+        env:
+          BASELINE_REF: ${{ steps.refs.outputs.baseline-ref }}
+          OTEL_RESOURCE_ATTRIBUTES: "benchmark_id=${{ env.BENCH_ID }},benchmark_run=baseline-2,run_type=baseline,git_ref=${{ steps.refs.outputs.baseline-ref }}"
+        run: |
+          LAST_RUN_START=$(date +%s)
+          echo "BENCH_LAST_RUN_START=${LAST_RUN_START}" >> "$GITHUB_ENV"
+          cat > "$BENCH_LABELS_FILE" <<LABELS
+          {"benchmark_run":"baseline-2","run_type":"baseline","git_ref":"${BASELINE_REF}","bench_sha":"${BASELINE_REF}","benchmark_id":"${BENCH_ID}","run_start_epoch":"${LAST_RUN_START}","reference_epoch":"${BENCH_REFERENCE_EPOCH}"}
+          LABELS
+          taskset -c 0 .github/scripts/bench-reth-run.sh baseline ../reth-baseline/target/profiling/reth "$BENCH_WORK_DIR/baseline-2"
+
+      - name: Stop metrics proxy & generate Grafana URL
+        id: metrics
+        if: "!cancelled()"
+        run: |
+          kill "$BENCH_METRICS_PROXY_PID" 2>/dev/null || true
+
+          LAST_RUN_DURATION=$(( $(date +%s) - BENCH_LAST_RUN_START ))
+          FROM_MS=$(( BENCH_REFERENCE_EPOCH * 1000 ))
+          TO_MS=$(( (BENCH_REFERENCE_EPOCH + LAST_RUN_DURATION) * 1000 ))
+          GRAFANA_URL="https://tempoxyz.grafana.net/d/reth-bench-ghr/reth-bench-ghr?orgId=1&from=${FROM_MS}&to=${TO_MS}&timezone=browser&var-datasource=ef57fux92e9z4e&var-job=reth-bench&var-benchmark_id=${BENCH_ID}&var-benchmark_run=\$__all"
+          echo "grafana-url=${GRAFANA_URL}" >> "$GITHUB_OUTPUT"
+          echo "Grafana URL: ${GRAFANA_URL}"

      - name: Scan logs for errors
        if: "!cancelled()"
@@ -807,12 +1037,30 @@ jobs:
          SUMMARY_ARGS="$SUMMARY_ARGS --baseline-name ${BASELINE_NAME}"
          SUMMARY_ARGS="$SUMMARY_ARGS --feature-name ${FEATURE_NAME}"
          SUMMARY_ARGS="$SUMMARY_ARGS --feature-ref ${FEATURE_REF}"
-          SUMMARY_ARGS="$SUMMARY_ARGS --baseline-csv $BENCH_WORK_DIR/baseline-1/combined_latency.csv $BENCH_WORK_DIR/baseline-2/combined_latency.csv"
-          SUMMARY_ARGS="$SUMMARY_ARGS --feature-csv $BENCH_WORK_DIR/feature-1/combined_latency.csv $BENCH_WORK_DIR/feature-2/combined_latency.csv"
+          BASELINE_CSVS="$BENCH_WORK_DIR/baseline-1/combined_latency.csv"
+          FEATURE_CSVS="$BENCH_WORK_DIR/feature-1/combined_latency.csv"
+          if [ "${BENCH_ABBA:-true}" = "true" ]; then
+            BASELINE_CSVS="$BASELINE_CSVS $BENCH_WORK_DIR/baseline-2/combined_latency.csv"
+            FEATURE_CSVS="$FEATURE_CSVS $BENCH_WORK_DIR/feature-2/combined_latency.csv"
+          fi
+          SUMMARY_ARGS="$SUMMARY_ARGS --baseline-csv $BASELINE_CSVS"
+          SUMMARY_ARGS="$SUMMARY_ARGS --feature-csv $FEATURE_CSVS"
          SUMMARY_ARGS="$SUMMARY_ARGS --gas-csv $BENCH_WORK_DIR/feature-1/total_gas.csv"
          if [ "$BEHIND_BASELINE" -gt 0 ]; then
            SUMMARY_ARGS="$SUMMARY_ARGS --behind-baseline $BEHIND_BASELINE"
          fi
+          if [ "${BENCH_BIG_BLOCKS:-false}" = "true" ]; then
+            SUMMARY_ARGS="$SUMMARY_ARGS --big-blocks"
+            # Read gas ramp blocks count from first baseline run (same for all runs)
+            GAS_RAMP_FILE="$BENCH_WORK_DIR/baseline-1/gas_ramp_blocks.txt"
+            if [ -f "$GAS_RAMP_FILE" ]; then
+              SUMMARY_ARGS="$SUMMARY_ARGS --gas-ramp-blocks $(cat "$GAS_RAMP_FILE" | tr -d '[:space:]')"
+            fi
+          fi
+          GRAFANA_URL='${{ steps.metrics.outputs.grafana-url }}'
+          if [ -n "$GRAFANA_URL" ]; then
+            SUMMARY_ARGS="$SUMMARY_ARGS --grafana-url $GRAFANA_URL"
+          fi
          # shellcheck disable=SC2086
          python3 .github/scripts/bench-reth-summary.py $SUMMARY_ARGS

@@ -823,8 +1071,14 @@ jobs:
          FEATURE_NAME: ${{ steps.refs.outputs.feature-name }}
        run: |
          CHART_ARGS="--output-dir $BENCH_WORK_DIR/charts"
-          CHART_ARGS="$CHART_ARGS --feature $BENCH_WORK_DIR/feature-1/combined_latency.csv $BENCH_WORK_DIR/feature-2/combined_latency.csv"
-          CHART_ARGS="$CHART_ARGS --baseline $BENCH_WORK_DIR/baseline-1/combined_latency.csv $BENCH_WORK_DIR/baseline-2/combined_latency.csv"
+          FEATURE_CSVS="$BENCH_WORK_DIR/feature-1/combined_latency.csv"
+          BASELINE_CSVS="$BENCH_WORK_DIR/baseline-1/combined_latency.csv"
+          if [ "${BENCH_ABBA:-true}" = "true" ]; then
+            FEATURE_CSVS="$FEATURE_CSVS $BENCH_WORK_DIR/feature-2/combined_latency.csv"
+            BASELINE_CSVS="$BASELINE_CSVS $BENCH_WORK_DIR/baseline-2/combined_latency.csv"
+          fi
+          CHART_ARGS="$CHART_ARGS --feature $FEATURE_CSVS"
+          CHART_ARGS="$CHART_ARGS --baseline $BASELINE_CSVS"
          CHART_ARGS="$CHART_ARGS --baseline-name ${BASELINE_NAME}"
          CHART_ARGS="$CHART_ARGS --feature-name ${FEATURE_NAME}"
          # shellcheck disable=SC2086
@@ -832,7 +1086,7 @@ jobs:

      - name: Upload results
        if: "!cancelled()"
-        uses: actions/upload-artifact@v6
+        uses: actions/upload-artifact@v7
        with:
          name: bench-reth-results
          path: ${{ env.BENCH_WORK_DIR }}
@@ -864,7 +1118,7 @@ jobs:
          rm -rf "${TMP_DIR}"

      - name: Compare & comment
-        if: success()
+        if: success() && env.BENCH_COMMENT_ID
        uses: actions/github-script@v8
        with:
          github-token: ${{ secrets.DEREK_PAT }}
@@ -900,7 +1154,8 @@ jobs:

            // Samply profile links (URLs point directly to Firefox Profiler)
            if (process.env.BENCH_SAMPLY === 'true') {
-              const runs = ['baseline-1', 'feature-1', 'feature-2', 'baseline-2'];
+              const abba = (process.env.BENCH_ABBA || 'true') !== 'false';
+              const runs = abba ? ['baseline-1', 'feature-1', 'feature-2', 'baseline-2'] : ['baseline-1', 'feature-1'];
              const links = [];
              for (const run of runs) {
                try {
@@ -915,6 +1170,12 @@ jobs:
              }
            }

+            // Grafana dashboard link
+            const grafanaUrl = '${{ steps.metrics.outputs.grafana-url }}';
+            if (grafanaUrl) {
+              comment += `\n\n### Grafana Dashboard\n\n[View real-time metrics](${grafanaUrl})\n`;
+            }
+
            // Node errors (panics / ERROR logs)
            try {
              const errors = fs.readFileSync(process.env.BENCH_WORK_DIR + '/errors.md', 'utf8');
@@ -940,7 +1201,7 @@ jobs:
            }

      - name: Send Slack notification (success)
-        if: success()
+        if: success() && env.BENCH_NO_SLACK != 'true'
        uses: actions/github-script@v8
        env:
          SLACK_BENCH_BOT_TOKEN: ${{ secrets.SLACK_BENCH_BOT_TOKEN }}
@@ -956,12 +1217,13 @@ jobs:
        with:
          github-token: ${{ secrets.DEREK_PAT }}
          script: |
+            const abba = (process.env.BENCH_ABBA || 'true') !== 'false';
            const steps_status = [
              ['building binaries${{ steps.snapshot-check.outputs.needed == 'true' && ' & downloading snapshot' || '' }}', '${{ steps.build.outcome }}'],
              ['running baseline benchmark (1/2)', '${{ steps.run-baseline-1.outcome }}'],
              ['running feature benchmark (1/2)', '${{ steps.run-feature-1.outcome }}'],
-              ['running feature benchmark (2/2)', '${{ steps.run-feature-2.outcome }}'],
-              ['running baseline benchmark (2/2)', '${{ steps.run-baseline-2.outcome }}'],
+              ...(abba ? [['running feature benchmark (2/2)', '${{ steps.run-feature-2.outcome }}']] : []),
+              ...(abba ? [['running baseline benchmark (2/2)', '${{ steps.run-baseline-2.outcome }}']] : []),
            ];
            const failed = steps_status.find(([, o]) => o === 'failure');
            const failedStep = failed ? failed[0] : 'unknown step';
@@ -990,12 +1252,13 @@ jobs:
          SLACK_BENCH_CHANNEL: ${{ secrets.SLACK_BENCH_CHANNEL }}
        with:
          script: |
+            const abba = (process.env.BENCH_ABBA || 'true') !== 'false';
            const steps_status = [
              ['building binaries${{ steps.snapshot-check.outputs.needed == 'true' && ' & downloading snapshot' || '' }}', '${{ steps.build.outcome }}'],
              ['running baseline benchmark (1/2)', '${{ steps.run-baseline-1.outcome }}'],
              ['running feature benchmark (1/2)', '${{ steps.run-feature-1.outcome }}'],
-              ['running feature benchmark (2/2)', '${{ steps.run-feature-2.outcome }}'],
-              ['running baseline benchmark (2/2)', '${{ steps.run-baseline-2.outcome }}'],
+              ...(abba ? [['running feature benchmark (2/2)', '${{ steps.run-feature-2.outcome }}']] : []),
+              ...(abba ? [['running baseline benchmark (2/2)', '${{ steps.run-baseline-2.outcome }}']] : []),
            ];
            const failed = steps_status.find(([, o]) => o === 'failure');
            const failedStep = failed ? failed[0] : 'unknown step';
--- a/.github/workflows/docker-test.yml
+++ b/.github/workflows/docker-test.yml
@@ -6,7 +6,7 @@ on:
      hive_target:
        required: true
        type: string
-        description: "Docker bake target to build (e.g. hive-stable, hive-edge)"
+        description: "Docker bake target to build (e.g. hive)"
      artifact_name:
        required: false
        type: string
@@ -76,7 +76,7 @@ jobs:
            *.dockerfile=Dockerfile

      - name: Upload reth image
-        uses: actions/upload-artifact@v6
+        uses: actions/upload-artifact@v7
        with:
          name: ${{ inputs.artifact_name }}
          path: ./artifacts
--- a/.github/workflows/docker.yml
+++ b/.github/workflows/docker.yml
@@ -28,12 +28,30 @@ on:
        required: false
        type: boolean
        default: false
+      pgo:
+        description: "Enable PGO profiling"
+        required: false
+        type: boolean
+        default: false
+      pgo_blocks:
+        description: "Number of blocks to execute for PGO profiling"
+        required: false
+        type: string
+        default: "20"

 jobs:
+  collect-pgo-profile:
+    if: github.repository == 'paradigmxyz/reth' && github.event_name == 'workflow_dispatch' && inputs.pgo
+    uses: ./.github/workflows/pgo-profile.yml
+    with:
+      pgo_blocks: ${{ inputs.pgo_blocks || '20' }}
+    secrets: inherit
+
  build:
-    if: github.repository == 'paradigmxyz/reth'
+    if: github.repository == 'paradigmxyz/reth' && !failure() && !cancelled()
    name: Build Docker images
    runs-on: ubuntu-24.04
+    needs: collect-pgo-profile
    permissions:
      packages: write
      contents: read
@@ -45,7 +63,7 @@ jobs:
        uses: depot/setup-action@v1

      - name: Log in to GHCR
-        uses: docker/login-action@v3
+        uses: docker/login-action@v4
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
@@ -58,6 +76,30 @@ jobs:
          echo "describe=$(git describe --always --tags)" >> "$GITHUB_OUTPUT"
          echo "dirty=false" >> "$GITHUB_OUTPUT"

+      - name: Download pre-collected PGO profile
+        if: ${{ github.event_name == 'workflow_dispatch' && inputs.pgo }}
+        uses: actions/download-artifact@v7
+        with:
+          name: pgo-profdata
+          path: dist
+
+      - name: Configure PGO build args
+        id: pgo
+        run: |
+          if [[ "${{ github.event_name }}" == "workflow_dispatch" ]] && [[ "${{ inputs.pgo }}" == "true" ]]; then
+            if [ ! -f dist/merged.profdata ]; then
+              echo "::error::Expected dist/merged.profdata from collect-pgo-profile job"
+              exit 1
+            fi
+            echo "use_pgo_bolt=true" >> "$GITHUB_OUTPUT"
+            echo "pgo_profdata=dist/merged.profdata" >> "$GITHUB_OUTPUT"
+            echo "Using pre-collected PGO profile from collect-pgo-profile job"
+          else
+            echo "use_pgo_bolt=false" >> "$GITHUB_OUTPUT"
+            echo "pgo_profdata=" >> "$GITHUB_OUTPUT"
+            echo "PGO disabled"
+          fi
+
      - name: Determine build parameters
        id: params
        run: |
@@ -107,6 +149,9 @@ jobs:
          push: ${{ !(github.event_name == 'workflow_dispatch' && inputs.dry_run) }}
          set: |
            ${{ steps.params.outputs.ethereum_set }}
+            *.args.USE_PGO_BOLT=${{ steps.pgo.outputs.use_pgo_bolt }}
+            *.args.PGO_PROFDATA=${{ steps.pgo.outputs.pgo_profdata }}
+            *.args.STRIP_SYMBOLS=false

      - name: Verify image architectures
        env:
--- a/.github/workflows/e2e.yml
+++ b/.github/workflows/e2e.yml
@@ -63,6 +63,6 @@ jobs:
        run: |
          cargo nextest run \
            --no-fail-fast \
-            --locked --features "edge" \
+            --locked \
            -p reth-e2e-test-utils \
            -E 'binary(rocksdb)'
--- a/.github/workflows/hive.yml
+++ b/.github/workflows/hive.yml
@@ -15,18 +15,11 @@ concurrency:
  cancel-in-progress: true

 jobs:
-  build-reth-stable:
+  build-reth:
    uses: ./.github/workflows/docker-test.yml
    with:
-      hive_target: hive-stable
-      artifact_name: "reth-stable"
-    secrets: inherit
-
-  build-reth-edge:
-    uses: ./.github/workflows/docker-test.yml
-    with:
-      hive_target: hive-edge
-      artifact_name: "reth-edge"
+      hive_target: hive
+      artifact_name: "reth"
    secrets: inherit

  prepare-hive:
@@ -75,7 +68,7 @@ jobs:
          chmod +x hive

      - name: Upload hive assets
-        uses: actions/upload-artifact@v6
+        uses: actions/upload-artifact@v7
        with:
          name: hive_assets
          path: ./hive_assets
@@ -84,7 +77,6 @@ jobs:
    strategy:
      fail-fast: false
      matrix:
-        storage: [stable, edge]
        # ethereum/rpc to be deprecated:
        # https://github.com/ethereum/hive/pull/1117
        scenario:
@@ -184,10 +176,9 @@ jobs:
          - sim: ethereum/eels/consume-rlp
            limit: .*tests/paris.*
    needs:
-      - build-reth-stable
-      - build-reth-edge
+      - build-reth
      - prepare-hive
-    name: ${{ matrix.storage }} / ${{ matrix.scenario.sim }}${{ matrix.scenario.limit && format(' - {0}', matrix.scenario.limit) }}
+    name: ${{ matrix.scenario.sim }}${{ matrix.scenario.limit && format(' - {0}', matrix.scenario.limit) }}
    # Use larger runners for eels tests to avoid OOM runner crashes
    runs-on: ${{ github.repository == 'paradigmxyz/reth' && (contains(matrix.scenario.sim, 'eels') && 'depot-ubuntu-latest-8' || 'depot-ubuntu-latest-4') || 'ubuntu-latest' }}
    permissions:
@@ -198,15 +189,15 @@ jobs:
          fetch-depth: 0

      - name: Download hive assets
-        uses: actions/download-artifact@v7
+        uses: actions/download-artifact@v8
        with:
          name: hive_assets
          path: /tmp

      - name: Download reth image
-        uses: actions/download-artifact@v7
+        uses: actions/download-artifact@v8
        with:
-          name: reth-${{ matrix.storage }}
+          name: reth
          path: /tmp

      - name: Load Docker images
--- a/.github/workflows/integration.yml
+++ b/.github/workflows/integration.yml
@@ -22,7 +22,7 @@ concurrency:

 jobs:
  test:
-    name: test / ${{ matrix.network }} / ${{ matrix.storage }}
+    name: test / ${{ matrix.network }}
    if: github.event_name != 'schedule'
    runs-on: ${{ github.repository == 'paradigmxyz/reth' && 'depot-ubuntu-latest-4' || 'ubuntu-latest' }}
    env:
@@ -30,7 +30,6 @@ jobs:
    strategy:
      matrix:
        network: ["ethereum"]
-        storage: ["stable", "edge"]
    timeout-minutes: 60
    steps:
      - uses: actions/checkout@v6
@@ -47,7 +46,7 @@ jobs:
        run: |
          cargo nextest run \
            --no-fail-fast \
-            --locked --features "asm-keccak ${{ matrix.network }} ${{ matrix.storage == 'edge' && 'edge' || '' }}" \
+            --locked --features "asm-keccak ${{ matrix.network }}" \
            --workspace --exclude ef-tests \
            -E "kind(test) and not binary(e2e_testsuite)"

--- a/.github/workflows/kurtosis.yml
+++ b/.github/workflows/kurtosis.yml
@@ -40,7 +40,7 @@ jobs:
          fetch-depth: 0

      - name: Download reth image
-        uses: actions/download-artifact@v7
+        uses: actions/download-artifact@v8
        with:
          name: artifacts
          path: /tmp
--- a/.github/workflows/pgo-profile.yml
+++ b/.github/workflows/pgo-profile.yml
@@ -0,0 +1,107 @@
+name: pgo-profile
+
+on:
+  workflow_call:
+    inputs:
+      pgo_blocks:
+        description: "Number of blocks to execute for PGO profiling"
+        required: false
+        type: string
+        default: "20"
+  workflow_dispatch:
+    inputs:
+      pgo_blocks:
+        description: "Number of blocks to execute for PGO profiling"
+        required: false
+        type: string
+        default: "20"
+
+jobs:
+  collect:
+    name: collect PGO profiles
+    runs-on: [self-hosted, Linux, X64]
+    timeout-minutes: 180
+    env:
+      SCHELK_MOUNT: /reth-bench
+      BENCH_RPC_URL: https://ethereum.reth.rs/rpc
+      RUSTC_WRAPPER: "sccache"
+    steps:
+      - uses: actions/checkout@v6
+        with:
+          submodules: true
+      - uses: rui314/setup-mold@v1
+      - uses: dtolnay/rust-toolchain@stable
+        with:
+          target: x86_64-unknown-linux-gnu
+      - uses: mozilla-actions/sccache-action@v0.0.9
+        continue-on-error: true
+      - uses: Swatinem/rust-cache@v2
+        with:
+          cache-on-failure: true
+
+      - name: Install dependencies
+        run: |
+          sudo apt-get update -qq
+          sudo apt-get install -y --no-install-recommends \
+            dmsetup lsb-release wget linux-tools-"$(uname -r)" || \
+            sudo apt-get install -y --no-install-recommends linux-tools-generic
+
+      - name: Download snapshot if needed
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          BENCH_REPO: ${{ github.repository }}
+        run: |
+          if ! .github/scripts/bench-reth-snapshot.sh --check; then
+            echo "Snapshot outdated or missing, downloading..."
+            .github/scripts/bench-reth-snapshot.sh
+          fi
+
+      - name: Mount snapshot
+        run: |
+          sudo pkill -9 reth || true
+          sleep 1
+          if mountpoint -q "$SCHELK_MOUNT"; then
+            sudo umount -l "$SCHELK_MOUNT" || true
+            sudo schelk recover -y || true
+          fi
+          sudo schelk mount -y
+          sync
+          sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
+
+      - name: Collect PGO profile
+        run: |
+          DATADIR="$SCHELK_MOUNT/datadir" \
+          RPC_URL="$BENCH_RPC_URL" \
+          PGO_BLOCKS="${{ inputs.pgo_blocks || '20' }}" \
+          BOLT_BLOCKS="${{ inputs.pgo_blocks || '20' }}" \
+          COLLECT_PGO_ONLY=true \
+          SKIP_BOLT=true \
+          PROFILE=maxperf-symbols \
+          FEATURES="jemalloc,asm-keccak,min-debug-logs" \
+          TARGET=x86_64-unknown-linux-gnu \
+          EXTRA_RUSTFLAGS="-C target-cpu=x86-64-v3 -C target-feature=+pclmulqdq" \
+            .github/scripts/build_pgo_bolt.sh
+
+      - name: Show PGO profile stats
+        run: |
+          LLVM_PROFDATA=$(find "$(rustc --print sysroot)" -name llvm-profdata -type f | head -1)
+          if [ -z "$LLVM_PROFDATA" ]; then
+            echo "::error::llvm-profdata not found in rust toolchain"
+            exit 1
+          fi
+          "$LLVM_PROFDATA" show --detailed-summary --topn=20 target/pgo-profiles/merged.profdata
+
+      - name: Upload PGO profile
+        uses: actions/upload-artifact@v7
+        with:
+          name: pgo-profdata
+          path: target/pgo-profiles/merged.profdata
+          retention-days: 1
+
+      - name: Recover snapshot
+        if: always()
+        run: |
+          if mountpoint -q "$SCHELK_MOUNT"; then
+            sudo umount -l "$SCHELK_MOUNT" || true
+            sudo schelk recover -y || true
+          fi
--- a/.github/workflows/release-reproducible.yml
+++ b/.github/workflows/release-reproducible.yml
@@ -52,7 +52,7 @@ jobs:
        uses: docker/setup-buildx-action@v3

      - name: Log in to GitHub Container Registry
-        uses: docker/login-action@v3
+        uses: docker/login-action@v4
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -13,6 +13,14 @@ on:
        description: "Enable dry run mode (builds artifacts but skips uploads and release creation)"
        type: boolean
        default: false
+      pgo:
+        description: "Enable PGO profiling"
+        type: boolean
+        default: false
+      pgo_blocks:
+        description: "Number of blocks to execute for PGO profiling on self-hosted runner"
+        type: string
+        default: "20"

 env:
  REPO_NAME: ${{ github.repository_owner }}/reth
@@ -69,11 +77,6 @@ jobs:
      fail-fast: true
      matrix:
        configs:
-          - target: x86_64-unknown-linux-gnu
-            os: ubuntu-24.04
-            profile: maxperf
-            allow_fail: false
-            rustflags: "-C target-cpu=x86-64-v3 -C target-feature=+pclmulqdq"
          - target: aarch64-unknown-linux-gnu
            os: ubuntu-24.04-arm
            profile: maxperf
@@ -142,23 +145,105 @@ jobs:

      - name: Upload artifact
        if: ${{ github.event.inputs.dry_run != 'true' }}
-        uses: actions/upload-artifact@v6
+        uses: actions/upload-artifact@v7
        with:
          name: ${{ matrix.build.binary }}-${{ needs.extract-version.outputs.VERSION }}-${{ matrix.configs.target }}.tar.gz
          path: ${{ matrix.build.binary }}-${{ needs.extract-version.outputs.VERSION }}-${{ matrix.configs.target }}.tar.gz

      - name: Upload signature
        if: ${{ github.event.inputs.dry_run != 'true' }}
-        uses: actions/upload-artifact@v6
+        uses: actions/upload-artifact@v7
        with:
          name: ${{ matrix.build.binary }}-${{ needs.extract-version.outputs.VERSION }}-${{ matrix.configs.target }}.tar.gz.asc
          path: ${{ matrix.build.binary }}-${{ needs.extract-version.outputs.VERSION }}-${{ matrix.configs.target }}.tar.gz.asc

+  collect-pgo-profile:
+    if: github.event_name == 'workflow_dispatch' && inputs.pgo
+    uses: ./.github/workflows/pgo-profile.yml
+    with:
+      pgo_blocks: ${{ inputs.pgo_blocks || '20' }}
+    secrets: inherit
+
+  build-pgo:
+    if: github.event_name == 'workflow_dispatch' && inputs.pgo
+    name: build release (x86_64-linux PGO+BOLT)
+    runs-on: [self-hosted, Linux, X64]
+    needs: [extract-version, collect-pgo-profile]
+    timeout-minutes: 120
+    env:
+      RUSTC_WRAPPER: "sccache"
+    steps:
+      - uses: actions/checkout@v6
+        with:
+          submodules: true
+      - uses: rui314/setup-mold@v1
+      - uses: dtolnay/rust-toolchain@stable
+        with:
+          target: x86_64-unknown-linux-gnu
+      - uses: mozilla-actions/sccache-action@v0.0.9
+        continue-on-error: true
+      - uses: Swatinem/rust-cache@v2
+        with:
+          cache-on-failure: true
+
+      - name: Download pre-collected PGO profile
+        uses: actions/download-artifact@v7
+        with:
+          name: pgo-profdata
+          path: dist
+
+      - name: Verify PGO profile artifact
+        run: |
+          test -f dist/merged.profdata
+          ls -lh dist/merged.profdata
+
+      - name: Build Reth with PGO+BOLT
+        run: |
+          SKIP_BOLT=true \
+          PGO_PROFDATA="$PWD/dist/merged.profdata" \
+          PROFILE=maxperf-symbols \
+          FEATURES="jemalloc,asm-keccak,min-debug-logs" \
+          TARGET=x86_64-unknown-linux-gnu \
+          EXTRA_RUSTFLAGS="-C target-cpu=x86-64-v3 -C target-feature=+pclmulqdq" \
+            .github/scripts/build_pgo_bolt.sh
+
+      - name: Move binary
+        run: |
+          mkdir artifacts
+          mv target/maxperf-symbols/reth ./artifacts
+
+      - name: Configure GPG and create artifacts
+        env:
+          GPG_SIGNING_KEY: ${{ secrets.GPG_SIGNING_KEY }}
+          GPG_PASSPHRASE: ${{ secrets.GPG_PASSPHRASE }}
+        run: |
+          export GPG_TTY=$(tty)
+          echo -n "$GPG_SIGNING_KEY" | base64 --decode | gpg --batch --import
+          cd artifacts
+          tar -czf reth-${{ needs.extract-version.outputs.VERSION }}-x86_64-unknown-linux-gnu.tar.gz reth*
+          echo "$GPG_PASSPHRASE" | gpg --passphrase-fd 0 --pinentry-mode loopback --batch -ab reth-${{ needs.extract-version.outputs.VERSION }}-x86_64-unknown-linux-gnu.tar.gz
+          mv *tar.gz* ..
+        shell: bash
+
+      - name: Upload artifact
+        if: ${{ github.event.inputs.dry_run != 'true' }}
+        uses: actions/upload-artifact@v6
+        with:
+          name: reth-${{ needs.extract-version.outputs.VERSION }}-x86_64-unknown-linux-gnu.tar.gz
+          path: reth-${{ needs.extract-version.outputs.VERSION }}-x86_64-unknown-linux-gnu.tar.gz
+
+      - name: Upload signature
+        if: ${{ github.event.inputs.dry_run != 'true' }}
+        uses: actions/upload-artifact@v6
+        with:
+          name: reth-${{ needs.extract-version.outputs.VERSION }}-x86_64-unknown-linux-gnu.tar.gz.asc
+          path: reth-${{ needs.extract-version.outputs.VERSION }}-x86_64-unknown-linux-gnu.tar.gz.asc
+
  draft-release:
    name: draft release
    runs-on: ubuntu-latest
-    needs: [build, extract-version]
-    if: ${{ github.event.inputs.dry_run != 'true' }}
+    needs: [build, build-pgo, extract-version]
+    if: ${{ !failure() && !cancelled() && github.event.inputs.dry_run != 'true' }}
    env:
      VERSION: ${{ needs.extract-version.outputs.VERSION }}
    permissions:
@@ -171,7 +256,7 @@ jobs:
        with:
          fetch-depth: 0
      - name: Download artifacts
-        uses: actions/download-artifact@v7
+        uses: actions/download-artifact@v8
      - name: Generate full changelog
        id: changelog
        run: |
--- a/.github/workflows/reproducible-build.yml
+++ b/.github/workflows/reproducible-build.yml
@@ -43,7 +43,7 @@ jobs:
          echo "Binaries SHA256 on ${{ matrix.machine }}: $(cat checksum.sha256)"

      - name: Upload the hash
-        uses: actions/upload-artifact@v6
+        uses: actions/upload-artifact@v7
        with:
          name: checksum-${{ matrix.machine }}
          path: |
@@ -56,12 +56,12 @@ jobs:
    runs-on: ubuntu-latest
    steps:
      - name: Download artifacts from machine-1
-        uses: actions/download-artifact@v7
+        uses: actions/download-artifact@v8
        with:
          name: checksum-machine-1
          path: machine-1/
      - name: Download artifacts from machine-2
-        uses: actions/download-artifact@v7
+        uses: actions/download-artifact@v8
        with:
          name: checksum-machine-2
          path: machine-2/
--- a/.github/workflows/stage.yml
+++ b/.github/workflows/stage.yml
@@ -51,15 +51,12 @@ jobs:
      - name: Run execution stage
        run: |
          reth stage run execution --from ${{ env.FROM_BLOCK }} --to ${{ env.TO_BLOCK }} --commit --checkpoints
-      - name: Run account-hashing stage
-        run: |
-          reth stage run account-hashing --from ${{ env.FROM_BLOCK }} --to ${{ env.TO_BLOCK }} --commit --checkpoints
-      - name: Run storage hashing stage
-        run: |
-          reth stage run storage-hashing --from ${{ env.FROM_BLOCK }} --to ${{ env.TO_BLOCK }} --commit --checkpoints
-      - name: Run hashing stage
-        run: |
-          reth stage run hashing --from ${{ env.FROM_BLOCK }} --to ${{ env.TO_BLOCK }} --commit --checkpoints
+      # NOTE: account-hashing, storage-hashing, and hashing stages are omitted.
+      # With storage v2 (now default), these stages are no-ops because the
+      # execution stage writes directly to HashedAccounts/HashedStorages.
+      # Running them here is harmful: `stage run` unwinds before executing,
+      # and the unwind reverts the hashed state that execution wrote, but
+      # the no-op execute never restores it — causing merkle to fail.
      - name: Run merkle stage
        run: |
          reth stage run merkle --from ${{ env.FROM_BLOCK }} --to ${{ env.TO_BLOCK }} --commit --checkpoints
--- a/.github/workflows/unit.yml
+++ b/.github/workflows/unit.yml
@@ -19,15 +19,13 @@ concurrency:

 jobs:
  test:
-    name: test / ${{ matrix.type }} / ${{ matrix.storage }}
+    name: test / ${{ matrix.type }}
    runs-on: ${{ github.repository == 'paradigmxyz/reth' && 'depot-ubuntu-latest-4' || 'ubuntu-latest' }}
    env:
      RUST_BACKTRACE: 1
-      EDGE_FEATURES: ${{ matrix.storage == 'edge' && 'edge' || '' }}
    strategy:
      matrix:
        type: [ethereum]
-        storage: [stable, edge]
        include:
          - type: ethereum
            features: asm-keccak ethereum
@@ -50,14 +48,14 @@ jobs:
        run: |
          cargo nextest run \
            --no-fail-fast \
-            --features "${{ matrix.features }} $EDGE_FEATURES" --locked \
+            --features "${{ matrix.features }}" --locked \
            ${{ matrix.exclude_args }} --workspace \
            --exclude ef-tests --no-tests=warn \
            -E "!kind(test) and not binary(e2e_testsuite)"

  state:
    name: Ethereum state tests
-    runs-on: ${{ github.repository == 'paradigmxyz/reth' && 'depot-ubuntu-latest-4' || 'ubuntu-latest' }}
+    runs-on: ${{ github.repository == 'paradigmxyz/reth' && 'depot-ubuntu-latest-8' || 'ubuntu-latest' }}
    env:
      RUST_LOG: info,sync=error
      RUST_BACKTRACE: 1
--- a/Cargo.lock
+++ b/Cargo.lock
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -1,5 +1,5 @@
 [workspace.package]
-version = "1.11.1"
+version = "1.11.3"
 edition = "2024"
 rust-version = "1.93"
 license = "MIT OR Apache-2.0"
@@ -539,6 +539,8 @@ serde_json = { version = "1.0", default-features = false, features = ["alloc"] }
 serde_with = { version = "3", default-features = false, features = ["macros"] }
 sha2 = { version = "0.10", default-features = false }
 shlex = "1.3"
+# https://github.com/orlp/slotmap/pull/148
+slotmap = { git = "https://github.com/DaniPopes/slotmap.git", branch = "dani/shrink-methods" }
 smallvec = "1"
 strum = { version = "0.27", default-features = false }
 strum_macros = "0.27"
--- a/Dockerfile.depot
+++ b/Dockerfile.depot
@@ -1,8 +1,10 @@
 # syntax=docker/dockerfile:1

 # Dockerfile for reth, optimized for Depot builds
+# Supports PGO+BOLT optimization for maximum performance
 # Usage:
 #   reth: --build-arg BINARY=reth
+#   PGO+BOLT: --build-arg USE_PGO_BOLT=true (Linux x86_64/aarch64 only)

 FROM rust:1.93 AS builder
 WORKDIR /app
@@ -43,6 +45,18 @@ ENV VERGEN_GIT_SHA=$VERGEN_GIT_SHA
 ENV VERGEN_GIT_DESCRIBE=$VERGEN_GIT_DESCRIBE
 ENV VERGEN_GIT_DIRTY=$VERGEN_GIT_DIRTY

+# Enable PGO+BOLT optimization (Linux only)
+ARG USE_PGO_BOLT=false
+ENV USE_PGO_BOLT=$USE_PGO_BOLT
+
+# Optional path to a pre-collected merged.profdata file in build context.
+ARG PGO_PROFDATA=""
+ENV PGO_PROFDATA=$PGO_PROFDATA
+
+# Whether to strip debug symbols from PGO-built binaries.
+ARG STRIP_SYMBOLS=true
+ENV STRIP_SYMBOLS=$STRIP_SYMBOLS
+
 # Build application
 # Platform-specific RUSTFLAGS: amd64 uses x86-64-v3 (Haswell+) with pclmulqdq for rocksdb
 ARG TARGETPLATFORM
@@ -53,12 +67,21 @@ RUN --mount=type=secret,id=DEPOT_TOKEN,env=SCCACHE_WEBDAV_TOKEN \
    --mount=type=cache,target=$SCCACHE_DIR,sharing=shared \
    export RUSTC_WRAPPER=sccache SCCACHE_WEBDAV_ENDPOINT=https://cache.depot.dev SCCACHE_DIR=/sccache && \
    sccache --start-server && \
-    if [ -n "$RUSTFLAGS" ]; then \
-        export RUSTFLAGS="$RUSTFLAGS"; \
-    elif [ "$TARGETPLATFORM" = "linux/amd64" ]; then \
-        export RUSTFLAGS="-C target-cpu=x86-64-v3 -C target-feature=+pclmulqdq"; \
+    if [ "$USE_PGO_BOLT" = "true" ] && [ "$TARGETPLATFORM" = "linux/amd64" ] && [ -n "$PGO_PROFDATA" ] && [ -f "$PGO_PROFDATA" ]; then \
+        apt-get update && apt-get install -y -qq lsb-release wget sudo && \
+        BINARY="$BINARY" PROFILE="$BUILD_PROFILE" FEATURES="$FEATURES" SKIP_BOLT=true STRIP_SYMBOLS="$STRIP_SYMBOLS" PGO_PROFDATA="$PGO_PROFDATA" \
+            ./.github/scripts/build_pgo_bolt.sh; \
+    else \
+        if [ "$USE_PGO_BOLT" = "true" ]; then \
+            echo "PGO requested but pre-collected profile missing at '${PGO_PROFDATA:-<unset>}' - falling back to non-PGO build"; \
+        fi; \
+        if [ -n "$RUSTFLAGS" ]; then \
+            export RUSTFLAGS="$RUSTFLAGS"; \
+        elif [ "$TARGETPLATFORM" = "linux/amd64" ]; then \
+            export RUSTFLAGS="-C target-cpu=x86-64-v3 -C target-feature=+pclmulqdq"; \
+        fi && \
+        cargo build --profile $BUILD_PROFILE --features "$FEATURES" --locked --bin $BINARY --manifest-path $MANIFEST_PATH/Cargo.toml; \
    fi && \
-    cargo build --profile $BUILD_PROFILE --features "$FEATURES" --locked --bin $BINARY --manifest-path $MANIFEST_PATH/Cargo.toml && \
    sccache --show-stats

 # Copy binary to a known location (ARG not resolved in COPY)
--- a/bin/reth-bench/src/bench/context.rs
+++ b/bin/reth-bench/src/bench/context.rs
@@ -7,7 +7,7 @@ use alloy_primitives::address;
 use alloy_provider::{network::AnyNetwork, Provider, RootProvider};
 use alloy_rpc_client::ClientBuilder;
 use alloy_rpc_types_engine::JwtSecret;
-use alloy_transport::layers::RetryBackoffLayer;
+use alloy_transport::layers::{RateLimitRetryPolicy, RetryBackoffLayer};
 use reqwest::Url;
 use reth_node_core::args::BenchmarkArgs;
 use tracing::info;
@@ -53,9 +53,15 @@ impl BenchContext {
            }
        }

-        // set up alloy client for blocks
+        // set up alloy client for blocks, retrying on 429/503 (default) and 502
+        let retry_policy =
+            RateLimitRetryPolicy::default().or(|err: &alloy_transport::TransportError| -> bool {
+                err.as_transport_err()
+                    .and_then(|t| t.as_http_error())
+                    .is_some_and(|e| e.status == 502)
+            });
        let client = ClientBuilder::default()
-            .layer(RetryBackoffLayer::new(10, 800, u64::MAX))
+            .layer(RetryBackoffLayer::new_with_policy(10, 800, u64::MAX, retry_policy))
            .http(rpc_url.parse()?);
        let block_provider = RootProvider::<AnyNetwork>::new(client);

--- a/bin/reth-bench/src/bench/output.rs
+++ b/bin/reth-bench/src/bench/output.rs
@@ -24,7 +24,7 @@ pub(crate) struct GasRampPayloadFile {
    /// Engine API version (1-5).
    ///
    /// `None` indicates that `reth_newPayload` should be used.
-    #[serde(skip_serializing_if = "Option::is_none")]
+    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub(crate) version: Option<u8>,
    /// The block hash for FCU.
    pub(crate) block_hash: B256,
--- a/bin/reth-bench/src/bench/replay_payloads.rs
+++ b/bin/reth-bench/src/bench/replay_payloads.rs
@@ -292,10 +292,6 @@ impl Command {

            info!(target: "reth-bench", gas_ramp_payload = i + 1, "Gas ramp payload executed successfully");

-            if let Some(w) = &mut waiter {
-                w.on_block(payload.block_number).await?;
-            }
-
            parent_hash = payload.file.block_hash;
        }

--- a/bin/reth-bench/src/valid_payload.rs
+++ b/bin/reth-bench/src/valid_payload.rs
@@ -235,8 +235,8 @@ pub(crate) fn payload_to_new_payload(
                        ))?,
                    )
                } else {
-                    // Extract actual Requests from RequestsOrHash
-                    let requests = prague.requests.requests_hash();
+                    // Preserve the original RequestsOrHash payload for engine_newPayloadV4.
+                    let requests = prague.requests.clone();
                    (
                        version,
                        serde_json::to_value((
--- a/bin/reth/Cargo.toml
+++ b/bin/reth/Cargo.toml
@@ -89,7 +89,6 @@ default = [
    "keccak-cache-global",
    "asm-keccak",
    "min-debug-logs",
-    "rocksdb",
 ]

 otlp = [
@@ -191,8 +190,6 @@ min-trace-logs = [
 ]

 trie-debug = ["reth-node-builder/trie-debug", "reth-node-core/trie-debug"]
-rocksdb = ["reth-ethereum-cli/rocksdb", "reth-node-core/rocksdb"]
-edge = ["rocksdb"]

 [[bin]]
 name = "reth"
--- a/crates/chain-state/src/execution_stats.rs
+++ b/crates/chain-state/src/execution_stats.rs
@@ -0,0 +1,69 @@
+//! Execution timing statistics for detailed block logging.
+//!
+//! This module provides types for collecting and passing execution timing statistics
+//! through the block processing pipeline, enabling unified detailed block logging after
+//! database commit.
+
+use std::time::Duration;
+
+use alloy_primitives::B256;
+
+/// Statistics collected during block execution for cross-client performance analysis.
+///
+/// These statistics are populated during block validation and carried through to
+/// persistence, where they are used to emit a single unified log entry that includes
+/// complete timing information (including commit time).
+#[derive(Debug, Clone, Default)]
+pub struct ExecutionTimingStats {
+    /// Block number
+    pub block_number: u64,
+    /// Block hash
+    pub block_hash: B256,
+    /// Total gas used by the block
+    pub gas_used: u64,
+    /// Number of transactions in the block
+    pub tx_count: usize,
+    /// Time spent executing transactions (includes state reads)
+    pub execution_duration: Duration,
+    /// Time spent fetching state during execution (subset of `execution_duration`, includes cache
+    /// hits)
+    pub state_read_duration: Duration,
+    /// Time spent computing state root hash
+    pub state_hash_duration: Duration,
+    /// Number of accounts read during execution
+    pub accounts_read: usize,
+    /// Number of storage slots read (SLOAD operations)
+    pub storage_read: usize,
+    /// Number of code reads (EXTCODE* operations)
+    pub code_read: usize,
+    /// Total bytes of code read
+    pub code_bytes_read: usize,
+    /// Number of accounts changed (balance/nonce updates)
+    pub accounts_changed: usize,
+    /// Number of accounts deleted (SELFDESTRUCT)
+    pub accounts_deleted: usize,
+    /// Number of storage slots changed (SSTORE operations)
+    pub storage_slots_changed: usize,
+    /// Number of storage slots deleted (set to zero)
+    pub storage_slots_deleted: usize,
+    /// Number of bytecodes created/changed (contract deployments)
+    pub bytecodes_changed: usize,
+    /// Total bytes of code written
+    pub code_bytes_written: usize,
+    /// Number of EIP-7702 delegations set
+    pub eip7702_delegations_set: usize,
+    /// Number of EIP-7702 delegations cleared
+    pub eip7702_delegations_cleared: usize,
+    /// Account cache hits
+    pub account_cache_hits: usize,
+    /// Account cache misses
+    pub account_cache_misses: usize,
+    /// Storage cache hits
+    pub storage_cache_hits: usize,
+    /// Storage cache misses
+    pub storage_cache_misses: usize,
+    /// Code cache hits
+    pub code_cache_hits: usize,
+    /// Code cache misses
+    pub code_cache_misses: usize,
+}
--- a/crates/chain-state/src/in_memory.rs
+++ b/crates/chain-state/src/in_memory.rs
@@ -992,7 +992,7 @@ impl<N: NodePrimitives<SignedTx: SignedTransaction>> NewCanonicalChain<N> {
    ///
    /// Returns the new tip for [`Self::Reorg`] and [`Self::Commit`] variants which commit at least
    /// 1 new block.
-    pub fn tip(&self) -> &SealedBlock<N::Block> {
+    pub fn tip(&self) -> &RecoveredBlock<N::Block> {
        match self {
            Self::Commit { new } | Self::Reorg { new, .. } => {
                new.last().expect("non empty blocks").recovered_block()
--- a/crates/chain-state/src/lib.rs
+++ b/crates/chain-state/src/lib.rs
@@ -8,6 +8,9 @@
 #![cfg_attr(not(test), warn(unused_crate_dependencies))]
 #![cfg_attr(docsrs, feature(doc_cfg))]

+mod execution_stats;
+pub use execution_stats::ExecutionTimingStats;
+
 mod in_memory;
 pub use in_memory::*;

--- a/crates/cli/cli/src/lib.rs
+++ b/crates/cli/cli/src/lib.rs
@@ -21,7 +21,7 @@ use crate::chainspec::ChainSpecParser;
 ///
 /// This trait is supposed to be implemented by the main struct of the CLI.
 ///
-/// It provides commonly used functionality for running commands and information about the CL, such
+/// It provides commonly used functionality for running commands and information about the CLI, such
 /// as the name and version.
 pub trait RethCli: Sized {
    /// The associated `ChainSpecParser` type
--- a/crates/cli/commands/Cargo.toml
+++ b/crates/cli/commands/Cargo.toml
@@ -110,7 +110,6 @@ reth-provider = { workspace = true, features = ["test-utils"] }
 tempfile.workspace = true

 [features]
-default = []
 arbitrary = [
    "dep:proptest",
    "dep:arbitrary",
@@ -135,6 +134,3 @@ arbitrary = [
    "reth-primitives-traits/arbitrary",
    "reth-ethereum-primitives/arbitrary",
 ]
-
-rocksdb = ["reth-db-common/rocksdb", "reth-stages/rocksdb", "reth-provider/rocksdb", "reth-prune/rocksdb"]
-edge = ["rocksdb"]
--- a/crates/cli/commands/src/common.rs
+++ b/crates/cli/commands/src/common.rs
@@ -73,17 +73,12 @@ pub struct EnvironmentArgs<C: ChainSpecParser> {
 }

 impl<C: ChainSpecParser> EnvironmentArgs<C> {
-    /// Returns the effective storage settings derived from `--storage.v2`.
+    /// Returns the storage settings for new database initialization.
    ///
-    /// The base storage mode is determined by `--storage.v2`:
-    /// - When `--storage.v2` is set: uses [`StorageSettings::v2()`] defaults
-    /// - Otherwise: uses [`StorageSettings::base()`] defaults
+    /// Always returns [`StorageSettings::v2()`] — v2 is the default for all new
+    /// databases. Existing databases use the settings persisted in their metadata.
    pub fn storage_settings(&self) -> StorageSettings {
-        if self.storage.v2 {
-            StorageSettings::v2()
-        } else {
-            StorageSettings::base()
-        }
+        StorageSettings::v2()
    }

    /// Initializes environment according to [`AccessRights`] and returns an instance of
--- a/crates/cli/commands/src/db/checksum/mod.rs
+++ b/crates/cli/commands/src/db/checksum/mod.rs
@@ -21,7 +21,6 @@ use std::{
 };
 use tracing::{info, warn};

-#[cfg(all(unix, feature = "rocksdb"))]
 mod rocksdb;

 /// Interval for logging progress during checksum computation.
@@ -73,7 +72,6 @@ enum Subcommand {
        limit: Option<usize>,
    },
    /// Calculates the checksum of a RocksDB table
-    #[cfg(all(unix, feature = "rocksdb"))]
    Rocksdb {
        /// The RocksDB table
        #[arg(value_enum)]
@@ -100,7 +98,6 @@ impl Command {
            Subcommand::StaticFile { segment, start_block, end_block, limit } => {
                checksum_static_file(tool, segment, start_block, end_block, limit)?;
            }
-            #[cfg(all(unix, feature = "rocksdb"))]
            Subcommand::Rocksdb { table, limit } => {
                rocksdb::checksum_rocksdb(tool, table, limit)?;
            }
--- a/crates/cli/commands/src/db/mod.rs
+++ b/crates/cli/commands/src/db/mod.rs
@@ -102,14 +102,14 @@ impl<C: ChainSpecParser<ChainSpec: EthChainSpec + EthereumHardforks>> Command<C>
        let static_files_path = data_dir.static_files();
        let exex_wal_path = data_dir.exex_wal();

-        // ensure the provided datadir exist
+        // ensure the provided datadir exists
        eyre::ensure!(
            data_dir.data_dir().is_dir(),
            "Datadir does not exist: {:?}",
            data_dir.data_dir()
        );

-        // ensure the provided database exist
+        // ensure the provided database exists
        eyre::ensure!(db_path.is_dir(), "Database does not exist: {:?}", db_path);

        match self.command {
--- a/crates/cli/commands/src/db/state.rs
+++ b/crates/cli/commands/src/db/state.rs
@@ -19,7 +19,7 @@ use std::{
 };
 use tracing::info;

-/// Log progress every 5 seconds
+/// Log progress every 30 seconds
 const LOG_INTERVAL: Duration = Duration::from_secs(30);

 /// The arguments for the `reth db state` command
--- a/crates/cli/commands/src/download/config_gen.rs
+++ b/crates/cli/commands/src/download/config_gen.rs
@@ -290,6 +290,7 @@ mod tests {
            storage_version: 2,
            timestamp: 0,
            base_url: None,
+            reth_version: None,
            components: BTreeMap::new(),
        }
    }
--- a/crates/cli/commands/src/download/manifest.rs
+++ b/crates/cli/commands/src/download/manifest.rs
@@ -38,6 +38,9 @@ pub struct SnapshotManifest {
    /// When omitted, downloaders should derive the base URL from the manifest URL.
    #[serde(default, skip_serializing_if = "Option::is_none")]
    pub base_url: Option<String>,
+    /// Reth version that produced this snapshot.
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub reth_version: Option<String>,
    /// Available snapshot components.
    pub components: BTreeMap<String, ComponentManifest>,
 }
@@ -553,6 +556,7 @@ pub fn generate_manifest(
        storage_version: 2,
        timestamp,
        base_url: base_url.map(str::to_owned),
+        reth_version: Some(reth_node_core::version::version_metadata().short_version.to_string()),
        components,
    })
 }
@@ -834,6 +838,7 @@ mod tests {
            storage_version: 2,
            timestamp: 0,
            base_url: Some("https://example.com".to_string()),
+            reth_version: None,
            components,
        }
    }
@@ -884,6 +889,7 @@ mod tests {
            storage_version: 2,
            timestamp: 0,
            base_url: Some("https://example.com".to_string()),
+            reth_version: None,
            components,
        };

@@ -953,6 +959,7 @@ mod tests {
            storage_version: 2,
            timestamp: 0,
            base_url: Some("https://example.com".to_string()),
+            reth_version: None,
            components,
        };
        let urls = m.archive_urls(SnapshotComponentType::StorageChangesets);
@@ -1028,6 +1035,7 @@ mod tests {
            storage_version: 2,
            timestamp: 0,
            base_url: Some("https://example.com".to_string()),
+            reth_version: None,
            components,
        };

--- a/crates/cli/commands/src/download/mod.rs
+++ b/crates/cli/commands/src/download/mod.rs
@@ -7,7 +7,7 @@ use crate::common::EnvironmentArgs;
 use blake3::Hasher;
 use clap::Parser;
 use config_gen::{config_for_selections, write_config};
-use eyre::Result;
+use eyre::{Result, WrapErr};
 use futures::stream::{self, StreamExt};
 use lz4::Decoder;
 use manifest::{
@@ -17,6 +17,7 @@ use manifest::{
 use reqwest::{blocking::Client as BlockingClient, header::RANGE, Client, StatusCode};
 use reth_chainspec::{EthChainSpec, EthereumHardfork, EthereumHardforks};
 use reth_cli::chainspec::ChainSpecParser;
+use reth_cli_util::cancellation::CancellationToken;
 use reth_db::{init_db, Database};
 use reth_db_api::transaction::DbTx;
 use reth_fs_util as fs;
@@ -42,7 +43,8 @@ use url::Url;
 use zstd::stream::read::Decoder as ZstdDecoder;

 const BYTE_UNITS: [&str; 4] = ["B", "KB", "MB", "GB"];
-const MERKLE_BASE_URL: &str = "https://downloads.merkle.io";
+const RETH_SNAPSHOTS_BASE_URL: &str = "https://snapshots-r2.reth.rs";
+const RETH_SNAPSHOTS_API_URL: &str = "https://snapshots.reth.rs/api/snapshots";
 const EXTENSION_TAR_LZ4: &str = ".tar.lz4";
 const EXTENSION_TAR_ZSTD: &str = ".tar.zst";
 const DOWNLOAD_CACHE_DIR: &str = ".download-cache";
@@ -97,14 +99,14 @@ impl DownloadDefaults {
        DOWNLOAD_DEFAULTS.get_or_init(DownloadDefaults::default_download_defaults)
    }

-    /// Default download configuration with defaults from merkle.io and publicnode
+    /// Default download configuration with defaults from snapshots.reth.rs and publicnode
    pub fn default_download_defaults() -> Self {
        Self {
            available_snapshots: vec![
-                Cow::Borrowed("https://www.merkle.io/snapshots (default, mainnet archive)"),
+                Cow::Borrowed("https://snapshots.reth.rs (default)"),
                Cow::Borrowed("https://publicnode.com/snapshots (full nodes & testnets)"),
            ],
-            default_base_url: Cow::Borrowed(MERKLE_BASE_URL),
+            default_base_url: Cow::Borrowed(RETH_SNAPSHOTS_BASE_URL),
            default_chain_aware_base_url: None,
            long_help: None,
        }
@@ -120,7 +122,9 @@ impl DownloadDefaults {
        }

        let mut help = String::from(
-            "Specify a snapshot URL or let the command propose a default one.\n\nAvailable snapshot sources:\n",
+            "Specify a snapshot URL or let the command propose a default one.\n\n\
+             Browse available snapshots at https://snapshots.reth.rs\n\
+             or use --list-snapshots to see them from the CLI.\n\nAvailable snapshot sources:\n",
        );

        for source in &self.available_snapshots {
@@ -187,6 +191,7 @@ pub struct DownloadCommand<C: ChainSpecParser> {
    /// Custom URL to download a single snapshot archive (legacy mode).
    ///
    /// When provided, downloads and extracts a single archive without component selection.
+    /// Browse available snapshots at <https://snapshots.reth.rs> or use --list-snapshots.
    #[arg(long, short, long_help = DownloadDefaults::get_global().long_help())]
    url: Option<String>,

@@ -213,22 +218,30 @@ pub struct DownloadCommand<C: ChainSpecParser> {
    #[arg(long, alias = "with-changesets", conflicts_with_all = ["minimal", "full", "archive"])]
    with_state_history: bool,

+    /// Include transaction sender static files. Requires `--with-txs`.
+    #[arg(long, requires = "with_txs", conflicts_with_all = ["minimal", "full", "archive"])]
+    with_senders: bool,
+
+    /// Include RocksDB index files.
+    #[arg(long, conflicts_with_all = ["minimal", "full", "archive", "without_rocksdb"])]
+    with_rocksdb: bool,
+
    /// Download all available components (archive node, no pruning).
-    #[arg(long, alias = "all", conflicts_with_all = ["with_txs", "with_receipts", "with_state_history", "minimal", "full"])]
+    #[arg(long, alias = "all", conflicts_with_all = ["with_txs", "with_receipts", "with_state_history", "with_senders", "with_rocksdb", "minimal", "full"])]
    archive: bool,

    /// Download the minimal component set (same default as --non-interactive).
-    #[arg(long, conflicts_with_all = ["with_txs", "with_receipts", "with_state_history", "archive", "full"])]
+    #[arg(long, conflicts_with_all = ["with_txs", "with_receipts", "with_state_history", "with_senders", "with_rocksdb", "archive", "full"])]
    minimal: bool,

    /// Download the full node component set (matches default full prune settings).
-    #[arg(long, conflicts_with_all = ["with_txs", "with_receipts", "with_state_history", "archive", "minimal"])]
+    #[arg(long, conflicts_with_all = ["with_txs", "with_receipts", "with_state_history", "with_senders", "with_rocksdb", "archive", "minimal"])]
    full: bool,

    /// Skip optional RocksDB indices even when archive components are selected.
    ///
    /// This affects `--archive`/`--all` and TUI archive preset (`a`).
-    #[arg(long, conflicts_with = "url")]
+    #[arg(long, conflicts_with_all = ["url", "with_rocksdb"])]
    without_rocksdb: bool,

    /// Skip interactive component selection. Downloads the minimal set
@@ -247,6 +260,13 @@ pub struct DownloadCommand<C: ChainSpecParser> {
    /// Maximum number of concurrent modular archive workers.
    #[arg(long, default_value_t = MAX_CONCURRENT_DOWNLOADS)]
    download_concurrency: usize,
+
+    /// List available snapshots from snapshots.reth.rs and exit.
+    ///
+    /// Queries the snapshots API and prints all available snapshots for the selected chain,
+    /// including block number, size, and manifest URL.
+    #[arg(long, alias = "list-snapshots", conflicts_with_all = ["url", "manifest_url", "manifest_path"])]
+    list: bool,
 }

 impl<C: ChainSpecParser<ChainSpec: EthChainSpec + EthereumHardforks>> DownloadCommand<C> {
@@ -256,22 +276,39 @@ impl<C: ChainSpecParser<ChainSpec: EthChainSpec + EthereumHardforks>> DownloadCo
        let data_dir = self.env.datadir.clone().resolve_datadir(chain);
        fs::create_dir_all(&data_dir)?;

+        let cancel_token = CancellationToken::new();
+        let _cancel_guard = cancel_token.drop_guard();
+
+        // --list: print available snapshots and exit
+        if self.list {
+            let entries = fetch_snapshot_api_entries(chain_id).await?;
+            print_snapshot_listing(&entries, chain_id);
+            return Ok(());
+        }
+
        // Legacy single-URL mode: download one archive and extract it
-        if let Some(url) = self.url {
+        if let Some(ref url) = self.url {
            info!(target: "reth::cli",
                dir = ?data_dir.data_dir(),
                url = %url,
                "Starting snapshot download and extraction"
            );

-            stream_and_extract(&url, data_dir.data_dir(), None, self.resumable).await?;
+            stream_and_extract(
+                url,
+                data_dir.data_dir(),
+                None,
+                self.resumable,
+                cancel_token.clone(),
+            )
+            .await?;
            info!(target: "reth::cli", "Snapshot downloaded and extracted successfully");

            return Ok(());
        }

        // Modular download: fetch manifest and select components
-        let manifest_source = self.resolve_manifest_source(chain_id);
+        let manifest_source = self.resolve_manifest_source(chain_id).await?;

        info!(target: "reth::cli", source = %manifest_source, "Fetching snapshot manifest");
        let mut manifest = fetch_manifest_from_source(&manifest_source).await?;
@@ -365,7 +402,7 @@ impl<C: ChainSpecParser<ChainSpec: EthChainSpec + EthereumHardforks>> DownloadCo
            "Downloading all archives"
        );

-        let shared = SharedProgress::new(total_size, total_archives as u64);
+        let shared = SharedProgress::new(total_size, total_archives as u64, cancel_token.clone());
        let progress_handle = spawn_progress_display(Arc::clone(&shared));

        let target = target_dir.to_path_buf();
@@ -377,9 +414,17 @@ impl<C: ChainSpecParser<ChainSpec: EthChainSpec + EthereumHardforks>> DownloadCo
                let dir = target.clone();
                let cache = cache_dir.clone();
                let sp = Arc::clone(&shared);
+                let ct = cancel_token.clone();
                async move {
-                    process_modular_archive(planned, &dir, cache.as_deref(), Some(sp), resumable)
-                        .await?;
+                    process_modular_archive(
+                        planned,
+                        &dir,
+                        cache.as_deref(),
+                        Some(sp),
+                        resumable,
+                        ct,
+                    )
+                    .await?;
                    Ok(())
                }
            })
@@ -467,7 +512,11 @@ impl<C: ChainSpecParser<ChainSpec: EthChainSpec + EthereumHardforks>> DownloadCo
            });
        }

-        let has_explicit_flags = self.with_txs || self.with_receipts || self.with_state_history;
+        let has_explicit_flags = self.with_txs ||
+            self.with_receipts ||
+            self.with_state_history ||
+            self.with_senders ||
+            self.with_rocksdb;

        if has_explicit_flags {
            let mut selections = BTreeMap::new();
@@ -494,6 +543,13 @@ impl<C: ChainSpecParser<ChainSpec: EthChainSpec + EthereumHardforks>> DownloadCo
                        .insert(SnapshotComponentType::StorageChangesets, ComponentSelection::All);
                }
            }
+            if self.with_senders && available(SnapshotComponentType::TransactionSenders) {
+                selections
+                    .insert(SnapshotComponentType::TransactionSenders, ComponentSelection::All);
+            }
+            if self.with_rocksdb && available(SnapshotComponentType::RocksdbIndices) {
+                selections.insert(SnapshotComponentType::RocksdbIndices, ComponentSelection::All);
+            }
            return Ok(ResolvedComponents { selections, preset: None });
        }

@@ -602,17 +658,14 @@ impl<C: ChainSpecParser<ChainSpec: EthChainSpec + EthereumHardforks>> DownloadCo
        }
    }

-    fn resolve_manifest_source(&self, chain_id: u64) -> String {
+    async fn resolve_manifest_source(&self, chain_id: u64) -> Result<String> {
        if let Some(path) = &self.manifest_path {
-            return path.display().to_string();
+            return Ok(path.display().to_string());
        }

        match &self.manifest_url {
-            Some(url) => url.clone(),
-            None => {
-                let base_url = get_base_url(chain_id);
-                format!("{base_url}/manifest.json")
-            }
+            Some(url) => Ok(url.clone()),
+            None => discover_manifest_url(chain_id).await,
        }
    }
 }
@@ -800,19 +853,25 @@ struct SharedProgress {
    total_archives: u64,
    archives_done: AtomicU64,
    done: AtomicBool,
+    cancel_token: CancellationToken,
 }

 impl SharedProgress {
-    fn new(total_size: u64, total_archives: u64) -> Arc<Self> {
+    fn new(total_size: u64, total_archives: u64, cancel_token: CancellationToken) -> Arc<Self> {
        Arc::new(Self {
            downloaded: AtomicU64::new(0),
            total_size,
            total_archives,
            archives_done: AtomicU64::new(0),
            done: AtomicBool::new(false),
+            cancel_token,
        })
    }

+    fn is_cancelled(&self) -> bool {
+        self.cancel_token.is_cancelled()
+    }
+
    fn add(&self, bytes: u64) {
        self.downloaded.fetch_add(bytes, Ordering::Relaxed);
    }
@@ -901,16 +960,20 @@ fn spawn_progress_display(progress: Arc<SharedProgress>) -> tokio::task::JoinHan
 struct ProgressReader<R> {
    reader: R,
    progress: DownloadProgress,
+    cancel_token: CancellationToken,
 }

 impl<R: Read> ProgressReader<R> {
-    fn new(reader: R, total_size: u64) -> Self {
-        Self { reader, progress: DownloadProgress::new(total_size) }
+    fn new(reader: R, total_size: u64, cancel_token: CancellationToken) -> Self {
+        Self { reader, progress: DownloadProgress::new(total_size), cancel_token }
    }
 }

 impl<R: Read> Read for ProgressReader<R> {
    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
+        if self.cancel_token.is_cancelled() {
+            return Err(io::Error::new(io::ErrorKind::Interrupted, "download cancelled"));
+        }
        let bytes = self.reader.read(buf)?;
        if bytes > 0 &&
            let Err(e) = self.progress.update(bytes as u64)
@@ -953,8 +1016,9 @@ fn extract_archive<R: Read>(
    total_size: u64,
    format: CompressionFormat,
    target_dir: &Path,
+    cancel_token: CancellationToken,
 ) -> Result<()> {
-    let progress_reader = ProgressReader::new(reader, total_size);
+    let progress_reader = ProgressReader::new(reader, total_size, cancel_token);

    match format {
        CompressionFormat::Lz4 => {
@@ -998,7 +1062,7 @@ fn extract_from_file(path: &Path, format: CompressionFormat, target_dir: &Path)
        "Extracting local archive"
    );
    let start = Instant::now();
-    extract_archive(file, total_size, format, target_dir)?;
+    extract_archive(file, total_size, format, target_dir, CancellationToken::new())?;
    info!(target: "reth::cli",
        file = %path.display(),
        elapsed = %DownloadProgress::format_duration(start.elapsed()),
@@ -1015,10 +1079,14 @@ const RETRY_BACKOFF_SECS: u64 = 5;
 struct ProgressWriter<W> {
    inner: W,
    progress: DownloadProgress,
+    cancel_token: CancellationToken,
 }

 impl<W: Write> Write for ProgressWriter<W> {
    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
+        if self.cancel_token.is_cancelled() {
+            return Err(io::Error::new(io::ErrorKind::Interrupted, "download cancelled"));
+        }
        let n = self.inner.write(buf)?;
        let _ = self.progress.update(n as u64);
        Ok(n)
@@ -1038,6 +1106,9 @@ struct SharedProgressWriter<W> {

 impl<W: Write> Write for SharedProgressWriter<W> {
    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
+        if self.progress.is_cancelled() {
+            return Err(io::Error::new(io::ErrorKind::Interrupted, "download cancelled"));
+        }
        let n = self.inner.write(buf)?;
        self.progress.add(n as u64);
        Ok(n)
@@ -1057,6 +1128,9 @@ struct SharedProgressReader<R> {

 impl<R: Read> Read for SharedProgressReader<R> {
    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
+        if self.progress.is_cancelled() {
+            return Err(io::Error::new(io::ErrorKind::Interrupted, "download cancelled"));
+        }
        let n = self.inner.read(buf)?;
        self.progress.add(n as u64);
        Ok(n)
@@ -1073,6 +1147,7 @@ fn resumable_download(
    url: &str,
    target_dir: &Path,
    shared: Option<&Arc<SharedProgress>>,
+    cancel_token: CancellationToken,
 ) -> Result<(PathBuf, u64)> {
    let file_name = Url::parse(url)
        .ok()
@@ -1196,7 +1271,11 @@ fn resumable_download(
            // Legacy single-download path: local progress bar
            let mut progress = DownloadProgress::new(current_total);
            progress.downloaded = start_offset;
-            let mut writer = ProgressWriter { inner: BufWriter::new(file), progress };
+            let mut writer = ProgressWriter {
+                inner: BufWriter::new(file),
+                progress,
+                cancel_token: cancel_token.clone(),
+            };
            copy_result = io::copy(&mut reader, &mut writer);
            flush_result = writer.inner.flush();
            println!();
@@ -1229,6 +1308,7 @@ fn streaming_download_and_extract(
    format: CompressionFormat,
    target_dir: &Path,
    shared: Option<&Arc<SharedProgress>>,
+    cancel_token: CancellationToken,
 ) -> Result<()> {
    let quiet = shared.is_some();
    let mut last_error: Option<eyre::Error> = None;
@@ -1268,7 +1348,8 @@ fn streaming_download_and_extract(
            let reader = SharedProgressReader { inner: response, progress: Arc::clone(sp) };
            extract_archive_raw(reader, format, target_dir)
        } else {
-            extract_archive_raw(response, format, target_dir)
+            let total_size = response.content_length().unwrap_or(0);
+            extract_archive(response, total_size, format, target_dir, cancel_token.clone())
        };

        match result {
@@ -1293,9 +1374,11 @@ fn download_and_extract(
    format: CompressionFormat,
    target_dir: &Path,
    shared: Option<&Arc<SharedProgress>>,
+    cancel_token: CancellationToken,
 ) -> Result<()> {
    let quiet = shared.is_some();
-    let (downloaded_path, total_size) = resumable_download(url, target_dir, shared)?;
+    let (downloaded_path, total_size) =
+        resumable_download(url, target_dir, shared, cancel_token.clone())?;

    let file_name =
        downloaded_path.file_name().map(|f| f.to_string_lossy().to_string()).unwrap_or_default();
@@ -1313,7 +1396,7 @@ fn download_and_extract(
        // Skip progress tracking for extraction in parallel mode
        extract_archive_raw(file, format, target_dir)?;
    } else {
-        extract_archive(file, total_size, format, target_dir)?;
+        extract_archive(file, total_size, format, target_dir, cancel_token)?;
        info!(target: "reth::cli",
            file = %file_name,
            "Extraction complete"
@@ -1339,6 +1422,7 @@ fn blocking_download_and_extract(
    target_dir: &Path,
    shared: Option<Arc<SharedProgress>>,
    resumable: bool,
+    cancel_token: CancellationToken,
 ) -> Result<()> {
    let format = CompressionFormat::from_url(url)?;

@@ -1356,9 +1440,10 @@ fn blocking_download_and_extract(
        }
        result
    } else if resumable {
-        download_and_extract(url, format, target_dir, shared.as_ref())
+        download_and_extract(url, format, target_dir, shared.as_ref(), cancel_token)
    } else {
-        let result = streaming_download_and_extract(url, format, target_dir, shared.as_ref());
+        let result =
+            streaming_download_and_extract(url, format, target_dir, shared.as_ref(), cancel_token);
        if result.is_ok() &&
            let Some(sp) = shared
        {
@@ -1378,11 +1463,12 @@ async fn stream_and_extract(
    target_dir: &Path,
    shared: Option<Arc<SharedProgress>>,
    resumable: bool,
+    cancel_token: CancellationToken,
 ) -> Result<()> {
    let target_dir = target_dir.to_path_buf();
    let url = url.to_string();
    task::spawn_blocking(move || {
-        blocking_download_and_extract(&url, &target_dir, shared, resumable)
+        blocking_download_and_extract(&url, &target_dir, shared, resumable, cancel_token)
    })
    .await??;

@@ -1395,6 +1481,7 @@ async fn process_modular_archive(
    cache_dir: Option<&Path>,
    shared: Option<Arc<SharedProgress>>,
    resumable: bool,
+    cancel_token: CancellationToken,
 ) -> Result<()> {
    let target_dir = target_dir.to_path_buf();
    let cache_dir = cache_dir.map(Path::to_path_buf);
@@ -1406,6 +1493,7 @@ async fn process_modular_archive(
            cache_dir.as_deref(),
            shared,
            resumable,
+            cancel_token,
        )
    })
    .await??;
@@ -1419,6 +1507,7 @@ fn blocking_process_modular_archive(
    cache_dir: Option<&Path>,
    shared: Option<Arc<SharedProgress>>,
    resumable: bool,
+    cancel_token: CancellationToken,
 ) -> Result<()> {
    let archive = &planned.archive;
    if verify_output_files(target_dir, &archive.output_files)? {
@@ -1439,13 +1528,19 @@ fn blocking_process_modular_archive(
            let archive_path = cache_dir.join(&archive.file_name);
            let part_path = cache_dir.join(format!("{}.part", archive.file_name));
            let (downloaded_path, _downloaded_size) =
-                resumable_download(&archive.url, cache_dir, shared.as_ref())?;
+                resumable_download(&archive.url, cache_dir, shared.as_ref(), cancel_token.clone())?;
            let file = fs::open(&downloaded_path)?;
            extract_archive_raw(file, format, target_dir)?;
            let _ = fs::remove_file(&archive_path);
            let _ = fs::remove_file(&part_path);
        } else {
-            streaming_download_and_extract(&archive.url, format, target_dir, shared.as_ref())?;
+            streaming_download_and_extract(
+                &archive.url,
+                format,
+                target_dir,
+                shared.as_ref(),
+                cancel_token.clone(),
+            )?;
        }

        if verify_output_files(target_dir, &archive.output_files)? {
@@ -1511,20 +1606,149 @@ fn file_blake3_hex(path: &Path) -> Result<String> {
    Ok(hasher.finalize().to_hex().to_string())
 }

-/// Builds the base URL for the given chain ID using configured defaults.
-fn get_base_url(chain_id: u64) -> String {
-    let defaults = DownloadDefaults::get_global();
-    match &defaults.default_chain_aware_base_url {
-        Some(url) => format!("{url}/{chain_id}"),
-        None => defaults.default_base_url.to_string(),
+/// Discovers the latest snapshot manifest URL for the given chain from the snapshots API.
+///
+/// Queries `snapshots.reth.rs/api/snapshots` and returns the manifest URL for the most
+/// recent modular snapshot matching the requested chain.
+async fn discover_manifest_url(chain_id: u64) -> Result<String> {
+    let api_url = RETH_SNAPSHOTS_API_URL;
+
+    info!(target: "reth::cli", %api_url, %chain_id, "Discovering latest snapshot manifest");
+
+    let entries = fetch_snapshot_api_entries(chain_id).await?;
+
+    let entry =
+        entries.iter().filter(|s| s.is_modular()).max_by_key(|s| s.block).ok_or_else(|| {
+            eyre::eyre!(
+                "No modular snapshot manifest found for chain \
+                 {chain_id} at {api_url}\n\n\
+                 You can provide a manifest URL directly with --manifest-url, or\n\
+                 use a direct snapshot URL with -u from:\n\
+                 \t- https://snapshots.reth.rs\n\n\
+                 Use --list to see all available snapshots."
+            )
+        })?;
+
+    info!(target: "reth::cli",
+        block = entry.block,
+        url = %entry.metadata_url,
+        "Found latest snapshot manifest"
+    );
+
+    Ok(entry.metadata_url.clone())
+}
+
+/// Deserializes a JSON value that may be either a number or a string-encoded number.
+fn deserialize_string_or_u64<'de, D>(deserializer: D) -> std::result::Result<u64, D::Error>
+where
+    D: serde::Deserializer<'de>,
+{
+    use serde::Deserialize;
+    let value = serde_json::Value::deserialize(deserializer)?;
+    match &value {
+        serde_json::Value::Number(n) => {
+            n.as_u64().ok_or_else(|| serde::de::Error::custom("expected u64"))
+        }
+        serde_json::Value::String(s) => {
+            s.parse::<u64>().map_err(|_| serde::de::Error::custom("expected numeric string"))
+        }
+        _ => Err(serde::de::Error::custom("expected number or string")),
    }
 }

+/// An entry from the `snapshots.reth.rs/api/snapshots` listing.
+#[derive(serde::Deserialize)]
+#[serde(rename_all = "camelCase")]
+struct SnapshotApiEntry {
+    #[serde(deserialize_with = "deserialize_string_or_u64")]
+    chain_id: u64,
+    #[serde(deserialize_with = "deserialize_string_or_u64")]
+    block: u64,
+    #[serde(default)]
+    date: Option<String>,
+    #[serde(default)]
+    profile: Option<String>,
+    metadata_url: String,
+    #[serde(default)]
+    size: u64,
+}
+
+impl SnapshotApiEntry {
+    fn is_modular(&self) -> bool {
+        self.metadata_url.ends_with("manifest.json")
+    }
+}
+
+/// Fetches the full snapshot listing from the snapshots API, filtered by chain ID.
+async fn fetch_snapshot_api_entries(chain_id: u64) -> Result<Vec<SnapshotApiEntry>> {
+    let api_url = RETH_SNAPSHOTS_API_URL;
+
+    let entries: Vec<SnapshotApiEntry> = Client::new()
+        .get(api_url)
+        .send()
+        .await
+        .and_then(|r| r.error_for_status())
+        .wrap_err_with(|| format!("Failed to fetch snapshot listing from {api_url}"))?
+        .json()
+        .await?;
+
+    Ok(entries.into_iter().filter(|e| e.chain_id == chain_id).collect())
+}
+
+/// Prints a formatted table of available modular snapshots.
+fn print_snapshot_listing(entries: &[SnapshotApiEntry], chain_id: u64) {
+    let modular: Vec<_> = entries.iter().filter(|e| e.is_modular()).collect();
+
+    println!("Available snapshots for chain {chain_id} (https://snapshots.reth.rs):\n");
+    println!("{:<12}  {:>10}  {:<10}  {:>10}  MANIFEST URL", "DATE", "BLOCK", "PROFILE", "SIZE");
+    println!("{}", "-".repeat(100));
+
+    for entry in &modular {
+        let date = entry.date.as_deref().unwrap_or("-");
+        let profile = entry.profile.as_deref().unwrap_or("-");
+        let size = if entry.size > 0 {
+            DownloadProgress::format_size(entry.size)
+        } else {
+            "-".to_string()
+        };
+
+        println!(
+            "{date:<12}  {:>10}  {profile:<10}  {size:>10}  {}",
+            entry.block, entry.metadata_url
+        );
+    }
+
+    if modular.is_empty() {
+        println!("  (no modular snapshots found)");
+    }
+
+    println!(
+        "\nTo download a specific snapshot, copy its manifest URL and run:\n  \
+         reth download --manifest-url <URL>"
+    );
+}
+
 async fn fetch_manifest_from_source(source: &str) -> Result<SnapshotManifest> {
    if let Ok(parsed) = Url::parse(source) {
        return match parsed.scheme() {
            "http" | "https" => {
-                Ok(Client::new().get(source).send().await?.error_for_status()?.json().await?)
+                let response = Client::new()
+                    .get(source)
+                    .send()
+                    .await
+                    .and_then(|r| r.error_for_status())
+                    .wrap_err_with(|| {
+                        format!(
+                            "Failed to fetch snapshot manifest from {source}\n\n\
+                             The manifest endpoint may not be available for this snapshot source.\n\
+                             You can use a direct snapshot URL instead:\n\n\
+                             \treth download -u <snapshot-url>\n\n\
+                             Available snapshot sources:\n\
+                             \t- https://snapshots.reth.rs\n\
+                             \t- https://publicnode.com/snapshots"
+                        )
+                    })?;
+                Ok(response.json().await?)
            }
            "file" => {
                let path = parsed
@@ -1589,26 +1813,6 @@ fn resolve_manifest_base_url(manifest: &SnapshotManifest, source: &str) -> Resul
    Ok(base)
 }

-/// Builds default URL for latest mainnet archive snapshot using configured defaults.
-///
-/// Used by the legacy single-archive download flow when no manifest is available.
-#[allow(dead_code)]
-async fn get_latest_snapshot_url(chain_id: u64) -> Result<String> {
-    let base_url = get_base_url(chain_id);
-    let latest_url = format!("{base_url}/latest.txt");
-    let filename = Client::new()
-        .get(latest_url)
-        .send()
-        .await?
-        .error_for_status()?
-        .text()
-        .await?
-        .trim()
-        .to_string();
-
-    Ok(format!("{base_url}/{filename}"))
-}
-
 #[cfg(test)]
 mod tests {
    use super::*;
@@ -1641,6 +1845,7 @@ mod tests {
            storage_version: 2,
            timestamp: 0,
            base_url: Some("https://example.com".to_string()),
+            reth_version: None,
            components,
        }
    }
@@ -1672,7 +1877,7 @@ mod tests {
        let help = defaults.long_help();

        assert!(help.contains("Available snapshot sources:"));
-        assert!(help.contains("merkle.io"));
+        assert!(help.contains("snapshots.reth.rs"));
        assert!(help.contains("publicnode.com"));
        assert!(help.contains("file://"));
    }
--- a/crates/cli/commands/src/import_core.rs
+++ b/crates/cli/commands/src/import_core.rs
@@ -19,11 +19,12 @@ use reth_node_api::BlockTy;
 use reth_node_events::node::NodeEvent;
 use reth_provider::{
    providers::ProviderNodeTypes, BlockNumReader, HeaderProvider, ProviderError, ProviderFactory,
-    StageCheckpointReader,
+    RocksDBProviderFactory, StageCheckpointReader,
 };
 use reth_prune::PruneModes;
 use reth_stages::{prelude::*, ControlFlow, Pipeline, StageId, StageSet};
 use reth_static_file::StaticFileProducer;
+use reth_storage_api::StorageSettingsCache;
 use std::{path::Path, sync::Arc};
 use tokio::sync::watch;
 use tracing::{debug, error, info, warn};
@@ -108,7 +109,11 @@ where

    let provider = provider_factory.provider()?;
    let init_blocks = provider.tx_ref().entries::<tables::HeaderNumbers>()?;
-    let init_txns = provider.tx_ref().entries::<tables::TransactionHashNumbers>()?;
+    let init_txns = if provider_factory.cached_storage_settings().storage_v2 {
+        provider_factory.rocksdb_provider().iter::<tables::TransactionHashNumbers>()?.count()
+    } else {
+        provider.tx_ref().entries::<tables::TransactionHashNumbers>()?
+    };
    drop(provider);

    let mut total_decoded_blocks = 0;
@@ -215,8 +220,12 @@ where

    let provider = provider_factory.provider()?;
    let total_imported_blocks = provider.tx_ref().entries::<tables::HeaderNumbers>()? - init_blocks;
-    let total_imported_txns =
-        provider.tx_ref().entries::<tables::TransactionHashNumbers>()? - init_txns;
+    let current_txns = if provider_factory.cached_storage_settings().storage_v2 {
+        provider_factory.rocksdb_provider().iter::<tables::TransactionHashNumbers>()?.count()
+    } else {
+        provider.tx_ref().entries::<tables::TransactionHashNumbers>()?
+    };
+    let total_imported_txns = current_txns - init_txns;

    let result = ImportResult {
        total_decoded_blocks,
--- a/crates/cli/commands/src/init_state/mod.rs
+++ b/crates/cli/commands/src/init_state/mod.rs
@@ -47,7 +47,7 @@ pub struct InitStateCommand<C: ChainSpecParser> {
    /// Specifies whether to initialize the state without relying on EVM historical data.
    ///
    /// When enabled, and before inserting the state, it creates a dummy chain up to the last EVM
-    /// block specified. It then, appends the first block provided block.
+    /// block specified. It then appends the first provided block.
    ///
    /// - **Note**: **Do not** import receipts and blocks beforehand, or this will fail or be
    ///   ignored.
--- a/crates/cli/commands/src/node.rs
+++ b/crates/cli/commands/src/node.rs
@@ -125,12 +125,12 @@ pub struct NodeCommand<C: ChainSpecParser, Ext: clap::Args + fmt::Debug = NoArgs
 }

 impl<C: ChainSpecParser> NodeCommand<C> {
-    /// Parsers only the default CLI arguments
+    /// Parses only the default CLI arguments
    pub fn parse_args() -> Self {
        Self::parse()
    }

-    /// Parsers only the default [`NodeCommand`] arguments from the given iterator
+    /// Parses only the default [`NodeCommand`] arguments from the given iterator
    pub fn try_parse_args_from<I, T>(itr: I) -> Result<Self, clap::error::Error>
    where
        I: IntoIterator<Item = T>,
--- a/crates/cli/commands/src/p2p/mod.rs
+++ b/crates/cli/commands/src/p2p/mod.rs
@@ -193,7 +193,10 @@ impl<C: ChainSpecParser> DownloadArgs<C> {
        let default_secret_key_path = data_dir.p2p_secret();
        let p2p_secret_key = self.network.secret_key(default_secret_key_path)?;
        let rlpx_socket = (self.network.addr, self.network.port).into();
-        let boot_nodes = self.chain.bootnodes().unwrap_or_default();
+        let boot_nodes = self
+            .network
+            .resolved_bootnodes()
+            .unwrap_or_else(|| self.chain.bootnodes().unwrap_or_default());

        let net =
            NetworkConfigBuilder::<N::NetworkPrimitives>::new(p2p_secret_key, Runtime::test())
--- a/crates/cli/commands/src/prune.rs
+++ b/crates/cli/commands/src/prune.rs
@@ -12,7 +12,6 @@ use reth_node_metrics::{
    server::{MetricServer, MetricServerConfig},
    version::VersionInfo,
 };
-#[cfg(all(unix, feature = "rocksdb"))]
 use reth_provider::RocksDBProviderFactory;
 use reth_prune::PrunerBuilder;
 use reth_static_file::StaticFileProducer;
@@ -122,7 +121,6 @@ impl<C: ChainSpecParser<ChainSpec: EthChainSpec + EthereumHardforks>> PruneComma
        }

        // Flush and compact RocksDB to reclaim disk space after pruning
-        #[cfg(all(unix, feature = "rocksdb"))]
        {
            info!(target: "reth::cli", "Flushing and compacting RocksDB...");
            provider_factory.rocksdb_provider().flush_and_compact()?;
--- a/crates/cli/commands/src/stage/run.rs
+++ b/crates/cli/commands/src/stage/run.rs
@@ -107,7 +107,7 @@ impl<C: ChainSpecParser<ChainSpec: EthChainSpec + Hardforks + EthereumHardforks>
        Comp: CliNodeComponents<N>,
        F: FnOnce(Arc<C::ChainSpec>) -> Comp,
    {
-        // Quit early if the stages requires a commit and `--commit` is not provided.
+        // Quit early if the stage requires a commit and `--commit` is not provided.
        if self.requires_commit() && !self.commit {
            return Err(eyre::eyre!(
                "The stage {} requires overwriting existing static files and must commit, but `--commit` was not provided. Please pass `--commit` and try again.",
--- a/crates/cli/runner/src/lib.rs
+++ b/crates/cli/runner/src/lib.rs
@@ -71,7 +71,12 @@ impl CliRunner {
    ) -> Result<(), E>
    where
        F: Future<Output = Result<(), E>>,
-        E: Send + Sync + From<std::io::Error> + From<reth_tasks::PanickedTaskError> + 'static,
+        E: Send
+            + Sync
+            + std::fmt::Display
+            + From<std::io::Error>
+            + From<reth_tasks::PanickedTaskError>
+            + 'static,
    {
        let (context, task_manager_handle) = cli_context(&self.runtime);

@@ -81,8 +86,8 @@ impl CliRunner {
            run_until_ctrl_c(command(context)),
        ));

-        if command_res.is_err() {
-            error!(target: "reth::cli", "shutting down due to error");
+        if let Err(err) = &command_res {
+            error!(target: "reth::cli", %err, "shutting down due to error");
        } else {
            debug!(target: "reth::cli", "shutting down gracefully");
            // after the command has finished or exit signal was received we shutdown the
@@ -105,7 +110,12 @@ impl CliRunner {
    ) -> Result<(), E>
    where
        F: Future<Output = Result<(), E>> + Send + 'static,
-        E: Send + Sync + From<std::io::Error> + From<reth_tasks::PanickedTaskError> + 'static,
+        E: Send
+            + Sync
+            + std::fmt::Display
+            + From<std::io::Error>
+            + From<reth_tasks::PanickedTaskError>
+            + 'static,
    {
        let (context, task_manager_handle) = cli_context(&self.runtime);

@@ -122,8 +132,8 @@ impl CliRunner {
            ),
        ));

-        if command_res.is_err() {
-            error!(target: "reth::cli", "shutting down due to error");
+        if let Err(err) = &command_res {
+            error!(target: "reth::cli", %err, "shutting down due to error");
        } else {
            debug!(target: "reth::cli", "shutting down gracefully");
            self.runtime.graceful_shutdown_with_timeout(self.config.graceful_shutdown_timeout);
--- a/crates/e2e-test-utils/Cargo.toml
+++ b/crates/e2e-test-utils/Cargo.toml
@@ -75,8 +75,3 @@ path = "tests/e2e-testsuite/main.rs"
 [[test]]
 name = "rocksdb"
 path = "tests/rocksdb/main.rs"
-required-features = ["rocksdb"]
-
-[features]
-rocksdb = ["reth-node-core/rocksdb", "reth-provider/rocksdb", "reth-cli-commands/rocksdb"]
-edge = ["rocksdb"]
--- a/crates/e2e-test-utils/tests/rocksdb/main.rs
+++ b/crates/e2e-test-utils/tests/rocksdb/main.rs
@@ -1,7 +1,5 @@
 //! E2E tests for `RocksDB` provider functionality.

-#![cfg(all(feature = "rocksdb", unix))]
-
 use alloy_consensus::BlockHeader;
 use alloy_primitives::B256;
 use alloy_rpc_types_eth::{Transaction, TransactionReceipt};
--- a/crates/engine/primitives/Cargo.toml
+++ b/crates/engine/primitives/Cargo.toml
@@ -39,6 +39,7 @@ thiserror.workspace = true

 [features]
 default = ["std"]
+trie-debug = []
 std = [
    "reth-execution-types/std",
    "reth-ethereum-primitives/std",
--- a/crates/engine/primitives/src/config.rs
+++ b/crates/engine/primitives/src/config.rs
@@ -26,10 +26,15 @@ pub const DEFAULT_RESERVED_CPU_CORES: usize = 1;
 /// Depth 4 means we keep roughly 16^4 = 65536 potential branch paths at most.
 pub const DEFAULT_SPARSE_TRIE_PRUNE_DEPTH: usize = 4;

-/// Default maximum number of storage tries to keep after pruning.
+/// Default LFU hot-slot capacity for sparse trie pruning.
 ///
-/// Storage tries beyond this limit are cleared (but allocations preserved).
-pub const DEFAULT_SPARSE_TRIE_MAX_STORAGE_TRIES: usize = 100;
+/// Limits the number of `(address, slot)` pairs retained across prune cycles.
+pub const DEFAULT_SPARSE_TRIE_MAX_HOT_SLOTS: usize = 1500;
+
+/// Default LFU hot-account capacity for sparse trie pruning.
+///
+/// Limits the number of account addresses retained across prune cycles.
+pub const DEFAULT_SPARSE_TRIE_MAX_HOT_ACCOUNTS: usize = 1000;

 /// Default timeout for the state root task before spawning a sequential fallback.
 pub const DEFAULT_STATE_ROOT_TASK_TIMEOUT: Duration = Duration::from_secs(1);
@@ -131,15 +136,28 @@ pub struct TreeConfig {
    disable_cache_metrics: bool,
    /// Depth for sparse trie pruning after state root computation.
    sparse_trie_prune_depth: usize,
-    /// Maximum number of storage tries to retain after pruning.
-    sparse_trie_max_storage_tries: usize,
+    /// LFU hot-slot capacity: max `(address, slot)` pairs retained across prune cycles.
+    sparse_trie_max_hot_slots: usize,
+    /// LFU hot-account capacity: max account addresses retained across prune cycles.
+    sparse_trie_max_hot_accounts: usize,
+    /// When set, blocks whose total processing time (execution + state reads + state root +
+    /// DB commit) exceeds this duration trigger a structured `warn!` log with detailed timing,
+    /// state-operation counts, and cache hit-rate metrics. `Duration::ZERO` logs every block.
+    slow_block_threshold: Option<Duration>,
    /// Whether to fully disable sparse trie cache pruning between blocks.
    disable_sparse_trie_cache_pruning: bool,
+    /// Whether to use the arena-based sparse trie implementation.
+    enable_arena_sparse_trie: bool,
    /// Timeout for the state root task before spawning a sequential fallback computation.
    /// If `Some`, after waiting this duration for the state root task, a sequential state root
    /// computation is spawned in parallel and whichever finishes first is used.
    /// If `None`, the timeout fallback is disabled.
    state_root_task_timeout: Option<Duration>,
+    /// Maximum random jitter applied before each proof computation (trie-debug only).
+    /// When set, each proof worker sleeps for a random duration up to this value
+    /// before starting a proof calculation.
+    #[cfg(feature = "trie-debug")]
+    proof_jitter: Option<Duration>,
 }

 impl Default for TreeConfig {
@@ -165,9 +183,14 @@ impl Default for TreeConfig {
            allow_unwind_canonical_header: false,
            disable_cache_metrics: false,
            sparse_trie_prune_depth: DEFAULT_SPARSE_TRIE_PRUNE_DEPTH,
-            sparse_trie_max_storage_tries: DEFAULT_SPARSE_TRIE_MAX_STORAGE_TRIES,
+            sparse_trie_max_hot_slots: DEFAULT_SPARSE_TRIE_MAX_HOT_SLOTS,
+            sparse_trie_max_hot_accounts: DEFAULT_SPARSE_TRIE_MAX_HOT_ACCOUNTS,
+            slow_block_threshold: None,
            disable_sparse_trie_cache_pruning: false,
+            enable_arena_sparse_trie: false,
            state_root_task_timeout: Some(DEFAULT_STATE_ROOT_TASK_TIMEOUT),
+            #[cfg(feature = "trie-debug")]
+            proof_jitter: None,
        }
    }
 }
@@ -196,7 +219,9 @@ impl TreeConfig {
        allow_unwind_canonical_header: bool,
        disable_cache_metrics: bool,
        sparse_trie_prune_depth: usize,
-        sparse_trie_max_storage_tries: usize,
+        sparse_trie_max_hot_slots: usize,
+        sparse_trie_max_hot_accounts: usize,
+        slow_block_threshold: Option<Duration>,
        state_root_task_timeout: Option<Duration>,
    ) -> Self {
        Self {
@@ -220,9 +245,14 @@ impl TreeConfig {
            allow_unwind_canonical_header,
            disable_cache_metrics,
            sparse_trie_prune_depth,
-            sparse_trie_max_storage_tries,
+            sparse_trie_max_hot_slots,
+            sparse_trie_max_hot_accounts,
+            slow_block_threshold,
            disable_sparse_trie_cache_pruning: false,
+            enable_arena_sparse_trie: false,
            state_root_task_timeout,
+            #[cfg(feature = "trie-debug")]
+            proof_jitter: None,
        }
    }

@@ -471,14 +501,43 @@ impl TreeConfig {
        self
    }

-    /// Returns the maximum number of storage tries to retain after pruning.
-    pub const fn sparse_trie_max_storage_tries(&self) -> usize {
-        self.sparse_trie_max_storage_tries
+    /// Returns the LFU hot-slot capacity for sparse trie pruning.
+    pub const fn sparse_trie_max_hot_slots(&self) -> usize {
+        self.sparse_trie_max_hot_slots
    }

-    /// Setter for maximum storage tries to retain.
-    pub const fn with_sparse_trie_max_storage_tries(mut self, max_tries: usize) -> Self {
-        self.sparse_trie_max_storage_tries = max_tries;
+    /// Setter for LFU hot-slot capacity.
+    pub const fn with_sparse_trie_max_hot_slots(mut self, max_hot_slots: usize) -> Self {
+        self.sparse_trie_max_hot_slots = max_hot_slots;
+        self
+    }
+
+    /// Returns the LFU hot-account capacity for sparse trie pruning.
+    pub const fn sparse_trie_max_hot_accounts(&self) -> usize {
+        self.sparse_trie_max_hot_accounts
+    }
+
+    /// Setter for LFU hot-account capacity.
+    pub const fn with_sparse_trie_max_hot_accounts(mut self, max_hot_accounts: usize) -> Self {
+        self.sparse_trie_max_hot_accounts = max_hot_accounts;
+        self
+    }
+
+    /// Returns the slow block threshold, if configured.
+    ///
+    /// When `Some`, blocks whose total processing time exceeds this duration emit a structured
+    /// warning with timing, state-operation, and cache-hit-rate details. `Duration::ZERO` logs
+    /// every block.
+    pub const fn slow_block_threshold(&self) -> Option<Duration> {
+        self.slow_block_threshold
+    }
+
+    /// Setter for slow block threshold.
+    pub const fn with_slow_block_threshold(
+        mut self,
+        slow_block_threshold: Option<Duration>,
+    ) -> Self {
+        self.slow_block_threshold = slow_block_threshold;
        self
    }

@@ -493,6 +552,17 @@ impl TreeConfig {
        self
    }

+    /// Returns whether the arena-based sparse trie is enabled.
+    pub const fn enable_arena_sparse_trie(&self) -> bool {
+        self.enable_arena_sparse_trie
+    }
+
+    /// Setter for whether to enable the arena-based sparse trie.
+    pub const fn with_enable_arena_sparse_trie(mut self, value: bool) -> Self {
+        self.enable_arena_sparse_trie = value;
+        self
+    }
+
    /// Returns the state root task timeout.
    pub const fn state_root_task_timeout(&self) -> Option<Duration> {
        self.state_root_task_timeout
@@ -503,4 +573,17 @@ impl TreeConfig {
        self.state_root_task_timeout = timeout;
        self
    }
+
+    /// Returns the proof jitter duration, if configured (trie-debug only).
+    #[cfg(feature = "trie-debug")]
+    pub const fn proof_jitter(&self) -> Option<Duration> {
+        self.proof_jitter
+    }
+
+    /// Setter for proof jitter (trie-debug only).
+    #[cfg(feature = "trie-debug")]
+    pub const fn with_proof_jitter(mut self, proof_jitter: Option<Duration>) -> Self {
+        self.proof_jitter = proof_jitter;
+        self
+    }
 }
--- a/crates/engine/primitives/src/event.rs
+++ b/crates/engine/primitives/src/event.rs
@@ -9,7 +9,7 @@ use core::{
    fmt::{Display, Formatter, Result},
    time::Duration,
 };
-use reth_chain_state::ExecutedBlock;
+use reth_chain_state::{ExecutedBlock, ExecutionTimingStats};
 use reth_ethereum_primitives::EthPrimitives;
 use reth_primitives_traits::{NodePrimitives, SealedBlock, SealedHeader};

@@ -32,6 +32,8 @@ pub enum ConsensusEngineEvent<N: NodePrimitives = EthPrimitives> {
    CanonicalChainCommitted(Box<SealedHeader<N::BlockHeader>>, Duration),
    /// The consensus engine processed an invalid block.
    InvalidBlock(Box<SealedBlock<N::Block>>),
+    /// A slow block was detected after persistence, with its timing statistics.
+    SlowBlock(SlowBlockInfo),
 }

 impl<N: NodePrimitives> ConsensusEngineEvent<N> {
@@ -73,6 +75,25 @@ where
            Self::BlockReceived(num_hash) => {
                write!(f, "BlockReceived({num_hash:?})")
            }
+            Self::SlowBlock(info) => {
+                write!(
+                    f,
+                    "SlowBlock(block={}, total={:?})",
+                    info.stats.block_number, info.total_duration
+                )
+            }
        }
    }
 }
+
+/// Information about a slow block detected after persistence.
+#[derive(Clone, Debug)]
+pub struct SlowBlockInfo {
+    /// The timing statistics for the slow block.
+    pub stats: Box<ExecutionTimingStats>,
+    /// The commit duration for the batch containing this block.
+    pub commit_duration: Duration,
+    /// The total duration (execution + `state_root` + commit).
+    /// Note: `state_read` is a subset of execution and is not added separately.
+    pub total_duration: Duration,
+}
--- a/crates/engine/tree/Cargo.toml
+++ b/crates/engine/tree/Cargo.toml
@@ -100,7 +100,6 @@ revm-state.workspace = true
 assert_matches.workspace = true
 eyre.workspace = true
 serde_json.workspace = true
-crossbeam-channel.workspace = true
 proptest.workspace = true
 rand.workspace = true
 rand_08.workspace = true
@@ -134,14 +133,12 @@ test-utils = [
    "reth-evm-ethereum/test-utils",
    "reth-tasks/test-utils",
 ]
-trie-debug = ["reth-trie-sparse/trie-debug", "dep:serde_json"]
-rocksdb = [
-    "reth-provider/rocksdb",
-    "reth-prune/rocksdb",
-    "reth-stages?/rocksdb",
-    "reth-e2e-test-utils/rocksdb",
+trie-debug = [
+    "reth-trie-sparse/trie-debug",
+    "reth-trie-parallel/trie-debug",
+    "reth-engine-primitives/trie-debug",
+    "dep:serde_json",
 ]
-edge = ["rocksdb"]

 [[test]]
 name = "e2e_testsuite"
--- a/crates/engine/tree/src/persistence.rs
+++ b/crates/engine/tree/src/persistence.rs
@@ -18,10 +18,20 @@ use std::{
        Arc,
    },
    thread::JoinHandle,
+    time::Duration,
 };
 use thiserror::Error;
 use tracing::{debug, error, instrument};

+/// Unified result of any persistence operation.
+#[derive(Debug)]
+pub struct PersistenceResult {
+    /// The last block that was persisted, if any.
+    pub last_block: Option<BlockNumHash>,
+    /// The commit duration, only available for save-blocks operations.
+    pub commit_duration: Option<Duration>,
+}
+
 /// Writes parts of reth's in memory tree state to the database and static files.
 ///
 /// This is meant to be a spawned service that listens for various incoming persistence operations,
@@ -86,18 +96,16 @@ where
        while let Ok(action) = self.incoming.recv() {
            match action {
                PersistenceAction::RemoveBlocksAbove(new_tip_num, sender) => {
-                    let result = self.on_remove_blocks_above(new_tip_num)?;
+                    let last_block = self.on_remove_blocks_above(new_tip_num)?;
                    // send new sync metrics based on removed blocks
                    let _ =
                        self.sync_metrics_tx.send(MetricEvent::SyncHeight { height: new_tip_num });
-                    // we ignore the error because the caller may or may not care about the result
-                    let _ = sender.send(result);
+                    let _ = sender.send(PersistenceResult { last_block, commit_duration: None });
                }
                PersistenceAction::SaveBlocks(blocks, sender) => {
                    let result = self.on_save_blocks(blocks)?;
-                    let result_number = result.map(|r| r.number);
+                    let result_number = result.last_block.map(|b| b.number);

-                    // we ignore the error because the caller may or may not care about the result
                    let _ = sender.send(result);

                    if let Some(block_number) = result_number {
@@ -140,7 +148,7 @@ where
    fn on_save_blocks(
        &mut self,
        blocks: Vec<ExecutedBlock<N::Primitives>>,
-    ) -> Result<Option<BlockNumHash>, PersistenceError> {
+    ) -> Result<PersistenceResult, PersistenceError> {
        let first_block = blocks.first().map(|b| b.recovered_block.num_hash());
        let last_block = blocks.last().map(|b| b.recovered_block.num_hash());
        let block_count = blocks.len();
@@ -157,8 +165,6 @@ where
            provider_rw.save_blocks(blocks, SaveBlocksMode::Full)?;

            if let Some(finalized) = pending_finalized {
-                // Clamp to the highest persisted block so that on restart
-                // `last_finalized_block_number` never points past available state.
                provider_rw.save_finalized_block_number(finalized.min(last.number))?;
                if finalized > last.number {
                    self.pending_finalized_block = Some(finalized);
@@ -183,10 +189,11 @@ where

        debug!(target: "engine::persistence", first=?first_block, last=?last_block, "Saved range of blocks");

+        let elapsed = start_time.elapsed();
        self.metrics.save_blocks_batch_size.record(block_count as f64);
-        self.metrics.save_blocks_duration_seconds.record(start_time.elapsed());
+        self.metrics.save_blocks_duration_seconds.record(elapsed);

-        Ok(last_block)
+        Ok(PersistenceResult { last_block, commit_duration: Some(elapsed) })
    }
 }

@@ -210,13 +217,13 @@ pub enum PersistenceAction<N: NodePrimitives = EthPrimitives> {
    ///
    /// First, header, transaction, and receipt-related data should be written to static files.
    /// Then the execution history-related data will be written to the database.
-    SaveBlocks(Vec<ExecutedBlock<N>>, CrossbeamSender<Option<BlockNumHash>>),
+    SaveBlocks(Vec<ExecutedBlock<N>>, CrossbeamSender<PersistenceResult>),

    /// Removes block data above the given block number from the database.
    ///
    /// This will first update checkpoints from the database, then remove actual block data from
    /// static files.
-    RemoveBlocksAbove(u64, CrossbeamSender<Option<BlockNumHash>>),
+    RemoveBlocksAbove(u64, CrossbeamSender<PersistenceResult>),

    /// Update the persisted finalized block on disk
    SaveFinalizedBlock(u64),
@@ -295,7 +302,7 @@ impl<T: NodePrimitives> PersistenceHandle<T> {
    pub fn save_blocks(
        &self,
        blocks: Vec<ExecutedBlock<T>>,
-        tx: CrossbeamSender<Option<BlockNumHash>>,
+        tx: CrossbeamSender<PersistenceResult>,
    ) -> Result<(), SendError<PersistenceAction<T>>> {
        self.send_action(PersistenceAction::SaveBlocks(blocks, tx))
    }
@@ -330,7 +337,7 @@ impl<T: NodePrimitives> PersistenceHandle<T> {
    pub fn remove_blocks_above(
        &self,
        block_num: u64,
-        tx: CrossbeamSender<Option<BlockNumHash>>,
+        tx: CrossbeamSender<PersistenceResult>,
    ) -> Result<(), SendError<PersistenceAction<T>>> {
        self.send_action(PersistenceAction::RemoveBlocksAbove(block_num, tx))
    }
@@ -390,8 +397,8 @@ mod tests {

        handle.save_blocks(blocks, tx).unwrap();

-        let hash = rx.recv().unwrap();
-        assert_eq!(hash, None);
+        let result = rx.recv().unwrap();
+        assert!(result.last_block.is_none());
    }

    #[test]
@@ -409,12 +416,9 @@ mod tests {

        handle.save_blocks(blocks, tx).unwrap();

-        let BlockNumHash { hash: actual_hash, number: _ } = rx
-            .recv_timeout(std::time::Duration::from_secs(10))
-            .expect("test timed out")
-            .expect("no hash returned");
+        let result = rx.recv_timeout(std::time::Duration::from_secs(10)).expect("test timed out");

-        assert_eq!(block_hash, actual_hash);
+        assert_eq!(block_hash, result.last_block.unwrap().hash);
    }

    #[test]
@@ -428,8 +432,8 @@ mod tests {
        let (tx, rx) = crossbeam_channel::bounded(1);

        handle.save_blocks(blocks, tx).unwrap();
-        let BlockNumHash { hash: actual_hash, number: _ } = rx.recv().unwrap().unwrap();
-        assert_eq!(last_hash, actual_hash);
+        let result = rx.recv().unwrap();
+        assert_eq!(last_hash, result.last_block.unwrap().hash);
    }

    #[test]
@@ -446,8 +450,8 @@ mod tests {

            handle.save_blocks(blocks, tx).unwrap();

-            let BlockNumHash { hash: actual_hash, number: _ } = rx.recv().unwrap().unwrap();
-            assert_eq!(last_hash, actual_hash);
+            let result = rx.recv().unwrap();
+            assert_eq!(last_hash, result.last_block.unwrap().hash);
        }
    }
 }
--- a/crates/engine/tree/src/tree/cached_state.rs
+++ b/crates/engine/tree/src/tree/cached_state.rs
@@ -18,9 +18,7 @@ use reth_trie::{
    updates::TrieUpdates, AccountProof, HashedPostState, HashedStorage, MultiProof,
    MultiProofTargets, StorageMultiProof, StorageProof, TrieInput,
 };
-use revm_primitives::eip7907::MAX_CODE_SIZE;
 use std::{
-    mem::size_of,
    sync::{
        atomic::{AtomicU64, AtomicUsize, Ordering},
        Arc,
@@ -56,8 +54,17 @@ const fn fixed_cache_key_size_with_value<K>(value: usize) -> usize {
    raw_size.div_ceil(FIXED_CACHE_ALIGNMENT) * FIXED_CACHE_ALIGNMENT
 }

-/// Size in bytes of a single code cache entry.
-const CODE_CACHE_ENTRY_SIZE: usize = fixed_cache_key_size_with_value::<Address>(MAX_CODE_SIZE);
+/// Estimated average bytecode size for cache budget calculation.
+///
+/// The fixed-cache stores `Option<Bytecode>` inline (pointer-sized), but each cached contract
+/// also holds bytecode on the heap. For budget estimation we use 8 KiB, which is close to the
+/// observed mainnet average (~7 KiB). Using `MAX_CODE_SIZE` (48 KiB) overestimates by ~7x,
+/// yielding only 4096 entries for a 228 MB code-cache budget when 16384 fit comfortably.
+const ESTIMATED_AVG_CODE_SIZE: usize = 8 * 1024;
+
+/// Size in bytes of a single code cache entry (inline metadata + estimated heap).
+const CODE_CACHE_ENTRY_SIZE: usize =
+    fixed_cache_key_size_with_value::<Address>(ESTIMATED_AVG_CODE_SIZE);

 /// Size in bytes of a single storage cache entry.
 const STORAGE_CACHE_ENTRY_SIZE: usize =
@@ -94,6 +101,10 @@ pub struct CachedStateProvider<S, const PREWARM: bool = false> {

    /// Metrics for the cached state provider
    metrics: CachedStateMetrics,
+
+    /// Optional cache statistics for detailed block logging. Only tracked when slow block
+    /// threshold is configured.
+    cache_stats: Option<Arc<CacheStats>>,
 }

 impl<S> CachedStateProvider<S> {
@@ -104,7 +115,7 @@ impl<S> CachedStateProvider<S> {
        caches: ExecutionCache,
        metrics: CachedStateMetrics,
    ) -> Self {
-        Self { state_provider, caches, metrics }
+        Self { state_provider, caches, metrics, cache_stats: None }
    }
 }

@@ -115,14 +126,28 @@ impl<S> CachedStateProvider<S, true> {
        caches: ExecutionCache,
        metrics: CachedStateMetrics,
    ) -> Self {
-        Self { state_provider, caches, metrics }
+        Self { state_provider, caches, metrics, cache_stats: None }
    }
 }

-/// Metrics for the cached state provider, showing hits / misses / size for each cache.
-///
-/// This struct combines both the provider-level metrics (hits/misses tracked by the provider)
-/// and the fixed-cache internal stats (collisions, size, capacity).
+impl<S, const PREWARM: bool> CachedStateProvider<S, PREWARM> {
+    /// Enables cache statistics tracking for detailed block logging.
+    pub fn with_cache_stats(mut self, stats: Option<Arc<CacheStats>>) -> Self {
+        self.cache_stats = stats;
+        self
+    }
+}
+
+/// Represents the status of a key in the cache.
+#[derive(Debug, Clone, PartialEq, Eq)]
+pub enum CachedStatus<T> {
+    /// The key is not in the cache (or was invalidated). The value was recalculated.
+    NotCached(T),
+    /// The key exists in cache and has a specific value.
+    Cached(T),
+}
+
+/// Metrics for the cached state provider, showing hits / misses for each cache
 #[derive(Metrics, Clone)]
 #[metrics(scope = "sync.caching")]
 pub struct CachedStateMetrics {
@@ -211,6 +236,73 @@ impl CachedStateMetrics {
    }
 }

+/// Cache hit/miss statistics for detailed block logging.
+#[derive(Debug, Default)]
+pub struct CacheStats {
+    /// Account cache hits
+    account_hits: AtomicUsize,
+    /// Account cache misses
+    account_misses: AtomicUsize,
+    /// Storage cache hits
+    storage_hits: AtomicUsize,
+    /// Storage cache misses
+    storage_misses: AtomicUsize,
+    /// Code cache hits
+    code_hits: AtomicUsize,
+    /// Code cache misses
+    code_misses: AtomicUsize,
+}
+
+impl CacheStats {
+    pub(crate) fn record_account_hit(&self) {
+        self.account_hits.fetch_add(1, Ordering::Relaxed);
+    }
+
+    pub(crate) fn record_account_miss(&self) {
+        self.account_misses.fetch_add(1, Ordering::Relaxed);
+    }
+
+    pub(crate) fn account_hits(&self) -> usize {
+        self.account_hits.load(Ordering::Relaxed)
+    }
+
+    pub(crate) fn account_misses(&self) -> usize {
+        self.account_misses.load(Ordering::Relaxed)
+    }
+
+    pub(crate) fn record_storage_hit(&self) {
+        self.storage_hits.fetch_add(1, Ordering::Relaxed);
+    }
+
+    pub(crate) fn record_storage_miss(&self) {
+        self.storage_misses.fetch_add(1, Ordering::Relaxed);
+    }
+
+    pub(crate) fn storage_hits(&self) -> usize {
+        self.storage_hits.load(Ordering::Relaxed)
+    }
+
+    pub(crate) fn storage_misses(&self) -> usize {
+        self.storage_misses.load(Ordering::Relaxed)
+    }
+
+    pub(crate) fn record_code_hit(&self) {
+        self.code_hits.fetch_add(1, Ordering::Relaxed);
+    }
+
+    pub(crate) fn record_code_miss(&self) {
+        self.code_misses.fetch_add(1, Ordering::Relaxed);
+    }
+
+    pub(crate) fn code_hits(&self) -> usize {
+        self.code_hits.load(Ordering::Relaxed)
+    }
+
+    pub(crate) fn code_misses(&self) -> usize {
+        self.code_misses.load(Ordering::Relaxed)
+    }
+}
+
 /// A stats handler for fixed-cache that tracks collisions and size.
 ///
 /// Note: Hits and misses are tracked directly by the [`CachedStateProvider`] via
@@ -306,27 +398,36 @@ impl<S: AccountReader, const PREWARM: bool> AccountReader for CachedStateProvide
            match self.caches.get_or_try_insert_account_with(*address, || {
                self.state_provider.basic_account(address)
            })? {
-                CachedStatus::NotCached(value) | CachedStatus::Cached(value) => Ok(value),
+                // During prewarm we only record stats (not prometheus metrics)
+                CachedStatus::NotCached(value) => {
+                    if let Some(stats) = &self.cache_stats {
+                        stats.record_account_miss();
+                    }
+                    Ok(value)
+                }
+                CachedStatus::Cached(value) => {
+                    if let Some(stats) = &self.cache_stats {
+                        stats.record_account_hit();
+                    }
+                    Ok(value)
+                }
            }
        } else if let Some(account) = self.caches.0.account_cache.get(address) {
            self.metrics.account_cache_hits.increment(1);
+            if let Some(stats) = &self.cache_stats {
+                stats.record_account_hit();
+            }
            Ok(account)
        } else {
            self.metrics.account_cache_misses.increment(1);
+            if let Some(stats) = &self.cache_stats {
+                stats.record_account_miss();
+            }
            self.state_provider.basic_account(address)
        }
    }
 }

-/// Represents the status of a key in the cache.
-#[derive(Debug, Clone, PartialEq, Eq)]
-pub enum CachedStatus<T> {
-    /// The key is not in the cache (or was invalidated). The value was recalculated.
-    NotCached(T),
-    /// The key exists in cache and has a specific value.
-    Cached(T),
-}
-
 impl<S: StateProvider, const PREWARM: bool> StateProvider for CachedStateProvider<S, PREWARM> {
    fn storage(
        &self,
@@ -337,17 +438,31 @@ impl<S: StateProvider, const PREWARM: bool> StateProvider for CachedStateProvide
            match self.caches.get_or_try_insert_storage_with(account, storage_key, || {
                self.state_provider.storage(account, storage_key).map(Option::unwrap_or_default)
            })? {
-                CachedStatus::NotCached(value) | CachedStatus::Cached(value) => {
-                    // The slot that was never written to is indistinguishable from a slot
-                    // explicitly set to zero. We return `None` in both cases.
+                // During prewarm we only record stats (not prometheus metrics)
+                CachedStatus::NotCached(value) => {
+                    if let Some(stats) = &self.cache_stats {
+                        stats.record_storage_miss();
+                    }
+                    Ok(Some(value).filter(|v| !v.is_zero()))
+                }
+                CachedStatus::Cached(value) => {
+                    if let Some(stats) = &self.cache_stats {
+                        stats.record_storage_hit();
+                    }
                    Ok(Some(value).filter(|v| !v.is_zero()))
                }
            }
        } else if let Some(value) = self.caches.0.storage_cache.get(&(account, storage_key)) {
            self.metrics.storage_cache_hits.increment(1);
+            if let Some(stats) = &self.cache_stats {
+                stats.record_storage_hit();
+            }
            Ok(Some(value).filter(|v| !v.is_zero()))
        } else {
            self.metrics.storage_cache_misses.increment(1);
+            if let Some(stats) = &self.cache_stats {
+                stats.record_storage_miss();
+            }
            self.state_provider.storage(account, storage_key)
        }
    }
@@ -359,13 +474,31 @@ impl<S: BytecodeReader, const PREWARM: bool> BytecodeReader for CachedStateProvi
            match self.caches.get_or_try_insert_code_with(*code_hash, || {
                self.state_provider.bytecode_by_hash(code_hash)
            })? {
-                CachedStatus::NotCached(code) | CachedStatus::Cached(code) => Ok(code),
+                // During prewarm we only record stats (not prometheus metrics)
+                CachedStatus::NotCached(code) => {
+                    if let Some(stats) = &self.cache_stats {
+                        stats.record_code_miss();
+                    }
+                    Ok(code)
+                }
+                CachedStatus::Cached(code) => {
+                    if let Some(stats) = &self.cache_stats {
+                        stats.record_code_hit();
+                    }
+                    Ok(code)
+                }
            }
        } else if let Some(code) = self.caches.0.code_cache.get(code_hash) {
            self.metrics.code_cache_hits.increment(1);
+            if let Some(stats) = &self.cache_stats {
+                stats.record_code_hit();
+            }
            Ok(code)
        } else {
            self.metrics.code_cache_misses.increment(1);
+            if let Some(stats) = &self.cache_stats {
+                stats.record_code_miss();
+            }
            self.state_provider.bytecode_by_hash(code_hash)
        }
    }
@@ -707,7 +840,8 @@ impl ExecutionCache {
                }

                self.0.account_cache.remove(addr);
-                continue
+                self.0.account_stats.decrement_size();
+                continue;
            }

            // If we have an account that was modified, but it has a `None` account info, some wild
@@ -837,8 +971,10 @@ impl SavedCache {
        self.caches.update_metrics(&self.metrics);
    }

-    /// Clears all caches, resetting them to empty state.
-    pub(crate) fn clear(&self) {
+    /// Clears all caches, resetting them to empty state,
+    /// and updates the hash of the block this cache belongs to.
+    pub(crate) fn clear_with_hash(&mut self, hash: B256) {
+        self.hash = hash;
        self.caches.clear();
    }
 }
@@ -1085,4 +1221,20 @@ mod tests {
        assert!(caches.0.account_cache.get(&addr1).is_none());
        assert!(caches.0.account_cache.get(&addr2).is_some());
    }
+
+    #[test]
+    fn test_code_cache_capacity_with_default_budget() {
+        // Default cross-block cache is 4 GB; code gets 5.56% = ~228 MB.
+        let total_cache_size = 4 * 1024 * 1024 * 1024; // 4 GB
+        let code_budget = (total_cache_size * 556) / 10000; // 228 MB
+
+        let capacity = ExecutionCache::bytes_to_entries(code_budget, CODE_CACHE_ENTRY_SIZE);
+
+        // With ESTIMATED_AVG_CODE_SIZE (8 KiB) we expect 16384 entries.
+        // If someone accidentally reverts to MAX_CODE_SIZE (48 KiB), this would drop to 4096.
+        assert_eq!(
+            capacity, 16384,
+            "code cache should have 16384 entries with default 4 GB budget"
+        );
+    }
 }
--- a/crates/engine/tree/src/tree/instrumented_state.rs
+++ b/crates/engine/tree/src/tree/instrumented_state.rs
@@ -13,7 +13,10 @@ use reth_trie::{
    MultiProofTargets, StorageMultiProof, StorageProof, TrieInput,
 };
 use std::{
-    sync::atomic::{AtomicU64, Ordering},
+    sync::{
+        atomic::{AtomicU64, AtomicUsize, Ordering},
+        Arc,
+    },
    time::Duration,
 };

@@ -33,11 +36,6 @@ pub(crate) struct AtomicDuration {
 }

 impl AtomicDuration {
-    /// Returns a zero duration.
-    pub(crate) const fn zero() -> Self {
-        Self { nanos: AtomicU64::new(0) }
-    }
-
    /// Returns the duration as a [`Duration`]
    pub(crate) fn duration(&self) -> Duration {
        let nanos = self.nanos.load(Ordering::Relaxed);
@@ -63,18 +61,10 @@ impl AtomicDuration {
 pub struct InstrumentedStateProvider<S> {
    /// The state provider
    state_provider: S,
-
-    /// Metrics for the instrumented state provider
+    /// Prometheus metrics for the instrumented state provider
    metrics: StateProviderMetrics,
-
-    /// The total time we spend fetching storage over the lifetime of this state provider
-    total_storage_fetch_latency: AtomicDuration,
-
-    /// The total time we spend fetching code over the lifetime of this state provider
-    total_code_fetch_latency: AtomicDuration,
-
-    /// The total time we spend fetching accounts over the lifetime of this state provider
-    total_account_fetch_latency: AtomicDuration,
+    /// Shared fetch statistics, readable after the provider is consumed.
+    stats: Arc<StateProviderStats>,
 }

 impl<S> InstrumentedStateProvider<S>
@@ -87,58 +77,33 @@ where
        Self {
            state_provider,
            metrics: StateProviderMetrics::new_with_labels(&[("source", source)]),
-            total_storage_fetch_latency: AtomicDuration::zero(),
-            total_code_fetch_latency: AtomicDuration::zero(),
-            total_account_fetch_latency: AtomicDuration::zero(),
+            stats: Arc::new(StateProviderStats::default()),
        }
    }
-}

-impl<S> InstrumentedStateProvider<S> {
-    /// Records the latency for a storage fetch, and increments the duration counter for the storage
-    /// fetch.
-    fn record_storage_fetch(&self, latency: Duration) {
-        self.metrics.storage_fetch_latency.record(latency);
-        self.total_storage_fetch_latency.add_duration(latency);
-    }
-
-    /// Records the latency for a code fetch, and increments the duration counter for the code
-    /// fetch.
-    fn record_code_fetch(&self, latency: Duration) {
-        self.metrics.code_fetch_latency.record(latency);
-        self.total_code_fetch_latency.add_duration(latency);
-    }
-
-    /// Records the latency for an account fetch, and increments the duration counter for the
-    /// account fetch.
-    fn record_account_fetch(&self, latency: Duration) {
-        self.metrics.account_fetch_latency.record(latency);
-        self.total_account_fetch_latency.add_duration(latency);
-    }
-
-    /// Records the total latencies into their respective gauges and histograms.
-    pub(crate) fn record_total_latency(&self) {
-        let total_storage_fetch_latency = self.total_storage_fetch_latency.duration();
-        self.metrics.total_storage_fetch_latency.record(total_storage_fetch_latency);
-        self.metrics
-            .total_storage_fetch_latency_gauge
-            .set(total_storage_fetch_latency.as_secs_f64());
-
-        let total_code_fetch_latency = self.total_code_fetch_latency.duration();
-        self.metrics.total_code_fetch_latency.record(total_code_fetch_latency);
-        self.metrics.total_code_fetch_latency_gauge.set(total_code_fetch_latency.as_secs_f64());
-
-        let total_account_fetch_latency = self.total_account_fetch_latency.duration();
-        self.metrics.total_account_fetch_latency.record(total_account_fetch_latency);
-        self.metrics
-            .total_account_fetch_latency_gauge
-            .set(total_account_fetch_latency.as_secs_f64());
+    /// Returns a shared reference to the accumulated fetch statistics.
+    pub fn stats(&self) -> Arc<StateProviderStats> {
+        Arc::clone(&self.stats)
    }
 }

 impl<S> Drop for InstrumentedStateProvider<S> {
    fn drop(&mut self) {
-        self.record_total_latency();
+        let total_storage_fetch_latency = self.stats.total_storage_fetch_latency.duration();
+        self.metrics.total_storage_fetch_latency.record(total_storage_fetch_latency);
+        self.metrics
+            .total_storage_fetch_latency_gauge
+            .set(total_storage_fetch_latency.as_secs_f64());
+
+        let total_code_fetch_latency = self.stats.total_code_fetch_latency.duration();
+        self.metrics.total_code_fetch_latency.record(total_code_fetch_latency);
+        self.metrics.total_code_fetch_latency_gauge.set(total_code_fetch_latency.as_secs_f64());
+
+        let total_account_fetch_latency = self.stats.total_account_fetch_latency.duration();
+        self.metrics.total_account_fetch_latency.record(total_account_fetch_latency);
+        self.metrics
+            .total_account_fetch_latency_gauge
+            .set(total_account_fetch_latency.as_secs_f64());
    }
 }

@@ -183,7 +148,10 @@ impl<S: AccountReader> AccountReader for InstrumentedStateProvider<S> {
    fn basic_account(&self, address: &Address) -> ProviderResult<Option<Account>> {
        let start = Instant::now();
        let res = self.state_provider.basic_account(address);
-        self.record_account_fetch(start.elapsed());
+        let elapsed = start.elapsed();
+        self.metrics.account_fetch_latency.record(elapsed);
+        self.stats.total_account_fetches.fetch_add(1, Ordering::Relaxed);
+        self.stats.total_account_fetch_latency.add_duration(elapsed);
        res
    }
 }
@@ -196,7 +164,10 @@ impl<S: StateProvider> StateProvider for InstrumentedStateProvider<S> {
    ) -> ProviderResult<Option<StorageValue>> {
        let start = Instant::now();
        let res = self.state_provider.storage(account, storage_key);
-        self.record_storage_fetch(start.elapsed());
+        let elapsed = start.elapsed();
+        self.metrics.storage_fetch_latency.record(elapsed);
+        self.stats.total_storage_fetches.fetch_add(1, Ordering::Relaxed);
+        self.stats.total_storage_fetch_latency.add_duration(elapsed);
        res
    }
 }
@@ -205,7 +176,17 @@ impl<S: BytecodeReader> BytecodeReader for InstrumentedStateProvider<S> {
    fn bytecode_by_hash(&self, code_hash: &B256) -> ProviderResult<Option<Bytecode>> {
        let start = Instant::now();
        let res = self.state_provider.bytecode_by_hash(code_hash);
-        self.record_code_fetch(start.elapsed());
+        let elapsed = start.elapsed();
+        self.metrics.code_fetch_latency.record(elapsed);
+        self.stats.total_code_fetches.fetch_add(1, Ordering::Relaxed);
+        self.stats.total_code_fetch_latency.add_duration(elapsed);
+        self.stats.total_code_fetched_bytes.fetch_add(
+            res.as_ref()
+                .ok()
+                .and_then(|code| code.as_ref().map(|code| code.len()))
+                .unwrap_or_default(),
+            Ordering::Relaxed,
+        );
        res
    }
 }
@@ -308,3 +289,56 @@ impl<S: HashedPostStateProvider> HashedPostStateProvider for InstrumentedStatePr
        self.state_provider.hashed_post_state(bundle_state)
    }
 }
+
+/// Accumulated fetch statistics from an [`InstrumentedStateProvider`].
+///
+/// Shared via `Arc` so statistics can be read after the provider is consumed.
+#[derive(Debug, Default)]
+pub struct StateProviderStats {
+    total_storage_fetches: AtomicUsize,
+    total_storage_fetch_latency: AtomicDuration,
+
+    total_code_fetches: AtomicUsize,
+    total_code_fetch_latency: AtomicDuration,
+    total_code_fetched_bytes: AtomicUsize,
+
+    total_account_fetches: AtomicUsize,
+    total_account_fetch_latency: AtomicDuration,
+}
+
+impl StateProviderStats {
+    /// Returns total number of storage fetches.
+    pub fn total_storage_fetches(&self) -> usize {
+        self.total_storage_fetches.load(Ordering::Relaxed)
+    }
+
+    /// Returns total time spent on storage fetches.
+    pub fn total_storage_fetch_latency(&self) -> Duration {
+        self.total_storage_fetch_latency.duration()
+    }
+
+    /// Returns total number of code fetches.
+    pub fn total_code_fetches(&self) -> usize {
+        self.total_code_fetches.load(Ordering::Relaxed)
+    }
+
+    /// Returns total time spent on code fetches.
+    pub fn total_code_fetch_latency(&self) -> Duration {
+        self.total_code_fetch_latency.duration()
+    }
+
+    /// Returns total amount of code fetched, in bytes.
+    pub fn total_code_fetched_bytes(&self) -> usize {
+        self.total_code_fetched_bytes.load(Ordering::Relaxed)
+    }
+
+    /// Returns total number of account fetches.
+    pub fn total_account_fetches(&self) -> usize {
+        self.total_account_fetches.load(Ordering::Relaxed)
+    }
+
+    /// Returns total time spent on account fetches.
+    pub fn total_account_fetch_latency(&self) -> Duration {
+        self.total_account_fetch_latency.duration()
+    }
+}
--- a/crates/engine/tree/src/tree/mod.rs
+++ b/crates/engine/tree/src/tree/mod.rs
@@ -13,13 +13,13 @@ use alloy_rpc_types_engine::{
 };
 use error::{InsertBlockError, InsertBlockFatalError};
 use reth_chain_state::{
-    CanonicalInMemoryState, ComputedTrieData, ExecutedBlock, MemoryOverlayStateProvider,
-    NewCanonicalChain,
+    CanonicalInMemoryState, ComputedTrieData, ExecutedBlock, ExecutionTimingStats,
+    MemoryOverlayStateProvider, NewCanonicalChain,
 };
 use reth_consensus::{Consensus, FullConsensus};
 use reth_engine_primitives::{
    BeaconEngineMessage, BeaconOnNewPayloadError, ConsensusEngineEvent, ExecutionPayload,
-    ForkchoiceStateTracker, NewPayloadTimings, OnForkChoiceUpdated,
+    ForkchoiceStateTracker, NewPayloadTimings, OnForkChoiceUpdated, SlowBlockInfo,
 };
 use reth_errors::{ConsensusError, ProviderResult};
 use reth_evm::ConfigureEvm;
@@ -42,7 +42,7 @@ use reth_tasks::{spawn_os_thread, utils::increase_thread_priority};
 use reth_trie_db::ChangesetCache;
 use revm::interpreter::debug_unreachable;
 use state::TreeState;
-use std::{fmt::Debug, ops, sync::Arc, time::Duration};
+use std::{collections::HashMap, fmt::Debug, ops, sync::Arc, time::Duration};

 use crossbeam_channel::{Receiver, Sender};
 use tokio::sync::{
@@ -65,7 +65,7 @@ pub mod precompile_cache;
 mod tests;
 mod trie_updates;

-use crate::tree::error::AdvancePersistenceError;
+use crate::{persistence::PersistenceResult, tree::error::AdvancePersistenceError};
 pub use block_buffer::BlockBuffer;
 pub use cached_state::{CachedStateMetrics, CachedStateProvider, ExecutionCache, SavedCache};
 pub use invalid_headers::InvalidHeaderCache;
@@ -273,6 +273,10 @@ where
    evm_config: C,
    /// Changeset cache for in-memory trie changesets
    changeset_cache: ChangesetCache,
+    /// Timing statistics for executed blocks, keyed by block hash.
+    /// Stored here (not in `ExecutedBlock`) to avoid leaking observability concerns into the block
+    /// type. Entries are removed when blocks are persisted or invalidated.
+    execution_timing_stats: HashMap<B256, Box<ExecutionTimingStats>>,
    /// Whether the node uses hashed state as canonical storage (v2 mode).
    /// Cached at construction to avoid threading `StorageSettingsCache` bounds everywhere.
    use_hashed_state: bool,
@@ -303,6 +307,7 @@ where
            .field("engine_kind", &self.engine_kind)
            .field("evm_config", &self.evm_config)
            .field("changeset_cache", &self.changeset_cache)
+            .field("execution_timing_stats", &self.execution_timing_stats.len())
            .field("use_hashed_state", &self.use_hashed_state)
            .field("runtime", &self.runtime)
            .finish()
@@ -367,6 +372,7 @@ where
            engine_kind,
            evm_config,
            changeset_cache,
+            execution_timing_stats: HashMap::new(),
            use_hashed_state,
            runtime,
        }
@@ -503,8 +509,8 @@ where
                recv(persistence_rx) -> result => {
                    // Don't put it back - consumed (oneshot-like behavior)
                    match result {
-                        Ok(value) => LoopEvent::PersistenceComplete {
-                            result: value,
+                        Ok(result) => LoopEvent::PersistenceComplete {
+                            result,
                            start_time,
                        },
                        Err(_) => LoopEvent::Disconnected,
@@ -1369,15 +1375,16 @@ where
    /// Handles a completed persistence task.
    fn on_persistence_complete(
        &mut self,
-        last_persisted_hash_num: Option<BlockNumHash>,
+        result: PersistenceResult,
        start_time: Instant,
    ) -> Result<(), AdvancePersistenceError> {
        self.metrics.engine.persistence_duration.record(start_time.elapsed());

+        let commit_duration = result.commit_duration;
        let Some(BlockNumHash {
            hash: last_persisted_block_hash,
            number: last_persisted_block_number,
-        }) = last_persisted_hash_num
+        }) = result.last_block
        else {
            // if this happened, then we persisted no blocks because we sent an empty vec of blocks
            warn!(target: "engine::tree", "Persistence task completed but did not persist any blocks");
@@ -1423,6 +1430,8 @@ where
            });
        }

+        self.purge_timing_stats(last_persisted_block_number, commit_duration);
+
        Ok(())
    }

@@ -1728,6 +1737,7 @@ where

        // remove all buffered blocks below the backfill height
        self.state.buffer.remove_old_blocks(backfill_height);
+        self.purge_timing_stats(backfill_height, None);
        // we remove all entries because now we're synced to the backfill target and consider this
        // the canonical chain
        self.canonical_in_memory_state.clear_state();
@@ -1861,6 +1871,43 @@ where
        Ok(())
    }

+    /// Removes timing stats for blocks at or below `below_number`.
+    ///
+    /// No-op when detailed block logging is disabled (no stats are recorded in that case).
+    /// When `commit_duration` is provided and a slow block threshold is configured, checks
+    /// each removed block against the threshold and emits a [`ConsensusEngineEvent::SlowBlock`]
+    /// event for blocks that exceed it.
+    fn purge_timing_stats(&mut self, below_number: u64, commit_duration: Option<Duration>) {
+        let threshold = self.config.slow_block_threshold();
+        let check_slow = commit_duration.is_some() && threshold.is_some();
+
+        // Two-pass: collect keys first because emit_event borrows &mut self.
+        let keys_to_remove: Vec<B256> = self
+            .execution_timing_stats
+            .iter()
+            .filter(|(_, stats)| stats.block_number <= below_number)
+            .map(|(k, _)| *k)
+            .collect();
+
+        for key in keys_to_remove {
+            let stats = self.execution_timing_stats.remove(&key).expect("key just found");
+            if check_slow {
+                let commit_dur = commit_duration.expect("checked above");
+                // state_read_duration is already included in execution_duration
+                let total_duration =
+                    stats.execution_duration + stats.state_hash_duration + commit_dur;
+
+                if total_duration > threshold.expect("checked above") {
+                    self.emit_event(ConsensusEngineEvent::SlowBlock(SlowBlockInfo {
+                        stats,
+                        commit_duration: commit_dur,
+                        total_duration,
+                    }));
+                }
+            }
+        }
+    }
+
    /// Emits an outgoing event to the engine.
    fn emit_event(&mut self, event: impl Into<EngineApiEvent<N>>) {
        let event = event.into();
@@ -2173,18 +2220,26 @@ where

    /// Finds any invalid ancestor for the given payload.
    ///
-    /// This function walks up the chain of buffered ancestors from the payload's block
-    /// hash and checks if any ancestor is marked as invalid in the tree state.
+    /// This function first checks if the block itself is in the invalid headers cache (to
+    /// avoid re-executing a known-invalid block). Then it walks up the chain of buffered
+    /// ancestors and checks if any ancestor is marked as invalid.
    ///
    /// The check works by:
-    /// 1. Finding the lowest buffered ancestor for the given block hash
-    /// 2. If the ancestor is the same as the block hash itself, using the parent hash instead
-    /// 3. Checking if this ancestor is in the `invalid_headers` map
+    /// 1. Checking if the block hash itself is in the `invalid_headers` map
+    /// 2. Finding the lowest buffered ancestor for the given block hash
+    /// 3. If the ancestor is the same as the block hash itself, using the parent hash instead
+    /// 4. Checking if this ancestor is in the `invalid_headers` map
    ///
    /// Returns the invalid ancestor block info if found, or None if no invalid ancestor exists.
    fn find_invalid_ancestor(&mut self, payload: &T::ExecutionData) -> Option<BlockWithParent> {
        let parent_hash = payload.parent_hash();
        let block_hash = payload.block_hash();
+
+        // Check if the block itself is already known to be invalid, avoiding re-execution
+        if let Some(entry) = self.state.invalid_headers.get(&block_hash) {
+            return Some(entry);
+        }
+
        let mut lowest_buffered_ancestor = self.lowest_buffered_ancestor_or(block_hash);
        if lowest_buffered_ancestor == block_hash {
            lowest_buffered_ancestor = parent_hash;
@@ -2736,7 +2791,12 @@ where
        &mut self,
        block_id: BlockWithParent,
        input: Input,
-        execute: impl FnOnce(&mut V, Input, TreeCtx<'_, N>) -> Result<ExecutedBlock<N>, Err>,
+        execute: impl FnOnce(
+            &mut V,
+            Input,
+            TreeCtx<'_, N>,
+        )
+            -> Result<(ExecutedBlock<N>, Option<Box<ExecutionTimingStats>>), Err>,
        convert_to_block: impl FnOnce(&mut Self, Input) -> Result<SealedBlock<N::Block>, Err>,
    ) -> Result<InsertPayloadOk, Err>
    where
@@ -2806,7 +2866,12 @@ where

        let start = Instant::now();

-        let executed = execute(&mut self.payload_validator, input, ctx)?;
+        let (executed, timing_stats) = execute(&mut self.payload_validator, input, ctx)?;
+
+        // Store timing stats for detailed block logging after persistence
+        if let Some(stats) = timing_stats {
+            self.execution_timing_stats.insert(executed.recovered_block().hash(), stats);
+        }

        // if the parent is the canonical head, we can insert the block as the pending block
        if self.state.tree_state.canonical_block_hash() == executed.recovered_block().parent_hash()
@@ -3128,8 +3193,8 @@ where
    EngineMessage(FromEngine<EngineApiRequest<T, N>, N::Block>),
    /// A persistence task completed.
    PersistenceComplete {
-        /// The result of the persistence operation.
-        result: Option<BlockNumHash>,
+        /// The unified result of the persistence operation.
+        result: PersistenceResult,
        /// When the persistence operation started.
        start_time: Instant,
    },
--- a/crates/engine/tree/src/tree/payload_processor/mod.rs
+++ b/crates/engine/tree/src/tree/payload_processor/mod.rs
@@ -38,9 +38,7 @@ use reth_trie_parallel::{
    proof_task::{ProofTaskCtx, ProofWorkerHandle},
    root::ParallelStateRootError,
 };
-use reth_trie_sparse::{
-    ParallelSparseTrie, ParallelismThresholds, RevealableSparseTrie, SparseStateTrie,
-};
+use reth_trie_sparse::ParallelismThresholds;
 use std::{
    ops::Not,
    sync::{
@@ -54,19 +52,22 @@ use tracing::{debug, debug_span, instrument, warn, Span};

 pub mod bal;
 pub mod multiproof;
-pub mod post_exec;
 mod preserved_sparse_trie;
 pub mod prewarm;
+pub mod receipt_root_task;
 pub mod sparse_trie;

-use preserved_sparse_trie::{PreservedSparseTrie, SharedPreservedSparseTrie};
+pub use preserved_sparse_trie::{
+    PayloadSparseTrieCache, PayloadSparseTrieKind, PayloadSparseTrieStoreOutcome,
+    SparseTrieCheckout,
+};

 /// Default parallelism thresholds to use with the [`ParallelSparseTrie`].
 ///
 /// These values were determined by performing benchmarks using gradually increasing values to judge
-/// the affects. Below 100 throughput would generally be equal or slightly less, while above 150 it
+/// the effects. Below 100 throughput would generally be equal or slightly less, while above 150 it
 /// would deteriorate to the point where PST might as well not be used.
-pub const PARALLEL_SPARSE_TRIE_PARALLELISM_THRESHOLDS: ParallelismThresholds =
+const PARALLEL_SPARSE_TRIE_PARALLELISM_THRESHOLDS: ParallelismThresholds =
    ParallelismThresholds { min_revealed_nodes: 100, min_updated_nodes: 100 };

 /// Default node capacity for shrinking the sparse trie. This is used to limit the number of trie
@@ -103,6 +104,52 @@ type IteratorPayloadHandle<Evm, I, N> = PayloadHandle<
    <N as NodePrimitives>::Receipt,
 >;

+/// Shared cache handles that can be exported to engine consumers and downstream payload builders.
+#[derive(Debug, Clone)]
+pub struct EngineSharedCaches<Evm: ConfigureEvm> {
+    execution_cache: PayloadExecutionCache,
+    sparse_trie_cache: PayloadSparseTrieCache,
+    precompile_cache_map: PrecompileCacheMap<SpecFor<Evm>>,
+}
+
+impl<Evm> Default for EngineSharedCaches<Evm>
+where
+    Evm: ConfigureEvm,
+{
+    fn default() -> Self {
+        Self::with_sparse_trie_kind(PayloadSparseTrieKind::default())
+    }
+}
+
+impl<Evm> EngineSharedCaches<Evm>
+where
+    Evm: ConfigureEvm,
+{
+    /// Creates shared caches backed by the requested sparse trie implementation.
+    pub fn with_sparse_trie_kind(sparse_trie_kind: PayloadSparseTrieKind) -> Self {
+        Self {
+            execution_cache: Default::default(),
+            sparse_trie_cache: PayloadSparseTrieCache::new(sparse_trie_kind),
+            precompile_cache_map: Default::default(),
+        }
+    }
+
+    /// Returns the shared execution cache handle for engine-internal use.
+    pub(crate) fn execution_cache(&self) -> PayloadExecutionCache {
+        self.execution_cache.clone()
+    }
+
+    /// Returns the shared sparse trie cache handle.
+    pub fn sparse_trie_cache(&self) -> PayloadSparseTrieCache {
+        self.sparse_trie_cache.clone()
+    }
+
+    /// Returns the shared precompile cache map.
+    pub fn precompile_cache_map(&self) -> PrecompileCacheMap<SpecFor<Evm>> {
+        self.precompile_cache_map.clone()
+    }
+}
+
 /// Entrypoint for executing the payload.
 #[derive(Debug)]
 pub struct PayloadProcessor<Evm>
@@ -111,8 +158,8 @@ where
 {
    /// The executor used by to spawn tasks.
    executor: Runtime,
-    /// The most recent cache used for execution.
-    execution_cache: PayloadExecutionCache,
+    /// Shared caches reused across payload processing.
+    shared_caches: EngineSharedCaches<Evm>,
    /// Metrics for trie operations
    trie_metrics: MultiProofTaskMetrics,
    /// Cross-block cache size in bytes.
@@ -125,16 +172,10 @@ where
    evm_config: Evm,
    /// Whether precompile cache should be disabled.
    precompile_cache_disabled: bool,
-    /// Precompile cache map.
-    precompile_cache_map: PrecompileCacheMap<SpecFor<Evm>>,
-    /// A pruned `SparseStateTrie`, kept around as a cache of already revealed trie nodes and to
-    /// re-use allocated memory. Stored with the block hash it was computed for to enable trie
-    /// preservation across sequential payload validations.
-    sparse_state_trie: SharedPreservedSparseTrie,
-    /// Sparse trie prune depth.
-    sparse_trie_prune_depth: usize,
-    /// Maximum storage tries to retain after pruning.
-    sparse_trie_max_storage_tries: usize,
+    /// LFU hot-slot capacity: max storage slots retained across prune cycles.
+    sparse_trie_max_hot_slots: usize,
+    /// LFU hot-account capacity: max account addresses retained across prune cycles.
+    sparse_trie_max_hot_accounts: usize,
    /// Whether sparse trie cache pruning is fully disabled.
    disable_sparse_trie_cache_pruning: bool,
    /// Whether to disable cache metrics recording.
@@ -156,31 +197,23 @@ where
        executor: Runtime,
        evm_config: Evm,
        config: &TreeConfig,
-        precompile_cache_map: PrecompileCacheMap<SpecFor<Evm>>,
+        shared_caches: EngineSharedCaches<Evm>,
    ) -> Self {
        Self {
            executor,
-            execution_cache: Default::default(),
+            shared_caches,
            trie_metrics: Default::default(),
            cross_block_cache_size: config.cross_block_cache_size(),
            disable_transaction_prewarming: config.disable_prewarming(),
            evm_config,
            disable_state_cache: config.disable_state_cache(),
            precompile_cache_disabled: config.precompile_cache_disabled(),
-            precompile_cache_map,
-            sparse_state_trie: SharedPreservedSparseTrie::default(),
-            sparse_trie_prune_depth: config.sparse_trie_prune_depth(),
-            sparse_trie_max_storage_tries: config.sparse_trie_max_storage_tries(),
+            sparse_trie_max_hot_slots: config.sparse_trie_max_hot_slots(),
+            sparse_trie_max_hot_accounts: config.sparse_trie_max_hot_accounts(),
            disable_sparse_trie_cache_pruning: config.disable_sparse_trie_cache_pruning(),
            disable_cache_metrics: config.disable_cache_metrics(),
        }
    }
-
-    /// Creates a new post-execution handle for a block, immediately spawning the
-    /// single event-driven post-exec background worker.
-    pub fn post_exec_handle(&self, receipts_len: usize) -> post_exec::PostExecHandle<N::Receipt> {
-        post_exec::PostExecHandle::new(&self.executor, receipts_len)
-    }
 }

 impl<Evm> WaitForCaches for PayloadProcessor<Evm>
@@ -191,8 +224,8 @@ where
        debug!(target: "engine::tree::payload_processor", "Waiting for execution cache and sparse trie locks");

        // Wait for both caches in parallel using std threads
-        let execution_cache = self.execution_cache.clone();
-        let sparse_trie = self.sparse_state_trie.clone();
+        let execution_cache = self.shared_caches.execution_cache();
+        let sparse_trie = self.shared_caches.sparse_trie_cache();

        // Use channels and spawn_blocking instead of std::thread::spawn
        let (execution_tx, execution_rx) = std::sync::mpsc::channel();
@@ -357,6 +390,8 @@ where
        let (to_multi_proof, from_multi_proof) = crossbeam_channel::unbounded();

        let task_ctx = ProofTaskCtx::new(multiproof_provider_factory);
+        #[cfg(feature = "trie-debug")]
+        let task_ctx = task_ctx.with_proof_jitter(config.proof_jitter());
        let halve_workers = env.transaction_count <= Self::SMALL_BLOCK_PROOF_WORKER_TX_THRESHOLD;
        let proof_handle = ProofWorkerHandle::new(&self.executor, task_ctx, halve_workers);

@@ -504,12 +539,12 @@ where
            terminate_execution: Arc::new(AtomicBool::new(false)),
            executed_tx_index: Arc::clone(&executed_tx_index),
            precompile_cache_disabled: self.precompile_cache_disabled,
-            precompile_cache_map: self.precompile_cache_map.clone(),
+            precompile_cache_map: self.shared_caches.precompile_cache_map(),
        };

        let (prewarm_task, to_prewarm_task) = PrewarmCacheTask::new(
            self.executor.clone(),
-            self.execution_cache.clone(),
+            self.shared_caches.execution_cache(),
            prewarm_ctx,
            to_multi_proof,
        );
@@ -537,7 +572,7 @@ where
    /// instance.
    #[instrument(level = "debug", target = "engine::caching", skip(self))]
    fn cache_for(&self, parent_hash: B256) -> SavedCache {
-        if let Some(cache) = self.execution_cache.get_cache_for(parent_hash) {
+        if let Some(cache) = self.shared_caches.execution_cache().get_cache_for(parent_hash) {
            debug!("reusing execution cache");
            cache
        } else {
@@ -562,54 +597,33 @@ where
        parent_state_root: B256,
        chunk_size: usize,
    ) {
-        let preserved_sparse_trie = self.sparse_state_trie.clone();
+        let sparse_trie_cache = self.shared_caches.sparse_trie_cache();
        let trie_metrics = self.trie_metrics.clone();
-        let prune_depth = self.sparse_trie_prune_depth;
-        let max_storage_tries = self.sparse_trie_max_storage_tries;
+        let max_hot_slots = self.sparse_trie_max_hot_slots;
+        let max_hot_accounts = self.sparse_trie_max_hot_accounts;
        let disable_cache_pruning = self.disable_sparse_trie_cache_pruning;
        let executor = self.executor.clone();

        let parent_span = Span::current();
        self.executor.spawn_blocking_named("sparse-trie", move || {
-            reth_tasks::once!(increase_thread_priority());
+            reth_tasks::once!(increase_thread_priority);

            let _enter = debug_span!(target: "engine::tree::payload_processor", parent: parent_span, "sparse_trie_task")
                .entered();

-            // Reuse a stored SparseStateTrie if available, applying continuation logic.
-            // If this payload's parent state root matches the preserved trie's anchor,
-            // we can reuse the pruned trie structure. Otherwise, we clear the trie but
-            // keep allocations.
            let start = Instant::now();
-            let preserved = preserved_sparse_trie.take();
+            let mut checkout = sparse_trie_cache.take_or_create_for(parent_state_root);
            trie_metrics
                .sparse_trie_cache_wait_duration_histogram
                .record(start.elapsed().as_secs_f64());
+            checkout.set_hot_cache_capacities(max_hot_slots, max_hot_accounts);

-            let sparse_state_trie = preserved
-                .map(|preserved| preserved.into_trie_for(parent_state_root))
-                .unwrap_or_else(|| {
-                    debug!(
-                        target: "engine::tree::payload_processor",
-                        "Creating new sparse trie - no preserved trie available"
-                    );
-                    let default_trie = RevealableSparseTrie::blind_from(
-                        ParallelSparseTrie::default().with_parallelism_thresholds(
-                            PARALLEL_SPARSE_TRIE_PARALLELISM_THRESHOLDS,
-                        ),
-                    );
-                    SparseStateTrie::new()
-                        .with_accounts_trie(default_trie.clone())
-                        .with_default_storage_trie(default_trie)
-                        .with_updates(true)
-                });
-
-            let mut task = SparseTrieCacheTask::new_with_trie(
+            let mut task = SparseTrieCacheTask::new_with_checkout(
                &executor,
                from_multi_proof,
                proof_worker_handle,
                trie_metrics.clone(),
-                sparse_state_trie,
+                checkout,
                chunk_size,
            );

@@ -620,7 +634,7 @@ where
            // causing take() to return None and forcing it to create a new empty trie
            // instead of reusing the preserved one. Holding the guard ensures the next
            // block's take() blocks until we've stored the trie for reuse.
-            let mut guard = preserved_sparse_trie.lock();
+            let mut guard = sparse_trie_cache.lock();

            let task_result = result.as_ref().ok().cloned();
            // Send state root computation result - next block may start but will block on take()
@@ -635,10 +649,9 @@ where
                    SPARSE_TRIE_MAX_NODES_SHRINK_CAPACITY,
                    SPARSE_TRIE_MAX_VALUES_SHRINK_CAPACITY,
                );
-                guard.store(PreservedSparseTrie::cleared(trie));
-                // Drop guard before deferred to release lock before expensive deallocations
+                trie.store_prepared_cleared_with_guard(&mut guard);
                drop(guard);
-                drop(deferred);
+                executor.spawn_drop(deferred);
                return;
            }

@@ -649,8 +662,8 @@ where
            let deferred = if let Some(result) = task_result {
                let start = Instant::now();
                let (trie, deferred) = task.into_trie_for_reuse(
-                    prune_depth,
-                    max_storage_tries,
+                    max_hot_slots,
+                    max_hot_accounts,
                    SPARSE_TRIE_MAX_NODES_SHRINK_CAPACITY,
                    SPARSE_TRIE_MAX_VALUES_SHRINK_CAPACITY,
                    disable_cache_pruning,
@@ -665,7 +678,7 @@ where
                trie_metrics
                    .sparse_trie_retained_storage_tries
                    .set(trie.retained_storage_tries_count() as f64);
-                guard.store(PreservedSparseTrie::anchored(trie, result.state_root));
+                trie.store_anchored_with_guard(&mut guard, result.state_root);
                deferred
            } else {
                debug!(
@@ -676,12 +689,11 @@ where
                    SPARSE_TRIE_MAX_NODES_SHRINK_CAPACITY,
                    SPARSE_TRIE_MAX_VALUES_SHRINK_CAPACITY,
                );
-                guard.store(PreservedSparseTrie::cleared(trie));
+                trie.store_prepared_cleared_with_guard(&mut guard);
                deferred
            };
-            // Drop guard before deferred to release lock before expensive deallocations
            drop(guard);
-            drop(deferred);
+            executor.spawn_drop(deferred);
        });
    }

@@ -698,7 +710,7 @@ where
        bundle_state: &BundleState,
    ) {
        let disable_cache_metrics = self.disable_cache_metrics;
-        self.execution_cache.update_with_guard(|cached| {
+        self.shared_caches.execution_cache().update_with_guard(|cached| {
            if cached.as_ref().is_some_and(|c| c.executed_block_hash() != block_with_parent.parent) {
                debug!(
                    target: "engine::caching",
@@ -1002,7 +1014,7 @@ impl<R> Drop for CacheTaskHandle<R> {
 /// - Prepares data for state root proof computation
 /// - Runs concurrently but must not interfere with cache saves
 #[derive(Clone, Debug, Default)]
-pub struct PayloadExecutionCache {
+pub(crate) struct PayloadExecutionCache {
    /// Guarded cloneable cache identified by a block hash.
    inner: Arc<RwLock<Option<SavedCache>>>,
    /// Metrics for cache operations.
@@ -1010,15 +1022,15 @@ pub struct PayloadExecutionCache {
 }

 impl PayloadExecutionCache {
-    /// Returns the cache for `parent_hash` if it's available for use.
+    /// Returns the cache backing store for `parent_hash` if it's available for reuse.
    ///
-    /// A cache is considered available when:
-    /// - It exists and matches the requested parent hash
-    /// - No other tasks are currently using it (checked via Arc reference count)
+    /// If the tracked cache is available but keyed to a different parent hash, the cache is
+    /// cleared and returned so callers can reuse the underlying allocations without carrying over
+    /// stale state.
    #[instrument(level = "debug", target = "engine::tree::payload_processor", skip(self))]
    pub(crate) fn get_cache_for(&self, parent_hash: B256) -> Option<SavedCache> {
        let start = Instant::now();
-        let cache = self.inner.read();
+        let mut cache = self.inner.write();

        let elapsed = start.elapsed();
        self.metrics.execution_cache_wait_duration.record(elapsed.as_secs_f64());
@@ -1026,7 +1038,7 @@ impl PayloadExecutionCache {
            warn!(blocked_for=?elapsed, "Blocked waiting for execution cache mutex");
        }

-        if let Some(c) = cache.as_ref() {
+        if let Some(c) = cache.as_mut() {
            let cached_hash = c.executed_block_hash();
            // Check that the cache hash matches the parent hash of the current block. It won't
            // match in case it's a fork block.
@@ -1047,13 +1059,13 @@ impl PayloadExecutionCache {
            );

            if available {
-                // If the has is available (no other threads are using it), but has a mismatching
-                // parent hash, we can just clear it and keep using without re-creating from
-                // scratch.
                if !hash_matches {
-                    c.clear();
+                    // Fork block: clear and update the hash on the ORIGINAL before cloning.
+                    // This prevents the canonical chain from matching on the stale hash
+                    // and picking up polluted data if the fork block fails.
+                    c.clear_with_hash(parent_hash);
                }
-                return Some(c.clone())
+                return Some(c.clone());
            } else if hash_matches {
                self.metrics.execution_cache_in_use.increment(1);
            }
@@ -1064,19 +1076,13 @@ impl PayloadExecutionCache {
        None
    }

-    /// Clears the tracked cache
-    #[expect(unused)]
-    pub(crate) fn clear(&self) {
-        self.inner.write().take();
-    }
-
    /// Waits until the execution cache becomes available for use.
    ///
    /// This acquires a write lock to ensure exclusive access, then immediately releases it.
    /// This is useful for synchronization before starting payload processing.
    ///
    /// Returns the time spent waiting for the lock.
-    pub fn wait_for_availability(&self) -> Duration {
+    pub(crate) fn wait_for_availability(&self) -> Duration {
        let start = Instant::now();
        // Acquire write lock to wait for any current holders to finish
        let _guard = self.inner.write();
@@ -1104,7 +1110,7 @@ impl PayloadExecutionCache {
    ///
    /// Violating this requirement can result in cache corruption, incorrect state data,
    /// and potential consensus failures.
-    pub fn update_with_guard<F>(&self, update_fn: F)
+    pub(crate) fn update_with_guard<F>(&self, update_fn: F)
    where
        F: FnOnce(&mut Option<SavedCache>),
    {
@@ -1173,8 +1179,9 @@ mod tests {
    use super::PayloadExecutionCache;
    use crate::tree::{
        cached_state::{CachedStateMetrics, ExecutionCache, SavedCache},
-        payload_processor::{evm_state_to_hashed_post_state, ExecutionEnv, PayloadProcessor},
-        precompile_cache::PrecompileCacheMap,
+        payload_processor::{
+            evm_state_to_hashed_post_state, EngineSharedCaches, ExecutionEnv, PayloadProcessor,
+        },
        StateProviderBuilder, TreeConfig,
    };
    use alloy_eips::eip1898::{BlockNumHash, BlockWithParent};
@@ -1247,10 +1254,18 @@ mod tests {

        execution_cache.update_with_guard(|slot| *slot = Some(make_saved_cache(hash)));

-        // When the parent hash doesn't match, the cache is cleared and returned for reuse
+        // When the parent hash doesn't match (fork block), the cache is cleared,
+        // hash updated on the original, and clone returned for reuse
        let different_hash = B256::from([4u8; 32]);
        let cache = execution_cache.get_cache_for(different_hash);
-        assert!(cache.is_some(), "cache should be returned for reuse after clearing")
+        assert!(cache.is_some(), "cache should be returned for reuse after clearing");
+
+        drop(cache);
+
+        // The stored cache now has the fork block's parent hash.
+        // Canonical chain looking for original hash sees a mismatch → clears and reuses.
+        let original = execution_cache.get_cache_for(hash);
+        assert!(original.is_some(), "canonical chain gets cache back via mismatch+clear");
    }

    #[test]
@@ -1278,7 +1293,7 @@ mod tests {
            reth_tasks::Runtime::test(),
            EthEvmConfig::new(Arc::new(ChainSpec::default())),
            &TreeConfig::default(),
-            PrecompileCacheMap::default(),
+            EngineSharedCaches::default(),
        );

        let parent_hash = B256::from([1u8; 32]);
@@ -1290,13 +1305,17 @@ mod tests {
        let bundle_state = BundleState::default();

        // Cache should be empty initially
-        assert!(payload_processor.execution_cache.get_cache_for(block_hash).is_none());
+        assert!(payload_processor
+            .shared_caches
+            .execution_cache()
+            .get_cache_for(block_hash)
+            .is_none());

        // Update cache with inserted block
        payload_processor.on_inserted_executed_block(block_with_parent, &bundle_state);

        // Cache should now exist for the block hash
-        let cached = payload_processor.execution_cache.get_cache_for(block_hash);
+        let cached = payload_processor.shared_caches.execution_cache().get_cache_for(block_hash);
        assert!(cached.is_some());
        assert_eq!(cached.unwrap().executed_block_hash(), block_hash);
    }
@@ -1307,13 +1326,14 @@ mod tests {
            reth_tasks::Runtime::test(),
            EthEvmConfig::new(Arc::new(ChainSpec::default())),
            &TreeConfig::default(),
-            PrecompileCacheMap::default(),
+            EngineSharedCaches::default(),
        );

        // Setup: populate cache with block 1
        let block1_hash = B256::from([1u8; 32]);
        payload_processor
-            .execution_cache
+            .shared_caches
+            .execution_cache()
            .update_with_guard(|slot| *slot = Some(make_saved_cache(block1_hash)));

        // Try to insert block 3 with wrong parent (should skip and keep block 1's cache)
@@ -1328,11 +1348,11 @@ mod tests {
        payload_processor.on_inserted_executed_block(block_with_parent, &bundle_state);

        // Cache should still be for block 1 (unchanged)
-        let cached = payload_processor.execution_cache.get_cache_for(block1_hash);
+        let cached = payload_processor.shared_caches.execution_cache().get_cache_for(block1_hash);
        assert!(cached.is_some(), "Original cache should be preserved");

        // Cache for block 3 should not exist
-        let cached3 = payload_processor.execution_cache.get_cache_for(block3_hash);
+        let cached3 = payload_processor.shared_caches.execution_cache().get_cache_for(block3_hash);
        assert!(cached3.is_none(), "New block cache should not be created on mismatch");
    }

@@ -1442,7 +1462,7 @@ mod tests {
            reth_tasks::Runtime::test(),
            EthEvmConfig::new(factory.chain_spec()),
            &TreeConfig::default(),
-            PrecompileCacheMap::default(),
+            EngineSharedCaches::default(),
        );

        let provider_factory = BlockchainProvider::new(factory).unwrap();
@@ -1474,4 +1494,61 @@ mod tests {
            "State root mismatch: task={root_from_task}, base={root_from_regular}"
        );
    }
+
+    /// Tests the full prewarm lifecycle for a fork block:
+    ///
+    /// 1. Cache is at canonical block 4.
+    /// 2. Fork block (parent = block 2) checks out the cache via `get_cache_for`, simulating what
+    ///    `PrewarmCacheTask` does when it receives a `SavedCache`.
+    /// 3. Prewarm populates the shared cache with fork-specific state.
+    /// 4. While the prewarm clone is alive, the cache is unavailable (`usage_guard` > 1).
+    /// 5. Prewarm drops without calling `save_cache` (fork block was invalid).
+    /// 6. Canonical block 5 (parent = block 4) must get a cache with correct hash and no stale fork
+    ///    data.
+    #[test]
+    fn fork_prewarm_dropped_without_save_does_not_corrupt_cache() {
+        let execution_cache = PayloadExecutionCache::default();
+
+        // Canonical chain at block 4.
+        let block4_hash = B256::from([4u8; 32]);
+        execution_cache.update_with_guard(|slot| *slot = Some(make_saved_cache(block4_hash)));
+
+        // Fork block arrives with parent = block 2. Prewarm task checks out the cache.
+        // This simulates PrewarmCacheTask receiving a SavedCache clone from get_cache_for.
+        let fork_parent = B256::from([2u8; 32]);
+        let prewarm_cache = execution_cache.get_cache_for(fork_parent);
+        assert!(prewarm_cache.is_some(), "prewarm should obtain cache for fork block");
+        let prewarm_cache = prewarm_cache.unwrap();
+        assert_eq!(prewarm_cache.executed_block_hash(), fork_parent);
+
+        // Prewarm populates cache with fork-specific state (ancestor data for block 2).
+        // Since ExecutionCache uses Arc<Inner>, this data is shared with the stored original.
+        let fork_addr = Address::from([0xBB; 20]);
+        let fork_key = B256::from([0xCC; 32]);
+        prewarm_cache.cache().insert_storage(fork_addr, fork_key, Some(U256::from(999)));
+
+        // While prewarm holds the clone, the usage_guard count > 1 → cache is in use.
+        let during_prewarm = execution_cache.get_cache_for(block4_hash);
+        assert!(
+            during_prewarm.is_none(),
+            "cache must be unavailable while prewarm holds a reference"
+        );
+
+        // Fork block fails — prewarm task drops without calling save_cache/update_with_guard.
+        drop(prewarm_cache);
+
+        // Canonical block 5 arrives (parent = block 4).
+        // Stored hash = fork_parent (our fix), so get_cache_for sees a mismatch,
+        // clears the stale fork data, and returns a cache with hash = block4_hash.
+        let block5_cache = execution_cache.get_cache_for(block4_hash);
+        assert!(
+            block5_cache.is_some(),
+            "canonical chain must get cache after fork prewarm is dropped"
+        );
+        assert_eq!(
+            block5_cache.as_ref().unwrap().executed_block_hash(),
+            block4_hash,
+            "cache must carry the canonical parent hash, not the fork parent"
+        );
+    }
 }
--- a/crates/engine/tree/src/tree/payload_processor/multiproof.rs
+++ b/crates/engine/tree/src/tree/payload_processor/multiproof.rs
@@ -47,16 +47,6 @@ pub enum MultiProofMessage {
    PrefetchProofs(MultiProofTargetsV2),
    /// New state update from transaction execution with its source
    StateUpdate(Source, EvmState),
-    /// State update that can be applied to the sparse trie without any new proofs.
-    ///
-    /// It can be the case when all accounts and storage slots from the state update were already
-    /// fetched and revealed.
-    EmptyProof {
-        /// The index of this proof in the sequence of state updates
-        sequence_number: u64,
-        /// The state update that was used to calculate the proof
-        state: HashedPostState,
-    },
    /// Pre-hashed state update from BAL conversion that can be applied directly without proofs.
    HashedStateUpdate(HashedPostState),
    /// Block Access List (EIP-7928; BAL) containing complete state changes for the block.
@@ -128,41 +118,6 @@ pub(crate) fn evm_state_to_hashed_post_state(update: EvmState) -> HashedPostStat
 #[derive(Metrics, Clone)]
 #[metrics(scope = "tree.root")]
 pub(crate) struct MultiProofTaskMetrics {
-    /// Histogram of active storage workers processing proofs.
-    pub active_storage_workers_histogram: Histogram,
-    /// Histogram of active account workers processing proofs.
-    pub active_account_workers_histogram: Histogram,
-    /// Gauge for the maximum number of storage workers in the pool.
-    pub max_storage_workers: Gauge,
-    /// Gauge for the maximum number of account workers in the pool.
-    pub max_account_workers: Gauge,
-    /// Histogram of pending storage multiproofs in the queue.
-    pub pending_storage_multiproofs_histogram: Histogram,
-    /// Histogram of pending account multiproofs in the queue.
-    pub pending_account_multiproofs_histogram: Histogram,
-
-    /// Histogram of the number of prefetch proof target accounts.
-    pub prefetch_proof_targets_accounts_histogram: Histogram,
-    /// Histogram of the number of prefetch proof target storages.
-    pub prefetch_proof_targets_storages_histogram: Histogram,
-    /// Histogram of the number of prefetch proof target chunks.
-    pub prefetch_proof_chunks_histogram: Histogram,
-
-    /// Histogram of the number of state update proof target accounts.
-    pub state_update_proof_targets_accounts_histogram: Histogram,
-    /// Histogram of the number of state update proof target storages.
-    pub state_update_proof_targets_storages_histogram: Histogram,
-    /// Histogram of the number of state update proof target chunks.
-    pub state_update_proof_chunks_histogram: Histogram,
-
-    /// Histogram of prefetch proof batch sizes (number of messages merged).
-    pub prefetch_batch_size_histogram: Histogram,
-
-    /// Histogram of proof calculation durations.
-    pub proof_calculation_duration_histogram: Histogram,
-
-    /// Histogram of sparse trie update durations.
-    pub sparse_trie_update_duration_histogram: Histogram,
    /// Histogram of durations spent revealing multiproof results into the sparse trie.
    pub sparse_trie_reveal_multiproof_duration_histogram: Histogram,
    /// Histogram of durations spent coalescing multiple proof results from the channel.
@@ -175,17 +130,6 @@ pub(crate) struct MultiProofTaskMetrics {
    pub sparse_trie_final_update_duration_histogram: Histogram,
    /// Histogram of sparse trie total durations.
    pub sparse_trie_total_duration_histogram: Histogram,
-
-    /// Histogram of state updates received.
-    pub state_updates_received_histogram: Histogram,
-    /// Histogram of proofs processed.
-    pub proofs_processed_histogram: Histogram,
-    /// Histogram of total time spent in the multiproof task.
-    pub multiproof_task_total_duration_histogram: Histogram,
-    /// Total time spent waiting for the first state update or prefetch request.
-    pub first_update_wait_time_histogram: Histogram,
-    /// Total time spent waiting for the last proof result.
-    pub last_proof_wait_time_histogram: Histogram,
    /// Time spent preparing the sparse trie for reuse after state root computation.
    pub into_trie_for_reuse_duration_histogram: Histogram,
    /// Time spent waiting for preserved sparse trie cache to become available.
--- a/crates/engine/tree/src/tree/payload_processor/post_exec.rs
+++ b/crates/engine/tree/src/tree/payload_processor/post_exec.rs
@@ -1,483 +0,0 @@
-//! Per-block post-execution handle for background post-execution artifact computation.
-//!
-//! This module provides [`PostExecHandle`], a block-scoped facade that coordinates
-//! background tasks:
-//!
-//! 1. **Receipt root worker** — spawned at construction via [`Runtime::spawn_blocking_named`].
-//!    Receipts are streamed incrementally during execution; when the channel closes the worker
-//!    finalizes the receipt trie root and aggregated bloom.
-//!
-//! 2. **Hashed post-state task** — spawned by [`PostExecHandle::finish`] so it starts immediately
-//!    after execution, running in parallel with receipt-root finalization.
-//!
-//! 3. **Transaction root task** — spawned by [`PostExecHandle::finish`] for payload blocks,
-//!    computing the transaction trie root in parallel with the other tasks.
-//!
-//! Results are accessed via blocking accessors that wait for the background tasks to complete.
-
-use alloy_eips::Encodable2718;
-use alloy_primitives::{Bloom, B256};
-use crossbeam_channel::Sender as CrossbeamSender;
-use reth_primitives_traits::Receipt;
-use reth_tasks::{LazyHandle, Runtime};
-use reth_trie::HashedPostState;
-use reth_trie_common::ordered_root::OrderedTrieRootEncodedBuilder;
-use std::sync::{Arc, OnceLock};
-use tracing::error;
-
-/// Receipt with index, ready to be sent to the background task for encoding and trie building.
-#[derive(Debug, Clone)]
-pub struct IndexedReceipt<R> {
-    /// The transaction index within the block.
-    pub index: usize,
-    /// The receipt.
-    pub receipt: R,
-}
-
-impl<R> IndexedReceipt<R> {
-    /// Creates a new indexed receipt.
-    #[inline]
-    pub const fn new(index: usize, receipt: R) -> Self {
-        Self { index, receipt }
-    }
-}
-
-/// Block-scoped handle for post-execution background tasks.
-///
-/// Created once per block via [`PostExecHandle::new`], which immediately spawns a
-/// receipt-root background worker. During transaction execution, receipts are streamed
-/// via [`push_receipt`](Self::push_receipt). After execution completes, call
-/// [`finish`](Self::finish) to close the receipt channel and spawn hashed-post-state
-/// and (optionally) transaction-root computation in parallel.
-#[must_use]
-pub struct PostExecHandle<R> {
-    tx: Option<CrossbeamSender<IndexedReceipt<R>>>,
-    receipt_root_bloom: Arc<OnceLock<Option<(B256, Bloom)>>>,
-    hashed_post_state: Option<LazyHandle<HashedPostState>>,
-    transaction_root: Option<LazyHandle<B256>>,
-    executor: Runtime,
-}
-
-impl<R> core::fmt::Debug for PostExecHandle<R> {
-    fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result {
-        f.debug_struct("PostExecHandle").field("finished", &self.tx.is_none()).finish()
-    }
-}
-
-impl<R: Receipt + 'static> PostExecHandle<R> {
-    /// Creates a new handle and immediately spawns the receipt-root background worker.
-    ///
-    /// The worker begins waiting for receipts via the crossbeam channel and builds the
-    /// receipt trie incrementally as they arrive. When the channel closes (via
-    /// [`finish`](Self::finish) or handle drop), the worker finalizes the receipt root.
-    pub fn new(executor: &Runtime, receipts_len: usize) -> Self {
-        let (tx, rx) = crossbeam_channel::unbounded();
-        let receipt_root_bloom = Arc::new(OnceLock::new());
-        let receipt_root_bloom_worker = receipt_root_bloom.clone();
-
-        // Use spawn_blocking_named for consistent thread naming; ignore the LazyHandle<()> return.
-        let _ = executor.spawn_blocking_named("receipt-root", move || {
-            run_receipt_root_worker(rx, receipt_root_bloom_worker, receipts_len);
-        });
-
-        Self {
-            tx: Some(tx),
-            receipt_root_bloom,
-            hashed_post_state: None,
-            transaction_root: None,
-            executor: executor.clone(),
-        }
-    }
-
-    /// Streams one receipt to the background worker.
-    #[inline]
-    pub fn push_receipt(&self, index: usize, receipt: R) {
-        if let Some(tx) = self.tx.as_ref() &&
-            tx.send(IndexedReceipt::new(index, receipt)).is_err()
-        {
-            error!(
-                target: "engine::tree::payload_processor",
-                index,
-                "receipt-root worker dropped before receipt event",
-            );
-        }
-    }
-
-    /// Closes the receipt channel and spawns hashed-post-state and (optionally)
-    /// transaction-root computation.
-    ///
-    /// Dropping the channel sender signals the receipt-root worker to finalize.
-    /// The hashed-post-state and transaction-root closures are each spawned on
-    /// separate threads, running in parallel with receipt-root finalization.
-    ///
-    /// Pass `None` for `tx_root_fn` when the block is not a payload (no tx root needed).
-    ///
-    /// Must be called after all receipts have been pushed.
-    pub fn finish(
-        &mut self,
-        hashed_state_fn: impl FnOnce() -> HashedPostState + Send + 'static,
-        tx_root_fn: Option<impl FnOnce() -> B256 + Send + 'static>,
-    ) {
-        // Drop receipt channel sender — signals worker to finalize receipt root.
-        self.tx.take();
-
-        // Spawn hashed-post-state computation immediately on a separate thread.
-        self.hashed_post_state =
-            Some(self.executor.spawn_blocking_named("hash-post-state", hashed_state_fn));
-
-        // Spawn transaction-root computation if this is a payload block.
-        self.transaction_root =
-            tx_root_fn.map(|f| self.executor.spawn_blocking_named("payload-tx-root", f));
-    }
-
-    /// Returns the computed receipt root and aggregated logs bloom.
-    ///
-    /// Blocks until the receipt-root worker completes. Returns `None` if the receipt
-    /// stream was incomplete (e.g., execution was aborted).
-    pub fn receipt_root_bloom(&self) -> Option<(B256, Bloom)> {
-        *self.receipt_root_bloom.wait()
-    }
-
-    /// Returns the computed transaction root, if this was a payload block.
-    ///
-    /// Blocks until the transaction-root task completes. Returns `None` for non-payload
-    /// blocks where tx root computation was not requested.
-    pub fn transaction_root(&self) -> Option<B256> {
-        self.transaction_root.as_ref().map(|h| *h.get())
-    }
-
-    /// Returns a reference to the computed hashed post state.
-    ///
-    /// Blocks until the background task completes.
-    ///
-    /// # Panics
-    ///
-    /// Panics if [`finish`](Self::finish) was not called before this method.
-    pub fn hashed_post_state(&self) -> &HashedPostState {
-        self.hashed_post_state
-            .as_ref()
-            .expect("finish() must be called before hashed_post_state()")
-            .get()
-    }
-
-    /// Extracts the [`LazyHandle<HashedPostState>`] from this handle.
-    ///
-    /// # Panics
-    ///
-    /// Panics if [`finish`](Self::finish) was not called.
-    pub fn into_lazy_hashed_state(&mut self) -> LazyHandle<HashedPostState> {
-        self.hashed_post_state
-            .take()
-            .expect("finish() must be called before into_lazy_hashed_state()")
-    }
-}
-
-impl<R> Drop for PostExecHandle<R> {
-    fn drop(&mut self) {
-        // Drop the channel sender if finish() was never called, so the receipt-root
-        // worker can observe channel closure and terminate.
-        self.tx.take();
-    }
-}
-
-/// Runs the receipt-root background worker.
-///
-/// Receives indexed receipts from the channel, incrementally builds the receipt trie,
-/// and aggregates the logs bloom. When the channel closes, it finalizes the root and
-/// stores the result in the shared [`OnceLock`].
-fn run_receipt_root_worker<R: Receipt>(
-    rx: crossbeam_channel::Receiver<IndexedReceipt<R>>,
-    receipt_root_bloom: Arc<OnceLock<Option<(B256, Bloom)>>>,
-    receipts_len: usize,
-) {
-    // RAII guard ensures the OnceLock is set to None if we return early / panic.
-    struct AbortGuard<'a> {
-        lock: &'a OnceLock<Option<(B256, Bloom)>>,
-        disarmed: bool,
-    }
-    impl Drop for AbortGuard<'_> {
-        fn drop(&mut self) {
-            if !self.disarmed {
-                let _ = self.lock.set(None);
-            }
-        }
-    }
-
-    let mut guard = AbortGuard { lock: &receipt_root_bloom, disarmed: false };
-    let mut builder = OrderedTrieRootEncodedBuilder::new(receipts_len);
-    let mut aggregated_bloom = Bloom::ZERO;
-    let mut encode_buf = Vec::new();
-    let mut received_count = 0usize;
-
-    for indexed_receipt in &rx {
-        let receipt_with_bloom = indexed_receipt.receipt.with_bloom_ref();
-
-        encode_buf.clear();
-        receipt_with_bloom.encode_2718(&mut encode_buf);
-
-        match builder.push(indexed_receipt.index, &encode_buf) {
-            Ok(()) => {
-                received_count += 1;
-                aggregated_bloom |= *receipt_with_bloom.bloom_ref();
-            }
-            Err(err) => {
-                error!(
-                    target: "engine::tree::payload_processor",
-                    index = indexed_receipt.index,
-                    ?err,
-                    "Receipt root worker received invalid receipt index, skipping"
-                );
-            }
-        }
-    }
-
-    // Finalize receipt root.
-    match builder.finalize() {
-        Ok(root) => {
-            let _ = receipt_root_bloom.set(Some((root, aggregated_bloom)));
-        }
-        Err(_) => {
-            error!(
-                target: "engine::tree::payload_processor",
-                expected = receipts_len,
-                received = received_count,
-                "Receipt-root worker received incomplete receipts, execution likely aborted"
-            );
-            let _ = receipt_root_bloom.set(None);
-            return;
-        }
-    }
-
-    guard.disarmed = true;
-}
-
-#[cfg(test)]
-mod tests {
-    use super::*;
-    use alloy_consensus::{proofs::calculate_receipt_root, TxReceipt};
-    use alloy_primitives::{Address, Bytes, Log, B256};
-    use reth_ethereum_primitives::{Receipt, TxType};
-
-    fn test_runtime() -> Runtime {
-        Runtime::test()
-    }
-
-    fn sample_receipts() -> Vec<Receipt> {
-        vec![
-            Receipt {
-                tx_type: TxType::Legacy,
-                cumulative_gas_used: 21_000,
-                success: true,
-                logs: vec![],
-            },
-            Receipt {
-                tx_type: TxType::Eip1559,
-                cumulative_gas_used: 42_000,
-                success: true,
-                logs: vec![Log {
-                    address: Address::ZERO,
-                    data: alloy_primitives::LogData::new_unchecked(vec![B256::ZERO], Bytes::new()),
-                }],
-            },
-            Receipt {
-                tx_type: TxType::Eip2930,
-                cumulative_gas_used: 63_000,
-                success: false,
-                logs: vec![],
-            },
-        ]
-    }
-
-    fn expected_root_bloom(receipts: &[Receipt]) -> (B256, Bloom) {
-        let receipts_with_bloom: Vec<_> = receipts.iter().map(|r| r.with_bloom_ref()).collect();
-        let root = calculate_receipt_root(&receipts_with_bloom);
-        let bloom =
-            receipts_with_bloom.iter().fold(Bloom::ZERO, |acc, receipt| acc | *receipt.bloom_ref());
-        (root, bloom)
-    }
-
-    #[test]
-    fn post_exec_handle_computes_receipt_root_and_bloom() {
-        let rt = test_runtime();
-
-        let receipts = sample_receipts();
-        let (expected_root, expected_bloom) = expected_root_bloom(&receipts);
-
-        let mut handle = PostExecHandle::<Receipt>::new(&rt, receipts.len());
-        for (index, receipt) in receipts.into_iter().enumerate() {
-            handle.push_receipt(index, receipt);
-        }
-        handle.finish(HashedPostState::default, None::<fn() -> B256>);
-
-        let (root, bloom) = handle.receipt_root_bloom().unwrap();
-        assert_eq!(root, expected_root);
-        assert_eq!(bloom, expected_bloom);
-    }
-
-    #[test]
-    fn post_exec_handle_handles_out_of_order_receipts() {
-        let rt = test_runtime();
-
-        let receipts = sample_receipts();
-        let (expected_root, expected_bloom) = expected_root_bloom(&receipts);
-
-        let mut handle = PostExecHandle::<Receipt>::new(&rt, receipts.len());
-        for (index, receipt) in receipts.into_iter().enumerate().rev() {
-            handle.push_receipt(index, receipt);
-        }
-        handle.finish(HashedPostState::default, None::<fn() -> B256>);
-
-        let (root, bloom) = handle.receipt_root_bloom().unwrap();
-        assert_eq!(root, expected_root);
-        assert_eq!(bloom, expected_bloom);
-    }
-
-    #[test]
-    fn post_exec_handle_ignores_invalid_index_for_bloom_aggregation() {
-        let rt = test_runtime();
-
-        let valid = Receipt::default();
-        let invalid = Receipt {
-            tx_type: TxType::Legacy,
-            cumulative_gas_used: 21_000,
-            success: true,
-            logs: vec![Log {
-                address: Address::ZERO,
-                data: alloy_primitives::LogData::new_unchecked(vec![B256::ZERO], Bytes::new()),
-            }],
-        };
-
-        let expected = expected_root_bloom(core::slice::from_ref(&valid));
-
-        let mut handle = PostExecHandle::<Receipt>::new(&rt, 1);
-        handle.push_receipt(0, valid);
-        handle.push_receipt(999, invalid);
-        handle.finish(HashedPostState::default, None::<fn() -> B256>);
-
-        assert_eq!(handle.receipt_root_bloom(), Some(expected));
-    }
-
-    #[test]
-    fn post_exec_handle_returns_none_for_incomplete_stream() {
-        let rt = test_runtime();
-
-        let mut handle = PostExecHandle::<Receipt>::new(&rt, 2);
-        handle.push_receipt(0, Receipt::default());
-        // Finish with only 1 of 2 receipts — root should be None.
-        handle.finish(HashedPostState::default, None::<fn() -> B256>);
-
-        assert!(handle.receipt_root_bloom().is_none());
-    }
-
-    #[test]
-    fn post_exec_handle_with_hashed_post_state() {
-        let rt = test_runtime();
-
-        let mut handle = PostExecHandle::<Receipt>::new(&rt, 0);
-        let expected = HashedPostState::default();
-        handle.finish(HashedPostState::default, None::<fn() -> B256>);
-
-        assert_eq!(handle.hashed_post_state(), &expected);
-    }
-
-    #[test]
-    fn post_exec_handle_with_transaction_root() {
-        let rt = test_runtime();
-
-        let expected_root = B256::repeat_byte(0x42);
-        let mut handle = PostExecHandle::<Receipt>::new(&rt, 0);
-        handle.finish(HashedPostState::default, Some(move || expected_root));
-
-        assert_eq!(handle.transaction_root(), Some(expected_root));
-    }
-
-    #[test]
-    fn post_exec_handle_without_transaction_root() {
-        let rt = test_runtime();
-
-        let mut handle = PostExecHandle::<Receipt>::new(&rt, 0);
-        handle.finish(HashedPostState::default, None::<fn() -> B256>);
-
-        assert_eq!(handle.transaction_root(), None);
-    }
-
-    #[test]
-    fn post_exec_handle_parallel_blocks() {
-        let rt = test_runtime();
-
-        let receipts_a = sample_receipts();
-        let (expected_root_a, expected_bloom_a) = expected_root_bloom(&receipts_a);
-
-        let receipts_b = vec![Receipt::default(); 2];
-        let (expected_root_b, expected_bloom_b) = expected_root_bloom(&receipts_b);
-
-        let mut handle_a = PostExecHandle::<Receipt>::new(&rt, receipts_a.len());
-        let mut handle_b = PostExecHandle::<Receipt>::new(&rt, receipts_b.len());
-
-        for (index, receipt) in receipts_a.into_iter().enumerate() {
-            handle_a.push_receipt(index, receipt);
-        }
-        for (index, receipt) in receipts_b.into_iter().enumerate() {
-            handle_b.push_receipt(index, receipt);
-        }
-
-        handle_a.finish(HashedPostState::default, None::<fn() -> B256>);
-        handle_b.finish(HashedPostState::default, None::<fn() -> B256>);
-
-        let (root_a, bloom_a) = handle_a.receipt_root_bloom().unwrap();
-        let (root_b, bloom_b) = handle_b.receipt_root_bloom().unwrap();
-
-        assert_eq!(root_a, expected_root_a);
-        assert_eq!(bloom_a, expected_bloom_a);
-        assert_eq!(root_b, expected_root_b);
-        assert_eq!(bloom_b, expected_bloom_b);
-    }
-
-    #[test]
-    fn post_exec_handle_aborted_block_then_next_succeeds() {
-        let rt = test_runtime();
-
-        // First block: aborted (dropped without finishing all receipts)
-        let handle = PostExecHandle::<Receipt>::new(&rt, 2);
-        handle.push_receipt(0, Receipt::default());
-        drop(handle);
-
-        // Second block: succeeds
-        let mut handle = PostExecHandle::<Receipt>::new(&rt, 1);
-        handle.push_receipt(0, Receipt::default());
-        handle.finish(HashedPostState::default, None::<fn() -> B256>);
-        assert!(handle.receipt_root_bloom().is_some());
-    }
-
-    #[test]
-    fn lazy_hashed_post_state_get_and_try_into_inner() {
-        let rt = test_runtime();
-
-        let mut handle = PostExecHandle::<Receipt>::new(&rt, 0);
-        handle.finish(HashedPostState::default, None::<fn() -> B256>);
-
-        let lazy = handle.into_lazy_hashed_state();
-        // handle is partially consumed but Drop is safe (hashed_post_state is now None)
-        drop(handle);
-        assert_eq!(lazy.get(), &HashedPostState::default());
-
-        let inner = lazy.try_into_inner().unwrap();
-        assert_eq!(inner, HashedPostState::default());
-    }
-
-    #[test]
-    fn lazy_hashed_post_state_clone_prevents_try_into_inner() {
-        let rt = test_runtime();
-
-        let mut handle = PostExecHandle::<Receipt>::new(&rt, 0);
-        handle.finish(HashedPostState::default, None::<fn() -> B256>);
-
-        let lazy = handle.into_lazy_hashed_state();
-        drop(handle);
-        let _clone = lazy.clone();
-
-        // try_into_inner fails because there are multiple Arc references.
-        let lazy = lazy.try_into_inner().unwrap_err();
-        assert_eq!(lazy.get(), &HashedPostState::default());
-    }
-}
--- a/crates/engine/tree/src/tree/payload_processor/preserved_sparse_trie.rs
+++ b/crates/engine/tree/src/tree/payload_processor/preserved_sparse_trie.rs
@@ -1,44 +1,128 @@
 //! Preserved sparse trie for reuse across payload validations.

+use super::{
+    PARALLEL_SPARSE_TRIE_PARALLELISM_THRESHOLDS, SPARSE_TRIE_MAX_NODES_SHRINK_CAPACITY,
+    SPARSE_TRIE_MAX_VALUES_SHRINK_CAPACITY,
+};
 use alloy_primitives::B256;
 use parking_lot::Mutex;
-use reth_trie_sparse::SparseStateTrie;
-use std::{sync::Arc, time::Instant};
+use reth_trie_sparse::{
+    ArenaParallelSparseTrie, ConfigurableSparseTrie, ParallelSparseTrie, RevealableSparseTrie,
+    SparseStateTrie,
+};
+use std::{
+    ops::{Deref, DerefMut},
+    sync::Arc,
+    time::{Duration, Instant},
+};
 use tracing::debug;

 /// Type alias for the sparse trie type used in preservation.
-pub(super) type SparseTrie = SparseStateTrie;
+type SparseTrie = SparseStateTrie<ConfigurableSparseTrie, ConfigurableSparseTrie>;

-/// Shared handle to a preserved sparse trie that can be reused across payload validations.
+/// Sparse trie implementation used by [`PayloadSparseTrieCache`].
+#[derive(Debug, Default, Clone, Copy, PartialEq, Eq)]
+pub enum PayloadSparseTrieKind {
+    /// Back sparse trie storage with hash maps.
+    #[default]
+    HashMap,
+    /// Back sparse trie storage with arena allocations.
+    Arena,
+}
+
+impl From<bool> for PayloadSparseTrieKind {
+    fn from(enable_arena_sparse_trie: bool) -> Self {
+        if enable_arena_sparse_trie {
+            Self::Arena
+        } else {
+            Self::HashMap
+        }
+    }
+}
+
+#[derive(Debug, Default)]
+struct PayloadSparseTrieState {
+    latest_checkout_id: u64,
+    preserved: Option<PreservedSparseTrie>,
+}
+
+/// Outcome of storing a checked-out sparse trie back into the shared cache.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub enum PayloadSparseTrieStoreOutcome {
+    /// The checkout was the most recent lease and the trie was stored.
+    Stored,
+    /// A newer checkout had already been issued, so this stale lease was ignored.
+    IgnoredStaleCheckout,
+}
+
+/// Shared sparse trie cache that can be reused across payload validations.
 ///
-/// This is stored in [`PayloadProcessor`](super::PayloadProcessor) and cloned to pass to
-/// [`SparseTrieCacheTask`](super::sparse_trie::SparseTrieCacheTask) for trie reuse.
-#[derive(Debug, Default, Clone)]
-pub(super) struct SharedPreservedSparseTrie(Arc<Mutex<Option<PreservedSparseTrie>>>);
+/// This is the public sparse-trie SDK surface exposed through
+/// [`EngineSharedCaches`](super::EngineSharedCaches). Callers take or create a trie, use it for
+/// payload work, then store it back either anchored to the resulting state root or cleared for
+/// allocation reuse.
+#[derive(Debug, Clone)]
+pub struct PayloadSparseTrieCache {
+    kind: PayloadSparseTrieKind,
+    state: Arc<Mutex<PayloadSparseTrieState>>,
+}

-impl SharedPreservedSparseTrie {
-    /// Takes the preserved trie if present, leaving `None` in its place.
-    pub(super) fn take(&self) -> Option<PreservedSparseTrie> {
-        self.0.lock().take()
+impl Default for PayloadSparseTrieCache {
+    fn default() -> Self {
+        Self::new(PayloadSparseTrieKind::default())
+    }
+}
+
+impl PayloadSparseTrieCache {
+    /// Creates a sparse trie cache backed by the requested trie implementation.
+    pub fn new(kind: PayloadSparseTrieKind) -> Self {
+        Self { kind, state: Arc::new(Mutex::new(PayloadSparseTrieState::default())) }
    }

-    /// Acquires a guard that blocks `take()` until dropped.
-    /// Use this before sending the state root result to ensure the next block
-    /// waits for the trie to be stored.
-    pub(super) fn lock(&self) -> PreservedTrieGuard<'_> {
-        PreservedTrieGuard(self.0.lock())
+    /// Returns the sparse trie implementation used when the cache needs to create a new trie.
+    pub const fn kind(&self) -> PayloadSparseTrieKind {
+        self.kind
+    }
+
+    /// Takes a preserved trie for `parent_state_root` or creates a new trie if the cache is empty.
+    pub fn take_or_create_for(&self, parent_state_root: B256) -> SparseTrieCheckout {
+        let start = Instant::now();
+        let mut state = self.state.lock();
+        state.latest_checkout_id += 1;
+        let checkout_id = state.latest_checkout_id;
+        let trie = state
+            .preserved
+            .take()
+            .map(|preserved| preserved.into_trie_for(parent_state_root))
+            .unwrap_or_else(|| {
+                debug!(
+                    target: "engine::tree::payload_processor",
+                    %parent_state_root,
+                    kind = ?self.kind,
+                    "Creating new sparse trie - no preserved trie available"
+                );
+                new_sparse_trie(self.kind)
+            });
+        drop(state);
+
+        let elapsed = start.elapsed();
+        if elapsed.as_millis() > 5 {
+            debug!(
+                target: "engine::tree::payload_processor",
+                blocked_for=?elapsed,
+                "Waited for preserved sparse trie checkout"
+            );
+        }
+
+        SparseTrieCheckout { trie: Some(trie), cache: self.clone(), checkout_id }
    }

    /// Waits until the sparse trie lock becomes available.
    ///
-    /// This acquires and immediately releases the lock, ensuring that any
-    /// ongoing operations complete before returning. Useful for synchronization
-    /// before starting payload processing.
-    ///
    /// Returns the time spent waiting for the lock.
-    pub(super) fn wait_for_availability(&self) -> std::time::Duration {
+    pub fn wait_for_availability(&self) -> Duration {
        let start = Instant::now();
-        let _guard = self.0.lock();
+        let _guard = self.state.lock();
        let elapsed = start.elapsed();
        if elapsed.as_millis() > 5 {
            debug!(
@@ -49,27 +133,142 @@ impl SharedPreservedSparseTrie {
        }
        elapsed
    }
+
+    /// Acquires a guard that blocks cache mutation until dropped.
+    ///
+    /// Engine-internal code uses this before making the state-root result visible so the next
+    /// payload cannot observe an empty cache between send and store.
+    pub(super) fn lock(&self) -> PreservedTrieGuard<'_> {
+        PreservedTrieGuard { state: self.state.lock() }
+    }
+}
+
+/// A checked-out sparse trie lease.
+///
+/// This dereferences to [`SparseStateTrie`] so callers can reuse the trie directly. If the lease is
+/// dropped without being stored back, a cleared trie is returned to the shared cache unless a newer
+/// checkout has already superseded it.
+#[derive(Debug)]
+pub struct SparseTrieCheckout {
+    trie: Option<SparseTrie>,
+    cache: PayloadSparseTrieCache,
+    checkout_id: u64,
+}
+
+impl SparseTrieCheckout {
+    /// Stores the trie back into the shared cache anchored to the given state root.
+    pub fn store_anchored(self, state_root: B256) -> PayloadSparseTrieStoreOutcome {
+        let cache = self.cache.clone();
+        let mut guard = cache.lock();
+        self.store_anchored_with_guard(&mut guard, state_root)
+    }
+
+    /// Stores the trie back into the shared cache in a cleared state.
+    pub fn store_cleared(mut self) -> PayloadSparseTrieStoreOutcome {
+        let cache = self.cache.clone();
+        let mut trie = self.take_trie();
+        prepare_cleared_trie(&mut trie);
+        let deferred = trie.take_deferred_drops();
+        let mut guard = cache.lock();
+        let outcome = guard.store(self.checkout_id, PreservedSparseTrie::cleared(trie));
+        drop(guard);
+        drop(deferred);
+        outcome
+    }
+
+    /// Stores the trie back into the shared cache anchored to the given state root while the
+    /// caller is already holding the preservation lock.
+    pub(super) fn store_anchored_with_guard(
+        mut self,
+        guard: &mut PreservedTrieGuard<'_>,
+        state_root: B256,
+    ) -> PayloadSparseTrieStoreOutcome {
+        guard.store(self.checkout_id, PreservedSparseTrie::anchored(self.take_trie(), state_root))
+    }
+
+    /// Stores an already-cleared trie back into the shared cache while the caller is already
+    /// holding the preservation lock.
+    pub(super) fn store_prepared_cleared_with_guard(
+        mut self,
+        guard: &mut PreservedTrieGuard<'_>,
+    ) -> PayloadSparseTrieStoreOutcome {
+        guard.store(self.checkout_id, PreservedSparseTrie::cleared(self.take_trie()))
+    }
+
+    fn take_trie(&mut self) -> SparseTrie {
+        self.trie.take().expect("sparse trie checkout must hold a trie until it is stored")
+    }
+}
+
+impl Deref for SparseTrieCheckout {
+    type Target = SparseTrie;
+
+    fn deref(&self) -> &Self::Target {
+        self.trie.as_ref().expect("sparse trie checkout must hold a trie until it is stored")
+    }
+}
+
+impl DerefMut for SparseTrieCheckout {
+    fn deref_mut(&mut self) -> &mut Self::Target {
+        self.trie.as_mut().expect("sparse trie checkout must hold a trie until it is stored")
+    }
+}
+
+impl Drop for SparseTrieCheckout {
+    fn drop(&mut self) {
+        let Some(mut trie) = self.trie.take() else { return };
+
+        debug!(
+            target: "engine::tree::payload_processor",
+            checkout_id = self.checkout_id,
+            "Sparse trie checkout dropped before store, returning cleared trie to cache"
+        );
+
+        prepare_cleared_trie(&mut trie);
+        let deferred = trie.take_deferred_drops();
+        let mut guard = self.cache.lock();
+        let _ = guard.store(self.checkout_id, PreservedSparseTrie::cleared(trie));
+        drop(guard);
+        drop(deferred);
+    }
 }

 /// Guard that holds the lock on the preserved trie.
-/// While held, `take()` will block. Call `store()` to save the trie before dropping.
-pub(super) struct PreservedTrieGuard<'a>(parking_lot::MutexGuard<'a, Option<PreservedSparseTrie>>);
+/// While held, take-or-create calls will block. Call `store()` to save the trie before dropping.
+pub(super) struct PreservedTrieGuard<'a> {
+    state: parking_lot::MutexGuard<'a, PayloadSparseTrieState>,
+}

 impl PreservedTrieGuard<'_> {
-    /// Stores a preserved trie for later reuse.
-    pub(super) fn store(&mut self, trie: PreservedSparseTrie) {
-        self.0.replace(trie);
+    /// Stores a preserved trie for later reuse if the checkout is still current.
+    fn store(
+        &mut self,
+        checkout_id: u64,
+        trie: PreservedSparseTrie,
+    ) -> PayloadSparseTrieStoreOutcome {
+        if checkout_id != self.state.latest_checkout_id {
+            debug!(
+                target: "engine::tree::payload_processor",
+                checkout_id,
+                latest_checkout_id = self.state.latest_checkout_id,
+                "Ignoring stale sparse trie checkout"
+            );
+            return PayloadSparseTrieStoreOutcome::IgnoredStaleCheckout;
+        }
+
+        self.state.preserved.replace(trie);
+        PayloadSparseTrieStoreOutcome::Stored
    }
 }

 /// A preserved sparse trie that can be reused across payload validations.
 ///
 /// The trie exists in one of two states:
-/// - **Anchored**: Has a computed state root and can be reused for payloads whose parent state root
-///   matches the anchor.
+/// - **Anchored**: Has a computed state root and can be reused for payloads whose parent state
+///   root matches the anchor.
 /// - **Cleared**: Trie data has been cleared but allocations are preserved for reuse.
 #[derive(Debug)]
-pub(super) enum PreservedSparseTrie {
+enum PreservedSparseTrie {
    /// Trie with a computed state root that can be reused for continuation payloads.
    Anchored {
        /// The sparse state trie (pruned after root computation).
@@ -87,24 +286,17 @@ pub(super) enum PreservedSparseTrie {

 impl PreservedSparseTrie {
    /// Creates a new anchored preserved trie.
-    ///
-    /// The `state_root` is the computed state root from the trie, which becomes the
-    /// anchor for determining if subsequent payloads can reuse this trie.
-    pub(super) const fn anchored(trie: SparseTrie, state_root: B256) -> Self {
+    const fn anchored(trie: SparseTrie, state_root: B256) -> Self {
        Self::Anchored { trie, state_root }
    }

    /// Creates a cleared preserved trie (allocations preserved, data cleared).
-    pub(super) const fn cleared(trie: SparseTrie) -> Self {
+    const fn cleared(trie: SparseTrie) -> Self {
        Self::Cleared { trie }
    }

    /// Consumes self and returns the trie for reuse.
-    ///
-    /// If the preserved trie is anchored and the parent state root matches, the pruned
-    /// trie structure is reused directly. Otherwise, the trie is cleared but allocations
-    /// are preserved to reduce memory overhead.
-    pub(super) fn into_trie_for(self, parent_state_root: B256) -> SparseTrie {
+    fn into_trie_for(self, parent_state_root: B256) -> SparseTrie {
        match self {
            Self::Anchored { trie, state_root } if state_root == parent_state_root => {
                debug!(
@@ -135,3 +327,111 @@ impl PreservedSparseTrie {
        }
    }
 }
+
+fn new_sparse_trie(kind: PayloadSparseTrieKind) -> SparseTrie {
+    let default_trie = match kind {
+        PayloadSparseTrieKind::HashMap => {
+            RevealableSparseTrie::blind_from(ConfigurableSparseTrie::HashMap(
+                ParallelSparseTrie::default()
+                    .with_parallelism_thresholds(PARALLEL_SPARSE_TRIE_PARALLELISM_THRESHOLDS),
+            ))
+        }
+        PayloadSparseTrieKind::Arena => RevealableSparseTrie::blind_from(
+            ConfigurableSparseTrie::Arena(ArenaParallelSparseTrie::default()),
+        ),
+    };
+
+    SparseStateTrie::default()
+        .with_accounts_trie(default_trie.clone())
+        .with_default_storage_trie(default_trie)
+        .with_updates(true)
+}
+
+fn prepare_cleared_trie(trie: &mut SparseTrie) {
+    trie.clear();
+    trie.shrink_to(SPARSE_TRIE_MAX_NODES_SHRINK_CAPACITY, SPARSE_TRIE_MAX_VALUES_SHRINK_CAPACITY);
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn take_or_create_reuses_matching_anchor() {
+        let cache = PayloadSparseTrieCache::default();
+        let state_root = B256::with_last_byte(1);
+
+        assert_eq!(
+            cache.take_or_create_for(state_root).store_anchored(state_root),
+            PayloadSparseTrieStoreOutcome::Stored
+        );
+
+        match cache.state.lock().preserved.as_ref() {
+            Some(PreservedSparseTrie::Anchored { state_root: anchored, .. }) => {
+                assert_eq!(*anchored, state_root);
+            }
+            other => panic!("expected anchored trie, got {other:?}"),
+        }
+    }
+
+    #[test]
+    fn drop_restores_cleared_trie() {
+        let cache = PayloadSparseTrieCache::default();
+        let state_root = B256::with_last_byte(2);
+
+        let mut checkout = cache.take_or_create_for(state_root);
+        checkout.set_updates(true);
+        drop(checkout);
+
+        match cache.state.lock().preserved.as_ref() {
+            Some(PreservedSparseTrie::Cleared { .. }) => {}
+            other => panic!("expected cleared trie, got {other:?}"),
+        }
+    }
+
+    #[test]
+    fn stale_checkout_does_not_overwrite_newer_store() {
+        let cache = PayloadSparseTrieCache::default();
+        let parent_state_root = B256::with_last_byte(3);
+        let anchored_state_root = B256::with_last_byte(4);
+
+        let stale = cache.take_or_create_for(parent_state_root);
+        let fresh = cache.take_or_create_for(parent_state_root);
+
+        assert_eq!(
+            fresh.store_anchored(anchored_state_root),
+            PayloadSparseTrieStoreOutcome::Stored
+        );
+        assert_eq!(stale.store_cleared(), PayloadSparseTrieStoreOutcome::IgnoredStaleCheckout);
+
+        match cache.state.lock().preserved.as_ref() {
+            Some(PreservedSparseTrie::Anchored { state_root, .. }) => {
+                assert_eq!(*state_root, anchored_state_root);
+            }
+            other => panic!("expected anchored trie to survive stale checkout, got {other:?}"),
+        }
+    }
+
+    #[test]
+    fn stale_checkout_drop_does_not_overwrite_newer_store() {
+        let cache = PayloadSparseTrieCache::default();
+        let parent_state_root = B256::with_last_byte(5);
+        let anchored_state_root = B256::with_last_byte(6);
+
+        let stale = cache.take_or_create_for(parent_state_root);
+        let fresh = cache.take_or_create_for(parent_state_root);
+
+        assert_eq!(
+            fresh.store_anchored(anchored_state_root),
+            PayloadSparseTrieStoreOutcome::Stored
+        );
+        drop(stale);
+
+        match cache.state.lock().preserved.as_ref() {
+            Some(PreservedSparseTrie::Anchored { state_root, .. }) => {
+                assert_eq!(*state_root, anchored_state_root);
+            }
+            other => panic!("expected anchored trie to survive stale checkout drop, got {other:?}"),
+        }
+    }
+}
--- a/crates/engine/tree/src/tree/payload_processor/prewarm.rs
+++ b/crates/engine/tree/src/tree/payload_processor/prewarm.rs
@@ -84,7 +84,7 @@ where
    Evm: ConfigureEvm<Primitives = N> + 'static,
 {
    /// Initializes the task with the given transactions pending execution
-    pub fn new(
+    pub(crate) fn new(
        executor: Runtime,
        execution_cache: PayloadExecutionCache,
        ctx: PrewarmContext<N, P, Evm>,
--- a/crates/engine/tree/src/tree/payload_processor/receipt_root_task.rs
+++ b/crates/engine/tree/src/tree/payload_processor/receipt_root_task.rs
@@ -0,0 +1,281 @@
+//! Receipt root computation in a background task.
+//!
+//! This module provides a streaming receipt root builder that computes the receipt trie root
+//! in a background thread. Receipts are sent via a channel with their index, and for each
+//! receipt received, the builder incrementally flushes leaves to the underlying
+//! [`OrderedTrieRootEncodedBuilder`] when possible. When the channel closes, the task returns the
+//! computed root.
+
+use alloy_eips::Encodable2718;
+use alloy_primitives::{Bloom, B256};
+use crossbeam_channel::Receiver;
+use reth_primitives_traits::Receipt;
+use reth_trie_common::ordered_root::OrderedTrieRootEncodedBuilder;
+use tokio::sync::oneshot;
+use tracing::debug_span;
+
+/// Receipt with index, ready to be sent to the background task for encoding and trie building.
+#[derive(Debug, Clone)]
+pub struct IndexedReceipt<R> {
+    /// The transaction index within the block.
+    pub index: usize,
+    /// The receipt.
+    pub receipt: R,
+}
+
+impl<R> IndexedReceipt<R> {
+    /// Creates a new indexed receipt.
+    #[inline]
+    pub const fn new(index: usize, receipt: R) -> Self {
+        Self { index, receipt }
+    }
+}
+
+/// Handle for running the receipt root computation in a background task.
+///
+/// This struct holds the channels needed to receive receipts and send the result.
+/// Use [`Self::run`] to execute the computation (typically in a spawned blocking task).
+#[derive(Debug)]
+pub struct ReceiptRootTaskHandle<R> {
+    /// Receiver for indexed receipts.
+    receipt_rx: Receiver<IndexedReceipt<R>>,
+    /// Sender for the computed result.
+    result_tx: oneshot::Sender<(B256, Bloom)>,
+}
+
+impl<R: Receipt> ReceiptRootTaskHandle<R> {
+    /// Creates a new handle from the receipt receiver and result sender channels.
+    pub const fn new(
+        receipt_rx: Receiver<IndexedReceipt<R>>,
+        result_tx: oneshot::Sender<(B256, Bloom)>,
+    ) -> Self {
+        Self { receipt_rx, result_tx }
+    }
+
+    /// Runs the receipt root computation, consuming the handle.
+    ///
+    /// This method receives indexed receipts from the channel, encodes them,
+    /// and builds the trie incrementally. When all receipts have been received
+    /// (channel closed), it sends the result through the oneshot channel.
+    ///
+    /// This is designed to be called inside a blocking task (e.g., via
+    /// `executor.spawn_blocking(move || handle.run(receipts_len))`).
+    ///
+    /// # Arguments
+    ///
+    /// * `receipts_len` - The total number of receipts expected. This is needed to correctly order
+    ///   the trie keys according to RLP encoding rules.
+    pub fn run(self, receipts_len: usize) {
+        let _span = debug_span!(
+            target: "engine::tree::payload_processor",
+            "receipt_root",
+            receipts_len,
+        )
+        .entered();
+
+        let mut builder = OrderedTrieRootEncodedBuilder::new(receipts_len);
+        let mut aggregated_bloom = Bloom::ZERO;
+        let mut encode_buf = Vec::new();
+        let mut received_count = 0usize;
+
+        for indexed_receipt in self.receipt_rx {
+            let receipt_with_bloom = indexed_receipt.receipt.with_bloom_ref();
+
+            encode_buf.clear();
+            receipt_with_bloom.encode_2718(&mut encode_buf);
+
+            aggregated_bloom |= *receipt_with_bloom.bloom_ref();
+            match builder.push(indexed_receipt.index, &encode_buf) {
+                Ok(()) => {
+                    received_count += 1;
+                }
+                Err(err) => {
+                    // If a duplicate or out-of-bounds index is streamed, skip it and
+                    // fall back to computing the receipt root from the full receipts
+                    // vector later.
+                    tracing::error!(
+                        target: "engine::tree::payload_processor",
+                        index = indexed_receipt.index,
+                        ?err,
+                        "Receipt root task received invalid receipt index, skipping"
+                    );
+                }
+            }
+        }
+
+        let Ok(root) = builder.finalize() else {
+            // Finalize fails if we didn't receive exactly `receipts_len` receipts. This can
+            // happen if execution was aborted early (e.g., invalid transaction encountered).
+            // We return without sending a result, allowing the caller to handle the abort.
+            tracing::error!(
+                target: "engine::tree::payload_processor",
+                expected = receipts_len,
+                received = received_count,
+                "Receipt root task received incomplete receipts, execution likely aborted"
+            );
+            return;
+        };
+        let _ = self.result_tx.send((root, aggregated_bloom));
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use alloy_consensus::{proofs::calculate_receipt_root, TxReceipt};
+    use alloy_primitives::{b256, hex, Address, Bytes, Log};
+    use crossbeam_channel::bounded;
+    use reth_ethereum_primitives::{Receipt, TxType};
+
+    #[tokio::test]
+    async fn test_receipt_root_task_empty() {
+        let (_tx, rx) = bounded::<IndexedReceipt<Receipt>>(1);
+        let (result_tx, result_rx) = oneshot::channel();
+        drop(_tx);
+
+        let handle = ReceiptRootTaskHandle::new(rx, result_tx);
+        tokio::task::spawn_blocking(move || handle.run(0)).await.unwrap();
+
+        let (root, bloom) = result_rx.await.unwrap();
+
+        // Empty trie root
+        assert_eq!(root, reth_trie_common::EMPTY_ROOT_HASH);
+        assert_eq!(bloom, Bloom::ZERO);
+    }
+
+    #[tokio::test]
+    async fn test_receipt_root_task_single_receipt() {
+        let receipts: Vec<Receipt> = vec![Receipt::default()];
+
+        let (tx, rx) = bounded(1);
+        let (result_tx, result_rx) = oneshot::channel();
+        let receipts_len = receipts.len();
+
+        let handle = ReceiptRootTaskHandle::new(rx, result_tx);
+        let join_handle = tokio::task::spawn_blocking(move || handle.run(receipts_len));
+
+        for (i, receipt) in receipts.clone().into_iter().enumerate() {
+            tx.send(IndexedReceipt::new(i, receipt)).unwrap();
+        }
+        drop(tx);
+
+        join_handle.await.unwrap();
+        let (root, _bloom) = result_rx.await.unwrap();
+
+        // Verify against the standard calculation
+        let receipts_with_bloom: Vec<_> = receipts.iter().map(|r| r.with_bloom_ref()).collect();
+        let expected_root = calculate_receipt_root(&receipts_with_bloom);
+
+        assert_eq!(root, expected_root);
+    }
+
+    #[tokio::test]
+    async fn test_receipt_root_task_multiple_receipts() {
+        let receipts: Vec<Receipt> = vec![Receipt::default(); 5];
+
+        let (tx, rx) = bounded(4);
+        let (result_tx, result_rx) = oneshot::channel();
+        let receipts_len = receipts.len();
+
+        let handle = ReceiptRootTaskHandle::new(rx, result_tx);
+        let join_handle = tokio::task::spawn_blocking(move || handle.run(receipts_len));
+
+        for (i, receipt) in receipts.into_iter().enumerate() {
+            tx.send(IndexedReceipt::new(i, receipt)).unwrap();
+        }
+        drop(tx);
+
+        join_handle.await.unwrap();
+        let (root, bloom) = result_rx.await.unwrap();
+
+        // Verify against expected values from existing test
+        assert_eq!(
+            root,
+            b256!("0x61353b4fb714dc1fccacbf7eafc4273e62f3d1eed716fe41b2a0cd2e12c63ebc")
+        );
+        assert_eq!(
+            bloom,
+            Bloom::from(hex!("00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"))
+        );
+    }
+
+    #[tokio::test]
+    async fn test_receipt_root_matches_standard_calculation() {
+        // Create some receipts with actual data
+        let receipts = vec![
+            Receipt {
+                tx_type: TxType::Legacy,
+                cumulative_gas_used: 21000,
+                success: true,
+                logs: vec![],
+            },
+            Receipt {
+                tx_type: TxType::Eip1559,
+                cumulative_gas_used: 42000,
+                success: true,
+                logs: vec![Log {
+                    address: Address::ZERO,
+                    data: alloy_primitives::LogData::new_unchecked(vec![B256::ZERO], Bytes::new()),
+                }],
+            },
+            Receipt {
+                tx_type: TxType::Eip2930,
+                cumulative_gas_used: 63000,
+                success: false,
+                logs: vec![],
+            },
+        ];
+
+        // Calculate expected values first (before we move receipts)
+        let receipts_with_bloom: Vec<_> = receipts.iter().map(|r| r.with_bloom_ref()).collect();
+        let expected_root = calculate_receipt_root(&receipts_with_bloom);
+        let expected_bloom =
+            receipts_with_bloom.iter().fold(Bloom::ZERO, |bloom, r| bloom | r.bloom_ref());
+
+        // Calculate using the task
+        let (tx, rx) = bounded(4);
+        let (result_tx, result_rx) = oneshot::channel();
+        let receipts_len = receipts.len();
+
+        let handle = ReceiptRootTaskHandle::new(rx, result_tx);
+        let join_handle = tokio::task::spawn_blocking(move || handle.run(receipts_len));
+
+        for (i, receipt) in receipts.into_iter().enumerate() {
+            tx.send(IndexedReceipt::new(i, receipt)).unwrap();
+        }
+        drop(tx);
+
+        join_handle.await.unwrap();
+        let (task_root, task_bloom) = result_rx.await.unwrap();
+
+        assert_eq!(task_root, expected_root);
+        assert_eq!(task_bloom, expected_bloom);
+    }
+
+    #[tokio::test]
+    async fn test_receipt_root_task_out_of_order() {
+        let receipts: Vec<Receipt> = vec![Receipt::default(); 5];
+
+        // Calculate expected values first (before we move receipts)
+        let receipts_with_bloom: Vec<_> = receipts.iter().map(|r| r.with_bloom_ref()).collect();
+        let expected_root = calculate_receipt_root(&receipts_with_bloom);
+
+        let (tx, rx) = bounded(4);
+        let (result_tx, result_rx) = oneshot::channel();
+        let receipts_len = receipts.len();
+
+        let handle = ReceiptRootTaskHandle::new(rx, result_tx);
+        let join_handle = tokio::task::spawn_blocking(move || handle.run(receipts_len));
+
+        // Send in reverse order to test out-of-order handling
+        for (i, receipt) in receipts.into_iter().enumerate().rev() {
+            tx.send(IndexedReceipt::new(i, receipt)).unwrap();
+        }
+        drop(tx);
+
+        join_handle.await.unwrap();
+        let (root, _bloom) = result_rx.await.unwrap();
+
+        assert_eq!(root, expected_root);
+    }
+}
--- a/crates/engine/tree/src/tree/payload_processor/sparse_trie.rs
+++ b/crates/engine/tree/src/tree/payload_processor/sparse_trie.rs
@@ -7,13 +7,13 @@ use crate::tree::{
        dispatch_with_chunking, evm_state_to_hashed_post_state, MultiProofMessage,
        DEFAULT_MAX_TARGETS_FOR_CHUNKING,
    },
-    payload_processor::multiproof::MultiProofTaskMetrics,
+    payload_processor::{multiproof::MultiProofTaskMetrics, SparseTrieCheckout},
 };
 use alloy_primitives::B256;
 use alloy_rlp::{Decodable, Encodable};
 use crossbeam_channel::{Receiver as CrossbeamReceiver, Sender as CrossbeamSender};
-use rayon::iter::ParallelIterator;
-use reth_primitives_traits::{Account, FastInstant as Instant, ParallelBridgeBuffered};
+use rayon::iter::{IntoParallelIterator, ParallelIterator};
+use reth_primitives_traits::{Account, FastInstant as Instant};
 use reth_tasks::Runtime;
 use reth_trie::{
    updates::TrieUpdates, DecodedMultiProofV2, HashedPostState, TrieAccount, EMPTY_ROOT_HASH,
@@ -29,8 +29,8 @@ use reth_trie_parallel::{
 #[cfg(feature = "trie-debug")]
 use reth_trie_sparse::debug_recorder::TrieDebugRecorder;
 use reth_trie_sparse::{
-    errors::SparseTrieResult, DeferredDrops, LeafUpdate, ParallelSparseTrie, SparseStateTrie,
-    SparseTrie,
+    errors::SparseTrieResult, ConfigurableSparseTrie, DeferredDrops, LeafUpdate,
+    RevealableSparseTrie,
 };
 use revm_primitives::{hash_map::Entry, B256Map};
 use tracing::{debug, debug_span, error, instrument, trace_span};
@@ -39,7 +39,7 @@ use tracing::{debug, debug_span, error, instrument, trace_span};
 const MAX_PENDING_UPDATES: usize = 100;

 /// Sparse trie task implementation that uses in-memory sparse trie data to schedule proof fetching.
-pub(super) struct SparseTrieCacheTask<A = ParallelSparseTrie, S = ParallelSparseTrie> {
+pub(super) struct SparseTrieCacheTask {
    /// Sender for proof results.
    proof_result_tx: CrossbeamSender<ProofResultMessage>,
    /// Receiver for proof results directly from workers.
@@ -47,7 +47,7 @@ pub(super) struct SparseTrieCacheTask<A = ParallelSparseTrie, S = ParallelSparse
    /// Receives updates from execution and prewarming.
    updates: CrossbeamReceiver<SparseTrieTaskMessage>,
    /// `SparseStateTrie` used for computing the state root.
-    trie: SparseStateTrie<A, S>,
+    trie: SparseTrieCheckout,
    /// Handle to the proof worker pools (storage and account).
    proof_worker_handle: ProofWorkerHandle,

@@ -110,18 +110,14 @@ pub(super) struct SparseTrieCacheTask<A = ParallelSparseTrie, S = ParallelSparse
    metrics: MultiProofTaskMetrics,
 }

-impl<A, S> SparseTrieCacheTask<A, S>
-where
-    A: SparseTrie + Default,
-    S: SparseTrie + Default + Clone,
-{
+impl SparseTrieCacheTask {
    /// Creates a new sparse trie, pre-populating with an existing [`SparseStateTrie`].
-    pub(super) fn new_with_trie(
+    pub(super) fn new_with_checkout(
        executor: &Runtime,
        updates: CrossbeamReceiver<MultiProofMessage>,
        proof_worker_handle: ProofWorkerHandle,
        metrics: MultiProofTaskMetrics,
-        trie: SparseStateTrie<A, S>,
+        trie: SparseTrieCheckout,
        chunk_size: usize,
    ) -> Self {
        let (proof_result_tx, proof_result_rx) = crossbeam_channel::unbounded();
@@ -179,9 +175,7 @@ where
                MultiProofMessage::FinishedStateUpdates => {
                    SparseTrieTaskMessage::FinishedStateUpdates
                }
-                MultiProofMessage::EmptyProof { .. } | MultiProofMessage::BlockAccessList(_) => {
-                    continue
-                }
+                MultiProofMessage::BlockAccessList(_) => continue,
                MultiProofMessage::HashedStateUpdate(state) => {
                    SparseTrieTaskMessage::HashedState(state)
                }
@@ -201,17 +195,17 @@ where
    /// benchmarking purposes.
    pub(super) fn into_trie_for_reuse(
        self,
-        prune_depth: usize,
-        max_storage_tries: usize,
+        max_hot_slots: usize,
+        max_hot_accounts: usize,
        max_nodes_capacity: usize,
        max_values_capacity: usize,
        disable_pruning: bool,
        updates: &TrieUpdates,
-    ) -> (SparseStateTrie<A, S>, DeferredDrops) {
+    ) -> (SparseTrieCheckout, DeferredDrops) {
        let Self { mut trie, .. } = self;
        trie.commit_updates(updates);
        if !disable_pruning {
-            trie.prune(prune_depth, max_storage_tries);
+            trie.prune(max_hot_slots, max_hot_accounts);
            trie.shrink_to(max_nodes_capacity, max_values_capacity);
        }
        let deferred = trie.take_deferred_drops();
@@ -226,7 +220,7 @@ where
        self,
        max_nodes_capacity: usize,
        max_values_capacity: usize,
-    ) -> (SparseStateTrie<A, S>, DeferredDrops) {
+    ) -> (SparseTrieCheckout, DeferredDrops) {
        let Self { mut trie, .. } = self;
        trie.clear();
        trie.shrink_to(max_nodes_capacity, max_values_capacity);
@@ -308,9 +302,9 @@ where
                self.promote_pending_account_updates()?;
                self.metrics.sparse_trie_process_updates_duration_histogram.record(t.elapsed());

-                if self.finished_state_updates &&
-                    self.account_updates.is_empty() &&
-                    self.storage_updates.iter().all(|(_, updates)| updates.is_empty())
+                if self.finished_state_updates
+                    && self.account_updates.is_empty()
+                    && self.storage_updates.iter().all(|(_, updates)| updates.is_empty())
                {
                    break;
                }
@@ -384,13 +378,13 @@ where
        }

        for (address, slots) in targets.storage_targets {
-            for slot in slots {
-                // Only touch storages that are not yet present in the updates set.
-                self.new_storage_updates
-                    .entry(address)
-                    .or_default()
-                    .entry(slot.key())
-                    .or_insert(LeafUpdate::Touched);
+            if !slots.is_empty() {
+                // Look up outer map once per address instead of once per slot.
+                let new_updates = self.new_storage_updates.entry(address).or_default();
+                for slot in slots {
+                    // Only touch storages that are not yet present in the updates set.
+                    new_updates.entry(slot.key()).or_insert(LeafUpdate::Touched);
+                }
            }

            // Touch corresponding account leaf to make sure its revealed in accounts trie for
@@ -407,19 +401,26 @@ where
    )]
    fn on_hashed_state_update(&mut self, hashed_state_update: HashedPostState) {
        for (address, storage) in hashed_state_update.storages {
-            for (slot, value) in storage.storage {
-                let encoded = if value.is_zero() {
-                    Vec::new()
-                } else {
-                    alloy_rlp::encode_fixed_size(&value).to_vec()
-                };
-                self.new_storage_updates
-                    .entry(address)
-                    .or_default()
-                    .insert(slot, LeafUpdate::Changed(encoded));
+            if !storage.storage.is_empty() {
+                // Look up outer maps once per address instead of once per slot.
+                let new_updates = self.new_storage_updates.entry(address).or_default();
+                let mut existing_updates = self.storage_updates.get_mut(&address);

-                // Remove an existing storage update if it exists.
-                self.storage_updates.get_mut(&address).and_then(|updates| updates.remove(&slot));
+                for (slot, value) in storage.storage {
+                    self.trie.record_slot_touch(address, slot);
+
+                    let encoded = if value.is_zero() {
+                        Vec::new()
+                    } else {
+                        alloy_rlp::encode_fixed_size(&value).to_vec()
+                    };
+                    new_updates.insert(slot, LeafUpdate::Changed(encoded));
+
+                    // Remove an existing storage update if it exists.
+                    if let Some(ref mut existing) = existing_updates {
+                        existing.remove(&slot);
+                    }
+                }
            }

            // Make sure account is tracked in `account_updates` so that it is revealed in accounts
@@ -432,6 +433,8 @@ where
        }

        for (address, account) in hashed_state_update.accounts {
+            self.trie.record_account_touch(address);
+
            // Track account as touched.
            //
            // This might overwrite an existing update, which is fine, because storage root from it
@@ -593,6 +596,59 @@ where

        Ok(updates_len_after < updates_len_before)
    }
+    /// Computes storage roots for accounts whose storage updates are fully drained.
+    ///
+    /// For each storage trie T that:
+    /// 1. was modified in the current block,
+    /// 2. all the storage updates are fully drained,
+    /// 3. but the storage root hasn't been updated yet,
+    ///
+    /// we trigger state root computation on a rayon pool.
+    #[instrument(
+        level = "debug",
+        target = "engine::tree::payload_processor::sparse_trie",
+        skip_all
+    )]
+    fn compute_drained_storage_roots(&mut self) {
+        let addresses_to_compute_roots: Vec<_> = self
+            .storage_updates
+            .iter()
+            .filter_map(|(address, updates)| updates.is_empty().then_some(*address))
+            .collect();
+
+        struct SendStorageTriePtr(*mut RevealableSparseTrie<ConfigurableSparseTrie>);
+        // SAFETY: this wrapper only forwards the pointer across rayon; deref invariants are
+        // documented at the use site below.
+        unsafe impl Send for SendStorageTriePtr {}
+
+        let mut tries_to_compute_roots: Vec<(B256, SendStorageTriePtr)> =
+            Vec::with_capacity(addresses_to_compute_roots.len());
+        for address in addresses_to_compute_roots {
+            if let Some(trie) = self.trie.storage_tries_mut().get_mut(&address)
+                && !trie.is_root_cached()
+            {
+                tries_to_compute_roots.push((address, SendStorageTriePtr(trie)));
+            }
+        }
+
+        let parent_span = tracing::Span::current();
+        tries_to_compute_roots.into_par_iter().for_each(|(address, SendStorageTriePtr(trie))| {
+            let _enter = debug_span!(
+                target: "engine::tree::payload_processor::sparse_trie",
+                parent: &parent_span,
+                "storage_root",
+                ?address
+            )
+            .entered();
+            // SAFETY:
+            // - pointers are created from `storage_tries_mut().get_mut(address)` above;
+            // - `addresses_to_compute_roots` comes from map iteration, so addresses are unique;
+            // - we do not insert/remove entries between pointer collection and use, so pointers
+            //   stay valid and map reallocation cannot occur;
+            // - each pointer is consumed by at most one rayon task, so no aliasing mutable access.
+            unsafe { (*trie).root().expect("updates are drained, trie should be revealed by now") };
+        });
+    }

    /// Iterates through all storage tries for which all updates were processed, computes their
    /// storage roots, and promotes corresponding pending account updates into proper leaf updates
@@ -609,21 +665,7 @@ where
            return Ok(());
        }

-        let span = debug_span!("compute_storage_roots").entered();
-        self
-            .trie
-            .storage_tries_mut()
-            .iter_mut()
-            .filter(|(address, trie)| {
-                self.storage_updates.get(*address).is_some_and(|updates| updates.is_empty()) &&
-                    !trie.is_root_cached()
-            })
-            .par_bridge_buffered()
-            .for_each(|(address, trie)| {
-                let _enter = debug_span!(target: "engine::tree::payload_processor::sparse_trie", parent: &span, "storage_root", ?address).entered();
-                trie.root().expect("updates are drained, trie should be revealed by now");
-            });
-        drop(span);
+        self.compute_drained_storage_roots();

        loop {
            let span = debug_span!("promote_updates", promoted = tracing::field::Empty).entered();
@@ -682,7 +724,7 @@ where
            // We need to keep iterating if any updates are being drained because that might
            // indicate that more pending account updates can be promoted.
            if num_promoted == 0 || !self.process_account_leaf_updates(false)? {
-                break
+                break;
            }
        }

@@ -803,7 +845,6 @@ pub struct StateRootComputeOutcome {
 mod tests {
    use super::*;
    use alloy_primitives::{keccak256, Address, B256, U256};
-    use reth_trie_sparse::ParallelSparseTrie;

    #[test]
    fn test_run_hashing_task_hashed_state_update_forwards() {
@@ -826,10 +867,7 @@ mod tests {
        let expected_state = hashed_state.clone();

        let handle = std::thread::spawn(move || {
-            SparseTrieCacheTask::<ParallelSparseTrie, ParallelSparseTrie>::run_hashing_task(
-                updates_rx,
-                hashed_state_tx,
-            );
+            SparseTrieCacheTask::run_hashing_task(updates_rx, hashed_state_tx);
        });

        updates_tx.send(MultiProofMessage::HashedStateUpdate(hashed_state)).unwrap();
--- a/crates/engine/tree/src/tree/payload_validator.rs
+++ b/crates/engine/tree/src/tree/payload_validator.rs
@@ -1,10 +1,10 @@
 //! Types and traits for validating blocks and payloads.

 use crate::tree::{
-    cached_state::CachedStateProvider,
+    cached_state::{CacheStats, CachedStateProvider},
    error::{InsertBlockError, InsertBlockErrorKind, InsertPayloadError},
-    instrumented_state::InstrumentedStateProvider,
-    payload_processor::PayloadProcessor,
+    instrumented_state::{InstrumentedStateProvider, StateProviderStats},
+    payload_processor::{EngineSharedCaches, PayloadProcessor},
    precompile_cache::{CachedPrecompile, CachedPrecompileMetrics, PrecompileCacheMap},
    sparse_trie::StateRootComputeOutcome,
    CacheWaitDurations, EngineApiMetrics, EngineApiTreeState, ExecutionEnv, PayloadHandle,
@@ -14,12 +14,14 @@ use alloy_consensus::transaction::{Either, TxHashRef};
 use alloy_eip7928::BlockAccessList;
 use alloy_eips::{eip1898::BlockWithParent, eip4895::Withdrawal, NumHash};
 use alloy_evm::Evm;
-use alloy_primitives::B256;
+use alloy_primitives::{map::B256Set, B256};
 #[cfg(feature = "trie-debug")]
 use reth_trie_sparse::debug_recorder::TrieDebugRecorder;

-use crate::tree::payload_processor::post_exec::PostExecHandle;
-use reth_chain_state::{CanonicalInMemoryState, DeferredTrieData, ExecutedBlock, LazyOverlay};
+use crate::tree::payload_processor::receipt_root_task::{IndexedReceipt, ReceiptRootTaskHandle};
+use reth_chain_state::{
+    CanonicalInMemoryState, DeferredTrieData, ExecutedBlock, ExecutionTimingStats, LazyOverlay,
+};
 use reth_consensus::{ConsensusError, FullConsensus, ReceiptRootBloom};
 use reth_engine_primitives::{
    ConfigureEngineEvm, ExecutableTxIterator, ExecutionPayload, InvalidBlockHook, PayloadValidator,
@@ -42,11 +44,11 @@ use reth_provider::{
    ProviderError, PruneCheckpointReader, StageCheckpointReader, StateProvider,
    StateProviderFactory, StateReader, StorageChangeSetReader, StorageSettingsCache,
 };
-use reth_revm::db::{states::bundle_state::BundleRetention, State};
+use reth_revm::db::{states::bundle_state::BundleRetention, BundleAccount, State};
 use reth_trie::{trie_cursor::TrieCursorFactory, updates::TrieUpdates, HashedPostState, StateRoot};
 use reth_trie_db::ChangesetCache;
 use reth_trie_parallel::root::{ParallelStateRoot, ParallelStateRootError};
-use revm_primitives::Address;
+use revm_primitives::{Address, KECCAK_EMPTY};
 use std::{
    collections::HashMap,
    panic::{self, AssertUnwindSafe},
@@ -55,12 +57,23 @@ use std::{
        mpsc::RecvTimeoutError,
        Arc,
    },
+    time::Duration,
 };
 use tracing::{debug, debug_span, error, info, instrument, trace, warn, Span};

+/// Output of block or payload validation.
+pub type ValidationOutcome<N, E = InsertPayloadError<BlockTy<N>>> =
+    Result<(ExecutedBlock<N>, Option<Box<ExecutionTimingStats>>), E>;
+
 /// Handle to a [`HashedPostState`] computed on a background thread.
 type LazyHashedPostState = reth_tasks::LazyHandle<HashedPostState>;

+/// Result type for block validation with optional timing stats.
+type InsertPayloadResult<N> = Result<
+    (ExecutedBlock<N>, Option<Box<ExecutionTimingStats>>),
+    InsertPayloadError<<N as NodePrimitives>::Block>,
+>;
+
 /// Context providing access to tree state during validation.
 ///
 /// This context is provided to the [`EngineValidator`] and includes the state of the tree's
@@ -89,7 +102,9 @@ impl<'a, N: NodePrimitives> TreeCtx<'a, N> {
    ) -> Self {
        Self { state, canonical_in_memory_state }
    }
+}

+impl<'a, N: NodePrimitives> TreeCtx<'a, N> {
    /// Returns a reference to the engine tree state
    pub const fn state(&self) -> &EngineApiTreeState<N> {
        &*self.state
@@ -175,16 +190,13 @@ where
        validator: V,
        config: TreeConfig,
        invalid_block_hook: Box<dyn InvalidBlockHook<N>>,
+        shared_caches: EngineSharedCaches<Evm>,
        changeset_cache: ChangesetCache,
        runtime: reth_tasks::Runtime,
    ) -> Self {
-        let precompile_cache_map = PrecompileCacheMap::default();
-        let payload_processor = PayloadProcessor::new(
-            runtime.clone(),
-            evm_config.clone(),
-            &config,
-            precompile_cache_map.clone(),
-        );
+        let precompile_cache_map = shared_caches.precompile_cache_map();
+        let payload_processor =
+            PayloadProcessor::new(runtime.clone(), evm_config.clone(), &config, shared_caches);
        Self {
            provider,
            consensus,
@@ -280,7 +292,7 @@ where
        input: BlockOrPayload<T>,
        execution_err: InsertBlockErrorKind,
        parent_block: &SealedHeader<N::BlockHeader>,
-    ) -> Result<ExecutedBlock<N>, InsertPayloadError<N::Block>>
+    ) -> InsertPayloadResult<N>
    where
        V: PayloadValidator<T, Block = N::Block>,
    {
@@ -298,7 +310,7 @@ where
        // Validate block consensus rules which includes header validation
        if let Err(consensus_err) = self.validate_block_inner(&block, None) {
            // Header validation error takes precedence over execution error
-            return Err(InsertBlockError::new(block, consensus_err.into()).into())
+            return Err(InsertBlockError::new(block, consensus_err.into()).into());
        }

        // Also validate against the parent
@@ -306,7 +318,7 @@ where
            self.consensus.validate_header_against_parent(block.sealed_header(), parent_block)
        {
            // Parent validation error takes precedence over execution error
-            return Err(InsertBlockError::new(block, consensus_err.into()).into())
+            return Err(InsertBlockError::new(block, consensus_err.into()).into());
        }

        // No header validation errors, return the original execution error
@@ -333,7 +345,7 @@ where
        &mut self,
        input: BlockOrPayload<T>,
        mut ctx: TreeCtx<'_, N>,
-    ) -> ValidationOutcome<N, InsertPayloadError<N::Block>>
+    ) -> InsertPayloadResult<N>
    where
        V: PayloadValidator<T, Block = N::Block> + Clone,
        Evm: ConfigureEngineEvm<T::ExecutionData, Primitives = N>,
@@ -381,7 +393,7 @@ where
                    Ok(val) => val,
                    Err(e) => {
                        let block = convert_to_block(input)?;
-                        return Err(InsertBlockError::new(block, e.into()).into())
+                        return Err(InsertBlockError::new(block, e.into()).into());
                    }
                }
            };
@@ -414,7 +426,7 @@ where
                convert_to_block(input)?,
                ProviderError::HeaderNotFound(parent_hash.into()).into(),
            )
-            .into())
+            .into());
        };
        let mut state_provider = ensure_ok!(provider_builder.build());
        drop(_enter);
@@ -427,7 +439,7 @@ where
                convert_to_block(input)?,
                ProviderError::HeaderNotFound(parent_hash.into()).into(),
            )
-            .into())
+            .into());
        };

        let evm_env = debug_span!(target: "engine::tree::payload_validator", "evm_env")
@@ -485,25 +497,39 @@ where
            block_access_list,
        ));

+        // Create optional cache stats for detailed block logging
+        let slow_block_enabled = self.config.slow_block_threshold().is_some();
+        let cache_stats = slow_block_enabled.then(|| Arc::new(CacheStats::default()));
+
        // Use cached state provider before executing, used in execution after prewarming threads
        // complete
        if let Some((caches, cache_metrics)) = handle.caches().zip(handle.cache_metrics()) {
-            state_provider =
-                Box::new(CachedStateProvider::new(state_provider, caches, cache_metrics));
+            state_provider = Box::new(
+                CachedStateProvider::new(state_provider, caches, cache_metrics)
+                    .with_cache_stats(cache_stats.clone()),
+            );
        };

-        if self.config.state_provider_metrics() {
-            state_provider = Box::new(InstrumentedStateProvider::new(state_provider, "engine"));
-        }
+        let state_provider_stats = if slow_block_enabled || self.config.state_provider_metrics() {
+            let instrumented_state_provider =
+                InstrumentedStateProvider::new(state_provider, "engine");
+            let stats = slow_block_enabled.then(|| instrumented_state_provider.stats());
+            state_provider = Box::new(instrumented_state_provider);
+            stats
+        } else {
+            None
+        };

        // Execute the block and handle any execution errors.
-        // The post-exec handle manages receipt root computation in a background worker,
-        // receiving receipts incrementally as transactions complete.
-        let (output, senders, mut post_exec) =
+        // The receipt root task is spawned before execution and receives receipts incrementally
+        // as transactions complete, allowing parallel computation during execution.
+        let execute_block_start = Instant::now();
+        let (output, senders, receipt_root_rx) =
            match self.execute_block(state_provider, env, &input, &mut handle) {
                Ok(output) => output,
                Err(err) => return self.handle_execution_error(input, err, &parent_block),
            };
+        let execution_duration = execute_block_start.elapsed();

        // After executing the block we can stop prewarming transactions
        handle.stop_prewarming_execution();
@@ -517,40 +543,59 @@ where
        // needed. This frees up resources while state root computation continues.
        let valid_block_tx = handle.terminate_caching(Some(output.clone()));

-        // Spawn hashed post state and (for payloads) transaction root on separate threads,
-        // in parallel with receipt-root finalization. Dropping the channel closes the receipt
-        // stream.
+        // Spawn hashed post state computation in background so it runs concurrently with
+        // block conversion and receipt root computation. This is a pure CPU-bound task
+        // (keccak256 hashing of all changed addresses and storage slots).
        let hashed_state_output = output.clone();
        let hashed_state_provider = self.provider.clone();
-        let block = convert_to_block(input)?;
-        let tx_root_fn = is_payload.then(|| {
-            let block = block.clone();
-            let parent_span = Span::current();
-            let num_hash = block.num_hash();
-            move || {
-                let _span =
-                    debug_span!(target: "engine::tree::payload_validator", parent: parent_span, "payload_tx_root", block = ?num_hash)
-                        .entered();
-                block.body().calculate_tx_root()
-            }
-        });
-        post_exec.finish(
-            move || {
+        let hashed_state: LazyHashedPostState =
+            self.payload_processor.executor().spawn_blocking_named("hash-post-state", move || {
                let _span = debug_span!(
                    target: "engine::tree::payload_validator",
                    "hashed_post_state",
                )
                .entered();
                hashed_state_provider.hashed_post_state(&hashed_state_output.state)
-            },
-            tx_root_fn,
-        );
+            });

+        let block = convert_to_block(input)?;
+        let transaction_root = is_payload.then(|| {
+            let block = block.clone();
+            let parent_span = Span::current();
+            let num_hash = block.num_hash();
+            self.payload_processor.executor().spawn_blocking_named("payload-tx-root", move || {
+                let _span =
+                    debug_span!(target: "engine::tree::payload_validator", parent: parent_span, "payload_tx_root", block = ?num_hash)
+                        .entered();
+                block.body().calculate_tx_root()
+            })
+        });
        let block = block.with_senders(senders);

-        let receipt_root_bloom = post_exec.receipt_root_bloom();
-        let transaction_root = post_exec.transaction_root();
-        let hashed_state: LazyHashedPostState = post_exec.into_lazy_hashed_state();
+        // Wait for the receipt root computation to complete.
+        let receipt_root_bloom = {
+            let _enter = debug_span!(
+                target: "engine::tree::payload_validator",
+                "wait_receipt_root",
+            )
+            .entered();
+
+            receipt_root_rx
+                .blocking_recv()
+                .inspect_err(|_| {
+                    tracing::error!(
+                        target: "engine::tree::payload_validator",
+                        "Receipt root task dropped sender without result, receipt root calculation likely aborted"
+                    );
+                })
+                .ok()
+        };
+        let transaction_root = transaction_root.map(|handle| {
+            let _span =
+                debug_span!(target: "engine::tree::payload_validator", "wait_payload_tx_root")
+                    .entered();
+            handle.try_into_inner().expect("sole handle")
+        });

        let hashed_state = ensure_ok_post_block!(
            self.validate_post_execution(
@@ -714,9 +759,20 @@ where
                )
                .into(),
            )
-            .into())
+            .into());
        }

+        let timing_stats = state_provider_stats.map(|stats| {
+            self.calculate_timing_stats(
+                &block,
+                stats,
+                cache_stats,
+                &output,
+                execution_duration,
+                root_elapsed,
+            )
+        });
+
        if let Some(valid_block_tx) = valid_block_tx {
            let _ = valid_block_tx.send(());
        }
@@ -728,14 +784,15 @@ where
        let changeset_provider =
            ensure_ok_post_block!(overlay_factory.database_provider_ro(), block);

-        Ok(self.spawn_deferred_trie_task(
+        let executed_block = self.spawn_deferred_trie_task(
            block,
            output,
            &ctx,
            hashed_state,
            trie_output,
            changeset_provider,
-        ))
+        );
+        Ok((executed_block, timing_stats))
    }

    /// Return sealed block header from database or in-memory state by hash.
@@ -764,14 +821,14 @@ where
    ) -> Result<(), ConsensusError> {
        if let Err(e) = self.consensus.validate_header(block.sealed_header()) {
            error!(target: "engine::tree::payload_validator", ?block, "Failed to validate header {}: {e}", block.hash());
-            return Err(e)
+            return Err(e);
        }

        if let Err(e) =
            self.consensus.validate_block_pre_execution_with_tx_root(block, transaction_root)
        {
            error!(target: "engine::tree::payload_validator", ?block, "Failed to validate block {}: {e}", block.hash());
-            return Err(e)
+            return Err(e);
        }

        Ok(())
@@ -793,7 +850,11 @@ where
        input: &BlockOrPayload<T>,
        handle: &mut PayloadHandle<impl ExecutableTxFor<Evm>, Err, N::Receipt>,
    ) -> Result<
-        (BlockExecutionOutput<N::Receipt>, Vec<Address>, PostExecHandle<N::Receipt>),
+        (
+            BlockExecutionOutput<N::Receipt>,
+            Vec<Address>,
+            tokio::sync::oneshot::Receiver<(B256, alloy_primitives::Bloom)>,
+        ),
        InsertBlockErrorKind,
    >
    where
@@ -842,10 +903,15 @@ where
            );
        }

-        // Create a unified post-exec handle that manages receipt root, hashed post state,
-        // and transaction root computation in parallel background tasks.
+        // Spawn background task to compute receipt root and logs bloom incrementally.
+        // Unbounded channel is used since tx count bounds capacity anyway (max ~30k txs per block).
        let receipts_len = input.transaction_count();
-        let post_exec = self.payload_processor.post_exec_handle(receipts_len);
+        let (receipt_tx, receipt_rx) = crossbeam_channel::unbounded();
+        let (result_tx, result_rx) = tokio::sync::oneshot::channel();
+        let task_handle = ReceiptRootTaskHandle::new(receipt_rx, result_tx);
+        self.payload_processor
+            .executor()
+            .spawn_blocking_named("receipt-root", move || task_handle.run(receipts_len));

        let transaction_count = input.transaction_count();
        let executed_tx_index = Arc::clone(handle.executed_tx_index());
@@ -860,9 +926,10 @@ where
            executor,
            transaction_count,
            handle.iter_transactions(),
-            &post_exec,
+            &receipt_tx,
            &executed_tx_index,
        )?;
+        drop(receipt_tx);

        // Finish execution and get the result
        let post_exec_start = Instant::now();
@@ -880,12 +947,12 @@ where
        let execution_duration = execution_start.elapsed();
        self.metrics.record_block_execution(&output, execution_duration);
        self.metrics.record_block_execution_gas_bucket(output.result.gas_used, execution_duration);
-
        debug!(target: "engine::tree::payload_validator", elapsed = ?execution_duration, "Executed block");
-        Ok((output, senders, post_exec))
+
+        Ok((output, senders, result_rx))
    }

-    /// Executes transactions and collects senders, streaming receipts to the post-exec worker.
+    /// Executes transactions and collects senders, streaming receipts to a background task.
    ///
    /// This method handles:
    /// - Applying pre-execution changes (e.g., beacon root updates)
@@ -899,7 +966,7 @@ where
        mut executor: E,
        transaction_count: usize,
        transactions: impl Iterator<Item = Result<Tx, Err>>,
-        post_exec: &PostExecHandle<N::Receipt>,
+        receipt_tx: &crossbeam_channel::Sender<IndexedReceipt<N::Receipt>>,
        executed_tx_index: &AtomicUsize,
    ) -> Result<(E, Vec<Address>), BlockExecutionError>
    where
@@ -939,6 +1006,7 @@ where
            let _enter = debug_span!(
                target: "engine::tree",
                "execute tx",
+                tx_index = senders.len() - 1,
            )
            .entered();
            trace!(target: "engine::tree", "Executing transaction");
@@ -953,11 +1021,10 @@ where
            let current_len = executor.receipts().len();
            if current_len > last_sent_len {
                last_sent_len = current_len;
-                // Stream the latest receipt to the post-exec worker for incremental root
-                // computation.
+                // Send the latest receipt to the background task for incremental root computation.
                if let Some(receipt) = executor.receipts().last() {
                    let tx_index = current_len - 1;
-                    post_exec.push_receipt(tx_index, receipt.clone());
+                    let _ = receipt_tx.send(IndexedReceipt::new(tx_index, receipt.clone()));
                }
            }
        }
@@ -1253,7 +1320,7 @@ where
        trace!(target: "engine::tree::payload_validator", block=?block.num_hash(), "Validating block consensus");
        // validate block consensus rules
        if let Err(e) = self.validate_block_inner(block, transaction_root) {
-            return Err(e.into())
+            return Err(e.into());
        }

        // now validate against the parent
@@ -1262,7 +1329,7 @@ where
            self.consensus.validate_header_against_parent(block.sealed_header(), parent_block)
        {
            warn!(target: "engine::tree::payload_validator", ?block, "Failed to validate header {} against parent: {e}", block.hash());
-            return Err(e.into())
+            return Err(e.into());
        }
        drop(_enter);

@@ -1275,7 +1342,7 @@ where
        {
            // call post-block hook
            self.on_invalid_block(parent_block, block, output, None, ctx.state_mut());
-            return Err(err.into())
+            return Err(err.into());
        }
        drop(_enter);

@@ -1291,7 +1358,7 @@ where
        {
            // call post-block hook
            self.on_invalid_block(parent_block, block, output, None, ctx.state_mut());
-            return Err(err.into())
+            return Err(err.into());
        }

        // record post-execution validation duration
@@ -1399,7 +1466,7 @@ where
                self.provider.clone(),
                historical,
                Some(blocks),
-            )))
+            )));
        }

        // Check if the block is persisted
@@ -1407,7 +1474,7 @@ where
            debug!(target: "engine::tree::payload_validator", %hash, number = %header.number(), "found canonical state for block in database, creating provider builder");
            // For persisted blocks, we create a builder that will fetch state directly from the
            // database
-            return Ok(Some(StateProviderBuilder::new(self.provider.clone(), hash, None)))
+            return Ok(Some(StateProviderBuilder::new(self.provider.clone(), hash, None)));
        }

        debug!(target: "engine::tree::payload_validator", %hash, "no canonical state found for block");
@@ -1439,7 +1506,7 @@ where
    ) {
        if state.invalid_headers.get(&block.hash()).is_some() {
            // we already marked this block as invalid
-            return
+            return;
        }
        self.invalid_block_hook.on_invalid_block(parent_header, block, output, trie_updates);
    }
@@ -1630,10 +1697,142 @@ where
            deferred_trie_data,
        )
    }
-}

-/// Output of block or payload validation.
-pub type ValidationOutcome<N, E = InsertPayloadError<BlockTy<N>>> = Result<ExecutedBlock<N>, E>;
+    fn calculate_timing_stats(
+        &self,
+        block: &RecoveredBlock<N::Block>,
+        provider_stats: Arc<StateProviderStats>,
+        cache_stats: Option<Arc<CacheStats>>,
+        output: &BlockExecutionOutput<N::Receipt>,
+        execution_duration: Duration,
+        state_hash_duration: Duration,
+    ) -> Box<ExecutionTimingStats> {
+        let accounts_read = provider_stats.total_account_fetches();
+        let storage_read = provider_stats.total_storage_fetches();
+        let code_read = provider_stats.total_code_fetches();
+        let code_bytes_read = provider_stats.total_code_fetched_bytes();
+
+        // Write stats from BundleState (final state changes)
+        let accounts_changed = output.state.state.len();
+        let accounts_deleted =
+            output.state.state.values().filter(|acc| acc.was_destroyed()).count();
+        let storage_slots_changed =
+            output.state.state.values().map(|account| account.storage.len()).sum::<usize>();
+        let storage_slots_deleted = output
+            .state
+            .state
+            .values()
+            .flat_map(|account| account.storage.values())
+            .filter(|slot| {
+                slot.present_value.is_zero() && !slot.previous_or_original_value.is_zero()
+            })
+            .count();
+
+        // Helper: check if account represents a new contract deployment
+        let is_new_deployment = |acc: &BundleAccount| -> bool {
+            let has_code_now = acc.info.as_ref().is_some_and(|info| info.code_hash != KECCAK_EMPTY);
+            let had_no_code_before = acc
+                .original_info
+                .as_ref()
+                .map(|info| info.code_hash == KECCAK_EMPTY)
+                .unwrap_or(true);
+            has_code_now && had_no_code_before
+        };
+
+        let bytecodes_changed =
+            output.state.state.values().filter(|acc| is_new_deployment(acc)).count();
+
+        // Unique new code hashes to count actual bytes persisted (deduplicated)
+        let unique_new_code_hashes: B256Set = output
+            .state
+            .state
+            .values()
+            .filter(|acc| is_new_deployment(acc))
+            .filter_map(|acc| acc.info.as_ref().map(|info| info.code_hash))
+            .collect();
+        let code_bytes_written: usize = unique_new_code_hashes
+            .iter()
+            .filter_map(|hash| {
+                output.state.contracts.get(hash).map(|bytecode| bytecode.original_bytes().len())
+            })
+            .sum();
+
+        // Total time spent fetching state during execution
+        let state_read_duration = provider_stats.total_account_fetch_latency() +
+            provider_stats.total_storage_fetch_latency() +
+            provider_stats.total_code_fetch_latency();
+
+        // EIP-7702 delegation tracking from bytecode changes
+        // Count new EIP-7702 bytecodes as delegations set
+        let eip7702_delegations_set =
+            output.state.contracts.values().filter(|bytecode| bytecode.is_eip7702()).count();
+        // Delegations cleared: accounts where bytecode changed FROM EIP-7702 TO empty
+        // This detects when an EIP-7702 delegation is removed by setting code to empty
+        // Note: Clearing a delegation does NOT destroy the account - it just empties the
+        // bytecode
+        let eip7702_delegations_cleared = output
+            .state
+            .state
+            .values()
+            .filter(|acc| {
+                // Check if original bytecode was EIP-7702
+                let original_was_eip7702 = acc
+                    .original_info
+                    .as_ref()
+                    .and_then(|info| info.code.as_ref())
+                    .map(|bytecode| bytecode.is_eip7702())
+                    .unwrap_or(false);
+
+                // Check if current code is empty (delegation cleared)
+                let code_now_empty =
+                    acc.info.as_ref().map(|info| info.code_hash == KECCAK_EMPTY).unwrap_or(false);
+
+                original_was_eip7702 && code_now_empty
+            })
+            .count();
+
+        // Get cache statistics for detailed block logging
+        let (account_cache_hits, account_cache_misses) = cache_stats
+            .as_ref()
+            .map(|s| (s.account_hits(), s.account_misses()))
+            .unwrap_or_default();
+        let (storage_cache_hits, storage_cache_misses) = cache_stats
+            .as_ref()
+            .map(|s| (s.storage_hits(), s.storage_misses()))
+            .unwrap_or_default();
+        let (code_cache_hits, code_cache_misses) =
+            cache_stats.as_ref().map(|s| (s.code_hits(), s.code_misses())).unwrap_or_default();
+
+        // Build execution timing stats for detailed block logging
+        Box::new(ExecutionTimingStats {
+            block_number: block.number(),
+            block_hash: block.hash(),
+            gas_used: output.result.gas_used,
+            tx_count: block.transaction_count(),
+            execution_duration,
+            state_read_duration,
+            state_hash_duration,
+            accounts_read,
+            storage_read,
+            code_read,
+            code_bytes_read,
+            accounts_changed,
+            accounts_deleted,
+            storage_slots_changed,
+            storage_slots_deleted,
+            bytecodes_changed,
+            code_bytes_written,
+            eip7702_delegations_set,
+            eip7702_delegations_cleared,
+            account_cache_hits,
+            account_cache_misses,
+            storage_cache_hits,
+            storage_cache_misses,
+            code_cache_hits,
+            code_cache_misses,
+        })
+    }
+}

 /// Strategy describing how to compute the state root.
 #[derive(Debug, Clone, Copy, PartialEq, Eq)]
--- a/crates/engine/tree/src/tree/persistence_state.rs
+++ b/crates/engine/tree/src/tree/persistence_state.rs
@@ -20,6 +20,7 @@
 //! The [`PersistenceState`] tracks ongoing persistence operations and coordinates
 //! between the main execution thread and background persistence workers.

+use crate::persistence::PersistenceResult;
 use alloy_eips::BlockNumHash;
 use alloy_primitives::B256;
 use crossbeam_channel::Receiver as CrossbeamReceiver;
@@ -36,7 +37,7 @@ pub struct PersistenceState {
    /// Receiver end of channel where the result of the persistence task will be
    /// sent when done. A None value means there's no persistence task in progress.
    pub(crate) rx:
-        Option<(CrossbeamReceiver<Option<BlockNumHash>>, Instant, CurrentPersistenceAction)>,
+        Option<(CrossbeamReceiver<PersistenceResult>, Instant, CurrentPersistenceAction)>,
 }

 impl PersistenceState {
@@ -50,7 +51,7 @@ impl PersistenceState {
    pub(crate) fn start_remove(
        &mut self,
        new_tip_num: u64,
-        rx: CrossbeamReceiver<Option<BlockNumHash>>,
+        rx: CrossbeamReceiver<PersistenceResult>,
    ) {
        self.rx =
            Some((rx, Instant::now(), CurrentPersistenceAction::RemovingBlocks { new_tip_num }));
@@ -60,7 +61,7 @@ impl PersistenceState {
    pub(crate) fn start_save(
        &mut self,
        highest: BlockNumHash,
-        rx: CrossbeamReceiver<Option<BlockNumHash>>,
+        rx: CrossbeamReceiver<PersistenceResult>,
    ) {
        self.rx = Some((rx, Instant::now(), CurrentPersistenceAction::SavingBlocks { highest }));
    }
--- a/crates/engine/tree/src/tree/precompile_cache.rs
+++ b/crates/engine/tree/src/tree/precompile_cache.rs
@@ -169,11 +169,11 @@ where
    }

    fn call(&self, input: PrecompileInput<'_>) -> PrecompileResult {
-        if let Some(entry) = &self.cache.get(input.data, self.spec_id.clone()) {
+        if let Some(entry) = &self.cache.get(input.data, self.spec_id.clone()) &&
+            input.gas >= entry.gas_used()
+        {
            self.increment_by_one_precompile_cache_hits();
-            if input.gas >= entry.gas_used() {
-                return entry.to_precompile_result()
-            }
+            return entry.to_precompile_result()
        }

        let calldata = input.data;
--- a/crates/engine/tree/src/tree/tests.rs
+++ b/crates/engine/tree/src/tree/tests.rs
@@ -36,6 +36,7 @@ use std::{
        mpsc::{Receiver, Sender},
        Arc,
    },
+    time::Duration,
 };
 use tokio::sync::oneshot;

@@ -202,6 +203,7 @@ impl TestHarness {
            payload_validator,
            TreeConfig::default(),
            Box::new(NoopInvalidBlockHook::default()),
+            EngineSharedCaches::default(),
            changeset_cache.clone(),
            reth_tasks::Runtime::test(),
        );
@@ -406,6 +408,7 @@ impl ValidatorTestHarness {
            payload_validator,
            TreeConfig::default(),
            Box::new(NoopInvalidBlockHook::default()),
+            EngineSharedCaches::default(),
            changeset_cache,
            reth_tasks::Runtime::test(),
        );
@@ -415,14 +418,13 @@ impl ValidatorTestHarness {

    /// Configure `PersistenceState` for specific persistence scenarios
    fn start_persistence_operation(&mut self, action: CurrentPersistenceAction) {
-        // Create a dummy receiver for testing - it will never receive a value
-        let (_tx, rx) = crossbeam_channel::bounded(1);
-
        match action {
            CurrentPersistenceAction::SavingBlocks { highest } => {
+                let (_tx, rx) = crossbeam_channel::bounded(1);
                self.harness.tree.persistence_state.start_save(highest, rx);
            }
            CurrentPersistenceAction::RemovingBlocks { new_tip_num } => {
+                let (_tx, rx) = crossbeam_channel::bounded(1);
                self.harness.tree.persistence_state.start_remove(new_tip_num, rx);
            }
        }
@@ -759,7 +761,12 @@ async fn test_tree_state_on_new_head_reorg() {
    assert_eq!(saved_blocks, vec![blocks[0].clone(), blocks[1].clone()]);

    // send the response so we can advance again
-    sender.send(Some(blocks[1].recovered_block().num_hash())).unwrap();
+    sender
+        .send(PersistenceResult {
+            last_block: Some(blocks[1].recovered_block().num_hash()),
+            commit_duration: Some(Duration::ZERO),
+        })
+        .unwrap();

    // we should be persisting blocks[1] because we threw out the prev action
    let current_action = test_harness.tree.persistence_state.current_action().cloned();
@@ -1582,6 +1589,39 @@ mod check_invalid_ancestors_tests {
        }
    }

+    /// Test that `find_invalid_ancestor` detects the block itself in the invalid cache
+    #[test]
+    fn test_find_invalid_ancestor_detects_block_itself() {
+        reth_tracing::init_test_tracing();
+
+        let mut test_harness = TestHarness::new(HOLESKY.clone());
+
+        // Read block 1
+        let s1 = include_str!("../../test-data/holesky/1.rlp");
+        let data1 = Bytes::from_str(s1).unwrap();
+        let block1 = Block::decode(&mut data1.as_ref()).unwrap();
+        let sealed1 = block1.seal_slow();
+        let hash1 = sealed1.hash();
+        let parent1 = sealed1.parent_hash();
+
+        // Mark block 1 itself as invalid (simulates a block that failed execution)
+        test_harness
+            .tree
+            .state
+            .invalid_headers
+            .insert(BlockWithParent { block: sealed1.num_hash(), parent: parent1 });
+
+        // Create payload for block 1 (same block, sent again by CL)
+        let payload1 = ExecutionData {
+            payload: ExecutionPayloadV1::from_block_unchecked(hash1, &sealed1.into_block()).into(),
+            sidecar: ExecutionPayloadSidecar::none(),
+        };
+
+        // find_invalid_ancestor should detect the block itself without re-execution
+        let result = test_harness.tree.find_invalid_ancestor(&payload1);
+        assert!(result.is_some(), "Should detect block itself in invalid headers cache");
+    }
+
    /// Helper function to create a malformed payload that descends from a given parent
    fn create_malformed_payload_descending_from(parent_hash: B256) -> ExecutionData {
        // Create a block with invalid hash (mismatch between computed and provided hash)
@@ -2034,7 +2074,12 @@ mod forkchoice_updated_tests {
                if let Some(last) = saved_blocks.last() {
                    last_persisted_number = last.recovered_block().number;
                }
-                sender.send(saved_blocks.last().map(|b| b.recovered_block().num_hash())).unwrap();
+                sender
+                    .send(PersistenceResult {
+                        last_block: saved_blocks.last().map(|b| b.recovered_block().num_hash()),
+                        commit_duration: Some(Duration::ZERO),
+                    })
+                    .unwrap();
            }
        }

--- a/crates/engine/tree/tests/shared_caches_sdk.rs
+++ b/crates/engine/tree/tests/shared_caches_sdk.rs
@@ -0,0 +1,27 @@
+//! SDK smoke tests for `EngineSharedCaches`.
+
+use alloy_primitives::B256;
+use reth_engine_tree::tree::{
+    EngineSharedCaches, PayloadSparseTrieKind, PayloadSparseTrieStoreOutcome,
+};
+use reth_evm_ethereum::EthEvmConfig;
+
+#[test]
+fn engine_shared_caches_exposes_public_sparse_trie_sdk() {
+    let caches =
+        EngineSharedCaches::<EthEvmConfig>::with_sparse_trie_kind(PayloadSparseTrieKind::Arena);
+
+    let _precompile_cache_map = caches.precompile_cache_map();
+
+    let sparse_trie_cache = caches.sparse_trie_cache();
+    assert_eq!(sparse_trie_cache.kind(), PayloadSparseTrieKind::Arena);
+    let state_root = B256::with_last_byte(1);
+
+    assert_eq!(
+        sparse_trie_cache.take_or_create_for(state_root).store_anchored(state_root),
+        PayloadSparseTrieStoreOutcome::Stored
+    );
+
+    let checkout = sparse_trie_cache.take_or_create_for(state_root);
+    assert!(checkout.memory_size() > 0 || checkout.retained_storage_tries_count() == 0);
+}
--- a/crates/era-downloader/src/client.rs
+++ b/crates/era-downloader/src/client.rs
@@ -4,6 +4,7 @@ use eyre::{eyre, OptionExt};
 use futures_util::{stream::StreamExt, Stream, TryStreamExt};
 use reqwest::{Client, IntoUrl, Url};
 use reth_era::common::file_ops::EraFileType;
+use reth_fs_util::FsPathError;
 use sha2::{Digest, Sha256};
 use std::{future::Future, path::Path, str::FromStr};
 use tokio::{
@@ -136,7 +137,7 @@ impl<Http: HttpClient + Clone> EraClient<Http> {
                    let Some(number) = self.file_name_to_number(name) &&
                    (number < index || number >= last)
                {
-                    reth_fs_util::remove_file(entry.path())?;
+                    remove_file_ignore_not_found(entry.path())?;
                }
            }
        }
@@ -321,6 +322,16 @@ impl<Http: HttpClient + Clone> EraClient<Http> {
    }
 }

+fn remove_file_ignore_not_found(path: impl AsRef<Path>) -> eyre::Result<()> {
+    match reth_fs_util::remove_file(path) {
+        Ok(()) => Ok(()),
+        Err(FsPathError::RemoveFile { source, .. }) if source.kind() == io::ErrorKind::NotFound => {
+            Ok(())
+        }
+        Err(err) => Err(err.into()),
+    }
+}
+
 async fn checksum(mut reader: impl AsyncRead + Unpin) -> eyre::Result<Vec<u8>> {
    let mut hasher = Sha256::new();

@@ -367,4 +378,25 @@ mod tests {

        assert_eq!(actual_number, expected_number);
    }
+
+    #[test]
+    fn test_remove_file_ignore_not_found() {
+        let temp_dir = tempfile::tempdir().unwrap();
+        let path = temp_dir.path().join("missing.era1");
+
+        assert!(remove_file_ignore_not_found(&path).is_ok());
+    }
+
+    #[test]
+    fn test_remove_file_ignore_not_found_preserves_other_errors() {
+        let temp_dir = tempfile::tempdir().unwrap();
+        let path = temp_dir.path().join("dir");
+        std::fs::create_dir_all(&path).unwrap();
+
+        let err = remove_file_ignore_not_found(&path).unwrap_err();
+        assert!(matches!(
+            err.downcast_ref::<FsPathError>(),
+            Some(FsPathError::RemoveFile { source, .. }) if source.kind() != io::ErrorKind::NotFound
+        ));
+    }
 }
--- a/crates/ethereum/cli/Cargo.toml
+++ b/crates/ethereum/cli/Cargo.toml
@@ -36,8 +36,6 @@ tracing.workspace = true
 tempfile.workspace = true

 [features]
-default = []
-
 otlp = ["reth-tracing/otlp", "reth-node-core/otlp"]
 otlp-logs = ["reth-tracing/otlp-logs", "reth-node-core/otlp-logs"]

@@ -89,6 +87,3 @@ min-trace-logs = [
    "tracing/release_max_level_trace",
    "reth-node-core/min-trace-logs",
 ]
-
-rocksdb = ["reth-cli-commands/rocksdb"]
-edge = ["rocksdb"]
--- a/crates/ethereum/cli/src/interface.rs
+++ b/crates/ethereum/cli/src/interface.rs
@@ -21,7 +21,7 @@ use reth_node_core::{
    args::{LogArgs, OtlpInitStatus, OtlpLogsStatus, TraceArgs},
    version::version_metadata,
 };
-use reth_rpc_server_types::{DefaultRpcModuleValidator, RpcModuleValidator};
+use reth_rpc_server_types::{DefaultRpcModuleValidator, RethRpcModule, RpcModuleValidator};
 use reth_tracing::{FileWorkerGuard, Layers};
 use std::{ffi::OsString, fmt, future::Future, marker::PhantomData, sync::Arc};
 use tracing::{info, warn};
@@ -223,7 +223,9 @@ impl<
        let otlp_status = runner.block_on(self.traces.init_otlp_tracing(&mut layers))?;
        let otlp_logs_status = runner.block_on(self.traces.init_otlp_logs(&mut layers))?;

-        let guard = self.logs.init_tracing_with_layers(layers)?;
+        // Enable reload support if debug RPC namespace is available
+        let enable_reload = self.command.debug_namespace_enabled();
+        let file_guard = self.logs.init_tracing_with_layers(layers, enable_reload)?;
        info!(target: "reth::cli", "Initialized tracing, debug log directory: {}", self.logs.log_file_directory);

        match otlp_status {
@@ -246,7 +248,7 @@ impl<
            OtlpLogsStatus::Disabled => {}
        }

-        Ok(guard)
+        Ok(file_guard)
    }
 }

@@ -349,6 +351,16 @@ impl<C: ChainSpecParser, Ext: clap::Args + fmt::Debug, SubCmd: Subcommand + fmt:
            Self::Ext(_) => None,
        }
    }
+
+    /// Returns `true` if this is a node command with debug RPC namespace enabled.
+    ///
+    /// This is used to determine whether to enable runtime log level changes.
+    pub fn debug_namespace_enabled(&self) -> bool {
+        match self {
+            Self::Node(cmd) => cmd.rpc.is_namespace_enabled(RethRpcModule::Debug),
+            _ => false,
+        }
+    }
 }

 #[cfg(test)]
--- a/crates/ethereum/node/tests/e2e/eth.rs
+++ b/crates/ethereum/node/tests/e2e/eth.rs
@@ -273,7 +273,7 @@ async fn test_sparse_trie_reuse_across_blocks() -> eyre::Result<()> {
    let tree_config = TreeConfig::default()
        .with_legacy_state_root(false)
        .with_sparse_trie_prune_depth(2)
-        .with_sparse_trie_max_storage_tries(100);
+        .with_sparse_trie_max_hot_slots(100);

    let (mut nodes, _wallet) = setup_engine::<EthereumNode>(
        1,
--- a/crates/ethereum/node/tests/e2e/simulate.rs
+++ b/crates/ethereum/node/tests/e2e/simulate.rs
@@ -78,3 +78,43 @@ async fn test_simulate_v1_with_max_fee_per_blob_gas_only() -> eyre::Result<()> {

    Ok(())
 }
+
+#[tokio::test]
+async fn test_simulate_v1_too_many_blocks_error() -> eyre::Result<()> {
+    reth_tracing::init_test_tracing();
+
+    let chain_spec = Arc::new(
+        ChainSpecBuilder::default()
+            .chain(MAINNET.chain)
+            .genesis(serde_json::from_str(include_str!("../assets/genesis.json")).unwrap())
+            .cancun_activated()
+            .build(),
+    );
+
+    let (mut nodes, wallet) = setup_engine::<EthereumNode>(
+        1,
+        chain_spec,
+        false,
+        Default::default(),
+        eth_payload_attributes,
+    )
+    .await?;
+    let node = nodes.pop().unwrap();
+    let provider = ProviderBuilder::new()
+        .wallet(EthereumWallet::new(wallet.wallet_gen().swap_remove(0)))
+        .connect_http(node.rpc_url());
+
+    let payload: SimulatePayload<TransactionRequest> =
+        (0..257).fold(SimulatePayload::default(), |payload, _| payload.extend(SimBlock::default()));
+
+    let err = provider
+        .raw_request::<_, Vec<SimulatedBlock>>("eth_simulateV1".into(), (&payload, "latest"))
+        .await
+        .unwrap_err();
+    let err = err.as_error_resp().expect("expected JSON-RPC error response");
+
+    assert_eq!(err.code, -38026);
+    assert_eq!(err.message, "too many blocks");
+
+    Ok(())
+}
--- a/crates/ethereum/payload/src/lib.rs
+++ b/crates/ethereum/payload/src/lib.rs
@@ -367,7 +367,7 @@ where
        .is_prague_active_at_timestamp(attributes.timestamp)
        .then_some(execution_result.requests);

-    let sealed_block = Arc::new(block.sealed_block().clone());
+    let sealed_block = Arc::new(block.into_sealed_block());
    debug!(target: "payload_builder", id=%attributes.id, sealed_block_header = ?sealed_block.sealed_header(), "sealed built block");

    if is_osaka && sealed_block.rlp_length() > MAX_RLP_BLOCK_SIZE {
--- a/crates/ethereum/primitives/Cargo.toml
+++ b/crates/ethereum/primitives/Cargo.toml
@@ -85,4 +85,7 @@ serde = [
    "alloy-rpc-types-eth?/serde",
    "rand/serde",
 ]
-rpc = ["dep:alloy-rpc-types-eth"]
+rpc = [
+    "dep:alloy-rpc-types-eth",
+    "alloy-rpc-types-eth?/serde",
+]
--- a/crates/exex/exex/Cargo.toml
+++ b/crates/exex/exex/Cargo.toml
@@ -66,8 +66,6 @@ secp256k1.workspace = true
 tempfile.workspace = true

 [features]
-default = []
-edge = ["reth-provider/edge"]
 serde = [
    "reth-exex-types/serde",
    "reth-revm/serde",
--- a/crates/net/downloaders/src/file_client.rs
+++ b/crates/net/downloaders/src/file_client.rs
@@ -3,7 +3,7 @@ use alloy_eips::BlockHashOrNumber;
 use alloy_primitives::{BlockHash, BlockNumber, Sealable, B256};
 use async_compression::tokio::bufread::GzipDecoder;
 use futures::Future;
-use itertools::Either;
+use itertools::{Either, Itertools};
 use reth_consensus::{Consensus, ConsensusError};
 use reth_network_p2p::{
    bodies::client::{BodiesClient, BodiesFut},
@@ -163,17 +163,9 @@ impl<B: FullBlock> FileClient<B> {
        if self.headers.is_empty() {
            return true
        }
-        let mut nums = self.headers.keys().copied().collect::<Vec<_>>();
-        nums.sort_unstable();
-        let mut iter = nums.into_iter();
-        let mut lowest = iter.next().expect("not empty");
-        for next in iter {
-            if next != lowest + 1 {
-                return false
-            }
-            lowest = next;
-        }
-        true
+        let (min, max) = self.headers.keys().minmax().into_option().expect("not empty");
+        // Contiguous range from min to max means no gaps
+        *max - *min + 1 == self.headers.len() as u64
    }

    /// Use the provided bodies as the file client's block body buffer.
--- a/crates/net/eth-wire-types/src/broadcast.rs
+++ b/crates/net/eth-wire-types/src/broadcast.rs
@@ -25,6 +25,7 @@ use reth_primitives_traits::{Block, SignedTransaction};
    RlpDecodableWrapper,
    Default,
    Deref,
+    DerefMut,
    IntoIterator,
 )]
 #[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
@@ -238,7 +239,7 @@ impl NewPooledTransactionHashes {
    /// the rest. If `len` is greater than the number of hashes, this has no effect.
    pub fn truncate(&mut self, len: usize) {
        match self {
-            Self::Eth66(msg) => msg.0.truncate(len),
+            Self::Eth66(msg) => msg.truncate(len),
            Self::Eth68(msg) => {
                msg.types.truncate(len);
                msg.sizes.truncate(len);
@@ -263,7 +264,7 @@ impl NewPooledTransactionHashes {
        }
    }

-    /// Returns an immutable reference to the inner type if this an eth68 announcement.
+    /// Returns an immutable reference to the inner type if this is an eth68 announcement.
    pub const fn as_eth68(&self) -> Option<&NewPooledTransactionHashes68> {
        match self {
            Self::Eth66(_) => None,
@@ -271,7 +272,7 @@ impl NewPooledTransactionHashes {
        }
    }

-    /// Returns a mutable reference to the inner type if this an eth68 announcement.
+    /// Returns a mutable reference to the inner type if this is an eth68 announcement.
    pub const fn as_eth68_mut(&mut self) -> Option<&mut NewPooledTransactionHashes68> {
        match self {
            Self::Eth66(_) => None,
@@ -279,7 +280,7 @@ impl NewPooledTransactionHashes {
        }
    }

-    /// Returns a mutable reference to the inner type if this an eth66 announcement.
+    /// Returns a mutable reference to the inner type if this is an eth66 announcement.
    pub const fn as_eth66_mut(&mut self) -> Option<&mut NewPooledTransactionHashes66> {
        match self {
            Self::Eth66(msg) => Some(msg),
@@ -287,7 +288,7 @@ impl NewPooledTransactionHashes {
        }
    }

-    /// Returns the inner type if this an eth68 announcement.
+    /// Returns the inner type if this is an eth68 announcement.
    pub fn take_eth68(&mut self) -> Option<NewPooledTransactionHashes68> {
        match self {
            Self::Eth66(_) => None,
@@ -295,7 +296,7 @@ impl NewPooledTransactionHashes {
        }
    }

-    /// Returns the inner type if this an eth66 announcement.
+    /// Returns the inner type if this is an eth66 announcement.
    pub fn take_eth66(&mut self) -> Option<NewPooledTransactionHashes66> {
        match self {
            Self::Eth66(msg) => Some(mem::take(msg)),
@@ -336,6 +337,7 @@ impl From<NewPooledTransactionHashes68> for NewPooledTransactionHashes {
    RlpDecodableWrapper,
    Default,
    Deref,
+    DerefMut,
    IntoIterator,
 )]
 #[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
@@ -865,8 +867,8 @@ mod tests {
        let latest = blocks.latest().unwrap();
        assert_eq!(latest.number, 0);

-        blocks.0.push(BlockHashNumber { hash: B256::random(), number: 100 });
-        blocks.0.push(BlockHashNumber { hash: B256::random(), number: 2 });
+        blocks.push(BlockHashNumber { hash: B256::random(), number: 100 });
+        blocks.push(BlockHashNumber { hash: B256::random(), number: 2 });
        let latest = blocks.latest().unwrap();
        assert_eq!(latest.number, 100);
    }
--- a/crates/net/eth-wire/src/errors/eth.rs
+++ b/crates/net/eth-wire/src/errors/eth.rs
@@ -63,6 +63,28 @@ impl EthStreamError {
        }
    }

+    /// Returns whether this error indicates a protocol breach on the receive side.
+    ///
+    /// These are errors caused by the remote peer sending invalid or malformed data
+    /// that warrant disconnecting with [`DisconnectReason::ProtocolBreach`].
+    pub const fn is_protocol_breach(&self) -> bool {
+        matches!(
+            self,
+            Self::InvalidMessage(_) |
+                Self::MessageTooBig(_) |
+                Self::TransactionHashesInvalidLenOfFields { .. } |
+                Self::UnsupportedMessage { .. } |
+                Self::P2PStreamError(
+                    P2PStreamError::Rlp(_) |
+                        P2PStreamError::Snap(_) |
+                        P2PStreamError::MessageTooBig { .. } |
+                        P2PStreamError::UnknownReservedMessageId(_) |
+                        P2PStreamError::EmptyProtocolMessage |
+                        P2PStreamError::UnknownDisconnectReason(_)
+                )
+        )
+    }
+
    /// Returns the [`io::Error`] if it was caused by IO
    pub const fn as_io(&self) -> Option<&io::Error> {
        if let Self::P2PStreamError(P2PStreamError::Io(io)) = self {
--- a/crates/net/network-api/src/lib.rs
+++ b/crates/net/network-api/src/lib.rs
@@ -262,12 +262,12 @@ pub enum Direction {
 }

 impl Direction {
-    /// Returns `true` if this an incoming connection.
+    /// Returns `true` if this is an incoming connection.
    pub const fn is_incoming(&self) -> bool {
        matches!(self, Self::Incoming)
    }

-    /// Returns `true` if this an outgoing connection.
+    /// Returns `true` if this is an outgoing connection.
    pub const fn is_outgoing(&self) -> bool {
        matches!(self, Self::Outgoing(_))
    }
--- a/crates/net/network-types/Cargo.toml
+++ b/crates/net/network-types/Cargo.toml
@@ -21,7 +21,7 @@ alloy-eip2124.workspace = true
 # misc
 serde = { workspace = true, optional = true }
 humantime-serde = { workspace = true, optional = true }
-serde_json = { workspace = true, features = ["std"] }
+serde_json = { workspace = true, features = ["std"], optional = true }

 # misc
 tracing.workspace = true
@@ -30,6 +30,7 @@ tracing.workspace = true
 serde = [
    "dep:serde",
    "dep:humantime-serde",
+    "dep:serde_json",
    "alloy-eip2124/serde",
 ]
 test-utils = []
--- a/crates/net/network-types/src/peers/config.rs
+++ b/crates/net/network-types/src/peers/config.rs
@@ -1,15 +1,9 @@
 //! Configuration for peering.

-use std::{
-    collections::HashSet,
-    io::{self, ErrorKind},
-    path::Path,
-    time::Duration,
-};
+use std::{collections::HashSet, time::Duration};

 use reth_net_banlist::{BanList, IpFilter};
 use reth_network_peers::{NodeRecord, TrustedPeer};
-use tracing::info;

 use crate::{peers::PersistedPeerInfo, BackoffKind, ReputationChangeWeights};

@@ -311,16 +305,16 @@ impl PeersConfig {
    #[cfg(feature = "serde")]
    pub fn with_basic_nodes_from_file(
        mut self,
-        optional_file: Option<impl AsRef<Path>>,
-    ) -> Result<Self, io::Error> {
+        optional_file: Option<impl AsRef<std::path::Path>>,
+    ) -> Result<Self, std::io::Error> {
        let Some(file_path) = optional_file else { return Ok(self) };
        let raw = match std::fs::read_to_string(file_path.as_ref()) {
            Ok(contents) => contents,
-            Err(e) if e.kind() == ErrorKind::NotFound => return Ok(self),
+            Err(e) if e.kind() == std::io::ErrorKind::NotFound => return Ok(self),
            Err(e) => return Err(e),
        };

-        info!(target: "net::peers", file = %file_path.as_ref().display(), "Loading saved peers");
+        tracing::info!(target: "net::peers", file = %file_path.as_ref().display(), "Loading saved peers");

        // Try the new format first, fall back to legacy Vec<NodeRecord>
        let peers: Vec<PersistedPeerInfo> = serde_json::from_str(&raw)
@@ -330,9 +324,9 @@ impl PeersConfig {
                    nodes.into_iter().map(PersistedPeerInfo::from_node_record).collect(),
                )
            })
-            .map_err(|e| io::Error::new(io::ErrorKind::InvalidData, e))?;
+            .map_err(|e| std::io::Error::new(std::io::ErrorKind::InvalidData, e))?;

-        info!(target: "net::peers", count = peers.len(), "Loaded persisted peers");
+        tracing::info!(target: "net::peers", count = peers.len(), "Loaded persisted peers");
        self.persisted_peers = peers;
        Ok(self)
    }
--- a/crates/net/network/src/manager.rs
+++ b/crates/net/network/src/manager.rs
@@ -637,7 +637,7 @@ impl<N: NetworkPrimitives> NetworkManager<N> {
            PeerMessage::NewBlockHashes(hashes) => {
                self.within_pow_or_disconnect(peer_id, |this| {
                    // update peer's state, to track what blocks this peer has seen
-                    this.swarm.state_mut().on_new_block_hashes(peer_id, hashes.0.clone());
+                    this.swarm.state_mut().on_new_block_hashes(peer_id, hashes.to_vec());
                    // start block import process for the hashes
                    this.block_import.on_new_block(peer_id, NewBlockEvent::Hashes(hashes));
                })
--- a/crates/net/network/src/message.rs
+++ b/crates/net/network/src/message.rs
@@ -65,6 +65,36 @@ pub enum PeerMessage<N: NetworkPrimitives = EthNetworkPrimitives> {
    Other(RawCapabilityMessage),
 }

+impl<N: NetworkPrimitives> PeerMessage<N> {
+    /// Returns a static string identifying the message variant for logging.
+    pub const fn message_kind(&self) -> &'static str {
+        match self {
+            Self::NewBlockHashes(_) => "NewBlockHashes",
+            Self::NewBlock(_) => "NewBlock",
+            Self::ReceivedTransaction(_) => "ReceivedTransaction",
+            Self::SendTransactions(_) => "SendTransactions",
+            Self::PooledTransactions(_) => "PooledTransactions",
+            Self::EthRequest(_) => "EthRequest",
+            Self::BlockRangeUpdated(_) => "BlockRangeUpdated",
+            Self::Other(_) => "Other",
+        }
+    }
+
+    /// Returns the number of items in the message payload, if applicable.
+    pub fn message_item_count(&self) -> usize {
+        match self {
+            Self::NewBlockHashes(msg) => msg.len(),
+            Self::ReceivedTransaction(msg) => msg.len(),
+            Self::SendTransactions(msg) => msg.len(),
+            Self::PooledTransactions(msg) => msg.len(),
+            Self::NewBlock(_) |
+            Self::EthRequest(_) |
+            Self::BlockRangeUpdated(_) |
+            Self::Other(_) => 1,
+        }
+    }
+}
+
 /// Request Variants that only target block related data.
 #[derive(Debug, Clone, PartialEq, Eq)]
 pub enum BlockRequest {
--- a/Show More
+++ b/Show More