## Summary
Adds a GitHub-style copy icon next to the data source URL in both the
autoprecompile analyzer and metrics viewer. Clicking the icon copies the
data URL to clipboard with a checkmark confirmation animation.
<img width="577" height="35" alt="Screenshot 2026-03-24 at 10 54 19"
src="https://github.com/user-attachments/assets/db862aa1-921e-4b18-ac1f-2f78f2d81afa"
/>
<img width="561" height="42" alt="Screenshot 2026-03-24 at 10 54 26"
src="https://github.com/user-attachments/assets/fff6d0d6-f75d-4392-9fb2-28093bf95031"
/>
## Features
- SVG clipboard icon that toggles to a checkmark on click
- "Copy data URL" tooltip with theme-appropriate styling
- 2-second confirmation period before resetting
- Dark theme (analyzer) and light theme (metrics viewer) variants
## Testing
- Test clicking the copy icon in both tools with sample data loaded
- Verify the URL is copied to clipboard
- Confirm the checkmark briefly appears with "Copied!" message
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
Fixes the bug found by Claude in #3686.
The core issue is that AIR IDs are only unique for a given VM, but not
across proving phases. For example, we might have the following entries:
```
{
"labels": [
[
"air_name",
"MemoryMerkleAir<8>"
],
[
"air_id",
"3"
]
],
"metric": "constraints",
"value": "33"
},
...
{
"labels": [
[
"air_name",
"SymbolicExpressionAir<BabyBearParameters>"
],
[
"air_id",
"3"
]
],
"metric": "constraints",
"value": "318"
},
```
The previous code would have added both constraints, because the AIR ID
is the same.
I guess the proper fix here would be to include a "group" label here as
well, but that requires changing `stark-backend` again...
With these changes, the number of constraint instances and bus
interactions messages on the [OpenVM 2.0
data](eec5e5a086/metrics_v2_pairing_combined.json)
is a lot closer to the reported software costs
[here](https://powdr-labs.github.io/powdr/autoprecompile-analyzer/?data=https%3A%2F%2Fgist.githubusercontent.com%2Fleonardoalt%2F4e7f5a1e81048df77d4a92cb01636a4a%2Fraw%2F562784471ff6f11082af0f0e096fb9dd61920a1b%2Fapc_candidates_pairing_v2.json):
- Metrics viewer: 1.13B constraint instances; 966M bus interaction
messages
- *Before this PR: 6.14B constraint instances, 1.44B bus interaction
messages*
- APC analyzer: 896M constraint instances; 797M bus interaction messages
I think the remaining difference can likely be explained by
continuations overhead, additional AIRs (not accounted for by the APC
analyzer) and rounding effects (instances are computed by multiplying by
the number of rows, which always appear to be powers of 2).
I did a pass to review the spec and improve some of the descriptions in
the UI.
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Summary
- Rotate x-axis labels (-35°) so long run names don't overlap
- Switch to pixel-based label threshold (18px min) so small stacked
segments don't get cluttered text
- Dynamically position legend below the actual x-axis label bounding box
to prevent overlap
- Highlight PowdrAir slices in the pie chart with a distinct red color
and render them first
<img width="774" height="521" alt="Screenshot 2026-03-23 at 12 07 22"
src="https://github.com/user-attachments/assets/b0bf9f5f-36ab-43f0-ab6f-9ddb08d582b8"
/>
<img width="348" height="518" alt="Screenshot 2026-03-23 at 12 09 41"
src="https://github.com/user-attachments/assets/8cc51e85-4cf4-4302-bae2-2b16490265dd"
/>
## Test plan
- [ ] Load V2 pairing data — verify stacked and grouped charts are
readable
- [ ] Load V1 keccak data — verify no regressions
- [ ] Check legend doesn't overlap rotated labels on both chart tabs
- [ ] Verify PowdrAir slices appear first in pie chart with red color
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This is a breaking change that works for data generated after #3678
This should compute the number of constraints and bus interactions (and
their respective instances) correctly.
The main goal of this PR is to add support for OpenVM 2.0, with this
example input file:
003b4ba129/metrics_pairing_openvm2_combined.json
It renders like this:
<img width="1593" height="1126" alt="Screenshot 2026-03-19 at 16 08 31"
src="https://github.com/user-attachments/assets/bb48a71f-36f3-4012-b753-4876cbc3fcb4"
/>
To do and sanity-check this, I added the following:
- a `spec.py`, which recomputes the details table. I find the python
code easier to audit and debug than JS code, hence the duplication.
- I added info tooltips to each entry, which also includes python-like
pseudo-code to describe how each value is computed.
- There is a version detection, to see if the metrics JSON comes from
OpenVM 1 or 2. The two OpenVM versions have different metrics,
especially in the proof time breakdown.
- Some small adjustments: The proof time breakdown is collapsible now
and the legend in the proof breakdown plot is layouted better.
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
### Motivation
- Make the metrics viewer accept both the existing combined-metrics
format and single-run raw metrics files that contain top-level `counter`
and `gauge` arrays so the example `metrics.json` URL can be opened
directly.
- Surface clear errors when a user supplies malformed JSON (wrong shape
/ missing fields) by validating payloads up front.
For example, now it also works with:
https://github.com/powdr-labs/bench-results/blob/gh-pages/results/2026-03-18-0537/reth/apc030.json
### Description
- Added a normalization/validation layer (`normalizeMetricsData`,
`assertRawMetricsShape`, `assertMetricEntries`, `hasRawMetricsShape`,
`inferRawRunName`, etc.) to `openvm/metrics-viewer/index.html` that
detects raw vs combined formats and wraps raw files as a single
experiment.
- Threaded a `sourceLabel` through file/URL loading paths (`handleFile`
and `loadFromUrl`) so the viewer can infer a sensible run name from URLs
(e.g. `/apc030/metrics.json`).
- Added format assertions that check that `counter` and `gauge` entries
are arrays with `labels`, `metric`, and `value`, which throw clear
errors if the JSON is malformed.
- Updated the UI copy and small CSS hint to advertise support for both
formats and updated `openvm/metrics-viewer/CLAUDE.md` to document the
raw-vs-combined behavior and the new `normalizeMetricsData` step.
### Testing
- Ran the bundled JS-based unit smoke test `node
/tmp/test_metrics_viewer.js` which exercises `normalizeMetricsData`,
`inferRawRunName`, and `extractMetrics` on synthesized raw+combined
inputs; the test passed locally.
- Attempted to fetch the real remote example with `curl` during testing
but the environment returned a `403` (network/proxy) so that automated
fetch failed; this does not affect the viewer logic validated by the
node-based tests.
------
[Codex
Task](https://chatgpt.com/codex/tasks/task_b_69ba9c72043483248210d8936ad8fc38)
## Summary
Extract `total_cells_used` metric and display it in the basic stats
table with a gray percentage showing the padding ratio. Pass metrics
object to formatter functions to enable ratio calculations.
This helps identify how much of the total cells are actually used vs.
how much is padding overhead.
<img width="1354" height="501" alt="Screenshot 2026-03-18 at 12 04 13"
src="https://github.com/user-attachments/assets/a9048e66-ac31-4670-8220-ed6635732fec"
/>
**Note**: The numbers are currently wrong, but this is fixed by #3672.
🤖 Generated with Claude Code
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
Needs files generated by #3665
## Summary
Add a metrics-viewer single-page web app in `openvm/metrics-viewer/`
that visualizes proof metrics from OpenVM benchmarks. Includes a proof
time breakdown stacked bar chart (port of `basic_metrics.py`), summary
table, and trace cell pie chart (port of `plot_trace_cells.py`). Add a
`combine` command to `basic_metrics.py` to merge multiple `metrics.json`
files into a single JSON keyed by run name. Update
`run_guest_benches.sh` to generate `combined_metrics.json` for each
benchmark. The viewer loads data via file upload or URL parameter and
follows the light theme of autoprecompile-analyzer.
Example deployments:
-
[Reth](https://fluttering-heat.surge.sh/?data=https%3A%2F%2Fgithub.com%2Fgeorgwiese%2Ffiles%2Fblob%2Fmain%2Freth_combined_2026-03-12-0537.json&baseline=apc000&run=apc030)
-
[Keccak](https://fluttering-heat.surge.sh/?data=https%3A%2F%2Fgist.githubusercontent.com%2Fgeorgwiese%2Fb146800a3b5eb633a6d5157f8aff1123%2Fraw%2Fe02ba2cec6a4cc063e4bff117cf46c69ff775e1e%2Fkeccak_combined.json&baseline=apc000&run=apc030)
<img width="1590" height="846" alt="Screenshot 2026-03-16 at 17 32 16"
src="https://github.com/user-attachments/assets/e495c083-714f-41c5-b596-ee20c43dd55a"
/>
## Original Prompt
Similar to autoprecompile-analyzer, create a new folder
openvm/metrics-viewer with a CLAUDE.md and single index.html.
Read openvm-riscv/scripts/basic_metrics.py and
openvm-riscv/scripts/plot_trace_cells.py for context.
You can also run the following to have an example:
```
python3 -m venv .venv
source .venv/bin/activate
pip install -r openvm-riscv/scripts/requirements.txt
pip install -r autoprecompiles/scripts/requirements.txt
# For summary table:
python openvm-riscv/scripts/basic_metrics.py summary-table /Users/georg/coding/powdr/results/2026-03-12-0537/keccak/**/metrics.json
# For proof time brakdown
python openvm-riscv/scripts/basic_metrics.py plot /Users/georg/coding/powdr/results/2026-03-12-0537/keccak/**/metrics.json -o result.png
# For trace cell plot
python openvm-riscv/scripts/plot_trace_cells.py -o result.png /Users/georg/coding/powdr/results/2026-03-12-0537/keccak/apc030/metrics.json
```
Do the following:
- To basic_metrics.py add a combine command. It makes a new single JSON
of the input JSON files, mapping from name to the JSON content. We
should be able to call it the same way, see example usage in
openvm-riscv/scripts/run_guest_benches.sh.
- Then, just like the autoprecompile analyzer, it should expect a file
or link to this combined metrics file.
- Then, in different panes, it should generate the plots and summary
table in different panes: One for the proof time breakdown, one for the
summary table, and a third one for the trace cell pie chart. The third
pane should only display something if a run has been selected.
## Test plan
- Run `python3 openvm-riscv/scripts/basic_metrics.py combine
/path/to/**/metrics.json > combined.json` and verify valid JSON output
- Open `openvm/metrics-viewer/` in a browser, upload the combined JSON,
and verify the stacked bar chart, summary table, and pie chart render
correctly
- Test URL loading with `?data=<url>&run=<name>` parameters
- Verify all three commands (summary-table, plot, combine) still work
with the refactored `get_label()` function
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>
Removes the assumption that instructions are consecutive in the program
in openvm.
This only supports unconditional superblocks (which are guaranteed to
run), so any failure in empirical constraints would fail at execution
time.
This only contains the changes required for openvm, no unconditional
superblocks are generated.
## Summary
- Extracts the `compile`/`assert_machine_output` APC snapshot test
utilities from `openvm-riscv/tests/common/mod.rs` into a reusable
`powdr_openvm::test_utils` module (behind a `test-utils` feature flag)
- Makes the functions generic over `ISA: OpenVmISA` so that other ISA
implementations (e.g. WOMIR in
[womir-openvm](https://github.com/powdr-labs/womir-openvm)) can share
the same test infrastructure instead of copying ~100 lines of equivalent
code
- Updates `openvm-riscv` tests to use the shared utilities as thin
wrappers
Motivation: https://github.com/powdr-labs/womir-openvm/pull/291 copies
nearly identical test infrastructure from powdr-riscv. This PR enables
code reuse.
## Test plan
- [x] `cargo check -p powdr-openvm --features test-utils` compiles
- [x] `cargo test -p powdr-openvm-riscv --test
apc_builder_single_instructions --test apc_builder_complex` — all pass
(except pre-existing div/rem ordering issue #3646)
- [x] Verified womir-openvm APC tests pass with path patch pointing to
this branch
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Make powdr-openvm generic over the ISA
- Introduce `OpenVmISA` trait in powdr-openvm which has everything which
depends on the ISA
- Move all riscv specific stuff to `powdr-openvm-riscv`
Most of the changes come from splitting the (large) `lib.rs` in two.
TODO:
- [ ] Fix reth
- [x] Put back info prints showing the top 10 blocks (for now using
`powdr_riscv_elf::SymbolTable` for all ISAs, it would make sense to
extract `SymbolTable` somewhere neutral)
- [ ] Test on womir
## Summary
- Update all openvm dependencies across the repo to use the
`v1.4.2-powdr-rc.3` tag
- Convert workspace root deps from `branch = "cuda-fp-as-support"` to
`tag = "v1.4.2-powdr-rc.3"`
- Convert guest crate deps from old `rev` to `rev = "v1.4.2-powdr-rc.3"`
## Test plan
- [x] `cargo check --all-targets` passes
- [ ] CI passes
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
This was a weird bug a while ago which doesn't seem to happen anymore.
The issue was that when wrapping an executor into another executor,
openvm didnt find the system executor after wrapping. The solution was
to keep a System variant in the wrapper, which complicates the `From`
implementations, as the system executor must be wired to `System(_)` and
not to `Sdk(System(_))`.
Doing #3619, I noticed that we still put a CBOR path in our
`apc_candidates.json`, even though we no longer generate CBOR files. I
don't think this field is actually used, so I removed it.
If this PR gets approved, I'll merge
https://github.com/georgwiese/autoprecompile-analyzer/pull/7.
A `SuperBlock` is a sequence of `BasicBlock`s seen during execution.
APCs are now generated over `SuperBlocks`: a `SuperBlock` with a single
element is a `BasicBlock`.
Still, this PR does not include the detection or support for SuperBlocks
of len > 1.
When comparing apc candidates, we currently check for overlap using
`Snapshot::instret` which is a trait method on the snapshot type.
However, instret would always originally come from the execution state.
This PR removes the trait, and instead gets the instret/global_clk from
the passed state and does the overlap analysis based on that.
The assertion in `SharedPeripheryChipsGpuProverExt::extend_prover` was
using `VariableRangeCheckerChip` (CPU type) instead of
`VariableRangeCheckerChipGPU` when searching for the range checker in
the GPU chip inventory.
Since GPU inventories only contain GPU chip types, `find_chip` for a CPU
type always returned an empty iterator, making the assertion
`.nth(1).is_none()` trivially pass regardless of the actual inventory
state. This made the sanity check effectively dead code.
Changes:
- Replaced `Arc<VariableRangeCheckerChip>` with
`Arc<VariableRangeCheckerChipGPU>`
- Changed assertion logic from `.nth(1).is_none()` to
`.next().is_some()` to
match the CPU version and actually verify the chip exists
- Removed unused CPU chip import
In `basic_metrics`, we defined `total_proof_time_ms = app_proof_time_ms
+ leaf_proof_time_ms + internal_proof_time_ms`, where the proof times
here were *excluding* trace gen.
This is not obvious, and it gets especially confusing because OpenVM
also has a metric called `total_proof_time_ms`, which includes trace
gen.
This PR changes it such that `total_proof_time_ms` is the end-to-end
number, and we additionally have `total_proof_time_excluding_trace_ms`.
Changes in this PR:
- Use `DegreeBound` everywhere possible instead of `max_degree: usize`
where it's unclear what is meant
- Make stats non optional: before this PR, we only compute the stats in
`Cell` mode, because they are used for the apc selection. After this PR,
we also compute the stats in other PGO modes, it simply happens after
selection.
- Simplify some of the APIs
- Add a trait method to `Adapter` to generate the stats. This is because
stats are client specific.
- Generalize usage of `ApcWithStats` to simplify function signatures.
Basically, inside the apc engine, we pass around the apc with its stats.
- Make the `InstructionHandler` specific to the degree bound. This
avoids passing around the degree bound, it can instead be recovered from
the instruction handler.
This PR generates the "optimistic constraints" (which I'd prefer to call
"execution constraints") introduced in #3491 for optimistic precompiles.
They are currently ignored, actually passing them to the execution
engine is left for another PR.
At a high level, this is what happens:
1. `optimistic_literals()` computes a map `AlgebraicReference ->
OptimisticLiteral`. It works by finding memory accesses with
compile-time addresses (essentially register accesses). The columns
representing the data in the memory bus interaction correspond to limbs
of register values at some point in time and therefore can be mapped to
an execution literal.
2. `BlockEmpiricalConstraints::filtered` is used to remove any
constraints on columns that cannot be mapped to execution literals. As a
result, all empirical constraints can be checked at execution time, but
the resulting optimistic precompiles are less effective.
3. `ConstraintGenerator::generate_constraints` turns empirical
constraints into equality constraints, i.e., constraints of the form
`(number|algebraic_reference) = (number|algebraic_reference)`. These
constraints can be converted to `SymbolicConstraint` (to be added to the
solver) and to execution constraints via
`generate_execution_constraints` (using the map computed in step 1).
To test:
`POWDR_RESTRICTED_OPTIMISTIC_PRECOMPILES=1 cargo run --bin powdr_openvm
-r prove guest-keccak --input 100 --autoprecompiles 1
--apc-candidates-dir keccak100 --mock --optimistic-precompiles`
Also see the evaluation on reth that I posted in #3366.
---------
Co-authored-by: Thibaut Schaeffer <schaeffer.thibaut@gmail.com>
Co-authored-by: schaeff <thibaut@powdrlabs.com>
During execution, we match the arenas to Initialized/Uninitialized many
times under the hood. Also, we have a lot of methods on `OriginalArenas`
which delegate to Initialized/Unintialized, or panic.
Simplify this by getting an `InitializedOriginalArena` at the beginning
of execution.
During tracegen, use the initialized status to detect if an apc was not
called instead of going through `number_of_calls()`.
I don't expect this to make much of a difference performance-wise, but
it simplifies the implementation a bit.
This PR contains two changes that are useful in #3501:
- In addition to `statements_to_symbolic_machine`, we now have
`statements_to_symbolic_machines`, which returns one machine per
instruction. `statements_to_symbolic_machine` calls
`statements_to_symbolic_machines` and concatenates the machines, which
should yield the same result as now. `statements_to_symbolic_machines`
will be used in #3501.
- It used to be that we hard-code the PC only for the first instruction.
I changed that, because otherwise the result of
`statements_to_symbolic_machines` would still have PC lookups after
optimization (which fails an assertion).
Some small refactorings extracted from #3501:
- Extracted `BlockEmpiricalConstraints`, a block-local version of
`EmpiricalConstraints`.
- `apply_pc_threshold` is now called only once in `customize`, instead
of once per basic block.
- Extracted a `BlockCellAlgebraicReferenceMapper`
---------
Co-authored-by: Thibaut Schaeffer <schaeffer.thibaut@gmail.com>