As @qwang98 pointed out, my implementation of `evaluate_apc` was too
inefficient, because it recalculated the statistics for the original
AIRs for each instruction of the basic block. This PR fixes it by
requiring a new function
`InstructionHandler::get_instruction_air_stats`.
Now `JsonExport` and `plot_effectiveness.py` both live in APC crate.
Didn't move other scripts, which rely on `metrics.json` and I think it's
specific to OVM?
---------
Co-authored-by: Georg Wiese <georgwiese@gmail.com>
```
$ cd openvm
$ cargo bench --bench optimizer_benchmark
Benchmarking optimize-keccak/optimize: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 44.2s.
optimize-keccak/optimize
time: [4.3703 s 4.3856 s 4.4020 s]
change: [-1.2781% -0.5240% +0.2094%] (p = 0.21 > 0.05)
No change in performance detected.
```
Turns `Solver` into a trait to abstract away the bus interaction handler
and also allow other implementations in the future.
It also makes the solver persistent across optimization loops so that we
do not have to derive the same range constraints all over again.
This is also a preparatory PR to make the solver more independent from
the constraint system: The constraint system inside the solver will have
way more redundancies in the future.
---------
Co-authored-by: Georg Wiese <georgwiese@gmail.com>
Removes algebraic constraints that are implied by other algebraic
constraints.
In order to do that, grouped expressions are turned into products of
factors such that these factors are as much normalized as possible. Then
we try to find algebraic constraints that are divisors of other
algebraic constraints (i.e. the factors are subsets).
We used to need the Opcode of the APC in order to hard-code it in the PC
lookup. However, for a while, we have removed PC lookups entirely,
hard-coding the PC in the execution bridge receive instead.
This removes the necessity to store the opcode. It was only used as an
ID of the APC. But for this, we can use the PC instead. This has the
added benefit that we can more easily relate it to the original program.
The `BusType` enum in the `autoprecompiles` crate still listed the
OpenVM bus types. Now, the bus types that we actually rely on are
hard-coded (e.g. execution bridge, memory, etc), but there is an `Other`
variant where the integration can add their own types:
92ae7469e8/autoprecompiles/src/bus_map.rs (L5-L19)
---------
Co-authored-by: Thibaut Schaeffer <schaeffer.thibaut@gmail.com>
This PR generalizes the memory optimizer to not make any specific
assumptions about the structure of memory bus interactions. This is done
by turning `MemoryBusInteraction` into a trait, which is an associated
type in the `Adapter`. Currently, there is one concrete implementation,
`OpenVmMemoryBusInteraction`.
In order to get a feel for the optimizer, I think it is good to also
output the column count. It is rather difficult to get the column count
before optimization, but in the end we are only interested in the effect
of a new optimization step anyway, so I think this is fine.
---------
Co-authored-by: Georg Wiese <georgwiese@gmail.com>
Generalizations that make fewer assumptions about the structure of bus
interactions like PC lookups and bus interactions. See comments below.
One change is that we know inline the initial PC (because we know it for
any given basic block), which leads to *all* PC lookups being removed,
slightly improving performance (at least reducing one witness column &
one bus interaction).
---------
Co-authored-by: schaeff <thibaut@powdrlabs.com>
Suggested alternative to #3057: Instead of `start_idx`, we store the
`start_pc` to identify a basic block. Both should uniquely identify the
basic block, but the start PC is useful information on its own (whereas
the `start_idx` only makes sense to someone who knows the entire
program).
See also
https://github.com/powdr-labs/powdr/pull/3055#discussion_r2215738707.
This PR does two things:
- it partially removes the notion of opcode in the autoprecompile crate,
and instead relies on user provided functions to know if an instruction
is allowed/branching
- it uses VmOpcode as much as possible in the openvm crate instead of
usize
After this PR, the only remaining notion of opcode in the apc crate is
in the `SymbolicInstructionStatement` where it is still a single field
element. @georgwiese shared ideas to remove that too, or at least shift
it to the client, see #3055
Based on #3040, this PR makes the autoprecompile crate responsible for
converting from and to the user's field type. As a result, the openvm
crate only deals with the openvm Babybear, the powdr Babybear is only
used inside the APC crate. This leads to a lot of simplification, and
enables some derives.
TODO:
- [x] fix the cargo tree clash
Key design choices revolve around passing `ApcCandidate` stats out from
`create_apcs_with_cell_pgo`, `create_apcs_with_pgo`, etc. I explored a
few ways:
1. [Current solution] Add an `ApcStats` struct that wraps stats we want
to pass to `PowdrPrecompile`. I believe this is the most versatile
solution and avoids defects from the two solutions below.
2. Return `ApcCandidate` for `create_apcs_with_[x]_pgo` and
`create_apcs_with_pgo`. This solution requires filling dummy zeros for
`execution_frequency`, `width_before`, and `width_after`, which I think
aren't very clean. Besides, passing these values still require computing
the stats we care about later, so I think it's cleaner to explicitly
declare what we care about in `ApcStats` instead.
3. Use `Either::Left(ApcCandidate)` for cell PGO while
`Either::Right(Apc)` for other PGO modes. This isn't very clean, and
also has the disadvantage of requiring match clauses when constructing
elements of `PowdrExtension`.
```
struct ApcCandidate<P> {
apc: Apc<P>,
execution_frequency: usize,
width_before: AirWidths,
width_after: AirWidths,
}
```
Larger PR, but after this basically everything that can be made agnostic
is agnostic.
`PowdrConfig` is split in two, as the `autoprecompile` crate does not
care about the arithmetization method. This could be cleaned up a bit
though.
Need to merge
https://github.com/powdr-labs/openvm-reth-benchmark/pull/24/ first and
update the hashes.
TODO:
- [x] merge #3033
- [x] fix reth
- [x] remove POWDR_OPCODE constant in apc and pass it instead
In general, it makes more sense to always test:
- The full `AirMetrics` (including log up and preprocessed columns),
instead of just the main columns.
- The sum of `AirMetrics` for all APCs used, instead of just the first
APC in a testing profile (because of wasted data, and we can always just
create one APC if we only want to test one).
These won't add any cost to the testing CI run, because all data are
already there but we just aren't testing them fully yet.
1 is recommended, and having a constant is an improvement over the
current way the blowup factor can be set everywhere.
update: 1 does not work, so set to 2 for now, as it was before.
We currently export the APCs to disk in the openvm crate.
Now that the APCs are generic, we can export them in the APC crate
already.
The json file is still generated in openvm, since it contains some
openvm-specific data, such as the number of logup columns (which depends
on the stark config, which the APC crate does not know about)
@georgwiese built some graphs from the APC candidates by exporting the
main stats to CSV. This PR start extending that by persisting the
candidates to disk for further inspection.
This first PR creates one file per APC, and #3015 creates a single JSON
which has the main stats and points to this APC file.