Based on #3040, this PR makes the autoprecompile crate responsible for
converting from and to the user's field type. As a result, the openvm
crate only deals with the openvm Babybear, the powdr Babybear is only
used inside the APC crate. This leads to a lot of simplification, and
enables some derives.
TODO:
- [x] fix the cargo tree clash
Key design choices revolve around passing `ApcCandidate` stats out from
`create_apcs_with_cell_pgo`, `create_apcs_with_pgo`, etc. I explored a
few ways:
1. [Current solution] Add an `ApcStats` struct that wraps stats we want
to pass to `PowdrPrecompile`. I believe this is the most versatile
solution and avoids defects from the two solutions below.
2. Return `ApcCandidate` for `create_apcs_with_[x]_pgo` and
`create_apcs_with_pgo`. This solution requires filling dummy zeros for
`execution_frequency`, `width_before`, and `width_after`, which I think
aren't very clean. Besides, passing these values still require computing
the stats we care about later, so I think it's cleaner to explicitly
declare what we care about in `ApcStats` instead.
3. Use `Either::Left(ApcCandidate)` for cell PGO while
`Either::Right(Apc)` for other PGO modes. This isn't very clean, and
also has the disadvantage of requiring match clauses when constructing
elements of `PowdrExtension`.
```
struct ApcCandidate<P> {
apc: Apc<P>,
execution_frequency: usize,
width_before: AirWidths,
width_after: AirWidths,
}
```
Larger PR, but after this basically everything that can be made agnostic
is agnostic.
`PowdrConfig` is split in two, as the `autoprecompile` crate does not
care about the arithmetization method. This could be cleaned up a bit
though.
Need to merge
https://github.com/powdr-labs/openvm-reth-benchmark/pull/24/ first and
update the hashes.
TODO:
- [x] merge #3033
- [x] fix reth
- [x] remove POWDR_OPCODE constant in apc and pass it instead
In general, it makes more sense to always test:
- The full `AirMetrics` (including log up and preprocessed columns),
instead of just the main columns.
- The sum of `AirMetrics` for all APCs used, instead of just the first
APC in a testing profile (because of wasted data, and we can always just
create one APC if we only want to test one).
These won't add any cost to the testing CI run, because all data are
already there but we just aren't testing them fully yet.
1 is recommended, and having a constant is an improvement over the
current way the blowup factor can be set everywhere.
update: 1 does not work, so set to 2 for now, as it was before.
We currently export the APCs to disk in the openvm crate.
Now that the APCs are generic, we can export them in the APC crate
already.
The json file is still generated in openvm, since it contains some
openvm-specific data, such as the number of logup columns (which depends
on the stark config, which the APC crate does not know about)
Exposes the changes of #3015 to the CLI
To test, run this from `cli-openvm`:
`mkdir -p apcs && cargo run -r prove $(pwd)/../openvm/guest-keccak
--input 1 --autoprecompiles 1 --apc-candidates-dir apcs`
@georgwiese built some graphs from the APC candidates by exporting the
main stats to CSV. This PR start extending that by persisting the
candidates to disk for further inspection.
This first PR creates one file per APC, and #3015 creates a single JSON
which has the main stats and points to this APC file.
Depend on #2962.
- `Add` and `Sum` traits for `AirMetric`, so that we don't need to
manually add them in tests.
- Removed `name` field from `AirMetric`, because they currently live in
`OriginalAirs`, which already contains a map from air name `String` to
`AirMetric`.
---------
Co-authored-by: Thibaut Schaeffer <schaeffer.thibaut@gmail.com>
Currently all Powdr symbolic machines in `debug.pil` appear as
`PowdrAir<BabyBear>`, which gives no way of distinguishing one from
another.
This PR fixes this by using the existing `air_name` API of powdr
executor and plonk executor, which gives names like:
`powdr_air_for_opcode_4381`. This makes it easier to debug and in the
future, cross reference to `ApcCandidates` cbor and json files written
in #3008 and #3015.
Depends on #2960. I think we've long needed a more comprehensive test
for PGO, especially the cell mode.
---------
Co-authored-by: Thibaut Schaeffer <schaeffer.thibaut@gmail.com>
Just so we don't do this:
```rust
quadratic_symbolic_expression_to_algebraic(
&algebraic_to_quadratic_symbolic_expression(
&quadratic_symbolic_expression_to_algebraic(&expr)))
```
😉
When we generate the PGO data, we first collect all visited pcs in
execution order which is linear in the execution length.
Then we reduce this to a map of unique PC to number of occurrences,
which is linear in the program size.
This PR removes the first collection, generating the map directly.
Update: since for reth we do ~300 million additions to the hashmap, this
PR also switches to using a vector to keep track of the count as opposed
to a hashmap. Not to be confused: before this PR the vector is the
sequence of all pcs visited. After this PR, it is the number of times a
pc was visited, by pc.
Other update:
copilot made a good point about switching away from
`Arc<Mutex<Vec<u32>>>` to `Arc<Vec<AtomicU32>>`. Done.
This PR runs a few benchmarks nightly, saving the results to a separate
repo: `bench-results`.
Idea is to get something going on and improve reporting later.
Currently, it runs `keccak` and `reth`.
Need to first merge https://github.com/powdr-labs/stark-backend/pull/11.
And then merge https://github.com/powdr-labs/openvm/pull/32.
Original comment by @Schaeff:
```
so here's my suspicion:
to get the number of columns of an air, we use executor.air().width(), in fully qualified syntax openvm_stark_backend::p3_air::BaseAir::width(&executor.air())
this is the number of main trace columns, which does not include interaction columns
in order to get the full number of columns, we need to do what we already do for get_constraints but additionally call InteractionPhaseAirBuilder::finalize_interactions on the symbolic builder to materialize the interactions. Sadly InteractionPhaseAirBuilder is pub(crate) so it seems we need to change stark-backend
```
This PR follows the comment above, but has additional changes:
1. Instead of using `finalize_interactions`, call lower level functions
for greater efficiency.
2. Refactored `AirMetrics` related functions, so we always show the
number of main constraint vs bus interaction columns.
Next step:
1. We should calculate log up columns for APCs as well during PGO.
---------
Co-authored-by: Thibaut Schaeffer <schaeffer.thibaut@gmail.com>