Following #1934, adds these new variants to the rust side. A follow up
PR by @georgwiese will actually compute multiplicities in witgen.
---------
Co-authored-by: Georg Wiese <georgwiese@gmail.com>
Step towards #1633
This PR adds witness generation for any public that is referenced from
an identity.
Note that publics and public references are now existing independently:
- A public is still defined as a pointer to a cell in the trace. The
prover extracts the values from the trace and returns them to the
verifier; witgen has nothing to do with them (except providing the
values in the trace).
- A public reference (i.e., a public that is referenced by a constraint)
was previously unimplemented. Now, witgen would solve for this value.
This value might not be the same as the value of the public being
referenced! We don't check for consistency.
After #1633 is completed, publics will no longer be defined in terms of
trace cells, so the values returned by witgen will be the ones that are
returned to the verifier.
For now, the values are not returned yet (and different machines might
find conflicting values for the same public). But the solving works, and
I added a log message, e.g.:
```
$ cargo run pil test_data/pil/fibonacci_with_public.pil -o output -f
Writing output/fibonacci_with_public_analyzed.pil.
done.
Optimizing pil...
Removed 0 witness and 0 fixed columns. Total count now: 2 witness and 1 fixed columns.
Writing output/fibonacci_with_public_opt.pil.
Evaluating fixed columns...
Fixed column generation took 0.001645084s
Writing output/constants.bin.
Deducing witness columns...
Running main machine for 4 rows
[00:00:00 (ETA: 00:00:00)] ░░░░░░░░░░░░░░░░░░░░ 0% - Starting...
=> out (public) = 5
[00:00:00 (ETA: 00:00:00)] ████████████████████ 100% - Starting...
Witness generation took 0.00259025s
Writing output/commits.bin.
```
With this change, we check the fraction of used rows in each machine. If
the fraction is above 50%, we don't log anything in INFO level,
otherwise, we suggest that the machines could be configured with a
smaller min_degree.
This is not a warning, because on some backends, it might not be
possible to use VADCOP.
Example:
```
$ cargo run pil test_data/std/poseidon_gl_test.asm -o output -f
...
Only 101 of 256 rows (39.45%) are used in machine 'Main Machine', which is configured to be of static size 256. If the min_degree of this machine was lower, we could size it down such that the fraction of used rows is at least 50%. If the backend supports it, consider lowering the min_degree.
```
---------
Co-authored-by: chriseth <chris@ethereum.org>
We represent lookup and permutation selectors as
`Option<AlgebraicExpression<T>>`.
The issue with this is that there are two ways to represent 1: `Some(1)`
and `None`.
This leads to issues in witgen where we check `is_none` but not against
`Some(1)`.
This already led to an awkward optimizer step which reduces one to the
other so that witgen only sees one of the two options.
This PR makes the selector non-optional. Therefore, 1 is only 1. Display
is adjusted to not print the selector if it prints to "1". This
str-level comparison is there to avoid introducing invasive bounds on T
which would contaminate everything.
`Identity` is currently a struct which acts as an union: all kinds of
identities are represented using the same struct fields, with runtime
checks that the right fields are used.
In the context of #1934 where we add new kinds of identities, it seems
advantageous to move to an enum on the Rust side as well.
Todo:
- [x] reimplement ids
- [x] minimize code duplication between lookups and permutations, if
possible
- [x] remove set_id
- [x] find better way to compare identities
- [x] fix connect
Mainly a test file to see what the JIT features are we need to support
https://github.com/powdr-labs/powdr/pull/1623
The "Add functionality benchmark" commit is the only important one here,
the rest are merges from the implementing branches.
This is a major change to the plonky3 prover to support proving many
machines.
# Sharing costs across tables
- at setup phase, fixed columns for each machine are committed to for
each possible size. This happens in separate commitments, so that the
prover and verifier can pick the relevant ones for a given execution
- for each phase of the proving, the corresponding traces across all
machines are committed to jointly
- the quotient chunks are committed to jointly across all tables
# Multi-stage publics
The implementation supports public values for each stage of each table.
This is tested internally in the plonky3 crate but not end-to-end in
pipeline tests.
---------
Co-authored-by: Leo Alt <leo@powdrlabs.com>
related to [this PR](https://github.com/powdr-labs/powdr/pull/1898)
we need to change to nightly toolchain to integrate stwo
I kept the toolchain related to riscv to be "nightly-2024-08-01" as it
is handled separately in workflow, so I made the least change to make
stwo integrate-able for now.
fix some clippy issues about the comment format on some files.
---------
Co-authored-by: chriseth <chris@ethereum.org>
This removes a complicated comparison mechanism introduced in #1874.
Previously, time steps were compared differently from addresses. Now,
both are compared the same way which is:
- The prover provides a bit indicating whether the high limbs of the
values are equal
- If this bit is 1, a constraints asserts this is actually the case
- The prover provides the 16-Bit difference minus one on the limb that
differs. This asserts that the limb value is strictly increasing.
Extracted from https://github.com/powdr-labs/powdr/pull/1790
This PR adds std and witgen support for the new 16-bit limb memory
machine.
- The current version only supports 24-bit addresses. I think it's fine
for now, but we should fix it later.
- Github fails miserably at showing the proper diff, but:
- the new file `double_sorted_32.rs` is the same as the old unique
`double_sorted.rs`, minus the common parts which
- were left in the outermost `double_sorted.rs` which dispatches calls
depending on the field
- the new file `double_sorted_16.rs` is very similar to the 32 case, but
adjusted for 2 value fields.
- There are probably more common things we can extract, but this here is
enough for a first version.
---------
Co-authored-by: Leo Alt <leo@ethereum.org>
Co-authored-by: Georg Wiese <georgwiese@gmail.com>
Tries to call the JIT compiler on each fixed column when generating the
values of the fixed columns. Uses the interpreter we already have for
all the columns where JIT compilation failed.
when run this command:
`cargo run pil test_data/asm/book/hello_world.asm --inputs 0 -o output
--field m31 --prove-with plonky3 -f`
it gets error:
```
thread 'main' panicked at
powdr/executor/src/witgen/global_constraints.rs:195:37:
attempt to subtract with overflow
```
The problem is in function
```
fn smallest_period_candidate<T: FieldElement>(fixed: &[T]) -> Option<u64> {
if fixed.first() != Some(&0.into()) {
return None;
}
(1..63).find(|bit| fixed.last() == Some(&((1u64 << bit) - 1).into()))
}
```
when using Mersenne31 field (prime: 2^31 -1) and the last element of
fixed is 0, it is equivalent to 2^31-1, then the function return bit=31
and bit=31 overflow at this line:
`let mask = T::Integer::from(((1 << bit) - 1) as u64);`
in function process_fixed_column, as 1 is by default to be i32
so I add the code below to handle the special case and make 1 to be 1u64
`if fixed.last() == Some(&0.into()) {
return None;
}`
---------
Co-authored-by: Georg Wiese <georgwiese@gmail.com>
This PR does a few things:
- unify `Input/DataIdentifier` into a single `Input` query that takes
(channel,idx) and returns a field element
- Output query takes (channel, fe)
- change the related prover functions to reflect this
This interface is used by, but not the same, as the input/output for the
riscv machine.
How to use these to input/output bytes or serialized data is a job of
the runtime implementation.
(First 4 commits of #1650)
This PR prepares witness generation for scalar publics (#1756). Scalar
publics are similar to cells in the trace, but are global (i.e.,
independent on the row number).
With this PR, affine expressions use a new `AlgebraicVariable` enum,
that can be either a column reference (`&'a AlgebraicReference`, which
was used previously), or a reference to a public.
At the on-site, we introduced operators for all lookup-like constraints.
This means that `[x, y] in s $ [a, b];` is not parsed as a lookup
identity any more, but just as a regular expression with binary
operators. The whole concept of a lookup or identity as a rust type is
now only present after the condenser has run. Because of that, we can
remove a lot of code that was concerned with parsed identities.
Since prover functions were introduced, we can have both constraints and
prover functions at statement level. Because of that I extended the
concept of "identity" (which we partly renamed to "constraint" already)
to "proof item". A proof item is either a constraint or a prover
function. Later on, we might also include fixed columns, challenges,
etc.
Co-authored-by: chriseth <chriseth.github@gmail.com>
We currently hardcode the range of degrees that variable degree machines
are preprocessed for. Expose that in machines instead.
This changes pil namespaces to accept a min and max degree:
```
namespace main(123..456);
namespace main(5); // allowed for backward compatibility, translates to `5..5`
```
It adds two new builtins:
```
std::prover::min_degree
std::prover::max_degree
```
And sets the behavior of the `std::prover::degree` builtin to only
succeed if `min_degree` and `max_degree` are equal.
Builds on #1687Fixed#1572
With this PR, we are using dynamic VADCOP in the RISC-V zk-VM.
There were a few smaller fixes needed to make this work. In summary, the
changes are as follows:
- We set the degree the main machine to `None`, and all fixed lookup
machines to the appropriate size. As a consequence, the CPU, all block
machines & memory have a dynamic size.
- As a consequence, I had to adjust some tests (set the size of all
machines, so they can still be run with monolithic provers) *and* was
able to remove the `Memory_<size>` machines 🎉
- With the main machine being of flexible size, the prover can chose for
how long to run it. We run it for `1 << (MAX_DEGREE_LOG - 2)` steps and
compute the bootloader inputs accordingly. With this choice, we can
guarantee that the register memory (which can be up to 4x larger than
the main machine) does not run out of rows.
Note that while we do access `MAX_DEGREE_LOG` in a bunch of places now,
this will go away once #1667 is merged, which will allow us to configure
the degree range in ASM and for each machine individually.
### Example:
```bash
export MAX_LOG_DEGREE=18
cargo run -r --bin powdr-rs compile riscv/tests/riscv_data/many_chunks -o output --continuations
cargo run -r --bin powdr-rs execute output/many_chunks.asm -o output --continuations -w
cargo run -r --features plonky3,halo2 prove output/many_chunks.asm -d output/chunk_0 --field gl --backend plonky3-composite
```
This leads to the following output:
```
== Proving machine: main (size 65536), stage 0
==> Proof stage computed in 1.918317417s
== Proving machine: main__rom (size 8192), stage 0
==> Proof stage computed in 45.847375ms
== Proving machine: main_binary (size 1024), stage 0
==> Proof stage computed in 27.718416ms
== Proving machine: main_bit2 (size 4), stage 0
==> Proof stage computed in 15.280667ms
== Proving machine: main_bit6 (size 64), stage 0
==> Proof stage computed in 17.449875ms
== Proving machine: main_bit7 (size 128), stage 0
==> Proof stage computed in 20.717834ms
== Proving machine: main_bootloader_inputs (size 262144), stage 0
==> Proof stage computed in 524.013375ms
== Proving machine: main_byte (size 256), stage 0
==> Proof stage computed in 17.280167ms
== Proving machine: main_byte2 (size 65536), stage 0
==> Proof stage computed in 164.709625ms
== Proving machine: main_byte_binary (size 262144), stage 0
==> Proof stage computed in 504.743917ms
== Proving machine: main_byte_compare (size 65536), stage 0
==> Proof stage computed in 169.881542ms
== Proving machine: main_byte_shift (size 65536), stage 0
==> Proof stage computed in 146.235916ms
== Proving machine: main_memory (size 32768), stage 0
==> Proof stage computed in 326.522167ms
== Proving machine: main_poseidon_gl (size 16384), stage 0
==> Proof stage computed in 1.324662625s
== Proving machine: main_regs (size 262144), stage 0
==> Proof stage computed in 2.009408667s
== Proving machine: main_shift (size 32), stage 0
==> Proof stage computed in 13.71825ms
== Proving machine: main_split_gl (size 16384), stage 0
==> Proof stage computed in 108.019334ms
Proof generation took 7.364567s
Proof size: 8432928 bytes
Writing output/chunk_0/many_chunks_proof.bin.
```
Note that `main_bootloader_inputs` is still equal to the maximum size,
we should fix that in a following PR!
When making fixed lookup machines smaller in the RISC-V VM (#1683), I
came across the issue that range-constraint lookups (e.g. `[two_bits] in
[TWO_BITS]` where `TWO_BITS = [0, 1, 2, 3]`) where not recognized as
such if the fixed column was *just* the right size (in the above
example, `TWO_BITS = [0, 1, 2, 3, 0]` would have worked).
This PR turns the `FixedLookup` into a "normal" machine, i.e., it
implements the `Machine` trait. This removes special handling of the
fixed lookup in various places.
Changes:
- `Machine::take_witness_col_values` now takes a reference to
`MutableState`, similar to `Machine::process_plookup`. With this,
machines can still call other machines machines while finalizing. This
is needed because some machines appear to call the fixed lookup when
finalizing.
- To handle this correctly, I changed the code such that:
- Machines are finalized in the order in which they appear in the
machines list (`FixedLookup` is the last machine)
- When finalizing, machines can access all *following* machines, but not
the once before, as they are already finalized.
`FixedLookup` is still a weird machine, which is responsible to many
sets of fixed columns (i.e., several ASM-machines) which might not even
have the same length. But that can be fixed in a separate PR.
Equivalent to #1623 but in Rust
Closes#1570
@georgwiese I wanted to add a test but everything I try either panics
(we sometimes assume the fixed columns to be available in a single size)
or runs forever (when I add fixed columns to a machine, I assume witgen
keeps increasing the degree and never stops? or it picks the largest
size and just takes time?) any thoughts?
Fixes#1604
With this PR, we bypass machine detection during witness generation of
stages > 0. See [this
comment](https://github.com/powdr-labs/powdr/issues/1604#issuecomment-2257059636)
for a motivation.
This currently needs to be tested manually, as follows:
```
$ RUST_LOG=trace cargo run pil test_data/asm/block_to_block_with_bus.asm -o output -f --field bn254 --prove-with halo2-mock
...
===== Summary for row 7:
main.acc1 = 20713437912485111384541749944547180564950035591542371144095269313127123163196
main.acc2 = 5162472027861336027760332823162682203738251621730423286600997430635718406729
main.z = 3
main.res = 9
main_arith.acc1 = 1174804959354163837704655800710094523598328808873663199602934873448685332421
main_arith.acc2 = 16725770843977939194486072922094592884810112778685611057097206755940090088888
main_arith.acc1_next = 463668501342879563405020640323131794083013726819708055681247370540753473777
main_arith.acc2_next = 20043340305711842349747334022818855888193664738087810292789887394167185113571
main_arith.y = 1
main_arith.z = 1
main.dummy = 0
main.acc1_next = 0
main.acc2_next = 0
main_arith.x = 0
main_arith.sel[0] = 0
---------------------
...
```
Computing `main.acc1 + main_arith.acc1` and `main.acc2 +
main_arith.acc2` both yields
`21888242871839275222246405745257275088548364400416034343698204186575808495617`,
which is the BN254 scalar field prime! In other words, the partial
accumulators sum to 0.
---------
Co-authored-by: Leo <leo@powdrlabs.com>
Another step towards #1572
Builds on #1574
I modified witness generation as follows:
- Each machine keeps track of its current size; whenever a fixed column
value is read, it has to pass the requested size as well.
- If fixed columns are available in several sizes, witness generation
starts out by using the largest size, as before
- When finalizing a block machine, it "downsizes" the machine to the
smallest possible value
Doing this for other machine types (e.g. VM, memory, etc) should be done
in another PR.
In the `vm_to_block_dynamic_length.pil` example, witness generation now
pics the minimum size instead of the maximum size for `main_arith`
```
$ cargo run pil test_data/pil/vm_to_block_dynamic_length.pil -o output -f --field bn254 --prove-with halo2-mock-composite
...
== Proving machine: main (size 256)
==> Machine proof of 256 rows (0 bytes) computed in 60.174583ms
size: 256
Machine: main__rom
== Proving machine: main__rom (size 256)
==> Machine proof of 256 rows (0 bytes) computed in 33.310292ms
size: 32
Machine: main_arith
== Proving machine: main_arith (size 32)
==> Machine proof of 32 rows (0 bytes) computed in 2.766541ms
```
Fixes#1496
Also, a step towards #1572
This PR implements the steps needed in `CompositeBackend` to implement
dynamic VADCOP.
In summary:
- If a machines size (a.k.a. "degree") is set to `None`, fixed columns
are computed in all powers of too in some hard-coded range. This fixes
#1572. As a result, machines with a size set to `None` are available in
multiple sizes. If the size is explicitly set by the user, the machine
is only available in that one size.
- Note that the ASM linker still sets the size of machines without a
size. So, currently, this can only happen when coming from PIL directly.
- `CompositeBackend` instantiates a new backend for each machine *and
size*:
- The verification key contains a key for each machine and size.
- When proving, it it uses the backend of whatever size the witness has.
The size chosen is also stored in the proof.
- When verifying, the verification key of the reported size is used.
- Witness generation currently chooses the largest available size. This
will change in a future PR.
This is an example:
```
$ cargo run pil test_data/pil/vm_to_block_dynamic_length.pil -o output -f --field bn254 --prove-with halo2-mock-composite
...
== Proving machine: main (size 256)
==> Machine proof of 256 rows (0 bytes) computed in 209.101166ms
== Proving machine: main__rom (size 256)
==> Machine proof of 256 rows (0 bytes) computed in 226.87175ms
== Proving machine: main_arith (size 1024)
==> Machine proof of 1024 rows (0 bytes) computed in 432.807583ms
```
This PR adds `number::VariablySizedColumns`, which can store several
sizes of the same column. Currently, we always just have one size, but
as part of #1496, we can relax that.