This simulates one approach we could go for when moving registers to
memory. The memory machine remains completely unchanged, but the step is
increased by more than 1 in each step of the main machine. This way,
from the point of view of memory, all the memory operations happen at
different time steps, which allows for:
- Reading from the same address twice
- Writing to the same address that we read from (which from the point of
view of memory should happen *after* the read)
The only downside I see with this approach is that this makes the
differences of time steps between memory accesses bigger: Before it was
at most the degree, now it is some small constant times the degree (in
this example 3). The way the memory machine is currently built, the
difference can be at most $2^{32} - 1$, so I think this is fine in
practice. E.g., for a degree $2^{30}$ machine we could do up to 4
parallel reads / writes.
*Cherry-picked b1a07bd9a7 from #1380, and
extended on it.*
Fixes#1382.
With this PR, a lookup like `selector { byte_lower + 256 * byte_upper }
in { <some other machine> }` works, even if the range constraints on
`byte_lower` and `byte_upper` are not "global". For example, they could
be implemented as `selector { byte_lower } in { BYTES }` (i.e.,
`byte_lower` is only range constrained when the machine call is active).
To make this work, I changed the `Machine::process_plookup` interface
like this:
```diff
fn process_plookup<'b, Q: QueryCallback<T>>(
&mut self,
mutable_state: &'b mut MutableState<'a, 'b, T, Q>,
identity_id: u64,
- args: &[AffineExpression<&'a AlgebraicReference, T>],
+ caller_rows: &'b RowPair<'b, 'a, T>,
) -> EvalResult<'a, T>;
```
The `RowPair` passed by the caller contains all range constraints known
at runtime. The LHS of the lookup (or permutation) is no longer
evaluated by the caller but by the callee. For this, the callee needs to
remember the identity associated with the `identity_id` (before this PR,
most machines just remembered the RHS, not the full identity). I don't
expect there to be any performance implications, because we only invoke
one machine (since #1154).
### Benchmark results
```
executor-benchmark/keccak
time: [14.609 s 14.645 s 14.678 s]
change: [-2.5984% -2.3127% -2.0090%] (p = 0.00 < 0.05)
Performance has improved.
executor-benchmark/many_chunks_chunk_0
time: [39.299 s 39.380 s 39.452 s]
change: [-3.9505% -3.6909% -3.4063%] (p = 0.00 < 0.05)
Performance has improved.
```
---------
Co-authored-by: Leo <leo@powdrlabs.com>
This PR is part of issue https://github.com/powdr-labs/powdr/pull/1345.
In particular, it adds the struct BlockExpression to Expressions to
homogenize the structure before including source references.
This PR is part of issue https://github.com/powdr-labs/powdr/pull/1345.
In particular, it adds the struct MatchExpression to Expressions to
homogenise the structure before including source references.
This PR is part of issue https://github.com/powdr-labs/powdr/pull/1345.
In particular, it adds the struct UnaryOperation to Expressions to
homogenise the structure before including source references.
This PR is part of issue #1345.
In particular, it adds the struct BinaryOperation to Expressions to
homogenise the structure before including source references.
---------
Co-authored-by: Thibaut Schaeffer <schaeffer.thibaut@gmail.com>
Cherry-picked ef6a72fcfa from #1380.
With this PR, we track whether a call to a machine led to some side
effect (e.g. added a block). In that case, the processed identity should
count has having led to some progress, even if no values were returned
to the calling machine. An example would be writing values to memory,
which does not return any values and hence does not change the state of
the caller.
This fixes a soundness bug in the recently added permutation argument:
To fingerprint a list $A = (a_1, ..., a_n)$ with selectors $S = (s_1,
..., s_n)$, we were computing:
$$
\prod_{i = 1}^n (X - s_i \cdot a_i)
$$
As a result, any element $a_i$ can be replaced with $0$, by setting the
selector to $0$!
Now, an element $i$ with $s_i = 0$ does not contribute to the product
anymore:
$$
\prod_{i = 1}^n (s_i \cdot (X - a_i - 1) + 1)
$$
Fixes an issue that @leonardoalt had on his `binary-mux2` branch.
There are two ways to have a block machine that is connected via a
permutation:
1. Use permutations `<sel> { ... } is (sub.sel * sub.LATCH) { ... }`.
This makes sure only rows where `sub.LATCH` is `1` can be selected. This
is what we do when we compile from ASM to PIL.
2. Use permutations `<sel> { ... } is sub.sel { ... }`, but also a
constraint `(1 - sub.LATCH) * sub.sel = 0`. This achieves something
similar.
The problem is that in the second case, detecting the block size is
harder, because the latch doesn't appear anywhere in the selector. So we
used to look into *all* fixed columns to detect the period. But this
includes fixed columns that might have a larger period (as is the case
for the multiplexer machine).
This PR simply removes support for the second approach. I think this is
fine in practice, as I don't see a disadvantage of the first approach
and when you come from ASM everything works as expected. I did need to
adjust `test_data/pil/block_lookup_or_permutation.pil`, which used the
second approach.
In Halo2, we need to simulate the "wrapping" behavior that we expect
from the backend, by copying the first row of witness columns after the
last. This was previously implemented by using copy constraints.
Considering [how the argument
works](https://zcash.github.io/halo2/design/proving-system/permutation.html),
this leads to 1 additional fixed column during setup *per witness
column*, and several additional witness columns (depending on the degree
bound, but linear in the number of witness columns).
But because Halo2 offers rotations of arbitrary offsets, we can instead
constrain this as:
`first_step * (witness_column - witness_column'degree)`
On the `simple_sum` example, this improved the proof time by ~15%
(353.01825ms -> 302.067125ms). The benchmarks on Keccak are still
running...
As suggested by @lvella .
I'm not too sure about this actually. I agree that compiling Rust ->
powdr-asm should have a separate pipeline somewhere, but before we could
do Rust -> proof in a single command, and with this we can't anymore.
- Do we expose the `prove` command and everything else here as well?
That'd be a lot of duplicated code.
- Do we force the user to first run `powdr-rs compile <rust_proj>` then
`powdr pil/prove ...`? That's potentially worse UX than right now.
- Do we keep all the Rust stuff in the `cli` crate?
Depends on #1346
Adds a module to the standard library to implement extension field
arithmetic. This should work and be correct on both the Goldilocks and
BN254 field. Needed for #1306.
---------
Co-authored-by: chriseth <chris@ethereum.org>
This PR is part of issue https://github.com/powdr-labs/powdr/pull/1345.
In particular, it adds the struct Number to Expressions to homogenize
the structure before including source references.
Adds binary operation precedence support to avoid unnecessary
parentheses in expression printed format
- #962
---------
Co-authored-by: chriseth <chris@ethereum.org>
This PR attempts various issues around using challenges in hints, which
is blocking #1306:
1. Hints of later-phase witness columns are now removed in witgen, as
these columns don't need to be computed yet anyway and the hint might be
accessing a challenge that does not exist.
2. The query callback is now cloned for each phase of witness generation
(because otherwise it was only available in the first phase).
3. `SymbolicEvaluator` no longer panics when encountering challenges,
but returns an error. This evaluator is used to detect patterns in
identities, like `A' - A = 0`. This means that we can't detect patterns
in identities that involve challenges, but at least it doesn't panic.
4. `witgen::query_processor::Symbols` can now evaluate challenges.
5. `witgen::query_processor::Symbols` now also looks up intermediate
"polynomials" (which includes challenges). This is necessary because we
don't currently inline intermediate polynomials in hints (which we do
for identities).
I added a test that demonstrates that challenges can now be used in
hints.
This PR
- removes the riscv dev dependency from the pipeline crate
- splits the benchmark into riscv and pil benches
- fixes the query callback in the keccak bench