Commit Graph

39 Commits

Author SHA1 Message Date
chriseth
e2eadffa2a Create is_first inside the lookup. (#1796)
Since we can now create constant columns inside `constr` functions, we
don't need to pass `is_first` any more.
2024-09-13 08:31:02 +00:00
chriseth
b2a06c2b03 More prover functions for std. (#1762) 2024-09-11 14:20:09 +00:00
Steve Wang
6c1c31a4da BabyBear shift machine (#1784)
All tests passed. :)

Operations:
- shl<0> A1, A2, B -> C1, C2
- shr<1> A1, A2, B -> C1, C2
- `A1` `A2` are 16 bit limbs of 32 bit `A` in little-endian order.
Likewise for `C1` and `C2`

Implementation:
- We adopted a similar implementation to our prior shift machine, which
decomposes `A` to 4 bytes and looks up each byte to a lookup table of
`[A_byte, B (shift amount), block row, operation id]`, so the size of
the lookup table is `[256, 32, 4, 2] = 65536`. Each row looks up to the
resulting byte after the bit shifting, and the results are added
together to obtain `C`.
- In our design, instead of looking up to 32-bit `C` column, we are
looking up to two 16-bit `C1` and `C2` columns. Overall, there are more
witness columns due to decomposing to 16-bit limbs the in the main shift
machine and one more fixed column in the lookup table, but the same
number of lookups performed.

Future optimization:
- There's ample ground for "reshaping" the main machine to have more
columns but fewer rows, and do more lookups in each row so that we are
not just processing one byte in each row. In the most aggressive case,
we can even process everything in the same row.
- For example, processing two bytes in one row (instead of one byte)
should have half the number of rows but less than twice the number of
columns, and therefore fewer cells in total. This should be good for
provers with time linear to the number of cells.

---------

Co-authored-by: onurinanc <e191322@metu.edu.tr>
2024-09-11 13:28:23 +00:00
Leo
c9ebdd9682 Baby bear initial machine tests (#1682) 2024-08-30 12:43:07 +00:00
Georg Wiese
34fdbd1ccd RISC-V machine: Use dynamic VADCOP (#1683)
Builds on #1687
Fixed #1572 

With this PR, we are using dynamic VADCOP in the RISC-V zk-VM.

There were a few smaller fixes needed to make this work. In summary, the
changes are as follows:
- We set the degree the main machine to `None`, and all fixed lookup
machines to the appropriate size. As a consequence, the CPU, all block
machines & memory have a dynamic size.
- As a consequence, I had to adjust some tests (set the size of all
machines, so they can still be run with monolithic provers) *and* was
able to remove the `Memory_<size>` machines 🎉
- With the main machine being of flexible size, the prover can chose for
how long to run it. We run it for `1 << (MAX_DEGREE_LOG - 2)` steps and
compute the bootloader inputs accordingly. With this choice, we can
guarantee that the register memory (which can be up to 4x larger than
the main machine) does not run out of rows.

Note that while we do access `MAX_DEGREE_LOG` in a bunch of places now,
this will go away once #1667 is merged, which will allow us to configure
the degree range in ASM and for each machine individually.

### Example:
```bash
export MAX_LOG_DEGREE=18
cargo run -r --bin powdr-rs compile riscv/tests/riscv_data/many_chunks -o output --continuations
cargo run -r --bin powdr-rs execute output/many_chunks.asm -o output --continuations -w
cargo run -r --features plonky3,halo2 prove output/many_chunks.asm -d output/chunk_0 --field gl --backend plonky3-composite
```

This leads to the following output:
```
== Proving machine: main (size 65536), stage 0
==> Proof stage computed in 1.918317417s
== Proving machine: main__rom (size 8192), stage 0
==> Proof stage computed in 45.847375ms
== Proving machine: main_binary (size 1024), stage 0
==> Proof stage computed in 27.718416ms
== Proving machine: main_bit2 (size 4), stage 0
==> Proof stage computed in 15.280667ms
== Proving machine: main_bit6 (size 64), stage 0
==> Proof stage computed in 17.449875ms
== Proving machine: main_bit7 (size 128), stage 0
==> Proof stage computed in 20.717834ms
== Proving machine: main_bootloader_inputs (size 262144), stage 0
==> Proof stage computed in 524.013375ms
== Proving machine: main_byte (size 256), stage 0
==> Proof stage computed in 17.280167ms
== Proving machine: main_byte2 (size 65536), stage 0
==> Proof stage computed in 164.709625ms
== Proving machine: main_byte_binary (size 262144), stage 0
==> Proof stage computed in 504.743917ms
== Proving machine: main_byte_compare (size 65536), stage 0
==> Proof stage computed in 169.881542ms
== Proving machine: main_byte_shift (size 65536), stage 0
==> Proof stage computed in 146.235916ms
== Proving machine: main_memory (size 32768), stage 0
==> Proof stage computed in 326.522167ms
== Proving machine: main_poseidon_gl (size 16384), stage 0
==> Proof stage computed in 1.324662625s
== Proving machine: main_regs (size 262144), stage 0
==> Proof stage computed in 2.009408667s
== Proving machine: main_shift (size 32), stage 0
==> Proof stage computed in 13.71825ms
== Proving machine: main_split_gl (size 16384), stage 0
==> Proof stage computed in 108.019334ms
Proof generation took 7.364567s
Proof size: 8432928 bytes
Writing output/chunk_0/many_chunks_proof.bin.
```

Note that `main_bootloader_inputs` is still equal to the maximum size,
we should fix that in a following PR!
2024-08-15 11:27:52 +00:00
chriseth
440e555c8f Change display for witness columns. (#1665) 2024-08-15 08:56:58 +00:00
onurinanc
47dd48399d Implement LogUp via Bus (#1624)
LogUp implementation using Bus

As @georgwiese suggested in
https://github.com/powdr-labs/powdr/issues/1573#issuecomment-2250875201

- That makes the machine detection simpler, because we wouldn't have
constraints that depend on both the LHS and RHS of a lookup.
- It's what we'll actually need for VADCOP, so maybe we don't even have
to support std::protocols::lookup as a first step.
2024-08-05 09:05:16 +00:00
Leo
86be6d2509 Make main instantiate all submachines and pass them as arguments (#1606) 2024-07-26 13:23:31 +00:00
chriseth
ac3b96eecf Even more quick tests (#1613) 2024-07-26 08:16:45 +00:00
onurinanc
60acaec5fc Remove some unused code in some tests & std (#1592) 2024-07-22 11:50:09 +00:00
onurinanc
4f873e6205 Implement Basic Bus (#1566)
Related to the issue of implementing basic bus (#1497), I have
implemented basic bus together with an example
(`permutation_via_bus.asm`) as specified inside the issue.

Currently, `test_data/std/bus_permutation_via_challenges.asm` works as
intended (To make it sound, stage(1) witness columns need to be exposed
publicly and verifier needs to check such as `out_z1 + out_z2 = 0`) We
can now check using RUST_LOG=trace and adding the final z1 and z2 is
equal to 0.

However, `test_data/std/bus_permutation_via_challenges_ext.asm` is not
working correctly as intended. This will be fixed with the following
commits.
2024-07-17 10:50:57 +00:00
Georg Wiese
b6f41e242c Add PoseidonGLMemory machine (#1525)
Implements #1055 for the Poseidon machines. Pulled out of #1508.

Specifically, this PR adds a new `PoseidonGLMemory` machine which
receives 2 memory points and then:
- Reads 24 32-Bit words and packs them into 12 field elements
- Computes the Poseidon permutation (just like `PoseidonGL`)
- For each of the 4 output field elements, it:
- Invokes the `SplitGL` machine to get the canonical `u64`
representation
  - Writes the 8 32-Bit words to memory at the provided memory pointer

The read and write memory regions can even overlap! 🎉 

This should simplify our RISC-V machine, as the syscall already expects
two memory pointers. We can simply pass it to the machine directly.

I started doing that in #1533, but I think it makes sense to wait until
#1443 is merged.

To test:
```
cargo run -r pil test_data/std/poseidon_gl_test.asm -o output -f --export-csv --prove-with estark-starky
```

I recommend reviewing the diff between
`std/machines/hash/poseidon_gl.asm` and
`std/machines/hash/poseidon_gl_memory.asm`

### Discussion

The overhead of the memory read / write is quite high (18 extra witness
columns, see [this
comment](40bdca4368/std/machines/hash/poseidon_gl_memory.asm (L13-L23)),
mostly because we now need to have the input available in all rows
(which previously was only the case for the outputs). If we had offsets
other than 0 and 1, this could be avoided. Doing 24 parallel memory
reads in the first row would *not* help, because we'd have to add 24
witness columns (instead of 2 now) to store the result of the memory
operation.

A few more notes:
- With Vadcop, 18 extra witness columns in a secondary machine is *a
lot* better than introducing more registers (either "regular" registers
or assignment registers) in the main machine
- As mentioned
[here](40bdca4368/std/machines/hash/poseidon_gl_memory.asm (L111-L113)),
we could get rid of two permutations if either:
- We were able to express explicitly that we want to call at most one
operation in the current row, or
- We had an optimizer that would be smart enough to batch the memory
reads and writes.
- We could also have just 1 read or write at a time (instead of 2), but
we'd have to increase the block size from 31 to 32 and the
implementation would be more complicated.
- We could also store the full final state of the Poseidon permutation,
instead of just the first 4 elements. This would need 8 more witness
columns to make the entire output available in all rows. Then, one could
use the machine to implement a Poseidon sponge, instead of.
- Looking at the bootloader, maybe it makes sense to pass 3 input
pointers instead of 1: One for the first 4 elements, one for the next 4,
and one for the capacity (often just a constant). For example, when
computing a Merkle root, you'd pass pointers for the two children hashes
and a pointer to the capacity constant.
2024-07-08 15:20:39 +00:00
Georg Wiese
f0af0c18c3 Remove all top level declarations (#1512)
Includes #1511, but also removes the declarations of the challenges in
the `permutation` and `lookup` modules.

With these changes, we never declare a column or challenge outside a
machine namespace. This greatly simplifies #1470.

---------

Co-authored-by: schaeff <thibaut@schaeff.fr>
Co-authored-by: chriseth <chris@ethereum.org>
2024-07-02 13:02:59 +00:00
Leandro Pacheco
72b4d739d7 merge compatible links from different instructions (#1467)
Merge compatible `link`s into a single permutation/lookup.
We only consider merging links from different instructions, as a single
instruction can be active at a time.
Links with next references are ignored due to a limitation in witgen
(left a TODO so its easily fixed upon witgen support)
2024-07-02 10:53:07 +00:00
onurinanc
9cce7ea4f9 implement lookup arguments from logarithmic derivatives in pil (#1477)
This PR solves #1374

There are 3 examples included in the PR:

1. `lookup_via_challenges_ext_simple.asm` is the most basic example
which implements {a} in {b} using extension field without using
selectors
2. `lookup_via_challenges_ext_no_selector.asm` implements {a1, a2} in
{b1, b2} using extension field without using selectors
3. `lookup_via_challenges_ext.asm` is a more complex example than others
which implements {a1, a2, a3} in {b1, b2, b3} using extension field
(which also handles tuples using Reed-Solomon fingerprinting). It also
use different lhs and rhs selectors.

---------

Co-authored-by: Georg Wiese <georgwiese@gmail.com>
2024-06-30 13:04:15 +00:00
Georg Wiese
1c465e566b Add MemoryWithBootloaderWrite machine to std (#1463)
This PR adds a new memory machine to the standard library that also
supports a `mstore_bootloader` operation: It's like a normal `mstore`,
but the first access to every memory cell *has to* come from this
operation.

As a follow-up, we'll be able to use it in the RISC-V machine (#1462).

I think this is best reviewed by reviewing the diff to the normal memory
machine and comparing it to [the code we have currently inlined in the
RISC-V
machine](abbe26618f/riscv/src/code_gen.rs (L500-L553)):

```diff
3,5c3,7
< // A read/write memory, similar to that of Polygon:
< // https://github.com/0xPolygonHermez/zkevm-proverjs/blob/main/pil/mem.pil
< machine Memory with
---
> /// This machine is a slightly extended version of std::machines::memory::Memory,
> /// where in addition to mstore, there is an mstore_bootloader operation. It behaves
> /// just like mstore, except that the first access to each memory cell must come
> /// from the mstore_bootloader operation.
> machine MemoryWithBootloaderWrite with
7c9
<     operation_id: m_is_write,
---
>     operation_id: operation_id,
13a16
>     operation mstore_bootloader<2> m_addr, m_step, m_value ->;
29a33
>     col witness m_is_bootloader_write;
30a35,36
>     std::utils::force_bool(m_is_bootloader_write);
>     col operation_id = m_is_write + 2 * m_is_bootloader_write;
35a42
>     (1 - is_mem_op) * m_is_bootloader_write = 0;
37,39c44,45
<     // If the next line is a not a write and we have an address change,
<     // then the value is zero.
<     (1 - m_is_write') * m_change * m_value' = 0;
---
>     // The first operation of a new address has to be a bootloader write
>     m_change * (1 - m_is_bootloader_write') = 0;
41,42c47,52
<     // change has to be 1 in the last row, so that a first read on row zero is constrained to return 0
<     (1 - m_change) * LAST = 0;
---
>     // m_change has to be 1 in the last row, so that the above constraint is triggered.
>     // An exception to this when the last address is -1, which is only possible if there is
>     // no memory operation in the entire chunk (because addresses are 32 bit unsigned).
>     // This exception is necessary so that there can be valid assignment in this case.
>     pol m_change_or_no_memory_operations = (1 - m_change) * (m_addr + 1);
>     LAST * m_change_or_no_memory_operations = 0;
46c56
<     (1 - m_is_write') * (1 - m_change) * (m_value' - m_value) = 0;
---
>     (1 - m_is_write' - m_is_bootloader_write') * (1 - m_change) * (m_value' - m_value) = 0;
➜  powdr git:(memory-with-bootloader-write) ✗ 
```
2024-06-24 21:02:07 +00:00
Leandro Pacheco
abbe26618f Instructions with link statements (#1439)
Allow VM instructions to use the `link` notation, unifying the way
machines are linked from VMs and block machines.
Previous syntax for "external instructions" not allowed anymore, and
should use the new `link` syntax.
2024-06-18 17:31:38 +00:00
Georg Wiese
7a851317bc Implement permutation argument using extension field (#1306)
Makes the permutation argument sound on the Goldilocks field by
evaluating polynomials on the extension field introduced in #1310.

I also used the new `Constr::Permutation` variant!

A few test cases (also tested in CI):

#### No extension field
`cargo run pil test_data/std/permutation_via_challenges.asm -o output -f
--field bn254 --prove-with halo2-mock`

This still works and produces the same output as before, thanks to the
PIL evaluator removing multiplications by 0 etc:

```
    col witness stage(1) z;
    (std::protocols::permutation::is_first * (main.z - 1)) = 0;
    ((((1 - main.first_four) * ((std::protocols::permutation::beta1 - ((std::protocols::permutation::alpha1 * main.b1) + main.b2)) - 1)) + 1) * main.z') = (((main.first_four * ((std::protocols::permutation::beta1 - ((std::protocols::permutation::alpha1 * main.a1) + main.a2)) - 1)) + 1) * main.z);
```

#### With extension field
`cargo run pil test_data/std/permutation_via_challenges_ext.asm -o
output -f --field bn254 --prove-with halo2-mock`

The constraints are significantly more complex but seem correct to me:

```
    col witness stage(1) z1;
    col witness stage(1) z2;
    (std::protocols::permutation::is_first * (main.z1 - 1)) = 0;
    (std::protocols::permutation::is_first * main.z2) = 0;
    (((((1 - main.first_four) * ((std::protocols::permutation::beta1 - ((std::protocols::permutation::alpha1 * main.b1) + main.b2)) - 1)) + 1) * main.z1') + ((7 * ((1 - main.first_four) * (std::protocols::permutation::beta2 - (std::protocols::permutation::alpha2 * main.b1)))) * main.z2')) = ((((main.first_four * ((std::protocols::permutation::beta1 - ((std::protocols::permutation::alpha1 * main.a1) + main.a2)) - 1)) + 1) * main.z1) + ((7 * (main.first_four * (std::protocols::permutation::beta2 - (std::protocols::permutation::alpha2 * main.a1)))) * main.z2));
    ((((1 - main.first_four) * (std::protocols::permutation::beta2 - (std::protocols::permutation::alpha2 * main.b1))) * main.z1') + ((((1 - main.first_four) * ((std::protocols::permutation::beta1 - ((std::protocols::permutation::alpha1 * main.b1) + main.b2)) - 1)) + 1) * main.z2')) = (((main.first_four * (std::protocols::permutation::beta2 - (std::protocols::permutation::alpha2 * main.a1))) * main.z1) + (((main.first_four * ((std::protocols::permutation::beta1 - ((std::protocols::permutation::alpha1 * main.a1) + main.a2)) - 1)) + 1) * main.z2));
```

#### On Goldilocks

Running the first example on GL fails, because using the permutation
argument without the extension field would not be sound. The second
example works, but because we don't support challenges on GL yet, it
doesn't actually run the second-phase witness generation.

---------

Co-authored-by: chriseth <chris@ethereum.org>
2024-06-12 11:15:50 +00:00
Gastón Zanitti
ed64f5ce09 Allow the type checker to accept empty values (#1393)
Co-authored-by: chriseth <chris@ethereum.org>
2024-06-11 15:12:39 +00:00
Steve Wang
fba0ba8d08 modul asm only (#1410)
Only the asm file changes that's ready to merge. To be paired with
another PR that will be updated and merged later: #1404
2024-05-30 15:19:56 +00:00
Georg Wiese
0105da5b4d Add test for parallel memory accesses (#1403)
This simulates one approach we could go for when moving registers to
memory. The memory machine remains completely unchanged, but the step is
increased by more than 1 in each step of the main machine. This way,
from the point of view of memory, all the memory operations happen at
different time steps, which allows for:
- Reading from the same address twice
- Writing to the same address that we read from (which from the point of
view of memory should happen *after* the read)

The only downside I see with this approach is that this makes the
differences of time steps between memory accesses bigger: Before it was
at most the degree, now it is some small constant times the degree (in
this example 3). The way the memory machine is currently built, the
difference can be at most $2^{32} - 1$, so I think this is fine in
practice. E.g., for a degree $2^{30}$ machine we could do up to 4
parallel reads / writes.
2024-05-29 13:18:44 +00:00
chriseth
3e43e33796 Constr as user-defined enum (#1252)
Turns some of the built-in types into user types in the prelude.
2024-05-01 10:29:19 +00:00
chriseth
75968f286a Btree data structure (#1299)
Co-authored-by: Georg Wiese <georgwiese@gmail.com>
2024-04-30 09:40:58 +00:00
Leandro Pacheco
cea207ff3f Machine properties using with syntax (#1267)
This implements issue #1251.
Basically `machine Foo(a,b) { ... }` is now `machine Foo with latch: a,
operation_id: b { ... }`
2024-04-25 16:02:01 +00:00
Georg Wiese
46e8bb8ee5 Move STD machines into submodule (#1201)
Moves all machines in the standard library to `std::machines`. That way,
it is separated from the PIL utilities in STD.
2024-04-25 12:14:33 +00:00
Georg Wiese
a917e4f35a Add permutation to PIL STD (#1297)
First part of #1296 

This PR adds a `permutation()` function to the standard library. The
code is inspired by the `permutation_via_challenges` test (removed now),
[this
comment](https://github.com/powdr-labs/powdr/issues/424#issuecomment-1931686047)
by @chriseth and [this Halo2
implementation](https://github.com/privacy-scaling-explorations/halo2/blob/main/halo2_proofs/examples/shuffle.rs).
2024-04-25 07:32:35 +00:00
Georg Wiese
96893cc143 Add Write-once memory to STD (#1202)
Fixes #844

This PR adds a new machine to the STD: `WriteOnceMemory`. This can be
used in our RISC-V machine for bootloader inputs (#1203).

Most of the issues mentioned in the issue were fixed in the meantime or
had a simple workaround (like defining `let LATCH = 1`). The only
remaining issues were in the machine detection, which I fixed here.

I also re-factor two existing tests.
2024-03-26 18:50:30 +00:00
Georg Wiese
a18577fed2 Add read/write memory to standard library (#1129)
With the recent changes by @pacheco, we can extract our [memory
machine](https://github.com/powdr-labs/powdr/blob/main/riscv/src/compiler.rs#L687-L841)
as a separate machine and add it to the standard library.

The result should be the same as calling the function linked above with
`with_bootloader=false`, except that the memory alignment stuff is not
inlined. For this reason, the machine is not yet used by the RISC-V
machine, but it could be after #1077 is implemented.

[This](eb320dca0c) shows the diff from
what we have in `compiler.rs`.

<!--

Please follow this protocol when creating or reviewing PRs in this
repository:

- Leave the PR as draft until review is required.
- When reviewing a PR, every reviewer should assign themselves as soon
as they
start, so that other reviewers know the PR is covered. You should not be
discouraged from reviewing a PR with assignees, but you will know it is
not
  strictly needed.
- Unless the PR is very small, help the reviewers by not making forced
pushes, so
that GitHub properly tracks what has been changed since the last review;
use
  "merge" instead of "rebase". It can be squashed after approval.
- Once the comments have been addressed, explicitly let the reviewer
know the PR
  is ready again.

-->
2024-03-25 17:33:55 +00:00
Georg Wiese
10aeae6cb5 Allow and use permutation in all std machines 2024-03-21 17:20:18 +01:00
Georg Wiese
f87a760071 Hints for arithmetic machine 2024-03-18 11:57:40 +01:00
Georg Wiese
3ce5a9f98a Add tests for Shift and Binary machines 2024-02-26 11:24:03 +01:00
Georg Wiese
40fe394c19 Add equations 1-4 2024-02-08 11:41:53 +01:00
Georg Wiese
9ea67369ce Arithmetic machine: Add equation selectors 2024-02-06 16:55:20 +01:00
Leandro Pacheco
39c5c62028 AST parsing and impl Display fixes
- `instr x = y` statements must end with semicolon
- fixed Display implementation for AST objects such that
  source->AST->source->AST works properly
- tests for the above
2024-02-05 18:40:00 -03:00
Georg Wiese
f91823e0ef Arith machine, Equation 0 2024-01-31 19:05:53 +01:00
chriseth
a1ba5707a9 Fix poseidon parsing on goldilocks. 2024-01-25 16:12:55 +01:00
Georg Wiese
f26a161ae3 Add SplitBN254 machine, use queries in SplitGL machine 2023-10-31 18:27:19 +00:00
Georg Wiese
263c03d77d WrapGL machine: return both high and low values 2023-10-05 16:18:28 +00:00
Georg Wiese
5268e6fc24 Add more machines to Powdr STD 2023-10-03 18:02:01 +00:00