three changes:
* faster integer->ff conversion
* parallel construction of bellman LCs
* parallel R1CS checking (on the CirC side)
The first change is the most important, by far. Our previous integer->ff
conversion was very slow.
The R1CS optimizer tracks variable uses so that it can run faster.
Our tracker would sometimes think that a variable has cancelled from a
constraint when it had not.
Now, the tracker is conservative, but correct.
This allows folks to use the RAM machinery while sticking with (non-interactive) R1CS output.
We're going to need this anyway when we benchmark our new approach.
* Improve the oblivious RAM pass by killing the hack where we treat selects as arrays.
* Fix a bug where the volatile RAM pass would not place selects before stores against the same array
* Improve that volatile RAM pass by placing selects against the same array literal in the same RAM. Before, they would each end up in different RAMs, which sucks. This is especially bad for ROMs.
Previously, a GC call would scan the HC table, identifying dead nodes.
Now, whenever a Node is destroyed, if it's ready to be GC'd, it gets added to a list. That list is used at GC time instead of the scan.
This substantially decreases the cost of running GC many times (as the Z# FE does).
* r1cs: fix boolean majority
for m = majority(a,b,c),
we were lowering
(-ab + bc + ac - 2abc)
instead of
(ab + bc + ac - 2abc)
Whoops! This is a functional correctness bug: we were lowering
soundly and completely, for the wrong specification.
Credit to Anna for raising the issue!
* add test_sha256 (commented out b/c it is slow)
* fmt + lint
To reduce CI build time:
- Replaced ABY dependency with corresponding binary.
- Removed dependencies on KaHIP and KaHyPar for now because these dependencies aren't used upstream.
Minor updates:
- Updated ABY source to Public branch
Note:
- The aby_interpreter binary will only work on Linux. We can rebuild the binary from this repo.
A basic implementation of committed witnesses & volatile RAM extraction in the Z# front-end.
The passes in question are still a bit brittle, so I left them behind a flag.
- Upgraded ci pipeline to [v3](https://github.com/actions/cache/blob/main/README.md)
- Included installation and build scripts for KaHIP and kahypar in driver.py
- Used absolute paths for caching in ci pipeline (relative paths don't work).
Average ci time brought down from 15 minutes to 8 minutes!
Adds:
an implementation of the Mirage proof system
generalized to multiple round of interaction
a notion of rounds for variables
a notion of randomness for variables
to the R1CS layer:
committed witnesses
rounds
new witness computation machinery (to support multiple rounds)
Before, whenever we constructed a bit-wise encoding of a BV, we also
debitified it into a UINT encoding.
Now, we do the debitification on-demand. This saves ~10-15% of compilation
time.
While debitification is free from the perspective of R1CS, it is costly
for compilation time.
I measured compile time for the sha2 round function.
```terminal
Benchmark #1: ./circ-old third_party/ZoKrates/zokrates_stdlib/stdlib/hashes/sha256/shaRound.zok r1cs
Time (mean ± σ): 1.219 s ± 0.019 s [User: 1.145 s, System: 0.071 s]
Range (min … max): 1.180 s … 1.240 s 10 runs
Benchmark #2: ./circ-new third_party/ZoKrates/zokrates_stdlib/stdlib/hashes/sha256/shaRound.zok r1cs
Time (mean ± σ): 1.055 s ± 0.024 s [User: 987.1 ms, System: 65.9 ms]
Range (min … max): 1.026 s … 1.094 s 10 runs
Summary
'./circ-new third_party/ZoKrates/zokrates_stdlib/stdlib/hashes/sha256/shaRound.zok r1cs' ran
1.15 ± 0.03 times faster than './circ-old third_party/ZoKrates/zokrates_stdlib/stdlib/hashes/sha256/shaRound.zok r1cs'
```
I split the hash-consing implementation into its own crate and re-implemented it.
In the new implementation, the table is thread-local. Terms are not Send, but linear terms are.
Decreased used of atomics gives a non-trivial speed-up. I tested on Z#'s sha2 round function, with an exit after IR optimization:
Benchmark #1: ./circ-old third_party/ZoKrates/zokrates_stdlib/stdlib/hashes/sha256/shaRound.zok r1cs
Time (mean ± σ): 236.2 ms ± 16.1 ms [User: 223.3 ms, System: 12.4 ms]
Range (min … max): 221.1 ms … 264.1 ms 11 runs
Benchmark #2: ./circ-new third_party/ZoKrates/zokrates_stdlib/stdlib/hashes/sha256/shaRound.zok r1cs
Time (mean ± σ): 141.8 ms ± 13.1 ms [User: 131.3 ms, System: 10.0 ms]
Range (min … max): 125.4 ms … 160.4 ms 18 runs
Summary
'./circ-new third_party/ZoKrates/zokrates_stdlib/stdlib/hashes/sha256/shaRound.zok r1cs' ran
1.67 ± 0.19 times faster than './circ-old third_party/ZoKrates/zokrates_stdlib/stdlib/hashes/sha256/shaRound.zok r1cs'
The first bug is a disagreement between our bellman circuit and our bellman
verify function. The circuit omit unused variables (public or private). The
verify function includes unused public variables.
Now, the circuit includes unused public variables.
The second bug is related to large BV comparisons. The fix is to emit a
bitwise comparator. We could optimize further in the future.
closes#125
* Configuration system. Kill DFL_T
* add circ::cfg::CircCfg that holds cfg info
* it's constructible from circ_opt::CircOpt
* implements clap::Args, so you can set it from your compiler's
CLI/envvars
* defined in external crate to keep clap out of our main build
* organized by circ module, but not feature gated
* no point: the build wouldn't meaningfully change
* includes a way to set the default field
* added circ::cfg::set and circ::cfg::cfg
* also circ::cfg::set_default and circ::cfg::set_cfg
* access a sync::once_cell, static configuration
* killed DFL_T
* workflows
* unit-tested component probably need to not read circ::cfg::cfg.
* compilers need to call circ::cfg::set or circ::cfg::set_default.
* rm dead features
* WIP
* Alternate formatting machinery.
* fmt & lint
* rm Letified
* ilp
* Configuration system. Kill DFL_T
* add circ::cfg::CircCfg that holds cfg info
* it's constructible from circ_opt::CircOpt
* implements clap::Args, so you can set it from your compiler's
CLI/envvars
* defined in external crate to keep clap out of our main build
* organized by circ module, but not feature gated
* no point: the build wouldn't meaningfully change
* includes a way to set the default field
* added circ::cfg::set and circ::cfg::cfg
* also circ::cfg::set_default and circ::cfg::set_cfg
* access a sync::once_cell, static configuration
* killed DFL_T
* workflows
* unit-tested component probably need to not read circ::cfg::cfg.
* compilers need to call circ::cfg::set or circ::cfg::set_default.
* rm dead features
Fixes 4 bugs in R1CS lowering:
* division-by-zero in finite fields: previously, this always caused
incompleteness. Now, the behavior depends on the CIRC_RELAXATION
envvar:
* CIRC_RELAXATION=incomplete: divide-by-0 causes incompletenes
* CIRC_RELAXATION=nondet: divide-by-0 allows the output to take any value
* CIRC_RELAXATION=det: divide-by-0 forces the output to 0
* bit-vector overshift: previously, this cause incompleteness. Now,
we follow the semantics of SMT-LIB (the result saturates).
* bit-vector udiv-by-zero: previously, the output value was an
unconstrained bit-vector. Now, it is MAX (following SMT-LIB).
* bit-vector comparisons: previously, the prover could lie, claiming
that x < y when actually x >= y. All bit-vector comparisons are
affected, including those in udiv and urem.
* soundness bug!