Previously, the traversal stack held (node, children queued) pairs.
When visiting a node without it's children queued, we would queue them
all. They take a lot of memory!
Now, the stack holds children iterators.
Also: this patch fixes many bugs introduced by the prior one.
For memories with constant values that have sorts which are linear
groups, there is a way to optimize linear-scan memory-checking.
This patch implements that optimization.
This started as an optimization patch, but my first optimization
revealed a bug. In chasing the bug, I found more optimization.
Changes:
1. Eliminate tuples in preprocessing. (opt)
2. Handle CStore in tuple elimination pass. (bugfix)
3. Use tuples instead of arrays in a few more extension ops: (opt)
* GCD for vanishing polynomials and their derivatives
* sorting in transcript checking
4. A few logging revisions
Previously, we would eliminate only final-round witnesses. This is
obviously sound, but sub-optimal.
Anna has found some protocols (SHA with lookups) where the
sub-optimality matters, so I've made the optimizer a bit more
aggressive.
three changes:
* faster integer->ff conversion
* parallel construction of bellman LCs
* parallel R1CS checking (on the CirC side)
The first change is the most important, by far. Our previous integer->ff
conversion was very slow.
The R1CS optimizer tracks variable uses so that it can run faster.
Our tracker would sometimes think that a variable has cancelled from a
constraint when it had not.
Now, the tracker is conservative, but correct.
This allows folks to use the RAM machinery while sticking with (non-interactive) R1CS output.
We're going to need this anyway when we benchmark our new approach.
* Improve the oblivious RAM pass by killing the hack where we treat selects as arrays.
* Fix a bug where the volatile RAM pass would not place selects before stores against the same array
* Improve that volatile RAM pass by placing selects against the same array literal in the same RAM. Before, they would each end up in different RAMs, which sucks. This is especially bad for ROMs.
Previously, a GC call would scan the HC table, identifying dead nodes.
Now, whenever a Node is destroyed, if it's ready to be GC'd, it gets added to a list. That list is used at GC time instead of the scan.
This substantially decreases the cost of running GC many times (as the Z# FE does).
* r1cs: fix boolean majority
for m = majority(a,b,c),
we were lowering
(-ab + bc + ac - 2abc)
instead of
(ab + bc + ac - 2abc)
Whoops! This is a functional correctness bug: we were lowering
soundly and completely, for the wrong specification.
Credit to Anna for raising the issue!
* add test_sha256 (commented out b/c it is slow)
* fmt + lint
To reduce CI build time:
- Replaced ABY dependency with corresponding binary.
- Removed dependencies on KaHIP and KaHyPar for now because these dependencies aren't used upstream.
Minor updates:
- Updated ABY source to Public branch
Note:
- The aby_interpreter binary will only work on Linux. We can rebuild the binary from this repo.
A basic implementation of committed witnesses & volatile RAM extraction in the Z# front-end.
The passes in question are still a bit brittle, so I left them behind a flag.