Previously, a GC call would scan the HC table, identifying dead nodes.
Now, whenever a Node is destroyed, if it's ready to be GC'd, it gets added to a list. That list is used at GC time instead of the scan.
This substantially decreases the cost of running GC many times (as the Z# FE does).
A basic implementation of committed witnesses & volatile RAM extraction in the Z# front-end.
The passes in question are still a bit brittle, so I left them behind a flag.
I split the hash-consing implementation into its own crate and re-implemented it.
In the new implementation, the table is thread-local. Terms are not Send, but linear terms are.
Decreased used of atomics gives a non-trivial speed-up. I tested on Z#'s sha2 round function, with an exit after IR optimization:
Benchmark #1: ./circ-old third_party/ZoKrates/zokrates_stdlib/stdlib/hashes/sha256/shaRound.zok r1cs
Time (mean ± σ): 236.2 ms ± 16.1 ms [User: 223.3 ms, System: 12.4 ms]
Range (min … max): 221.1 ms … 264.1 ms 11 runs
Benchmark #2: ./circ-new third_party/ZoKrates/zokrates_stdlib/stdlib/hashes/sha256/shaRound.zok r1cs
Time (mean ± σ): 141.8 ms ± 13.1 ms [User: 131.3 ms, System: 10.0 ms]
Range (min … max): 125.4 ms … 160.4 ms 18 runs
Summary
'./circ-new third_party/ZoKrates/zokrates_stdlib/stdlib/hashes/sha256/shaRound.zok r1cs' ran
1.67 ± 0.19 times faster than './circ-old third_party/ZoKrates/zokrates_stdlib/stdlib/hashes/sha256/shaRound.zok r1cs'