Change: The tuple elimination pass now represents its *tuple trees*
using nested persistent vectors of non-tuple terms, rather than standard
terms.
Rationale: When large oblivious arrays are transformed to tuples, said
tuples are large, and updates to them use the tuple update operator.
However, when we eliminate those tuples, we essentially replace the
update operator with an explicit tuple (tree). For a large number of
updates, this entails quadratic memory usage.
Of course, it doesn't *need* to: each update only modifies one spot in
the tuple.
Thus, in this patch, we change the representation of tuples in tuple
tree from a standard term---whose children are in a vector---such that
the children are now in a persistent vector from the `im` crate. This
allows for an "update" operation on the vector that takes log time and
log new space.
Effect: This reduces the compiler's memory usage on
`examples/C/mpc/benchmarks/2pc_biomatch_40x.c` from ~28GB to ~1GB. It
also makes the compiler substantially faster on that benchmark.
In *non-recursive* type-checking, perhaps better called *type computing*
we perform a minimal traversal in order to compute the type of a term,
without recursively type-checking it. Informally, we assume it is well
typed, and do the minimal amount of work needed to compute its type.
Two improvements:
1. No longer implemented with recursion.
2. Caches all intermediate results.
Implementation:
* `check_dependencies(Term) -> Vec<Term>`: maps a terms to the immediate
children of it whose types are needed to compute its type.
* `check_raw_step(Term, TypeTable) -> Sort`: assumes those children have
their types in the table, and computes this term's type.
* `check_raw`: glues the two above functions together into a suitable
traversal. Similar to `rec_check_raw`, but the traversal isn't total.
Significance:
Previously, we had a non-recursive implementation for array stores that
*didn't cache intermediate results* which could cause quadratic
type-checking time (if the type-check callee was doing a top-down
traversal). Edward sent me a benchmark that was experiencing this,
resulting in 74.1% of total compilation time being spent type-checking.
Now it's down to 0.4% of total compilation time.
This PR does two things:
- Fixes some scoping issues in Z# having to do with the context in which function and struct definitions are interpreted.
- Improves speed and reduces TermTable memory consumption by 2x (by keying on Arc<TermData> rather than TermData)
- Improves speed for Z# array handling with a BTreeMap -> HashMap swap
- Adds intrinsic widening casts in Z#. These were previously handled by bit-splitting, which is wasteful.
On array-heavy Z# programs, this PR improves speed by about 5x and memory consumption by roughly the same.
---
* zsharp::term::ir_array should use HashMap (faster, less memory)
* perf improvements
constant folding: get rid of redundant clones
bitvector: impl arith for refs
TermTable: don't store redundant term as key
* EMBED widening casts
* generic inf: need to push fn defn context onto filestack
* canonicalize path in err msg
* need struct defn on filestack, too
* no need to warn about U32 indexing in zsharp
* fix warning