current CAPI of CompilerEngine isn't really a CAPI. It's initial need
was for the python bindings to have access to the CompilerEngine through
a convenient API. So we now make a clear separation of CAPI and python
wrappers. So we now have wrappers functions, that can be implemented
using C/C++, and will be exposed to python via pybind11. And we have a
CAPI (still need fixing as it still contains C++ code), that can be used
as is, or to build bindings for other languages (such as Rust).
The batching pass passes operands to the batched operation as a flat,
one-dimensional vector produced through a `tensor.collapse_shape`
operation collapsing all dimensions of the original tensor of
operands. Similarly, the shape of the result vector of the batched
operation is expanded to the original shape afterwards using a
`tensor.expand_shape` operation.
The pass emits the `tensor.collapse_shape` and `tensor.expand_shape`
operations unconditionally, even for tensors, which already have only
a single dimension. This causes the verifiers of these operations to
fail in some cases, aborting the entire compilation process.
This patch lets the batching pass emit `tensor.collapse_shape` and
`tensor.expand_shape` for batched operands and batched results only if
the rank of the corresponding tensors is greater than one.
The new option `--batch-concrete-ops` invokes the batching pass after
lowering to the Concrete dialect and after lowering linalg operations
with operations from the Concrete dialect to loops.
The new action `dump-concrete-with-loops` dumps the IR right before
batching.
This adds a new pass that is able to hoist operations implementing the
`BatchableOpInterface` out of a loop nest that applies the operation
to the elements of a tensor indexed by the loop induction variables.
Example:
scf.for %i = c0 to %cN step %c1 {
scf.for %j = c0 to %cM step %c1 {
scf.for %k = c0 to %cK step %c1 {
%s = tensor.extract %T[%i, %j, %k]
%res = batchable_op %s
...
}
}
}
is replaced with:
%batchedSlice = tensor.extract_slice
%T[%c0, %c0, %c0] [%cN, %cM, %cK] [%c1, %c1, %c1]
%flatSlice = tensor.collapse_shape %batchedSlice
%resTFlat = batchedOp %flatSlice
%resT = tensor.expand_shape %resTFlat
scf.for %i = c0 to %cN step %c1 {
scf.for %j = c0 to %cM step %c1 {
scf.for %k = c0 to %cK step %c1 {
%res = tensor.extract %resT[%i, %j, %k]
...
}
}
}
Every index of the tensor with the input values may be a quasi-affine
expression on a single loop induction variable, as long as the
difference between the results of the expression for any two
consecutive values of the referenced loop induction variable is
constant.
This adds a new operation interface that allows an operation to
specify that a batched version of the operation exists that applies it
on the elements of a flat tensor in parallel.
updated also the API to make it easier to use by:
- creating MLIR components from native rust types instead of require
MLIR components in the API
- adding helpers around the creation of standard dialects
this required to have a CAPI that when asked for types, returns a
structure that can report if an error was faced during type creation.
This is required since a failure at that stage in the compiler would
lead to a segfault in the python bindings for example, and we want to be
able to handle this scenario gracefully.
Instead of overriding the process stderr to get
the string representation from mlir we can can
directly capture in into a string using mlir's
printOperation.
Another problem with overriding stderr is that
each `#[test]` runs as a different thread meaning that
as soon as we have 2+ tests the tests could panic
due to conflicts/races between the different overrides.
This also moves the expected string directly into the test
as a literal.
The rust bindings are intented to access both LLVM/MLIR CAPI as well as
the concrete-compiler one. This initial commit provide the API for
LLVM/MLIR only. Tests should be used as an example to how to generate a
valid DAG of operations in MLIR.
- unify CPU and GPU bootstrapping operations
- remove operations to build GLWE from table: this is now done in
wrapper functions
- remove GPU memory management operations: done in wrappers now, but we
will have to think about how to deal with it later in MLIR
When building the ServerLib before any other concretelang target that
depends on `mlir-headers`, compilation fails due to missing include
files generated by tablegen, e.g., `llvm/IR/Attributes.inc`.
This adds a dependency from ServerLib to `mlir-headers`, which forces
the generation of the missing header files.