C struct now contains an additonal char* pointer, which can be either
NULL in case there is no error, or a buffer containing the error
message. It's the responsability of destructor function to free that
memory.
CAPI covering a wider API of the Support library.
Better error handling. Could also be improved by returning an error
message back from C to rust (left TODO).
- Missing offset in woppbs routine
- Better error message for check of tensor result in end to end fixture
- Modify fixture generator for testing purpose
* Update makefile to use appropriate arguments to find executables on
macos systems.
* Add `set -e` to make sure that the tests crash if something goes wrong
in the build or test of the macos job of the CI.
closes#783
current CAPI of CompilerEngine isn't really a CAPI. It's initial need
was for the python bindings to have access to the CompilerEngine through
a convenient API. So we now make a clear separation of CAPI and python
wrappers. So we now have wrappers functions, that can be implemented
using C/C++, and will be exposed to python via pybind11. And we have a
CAPI (still need fixing as it still contains C++ code), that can be used
as is, or to build bindings for other languages (such as Rust).
The batching pass passes operands to the batched operation as a flat,
one-dimensional vector produced through a `tensor.collapse_shape`
operation collapsing all dimensions of the original tensor of
operands. Similarly, the shape of the result vector of the batched
operation is expanded to the original shape afterwards using a
`tensor.expand_shape` operation.
The pass emits the `tensor.collapse_shape` and `tensor.expand_shape`
operations unconditionally, even for tensors, which already have only
a single dimension. This causes the verifiers of these operations to
fail in some cases, aborting the entire compilation process.
This patch lets the batching pass emit `tensor.collapse_shape` and
`tensor.expand_shape` for batched operands and batched results only if
the rank of the corresponding tensors is greater than one.
This CI "feature" is meant to circumvent the 6 hours hard-limit
for a job in GitHub Action.
The benchmark is done using a matrix which is handled by Slab.
Here's the workflow:
1. ML benchmarks are started in a fire and forget fashion via
start_ml_benchmarks.yml
2. Slab will read ci/slab.toml to get the AWS EC2 configuration
and the matrix parameters
3. Slab will launch at most max_parallel_jobs EC2 instances in
parallel
4. Each job will trigger ml_benchmark_subset.yml which will run
only one of the generated YAML file via make generate-mlbench,
based on the value of the matrix item they were given.
5. As soon as a job is completed, the next one in the matrix
will start promptly.
This is done until all the matrix items are exhausted.
This adds a new end-to-end test `apply_lookup_table_batched`, which
forces batching of Concrete operations when invoking the compiler
engine, indirectly causing the `concrete.bootstrap_lwe` and
`concrete.keyswitch_lwe` operations generated from the
`FHELinalg.apply_lookup_table` operation of the test to be batched
into `concrete.batched_bootstrap_lwe` and
`concrete.batched_keyswitch_lwe` operations. The batched operations
trigger the generation of calls to batching wrapper functions further
down the pipeline, effectively testing the lowering and implementation
of batched operations altogether.
The new option `--batch-concrete-ops` invokes the batching pass after
lowering to the Concrete dialect and after lowering linalg operations
with operations from the Concrete dialect to loops.
The new action `dump-concrete-with-loops` dumps the IR right before
batching.
This adds a new pass that is able to hoist operations implementing the
`BatchableOpInterface` out of a loop nest that applies the operation
to the elements of a tensor indexed by the loop induction variables.
Example:
scf.for %i = c0 to %cN step %c1 {
scf.for %j = c0 to %cM step %c1 {
scf.for %k = c0 to %cK step %c1 {
%s = tensor.extract %T[%i, %j, %k]
%res = batchable_op %s
...
}
}
}
is replaced with:
%batchedSlice = tensor.extract_slice
%T[%c0, %c0, %c0] [%cN, %cM, %cK] [%c1, %c1, %c1]
%flatSlice = tensor.collapse_shape %batchedSlice
%resTFlat = batchedOp %flatSlice
%resT = tensor.expand_shape %resTFlat
scf.for %i = c0 to %cN step %c1 {
scf.for %j = c0 to %cM step %c1 {
scf.for %k = c0 to %cK step %c1 {
%res = tensor.extract %resT[%i, %j, %k]
...
}
}
}
Every index of the tensor with the input values may be a quasi-affine
expression on a single loop induction variable, as long as the
difference between the results of the expression for any two
consecutive values of the referenced loop induction variable is
constant.
This adds a new operation interface that allows an operation to
specify that a batched version of the operation exists that applies it
on the elements of a flat tensor in parallel.
updated also the API to make it easier to use by:
- creating MLIR components from native rust types instead of require
MLIR components in the API
- adding helpers around the creation of standard dialects