This adds a new operation interface `SDFGConvertibleOpInterface` that
allows an operation to specify how it is converted to an SDFG
process. The interface consists of a single method `convert` that
receives as the arguments the DFG created using `SDFG.init`, a set of
SDFG input streams corresponding to the operands and a set of output
streams for results. The order of the input and output streams
corresponds to the order of the operands and output values,
respectively.
This adds a new dialect called "SDFG" for data flow graphs. An SDFG
data flow graph is composed of a set of processes, connected through
data streams. Special streams allow for data to be injected into and
to be retrieved from the data flow graph.
The dialect is intended to be lowered to API calls that allow for
offloading of the graph on hardware accelerators.
C struct now contains an additonal char* pointer, which can be either
NULL in case there is no error, or a buffer containing the error
message. It's the responsability of destructor function to free that
memory.
CAPI covering a wider API of the Support library.
Better error handling. Could also be improved by returning an error
message back from C to rust (left TODO).
current CAPI of CompilerEngine isn't really a CAPI. It's initial need
was for the python bindings to have access to the CompilerEngine through
a convenient API. So we now make a clear separation of CAPI and python
wrappers. So we now have wrappers functions, that can be implemented
using C/C++, and will be exposed to python via pybind11. And we have a
CAPI (still need fixing as it still contains C++ code), that can be used
as is, or to build bindings for other languages (such as Rust).
The new option `--batch-concrete-ops` invokes the batching pass after
lowering to the Concrete dialect and after lowering linalg operations
with operations from the Concrete dialect to loops.
The new action `dump-concrete-with-loops` dumps the IR right before
batching.
This adds a new pass that is able to hoist operations implementing the
`BatchableOpInterface` out of a loop nest that applies the operation
to the elements of a tensor indexed by the loop induction variables.
Example:
scf.for %i = c0 to %cN step %c1 {
scf.for %j = c0 to %cM step %c1 {
scf.for %k = c0 to %cK step %c1 {
%s = tensor.extract %T[%i, %j, %k]
%res = batchable_op %s
...
}
}
}
is replaced with:
%batchedSlice = tensor.extract_slice
%T[%c0, %c0, %c0] [%cN, %cM, %cK] [%c1, %c1, %c1]
%flatSlice = tensor.collapse_shape %batchedSlice
%resTFlat = batchedOp %flatSlice
%resT = tensor.expand_shape %resTFlat
scf.for %i = c0 to %cN step %c1 {
scf.for %j = c0 to %cM step %c1 {
scf.for %k = c0 to %cK step %c1 {
%res = tensor.extract %resT[%i, %j, %k]
...
}
}
}
Every index of the tensor with the input values may be a quasi-affine
expression on a single loop induction variable, as long as the
difference between the results of the expression for any two
consecutive values of the referenced loop induction variable is
constant.
This adds a new operation interface that allows an operation to
specify that a batched version of the operation exists that applies it
on the elements of a flat tensor in parallel.
this required to have a CAPI that when asked for types, returns a
structure that can report if an error was faced during type creation.
This is required since a failure at that stage in the compiler would
lead to a segfault in the python bindings for example, and we want to be
able to handle this scenario gracefully.
converting types of the original op seems to have an impact on other
operations using the result type, which should consider checking the
different cases (whether the type has been converted yet, or not).
However, creating a new op don't have this issue
- unify CPU and GPU bootstrapping operations
- remove operations to build GLWE from table: this is now done in
wrapper functions
- remove GPU memory management operations: done in wrappers now, but we
will have to think about how to deal with it later in MLIR