This adds a new option `--unroll-loops-with-sdfg-convertible-ops`,
which causes loops containing SDFG-convertible operations to be fully
unrolled upon the extraction of SDFG-operations using the
`--emit-sdfg-ops` switch. This avoids constant roundtrips between an
SDFG-capable accelerator and the host during execution of a loop.
The option is limited to `scf.for` loops with static bounds and a
static step size. Since full unrolling of loops with large bounds
results in a large number of operations, the option is disabled by
default.
This adds a new pass `ExtractSDGOps`, which scans a function for
operations that implement `SDFGConvertibleOpInterface`, replaces them
with SDFG processes and constructs an SDFG graph around the processes.
Initialization and teardown of the SDFG graph are embedded into the
function and take place at the beginning of the function and before
the function's terminator, respectively.
The pass can be invoked using concretecompiler by specifying the new
compilation option `--emit-sdfg-ops` or programmatically on a
`CompilerEngine` using the new compilation option `extractSDFGOps`.
This adds a new operation interface `SDFGConvertibleOpInterface` that
allows an operation to specify how it is converted to an SDFG
process. The interface consists of a single method `convert` that
receives as the arguments the DFG created using `SDFG.init`, a set of
SDFG input streams corresponding to the operands and a set of output
streams for results. The order of the input and output streams
corresponds to the order of the operands and output values,
respectively.
This adds a new dialect called "SDFG" for data flow graphs. An SDFG
data flow graph is composed of a set of processes, connected through
data streams. Special streams allow for data to be injected into and
to be retrieved from the data flow graph.
The dialect is intended to be lowered to API calls that allow for
offloading of the graph on hardware accelerators.
The new option `--batch-concrete-ops` invokes the batching pass after
lowering to the Concrete dialect and after lowering linalg operations
with operations from the Concrete dialect to loops.
The new action `dump-concrete-with-loops` dumps the IR right before
batching.
- unify CPU and GPU bootstrapping operations
- remove operations to build GLWE from table: this is now done in
wrapper functions
- remove GPU memory management operations: done in wrappers now, but we
will have to think about how to deal with it later in MLIR
This patch adds support for scalar results to the client/server
protocol and tests. In addition to `TensorData`, a new type
`ScalarData` is added. Previous representations of scalar values using
one-dimensional `TensorData` instances have been replaced with proper
instantiations of `ScalarData`.
The generic use of `TensorData` for scalar and tensor values has been
replaced with uses of a new variant `ScalarOrTensorData`, which can
either hold an instance of `TensorData` or `ScalarData`.
Returning tensors with elements whose width is not equal to 64 results
in garbled data. This commit extends the `TensorData` class used to
represent tensors in JIT compilation with support for signed /
unsigned elements of 8/16/32 and 64 bits, such that all clear text
tensors with up to 64 bits can be represented accurately.
Starting from Mac 11 (Big Sur), it appears we need to add -L
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -lSystem for the
sharedlib to link properly.i
For now what it works are only levelled ops with user parameters. (take a look to the tests)
Done:
- Add parameters to the fhe parameters to support CRT-based large integers
- Add command line options and tests options to allows the user to give those new parameters
- Update the dialects and pipeline to handle new fhe parameters for CRT-based large integers
- Update the client parameters and the client library to handle the CRT-based large integers
Todo:
- Plug the optimizer to compute the CRT-based large interger parameters
- Plug the pbs for the CRT-based large integer