Until now, the pass hoisting `RT.await_future` operations only
supports `tensor.parallel_insert_slice` operations that use loop
induction variables directly as indexes. Any more complex indexing
expressions produce a domination error, since a
`tensor.parallel_insert_slice` cloned by the pass into an additional
parallel for loop is left with references to values from the original
loop.
This change properly clones operations producing intermediate values
within the original parallel for loop and thus adds support for
indexing expressions that reference loops IVs only indirectly.
The type parser of the RT dialect unconditionally attempts to parse
the string "future". This breaks the parsing of any RT type. Remove
unconditional parsing of this prefix.
The tiling infrastructure preserves attributes of tiled
`linalg.generic` operations, such that the attribute for the tile
sizes specified for the `linalg.generic` operation before tiling is
copied to the `linalg.generic` operation that is part of the generated
IR for a single tile.
This change causes the attribute to be removed after tiling, since it
does not make sense to preserve the attribute for per-tile operations.
For now, the attribute "tile-sizes" is copied from the FHELinalg
operation to the corresponding `linalg.generic` operation only for
multiplications of encrypted matrices with plaintext matrices.
The changes of this commit also cause the attribute to be copied for
multiplications between plaintext ad ciphertext matrices, as well as
for multiplications between two ciphertext matrices.
This adds support for the tiling of `linalg.generic` operations that
have only parallel iterators or only parallel iterators and a single
reduction dimension via the linalg tiling infrastructure (i.e.,
`mlir::linalg::tileToForallOpUsingTileSizes()` and
`mlir::linalg::tileReductionUsingForall()`).
This allows for the tiling of FHELinalg operations by first replacing
them with appropriate `linalg.generic` oeprations and then invoking
the tiling pass in the pipeline. In order for the tiling to take
place, tile sizes must be specified using the `tile-sizes` operation
attribute, either directly for `linalg.generic` operations or
indirectly for the FHELinalg operation, e.g.,
"FHELinalg.matmul_eint_int"(%a, %b) { "tile-sizes" = [0, 0, 7] } : ...
Tiling of operations with a reduction dimension is currently limited
to tiling of the reduction dimension, i.e., the tile sizes for the
parallel dimensions must be zero.
This makes the trip counts of operations in the TFHE statistics pass
as well as the per-location memory usage statistics in the memory
usage statistics pass optional. These values are unset if the trip
count could not be determined statically.
The Concrete Optimizer is invoked on a representation of the program
in the high-level FHELinalg / FHE Dialects and yields a solution with
a one-to-one mapping of operations to keys. However, the abstractions
used by these dialects do not allow for references to keys and the
application of the solution is delayed until the pipeline reaches a
representation of the program in the lower-level TFHE dialect. Various
transformations applied by the pipeline along the way may break the
one-to-one mapping and add indirections into producer-consumer
relationships, resulting in ambiguous or partial mappings of TFHE
operations to the keys. In particular, explicit frontiers between
optimizer partitions may not be recovered.
This commit preserves explicit frontiers between optimizer partitions
as `optimizer.partition_frontier` operations and lowers these to
keyswitch operations before parametrization of TFHE operations.
This adds a new dialect called `Optimizer` with operations related to
the Concrete Optimizer. Currently, there is only one operation
`optimizer.partition_frontier` that can be inserted between a producer
and a consumer which belong to different partitions computed by the
optimizer. The purpose of this operation is to preserve explicit key
changes from the invocation of the optimizer on high-level dialects
(i.e., FHELinalg / FHE) until the IR is provided with actual
references to keys in low-level dialects (i.e., TFHE).
The DAG pass establishing a mapping between operations in the IR and
the optimizer DAG currently omits assignment of the optimizer ID to
`FHE.reinterpret_precision` operations via the `TFHE.OId`
attribute. This prevents subsequent passes from determining to which
optimizer partition a `FHE.reinterpret_precision` operation belongs.
This commit removes the early exit in `FunctionToDag::addOperation`
for the handling of `FHE.reinterpret_precision` that prevented the
code assigning the optimizer ID from being executed.
The current pass applying the parameters determined by the optimizer
to the IR propagates the parametrized TFHE types to operations not
directly tagged with an optimizer ID only under certain conditions. In
particular, it does not always properly propagate types into nested
regions (e.g., of `scf.for` loops).
This burdens preceding transformations that are applied in between the
invocation of the optimizer and the parametrization pass with
data-flow analysis and book-keeping in order to tag newly inserted
operations with the right optimizer IDs that ensure proper
parametrization.
This commit replaces the current parametrization pass with a new pass
that propagates parametrized TFHE types up and down def-use chains
using type inference and a proper rewriter. The pass is limited to the
operations supported by `TFHEParametrizationTypeResolver::resolve`.
The `TypeInference` dialect provides three operations.
The operation `TypeInference.propagate_downwards` respresents a type
barrier, which is supposed to forward the type of its operand as its
result type during type inference.
The operation `TypeInference.propagate_upwards` also respresents a
type barrier, but is supposed to forward the type of its result as the
type for its operand during type inference.
The operation `TypeInference.unresolved_conflict` can be used as a
marker when two different types have beed inferred for a value (e.g.,
one type during forward dataflow analysis and the other during
backward dataflow analysis)
In the optimizer, nodes without consumers are identified as outputs.
Since we can now return multiple values, this is inherently buggy,
since a value can then be both returned, and consumed to create another
input.
This commit fixes this by allowing the compiler to tag nodes as being
outputs.
This commit:
+ Adds support for a protocol which enables inter-op between concrete,
tfhe-rs and potentially other contributors to the fhe ecosystem.
+ Gets rid of hand-made serialization in the compiler, and
client/server libs.
+ Refactors client/server libs to allow more pre/post processing of
circuit inputs/outputs.
The protocol is supported by a definition in the shape of a capnp file,
which defines different types of objects among which:
+ ProgramInfo object, which is a precise description of a set of fhe
circuit coming from the same compilation (understand function type
information), and the associated key set.
+ *Key objects, which represent secret/public keys used to
encrypt/execute fhe circuits.
+ Value object, which represent values that can be transferred between
client and server to support calls to fhe circuits.
The hand-rolled serialization that was previously used is completely
dropped in favor of capnp in the whole codebase.
The client/server libs, are refactored to introduce a modular design for
pre-post processing. Reading the ProgramInfo file associated with a
compilation, the client and server libs assemble a pipeline of
transformers (functions) for pre and post processing of values coming in
and out of a circuit. This design properly decouples various aspects of
the processing, and allows these capabilities to be safely extended.
In practice this commit includes the following:
+ Defines the specification in a concreteprotocol package
+ Integrate the compilation of this package as a compiler dependency
via cmake
+ Modify the compiler to use the Encodings objects defined in the
protocol
+ Modify the compiler to emit ProgramInfo files as compilation
artifact, and gets rid of the bloated ClientParameters.
+ Introduces a new Common library containing the functionalities shared
between the compiler and the client/server libs.
+ Introduces a functional pre-post processing pipeline to this common
library
+ Modify the client/server libs to support loading ProgramInfo objects,
and calling circuits using Value messages.
+ Drops support of JIT.
+ Drops support of C-api.
+ Drops support of Rust bindings.
Co-authored-by: Nikita Frolov <nf@mkmks.org>
Instead of having one `getSQManp` implementation per op with a lot of repetition, the noise
calculation is now modular.
- Ops that implements`UnaryEint`/`BinaryInt`/`BinaryEint` interfaces share the operand noise
presence check.
- For many scalar ops no further calculation is needed. If it's not the case, an op can override
`sqMANP`.
- Integer operand types lookups are abstracted into `BinaryInt::operandIntType()`
- Finding largest operand value for a type is abstracted into `BinaryInt::operandMaxConstant`
- Noise calculation for matmul ops is simplified and it's now general enough to work for
`matmul_eint_int`, `matmul_int_eint` and `dot_eint_int` at once.