Commit Graph

16 Commits

Author SHA1 Message Date
Miles
5d68424531 fix typos 2024-04-11 14:50:16 +02:00
Andi Drebes
6c1cef1fd3 refactor(compiler): Factor normalization of loop IVs from Batching-specific code
This introduces a new function `normalizeInductionVar()` to the static
loop utility code in `concretelang/Analysis/StaticLoops.h` with code
extracted for IV normalization from the batching code and changes the
batching code to make use of the factored function.
2024-04-08 12:02:49 +02:00
Andi Drebes
14f39fefe7 refactor(compiler): Separate code for static loop analysis from batching 2024-03-14 06:32:19 +01:00
Benoit Chevallier-Mames
e37a840e7e docs(compiler): adding API doc 2024-03-13 17:10:00 +01:00
Andi Drebes
6b5baca515 fix(compiler): Batching: Fix hoisting of indexed ops with mixed offsets
For `ExtractSliceOp` and `InsertSliceOp`, the code performing the
hoisting of indexed operations in the batching pass derives the
indexes of the hoisted operation from the indexes provided by
`ExtractSliceOp::getOffsets()` and
`InsertSliceOp::getOffsets()`. However, these methods only return
dynamic indexes, such that operations with mixed, dynamic and static
offsets are hoisted incorrectly.

This patch replaces the invocations of `ExtractSliceOp::getOffsets()`
and `InsertSliceOp::getOffsets()` with invocations of
`ExtractSliceOp::getMixedOffsets()` and
`InsertSliceOp::getMixedOffsets()`, respectively in order to take into
account both static and dynamic indexes.
2024-01-05 13:15:12 +00:00
Andi Drebes
c24d9c9553 fix(compiler): Batching: Do not generate tensor.collapse_shape operations reproducing the same shape
The IR generated by the batching pattern introduces intermediate
tensor values omitting the batched dimensions for batched
operands. This happens uncondiationally, leading to the generation of
`tensor.collapse_shape` operations with the same output and inout
shape. However, verification for such operation fails, since the
verifier assumes that the rank of the resulting tensor is reduced at
least by one.

This commit modified the check in `flattenTensor`, such that no
flattening operation is generated if the input and output shapes would
be identical.
2024-01-05 13:15:12 +00:00
Andi Drebes
51f10e38b3 fix(compiler): Batching: Ensure identical order of indexing IVs in cleanup pattern
Currently, the cleanup pattern matches sequences of `tensor.extract` /
`tensor.extract_slice`, `tensor.insert` / `tensor.insert_slice` and
`scf.yield` operations that are embedded into perfect loop nests whose
IVs are referenced in quasi-affine expressions with constant
steps. The pattern ensures that the extraction and insertion operation
use the same IVs for indexing, but does not check whether they appear
in the same order.

However, the order in which IVs are used for indexing is crucial when
replacing the operations with `tensor.extract_slice` and
`tensor.insert_slice` operations in order to preserve the shape of the
slices and the order of elements.

For example, when the cleanup pattern is applied to the following IR:

```
 %c0 = arith.constant 0 : index
 %c1 = arith.constant 1 : index
 %c2 = arith.constant 2 : index
 %c3 = arith.constant 3 : index

 scf.for %i = %c0 to %c3 step %c1 ... {
   scf.for %j = %c0 to %c2 step %c1 ... {
      %v = tensor.extract ... [%i, %j]
      %t = tensor.insert ... [%j, %i]
      scf.yield %t
   }
   ...
 }
```

The extracted slice has a shape of 3x2, while the insertion should be
happening with a slice with the shape 2x3.

This commit adds an additional check to the cleanup pattern that
ensures that loop IVs are used for indexing in the same order and
appear the same number of times.
2024-01-05 13:15:12 +00:00
Andi Drebes
dbcfa92bd3 refactor(compiler): Batching: Do not shift / scale already normalized indexes
Normalization of indexes in the batching pass currently
unconditionally emits arithmetic operations to shift and scale indexes
regardless of whether these indexes are already normalized. This leads
to unnecessary subtractions with 0 and divisions by 1.

This commit introduces two additional checks to the code normalizing
indexes that prevent arithmetic operations to be emitted for indexes
that do not need to be shifted or scaled.
2024-01-05 13:15:12 +00:00
Andi Drebes
38e14446d6 feat(compiler): Batching: Add pattern folding operations on tensors of constants
This adds a new pattern to the batching pass that folds operations on
tensors of constants into new tensors of constants. E.g.,

  %cst = arith.constant dense<...> : tensor<Nxi9>
  %res = scf.for %i = %c0 to %cN {
    %cst_i9 = tensor.extract %cst[%i]
    %cst_i64 = arith.extui %cst_i9 : i64
    ...
  }

becomes:

  %cst = arith.constant dense<...> : tensor<Nxi64>
  %res = scf.for %i = %c0 to %cN {
    %cst_i64 = tensor.extract %cst[%i]
    ...
  }

The pattern only works for static loops, indexes that are quasi-affine
expressions on single loop induction variables with a constant step
size across iterations and foldable operations that have a single
result.
2023-06-12 22:51:30 +01:00
Andi Drebes
3516ae7682 feat(compiler): Add option for maximum batch size to batching pass
This adds a new compilation option `maxBatchSize` and a command line
option `--max-batch-size` to `concretecompiler`.
2023-06-12 22:51:30 +01:00
Andi Drebes
38a5b5e928 feat(compiler): Add support for batching with multiple batchable operands
The current batching pass only supports batching of operations that
have a single batchable operand, that can only be batched in one way
and that operate on scalar values. However, this does not allow for
efficient batching of all arithmetic operations in TFHE, since these
are often applied to pairs of scalar values from tensors, to tensors
and scalars or to tensors that can be grouped in higher-order tensors.

This commit introduces three new features for batching:

  1. Support of multiple batchable operands

     The operation interface for batching now allows for the
     specification of multiple batchable operands. This set can be
     composed of any subset of an operation's operands, i.e., it is
     not limited to sets of operands with contiguous operand indexes.

  2. Support for multiple batching variants

     To account for multiple kinds of batching, the batching operation
     interface `BatchableOpInterface` now supports variants. The
     batching pass attempts to batch an operation by trying the
     batching variants expressed via the interface in order until it
     succeeds.

  3. Support for batching of tensor values

     Some operations that could be batched already operate on tensor
     values. The new batching pass detects those patterns and groups
     the batchable tensors' values into higher-dimensional tensors.
2023-06-12 22:51:30 +01:00
Andi Drebes
5a6ed84076 fix(compiler): Batching: Do not attempt to extract static bounds for dynamic loops
The batching pass erroneously assumes that any expression solely
composed of an induction variable has static bounds. This commit adds
a test for the lower bound, upper bound and step checking that they
are indeed static before attempting to determine their static values.
2023-06-12 22:51:30 +01:00
Andi Drebes
b24709a1ec feat(compiler): Batching: Hoist non-batchable operands produced by pure ops
The batching pass only creates a batched version of a batchable
operation if all of its non-batchable operands are defined out ouf the
outermost loop the iterating over the values of the batchable operand.

This change also allows for operations to be batched if the
non-batachable operands are generated by operations, which are pure
and thus hoistable out of the outermost loop.
2023-03-24 11:06:51 +01:00
Andi Drebes
3309615d7b fix(compiler): Batching: Bail out if batchable operand is not produced by an op
An early test for a batchable operation checks whether the batchable
operand is produced by a `tensor.extract` operation and bails out if
this is not the case. However, the use of `llvm::dyn_cast<T>()` directly
on the defining operation of the batchable operand causes an attempt
to cast a null value for an operand which is not produced by an
operation (e.g., block arguments).

Using `llvm::dyn_cast_or_null<T>()` fixes this issue.
2023-03-24 11:06:51 +01:00
Andi Drebes
c8c969773e Rebase onto llvm-project 465ee9bfb26d with local changes
This commit rebases the compiler onto commit 465ee9bfb26d from
llvm-project with locally maintained patches on top, i.e.:

  * 5d8669d669ee: Fix the element alignment (size) for memrefCopy
  * 4239163ea337: fix: Do not fold the memref.subview if the offset are
                  != 0 and strides != 1
  * 72c5decfcc21: remove github stuff from llvm
  * 8d0ce8f9eca1: Support arbitrary element types in named operations
                  via attributes
  * 94f64805c38c: Copy attributes of scf.for on bufferization and make
                  it an allocation hoisting barrier

Main upstream changes from llvm-project that required modification of
concretecompiler:

  * Switch to C++17
  * Various changes in the interfaces for linalg named operations
  * Transition from `llvm::Optional` to `std::optional`
  * Use of enums instead of string values for iterator types in linalg
  * Changed default naming convention of getter methods in
    ODS-generated operation classes from `some_value()` to
    `getSomeValue()`
  * Renaming of Arithmetic dialect to Arith
  * Refactoring of side effect interfaces (i.e., renaming from
    `NoSideEffect` to `Pure`)
  * Re-design of the data flow analysis framework
  * Refactoring of build targets for Python bindings
  * Refactoring of array attributes with integer values
  * Renaming of `linalg.init_tensor` to `tensor.empty`
  * Emission of `linalg.map` operations in bufferization of the Tensor
    dialect requiring another linalg conversion pass and registration
    of the bufferization op interfaces for linalg operations
  * Refactoring of the one-shot bufferizer
  * Necessity to run the expand-strided-metadata, affine-to-std and
    finalize-memref-to-llvm passes before converson to the LLVM
    dialect
  * Renaming of `BlockAndValueMapping` to `IRMapping`
  * Changes in the build function of `LLVM::CallOp`
  * Refactoring of the construction of `llvm::ArrayRef` and
    `llvm::MutableArrayRef` (direct invocation of constructor instead
    of builder functions for some cases)
  * New naming conventions for generated SSA values requiring rewrite
    of some check tests
  * Refactoring of `mlir::LLVM::lookupOrCreateMallocFn()`
  * Interface changes in generated type parsers
  * New dependencies for to mlir_float16_utils and
    MLIRSparseTensorRuntime for the runtime
  * Overhaul of MLIR-c deleting `mlir-c/Registration.h`
  * Deletion of library MLIRLinalgToSPIRV
  * Deletion of library MLIRLinalgAnalysis
  * Deletion of library MLIRMemRefUtils
  * Deletion of library MLIRQuantTransforms
  * Deletion of library MLIRVectorToROCDL
2023-03-09 17:47:16 +01:00
Quentin Bourgerie
f975415e86 chore: Integrate concrete-compiler to the mono-repo 2023-03-03 09:20:01 +01:00