The batching pass passes operands to the batched operation as a flat,
one-dimensional vector produced through a `tensor.collapse_shape`
operation collapsing all dimensions of the original tensor of
operands. Similarly, the shape of the result vector of the batched
operation is expanded to the original shape afterwards using a
`tensor.expand_shape` operation.
The pass emits the `tensor.collapse_shape` and `tensor.expand_shape`
operations unconditionally, even for tensors, which already have only
a single dimension. This causes the verifiers of these operations to
fail in some cases, aborting the entire compilation process.
This patch lets the batching pass emit `tensor.collapse_shape` and
`tensor.expand_shape` for batched operands and batched results only if
the rank of the corresponding tensors is greater than one.
This adds a new pass that is able to hoist operations implementing the
`BatchableOpInterface` out of a loop nest that applies the operation
to the elements of a tensor indexed by the loop induction variables.
Example:
scf.for %i = c0 to %cN step %c1 {
scf.for %j = c0 to %cM step %c1 {
scf.for %k = c0 to %cK step %c1 {
%s = tensor.extract %T[%i, %j, %k]
%res = batchable_op %s
...
}
}
}
is replaced with:
%batchedSlice = tensor.extract_slice
%T[%c0, %c0, %c0] [%cN, %cM, %cK] [%c1, %c1, %c1]
%flatSlice = tensor.collapse_shape %batchedSlice
%resTFlat = batchedOp %flatSlice
%resT = tensor.expand_shape %resTFlat
scf.for %i = c0 to %cN step %c1 {
scf.for %j = c0 to %cM step %c1 {
scf.for %k = c0 to %cK step %c1 {
%res = tensor.extract %resT[%i, %j, %k]
...
}
}
}
Every index of the tensor with the input values may be a quasi-affine
expression on a single loop induction variable, as long as the
difference between the results of the expression for any two
consecutive values of the referenced loop induction variable is
constant.
Rebase to llvm-project at 3f81841474fe with a pending upstream patch
for arbitrary element types in linalg named operations.
Co-authored-by: Ayoub Benaissa <ayoub.benaissa@zama.ai>
This commit rebases the compiler onto commit f69328049e9e from
llvm-project.
Changes:
* Use of the one-shot bufferizer for improved memory management
* A new pass `OneShotBufferizeDPSWrapper` that converts functions
returning tensors to destination-passing-style as required by the
one-shot bufferizer
* A new pass `LinalgGenericOpWithTensorsToLoopsPass` that converts
`linalg.generic` operations with value semantics to loop nests
* Rebase onto a fork of llvm-project at f69328049e9e with local
modifications to enable bufferization of `linalg.generic` operations
with value semantics
* Workaround for the absence of type propagation after type conversion
via extra patterns in all dialect conversion passes
* Printer, parser and verifier definitions moved from inline
declarations in ODS to the respective source files as required by
upstream changes
* New tests for functions with a large number of inputs
* Increase the number of allowed task inputs as required by new tests
* Use upstream function `mlir_configure_python_dev_packages()` to
locate Python development files for compatibility with various CMake
versions
Co-authored-by: Quentin Bourgerie <quentin.bourgerie@zama.ai>
Co-authored-by: Ayoub Benaissa <ayoub.benaissa@zama.ai>
Co-authored-by: Antoniu Pop <antoniu.pop@zama.ai>