Google benchmark is built twice due to the new bench infrastructure for
concrete-cuda, this commit fixes it by introducing
CONCRETE_CUDA_BUILD_TESTS and CONCRETE_CUDA_BUILD_BENCHMARKS options to skip
unecessary builds.
For debugging purpose, add a cmake variable that allows to generate
unsecure keycaches, that allows tracing ops to show the message in the
ciphertext body.
Re-introduce the previously deleted batching tests for BConcrete as
tests for TFHE with the addition of a new test, checking that
non-batchable operands generated by pure operations are hoisted.
With TFHE operations becoming batchable, the batching pass must now be
run after the conversion to TFHE,and TFHE parametrization, but before
any further lowering.
The batching pass only creates a batched version of a batchable
operation if all of its non-batchable operands are defined out ouf the
outermost loop the iterating over the values of the batchable operand.
This change also allows for operations to be batched if the
non-batachable operands are generated by operations, which are pure
and thus hoistable out of the outermost loop.
An early test for a batchable operation checks whether the batchable
operand is produced by a `tensor.extract` operation and bails out if
this is not the case. However, the use of `llvm::dyn_cast<T>()` directly
on the defining operation of the batchable operand causes an attempt
to cast a null value for an operand which is not produced by an
operation (e.g., block arguments).
Using `llvm::dyn_cast_or_null<T>()` fixes this issue.
the new wrapper function will make a call to the main compiled function,
and we got some problem in the GOT/PLT due to function of the same name.
So now we prefiex with `concrete_` to avoid that.
this was already implemented for JIT using mlir::ExecutionEngine, but
was using a different, and more complex way for library compilation and
execution, which was causing a bad calling convention at the assembly
level in MacOS M1 machine. This commits unify the invocation of JIT and
Library compiled circuit, solving the previously mentioned issue, but
also gives the ability to extend compiled libraries to support more than one
returned value