previous version was not working properly (too recent). So we switched
to 11.8 and thus downgraded to gcc11. The reason for using ?= in the
Makefile is to default to gcc11 in the docker image, as those variables
are set in the env.
With the type inference pass now being able to handle RT operations
and multiple functions, the restriction disallowing data-flow
parallelization when choosing the dag-multi optimization strategy can
be lifted. Remove the check and bail out in `CompilerEngine.cpp`.
This adds support for the operations `RT.await_future`,
`RT.build_return_ptr_placeholder`, `RT.create_async_task`,
`RT.deref_return_ptr_placeholder`,
`RT.deref_work_function_argument_ptr_placeholder`,
`RT.make_ready_future`, `RT.register_task_work_function`, and
`RT.work_function_return`, the RT types `RT.ptr` and `RT.future`, as
well as the auxiliary operation `arith.select`.
The type inference pass currently only supports a single function as
it is unable to infer types across function boundaries. This commit
adds support for multiple functions and tracks types across
`func.func`, `func.call` and function `constant` operations.
A function may be referenced at multiple sites, including at locations
that have not been visited by the rewriter when the function itself is
rewritten. By using `IRRewriter::replaceOp` after rewriting a
function, the function is erased immediately, which may cause the
rewriting to fail due to remaining uses of the function.
Defering the erasure of functions to the end of the entire rewriting
process ensures that all uses of the original functions have been
updated to the rewritten functions.
Operations with regions containing multiple blocks are currently not
handled correctly, since the list of successors of an operation is not
updated with rewritten blocks when the operation is
rewritten. Furthermore, functions were assumed to have only a single
block.
This change supports operations with multiple blocks by fixing the
issues above.
Updates of the types inferred for the values related to an operation
only propagate correctly to the operation if there is a direct
producer-consumer relationship or an indirect producer-consumer
relationship via some additional mechanism (e.g., region
successors). However, this is not sufficient to ensure updates of
values and operations that are related otherwise.
This change explicitly models the dependencies between related values
and an operation via `DataFlowAnalysis::addDependency` in order to
guarantee that all updates of the types of values are propagated to
all operations that the type resolver has designated as related
operations.
Furthermore, all operations are visited once initially in order to
guarantee that updates propagate to operations, for which the dataflow
framework does not invoke `visitOperation`.
This delegates the enumeration of values related to an operation to
the class `TypeResolver`. This allows for the customization of this
process via a class inheriting `TypeResolver`.
Until now, the pass hoisting `RT.await_future` operations only
supports `tensor.parallel_insert_slice` operations that use loop
induction variables directly as indexes. Any more complex indexing
expressions produce a domination error, since a
`tensor.parallel_insert_slice` cloned by the pass into an additional
parallel for loop is left with references to values from the original
loop.
This change properly clones operations producing intermediate values
within the original parallel for loop and thus adds support for
indexing expressions that reference loops IVs only indirectly.
The type parser of the RT dialect unconditionally attempts to parse
the string "future". This breaks the parsing of any RT type. Remove
unconditional parsing of this prefix.
Return-like operations require special treatment by the type inference
rewriter, since their operand types are both tied to the result types
of their producers and to result types of their parent operations.
The inference scheme for ordinary operations, in which the initial
local inference state is composed of the operand types of the
rewritten producers and the old types of related operations before
rewriting is insufficient, since this may result in a mismatch between
the inferred types and the actual types of the already rewritten
parent operation.
Until now, precedence of the new result types of the parent operation
has been implemented by simply designating these types as the operand
types of a return-like operation. However, while this works as
intended for return-like operations, which simply forward values
(e.g., `func.return`), this creates invalid IR for other return-like
operations (e.g., `tensor.yield`).
This change implements precedence of the result types of the parent
operation of a return-like operation by adding the return types of the
already rewritten parent operation to the initial local inference
state before final invocation of type inference.
The type inference rewriter changes the name of the rewritten function
to the name of the original function when the rewriting process is
complete. However, the name is retrieved from the original function
operation after the operation has already been replaced and thus
destroyed, resulting in a null pointer dereference.
This change retrieves the name of the original function before it is
replaced and saves it in a copy, which is then used to safely assign
the new name to the rewritten function.