* Fix some issues with defaults
Fixes to llama2 cpu compilation (turns off data tiling for old argmax
mode)
---------
Co-authored-by: Max Dawkins <max.dawkins@gmail.com>
* Update default CPU compilation flags.
c5a6cdc8dd52eb7e9b82
tweak CPU iree-compile flags to match upstream changes.
* Add an option for data tiling on SD models.
* Move clean_device_info to compile_utils
* Update compile_utils.py
* Fix .mlir writes for some user-level permissions
* Fix cases where full URI is given
* Fix conditionals.
* Fix device path handling in vulkan utils.
compile_str is always False in compile_module_to_flatbuffer since there
is a parameter 'model_name' before 'debug'.
This issue is relative to https://github.com/nod-ai/SHARK/pull/1863.
Then we can use mlir model buffer in RAM to run inference.
* Switch most compile flows to use ireec.compile_file.
* re-add input type to compile_str path.
* Check if mlir_module exists before checking if it's a path or pyobject.
* Fix some save_dir cases
Print a note ahead of a potentially long inactivity to set the right expectations.
Separately, we should add progress to the UI and make this loading faster.
-- Currently SHARK suggests that vmfb has been saved, while
that is not the case and no vmfb is generated.
This creates a misdirection for IR/vmfbs which are of larger
size.
-- This commit therefore fixes that misdirection.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>
-- This commit fixes the wrong Vulkan device being selected during
runtime.
-- It also adds couple of IREE compilation flags to target specific
Vulkan device.
-- It also changes the Vulkan device listing to be more in tune with
lowering control flow.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>
This allows to pass more arguemnts to the IREE compiler
Example:
python my-app.py --additional_compile_args="--mlir-pretty-debuginfo --mlir-timing"
Co-authored-by: Boian Petkantchin <boian@nod-labs.com>
-- This commit adds Scaled Dot Product Flash Attention's decomposition
in shark_importer.
-- It also updates `iree-flow-enable-data-tiling` to `iree-opt-data-tiling`.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>
* WIP: MSVC ROCM support for SHARK Studio
* Make get_iree_rocm_args platform-agnostic.
* Update stable_args.py
* Update rocm arg handling in SD utils
* Guard quantization imports.
Co-authored-by: jam https://github.com/jammm