* Switch most compile flows to use ireec.compile_file.
* re-add input type to compile_str path.
* Check if mlir_module exists before checking if it's a path or pyobject.
* Fix some save_dir cases
-- This commit adds Scaled Dot Product Flash Attention's decomposition
in shark_importer.
-- It also updates `iree-flow-enable-data-tiling` to `iree-opt-data-tiling`.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>
- fix setup_venv.sh for benchmarks/imports etc.
- fix torch benchmarks in SharkBenchmarkRunner
- generate SD artifacts using build_tools/stable_diffusion_testing.py and --import_mlir
- decouple SD gen from tank/generate_sharktank for now
* Change script to 1.3b model and add pytorch comparison
* fix CLI command
* Match OPT transformers model updates + numerics against latest version
* Cleanup OPT sentence completion script.
* Fix formatting and add standalone validation scripts.
* Add minimal OPT wrapper and example with import_with_fx
* Rename OPT full model wrapper.
* Cleanup test scripts for OPT.
Adding cpu-sync and cpu-task device configs was allowing respective tests to bypass the xfail conditional for cpu pytests marked in tank/all_models.csv. This commit updates the conditional to xfail those cases for cpu-sync and cpu-task as well.
* Only xfail windows models in CI
* downloader: make model updates more robust.
* Separate baseline and native benchmarks in pytest.
* Fix native benchmarks
* Fix torchvision model utils.
* Adds a few xfails to enable macOS builder
* Convert string batch sizes to ints where needed.
* allow pytest to retry getting model artifacts
* Reduce attempts and add assert msg.
* Fix sharktank generation and add batch_size pytest option for torch.
* Disable torch dynamo until py3.11 supported
* Compile torchmodel without dynamo if torch.compile fails
* Use release versions of TF/Keras for importer.
* Pin torchvision and remove debug prints.
* Remove duplicates from torch model list.
* Update generate_sharktank.py
* xfail a few models that fail sharktank generation/ numerics
* Rollback T5 models for torch as the inputs give some issues that aren't trivial to resolve
* xfail efficientnet-b0 on torch+cuda -- see CUDA requesting shared memory size larger than allowed size openxla/iree#12771
* Add gen_shark_files fn to shark_downloader for OTF artifact generation
* add generate_sharktank as a tank/ python module.
* Fix some paths in tank generation.