AmosLewis
c199ac78eb
Add decompose of aten._scaled_dot_product_flash_attention.default
...
The new decompose was just implemented from pytorch thes day.
Here is pytorch pr: https://github.com/pytorch/pytorch/pull/117390
This decompose is required from lowering chatglm model in torch-mlir.
Here is the issue:https://github.com/llvm/torch-mlir/issues/2730
2024-01-16 03:03:14 +00:00
Ean Garvey
fa95ed30d1
Relocate quantized matmul reassociation flag ( #2047 )
...
* Remove quantized matmul reassociation flag
This flag should be a model/use-case specific addition, not a default CPU compile flag.
2023-12-20 12:48:40 -08:00
Daniel Garvey
ebfcfec338
remove shark 1.0 tests, add support for 2.0 llm
...
* add support for external weights
* add tests and edit deps
2023-12-14 21:44:37 -06:00
Richard Pastirčák
3af0c6c658
#1843 - Add Export Default settings button ( #2016 )
...
* #1843 - Add Export Default settings button
* #1843 reformating units test
---------
Co-authored-by: Richard Pastirčák <richard.pastircak@student.tuke.sk >
2023-12-06 14:58:17 -06:00
Eliasj42
dfdd3b1f78
improved sharded performance and fixed issue with lmhead on rocm ( #2008 )
...
* improved sharded performance and fixed issue with lmhead on rocm
* mmap shards + disable sharing of device arrays across devices
* fix device_idx for non-layer vmfbs
* fix time calc for sharded
---------
Co-authored-by: Elias Joseph <elias@nod-labs.com >
Co-authored-by: PhaneeshB <b.phaneesh@gmail.com >
2023-12-05 11:53:44 -08:00
Ean Garvey
6384780d16
Fixes to llama2 cpu compilation and studio UI, schedulers ( #2013 )
...
* Fix some issues with defaults
Fixes to llama2 cpu compilation (turns off data tiling for old argmax
mode)
---------
Co-authored-by: Max Dawkins <max.dawkins@gmail.com >
2023-12-05 11:19:19 -05:00
Ean Garvey
d72da3801f
(Studio) Update gradio and multicontrolnet UI. ( #2001 )
...
* (Studio) Update gradio and multicontrolnet UI.
* Fixes for outputgallery, exe build
* Fix image return types.
* Update Gradio to 4.7.1
* Fix send buttons and hiresfix
* Various bugfixes and SDXL additions.
* More UI fixes and txt2img_sdxl presets.
*enable SDXL-Turbo and custom models, custom VAE for sdxl
* img2img ui tweaks
2023-12-04 12:37:51 -06:00
Ean Garvey
795fc33001
Update default compilation flags for data tiling. ( #2000 )
...
* Update default CPU compilation flags.
c5a6cdc8dd
52eb7e9b82
tweak CPU iree-compile flags to match upstream changes.
* Add an option for data tiling on SD models.
2023-11-30 17:05:37 -06:00
Evan Ruttenberg
78c607e1d3
Fix typo in default_rocm_arch ( #1998 )
2023-11-29 20:40:56 -05:00
Ean Garvey
da50a16242
Create specified dir if needed during save_mlir and fix vulkan device fetching without URI/ID ( #1989 )
2023-11-23 01:01:41 -06:00
PhaneeshB
2f780f0d38
quick fix rocm None device
2023-11-22 21:17:25 +05:30
Ean Garvey
d051c3a4a7
Use clean_device_info() by default and don't write .mlir to /tmp/ ( #1984 )
...
* Move clean_device_info to compile_utils
* Update compile_utils.py
* Fix .mlir writes for some user-level permissions
* Fix cases where full URI is given
* Fix conditionals.
* Fix device path handling in vulkan utils.
2023-11-20 13:10:31 -06:00
Ean Garvey
905d0103ff
Revert "Re-enable SD tunings without matmuls. ( #1976 )" ( #1979 )
...
This reverts commit 70817bb50a .
2023-11-17 23:44:33 +05:30
Ean Garvey
70817bb50a
Re-enable SD tunings without matmuls. ( #1976 )
2023-11-15 20:42:53 -06:00
jinchen62
dd37c26d36
Update brevitas quant api ( #1975 )
2023-11-15 10:04:07 -08:00
Ean Garvey
f6d41affd9
(SHARK Studio) Add Turbine-based llm chatbot. ( #1933 )
...
* Dan shark studio (#1970 )
* Fix issue in Falcon-GPTQ
* initial webui and llama2
---------
Co-authored-by: Vivek Khandelwal <vivekkhandelwal1424@gmail.com >
* Fix formatting.
---------
Co-authored-by: Daniel Garvey <34486624+dan-garvey@users.noreply.github.com >
Co-authored-by: Vivek Khandelwal <vivekkhandelwal1424@gmail.com >
2023-11-14 09:56:28 -06:00
PhaneeshB
11510d5111
add intra rocm vmfb differentiator
2023-11-13 23:35:55 +05:30
PhaneeshB
392bade0bf
enable non default rocm device selection for webui
2023-11-13 23:35:55 +05:30
PhaneeshB
51afe19e20
fix rocm arch selection
2023-11-10 13:22:51 +05:30
Ean Garvey
31005bcf73
Don't require vulkan installation to query devices. ( #1953 )
2023-11-09 14:46:44 -06:00
Phaneesh Barwaria
db89b1bdc1
Fix MacOS web execution flow ( #1899 )
...
* fix metal device path for chatbot
* single device remove indexing
* lint fix
2023-11-09 10:59:29 -06:00
Huang Qi
2754e2e257
Fix wrong parameter index passed to 'compile_module_to_flatbuffer' ( #1921 )
...
compile_str is always False in compile_module_to_flatbuffer since there
is a parameter 'model_name' before 'debug'.
This issue is relative to https://github.com/nod-ai/SHARK/pull/1863 .
Then we can use mlir model buffer in RAM to run inference.
2023-11-09 10:58:05 -06:00
PhaneeshB
ab0e870c43
fix vicuna cli vulkan
2023-11-09 22:27:13 +05:30
Stanley Winata
500c4f2306
[compile utils] Fix ROCM to not expect config.id as a default. ( #1939 )
2023-11-06 08:44:53 -08:00
Ean Garvey
5001db3415
Add 7800xt to target triples explicitly. ( #1928 )
2023-11-01 17:11:45 -05:00
PhaneeshB
7963abb8ec
remove caching for rocm args
2023-10-29 07:07:57 +05:30
PhaneeshB
72c0a8abc8
remove dependency on external commands for driver installation check
2023-10-27 10:30:40 +05:30
Vivek Khandelwal
ea920f2955
Add sharded Falcon support
2023-10-26 21:53:25 +05:30
Phaneesh Barwaria
486202377a
update dependency on rocm/hip info command ( #1900 )
...
* add support for rocm flags
* add rocm target flag to chat args
* rm rocm libs dependency message
2023-10-26 15:18:25 +05:30
Ean Garvey
e6cb5cef57
Add --additional_runtime_args option and use in OPT example. ( #1855 )
...
* Add --additional_runtime_args option and use in OPT example.
Fix the func name. (#1838 )
Co-authored-by: Sungsoon Cho <sungsoon.cho@gmail.com >
2023-10-19 13:29:39 -05:00
Huang Qi
66abee8e5b
SharkInference: Fix various examples and README.md ( #1903 )
...
Follow https://github.com/nod-ai/SHARK/pull/708 , remove parameter 'func_name'
for SharkInference.
2023-10-19 09:28:36 -05:00
Ean Garvey
4797bb89f5
Stringify path for ireec.compile_file ( #1901 )
...
* Stringify path for ireec.compile_file
* Update test-models.yml
2023-10-18 14:59:23 -05:00
Ean Garvey
0b77059628
Add matmul reassociation flags ( #1891 )
2023-10-12 20:12:37 -05:00
Vivek Khandelwal
b83d32fafe
Fix Falcon GPTQ Pipeline
2023-10-11 20:09:32 +05:30
Vivek Khandelwal
0a618e1863
Add support for Falcon GPTQ
2023-10-11 10:47:48 +05:30
Phaneesh Barwaria
a731eb6ed4
Macos fixes ( #1883 )
...
* fix venv setup for MacOS
* allow stream fuse binding on mac
* clean iree metal args
2023-10-09 23:36:12 -07:00
Ean Garvey
caf6cc5d8f
Switch most compile flows to use ireec.compile_file. ( #1863 )
...
* Switch most compile flows to use ireec.compile_file.
* re-add input type to compile_str path.
* Check if mlir_module exists before checking if it's a path or pyobject.
* Fix some save_dir cases
2023-10-06 23:04:43 -05:00
powderluv
a38cc9d216
Update vulkan_utils.py for Radeon 780m igpu ( #1866 )
2023-10-04 20:33:07 -07:00
Jakub Kuderski
1c382449ec
[vulkan] Print note about module load times. NFC. ( #1862 )
...
Print a note ahead of a potentially long inactivity to set the right expectations.
Separately, we should add progress to the UI and make this loading faster.
2023-10-03 17:27:27 -04:00
Vivek Khandelwal
8dd7850c69
Add Falcon-GPTQ support
2023-10-02 16:39:57 +05:30
PhaneeshB
94594542a9
remove use of vulkaninfo
2023-09-28 21:57:00 +05:30
Jakub Kuderski
4fec03a6cc
[vulkan] Switch from coop matrix NV to KHR ( #1848 )
2023-09-27 21:43:37 -04:00
Abhishek Varma
ad1a0f35ff
Fix misdirection while saving vmfb
...
-- Currently SHARK suggests that vmfb has been saved, while
that is not the case and no vmfb is generated.
This creates a misdirection for IR/vmfbs which are of larger
size.
-- This commit therefore fixes that misdirection.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-09-27 16:25:29 +05:30
Abhishek Varma
9a0efffcca
[Llama2] Fix wrong Vulkan device ID + Add Vulkan compile flags
...
-- This commit fixes the wrong Vulkan device being selected during
runtime.
-- It also adds couple of IREE compilation flags to target specific
Vulkan device.
-- It also changes the Vulkan device listing to be more in tune with
lowering control flow.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-09-22 22:24:18 +05:30
Boian Petkantchin
79267931c1
Add argument --additional_compile_args ( #1119 )
...
This allows to pass more arguemnts to the IREE compiler
Example:
python my-app.py --additional_compile_args="--mlir-pretty-debuginfo --mlir-timing"
Co-authored-by: Boian Petkantchin <boian@nod-labs.com >
2023-09-19 11:26:03 -05:00
Gaurav Shukla
11bdce9790
[flags] Fix vulkan runtime flags as vma is dropped from iree ( #1831 )
2023-09-14 08:58:59 -05:00
Ean Garvey
780f520f02
Fix vk.target_env extensions and remove redundant SD imports. ( #1826 )
...
* Remove redundant IREE runtime imports.
* Fix vulkan target env extensions.
2023-09-11 13:42:52 -05:00
Dom
c61b6f8d65
Code refactoring ( #1817 )
...
* use join
* fix bug
* further code optimizations
---------
Co-authored-by: Daniel Garvey <34486624+dan-garvey@users.noreply.github.com >
2023-09-11 11:30:56 -05:00
Vivek Khandelwal
9681d494eb
Update decomp list and shark trainer for DLRM
2023-09-06 21:24:50 +05:30
Vivek Khandelwal
1d31b2b2c6
Fix StableHLO Compilation flag
2023-09-05 21:32:33 +05:30