powderluv
a38cc9d216
Update vulkan_utils.py for Radeon 780m igpu ( #1866 )
2023-10-04 20:33:07 -07:00
Jakub Kuderski
1c382449ec
[vulkan] Print note about module load times. NFC. ( #1862 )
...
Print a note ahead of a potentially long inactivity to set the right expectations.
Separately, we should add progress to the UI and make this loading faster.
20231004.976
20231003.975
2023-10-03 17:27:27 -04:00
Gaurav Shukla
7cc9b3f8e8
[llama cli] Fix llama cli
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
20231003.974
2023-10-03 20:39:53 +05:30
Gaurav Shukla
e54517e967
[UI] Disable config generator, lora train and model manager ( #1858 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-10-02 22:34:40 -07:00
Ean Garvey
326327a799
Collect pipeline submodules for diffusers ckpt preprocessing. ( #1859 )
20231002.973
20231002.972
2023-10-03 00:29:28 -04:00
Ean Garvey
785b65c7b0
Add flag for specifying device-local caching allocator heap key. ( #1856 )
2023-10-03 00:28:39 -04:00
Sungsoon Cho
0d16c81687
Remove unused import. ( #1857 )
2023-10-02 11:36:08 -05:00
Vivek Khandelwal
8dd7850c69
Add Falcon-GPTQ support
2023-10-02 16:39:57 +05:30
Gaurav Shukla
e930ba85b4
[os] Remove os dependency from vmfb naming ( #1854 )
...
Also fixes a small ui issue for chatbot.
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
20231001.971
20230930.970
20230930.969
20230929.968
2023-09-29 12:38:17 -05:00
Gaurav Shukla
cd732e7a38
[chatbot] split execution time to prefill and decode
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-29 13:18:03 +05:30
Gaurav Shukla
8e0f8b3227
[ui] Update chatbot UI
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-29 13:18:03 +05:30
Gaurav Shukla
b8210ef796
[chatbot] Re-instantiate the chatbot object if device id changes
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-29 13:18:03 +05:30
PhaneeshB
94594542a9
remove use of vulkaninfo
20230928.967
2023-09-28 21:57:00 +05:30
Gaurav Shukla
82f833e87d
[vulkan] Update vmfb naming
...
Update vmfb naming for vulkan devices in order to resolve naming
conflicts in the presence of multiple vulkan devices.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-28 14:52:11 +05:30
Vivek Khandelwal
c9d6870105
Modify falcon pipeline for 180b support
2023-09-28 12:39:35 +05:30
Jakub Kuderski
4fec03a6cc
[vulkan] Switch from coop matrix NV to KHR ( #1848 )
20230927.965
2023-09-27 21:43:37 -04:00
harsh-nod
9a27f51378
Deprecate inference directory
...
This patch removes the inference directory that was no longer being used.
2023-09-27 14:29:00 -07:00
Abhishek Varma
ad1a0f35ff
Fix misdirection while saving vmfb
...
-- Currently SHARK suggests that vmfb has been saved, while
that is not the case and no vmfb is generated.
This creates a misdirection for IR/vmfbs which are of larger
size.
-- This commit therefore fixes that misdirection.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-09-27 16:25:29 +05:30
Nelson Sharpe
6773278ec2
Fix checkpoint_path unexpected argument ( #1832 )
20230926.964
20230925.963
20230924.962
2023-09-24 14:17:52 -07:00
Abhishek Varma
9a0efffcca
[Llama2] Fix wrong Vulkan device ID + Add Vulkan compile flags
...
-- This commit fixes the wrong Vulkan device being selected during
runtime.
-- It also adds couple of IREE compilation flags to target specific
Vulkan device.
-- It also changes the Vulkan device listing to be more in tune with
lowering control flow.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
20230923.961
20230922.960
2023-09-22 22:24:18 +05:30
gpetters94
61c6f153d9
Switch to keras-nightly to fix a Linux issue ( #1835 )
20230921.959
2023-09-21 12:33:45 -04:00
Phaneesh Barwaria
effd42e8f5
pin gradio to v3.44.3
2023-09-21 17:33:43 +05:30
Sungsoon Cho
b5fbb1a8a0
Rename the func arg save_json to avoid name collision. ( #1837 )
...
* Rename the func arg save_json to avoid name collision.
* black formatted.
20230920.958
20230919.957
2023-09-19 17:29:27 -05:00
Quinn Dawkins
ded74d09cd
[vicuna.py] Keep past key values on device ( #1836 )
...
The past key values are only used within the models themselves and can
be kept on device. For vulkan int4, this gives 44 tok/s (for the first
prompt) and settles at around 26 tok/s on 7900xtx.
2023-09-19 18:17:41 -04:00
Boian Petkantchin
79267931c1
Add argument --additional_compile_args ( #1119 )
...
This allows to pass more arguemnts to the IREE compiler
Example:
python my-app.py --additional_compile_args="--mlir-pretty-debuginfo --mlir-timing"
Co-authored-by: Boian Petkantchin <boian@nod-labs.com >
2023-09-19 11:26:03 -05:00
zjgarvey
9eceba69b7
local_tank_cache included into clear_all ( #1833 )
20230918.956
2023-09-18 00:27:23 -05:00
Ean Garvey
ca609afb6a
Update README.md ( #1830 )
20230917.955
20230916.954
20230915.953
20230914.951
20230914.950
2023-09-14 10:33:57 -05:00
Gaurav Shukla
11bdce9790
[flags] Fix vulkan runtime flags as vma is dropped from iree ( #1831 )
2023-09-14 08:58:59 -05:00
Ean Garvey
684943a4a6
(SD) Fix tokenizers imports in pyinstaller builds. ( #1828 )
...
* Fix tokenizers metadata.
* (SD) Disable VAE lowering configs (rdna3) and add versioned tunings.
* Update sd_annotation.py
* (SD) Add cv2 to spec.
* Update stencil pipeline with the new img2img arg.
20230913.949
20230912.948
2023-09-12 12:23:48 -05:00
PhaneeshB
b817bb8455
add roles for llama2
2023-09-12 10:59:28 +05:30
Ean Garvey
780f520f02
Fix vk.target_env extensions and remove redundant SD imports. ( #1826 )
...
* Remove redundant IREE runtime imports.
* Fix vulkan target env extensions.
20230911.946
2023-09-11 13:42:52 -05:00
Dom
c61b6f8d65
Code refactoring ( #1817 )
...
* use join
* fix bug
* further code optimizations
---------
Co-authored-by: Daniel Garvey <34486624+dan-garvey@users.noreply.github.com >
2023-09-11 11:30:56 -05:00
Abhishek Varma
c854208d49
[Llama2] Prefetch llama2 tokenizer configs ( #1824 )
...
-- This commit prefetches llama2 tokenizer configs from shark_tank.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
20230910.942
20230909.941
20230908.940
2023-09-08 11:29:54 -07:00
Gaurav Shukla
c5dcfc1f13
[vicuna] Exit when mlir is not present in shark tank ( #1825 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-08 10:30:29 -07:00
Abhishek Varma
bde63ee8ae
Add logging feature in WebUI ( #1821 )
2023-09-08 05:48:05 -07:00
Vivek Khandelwal
9681d494eb
Update decomp list and shark trainer for DLRM
20230907.939
20230907.938
2023-09-06 21:24:50 +05:30
Gaurav Shukla
ede6bf83e2
[vicuna] Disabling the IR generation path
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-06 20:13:17 +05:30
Ean Garvey
2c2693fb7d
Fix torchvision versioning in Linux importer setup. ( #1809 )
20230905.935
20230905.934
2023-09-05 12:57:03 -05:00
Vivek Khandelwal
1d31b2b2c6
Fix StableHLO Compilation flag
2023-09-05 21:32:33 +05:30
Gaurav Shukla
d2f64eefa3
[chatbot] Remove few outdated models from list ( #1814 )
20230904.932
2023-09-04 09:26:32 -07:00
Abhishek Varma
87ae14b6ff
[SD] Add sdpfa decomposition + update IREE flag
...
-- This commit adds Scaled Dot Product Flash Attention's decomposition
in shark_importer.
-- It also updates `iree-flow-enable-data-tiling` to `iree-opt-data-tiling`.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
20230904.931
2023-09-04 18:03:53 +05:30
Phaneesh Barwaria
1ccafa1fc1
fix llama2-70b rewrite tensor dim
20230903.930
20230903.929
20230902.928
20230901.927
2023-09-01 17:27:06 +05:30
jinchen62
4c3d8a0a7f
Enable downloading vmfb/mlir for webui ( #1807 )
20230831.925
20230831.924
2023-08-31 11:05:47 -07:00
jinchen62
3601dc7c3b
Fix llama2 13b combined ir ( #1803 )
20230830.923
20230829.922
20230829.921
20230828.920
2023-08-28 11:34:44 -07:00
Daniel Garvey
671881cf87
Llama2 70b ( #1783 )
...
* llama2 70b IR gen
* fix IR sec llama2 + debug
* llama270b
---------
Co-authored-by: PhaneeshB <b.phaneesh@gmail.com >
20230827.919
20230826.918
20230826.917
2023-08-25 23:04:28 -07:00
Gaurav Shukla
4e9be6be59
[chatbot] Add debug as class attribute ( #1799 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
20230825.916
20230825.915
2023-08-25 21:46:29 -07:00
Ean Garvey
9c8cbaf498
Add support for ROCM (Windows) in Studio + compile utils ( #1770 )
...
* WIP: MSVC ROCM support for SHARK Studio
* Make get_iree_rocm_args platform-agnostic.
* Update stable_args.py
* Update rocm arg handling in SD utils
* Guard quantization imports.
Co-authored-by: jam https://github.com/jammm
20230825.914
2023-08-25 20:56:05 -07:00
Ean Garvey
9e348a114e
Revert changes process_skipfiles.py ( #1798 )
...
Keeps a small typo fix but reverts the rest of changes to this file from 450c231171
2023-08-25 15:31:49 -07:00
jinchen62
51f90a4d56
Update conversion passes for brevitas quant op ( #1795 )
2023-08-25 17:28:07 -05:00
Abhishek Varma
310d5d0a49
Fix llama2 13b crashing + add spec file for CLI execution of Llama ( #1797 )
...
* [Llama2] Add a fix for Llama2 13B downloading/crashing
-- This commit fixes downloading/crashing of llama2 13B on wrong
.mlir file.
-- Also adds support for downloading vmfb from shark_tank in CLI.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
* [llama2] Add a spec file to run Llama/Vicuna CLI exe
-- This commit adds a spec file to run Llama/Vicuna CLI exe.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
---------
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
20230825.913
2023-08-25 09:36:09 -05:00