Jakub Kuderski
9c4610b9da
Add microbenchmark mode to vicuna CLI ( #1864 )
...
Add flags to enable a non-internactive mode for microbenchmarking llama
models. In this mode, the system and user prompts are specified with CLI
flags, and the number of generated tokens and iterations is fixed.
Also move the stats below the response and trim any response blankspace.
2023-10-05 00:12:08 -04:00
Gaurav Shukla
7cc9b3f8e8
[llama cli] Fix llama cli
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-10-03 20:39:53 +05:30
Gaurav Shukla
e54517e967
[UI] Disable config generator, lora train and model manager ( #1858 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-10-02 22:34:40 -07:00
Ean Garvey
326327a799
Collect pipeline submodules for diffusers ckpt preprocessing. ( #1859 )
2023-10-03 00:29:28 -04:00
Ean Garvey
785b65c7b0
Add flag for specifying device-local caching allocator heap key. ( #1856 )
2023-10-03 00:28:39 -04:00
Vivek Khandelwal
8dd7850c69
Add Falcon-GPTQ support
2023-10-02 16:39:57 +05:30
Gaurav Shukla
e930ba85b4
[os] Remove os dependency from vmfb naming ( #1854 )
...
Also fixes a small ui issue for chatbot.
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-29 12:38:17 -05:00
Gaurav Shukla
cd732e7a38
[chatbot] split execution time to prefill and decode
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-29 13:18:03 +05:30
Gaurav Shukla
8e0f8b3227
[ui] Update chatbot UI
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-29 13:18:03 +05:30
Gaurav Shukla
b8210ef796
[chatbot] Re-instantiate the chatbot object if device id changes
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-29 13:18:03 +05:30
PhaneeshB
94594542a9
remove use of vulkaninfo
2023-09-28 21:57:00 +05:30
Gaurav Shukla
82f833e87d
[vulkan] Update vmfb naming
...
Update vmfb naming for vulkan devices in order to resolve naming
conflicts in the presence of multiple vulkan devices.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-28 14:52:11 +05:30
Vivek Khandelwal
c9d6870105
Modify falcon pipeline for 180b support
2023-09-28 12:39:35 +05:30
Nelson Sharpe
6773278ec2
Fix checkpoint_path unexpected argument ( #1832 )
2023-09-24 14:17:52 -07:00
Abhishek Varma
9a0efffcca
[Llama2] Fix wrong Vulkan device ID + Add Vulkan compile flags
...
-- This commit fixes the wrong Vulkan device being selected during
runtime.
-- It also adds couple of IREE compilation flags to target specific
Vulkan device.
-- It also changes the Vulkan device listing to be more in tune with
lowering control flow.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-09-22 22:24:18 +05:30
Quinn Dawkins
ded74d09cd
[vicuna.py] Keep past key values on device ( #1836 )
...
The past key values are only used within the models themselves and can
be kept on device. For vulkan int4, this gives 44 tok/s (for the first
prompt) and settles at around 26 tok/s on 7900xtx.
2023-09-19 18:17:41 -04:00
zjgarvey
9eceba69b7
local_tank_cache included into clear_all ( #1833 )
2023-09-18 00:27:23 -05:00
Ean Garvey
684943a4a6
(SD) Fix tokenizers imports in pyinstaller builds. ( #1828 )
...
* Fix tokenizers metadata.
* (SD) Disable VAE lowering configs (rdna3) and add versioned tunings.
* Update sd_annotation.py
* (SD) Add cv2 to spec.
* Update stencil pipeline with the new img2img arg.
2023-09-12 12:23:48 -05:00
PhaneeshB
b817bb8455
add roles for llama2
2023-09-12 10:59:28 +05:30
Ean Garvey
780f520f02
Fix vk.target_env extensions and remove redundant SD imports. ( #1826 )
...
* Remove redundant IREE runtime imports.
* Fix vulkan target env extensions.
2023-09-11 13:42:52 -05:00
Abhishek Varma
c854208d49
[Llama2] Prefetch llama2 tokenizer configs ( #1824 )
...
-- This commit prefetches llama2 tokenizer configs from shark_tank.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-09-08 11:29:54 -07:00
Gaurav Shukla
c5dcfc1f13
[vicuna] Exit when mlir is not present in shark tank ( #1825 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-08 10:30:29 -07:00
Abhishek Varma
bde63ee8ae
Add logging feature in WebUI ( #1821 )
2023-09-08 05:48:05 -07:00
Gaurav Shukla
ede6bf83e2
[vicuna] Disabling the IR generation path
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-06 20:13:17 +05:30
Gaurav Shukla
d2f64eefa3
[chatbot] Remove few outdated models from list ( #1814 )
2023-09-04 09:26:32 -07:00
Phaneesh Barwaria
1ccafa1fc1
fix llama2-70b rewrite tensor dim
2023-09-01 17:27:06 +05:30
jinchen62
4c3d8a0a7f
Enable downloading vmfb/mlir for webui ( #1807 )
2023-08-31 11:05:47 -07:00
jinchen62
3601dc7c3b
Fix llama2 13b combined ir ( #1803 )
2023-08-28 11:34:44 -07:00
Daniel Garvey
671881cf87
Llama2 70b ( #1783 )
...
* llama2 70b IR gen
* fix IR sec llama2 + debug
* llama270b
---------
Co-authored-by: PhaneeshB <b.phaneesh@gmail.com >
2023-08-25 23:04:28 -07:00
Gaurav Shukla
4e9be6be59
[chatbot] Add debug as class attribute ( #1799 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-25 21:46:29 -07:00
Ean Garvey
9c8cbaf498
Add support for ROCM (Windows) in Studio + compile utils ( #1770 )
...
* WIP: MSVC ROCM support for SHARK Studio
* Make get_iree_rocm_args platform-agnostic.
* Update stable_args.py
* Update rocm arg handling in SD utils
* Guard quantization imports.
Co-authored-by: jam https://github.com/jammm
2023-08-25 20:56:05 -07:00
jinchen62
51f90a4d56
Update conversion passes for brevitas quant op ( #1795 )
2023-08-25 17:28:07 -05:00
Abhishek Varma
310d5d0a49
Fix llama2 13b crashing + add spec file for CLI execution of Llama ( #1797 )
...
* [Llama2] Add a fix for Llama2 13B downloading/crashing
-- This commit fixes downloading/crashing of llama2 13B on wrong
.mlir file.
-- Also adds support for downloading vmfb from shark_tank in CLI.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
* [llama2] Add a spec file to run Llama/Vicuna CLI exe
-- This commit adds a spec file to run Llama/Vicuna CLI exe.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
---------
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-08-25 09:36:09 -05:00
Ean Garvey
9697981004
Pipe through a debug option to iree compile utils. ( #1796 )
...
* Update compile_utils.py
* Pipe through a flag to toggle debug options in compile utils.
* Update SharkLLMBase.py
2023-08-25 07:11:11 -07:00
Ean Garvey
8e3860c9e6
Remove flags that are default in upstream IREE ( #1785 )
...
* Remove index bits flags now set by default
* Update shark_studio_imports.py
2023-08-24 11:57:54 -05:00
xzuyn
e37d6720eb
Add Hires Fix ( #1787 )
...
* improper test hiresfix
* add sliders & use `clear_cache`
* add resample choices & fix step adjustment
* add step adjustment to img2img
* add resample options to img2img
* simplify hiresfix
- import `img2img_inf` from `img2img_ui.py` instead of just copying it into `txt2img_ui.py`
* set `hri` to None after using
* add more resample types, and don't show output until hiresfix is done
* cleaner implementation
* ran black
* ran black again with jupyter dependencies
2023-08-24 09:01:41 -07:00
Vivek Khandelwal
16160d9a7d
Fix combine mlir script
2023-08-24 19:10:49 +05:30
Abhishek Varma
db990826d3
Add Llama2 13B int4 fp16 support ( #1784 )
...
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-08-23 10:00:32 -07:00
gpetters94
7ee3e4ba5d
Add stencil_unet_512 support ( #1778 )
...
This should fix any remaining issues with stencils and long prompts.
2023-08-22 12:23:46 -04:00
Vivek Khandelwal
05889a8fe1
Add LLaMa2-int4-fp16 support ( #1782 )
2023-08-22 07:45:50 -07:00
gpetters94
82b462de3a
Fix stencils for long prompts ( #1777 )
2023-08-19 00:26:51 -07:00
Daniel Garvey
d8f0f7bade
replace public with private ( #1776 )
...
unload footguns
2023-08-18 14:22:46 -07:00
gpetters94
79bd0b84a1
Fix an issue with diffusers>0.19.3 ( #1775 )
2023-08-18 14:06:06 -04:00
jinchen62
8738571d1e
Adapt the change of brevitas custom op name ( #1772 )
2023-08-17 14:24:43 -07:00
Gaurav Shukla
a4c354ce54
[version] Pin diffusers==0.19.3
...
Once the latest works with LORA train, unpin it.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-17 21:27:10 +05:30
Gaurav Shukla
cc53efa89f
[cli] Fix chatbot cli
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-17 21:27:10 +05:30
Gaurav Shukla
9ae8bc921e
[chatbot] Fix chatbot cli and webview warning
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-17 21:27:10 +05:30
Gaurav Shukla
32eb78f0f9
[chatbot] Fix switching parameters in chatbot
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-17 19:14:17 +05:30
Ean Garvey
9dee7ae652
fix tkinter window ( #1766 )
2023-08-15 13:23:09 -07:00
Ean Garvey
343dfd901c
Update SHARK-Runtime links to SRT ( #1765 )
...
* Update nightly.yml
* Update setup_venv.ps1
* Update CMakeLists.txt
* Update shark_iree_profiling.md
* Update setup_venv.sh
* Update README.md
* Update .gitmodules
* Update CMakeLists.txt
* Update README.md
* fix signtool flags
* Update nightly.yml
* Update benchmark_utils.py
* uncomment tkinter launch
2023-08-15 12:40:44 -07:00