Gaurav Shukla
e930ba85b4
[os] Remove os dependency from vmfb naming ( #1854 )
...
Also fixes a small ui issue for chatbot.
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-29 12:38:17 -05:00
Gaurav Shukla
cd732e7a38
[chatbot] split execution time to prefill and decode
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-29 13:18:03 +05:30
Gaurav Shukla
82f833e87d
[vulkan] Update vmfb naming
...
Update vmfb naming for vulkan devices in order to resolve naming
conflicts in the presence of multiple vulkan devices.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-28 14:52:11 +05:30
Vivek Khandelwal
c9d6870105
Modify falcon pipeline for 180b support
2023-09-28 12:39:35 +05:30
Abhishek Varma
9a0efffcca
[Llama2] Fix wrong Vulkan device ID + Add Vulkan compile flags
...
-- This commit fixes the wrong Vulkan device being selected during
runtime.
-- It also adds couple of IREE compilation flags to target specific
Vulkan device.
-- It also changes the Vulkan device listing to be more in tune with
lowering control flow.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-09-22 22:24:18 +05:30
Quinn Dawkins
ded74d09cd
[vicuna.py] Keep past key values on device ( #1836 )
...
The past key values are only used within the models themselves and can
be kept on device. For vulkan int4, this gives 44 tok/s (for the first
prompt) and settles at around 26 tok/s on 7900xtx.
2023-09-19 18:17:41 -04:00
PhaneeshB
b817bb8455
add roles for llama2
2023-09-12 10:59:28 +05:30
Abhishek Varma
c854208d49
[Llama2] Prefetch llama2 tokenizer configs ( #1824 )
...
-- This commit prefetches llama2 tokenizer configs from shark_tank.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-09-08 11:29:54 -07:00
Gaurav Shukla
c5dcfc1f13
[vicuna] Exit when mlir is not present in shark tank ( #1825 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-08 10:30:29 -07:00
Gaurav Shukla
ede6bf83e2
[vicuna] Disabling the IR generation path
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-09-06 20:13:17 +05:30
Gaurav Shukla
d2f64eefa3
[chatbot] Remove few outdated models from list ( #1814 )
2023-09-04 09:26:32 -07:00
Phaneesh Barwaria
1ccafa1fc1
fix llama2-70b rewrite tensor dim
2023-09-01 17:27:06 +05:30
jinchen62
4c3d8a0a7f
Enable downloading vmfb/mlir for webui ( #1807 )
2023-08-31 11:05:47 -07:00
jinchen62
3601dc7c3b
Fix llama2 13b combined ir ( #1803 )
2023-08-28 11:34:44 -07:00
Daniel Garvey
671881cf87
Llama2 70b ( #1783 )
...
* llama2 70b IR gen
* fix IR sec llama2 + debug
* llama270b
---------
Co-authored-by: PhaneeshB <b.phaneesh@gmail.com >
2023-08-25 23:04:28 -07:00
Gaurav Shukla
4e9be6be59
[chatbot] Add debug as class attribute ( #1799 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-25 21:46:29 -07:00
Ean Garvey
9c8cbaf498
Add support for ROCM (Windows) in Studio + compile utils ( #1770 )
...
* WIP: MSVC ROCM support for SHARK Studio
* Make get_iree_rocm_args platform-agnostic.
* Update stable_args.py
* Update rocm arg handling in SD utils
* Guard quantization imports.
Co-authored-by: jam https://github.com/jammm
2023-08-25 20:56:05 -07:00
jinchen62
51f90a4d56
Update conversion passes for brevitas quant op ( #1795 )
2023-08-25 17:28:07 -05:00
Abhishek Varma
310d5d0a49
Fix llama2 13b crashing + add spec file for CLI execution of Llama ( #1797 )
...
* [Llama2] Add a fix for Llama2 13B downloading/crashing
-- This commit fixes downloading/crashing of llama2 13B on wrong
.mlir file.
-- Also adds support for downloading vmfb from shark_tank in CLI.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
* [llama2] Add a spec file to run Llama/Vicuna CLI exe
-- This commit adds a spec file to run Llama/Vicuna CLI exe.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
---------
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-08-25 09:36:09 -05:00
Ean Garvey
9697981004
Pipe through a debug option to iree compile utils. ( #1796 )
...
* Update compile_utils.py
* Pipe through a flag to toggle debug options in compile utils.
* Update SharkLLMBase.py
2023-08-25 07:11:11 -07:00
Vivek Khandelwal
16160d9a7d
Fix combine mlir script
2023-08-24 19:10:49 +05:30
Abhishek Varma
db990826d3
Add Llama2 13B int4 fp16 support ( #1784 )
...
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-08-23 10:00:32 -07:00
Vivek Khandelwal
05889a8fe1
Add LLaMa2-int4-fp16 support ( #1782 )
2023-08-22 07:45:50 -07:00
Daniel Garvey
d8f0f7bade
replace public with private ( #1776 )
...
unload footguns
2023-08-18 14:22:46 -07:00
jinchen62
8738571d1e
Adapt the change of brevitas custom op name ( #1772 )
2023-08-17 14:24:43 -07:00
Gaurav Shukla
a4c354ce54
[version] Pin diffusers==0.19.3
...
Once the latest works with LORA train, unpin it.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-17 21:27:10 +05:30
Gaurav Shukla
cc53efa89f
[cli] Fix chatbot cli
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-17 21:27:10 +05:30
Gaurav Shukla
9ae8bc921e
[chatbot] Fix chatbot cli and webview warning
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-17 21:27:10 +05:30
Daniel Garvey
045c3c3852
enable iree-opt-const-expr-hoisting in vicuna ( #1742 )
...
Co-authored-by: powderluv <powderluv@users.noreply.github.com >
2023-08-14 18:43:42 -07:00
PhaneeshB
4f61d69d86
add support passing iree flags for LLMs
2023-08-15 00:22:56 +05:30
Vivek Khandelwal
16f46f8de9
Update langchain_requirements.txt
2023-08-14 14:32:19 +05:30
Vivek Khandelwal
c4723f469f
Update langchain_requirements.txt
2023-08-14 14:32:19 +05:30
Vivek Khandelwal
d804f45a61
Update langchain_requirements.txt
2023-08-14 14:32:19 +05:30
George Petterson
75e68f02f4
Remove CUDNN
2023-08-14 14:32:19 +05:30
Gaurav Shukla
4dc9c59611
[chatbot] Add tokens generated per second ( #1753 )
2023-08-13 11:25:41 -07:00
Vivek Khandelwal
e8c1203be2
Fix vicuna script ( #1745 )
2023-08-10 06:11:14 -07:00
Vivek Khandelwal
e4d7abb519
Final patch for fixing Langchain token streaming issue ( #1744 )
2023-08-09 10:09:41 -07:00
Eliasj42
5203679f1f
Bandaid fix 2 ( #1728 )
...
* download all mlirs
* fixed install method
* download all mlirs (#1727 )
Co-authored-by: Elias Joseph <elias@nod-labs.com >
* added taggs
* fix name check for file existence
* Remove SD from all_models.csv (#1706 )
Removes SD from pytests as it has its own test suite.
* gpt_langchain.py fixes for pydantic (#1722 )
* removed dead code
---------
Co-authored-by: Elias Joseph <elias@nod-labs.com >
Co-authored-by: PhaneeshB <b.phaneesh@gmail.com >
Co-authored-by: Ean Garvey <87458719+monorimet@users.noreply.github.com >
Co-authored-by: Stefan Kapusniak <121311569+one-lithe-rune@users.noreply.github.com >
2023-08-08 12:14:57 -05:00
Vivek Khandelwal
bf073f8f37
[Langchain] Expand pipelines to fix token streaming issue
2023-08-08 10:27:23 +05:30
Stefan Kapusniak
9b8c4401b5
gpt_langchain.py fixes for pydantic ( #1722 )
2023-08-07 00:55:38 -07:00
Eliasj42
fd1c4db5d0
download all mlirs ( #1727 )
...
Co-authored-by: Elias Joseph <elias@nod-labs.com >
2023-08-04 18:22:06 -05:00
Daniel Garvey
14fd0cdd87
add missing subprocess import ( #1721 )
2023-08-04 15:15:22 -05:00
Eliasj42
ed484b8253
added functionality for int8 vicuna and 4 shards ( #1712 )
...
combined vicuna_4_shards.py and vicuna.py to reduce code duplication
Co-authored-by: Elias Joseph <elias@nod-labs.com >
2023-08-04 14:05:05 -05:00
gpetters94
7fe57ebaaf
Add vector database and add support on the web UI ( #1699 )
2023-08-04 13:47:19 -04:00
Gaurav Shukla
bd30044c0b
[Shard] Add sharding generation in shark studio
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-04 21:51:14 +05:30
Vivek Khandelwal
a5b13fcc2f
[Langchain] Patch for fixing streaming of tokens ( #1709 )
2023-08-03 10:06:49 -07:00
Stefan Kapusniak
6bb329c4af
Unsharded Vicuna: Fix Memory Error compiling mlir for lmsys/vicuna-7b-v1.3 fp16 with 64 GiB ( #1702 )
2023-08-01 06:07:56 -07:00
Vivek Khandelwal
98fb6c52df
Expand pipelines to fix streaming of tokens
2023-07-31 22:11:01 +05:30
Daniel Garvey
ab57af43c1
Couple of fixes for vicuna.py ( #1696 )
...
* mega vicuna merge pt 2
* add fallback to ensure compile is called
2023-07-27 15:53:05 -07:00
jinchen62
4d5c55dd9f
Fix vicuna script ( #1697 )
2023-07-27 17:24:26 -05:00