Vivek Khandelwal
05889a8fe1
Add LLaMa2-int4-fp16 support ( #1782 )
2023-08-22 07:45:50 -07:00
Daniel Garvey
d8f0f7bade
replace public with private ( #1776 )
...
unload footguns
2023-08-18 14:22:46 -07:00
jinchen62
8738571d1e
Adapt the change of brevitas custom op name ( #1772 )
2023-08-17 14:24:43 -07:00
Gaurav Shukla
a4c354ce54
[version] Pin diffusers==0.19.3
...
Once the latest works with LORA train, unpin it.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-17 21:27:10 +05:30
Gaurav Shukla
cc53efa89f
[cli] Fix chatbot cli
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-17 21:27:10 +05:30
Gaurav Shukla
9ae8bc921e
[chatbot] Fix chatbot cli and webview warning
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-17 21:27:10 +05:30
Daniel Garvey
045c3c3852
enable iree-opt-const-expr-hoisting in vicuna ( #1742 )
...
Co-authored-by: powderluv <powderluv@users.noreply.github.com >
2023-08-14 18:43:42 -07:00
PhaneeshB
4f61d69d86
add support passing iree flags for LLMs
2023-08-15 00:22:56 +05:30
Vivek Khandelwal
16f46f8de9
Update langchain_requirements.txt
2023-08-14 14:32:19 +05:30
Vivek Khandelwal
c4723f469f
Update langchain_requirements.txt
2023-08-14 14:32:19 +05:30
Vivek Khandelwal
d804f45a61
Update langchain_requirements.txt
2023-08-14 14:32:19 +05:30
George Petterson
75e68f02f4
Remove CUDNN
2023-08-14 14:32:19 +05:30
Gaurav Shukla
4dc9c59611
[chatbot] Add tokens generated per second ( #1753 )
2023-08-13 11:25:41 -07:00
Vivek Khandelwal
e8c1203be2
Fix vicuna script ( #1745 )
2023-08-10 06:11:14 -07:00
Vivek Khandelwal
e4d7abb519
Final patch for fixing Langchain token streaming issue ( #1744 )
2023-08-09 10:09:41 -07:00
Eliasj42
5203679f1f
Bandaid fix 2 ( #1728 )
...
* download all mlirs
* fixed install method
* download all mlirs (#1727 )
Co-authored-by: Elias Joseph <elias@nod-labs.com >
* added taggs
* fix name check for file existence
* Remove SD from all_models.csv (#1706 )
Removes SD from pytests as it has its own test suite.
* gpt_langchain.py fixes for pydantic (#1722 )
* removed dead code
---------
Co-authored-by: Elias Joseph <elias@nod-labs.com >
Co-authored-by: PhaneeshB <b.phaneesh@gmail.com >
Co-authored-by: Ean Garvey <87458719+monorimet@users.noreply.github.com >
Co-authored-by: Stefan Kapusniak <121311569+one-lithe-rune@users.noreply.github.com >
2023-08-08 12:14:57 -05:00
Vivek Khandelwal
bf073f8f37
[Langchain] Expand pipelines to fix token streaming issue
2023-08-08 10:27:23 +05:30
Stefan Kapusniak
9b8c4401b5
gpt_langchain.py fixes for pydantic ( #1722 )
2023-08-07 00:55:38 -07:00
Eliasj42
fd1c4db5d0
download all mlirs ( #1727 )
...
Co-authored-by: Elias Joseph <elias@nod-labs.com >
2023-08-04 18:22:06 -05:00
Daniel Garvey
14fd0cdd87
add missing subprocess import ( #1721 )
2023-08-04 15:15:22 -05:00
Eliasj42
ed484b8253
added functionality for int8 vicuna and 4 shards ( #1712 )
...
combined vicuna_4_shards.py and vicuna.py to reduce code duplication
Co-authored-by: Elias Joseph <elias@nod-labs.com >
2023-08-04 14:05:05 -05:00
gpetters94
7fe57ebaaf
Add vector database and add support on the web UI ( #1699 )
2023-08-04 13:47:19 -04:00
Gaurav Shukla
bd30044c0b
[Shard] Add sharding generation in shark studio
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-08-04 21:51:14 +05:30
Vivek Khandelwal
a5b13fcc2f
[Langchain] Patch for fixing streaming of tokens ( #1709 )
2023-08-03 10:06:49 -07:00
Stefan Kapusniak
6bb329c4af
Unsharded Vicuna: Fix Memory Error compiling mlir for lmsys/vicuna-7b-v1.3 fp16 with 64 GiB ( #1702 )
2023-08-01 06:07:56 -07:00
Vivek Khandelwal
98fb6c52df
Expand pipelines to fix streaming of tokens
2023-07-31 22:11:01 +05:30
Daniel Garvey
ab57af43c1
Couple of fixes for vicuna.py ( #1696 )
...
* mega vicuna merge pt 2
* add fallback to ensure compile is called
2023-07-27 15:53:05 -07:00
jinchen62
4d5c55dd9f
Fix vicuna script ( #1697 )
2023-07-27 17:24:26 -05:00
Vivek Khandelwal
07399ad65c
[Langchain] Remove unused code ( #1698 )
2023-07-27 11:59:54 -05:00
Vivek Khandelwal
776a9c2293
Fix for Langchain ( #1694 )
...
For CPU, remove max time stopping criteria
Fix web UI issue
2023-07-26 09:00:23 -07:00
Eliasj42
9d399eb988
fixed bug where device_idx was hardcoded ( #1693 )
...
Co-authored-by: Elias Joseph <elias@nod-labs.com >
2023-07-25 19:00:13 -05:00
Vivek Khandelwal
927b662aa7
Add Langchain SHARK Compilation support for all paths
2023-07-25 22:15:42 +05:30
Abhishek Varma
47f8a79c75
[MiniGPT4] Add MiniGPT4 to SHARK ( #1554 )
...
* [MiniGPT4] Add MiniGPT4 to SHARK
-- This is the first installment of MiniGPT4 in SHARK.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
* Add int8 support for MiniGPT4
-- This commit adds int8 support for MiniGPT4.
Signed-off-by: Abhishek Varma <abhishek@nod-lab.com >
* Update .spec for MiniGPT4's config files
* black format MiniGPT4
---------
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
Signed-off-by: Abhishek Varma <abhishek@nod-lab.com >
2023-07-25 09:42:27 -07:00
Daniel Garvey
453e46562f
mega vicuna merge pt 2 ( #1685 )
2023-07-24 12:42:20 -05:00
Vivek Khandelwal
f3cb63fc9c
Fix Langchain multiple device isssue ( #1688 )
2023-07-24 08:03:46 -07:00
Vivek Khandelwal
d7092aafaa
Fix multiple issue for Langchain
...
This commit fixes the following issue for the Langchain:
1.) Web UI not able to fetch results.
2.) For each query model getting reloaded.
3.) SHARK module not using user provided device and precision.
4.) Create a class for main Langchain code.
5.) Misc issues
2023-07-21 21:56:27 +05:30
Vivek Khandelwal
a415f3f70e
Fix Langchain Prompt issue and add web UI support ( #1682 )
2023-07-21 06:36:55 -07:00
Vivek Khandelwal
c292e5c9d7
Add Langchain CPU support and update requirements
2023-07-20 18:53:34 +05:30
Vivek Khandelwal
03c4d9e171
Add support for Llama-2-70b for web and cli, and for hf_auth_token
2023-07-20 14:57:48 +05:30
jinchen62
3662224c04
Update brevitas requirement ( #1677 )
...
also clean up useless args
Co-authored-by: powderluv <powderluv@users.noreply.github.com >
2023-07-19 22:03:32 -07:00
Vivek Khandelwal
db3f222933
Revert "Add Llama2 70B option in CLI and WebUI ( #1673 )" ( #1679 )
...
This reverts commit 41e5088908 .
2023-07-19 22:02:48 -07:00
Abhishek Varma
41e5088908
Add Llama2 70B option in CLI and WebUI ( #1673 )
2023-07-19 10:41:42 -07:00
PhaneeshB
0a8f7673f4
Add README for CodeGen server
2023-07-19 23:10:23 +05:30
PhaneeshB
c482ab78da
fix second vic clearing for low mem device
2023-07-19 23:10:23 +05:30
Vivek Khandelwal
4be80f7158
Add support for the Llama-2 model
2023-07-19 20:57:08 +05:30
Daniel Garvey
8927cb0a2c
set optional vmfb download ( #1667 )
2023-07-18 10:57:28 -07:00
Daniel Garvey
8c317e4809
fix cli for vicuna ( #1666 )
2023-07-18 10:03:40 -07:00
Vivek Khandelwal
b0136593df
Add support for different compilation paths for DocuChat ( #1665 )
2023-07-18 09:49:44 -07:00
Vivek Khandelwal
ab01f0f048
Add Langchain model in SHARK ( #1657 )
...
* Add H2OGPT
* Add UI tab for h2ogpt
* Add source files from h2ogpt
* Add the rest of the files
* Add h2ogpt support
* Add SHARK Compilation support for langchain model for cli mode
---------
Co-authored-by: George Petterson <gpetters@protonmail.com >
2023-07-17 09:58:15 -07:00
Phaneesh Barwaria
c471d17cca
codegen API ( #1655 )
2023-07-16 20:00:39 -07:00