Vivek Khandelwal
03c4d9e171
Add support for Llama-2-70b for web and cli, and for hf_auth_token
2023-07-20 14:57:48 +05:30
jinchen62
3662224c04
Update brevitas requirement ( #1677 )
...
also clean up useless args
Co-authored-by: powderluv <powderluv@users.noreply.github.com >
2023-07-19 22:03:32 -07:00
Vivek Khandelwal
db3f222933
Revert "Add Llama2 70B option in CLI and WebUI ( #1673 )" ( #1679 )
...
This reverts commit 41e5088908 .
2023-07-19 22:02:48 -07:00
Abhishek Varma
41e5088908
Add Llama2 70B option in CLI and WebUI ( #1673 )
2023-07-19 10:41:42 -07:00
PhaneeshB
0a8f7673f4
Add README for CodeGen server
2023-07-19 23:10:23 +05:30
PhaneeshB
c482ab78da
fix second vic clearing for low mem device
2023-07-19 23:10:23 +05:30
Vivek Khandelwal
4be80f7158
Add support for the Llama-2 model
2023-07-19 20:57:08 +05:30
Daniel Garvey
8927cb0a2c
set optional vmfb download ( #1667 )
2023-07-18 10:57:28 -07:00
Daniel Garvey
8c317e4809
fix cli for vicuna ( #1666 )
2023-07-18 10:03:40 -07:00
Vivek Khandelwal
b0136593df
Add support for different compilation paths for DocuChat ( #1665 )
2023-07-18 09:49:44 -07:00
Vivek Khandelwal
ab01f0f048
Add Langchain model in SHARK ( #1657 )
...
* Add H2OGPT
* Add UI tab for h2ogpt
* Add source files from h2ogpt
* Add the rest of the files
* Add h2ogpt support
* Add SHARK Compilation support for langchain model for cli mode
---------
Co-authored-by: George Petterson <gpetters@protonmail.com >
2023-07-17 09:58:15 -07:00
Phaneesh Barwaria
c471d17cca
codegen API ( #1655 )
2023-07-16 20:00:39 -07:00
jinchen62
e20cd71314
Change to a separate pass to unpack quantized weights ( #1652 )
2023-07-15 04:54:53 -07:00
jinchen62
91027f8719
Remove done TODOs, a sup PR for #1644 ( #1647 )
2023-07-12 23:30:45 -07:00
jinchen62
247f69cf9d
Apply canonicalize for unpacking int4 ( #1644 )
...
- tested it unpacks int4 as expected
- tested it doesn't make difference on int8
2023-07-11 19:41:09 -07:00
PhaneeshB
3b8f7cc231
Add codegen support in UI + lint
2023-07-11 21:58:01 +05:30
PhaneeshB
6e8dbf72bd
mlir/vmfb path fixes for vic pipeline
2023-07-11 21:58:01 +05:30
PhaneeshB
1c7eecc981
add codegen support in vic pipeline
2023-07-11 21:58:01 +05:30
PhaneeshB
be417f0bf4
fix precision for fp16
2023-07-11 21:58:01 +05:30
jinchen62
47ec7275e6
Fix brevitas quantize argument ( #1633 )
2023-07-07 11:30:31 -07:00
Abhishek Varma
1b62dc4529
[Vicuna] Revert the formatting for Brevitas op ( #1626 )
...
-- This commit reverts the formatting for Brevitas op.
-- It also excludes vicuna.py script from `black` formatter.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-07-06 06:56:17 -07:00
Abhishek Varma
a1b1ce935c
int8 e2e for WebUI ( #1620 )
2023-07-05 07:08:36 -07:00
jinchen62
bc6fee1a0c
Add int4/int8 vicuna ( #1598 )
2023-07-05 07:01:51 -07:00
Eliasj42
4015793f84
changed method of compiling vicuna to remove first and second vicuna ( #1611 )
...
Co-authored-by: Elias Joseph <elias@nod-labs.com >
Co-authored-by: powderluv <powderluv@users.noreply.github.com >
2023-07-03 12:12:43 -07:00
jinchen62
534de05791
Update precision check for vicuna ( #1610 )
2023-06-29 16:16:33 -05:00
Daniel Garvey
5779e8c039
int4/int8 vicuna download support ( #1609 )
...
* set task_topology_max_group to cpu_count
by default. Can be overriden with a flag of the same str
* add download for int4/int8 mlir
2023-06-29 13:35:51 -07:00
Gaurav Shukla
1d6a1f9f8a
[vicuna] Add tokens streaming(step=3) ( #1600 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-06-27 08:59:27 -07:00
powderluv
726d73d6ba
Revert "[vicuna] Add streaming of tokens ( #1587 )" ( #1588 )
...
This reverts commit 4d55e51d46 .
2023-06-23 10:29:00 -07:00
Gaurav Shukla
4d55e51d46
[vicuna] Add streaming of tokens ( #1587 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-06-23 08:20:46 -07:00
jinchen62
4002da7161
Add int4/int8 options to chatbot webui ( #1586 )
2023-06-23 07:18:34 -07:00
Eliasj42
8822b9acd7
added ability to use config file to shard vicuna ( #1565 )
...
Co-authored-by: Elias Joseph <elias@nod-labs.com >
2023-06-22 17:40:35 -05:00
Daniel Garvey
0ca3b9fce3
fix some mmap and vicuna bugs ( #1576 )
2023-06-22 17:39:55 -05:00
Daniel Garvey
a202bb466a
fp16 fixes for webui ( #1571 )
2023-06-21 20:24:02 -07:00
Phaneesh Barwaria
88cc2423cc
Enable Vicuna fp16 cpu ( #1562 )
...
* fix second vic mlir gen
* fp16 mlir/vmfb download from shark_tank
2023-06-20 13:43:21 -05:00
Vivek Khandelwal
855435ee24
Fix for the user input for Falcon pipeline
2023-06-20 18:09:32 +05:30
Elias Joseph
6f9f868fc0
fixed a bug where designating device for vicuna didn't work
2023-06-20 17:09:32 +05:30
Vivek Khandelwal
fafd713141
Minor change to falcon pipeline
2023-06-19 22:36:32 +05:30
Vivek Khandelwal
015d0132c3
Modify falcon pipeline to add fp16 support ( #1551 )
2023-06-19 09:57:13 -07:00
Vivek Khandelwal
46184a81ac
Add Falcon pipeline ( #1534 )
2023-06-14 09:39:16 -07:00
PhaneeshB
149165a2f0
add multi-device mutli-precision vmfb names
2023-06-14 22:08:24 +05:30
dan
bec82a665f
mega vicuna merge
...
single endpoint in apps/language/models/scripts/vicuna.py
removed main functions from pipelines
replaced divergent utils compile with shark_importer
adds support for different precisions
2023-06-14 19:06:29 +05:30
Nithin Meganathan
34f1295349
Add a model config generator ( #1511 )
...
Model config generator takes a PyTorch model as input and generates a JSON file with model layers and other propperties that define sharding on a particular hardware.
2023-06-09 15:32:00 -07:00
Phaneesh Barwaria
1980d7b2c3
Cpu device map ( #1515 )
...
* update cpu iree device
* fix vmfb paths vic unsharded
2023-06-09 11:27:02 -05:00
Phaneesh Barwaria
436f58ddc4
cli using generate and mem fixes ( #1509 )
2023-06-08 13:13:32 -05:00
Phaneesh Barwaria
6b29bd17c8
Enable compilation vicuna ( #1507 )
...
* add cli for unsharded vic
* enable mlir download and compile
2023-06-07 13:08:22 -07:00
Daniel Garvey
f206ecc635
reenable compilation in vicuna pipeline, add flags ( #1505 )
...
* replace vicuna.py backend with pipeline
* add some memory management to fist vicuna compile
reenable compilation
2023-06-07 09:49:27 -07:00
PhaneeshB
f23b778a6c
remove old vicuna scripts
2023-06-06 21:35:58 +05:30
PhaneeshB
436edf900d
add vic sharded pipeline
2023-06-06 21:35:58 +05:30
Phaneesh Barwaria
a83808ddc5
Vicuna cuda on A100 40G ( #1496 )
...
* vic chat with memory management (precompiled vmfb)
* fix vmfb path and download
2023-06-06 15:10:33 +05:30
Phaneesh Barwaria
f0a4e59758
LLM Pipeline Wrapper ( #1477 )
...
* [LLM] Add LLM pipeline
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
* add base pipeline and stableLM
* StableLM on UI - full block
* add SLM default model name
* add vicuna with pipeline
* add one token gen api for vic
* Fix stableLM bugs
* debug vic memory
* lint fix
---------
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
Co-authored-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-05-31 10:17:20 -07:00