Vivek Khandelwal
03c4d9e171
Add support for Llama-2-70b for web and cli, and for hf_auth_token
2023-07-20 14:57:48 +05:30
jinchen62
3662224c04
Update brevitas requirement ( #1677 )
...
also clean up useless args
Co-authored-by: powderluv <powderluv@users.noreply.github.com >
2023-07-19 22:03:32 -07:00
Vivek Khandelwal
db3f222933
Revert "Add Llama2 70B option in CLI and WebUI ( #1673 )" ( #1679 )
...
This reverts commit 41e5088908 .
2023-07-19 22:02:48 -07:00
Abhishek Varma
41e5088908
Add Llama2 70B option in CLI and WebUI ( #1673 )
2023-07-19 10:41:42 -07:00
PhaneeshB
c482ab78da
fix second vic clearing for low mem device
2023-07-19 23:10:23 +05:30
Vivek Khandelwal
4be80f7158
Add support for the Llama-2 model
2023-07-19 20:57:08 +05:30
Daniel Garvey
8927cb0a2c
set optional vmfb download ( #1667 )
2023-07-18 10:57:28 -07:00
Daniel Garvey
8c317e4809
fix cli for vicuna ( #1666 )
2023-07-18 10:03:40 -07:00
Phaneesh Barwaria
c471d17cca
codegen API ( #1655 )
2023-07-16 20:00:39 -07:00
jinchen62
e20cd71314
Change to a separate pass to unpack quantized weights ( #1652 )
2023-07-15 04:54:53 -07:00
jinchen62
91027f8719
Remove done TODOs, a sup PR for #1644 ( #1647 )
2023-07-12 23:30:45 -07:00
jinchen62
247f69cf9d
Apply canonicalize for unpacking int4 ( #1644 )
...
- tested it unpacks int4 as expected
- tested it doesn't make difference on int8
2023-07-11 19:41:09 -07:00
PhaneeshB
3b8f7cc231
Add codegen support in UI + lint
2023-07-11 21:58:01 +05:30
PhaneeshB
6e8dbf72bd
mlir/vmfb path fixes for vic pipeline
2023-07-11 21:58:01 +05:30
PhaneeshB
1c7eecc981
add codegen support in vic pipeline
2023-07-11 21:58:01 +05:30
PhaneeshB
be417f0bf4
fix precision for fp16
2023-07-11 21:58:01 +05:30
jinchen62
47ec7275e6
Fix brevitas quantize argument ( #1633 )
2023-07-07 11:30:31 -07:00
Abhishek Varma
1b62dc4529
[Vicuna] Revert the formatting for Brevitas op ( #1626 )
...
-- This commit reverts the formatting for Brevitas op.
-- It also excludes vicuna.py script from `black` formatter.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-07-06 06:56:17 -07:00
Abhishek Varma
a1b1ce935c
int8 e2e for WebUI ( #1620 )
2023-07-05 07:08:36 -07:00
jinchen62
bc6fee1a0c
Add int4/int8 vicuna ( #1598 )
2023-07-05 07:01:51 -07:00
Eliasj42
8822b9acd7
added ability to use config file to shard vicuna ( #1565 )
...
Co-authored-by: Elias Joseph <elias@nod-labs.com >
2023-06-22 17:40:35 -05:00
Daniel Garvey
0ca3b9fce3
fix some mmap and vicuna bugs ( #1576 )
2023-06-22 17:39:55 -05:00
PhaneeshB
149165a2f0
add multi-device mutli-precision vmfb names
2023-06-14 22:08:24 +05:30
dan
bec82a665f
mega vicuna merge
...
single endpoint in apps/language/models/scripts/vicuna.py
removed main functions from pipelines
replaced divergent utils compile with shark_importer
adds support for different precisions
2023-06-14 19:06:29 +05:30
PhaneeshB
f23b778a6c
remove old vicuna scripts
2023-06-06 21:35:58 +05:30
Elias Joseph
73cd7e8320
added full vicuna to vicuna.py
2023-05-26 22:06:40 +05:30
PhaneeshB
6d64b8e273
vic and slm common generation base
2023-05-25 20:29:41 +05:30
PhaneeshB
a8ea0326f5
correct SLM saved vmfb naming
2023-05-25 20:29:41 +05:30
PhaneeshB
eb360e255d
remove unused imports
2023-05-25 20:29:41 +05:30
PhaneeshB
a6f88d7f72
refactor mlir compile
2023-05-25 20:29:41 +05:30
Phaneesh Barwaria
f5ce121988
SLM on Sharkstudio ( #1454 )
...
* localize import, fix file reading, device cpu
* extract out model args
2023-05-19 11:21:08 -07:00
PhaneeshB
09bea17e59
fix #2 SLM in SharkStudio
2023-05-18 00:56:22 +05:30
Daniel Garvey
aefcf80b48
swap to cpu an remove hardcoded paths ( #1448 )
...
Co-authored-by: powderluv <powderluv@users.noreply.github.com >
2023-05-17 10:53:34 -07:00
PhaneeshB
6602a2f5ba
add continuous output for CLI
2023-05-17 18:33:46 +05:30
powderluv
8ee2ac89f8
Rename sharded_vicuna_fp32_web.py to vicuna_web.py
2023-05-16 09:41:35 -07:00
powderluv
60cb48be2e
Rename sharded_vicuna_fp32.py to vicuna.py
2023-05-16 09:40:51 -07:00
powderluv
86a215b063
Delete sharded_vicunia.py
2023-05-16 09:37:39 -07:00
powderluv
d6e3a9a236
Delete standalone_vicuna.py
2023-05-16 09:37:26 -07:00
Daniel Garvey
4731c1a835
prevent loading tokenizer on import ( #1432 )
...
also adds sentencepiece dep for exe
moved vicuna imports to after an if statement
in general we should avoid importing files that load whole models as
global variables
2023-05-12 19:11:45 -07:00
Gaurav Shukla
e0cc2871bb
[SD] Yield 2 tokens at a time in vicuna
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-05-11 23:49:01 +05:30
Gaurav Shukla
649f39408b
[SD] Fix vicuna response
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-05-11 18:06:21 +05:30
Gaurav Shukla
9e07360b00
[SD] Standalone vicuna with web
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-05-11 17:23:44 +05:30
Eliasj42
fa833f8366
fixed spacing issue with chat-bot ( #1417 )
...
Co-authored-by: Elias Joseph <elias@nod-labs.com >
2023-05-10 16:07:50 -07:00
Gaurav Shukla
fcb059aa38
[SD] Integrate vicuna in the web ( #1410 )
2023-05-10 11:30:22 -07:00
PhaneeshB
517c670f82
vicuna chat cli
2023-05-10 22:55:06 +05:30
Eliasj42
59df14f18b
added vicuna demo ( #1408 )
...
Co-authored-by: Elias Joseph <elias@nod-labs.com >
2023-05-09 21:18:20 -07:00
Daniel Garvey
7a4a51ae73
vulkan vic f16 ( #1404 )
...
Co-authored-by: dan <dan@nod-labs.com >
2023-05-08 16:46:53 -07:00
Eliasj42
54ce3d48ca
added standalone vicuna script ( #1401 )
...
Co-authored-by: Elias Joseph <elias@nod-labs.com >
2023-05-05 18:05:52 -05:00
Gaurav Shukla
fed63dfd4b
[SD] Add stableLM chatbot ( #1383 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
Co-authored-by: powderluv <powderluv@users.noreply.github.com >
2023-05-03 15:37:20 -05:00
Vivek Khandelwal
d2f7e03b7e
Add StableLM model ( #1331 )
2023-04-21 09:51:02 -07:00