Abhishek Varma
1b62dc4529
[Vicuna] Revert the formatting for Brevitas op ( #1626 )
...
-- This commit reverts the formatting for Brevitas op.
-- It also excludes vicuna.py script from `black` formatter.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-07-06 06:56:17 -07:00
Abhishek Varma
a1b1ce935c
int8 e2e for WebUI ( #1620 )
2023-07-05 07:08:36 -07:00
jinchen62
bc6fee1a0c
Add int4/int8 vicuna ( #1598 )
2023-07-05 07:01:51 -07:00
Eliasj42
4015793f84
changed method of compiling vicuna to remove first and second vicuna ( #1611 )
...
Co-authored-by: Elias Joseph <elias@nod-labs.com >
Co-authored-by: powderluv <powderluv@users.noreply.github.com >
2023-07-03 12:12:43 -07:00
jinchen62
534de05791
Update precision check for vicuna ( #1610 )
2023-06-29 16:16:33 -05:00
Daniel Garvey
5779e8c039
int4/int8 vicuna download support ( #1609 )
...
* set task_topology_max_group to cpu_count
by default. Can be overriden with a flag of the same str
* add download for int4/int8 mlir
2023-06-29 13:35:51 -07:00
Gaurav Shukla
1d6a1f9f8a
[vicuna] Add tokens streaming(step=3) ( #1600 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-06-27 08:59:27 -07:00
powderluv
726d73d6ba
Revert "[vicuna] Add streaming of tokens ( #1587 )" ( #1588 )
...
This reverts commit 4d55e51d46 .
2023-06-23 10:29:00 -07:00
Gaurav Shukla
4d55e51d46
[vicuna] Add streaming of tokens ( #1587 )
...
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-06-23 08:20:46 -07:00
jinchen62
4002da7161
Add int4/int8 options to chatbot webui ( #1586 )
2023-06-23 07:18:34 -07:00
Eliasj42
8822b9acd7
added ability to use config file to shard vicuna ( #1565 )
...
Co-authored-by: Elias Joseph <elias@nod-labs.com >
2023-06-22 17:40:35 -05:00
Daniel Garvey
0ca3b9fce3
fix some mmap and vicuna bugs ( #1576 )
2023-06-22 17:39:55 -05:00
Daniel Garvey
a202bb466a
fp16 fixes for webui ( #1571 )
2023-06-21 20:24:02 -07:00
Phaneesh Barwaria
88cc2423cc
Enable Vicuna fp16 cpu ( #1562 )
...
* fix second vic mlir gen
* fp16 mlir/vmfb download from shark_tank
2023-06-20 13:43:21 -05:00
Vivek Khandelwal
855435ee24
Fix for the user input for Falcon pipeline
2023-06-20 18:09:32 +05:30
Elias Joseph
6f9f868fc0
fixed a bug where designating device for vicuna didn't work
2023-06-20 17:09:32 +05:30
Vivek Khandelwal
fafd713141
Minor change to falcon pipeline
2023-06-19 22:36:32 +05:30
Vivek Khandelwal
015d0132c3
Modify falcon pipeline to add fp16 support ( #1551 )
2023-06-19 09:57:13 -07:00
Vivek Khandelwal
46184a81ac
Add Falcon pipeline ( #1534 )
2023-06-14 09:39:16 -07:00
PhaneeshB
149165a2f0
add multi-device mutli-precision vmfb names
2023-06-14 22:08:24 +05:30
dan
bec82a665f
mega vicuna merge
...
single endpoint in apps/language/models/scripts/vicuna.py
removed main functions from pipelines
replaced divergent utils compile with shark_importer
adds support for different precisions
2023-06-14 19:06:29 +05:30
Nithin Meganathan
34f1295349
Add a model config generator ( #1511 )
...
Model config generator takes a PyTorch model as input and generates a JSON file with model layers and other propperties that define sharding on a particular hardware.
2023-06-09 15:32:00 -07:00
Phaneesh Barwaria
1980d7b2c3
Cpu device map ( #1515 )
...
* update cpu iree device
* fix vmfb paths vic unsharded
2023-06-09 11:27:02 -05:00
Phaneesh Barwaria
436f58ddc4
cli using generate and mem fixes ( #1509 )
2023-06-08 13:13:32 -05:00
Phaneesh Barwaria
6b29bd17c8
Enable compilation vicuna ( #1507 )
...
* add cli for unsharded vic
* enable mlir download and compile
2023-06-07 13:08:22 -07:00
Daniel Garvey
f206ecc635
reenable compilation in vicuna pipeline, add flags ( #1505 )
...
* replace vicuna.py backend with pipeline
* add some memory management to fist vicuna compile
reenable compilation
2023-06-07 09:49:27 -07:00
PhaneeshB
f23b778a6c
remove old vicuna scripts
2023-06-06 21:35:58 +05:30
PhaneeshB
436edf900d
add vic sharded pipeline
2023-06-06 21:35:58 +05:30
Phaneesh Barwaria
a83808ddc5
Vicuna cuda on A100 40G ( #1496 )
...
* vic chat with memory management (precompiled vmfb)
* fix vmfb path and download
2023-06-06 15:10:33 +05:30
Phaneesh Barwaria
f0a4e59758
LLM Pipeline Wrapper ( #1477 )
...
* [LLM] Add LLM pipeline
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
* add base pipeline and stableLM
* StableLM on UI - full block
* add SLM default model name
* add vicuna with pipeline
* add one token gen api for vic
* Fix stableLM bugs
* debug vic memory
* lint fix
---------
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
Co-authored-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-05-31 10:17:20 -07:00
Elias Joseph
73cd7e8320
added full vicuna to vicuna.py
2023-05-26 22:06:40 +05:30
PhaneeshB
6d64b8e273
vic and slm common generation base
2023-05-25 20:29:41 +05:30
PhaneeshB
a8ea0326f5
correct SLM saved vmfb naming
2023-05-25 20:29:41 +05:30
PhaneeshB
58e9194553
add Lists import
2023-05-25 20:29:41 +05:30
PhaneeshB
eb360e255d
remove unused imports
2023-05-25 20:29:41 +05:30
PhaneeshB
a6f88d7f72
refactor mlir compile
2023-05-25 20:29:41 +05:30
Phaneesh Barwaria
f5ce121988
SLM on Sharkstudio ( #1454 )
...
* localize import, fix file reading, device cpu
* extract out model args
2023-05-19 11:21:08 -07:00
PhaneeshB
09bea17e59
fix #2 SLM in SharkStudio
2023-05-18 00:56:22 +05:30
Daniel Garvey
aefcf80b48
swap to cpu an remove hardcoded paths ( #1448 )
...
Co-authored-by: powderluv <powderluv@users.noreply.github.com >
2023-05-17 10:53:34 -07:00
PhaneeshB
6602a2f5ba
add continuous output for CLI
2023-05-17 18:33:46 +05:30
powderluv
8ee2ac89f8
Rename sharded_vicuna_fp32_web.py to vicuna_web.py
2023-05-16 09:41:35 -07:00
powderluv
60cb48be2e
Rename sharded_vicuna_fp32.py to vicuna.py
2023-05-16 09:40:51 -07:00
powderluv
86a215b063
Delete sharded_vicunia.py
2023-05-16 09:37:39 -07:00
powderluv
d6e3a9a236
Delete standalone_vicuna.py
2023-05-16 09:37:26 -07:00
Daniel Garvey
4731c1a835
prevent loading tokenizer on import ( #1432 )
...
also adds sentencepiece dep for exe
moved vicuna imports to after an if statement
in general we should avoid importing files that load whole models as
global variables
2023-05-12 19:11:45 -07:00
Gaurav Shukla
e0cc2871bb
[SD] Yield 2 tokens at a time in vicuna
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-05-11 23:49:01 +05:30
Gaurav Shukla
649f39408b
[SD] Fix vicuna response
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-05-11 18:06:21 +05:30
Gaurav Shukla
9e07360b00
[SD] Standalone vicuna with web
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2023-05-11 17:23:44 +05:30
Eliasj42
fa833f8366
fixed spacing issue with chat-bot ( #1417 )
...
Co-authored-by: Elias Joseph <elias@nod-labs.com >
2023-05-10 16:07:50 -07:00
Gaurav Shukla
fcb059aa38
[SD] Integrate vicuna in the web ( #1410 )
2023-05-10 11:30:22 -07:00