AMD-SHARK-Studio

mirror of https://github.com/nod-ai/AMD-SHARK-Studio.git synced 2026-04-03 03:00:17 -04:00

Author	SHA1	Message	Date
Vivek Khandelwal	03c4d9e171	Add support for Llama-2-70b for web and cli, and for hf_auth_token	2023-07-20 14:57:48 +05:30
jinchen62	3662224c04	Update brevitas requirement (#1677 ) also clean up useless args Co-authored-by: powderluv <powderluv@users.noreply.github.com>	2023-07-19 22:03:32 -07:00
Vivek Khandelwal	db3f222933	Revert "Add Llama2 70B option in CLI and WebUI (#1673 )" (#1679 ) This reverts commit `41e5088908`.	2023-07-19 22:02:48 -07:00
Abhishek Varma	41e5088908	Add Llama2 70B option in CLI and WebUI (#1673 )	2023-07-19 10:41:42 -07:00
PhaneeshB	0a8f7673f4	Add README for CodeGen server	2023-07-19 23:10:23 +05:30
PhaneeshB	c482ab78da	fix second vic clearing for low mem device	2023-07-19 23:10:23 +05:30
Vivek Khandelwal	4be80f7158	Add support for the Llama-2 model	2023-07-19 20:57:08 +05:30
Daniel Garvey	8927cb0a2c	set optional vmfb download (#1667 )	2023-07-18 10:57:28 -07:00
Daniel Garvey	8c317e4809	fix cli for vicuna (#1666 )	2023-07-18 10:03:40 -07:00
Vivek Khandelwal	b0136593df	Add support for different compilation paths for DocuChat (#1665 )	2023-07-18 09:49:44 -07:00
Vivek Khandelwal	ab01f0f048	Add Langchain model in SHARK (#1657 ) * Add H2OGPT * Add UI tab for h2ogpt * Add source files from h2ogpt * Add the rest of the files * Add h2ogpt support * Add SHARK Compilation support for langchain model for cli mode --------- Co-authored-by: George Petterson <gpetters@protonmail.com>	2023-07-17 09:58:15 -07:00
Phaneesh Barwaria	c471d17cca	codegen API (#1655 )	2023-07-16 20:00:39 -07:00
jinchen62	e20cd71314	Change to a separate pass to unpack quantized weights (#1652 )	2023-07-15 04:54:53 -07:00
jinchen62	91027f8719	Remove done TODOs, a sup PR for #1644 (#1647 )	2023-07-12 23:30:45 -07:00
jinchen62	247f69cf9d	Apply canonicalize for unpacking int4 (#1644 ) - tested it unpacks int4 as expected - tested it doesn't make difference on int8	2023-07-11 19:41:09 -07:00
PhaneeshB	3b8f7cc231	Add codegen support in UI + lint	2023-07-11 21:58:01 +05:30
PhaneeshB	6e8dbf72bd	mlir/vmfb path fixes for vic pipeline	2023-07-11 21:58:01 +05:30
PhaneeshB	1c7eecc981	add codegen support in vic pipeline	2023-07-11 21:58:01 +05:30
PhaneeshB	be417f0bf4	fix precision for fp16	2023-07-11 21:58:01 +05:30
jinchen62	47ec7275e6	Fix brevitas quantize argument (#1633 )	2023-07-07 11:30:31 -07:00
Abhishek Varma	1b62dc4529	[Vicuna] Revert the formatting for Brevitas op (#1626 ) -- This commit reverts the formatting for Brevitas op. -- It also excludes vicuna.py script from `black` formatter. Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>	2023-07-06 06:56:17 -07:00
Abhishek Varma	a1b1ce935c	int8 e2e for WebUI (#1620 )	2023-07-05 07:08:36 -07:00
jinchen62	bc6fee1a0c	Add int4/int8 vicuna (#1598 )	2023-07-05 07:01:51 -07:00
Eliasj42	4015793f84	changed method of compiling vicuna to remove first and second vicuna (#1611 ) Co-authored-by: Elias Joseph <elias@nod-labs.com> Co-authored-by: powderluv <powderluv@users.noreply.github.com>	2023-07-03 12:12:43 -07:00
jinchen62	534de05791	Update precision check for vicuna (#1610 )	2023-06-29 16:16:33 -05:00
Daniel Garvey	5779e8c039	int4/int8 vicuna download support (#1609 ) * set task_topology_max_group to cpu_count by default. Can be overriden with a flag of the same str * add download for int4/int8 mlir	2023-06-29 13:35:51 -07:00
Gaurav Shukla	1d6a1f9f8a	[vicuna] Add tokens streaming(step=3) (#1600 ) Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com>	2023-06-27 08:59:27 -07:00
powderluv	726d73d6ba	Revert "[vicuna] Add streaming of tokens (#1587 )" (#1588 ) This reverts commit `4d55e51d46`.	2023-06-23 10:29:00 -07:00
Gaurav Shukla	4d55e51d46	[vicuna] Add streaming of tokens (#1587 ) Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com>	2023-06-23 08:20:46 -07:00
jinchen62	4002da7161	Add int4/int8 options to chatbot webui (#1586 )	2023-06-23 07:18:34 -07:00
Eliasj42	8822b9acd7	added ability to use config file to shard vicuna (#1565 ) Co-authored-by: Elias Joseph <elias@nod-labs.com>	2023-06-22 17:40:35 -05:00
Daniel Garvey	0ca3b9fce3	fix some mmap and vicuna bugs (#1576 )	2023-06-22 17:39:55 -05:00
Daniel Garvey	a202bb466a	fp16 fixes for webui (#1571 )	2023-06-21 20:24:02 -07:00
Phaneesh Barwaria	88cc2423cc	Enable Vicuna fp16 cpu (#1562 ) * fix second vic mlir gen * fp16 mlir/vmfb download from shark_tank	2023-06-20 13:43:21 -05:00
Vivek Khandelwal	855435ee24	Fix for the user input for Falcon pipeline	2023-06-20 18:09:32 +05:30
Elias Joseph	6f9f868fc0	fixed a bug where designating device for vicuna didn't work	2023-06-20 17:09:32 +05:30
Vivek Khandelwal	fafd713141	Minor change to falcon pipeline	2023-06-19 22:36:32 +05:30
Vivek Khandelwal	015d0132c3	Modify falcon pipeline to add fp16 support (#1551 )	2023-06-19 09:57:13 -07:00
Vivek Khandelwal	46184a81ac	Add Falcon pipeline (#1534 )	2023-06-14 09:39:16 -07:00
PhaneeshB	149165a2f0	add multi-device mutli-precision vmfb names	2023-06-14 22:08:24 +05:30
dan	bec82a665f	mega vicuna merge single endpoint in apps/language/models/scripts/vicuna.py removed main functions from pipelines replaced divergent utils compile with shark_importer adds support for different precisions	2023-06-14 19:06:29 +05:30
Nithin Meganathan	34f1295349	Add a model config generator (#1511 ) Model config generator takes a PyTorch model as input and generates a JSON file with model layers and other propperties that define sharding on a particular hardware.	2023-06-09 15:32:00 -07:00
Phaneesh Barwaria	1980d7b2c3	Cpu device map (#1515 ) * update cpu iree device * fix vmfb paths vic unsharded	2023-06-09 11:27:02 -05:00
Phaneesh Barwaria	436f58ddc4	cli using generate and mem fixes (#1509 )	2023-06-08 13:13:32 -05:00
Phaneesh Barwaria	6b29bd17c8	Enable compilation vicuna (#1507 ) * add cli for unsharded vic * enable mlir download and compile	2023-06-07 13:08:22 -07:00
Daniel Garvey	f206ecc635	reenable compilation in vicuna pipeline, add flags (#1505 ) * replace vicuna.py backend with pipeline * add some memory management to fist vicuna compile reenable compilation	2023-06-07 09:49:27 -07:00
PhaneeshB	f23b778a6c	remove old vicuna scripts	2023-06-06 21:35:58 +05:30
PhaneeshB	436edf900d	add vic sharded pipeline	2023-06-06 21:35:58 +05:30
Phaneesh Barwaria	a83808ddc5	Vicuna cuda on A100 40G (#1496 ) * vic chat with memory management (precompiled vmfb) * fix vmfb path and download	2023-06-06 15:10:33 +05:30
Phaneesh Barwaria	f0a4e59758	LLM Pipeline Wrapper (#1477 ) * [LLM] Add LLM pipeline Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com> * add base pipeline and stableLM * StableLM on UI - full block * add SLM default model name * add vicuna with pipeline * add one token gen api for vic * Fix stableLM bugs * debug vic memory * lint fix --------- Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com> Co-authored-by: Gaurav Shukla <gaurav@nod-labs.com>	2023-05-31 10:17:20 -07:00

1 2

76 Commits