AMD-SHARK-Studio

mirror of https://github.com/nod-ai/AMD-SHARK-Studio.git synced 2026-04-03 03:00:17 -04:00

Author	SHA1	Message	Date
PhaneeshB	a53e714ddf	fix combine mlir for llama2	2023-11-20 14:00:55 +05:30
jinchen62	dd37c26d36	Update brevitas quant api (#1975 )	2023-11-15 10:04:07 -08:00
PhaneeshB	54bff4611d	fix cli rocm device selection	2023-11-13 23:35:55 +05:30
PhaneeshB	11510d5111	add intra rocm vmfb differentiator	2023-11-13 23:35:55 +05:30
PhaneeshB	392bade0bf	enable non default rocm device selection for webui	2023-11-13 23:35:55 +05:30
dependabot[bot]	df20cf9c8a	Bump langchain in /apps/language_models/langchain (#1968 ) Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.0.325 to 0.0.329. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/v0.0.325...v0.0.329) --- updated-dependencies: - dependency-name: langchain dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-11-12 19:46:00 -08:00
dependabot[bot]	f41ad87ef6	Bump langchain in /apps/language_models/langchain (#1926 ) Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.0.202 to 0.0.325. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/v0.0.202...v0.0.325) --- updated-dependencies: - dependency-name: langchain dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-11-09 11:03:47 -06:00
dependabot[bot]	d811524a00	Bump pypdf from 3.12.2 to 3.17.0 in /apps/language_models/langchain (#1929 ) Bumps [pypdf](https://github.com/py-pdf/pypdf) from 3.12.2 to 3.17.0. - [Release notes](https://github.com/py-pdf/pypdf/releases) - [Changelog](https://github.com/py-pdf/pypdf/blob/main/CHANGELOG.md) - [Commits](https://github.com/py-pdf/pypdf/compare/3.12.2...3.17.0) --- updated-dependencies: - dependency-name: pypdf dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-11-09 11:02:43 -06:00
PhaneeshB	ab0e870c43	fix vicuna cli vulkan	2023-11-09 22:27:13 +05:30
Jakub Kuderski	488a172292	[vicuna.py] Allow to pass extra arguments to iree-compile (#1935 ) Add a new flag `-Xiree_compile` to forward extra compiler arguments to `iree-compile`. This flag can be set multiple times to pass more than one extra argument.	2023-11-06 12:12:34 -05:00
Vivek Khandelwal	92b694db4d	Add support for Falcon-40b-GPTQ	2023-11-06 19:49:19 +05:30
Vivek Khandelwal	322874f7f9	Fix issue in Falcon-GPTQ	2023-11-03 11:48:36 +05:30
Vivek Khandelwal	71846344a2	Add sharded Falcon-GPTQ support This commit adds the support for sharded Falcon-7b-GPTQ and Falcon-180B-GPTQ. This commit also adds the support for 4-way sharding of the Falcon model for the device ROCM. Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>	2023-11-01 12:11:44 +05:30
Vivek Khandelwal	ea920f2955	Add sharded Falcon support	2023-10-26 21:53:25 +05:30
Vivek Khandelwal	205e57683a	Modify Falcon-180b-GPTQ sharded pipeline	2023-10-17 20:26:01 +05:30
Vivek Khandelwal	2866d665ee	Fix Sharded Falcon-180b-GPTQ Pipeline	2023-10-17 20:26:01 +05:30
Vivek Khandelwal	202ffff67b	Add support for sharded Falcon model	2023-10-13 22:05:10 +05:30
Vivek Khandelwal	b83d32fafe	Fix Falcon GPTQ Pipeline	2023-10-11 20:09:32 +05:30
Vivek Khandelwal	0a618e1863	Add support for Falcon GPTQ	2023-10-11 10:47:48 +05:30
Gaurav Shukla	6e409bfb77	fix else if syntax error Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>	2023-10-10 06:23:56 +05:30
Ean Garvey	66f6e79d68	Split CPU/GPU definitions conditionally outside of torch contexts. (#1879 )	2023-10-09 16:46:41 -07:00
Ean Garvey	3b825579a7	(LLaMa-2) Point to int4 + f32 acc .mlir for cpu (#1878 ) - fixes some issues with non-system prompt invocation Co-authored-by: Gaurav Shukla <gauravshukla789@gmail.com>	2023-10-09 14:37:35 -05:00
Ean Garvey	caf6cc5d8f	Switch most compile flows to use ireec.compile_file. (#1863 ) * Switch most compile flows to use ireec.compile_file. * re-add input type to compile_str path. * Check if mlir_module exists before checking if it's a path or pyobject. * Fix some save_dir cases	2023-10-06 23:04:43 -05:00
Ean Garvey	8614a18474	Remove tf dependencies from importer path. (#1874 ) * Remove tf dependencies from import path. * Fix formatting.	2023-10-06 12:27:12 -07:00
Jakub Kuderski	86c1c0c215	Add aggregate statistics to microbenchmark (#1871 ) Print averaged results at the end of all iterations. Increase the default number of iterations to 5. Example: ``` Number of iterations: 5 Prefill: avg. 0.03 s, stddev 0.00 Decode: avg. 43.34 tokens/s, stdev 0.13 ``` Also remove the -2 in the number of generated tokens -- I did not find any evidence we need it.	2023-10-06 10:03:07 -07:00
Daniel Garvey	8bb364bcb8	enforce fp32 accumulates for cpu (#1873 )	2023-10-06 11:34:49 -05:00
Daniel Garvey	7abddd01ec	argmax inside model + brevitas pin (#1872 )	2023-10-05 20:15:21 -07:00
Abhishek Varma	2a451fa0c7	[Llama2] Add a standalone utility for dynamic and combining IRs -- This script adds a standalone utility for converting Llama IRs to dynamic and combining them as well. Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>	2023-10-05 20:01:06 +05:30
Jakub Kuderski	9c4610b9da	Add microbenchmark mode to vicuna CLI (#1864 ) Add flags to enable a non-internactive mode for microbenchmarking llama models. In this mode, the system and user prompts are specified with CLI flags, and the number of generated tokens and iterations is fixed. Also move the stats below the response and trim any response blankspace.	2023-10-05 00:12:08 -04:00
Gaurav Shukla	7cc9b3f8e8	[llama cli] Fix llama cli Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>	2023-10-03 20:39:53 +05:30
Vivek Khandelwal	8dd7850c69	Add Falcon-GPTQ support	2023-10-02 16:39:57 +05:30
Gaurav Shukla	e930ba85b4	[os] Remove os dependency from vmfb naming (#1854 ) Also fixes a small ui issue for chatbot. Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com>	2023-09-29 12:38:17 -05:00
Gaurav Shukla	cd732e7a38	[chatbot] split execution time to prefill and decode Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>	2023-09-29 13:18:03 +05:30
Gaurav Shukla	82f833e87d	[vulkan] Update vmfb naming Update vmfb naming for vulkan devices in order to resolve naming conflicts in the presence of multiple vulkan devices. Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>	2023-09-28 14:52:11 +05:30
Vivek Khandelwal	c9d6870105	Modify falcon pipeline for 180b support	2023-09-28 12:39:35 +05:30
Abhishek Varma	9a0efffcca	[Llama2] Fix wrong Vulkan device ID + Add Vulkan compile flags -- This commit fixes the wrong Vulkan device being selected during runtime. -- It also adds couple of IREE compilation flags to target specific Vulkan device. -- It also changes the Vulkan device listing to be more in tune with lowering control flow. Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>	2023-09-22 22:24:18 +05:30
Quinn Dawkins	ded74d09cd	[vicuna.py] Keep past key values on device (#1836 ) The past key values are only used within the models themselves and can be kept on device. For vulkan int4, this gives 44 tok/s (for the first prompt) and settles at around 26 tok/s on 7900xtx.	2023-09-19 18:17:41 -04:00
PhaneeshB	b817bb8455	add roles for llama2	2023-09-12 10:59:28 +05:30
Abhishek Varma	c854208d49	[Llama2] Prefetch llama2 tokenizer configs (#1824 ) -- This commit prefetches llama2 tokenizer configs from shark_tank. Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>	2023-09-08 11:29:54 -07:00
Gaurav Shukla	c5dcfc1f13	[vicuna] Exit when mlir is not present in shark tank (#1825 ) Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com>	2023-09-08 10:30:29 -07:00
Gaurav Shukla	ede6bf83e2	[vicuna] Disabling the IR generation path Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>	2023-09-06 20:13:17 +05:30
Gaurav Shukla	d2f64eefa3	[chatbot] Remove few outdated models from list (#1814 )	2023-09-04 09:26:32 -07:00
Phaneesh Barwaria	1ccafa1fc1	fix llama2-70b rewrite tensor dim	2023-09-01 17:27:06 +05:30
jinchen62	4c3d8a0a7f	Enable downloading vmfb/mlir for webui (#1807 )	2023-08-31 11:05:47 -07:00
jinchen62	3601dc7c3b	Fix llama2 13b combined ir (#1803 )	2023-08-28 11:34:44 -07:00
Daniel Garvey	671881cf87	Llama2 70b (#1783 ) * llama2 70b IR gen * fix IR sec llama2 + debug * llama270b --------- Co-authored-by: PhaneeshB <b.phaneesh@gmail.com>	2023-08-25 23:04:28 -07:00
Gaurav Shukla	4e9be6be59	[chatbot] Add debug as class attribute (#1799 ) Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com>	2023-08-25 21:46:29 -07:00
Ean Garvey	9c8cbaf498	Add support for ROCM (Windows) in Studio + compile utils (#1770 ) * WIP: MSVC ROCM support for SHARK Studio * Make get_iree_rocm_args platform-agnostic. * Update stable_args.py * Update rocm arg handling in SD utils * Guard quantization imports. Co-authored-by: jam https://github.com/jammm	2023-08-25 20:56:05 -07:00
jinchen62	51f90a4d56	Update conversion passes for brevitas quant op (#1795 )	2023-08-25 17:28:07 -05:00
Abhishek Varma	310d5d0a49	Fix llama2 13b crashing + add spec file for CLI execution of Llama (#1797 ) * [Llama2] Add a fix for Llama2 13B downloading/crashing -- This commit fixes downloading/crashing of llama2 13B on wrong .mlir file. -- Also adds support for downloading vmfb from shark_tank in CLI. Signed-off-by: Abhishek Varma <abhishek@nod-labs.com> * [llama2] Add a spec file to run Llama/Vicuna CLI exe -- This commit adds a spec file to run Llama/Vicuna CLI exe. Signed-off-by: Abhishek Varma <abhishek@nod-labs.com> --------- Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>	2023-08-25 09:36:09 -05:00

1 2 3 4

167 Commits