AMD-SHARK-Studio

mirror of https://github.com/nod-ai/AMD-SHARK-Studio.git synced 2026-02-19 11:56:43 -05:00

Author	SHA1	Message	Date
AmosLewis	c199ac78eb	Add decompose of aten._scaled_dot_product_flash_attention.default The new decompose was just implemented from pytorch thes day. Here is pytorch pr: https://github.com/pytorch/pytorch/pull/117390 This decompose is required from lowering chatglm model in torch-mlir. Here is the issue:https://github.com/llvm/torch-mlir/issues/2730	2024-01-16 03:03:14 +00:00
Ean Garvey	fa95ed30d1	Relocate quantized matmul reassociation flag (#2047 ) * Remove quantized matmul reassociation flag This flag should be a model/use-case specific addition, not a default CPU compile flag.	2023-12-20 12:48:40 -08:00
Daniel Garvey	ebfcfec338	remove shark 1.0 tests, add support for 2.0 llm * add support for external weights * add tests and edit deps	2023-12-14 21:44:37 -06:00
Richard Pastirčák	3af0c6c658	#1843 - Add Export Default settings button (#2016 ) * #1843 - Add Export Default settings button * #1843 reformating units test --------- Co-authored-by: Richard Pastirčák <richard.pastircak@student.tuke.sk>	2023-12-06 14:58:17 -06:00
Eliasj42	dfdd3b1f78	improved sharded performance and fixed issue with lmhead on rocm (#2008 ) * improved sharded performance and fixed issue with lmhead on rocm * mmap shards + disable sharing of device arrays across devices * fix device_idx for non-layer vmfbs * fix time calc for sharded --------- Co-authored-by: Elias Joseph <elias@nod-labs.com> Co-authored-by: PhaneeshB <b.phaneesh@gmail.com>	2023-12-05 11:53:44 -08:00
Ean Garvey	6384780d16	Fixes to llama2 cpu compilation and studio UI, schedulers (#2013 ) * Fix some issues with defaults Fixes to llama2 cpu compilation (turns off data tiling for old argmax mode) --------- Co-authored-by: Max Dawkins <max.dawkins@gmail.com>	2023-12-05 11:19:19 -05:00
Ean Garvey	d72da3801f	(Studio) Update gradio and multicontrolnet UI. (#2001 ) * (Studio) Update gradio and multicontrolnet UI. * Fixes for outputgallery, exe build * Fix image return types. * Update Gradio to 4.7.1 * Fix send buttons and hiresfix * Various bugfixes and SDXL additions. * More UI fixes and txt2img_sdxl presets. enable SDXL-Turbo and custom models, custom VAE for sdxl img2img ui tweaks	2023-12-04 12:37:51 -06:00
Ean Garvey	795fc33001	Update default compilation flags for data tiling. (#2000 ) * Update default CPU compilation flags. `c5a6cdc8dd` `52eb7e9b82` tweak CPU iree-compile flags to match upstream changes. * Add an option for data tiling on SD models.	2023-11-30 17:05:37 -06:00
Evan Ruttenberg	78c607e1d3	Fix typo in default_rocm_arch (#1998 )	2023-11-29 20:40:56 -05:00
Ean Garvey	da50a16242	Create specified dir if needed during save_mlir and fix vulkan device fetching without URI/ID (#1989 )	2023-11-23 01:01:41 -06:00
PhaneeshB	2f780f0d38	quick fix rocm None device	2023-11-22 21:17:25 +05:30
Ean Garvey	d051c3a4a7	Use clean_device_info() by default and don't write .mlir to /tmp/ (#1984 ) * Move clean_device_info to compile_utils * Update compile_utils.py * Fix .mlir writes for some user-level permissions * Fix cases where full URI is given * Fix conditionals. * Fix device path handling in vulkan utils.	2023-11-20 13:10:31 -06:00
Ean Garvey	905d0103ff	Revert "Re-enable SD tunings without matmuls. (#1976 )" (#1979 ) This reverts commit `70817bb50a`.	2023-11-17 23:44:33 +05:30
Ean Garvey	70817bb50a	Re-enable SD tunings without matmuls. (#1976 )	2023-11-15 20:42:53 -06:00
jinchen62	dd37c26d36	Update brevitas quant api (#1975 )	2023-11-15 10:04:07 -08:00
Ean Garvey	f6d41affd9	(SHARK Studio) Add Turbine-based llm chatbot. (#1933 ) * Dan shark studio (#1970) * Fix issue in Falcon-GPTQ * initial webui and llama2 --------- Co-authored-by: Vivek Khandelwal <vivekkhandelwal1424@gmail.com> * Fix formatting. --------- Co-authored-by: Daniel Garvey <34486624+dan-garvey@users.noreply.github.com> Co-authored-by: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>	2023-11-14 09:56:28 -06:00
PhaneeshB	11510d5111	add intra rocm vmfb differentiator	2023-11-13 23:35:55 +05:30
PhaneeshB	392bade0bf	enable non default rocm device selection for webui	2023-11-13 23:35:55 +05:30
PhaneeshB	51afe19e20	fix rocm arch selection	2023-11-10 13:22:51 +05:30
Ean Garvey	31005bcf73	Don't require vulkan installation to query devices. (#1953 )	2023-11-09 14:46:44 -06:00
Phaneesh Barwaria	db89b1bdc1	Fix MacOS web execution flow (#1899 ) * fix metal device path for chatbot * single device remove indexing * lint fix	2023-11-09 10:59:29 -06:00
Huang Qi	2754e2e257	Fix wrong parameter index passed to 'compile_module_to_flatbuffer' (#1921 ) compile_str is always False in compile_module_to_flatbuffer since there is a parameter 'model_name' before 'debug'. This issue is relative to https://github.com/nod-ai/SHARK/pull/1863. Then we can use mlir model buffer in RAM to run inference.	2023-11-09 10:58:05 -06:00
PhaneeshB	ab0e870c43	fix vicuna cli vulkan	2023-11-09 22:27:13 +05:30
Stanley Winata	500c4f2306	[compile utils] Fix ROCM to not expect config.id as a default. (#1939 )	2023-11-06 08:44:53 -08:00
Ean Garvey	5001db3415	Add 7800xt to target triples explicitly. (#1928 )	2023-11-01 17:11:45 -05:00
PhaneeshB	7963abb8ec	remove caching for rocm args	2023-10-29 07:07:57 +05:30
PhaneeshB	72c0a8abc8	remove dependency on external commands for driver installation check	2023-10-27 10:30:40 +05:30
Vivek Khandelwal	ea920f2955	Add sharded Falcon support	2023-10-26 21:53:25 +05:30
Phaneesh Barwaria	486202377a	update dependency on rocm/hip info command (#1900 ) * add support for rocm flags * add rocm target flag to chat args * rm rocm libs dependency message	2023-10-26 15:18:25 +05:30
Ean Garvey	e6cb5cef57	Add --additional_runtime_args option and use in OPT example. (#1855 ) * Add --additional_runtime_args option and use in OPT example. Fix the func name. (#1838) Co-authored-by: Sungsoon Cho <sungsoon.cho@gmail.com>	2023-10-19 13:29:39 -05:00
Huang Qi	66abee8e5b	SharkInference: Fix various examples and README.md (#1903 ) Follow https://github.com/nod-ai/SHARK/pull/708, remove parameter 'func_name' for SharkInference.	2023-10-19 09:28:36 -05:00
Ean Garvey	4797bb89f5	Stringify path for ireec.compile_file (#1901 ) * Stringify path for ireec.compile_file * Update test-models.yml	2023-10-18 14:59:23 -05:00
Ean Garvey	0b77059628	Add matmul reassociation flags (#1891 )	2023-10-12 20:12:37 -05:00
Vivek Khandelwal	b83d32fafe	Fix Falcon GPTQ Pipeline	2023-10-11 20:09:32 +05:30
Vivek Khandelwal	0a618e1863	Add support for Falcon GPTQ	2023-10-11 10:47:48 +05:30
Phaneesh Barwaria	a731eb6ed4	Macos fixes (#1883 ) * fix venv setup for MacOS * allow stream fuse binding on mac * clean iree metal args	2023-10-09 23:36:12 -07:00
Ean Garvey	caf6cc5d8f	Switch most compile flows to use ireec.compile_file. (#1863 ) * Switch most compile flows to use ireec.compile_file. * re-add input type to compile_str path. * Check if mlir_module exists before checking if it's a path or pyobject. * Fix some save_dir cases	2023-10-06 23:04:43 -05:00
powderluv	a38cc9d216	Update vulkan_utils.py for Radeon 780m igpu (#1866 )	2023-10-04 20:33:07 -07:00
Jakub Kuderski	1c382449ec	[vulkan] Print note about module load times. NFC. (#1862 ) Print a note ahead of a potentially long inactivity to set the right expectations. Separately, we should add progress to the UI and make this loading faster.	2023-10-03 17:27:27 -04:00
Vivek Khandelwal	8dd7850c69	Add Falcon-GPTQ support	2023-10-02 16:39:57 +05:30
PhaneeshB	94594542a9	remove use of vulkaninfo	2023-09-28 21:57:00 +05:30
Jakub Kuderski	4fec03a6cc	[vulkan] Switch from coop matrix NV to KHR (#1848 )	2023-09-27 21:43:37 -04:00
Abhishek Varma	ad1a0f35ff	Fix misdirection while saving vmfb -- Currently SHARK suggests that vmfb has been saved, while that is not the case and no vmfb is generated. This creates a misdirection for IR/vmfbs which are of larger size. -- This commit therefore fixes that misdirection. Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>	2023-09-27 16:25:29 +05:30
Abhishek Varma	9a0efffcca	[Llama2] Fix wrong Vulkan device ID + Add Vulkan compile flags -- This commit fixes the wrong Vulkan device being selected during runtime. -- It also adds couple of IREE compilation flags to target specific Vulkan device. -- It also changes the Vulkan device listing to be more in tune with lowering control flow. Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>	2023-09-22 22:24:18 +05:30
Boian Petkantchin	79267931c1	Add argument --additional_compile_args (#1119 ) This allows to pass more arguemnts to the IREE compiler Example: python my-app.py --additional_compile_args="--mlir-pretty-debuginfo --mlir-timing" Co-authored-by: Boian Petkantchin <boian@nod-labs.com>	2023-09-19 11:26:03 -05:00
Gaurav Shukla	11bdce9790	[flags] Fix vulkan runtime flags as vma is dropped from iree (#1831 )	2023-09-14 08:58:59 -05:00
Ean Garvey	780f520f02	Fix vk.target_env extensions and remove redundant SD imports. (#1826 ) * Remove redundant IREE runtime imports. * Fix vulkan target env extensions.	2023-09-11 13:42:52 -05:00
Dom	c61b6f8d65	Code refactoring (#1817 ) * use join * fix bug * further code optimizations --------- Co-authored-by: Daniel Garvey <34486624+dan-garvey@users.noreply.github.com>	2023-09-11 11:30:56 -05:00
Vivek Khandelwal	9681d494eb	Update decomp list and shark trainer for DLRM	2023-09-06 21:24:50 +05:30
Vivek Khandelwal	1d31b2b2c6	Fix StableHLO Compilation flag	2023-09-05 21:32:33 +05:30

1 2 3 4 5 ...

618 Commits