AMD-SHARK-Studio

mirror of https://github.com/nod-ai/AMD-SHARK-Studio.git synced 2026-04-03 03:00:17 -04:00

Author	SHA1	Message	Date
Abhishek-Varma	ce00c1c5e1	[SharkInference] Make SharkInference compile the entire module -- Previously SharkInference was compiling and providing run APIs for a harcoded function with function name "forward". -- This commit makes the compiling functionality generic and now any function being defined within the module can be run. -- It also creates an API to fetch all the function names defined within the compiled module. Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>	2022-12-24 09:05:06 +00:00
Stanley Winata	136021424c	[SD] Change default VMA large heap block size for windows perf. (#715 ) Windows perform can boost from 2.67s/image to 2.4523s/image. While Linux stays the same. 20221224.411 20221224.410 20221224.409 20221224.408	2022-12-24 01:40:58 +07:00
PhaneeshB	fee4ba3746	Add openjourney	2022-12-23 23:34:22 +05:30
Gaurav Shukla	a5b70335d4	[SD][web] Add variant support in the web UI Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>	2022-12-23 23:18:27 +05:30
Stanley Winata	5cf4976054	[Vulkan][utils] Add GTX Pascal support. (#709 ) 20221223.407 20221223.406	2022-12-22 15:24:15 -08:00
PhaneeshB	1aa3255061	Add vaebase for av3 and ad	2022-12-23 04:17:17 +05:30
Daniel Garvey	b01f29f10d	add support for clear_all (#691 )	2022-12-22 11:25:03 -06:00
Boian Petkantchin	2673abca88	Fix concurrency issue in stress_test for CUDA devices	2022-12-22 08:54:19 -08:00
Gaurav Shukla	7eeb7f0715	[SD] Update all the utilities to make web and CLI codebase closer (#707 ) At this point, all the utilities of SD web and CLI are exactly same. Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com> Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com>	2022-12-22 02:49:48 -08:00
powderluv	37262a2479	Remove spurious characters 20221222.405 20221222.404	2022-12-21 19:23:54 -08:00
Gaurav Shukla	de6e304959	[SD] Fix the resource location in shark_sd.spec (#706 )	2022-12-21 14:41:56 -08:00
Quinn Dawkins	234475bbc7	Add base_vae entries for variant models (#705 )	2022-12-21 14:35:08 -08:00
Quinn Dawkins	abbd9f7cfc	[SD] Set unet flags for cuda (#704 )	2022-12-21 13:22:04 -08:00
Gaurav Shukla	dfd6ba67b3	[SD] Update SD CLI to use model_db.json Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>	2022-12-22 02:13:04 +05:30
yzhang93	1595254eab	Modify model annotation tool to walk through ops by shape (#692 )	2022-12-21 10:46:30 -08:00
PhaneeshB	6964c5eeba	encapsulate relevant methods in one method 20221221.402	2022-12-21 23:56:17 +05:30
PhaneeshB	2befe771b3	Add support for automatic target triple selection for SD	2022-12-21 22:38:06 +05:30
Prashant Kumar	b133a035a4	Add the download progress bar.	2022-12-21 15:47:33 +05:30
Gaurav Shukla	726c062327	[SD] Update spec files Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>	2022-12-21 14:16:04 +05:30
Gaurav Shukla	9083672de3	[SD][web] Tuned models only for stablediffusion/fp16 and rdna3 cards Currently tuned models are only available for stablediffusion/fp16 and rdna3 cards. Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>	2022-12-21 14:15:39 +05:30
Quinn Dawkins	cdbaf880af	[SD] [web] Add model variants to web	2022-12-21 13:42:22 +05:30
Quinn Dawkins	9434981cdc	Add random seed generation for seed = -1 in cli (#689 )	2022-12-20 17:15:22 -05:00
Phaneesh Barwaria	8b3706f557	Add Anything v3 and AnalogDiffusion variants of SD (#685 ) * base support for anythingv3 * add analogdiffusiont * Update readme * keep max len 77 till support for 64 added for variants * lint fix	2022-12-20 13:08:13 -08:00
Gaurav Shukla	0d5173833d	[SD] Add a json file for model names information. (#687 ) This commit simplifies the code to identify the model name for a particular set of flags. This is achieved by introducing a json file that stores the model names information. The models are uploaded in gcloud with these names. Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com> Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com>	2022-12-20 11:47:31 -08:00
powderluv	bf1178eb79	roll to build 400	2022-12-20 10:34:31 -08:00
yzhang93	abcd3fa94a	[SD] Set model max length 64 as default (#681 ) 20221220.400	2022-12-19 21:13:04 -08:00
Quinn Dawkins	62aa1614b6	[SD] Add --use_base_vae flag to do conversion to pixel space on cpu (#682 )	2022-12-19 21:09:39 -08:00
Quinn Dawkins	7027356126	[SD] Fix warmup for max length 64 (#680 )	2022-12-19 21:04:44 -05:00
yzhang93	5ebe13a13d	Add Unet len 64 tuned model (#679 )	2022-12-19 16:24:08 -08:00
Gaurav Shukla	c3bed9a2b7	[SD][web] Add flag to disable the progress bar animation Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>	2022-12-20 02:50:04 +05:30
yzhang93	f865222882	Update VAE 19dec tuned model (#676 )	2022-12-19 12:42:28 -08:00
powderluv	e2fe2e4095	Point to 398	2022-12-19 12:08:30 -08:00
powderluv	0532a95f08	Update stable_diffusion_amd.md	2022-12-19 12:04:42 -08:00
Quinn Dawkins	ff536f6015	[SD] Deduplicate initial noise generation (#677 )	2022-12-19 14:38:41 -05:00
Gaurav Shukla	097d0f27bb	[SD][web] Add 64 max_length support in SD web Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>	2022-12-20 00:00:58 +05:30
Prashant Kumar	2257f87edf	Update opt_params.py	2022-12-19 23:43:30 +05:30
PhaneeshB	a17800da00	Add 64 len f16 untuned mlir	2022-12-19 22:53:17 +05:30
Prashant Kumar	059c1b3a19	Disable vae --use_tuned version. 20221219.398	2022-12-19 22:45:45 +05:30
Stanley Winata	9a36816d27	[SD][CLI] Add a warmup phase (#670 )	2022-12-20 00:14:23 +07:00
Gaurav Shukla	7986b9b20b	[SD][WEB] Update VAE model and wrapper This commit updates VAE model which significantly improves performance by an order of ~300ms. Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>	2022-12-19 22:32:05 +05:30
Gaurav Shukla	b2b3a0a62b	[SD] Move initial latent generation out of inference time The initial random latent generation is not taken into account for total SD inference time. Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com>	2022-12-19 22:32:05 +05:30
Prashant Kumar	3173b7d1d9	Update VAE model and wrapper.	2022-12-19 19:54:50 +05:30
Gaurav Shukla	9d716d70d6	[SD][web] Fix performance issues on shark scheduler Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com> 20221219.397	2022-12-19 17:44:37 +05:30
Stanley Winata	e1901a8608	[SD][CL] Disable print at every iteration. (#664 ) Printing might incur extra time to runtime. Hence, we add a flag to hide it. To disable printing please set this flag `--hide_steps`. Co-authored-by: Stanley <stanley@MacStudio.lan>	2022-12-19 15:39:57 +07:00
Quinn Dawkins	7d0cbd8d90	[SD][web] Set default tuned unet to v2 (#663 ) 20221219.396	2022-12-19 11:50:08 +07:00
Quinn Dawkins	59358361f9	[SD] Make clip batch 2 for positive and negative prompts (#662 ) Combines the forward passes for each input prompt type into a single batched clip pass.	2022-12-18 23:46:21 -05:00
Quinn Dawkins	7fea2d3b68	[SD] update default large heap size for web as well (#661 ) 20221219.395	2022-12-18 21:50:26 -05:00
Quinn Dawkins	b6d3ff26bd	[SD] Change default VMA large heap block size (#660 )	2022-12-18 21:41:46 -05:00
Stella Laurenzo	523e63f5c1	Fix NoneType exception if vulkan tuning flags not detected. (#659 ) (This goes on to produce compilation errors, but one step at a time)	2022-12-18 16:40:56 -08:00
Stella Laurenzo	10630ab597	Add config stanza for NVIDIA RTX 2080. (#658 ) Just happened to have this card on my Windows machine and verified that the SD demo works on it. ``` Average step time: 144.26142692565918ms/it Clip Inference Avg time (ms) = (205.001 + 44.000) / 2 = 124.501 VAE Inference time (ms): 281.001 Total image generation time: 7.856997728347778sec ``` I'd love to add an API upstream to derive compiler tuning flags from a host device.	2022-12-18 16:40:47 -08:00

1 2 3 4 5 ...

782 Commits