Ean Garvey
a4d28110b0
Add Resnet50 fp16 variant to pytests.
2023-01-06 08:04:08 +00:00
Prashant Kumar
4102c124a9
Add the shark upscaler model. ( #759 )
2023-01-05 14:07:20 -08:00
yzhang93
135bad3280
[SD] Update v1.4 tuned model ( #758 )
2023-01-05 11:04:30 -08:00
yzhang93
782b449c71
Add script to auto annotate SD models and variants ( #751 )
...
* Add script to auto annotate SD models and variants
* Add model config files
* Add script to auto annotate SD models and variants
* Add model config files
* Move config files to shark_tank
2023-01-04 15:53:10 -08:00
jinchen62
017dcab685
Add target triple support for TITAN RTX ( #756 )
2023-01-04 15:39:00 -08:00
Abhishek Varma
e60b4568c6
[SharkInference] Make SharkInference compile the entire module ( #708 )
...
* [SharkInference] Make SharkInference compile the entire module
-- Previously SharkInference was compiling and providing run APIs
for a harcoded function with function name "forward".
-- This commit makes the compiling functionality generic and now
any function being defined within the module can be run.
-- It also creates an API to fetch all the function names defined
within the compiled module.
-- This commit updates both web and command-line execution of Stable
Diffusion to use new API of SharkInference.
Signed-off-by: Abhishek Varma <abhishek@nod-labs.com >
2023-01-03 23:25:23 +05:30
powderluv
4ee3d95a5a
Update to build 423
...
Post pytorch security breach
2023-01-01 12:10:23 -08:00
Graham
f18725bacc
replaced <username> with %username% for easy copy/paste ( #744 )
2022-12-31 21:29:37 -08:00
jinchen62
f6064a2b84
Add a prototype of the model compilation configs for SD ( #734 )
2022-12-28 15:14:36 -08:00
powderluv
2c09d63cd9
Update to build 417
2022-12-27 14:25:20 -08:00
powderluv
cc6fbdb0c3
Add sm_89 and point to nvcuda.dll ( #731 )
2022-12-26 10:54:38 -08:00
Gaurav Shukla
45af40fd14
[SD][web] Add openjourney and dreamlike in SD web UI
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2022-12-26 01:59:36 +05:30
Phaneesh Barwaria
d11cf42501
Add support for dreamlike diffusion ( #725 )
...
* Add support for dreamlike diffusion
* model wrapper to support 77 dreamlike
* lint fix
2022-12-26 01:35:17 +05:30
Gaurav Shukla
c3c1e3b055
[SD] Add bucket info in the model_db.json
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2022-12-25 20:38:33 +05:30
Gaurav Shukla
7c5e3b1d99
[SD] Fix flags for cuda devices
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2022-12-25 19:03:02 +05:30
Gaurav Shukla
ed6cec71e7
[SD] Fix clip inference time
...
Fix clip inference time by adding default warmup_count to 5.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2022-12-25 18:16:53 +05:30
PhaneeshB
1261074d95
Add tuned models for av3 and ad
2022-12-24 22:56:15 +05:30
Stanley Winata
136021424c
[SD] Change default VMA large heap block size for windows perf. ( #715 )
...
Windows perform can boost from 2.67s/image to 2.4523s/image.
While Linux stays the same.
2022-12-24 01:40:58 +07:00
PhaneeshB
fee4ba3746
Add openjourney
2022-12-23 23:34:22 +05:30
Stanley Winata
5cf4976054
[Vulkan][utils] Add GTX Pascal support. ( #709 )
2022-12-22 15:24:15 -08:00
PhaneeshB
1aa3255061
Add vaebase for av3 and ad
2022-12-23 04:17:17 +05:30
Daniel Garvey
b01f29f10d
add support for clear_all ( #691 )
2022-12-22 11:25:03 -06:00
Boian Petkantchin
2673abca88
Fix concurrency issue in stress_test for CUDA devices
2022-12-22 08:54:19 -08:00
Gaurav Shukla
7eeb7f0715
[SD] Update all the utilities to make web and CLI codebase closer ( #707 )
...
At this point, all the utilities of SD web and CLI are exactly same.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
Signed-off-by: Gaurav Shukla <gaurav@nod-labs.com >
2022-12-22 02:49:48 -08:00
Gaurav Shukla
dfd6ba67b3
[SD] Update SD CLI to use model_db.json
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2022-12-22 02:13:04 +05:30
yzhang93
1595254eab
Modify model annotation tool to walk through ops by shape ( #692 )
2022-12-21 10:46:30 -08:00
PhaneeshB
6964c5eeba
encapsulate relevant methods in one method
2022-12-21 23:56:17 +05:30
PhaneeshB
2befe771b3
Add support for automatic target triple selection for SD
2022-12-21 22:38:06 +05:30
Prashant Kumar
b133a035a4
Add the download progress bar.
2022-12-21 15:47:33 +05:30
Quinn Dawkins
9434981cdc
Add random seed generation for seed = -1 in cli ( #689 )
2022-12-20 17:15:22 -05:00
Phaneesh Barwaria
8b3706f557
Add Anything v3 and AnalogDiffusion variants of SD ( #685 )
...
* base support for anythingv3
* add analogdiffusiont
* Update readme
* keep max len 77 till support for 64 added for variants
* lint fix
2022-12-20 13:08:13 -08:00
powderluv
bf1178eb79
roll to build 400
2022-12-20 10:34:31 -08:00
yzhang93
abcd3fa94a
[SD] Set model max length 64 as default ( #681 )
2022-12-19 21:13:04 -08:00
Quinn Dawkins
7027356126
[SD] Fix warmup for max length 64 ( #680 )
2022-12-19 21:04:44 -05:00
yzhang93
5ebe13a13d
Add Unet len 64 tuned model ( #679 )
2022-12-19 16:24:08 -08:00
yzhang93
f865222882
Update VAE 19dec tuned model ( #676 )
2022-12-19 12:42:28 -08:00
powderluv
e2fe2e4095
Point to 398
2022-12-19 12:08:30 -08:00
powderluv
0532a95f08
Update stable_diffusion_amd.md
2022-12-19 12:04:42 -08:00
Quinn Dawkins
ff536f6015
[SD] Deduplicate initial noise generation ( #677 )
2022-12-19 14:38:41 -05:00
Prashant Kumar
2257f87edf
Update opt_params.py
2022-12-19 23:43:30 +05:30
PhaneeshB
a17800da00
Add 64 len f16 untuned mlir
2022-12-19 22:53:17 +05:30
Prashant Kumar
059c1b3a19
Disable vae --use_tuned version.
2022-12-19 22:45:45 +05:30
Stanley Winata
9a36816d27
[SD][CLI] Add a warmup phase ( #670 )
2022-12-20 00:14:23 +07:00
Gaurav Shukla
b2b3a0a62b
[SD] Move initial latent generation out of inference time
...
The initial random latent generation is not taken into account
for total SD inference time.
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2022-12-19 22:32:05 +05:30
Prashant Kumar
3173b7d1d9
Update VAE model and wrapper.
2022-12-19 19:54:50 +05:30
Gaurav Shukla
9d716d70d6
[SD][web] Fix performance issues on shark scheduler
...
Signed-Off-by: Gaurav Shukla <gaurav@nod-labs.com >
2022-12-19 17:44:37 +05:30
Stanley Winata
e1901a8608
[SD][CL] Disable print at every iteration. ( #664 )
...
Printing might incur extra time to runtime. Hence, we add a flag to hide it. To disable printing please set this flag `--hide_steps`.
Co-authored-by: Stanley <stanley@MacStudio.lan >
2022-12-19 15:39:57 +07:00
Quinn Dawkins
59358361f9
[SD] Make clip batch 2 for positive and negative prompts ( #662 )
...
Combines the forward passes for each input prompt type into a single batched clip pass.
2022-12-18 23:46:21 -05:00
Quinn Dawkins
b6d3ff26bd
[SD] Change default VMA large heap block size ( #660 )
2022-12-18 21:41:46 -05:00
Stella Laurenzo
10630ab597
Add config stanza for NVIDIA RTX 2080. ( #658 )
...
Just happened to have this card on my Windows machine and verified that the SD demo works on it.
```
Average step time: 144.26142692565918ms/it
Clip Inference Avg time (ms) = (205.001 + 44.000) / 2 = 124.501
VAE Inference time (ms): 281.001
Total image generation time: 7.856997728347778sec
```
I'd love to add an API upstream to derive compiler tuning flags from a host device.
2022-12-18 16:40:47 -08:00