Squashed commit of the following:

commit 1c649e4663 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 12 13:29:16 2022 -0400 fix torchvision dependency version #511 commit 4d197f699e Merge: a3e07fb 190ba78 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 12 07:29:19 2022 -0400 Merge branch 'development' of github.com:lstein/stable-diffusion into development commit a3e07fb84a Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 12 07:28:58 2022 -0400 fix grid crash commit 9fa1f31bf2 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 12 07:07:05 2022 -0400 fix opencv and realesrgan dependencies in mac install commit 190ba78960 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 12 01:50:58 2022 -0400 Update requirements-mac.txt Fixed dangling dash on last line. commit 25d9ccc509 Author: Any-Winter-4079 <50542132+Any-Winter-4079@users.noreply.github.com> Date: Mon Sep 12 03:17:29 2022 +0200 Update model.py commit 9cdf3aca7d Author: Any-Winter-4079 <50542132+Any-Winter-4079@users.noreply.github.com> Date: Mon Sep 12 02:52:36 2022 +0200 Update attention.py Performance improvements to generate larger images in M1 #431 Update attention.py Added dtype=r1.dtype to softmax commit 49a96b90d8 Author: Mihai <299015+mh-dm@users.noreply.github.com> Date: Sat Sep 10 16:58:07 2022 +0300 ~7% speedup (1.57 to 1.69it/s) from switch to += in ldm.modules.attention. (#482) Tested on 8GB eGPU nvidia setup so YMMV. 512x512 output, max VRAM stays same. commit aba94b85e8 Author: Niek van der Maas <mail@niekvandermaas.nl> Date: Fri Sep 9 15:01:37 2022 +0200 Fix macOS `pyenv` instructions, add code block highlight (#441) Fix: `anaconda3-latest` does not work, specify the correct virtualenv, add missing init. commit aac5102cf3 Author: Henry van Megen <h.vanmegen@gmail.com> Date: Thu Sep 8 05:16:35 2022 +0200 Disabled debug output (#436) Co-authored-by: Henry van Megen <hvanmegen@gmail.com> commit 0ab5a36464 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 17:19:46 2022 -0400 fix missing lines in outputs commit 5e433728b5 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 16:20:14 2022 -0400 upped max_steps in v1-finetune.yaml and fixed TI docs to address #493 commit 7708f4fb98 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 16:03:37 2022 -0400 slight efficiency gain by using += in attention.py commit b86a1deb00 Author: blessedcoolant <54517381+blessedcoolant@users.noreply.github.com> Date: Mon Sep 12 07:47:12 2022 +1200 Remove print statement styling (#504) Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com> commit 4951e66103 Author: chromaticist <mhostick@gmail.com> Date: Sun Sep 11 12:44:26 2022 -0700 Adding support for .bin files from huggingface concepts (#498) * Adding support for .bin files from huggingface concepts * Updating documentation to include huggingface .bin info commit 79b445b0ca Merge: a323070 f7662c1 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 15:39:38 2022 -0400 Merge branch 'development' of github.com:lstein/stable-diffusion into development commit a323070a4d Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 15:28:57 2022 -0400 update requirements for new location of gfpgan commit f7662c1808 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 15:00:24 2022 -0400 update requirements for changed location of gfpgan commit 93c242c9fb Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 14:47:58 2022 -0400 make gfpgan_model_exists flag available to web interface commit c7c6cd7735 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 14:43:07 2022 -0400 Update UPSCALE.md New instructions needed to accommodate fact that the ESRGAN and GFPGAN packages are now installed by environment.yaml. commit 77ca83e103 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 14:31:56 2022 -0400 Update CLI.md Final documentation tweak. commit 0ea145d188 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 14:29:26 2022 -0400 Update CLI.md More doc fixes. commit 162285ae86 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 14:28:45 2022 -0400 Update CLI.md Minor documentation fix commit 37c921dfe2 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 14:26:41 2022 -0400 documentation enhancements commit 4f72cb44ad Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 13:05:38 2022 -0400 moved the notebook files into their own directory commit 878ef2e9e0 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 12:58:06 2022 -0400 documentation tweaks commit 4923118610 Merge: 16f6a67 defafc0 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 12:51:25 2022 -0400 Merge branch 'development' of github.com:lstein/stable-diffusion into development commit defafc0e8e Author: Dominic Letz <dominic@diode.io> Date: Sun Sep 11 18:51:01 2022 +0200 Enable upscaling on m1 (#474) commit 16f6a6731d Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 12:47:26 2022 -0400 install GFPGAN inside SD repository in order to fix 'dark cast' issue #169 commit 0881d429f2 Author: blessedcoolant <54517381+blessedcoolant@users.noreply.github.com> Date: Mon Sep 12 03:52:43 2022 +1200 Docs Update (#466) Authored-by: @blessedcoolant Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com> commit 9a29d442b4 Author: Gérald LONLAS <gerald@lonlas.com> Date: Sun Sep 11 23:23:18 2022 +0800 Revert "Add 3x Upscale option on the Web UI (#442)" (#488) This reverts commit f8a540881c. commit d301836fbd Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 10:52:19 2022 -0400 can select prior output for init_img using -1, -2, etc commit 70aa674e9e Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 10:34:06 2022 -0400 merge PR #495 - keep using float16 in ldm.modules.attention commit 8748370f44 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 10:22:32 2022 -0400 negative -S indexing recovers correct previous seed; closes issue #476 commit 839e30e4b8 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 11 10:02:44 2022 -0400 improve CUDA VRAM monitoring extra check that device==cuda before getting VRAM stats commit bfb2781279 Author: tildebyte <337875+tildebyte@users.noreply.github.com> Date: Sat Sep 10 10:15:56 2022 -0400 fix(readme): add note about updating env via conda (#475) commit 5c43988862 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 10 10:02:43 2022 -0400 reduce VRAM memory usage by half during model loading * This moves the call to half() before model.to(device) to avoid GPU copy of full model. Improves speed and reduces memory usage dramatically * This fix contributed by @mh-dm (Mihai) commit 99122708ca Merge: 817c4a2 ecc6b75 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 10 09:54:34 2022 -0400 Merge branch 'development' of github.com:lstein/stable-diffusion into development commit 817c4a26de Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 10 09:53:27 2022 -0400 remove -F option from normalized prompt; closes #483 commit ecc6b75a3e Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 10 09:53:27 2022 -0400 remove -F option from normalized prompt commit 723d074442 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Fri Sep 9 18:49:51 2022 -0400 Allow ctrl c when using --from_file (#472) * added ansi escapes to highlight key parts of CLI session * adjust exception handling so that ^C will abort when reading prompts from a file commit 75f633cda8 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Fri Sep 9 12:03:45 2022 -0400 re-add new logo commit 10db192cc4 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Fri Sep 9 09:26:10 2022 -0400 changes to dogettx optimizations to run on m1 * Author @any-winter-4079 * Author @dogettx Thanks to many individuals who contributed time and hardware to benchmarking and debugging these changes. commit c85ae00b33 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 23:57:45 2022 -0400 fix bug which caused seed to get "stuck" on previous image even when UI specified -1 commit 1b5aae3ef3 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 22:36:47 2022 -0400 add icon to dream web server commit 6abf739315 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 22:25:09 2022 -0400 add favicon to web server commit db825b8138 Merge: 33874ba afee7f9 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 22:17:37 2022 -0400 Merge branch 'deNULL-development' into development commit 33874bae8d Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 22:16:29 2022 -0400 Squashed commit of the following: commit afee7f9cea Merge: 6531446 171f8db Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 22:14:32 2022 -0400 Merge branch 'development' of github.com:deNULL/stable-diffusion into deNULL-development commit 171f8db742 Author: Denis Olshin <me@denull.ru> Date: Thu Sep 8 03:15:20 2022 +0300 saving full prompt to metadata when using web ui commit d7e67b62f0 Author: Denis Olshin <me@denull.ru> Date: Thu Sep 8 01:51:47 2022 +0300 better logic for clicking to make variations commit afee7f9cea Merge: 6531446 171f8db Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 22:14:32 2022 -0400 Merge branch 'development' of github.com:deNULL/stable-diffusion into deNULL-development commit 653144694f Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 20:41:37 2022 -0400 work around unexplained crash when timesteps=1000 (#440) * work around unexplained crash when timesteps=1000 * this fix seems to work commit c33a84cdfd Author: blessedcoolant <54517381+blessedcoolant@users.noreply.github.com> Date: Fri Sep 9 12:39:51 2022 +1200 Add New Logo (#454) * Add instructions on how to install alongside pyenv (#393) Like probably many others, I have a lot of different virtualenvs, one for each project. Most of them are handled by `pyenv`. After installing according to these instructions I had issues with ´pyenv`and `miniconda` fighting over the $PATH of my system. But then I stumbled upon this nice solution on SO: https://stackoverflow.com/a/73139031 , upon which I have based my suggested changes. It runs perfectly on my M1 setup, with the anaconda setup as a virtual environment handled by pyenv. Feel free to incorporate these instructions as you see fit. Thanks a million for all your hard work. * Disabled debug output (#436) Co-authored-by: Henry van Megen <hvanmegen@gmail.com> * Add New Logo Co-authored-by: Håvard Gulldahl <havard@lurtgjort.no> Co-authored-by: Henry van Megen <h.vanmegen@gmail.com> Co-authored-by: Henry van Megen <hvanmegen@gmail.com> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com> commit f8a540881c Author: Gérald LONLAS <gerald@lonlas.com> Date: Fri Sep 9 01:45:54 2022 +0800 Add 3x Upscale option on the Web UI (#442) commit 244239e5f6 Author: James Reynolds <magnusviri@users.noreply.github.com> Date: Thu Sep 8 05:36:33 2022 -0600 macOS CI workflow, dream.py exits with an error, but the workflow com… (#396) * macOS CI workflow, dream.py exits with an error, but the workflow completes. * Files for testing Co-authored-by: James Reynolds <magnsuviri@me.com> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com> commit 711d49ed30 Author: James Reynolds <magnusviri@users.noreply.github.com> Date: Thu Sep 8 05:35:08 2022 -0600 Cache model workflow (#394) * Add workflow that caches the model, step 1 for CI * Change name of workflow job Co-authored-by: James Reynolds <magnsuviri@me.com> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com> commit 7996a30e3a Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Thu Sep 8 07:34:03 2022 -0400 add auto-creation of mask for inpainting (#438) * now use a single init image for both image and mask * turn on debugging for now to write out mask and image * add back -M option as a fallback commit a69ca31f34 Author: elliotsayes <elliotsayes@gmail.com> Date: Thu Sep 8 15:30:06 2022 +1200 .gitignore WebUI temp files (#430) * Add instructions on how to install alongside pyenv (#393) Like probably many others, I have a lot of different virtualenvs, one for each project. Most of them are handled by `pyenv`. After installing according to these instructions I had issues with ´pyenv`and `miniconda` fighting over the $PATH of my system. But then I stumbled upon this nice solution on SO: https://stackoverflow.com/a/73139031 , upon which I have based my suggested changes. It runs perfectly on my M1 setup, with the anaconda setup as a virtual environment handled by pyenv. Feel free to incorporate these instructions as you see fit. Thanks a million for all your hard work. * .gitignore WebUI temp files Co-authored-by: Håvard Gulldahl <havard@lurtgjort.no> commit 5c6b612a72 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Wed Sep 7 22:50:55 2022 -0400 fix bug that caused same seed to be redisplayed repeatedly commit 56f155c590 Author: Johan Roxendal <johan@roxendal.com> Date: Thu Sep 8 04:50:06 2022 +0200 added support for parsing run log and displaying images in the frontend init state (#410) Co-authored-by: Johan Roxendal <johan.roxendal@litteraturbanken.se> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com> commit 41687746be Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Wed Sep 7 20:24:35 2022 -0400 added missing initialization of latent_noise to None commit 171f8db742 Author: Denis Olshin <me@denull.ru> Date: Thu Sep 8 03:15:20 2022 +0300 saving full prompt to metadata when using web ui commit d7e67b62f0 Author: Denis Olshin <me@denull.ru> Date: Thu Sep 8 01:51:47 2022 +0300 better logic for clicking to make variations commit d1d044aa87 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Wed Sep 7 17:56:59 2022 -0400 actual image seed now written into web log rather than -1 (#428) commit edada042b3 Author: Arturo Mendivil <60411196+artmen1516@users.noreply.github.com> Date: Wed Sep 7 10:42:26 2022 -0700 Improve notebook and add requirements file (#422) commit 29ab3c2028 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Wed Sep 7 13:28:11 2022 -0400 disable neonpixel optimizations on M1 hardware (#414) * disable neonpixel optimizations on M1 hardware * fix typo that was causing random noise images on m1 commit 7670ecc63f Author: cody <cnmizell@gmail.com> Date: Wed Sep 7 12:24:41 2022 -0500 add more keyboard support on the web server (#391) add ability to submit prompts with the "enter" key add ability to cancel generations with the "escape" key commit dd2aedacaf Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Wed Sep 7 13:23:53 2022 -0400 report VRAM usage stats during initial model loading (#419) commit f6284777e6 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Tue Sep 6 17:12:39 2022 -0400 Squashed commit of the following: commit 7d1344282d942a33dcecda4d5144fc154ec82915 Merge: caf4ea3 ebeb556 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 5 10:07:27 2022 -0400 Merge branch 'development' of github.com:WebDev9000/stable-diffusion into WebDev9000-development commit ebeb556af9 Author: Web Dev 9000 <rirath@gmail.com> Date: Sun Sep 4 18:05:15 2022 -0700 Fixed unintentionally removed lines commit ff2c4b9a1b Author: Web Dev 9000 <rirath@gmail.com> Date: Sun Sep 4 17:50:13 2022 -0700 Add ability to recreate variations via image click commit c012929cda Author: Web Dev 9000 <rirath@gmail.com> Date: Sun Sep 4 14:35:33 2022 -0700 Add files via upload commit 02a6018992 Author: Web Dev 9000 <rirath@gmail.com> Date: Sun Sep 4 14:35:07 2022 -0700 Add files via upload commit eef788981c Author: Olivier Louvignes <olivier@mg-crea.com> Date: Tue Sep 6 12:41:08 2022 +0200 feat(txt2img): allow from_file to work with len(lines) < batch_size (#349) commit 720e5cd651 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 5 20:40:10 2022 -0400 Refactoring simplet2i (#387) * start refactoring -not yet functional * first phase of refactor done - not sure weighted prompts working * Second phase of refactoring. Everything mostly working. * The refactoring has moved all the hard-core inference work into ldm.dream.generator.*, where there are submodules for txt2img and img2img. inpaint will go in there as well. * Some additional refactoring will be done soon, but relatively minor work. * fix -save_orig flag to actually work * add @neonsecret attention.py memory optimization * remove unneeded imports * move token logging into conditioning.py * add placeholder version of inpaint; porting in progress * fix crash in img2img * inpainting working; not tested on variations * fix crashes in img2img * ported attention.py memory optimization #117 from basujindal branch * added @torch_no_grad() decorators to img2img, txt2img, inpaint closures * Final commit prior to PR against development * fixup crash when generating intermediate images in web UI * rename ldm.simplet2i to ldm.generate * add backward-compatibility simplet2i shell with deprecation warning * add back in mps exception, addresses @vargol comment in #354 * replaced Conditioning class with exported functions * fix wrong type of with_variations attribute during intialization * changed "image_iterator()" to "get_make_image()" * raise NotImplementedError for calling get_make_image() in parent class * Update ldm/generate.py better error message Co-authored-by: Kevin Gibbons <bakkot@gmail.com> * minor stylistic fixes and assertion checks from code review * moved get_noise() method into img2img class * break get_noise() into two methods, one for txt2img and the other for img2img * inpainting works on non-square images now * make get_noise() an abstract method in base class * much improved inpainting Co-authored-by: Kevin Gibbons <bakkot@gmail.com> commit 1ad2a8e567 Author: thealanle <35761977+thealanle@users.noreply.github.com> Date: Mon Sep 5 17:35:04 2022 -0700 Fix --outdir function for web (#373) * Fix --outdir function for web * Removed unnecessary hardcoded path commit 52d8bb2836 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 5 10:31:59 2022 -0400 Squashed commit of the following: commit 0cd48e932f1326e000c46f4140f98697eb9bdc79 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Mon Sep 5 10:27:43 2022 -0400 resolve conflicts with development commit d7bc8c12e0 Author: Scott McMillin <scott@scottmcmillin.com> Date: Sun Sep 4 18:52:09 2022 -0500 Add title attribute back to img tag commit 5397c89184 Author: Scott McMillin <scott@scottmcmillin.com> Date: Sun Sep 4 13:49:46 2022 -0500 Remove temp code commit 1da080b509 Author: Scott McMillin <scott@scottmcmillin.com> Date: Sun Sep 4 13:33:56 2022 -0500 Cleaned up HTML; small style changes; image click opens image; add seed to figcaption beneath image commit caf4ea3d89 Author: Adam Rice <adam@askadam.io> Date: Mon Sep 5 10:05:39 2022 -0400 Add a 'Remove Image' button to clear the file upload field (#382) * added "remove image" button * styled a new "remove image" button * Update index.js commit 95c088b303 Author: Kevin Gibbons <bakkot@gmail.com> Date: Sun Sep 4 19:04:14 2022 -0700 Revert "Add CORS headers to dream server to ease integration with third-party web interfaces" (#371) This reverts commit 91e826e5f4. commit a20113d5a3 Author: Kevin Gibbons <bakkot@gmail.com> Date: Sun Sep 4 18:59:12 2022 -0700 put no_grad decorator on make_image closures (#375) commit 0f93dadd6a Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 4 21:39:15 2022 -0400 fix several dangling references to --gfpgan option, which no longer exists commit f4004f660e Author: tildebyte <337875+tildebyte@users.noreply.github.com> Date: Sun Sep 4 19:43:04 2022 -0400 TOIL(requirements): Split requirements to per-platform (#355) * toil(reqs): split requirements to per-platform Signed-off-by: Ben Alkov <ben.alkov@gmail.com> * toil(reqs): fix for Win and Lin... ...allow pip to resolve latest torch, numpy Signed-off-by: Ben Alkov <ben.alkov@gmail.com> * toil(install): update reqs in Win install notebook Signed-off-by: Ben Alkov <ben.alkov@gmail.com> Signed-off-by: Ben Alkov <ben.alkov@gmail.com> commit 4406fd138d Merge: 5116c81 fd7a72e Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 4 08:23:53 2022 -0400 Merge branch 'SebastianAigner-main' into development Add support for full CORS headers for dream server. commit fd7a72e147 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 4 08:23:11 2022 -0400 remove debugging message commit 3a2be621f3 Merge: 91e826e 5116c81 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sun Sep 4 08:15:51 2022 -0400 Merge branch 'development' into main commit 5116c8178c Author: Justin Wong <1584142+wongjustin99@users.noreply.github.com> Date: Sun Sep 4 07:17:58 2022 -0400 fix save_original flag saving to the same filename (#360) * Update README.md with new Anaconda install steps (#347) pip3 version did not work for me and this is the recommended way to install Anaconda now it seems * fix save_original flag saving to the same filename Before this, the `--save_orig` flag was not working. The upscaled/GFPGAN would overwrite the original output image. Co-authored-by: greentext2 <112735219+greentext2@users.noreply.github.com> commit 91e826e5f4 Author: Sebastian Aigner <SebastianAigner@users.noreply.github.com> Date: Sun Sep 4 10:22:54 2022 +0200 Add CORS headers to dream server to ease integration with third-party web interfaces commit 6266d9e8d6 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 15:45:20 2022 -0400 remove stray debugging message commit 138956e516 Author: greentext2 <112735219+greentext2@users.noreply.github.com> Date: Sat Sep 3 13:38:57 2022 -0500 Update README.md with new Anaconda install steps (#347) pip3 version did not work for me and this is the recommended way to install Anaconda now it seems commit 60be735e80 Author: Cora Johnson-Roberson <cora.johnson.roberson@gmail.com> Date: Sat Sep 3 14:28:34 2022 -0400 Switch to regular pytorch channel and restore Python 3.10 for Macs. (#301) * Switch to regular pytorch channel and restore Python 3.10 for Macs. Although pytorch-nightly should in theory be faster, it is currently causing increased memory usage and slower iterations: https://github.com/lstein/stable-diffusion/pull/283#issuecomment-1234784885 This changes the environment-mac.yaml file back to the regular pytorch channel and moves the `transformers` dep into pip for now (since it cannot be satisfied until tokenizers>=0.11 is built for Python 3.10). * Specify versions for Pip packages as well. commit d0d95d3a2a Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 14:10:31 2022 -0400 make initimg appear in web log commit b90a215000 Merge: 1eee811 6270e31 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 13:47:15 2022 -0400 Merge branch 'prixt-seamless' into development commit 6270e313b8 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 13:46:29 2022 -0400 add credit to prixt for seamless circular tiling commit a01b7bdc40 Merge: 1eee811 9d88abe Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 13:43:04 2022 -0400 add web interface for seamless option commit 1eee8111b9 Merge: 64eca42 fb857f0 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 12:33:39 2022 -0400 Merge branch 'development' of github.com:lstein/stable-diffusion into development commit 64eca42610 Merge: 9130ad7 21a1f68 Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 12:33:05 2022 -0400 Merge branch 'main' into development * brings in small documentation fixes that were added directly to main during release tweaking. commit fb857f05ba Author: Lincoln Stein <lincoln.stein@gmail.com> Date: Sat Sep 3 12:07:07 2022 -0400 fix typo in docs commit 9d88abe2ea Author: prixt <paraxite@naver.com> Date: Sat Sep 3 22:42:16 2022 +0900 fixed typo commit a61e49bc97 Author: prixt <paraxite@naver.com> Date: Sat Sep 3 22:39:35 2022 +0900 * Removed unnecessary code * Added description about --seamless commit 02bee4fdb1 Author: prixt <paraxite@naver.com> Date: Sat Sep 3 16:08:03 2022 +0900 added --seamless tag logging to normalize_prompt commit d922b53c26 Author: prixt <paraxite@naver.com> Date: Sat Sep 3 15:13:31 2022 +0900 added seamless tiling mode and commands
2026-02-19 03:24:37 -05:00 · 2022-09-12 14:31:48 -04:00
parent 62863ac586
commit 9d6d728b51
64 changed files with 3836 additions and 2738 deletions
--- a/ldm/modules/attention.py
+++ b/ldm/modules/attention.py
@@ -7,13 +7,14 @@ from einops import rearrange, repeat

 from ldm.modules.diffusionmodules.util import checkpoint

+import psutil

 def exists(val):
    return val is not None


 def uniq(arr):
-    return {el: True for el in arr}.keys()
+    return{el: True for el in arr}.keys()


 def default(val, d):
@@ -45,18 +46,19 @@ class GEGLU(nn.Module):


 class FeedForward(nn.Module):
-    def __init__(self, dim, dim_out=None, mult=4, glu=False, dropout=0.0):
+    def __init__(self, dim, dim_out=None, mult=4, glu=False, dropout=0.):
        super().__init__()
        inner_dim = int(dim * mult)
        dim_out = default(dim_out, dim)
-        project_in = (
-            nn.Sequential(nn.Linear(dim, inner_dim), nn.GELU())
-            if not glu
-            else GEGLU(dim, inner_dim)
-        )
+        project_in = nn.Sequential(
+            nn.Linear(dim, inner_dim),
+            nn.GELU()
+        ) if not glu else GEGLU(dim, inner_dim)

        self.net = nn.Sequential(
-            project_in, nn.Dropout(dropout), nn.Linear(inner_dim, dim_out)
+            project_in,
+            nn.Dropout(dropout),
+            nn.Linear(inner_dim, dim_out)
        )

    def forward(self, x):
@@ -73,9 +75,7 @@ def zero_module(module):


 def Normalize(in_channels):
-    return torch.nn.GroupNorm(
-        num_groups=32, num_channels=in_channels, eps=1e-6, affine=True
-    )
+    return torch.nn.GroupNorm(num_groups=32, num_channels=in_channels, eps=1e-6, affine=True)


 class LinearAttention(nn.Module):
@@ -83,28 +83,17 @@ class LinearAttention(nn.Module):
        super().__init__()
        self.heads = heads
        hidden_dim = dim_head * heads
-        self.to_qkv = nn.Conv2d(dim, hidden_dim * 3, 1, bias=False)
+        self.to_qkv = nn.Conv2d(dim, hidden_dim * 3, 1, bias = False)
        self.to_out = nn.Conv2d(hidden_dim, dim, 1)

    def forward(self, x):
        b, c, h, w = x.shape
        qkv = self.to_qkv(x)
-        q, k, v = rearrange(
-            qkv,
-            'b (qkv heads c) h w -> qkv b heads c (h w)',
-            heads=self.heads,
-            qkv=3,
-        )
-        k = k.softmax(dim=-1)
+        q, k, v = rearrange(qkv, 'b (qkv heads c) h w -> qkv b heads c (h w)', heads = self.heads, qkv=3)
+        k = k.softmax(dim=-1)  
        context = torch.einsum('bhdn,bhen->bhde', k, v)
        out = torch.einsum('bhde,bhdn->bhen', context, q)
-        out = rearrange(
-            out,
-            'b heads c (h w) -> b (heads c) h w',
-            heads=self.heads,
-            h=h,
-            w=w,
-        )
+        out = rearrange(out, 'b heads c (h w) -> b (heads c) h w', heads=self.heads, h=h, w=w)
        return self.to_out(out)


@@ -114,18 +103,26 @@ class SpatialSelfAttention(nn.Module):
        self.in_channels = in_channels

        self.norm = Normalize(in_channels)
-        self.q = torch.nn.Conv2d(
-            in_channels, in_channels, kernel_size=1, stride=1, padding=0
-        )
-        self.k = torch.nn.Conv2d(
-            in_channels, in_channels, kernel_size=1, stride=1, padding=0
-        )
-        self.v = torch.nn.Conv2d(
-            in_channels, in_channels, kernel_size=1, stride=1, padding=0
-        )
-        self.proj_out = torch.nn.Conv2d(
-            in_channels, in_channels, kernel_size=1, stride=1, padding=0
-        )
+        self.q = torch.nn.Conv2d(in_channels,
+                                 in_channels,
+                                 kernel_size=1,
+                                 stride=1,
+                                 padding=0)
+        self.k = torch.nn.Conv2d(in_channels,
+                                 in_channels,
+                                 kernel_size=1,
+                                 stride=1,
+                                 padding=0)
+        self.v = torch.nn.Conv2d(in_channels,
+                                 in_channels,
+                                 kernel_size=1,
+                                 stride=1,
+                                 padding=0)
+        self.proj_out = torch.nn.Conv2d(in_channels,
+                                        in_channels,
+                                        kernel_size=1,
+                                        stride=1,
+                                        padding=0)

    def forward(self, x):
        h_ = x
@@ -135,12 +132,12 @@ class SpatialSelfAttention(nn.Module):
        v = self.v(h_)

        # compute attention
-        b, c, h, w = q.shape
+        b,c,h,w = q.shape
        q = rearrange(q, 'b c h w -> b (h w) c')
        k = rearrange(k, 'b c h w -> b c (h w)')
        w_ = torch.einsum('bij,bjk->bik', q, k)

-        w_ = w_ * (int(c) ** (-0.5))
+        w_ = w_ * (int(c)**(-0.5))
        w_ = torch.nn.functional.softmax(w_, dim=2)

        # attend to values
@@ -150,18 +147,16 @@ class SpatialSelfAttention(nn.Module):
        h_ = rearrange(h_, 'b c (h w) -> b c h w', h=h)
        h_ = self.proj_out(h_)

-        return x + h_
+        return x+h_


 class CrossAttention(nn.Module):
-    def __init__(
-        self, query_dim, context_dim=None, heads=8, dim_head=64, dropout=0.0
-    ):
+    def __init__(self, query_dim, context_dim=None, heads=8, dim_head=64, dropout=0.):
        super().__init__()
        inner_dim = dim_head * heads
        context_dim = default(context_dim, query_dim)

-        self.scale = dim_head**-0.5
+        self.scale = dim_head ** -0.5
        self.heads = heads

        self.to_q = nn.Linear(query_dim, inner_dim, bias=False)
@@ -169,69 +164,136 @@ class CrossAttention(nn.Module):
        self.to_v = nn.Linear(context_dim, inner_dim, bias=False)

        self.to_out = nn.Sequential(
-            nn.Linear(inner_dim, query_dim), nn.Dropout(dropout)
+            nn.Linear(inner_dim, query_dim),
+            nn.Dropout(dropout)
        )

+        if not torch.cuda.is_available():
+            mem_av = psutil.virtual_memory().available / (1024**3)
+            if mem_av > 32:
+                self.einsum_op = self.einsum_op_v1
+            elif mem_av > 12:
+                self.einsum_op = self.einsum_op_v2
+            else:
+                self.einsum_op = self.einsum_op_v3   
+            del mem_av 
+        else:
+            self.einsum_op = self.einsum_op_v4
+
+    # mps 64-128 GB
+    def einsum_op_v1(self, q, k, v, r1):
+        if q.shape[1] <= 4096: # for 512x512: the max q.shape[1] is 4096
+            s1 = einsum('b i d, b j d -> b i j', q, k) * self.scale # aggressive/faster: operation in one go
+            s2 = s1.softmax(dim=-1, dtype=q.dtype)
+            del s1
+            r1 = einsum('b i j, b j d -> b i d', s2, v)
+            del s2
+        else:
+            # q.shape[0] * q.shape[1] * slice_size >= 2**31 throws err
+            # needs around half of that slice_size to not generate noise
+            slice_size = math.floor(2**30 / (q.shape[0] * q.shape[1]))
+            for i in range(0, q.shape[1], slice_size):
+                end = i + slice_size
+                s1 = einsum('b i d, b j d -> b i j', q[:, i:end], k) * self.scale
+                s2 = s1.softmax(dim=-1, dtype=r1.dtype)
+                del s1  
+                r1[:, i:end] = einsum('b i j, b j d -> b i d', s2, v)
+                del s2
+        return r1
+
+    # mps 16-32 GB (can be optimized)
+    def einsum_op_v2(self, q, k, v, r1):
+        slice_size = math.floor(2**30 / (q.shape[0] * q.shape[1]))
+        for i in range(0, q.shape[1], slice_size): # conservative/less mem: operation in steps
+            end = i + slice_size
+            s1 = einsum('b i d, b j d -> b i j', q[:, i:end], k) * self.scale
+            s2 = s1.softmax(dim=-1, dtype=r1.dtype)
+            del s1  
+            r1[:, i:end] = einsum('b i j, b j d -> b i d', s2, v)
+            del s2
+        return r1
+
+    # mps 8 GB
+    def einsum_op_v3(self, q, k, v, r1):
+        slice_size = 1
+        for i in range(0, q.shape[0], slice_size): # iterate over q.shape[0]
+            end = min(q.shape[0], i + slice_size)
+            s1 = einsum('b i d, b j d -> b i j', q[i:end], k[i:end]) # adapted einsum for mem
+            s1 *= self.scale
+            s2 = s1.softmax(dim=-1, dtype=r1.dtype)
+            del s1
+            r1[i:end] = einsum('b i j, b j d -> b i d', s2, v[i:end]) # adapted einsum for mem
+            del s2
+        return r1
+
+    # cuda
+    def einsum_op_v4(self, q, k, v, r1):
+        stats = torch.cuda.memory_stats(q.device)
+        mem_active = stats['active_bytes.all.current']
+        mem_reserved = stats['reserved_bytes.all.current']
+        mem_free_cuda, _ = torch.cuda.mem_get_info(torch.cuda.current_device())
+        mem_free_torch = mem_reserved - mem_active
+        mem_free_total = mem_free_cuda + mem_free_torch
+
+        gb = 1024 ** 3
+        tensor_size = q.shape[0] * q.shape[1] * k.shape[1] * 4
+        mem_required = tensor_size * 2.5
+        steps = 1
+
+        if mem_required > mem_free_total:
+            steps = 2**(math.ceil(math.log(mem_required / mem_free_total, 2)))
+
+        if steps > 64:
+            max_res = math.floor(math.sqrt(math.sqrt(mem_free_total / 2.5)) / 8) * 64
+            raise RuntimeError(f'Not enough memory, use lower resolution (max approx. {max_res}x{max_res}). '
+                            f'Need: {mem_required/64/gb:0.1f}GB free, Have:{mem_free_total/gb:0.1f}GB free')
+        
+        slice_size = q.shape[1] // steps if (q.shape[1] % steps) == 0 else q.shape[1]  
+        for i in range(0, q.shape[1], slice_size):
+            end = min(q.shape[1], i + slice_size)
+            s1 = einsum('b i d, b j d -> b i j', q[:, i:end], k) * self.scale
+            s2 = s1.softmax(dim=-1, dtype=r1.dtype)
+            del s1
+            r1[:, i:end] = einsum('b i j, b j d -> b i d', s2, v)
+            del s2 
+        return r1
+
    def forward(self, x, context=None, mask=None):
        h = self.heads

-        q = self.to_q(x)
+        q_in = self.to_q(x)
        context = default(context, x)
-        k = self.to_k(context)
-        v = self.to_v(context)
+        k_in = self.to_k(context)
+        v_in = self.to_v(context)
+        device_type = 'mps' if x.device.type == 'mps' else 'cuda'
+        del context, x

-        q, k, v = map(
-            lambda t: rearrange(t, 'b n (h d) -> (b h) n d', h=h), (q, k, v)
-        )
+        q, k, v = map(lambda t: rearrange(t, 'b n (h d) -> (b h) n d', h=h), (q_in, k_in, v_in))
+        del q_in, k_in, v_in
+        r1 = torch.zeros(q.shape[0], q.shape[1], v.shape[2], device=q.device, dtype=q.dtype)
+        r1 = self.einsum_op(q, k, v, r1)
+        del q, k, v

-        sim = einsum('b i d, b j d -> b i j', q, k) * self.scale
+        r2 = rearrange(r1, '(b h) n d -> b n (h d)', h=h)
+        del r1

-        if exists(mask):
-            mask = rearrange(mask, 'b ... -> b (...)')
-            max_neg_value = -torch.finfo(sim.dtype).max
-            mask = repeat(mask, 'b j -> (b h) () j', h=h)
-            sim.masked_fill_(~mask, max_neg_value)
-
-        # attention, what we cannot get enough of
-        attn = sim.softmax(dim=-1)
-
-        out = einsum('b i j, b j d -> b i d', attn, v)
-        out = rearrange(out, '(b h) n d -> b n (h d)', h=h)
-        return self.to_out(out)
+        return self.to_out(r2)


 class BasicTransformerBlock(nn.Module):
-    def __init__(
-        self,
-        dim,
-        n_heads,
-        d_head,
-        dropout=0.0,
-        context_dim=None,
-        gated_ff=True,
-        checkpoint=True,
-    ):
+    def __init__(self, dim, n_heads, d_head, dropout=0., context_dim=None, gated_ff=True, checkpoint=True):
        super().__init__()
-        self.attn1 = CrossAttention(
-            query_dim=dim, heads=n_heads, dim_head=d_head, dropout=dropout
-        )  # is a self-attention
+        self.attn1 = CrossAttention(query_dim=dim, heads=n_heads, dim_head=d_head, dropout=dropout)  # is a self-attention
        self.ff = FeedForward(dim, dropout=dropout, glu=gated_ff)
-        self.attn2 = CrossAttention(
-            query_dim=dim,
-            context_dim=context_dim,
-            heads=n_heads,
-            dim_head=d_head,
-            dropout=dropout,
-        )  # is self-attn if context is none
+        self.attn2 = CrossAttention(query_dim=dim, context_dim=context_dim,
+                                    heads=n_heads, dim_head=d_head, dropout=dropout)  # is self-attn if context is none
        self.norm1 = nn.LayerNorm(dim)
        self.norm2 = nn.LayerNorm(dim)
        self.norm3 = nn.LayerNorm(dim)
        self.checkpoint = checkpoint

    def forward(self, x, context=None):
-        return checkpoint(
-            self._forward, (x, context), self.parameters(), self.checkpoint
-        )
+        return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint)

    def _forward(self, x, context=None):
        x = x.contiguous() if x.device.type == 'mps' else x
@@ -249,43 +311,29 @@ class SpatialTransformer(nn.Module):
    Then apply standard transformer action.
    Finally, reshape to image
    """
-
-    def __init__(
-        self,
-        in_channels,
-        n_heads,
-        d_head,
-        depth=1,
-        dropout=0.0,
-        context_dim=None,
-    ):
+    def __init__(self, in_channels, n_heads, d_head,
+                 depth=1, dropout=0., context_dim=None):
        super().__init__()
        self.in_channels = in_channels
        inner_dim = n_heads * d_head
        self.norm = Normalize(in_channels)

-        self.proj_in = nn.Conv2d(
-            in_channels, inner_dim, kernel_size=1, stride=1, padding=0
-        )
+        self.proj_in = nn.Conv2d(in_channels,
+                                 inner_dim,
+                                 kernel_size=1,
+                                 stride=1,
+                                 padding=0)

        self.transformer_blocks = nn.ModuleList(
-            [
-                BasicTransformerBlock(
-                    inner_dim,
-                    n_heads,
-                    d_head,
-                    dropout=dropout,
-                    context_dim=context_dim,
-                )
-                for d in range(depth)
-            ]
+            [BasicTransformerBlock(inner_dim, n_heads, d_head, dropout=dropout, context_dim=context_dim)
+                for d in range(depth)]
        )

-        self.proj_out = zero_module(
-            nn.Conv2d(
-                inner_dim, in_channels, kernel_size=1, stride=1, padding=0
-            )
-        )
+        self.proj_out = zero_module(nn.Conv2d(inner_dim,
+                                              in_channels,
+                                              kernel_size=1,
+                                              stride=1,
+                                              padding=0))

    def forward(self, x, context=None):
        # note: if no context is given, cross-attention defaults to self-attention