Compare commits

...

9 Commits

Author SHA1 Message Date
Lincoln Stein
9a121f6190 updated changelog 2022-08-22 15:34:57 -04:00
Lincoln Stein
a20827697c adjusted instructions for the released stable-diffusion-v1 weights 2022-08-22 15:33:27 -04:00
Lincoln Stein
9391eaff0e Merge branch 'prompt-in-png' into main 2022-08-22 13:24:12 -04:00
Lincoln Stein
e1d52822c5 fixed crash that occurs if you type an empty prompt at the dream> prompt 2022-08-22 12:40:54 -04:00
Lincoln Stein
63989ce6ff tidied up scripts directory by moving the original CompViz scripts into a subfolder 2022-08-22 10:11:54 -04:00
Lincoln Stein
24b88c6fc5 Update README.md 2022-08-22 10:01:06 -04:00
Lincoln Stein
7cb5149a02 Update README.md
Fixed typo in the change log (wrong version #)
2022-08-22 00:27:48 -04:00
Lincoln Stein
ea3501a8c4 Merge branch 'main' of github.com:lstein/stable-diffusion into main 2022-08-22 00:26:11 -04:00
Lincoln Stein
8caa27bef0 Close #2 2022-08-22 00:26:03 -04:00
13 changed files with 76 additions and 18 deletions

View File

@@ -86,7 +86,16 @@ completely). The default is 0.75, and ranges from 0.25-0.75 give interesting res
## Changes
* v1.01 (21 August 2022)
* v1.04 (22 August 2022 - after the drop)
* Updated README to reflect installation of the released weights.
* Suppressed very noisy and inconsequential warning when loading the frozen CLIP
tokenizer.
* v1.03 (22 August 2022)
* The original txt2img and img2img scripts from the CompViz repository have been moved into
a subfolder named "orig_scripts", to reduce confusion.
* v1.02 (21 August 2022)
* A copy of the prompt and all of its switches and options is now stored in the corresponding
image in a tEXt metadata field named "Dream". You can read the prompt using scripts/images2prompt.py,
or an image editor that allows you to explore the full metadata.
@@ -142,17 +151,21 @@ After these steps, your command prompt will be prefixed by "(ldm)" as shown abov
7. Now you need to install the weights for the stable diffusion model.
For testing prior to the release of the real weights, you can use an older weight file that produces low-quality images. Create a directory within stable-diffusion named "models/ldm/text2img-large", and use the wget URL downloader tool to copy the weight file into it:
```
(ldm) ~/stable-diffusion$ mkdir -p models/ldm/text2img-large
(ldm) ~/stable-diffusion$ wget -O models/ldm/text2img-large/model.ckpt https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt
```
For testing with the released weighs, you will do something similar, but with a directory named "models/ldm/stable-diffusion-v1"
For running with the released weights, you will first need to set up an acount with Hugging Face (https://huggingface.co).
Use your credentials to log in, and then point browser at https://huggingface.co/CompVis/stable-diffusion-v-1-4-original.
You may be asked to sign a license agreement at this point.
Click on "Files and versions" near the top of the page, and then click on the file named "sd-v1-4.ckpt". You'll be taken
to a page that prompts you to click the "download" link. Now save the file somewhere safe on your local machine.
Now run the following commands from within the stable-diffusion directory to point it to the weights file.
```
(ldm) ~/stable-diffusion$ mkdir -p models/ldm/stable-diffusion-v1
(ldm) ~/stable-diffusion$ wget -O models/ldm/stable-diffusion-v1/model.ckpt <ENTER URL HERE>
(ldm) ~/stable-diffusion$ ln -sf /path/to/sd-v1-4.ckpt models/ldm/stable-diffusion-v1/model.ckpt
```
These weight files are ~5 GB in size, so downloading may take a while.
The weight file is >4 GB in size, so downloading may take a while.
8. Start generating images!
```
@@ -209,15 +222,36 @@ This will install all python requirements and activate the "ldm" environment whi
```
python scripts\preload_models.py
```
This installs two machine learning models that stable diffusion requires.
This installs several machine learning models that stable diffusion
requires. (Note that this step is required. I created it because some people
are using GPU systems that are behind a firewall and the models can't be
downloaded just-in-time)
9. Now you need to install the weights for the big stable diffusion model.
For testing prior to the release of the real weights, create a directory within stable-diffusion named "models\ldm\text2img-large".
For running with the released weights, you will first need to set up
an acount with Hugging Face (https://huggingface.co). Use your
credentials to log in, and then point browser at
https://huggingface.co/CompVis/stable-diffusion-v-1-4-original. You
may be asked to sign a license agreement at this point.
For testing with the released weights, create a directory within stable-diffusion named "models\ldm\stable-diffusion-v1".
Click on "Files and versions" near the top of the page, and then click
on the file named "sd-v1-4.ckpt". You'll be taken to a page that
prompts you to click the "download" link. Now save the file somewhere
safe on your local machine. The weight file is >4 GB in size, so
downloading may take a while.
Then use a web browser to copy model.ckpt into the appropriate directory. For the text2img-large (pre-release) model, the weights are at https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt. Check back here later for the release URL.
Now run the following commands from **within the stable-diffusion
directory** to point it to the weights file:
```
mkdir -p models/ldm/stable-diffusion-v1
copy C:\path\to\sd-v1-4.ckpt models\ldm\stable-diffusion-v1\model.ckpt
```
Instead of copying the file, you may instead create a shortcut within the
models\ldm\stable-diffusion-v1\ directory that points to it.
10. Start generating images!
```
@@ -227,7 +261,7 @@ python scripts\dream.py -l
# for the post-release weights
python scripts\dream.py
```
11. Subsequently, to relaunch the script, first activate the Anaconda command window (step 4), run "conda activate ldm" (step 7b), and then launch the dream script (step 10).
11. Subsequently, to relaunch the script, first activate the Anaconda command window (step 4), enter the stable-diffusion directory (step 6, "cd \path\to\stable-diffusion"), run "conda activate ldm" (step 7b), and then launch the dream script (step 10).
### Updating to newer versions of the script

View File

@@ -146,8 +146,8 @@ class FrozenCLIPEmbedder(AbstractEncoder):
"""Uses the CLIP transformer encoder for text (from Hugging Face)"""
def __init__(self, version="openai/clip-vit-large-patch14", device="cuda", max_length=77):
super().__init__()
self.tokenizer = CLIPTokenizer.from_pretrained(version)
self.transformer = CLIPTextModel.from_pretrained(version)
self.tokenizer = CLIPTokenizer.from_pretrained(version,local_files_only=True)
self.transformer = CLIPTextModel.from_pretrained(version,local_files_only=True)
self.device = device
self.max_length = max_length
self.freeze()

View File

@@ -40,7 +40,11 @@ def main():
sys.path.append('.')
from pytorch_lightning import logging
from ldm.simplet2i import T2I
# these two lines prevent a horrible warning message from appearing
# when the frozen CLIP tokenizer is imported
import transformers
transformers.logging.set_verbosity_error()
# creating a simple text2image object with a handful of
# defaults passed on the command line.
# additional parameters will be added (or overriden) during
@@ -87,6 +91,9 @@ def main_loop(t2i,parser,log):
break
elements = shlex.split(command)
if len(elements)==0:
continue
if elements[0]=='q': #
done = True
break

1
scripts/images2prompt.py Normal file → Executable file
View File

@@ -6,6 +6,7 @@ from PIL import Image,PngImagePlugin
if len(sys.argv) < 2:
print("Usage: file2prompt.py <file1.png> <file2.png> <file3.png>...")
print("This script opens up the indicated dream.py-generated PNG file(s) and prints out the prompt used to generate them.")
exit(-1)
filenames = sys.argv[1:]

18
scripts/preload_models.py Normal file → Executable file
View File

@@ -2,6 +2,10 @@
# Before running stable-diffusion on an internet-isolated machine,
# run this script from one with internet connectivity. The
# two machines must share a common .cache directory.
import sys
import transformers
transformers.logging.set_verbosity_error()
# this will preload the Bert tokenizer fles
print("preloading bert tokenizer...")
@@ -10,7 +14,19 @@ tokenizer = BertTokenizerFast.from_pretrained("bert-base-uncased")
print("...success")
# this will download requirements for Kornia
print("preloading Kornia requirements...")
print("preloading Kornia requirements (ignore the warnings)...")
import kornia
print("...success")
# doesn't work - probably wrong logger
# logging.getLogger('transformers.tokenization_utils').setLevel(logging.ERROR)
version='openai/clip-vit-large-patch14'
print('preloading CLIP model (Ignore the warnings)...')
sys.stdout.flush()
import clip
from transformers import CLIPTokenizer, CLIPTextModel
tokenizer =CLIPTokenizer.from_pretrained(version)
transformer=CLIPTextModel.from_pretrained(version)
print('\n\n...success')