mirror of
https://github.com/invoke-ai/InvokeAI.git
synced 2026-01-16 05:27:59 -05:00
Compare commits
25 Commits
release_1.
...
release-1.
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
6d1219deec | ||
|
|
e019de34ac | ||
|
|
88563fd27a | ||
|
|
18289dabcb | ||
|
|
e70169257e | ||
|
|
2afa87e911 | ||
|
|
281e381cfc | ||
|
|
9a121f6190 | ||
|
|
a20827697c | ||
|
|
9391eaff0e | ||
|
|
e1d52822c5 | ||
|
|
63989ce6ff | ||
|
|
24b88c6fc5 | ||
|
|
7cb5149a02 | ||
|
|
ea3501a8c4 | ||
|
|
8caa27bef0 | ||
|
|
ddf0ef3af1 | ||
|
|
aa2729d868 | ||
|
|
5f352aec87 | ||
|
|
c4c4974b39 | ||
|
|
194f43f00b | ||
|
|
325bc5280e | ||
|
|
11cc8e545b | ||
|
|
9adac56f4e | ||
|
|
5d5307dcb4 |
128
README.md
128
README.md
@@ -17,7 +17,11 @@ initialization only happens once. After that image generation
|
||||
from the command-line interface is very fast.
|
||||
|
||||
The script uses the readline library to allow for in-line editing,
|
||||
command history (up and down arrows), autocompletion, and more.
|
||||
command history (up and down arrows), autocompletion, and more. To help
|
||||
keep track of which prompts generated which images, the script writes a
|
||||
log file of image names and prompts to the selected output directory.
|
||||
In addition, as of version 1.02, it also writes the prompt into the PNG
|
||||
file's metadata where it can be retrieved using scripts/images2prompt.py
|
||||
|
||||
The script is confirmed to work on Linux and Windows systems. It should
|
||||
work on MacOSX as well, but this is not confirmed. Note that this script
|
||||
@@ -38,12 +42,19 @@ setting sampler to plms
|
||||
* Initialization done! Awaiting your command...
|
||||
dream> ashley judd riding a camel -n2 -s150
|
||||
Outputs:
|
||||
outputs/txt2img-samples/00009.png: "ashley judd riding a camel" -n2 -s150 -S 416354203
|
||||
outputs/txt2img-samples/00010.png: "ashley judd riding a camel" -n2 -s150-S 1362479620
|
||||
outputs/img-samples/00009.png: "ashley judd riding a camel" -n2 -s150 -S 416354203
|
||||
outputs/img-samples/00010.png: "ashley judd riding a camel" -n2 -s150 -S 1362479620
|
||||
|
||||
dream> "there's a fly in my soup" -n6 -g
|
||||
outputs/txt2img-samples/00041.png: "there's a fly in my soup" -n6 -g -S 2685670268
|
||||
outputs/img-samples/00011.png: "there's a fly in my soup" -n6 -g -S 2685670268
|
||||
seeds for individual rows: [2685670268, 1216708065, 2335773498, 822223658, 714542046, 3395302430]
|
||||
dream> q
|
||||
|
||||
# this shows how to retrieve the prompt stored in the saved image's metadata
|
||||
(ldm) ~/stable-diffusion$ python3 ./scripts/images2prompt.py outputs/img_samples/*.png
|
||||
00009.png: "ashley judd riding a camel" -s150 -S 416354203
|
||||
00010.png: "ashley judd riding a camel" -s150 -S 1362479620
|
||||
00011.png: "there's a fly in my soup" -n6 -g -S 2685670268
|
||||
~~~~
|
||||
|
||||
The dream> prompt's arguments are pretty much
|
||||
@@ -75,19 +86,52 @@ completely). The default is 0.75, and ranges from 0.25-0.75 give interesting res
|
||||
|
||||
## Changes
|
||||
|
||||
- v1.01 (21 August 2022)
|
||||
* added k_lms sampling **Please run "conda update -f environment.yaml" to load the k_lms dependencies**
|
||||
* use half precision arithmetic by default, resulting in faster execution and lower memory requirements
|
||||
Pass argument --full_precision to dream.py to get slower but more accurate image generation
|
||||
* v1.05 (22 August 2022 - after the drop)
|
||||
* Filenames now use the following formats:
|
||||
000010.95183149.png -- Two files produced by the same command (e.g. -n2),
|
||||
000010.26742632.png -- distinguished by a different seed.
|
||||
|
||||
000011.455191342.01.png -- Two files produced by the same command using
|
||||
000011.455191342.02.png -- a batch size>1 (e.g. -b2). They have the same seed.
|
||||
|
||||
000011.4160627868.grid#1-4.png -- a grid of four images (-g); the whole grid can
|
||||
be regenerated with the indicated key
|
||||
|
||||
* It should no longer be possible for one image to overwrite another
|
||||
* You can use the "cd" and "pwd" commands at the dream> prompt to set and retrieve
|
||||
the path of the output directory.
|
||||
|
||||
* v1.04 (22 August 2022 - after the drop)
|
||||
* Updated README to reflect installation of the released weights.
|
||||
* Suppressed very noisy and inconsequential warning when loading the frozen CLIP
|
||||
tokenizer.
|
||||
|
||||
* v1.03 (22 August 2022)
|
||||
* The original txt2img and img2img scripts from the CompViz repository have been moved into
|
||||
a subfolder named "orig_scripts", to reduce confusion.
|
||||
|
||||
* v1.02 (21 August 2022)
|
||||
* A copy of the prompt and all of its switches and options is now stored in the corresponding
|
||||
image in a tEXt metadata field named "Dream". You can read the prompt using scripts/images2prompt.py,
|
||||
or an image editor that allows you to explore the full metadata.
|
||||
**Please run "conda env update -f environment.yaml" to load the k_lms dependencies!!**
|
||||
|
||||
* v1.01 (21 August 2022)
|
||||
* added k_lms sampling.
|
||||
**Please run "conda env update -f environment.yaml" to load the k_lms dependencies!!**
|
||||
* use half precision arithmetic by default, resulting in faster execution and lower memory requirements
|
||||
Pass argument --full_precision to dream.py to get slower but more accurate image generation
|
||||
|
||||
|
||||
## Installation
|
||||
|
||||
There are separate installation walkthroughs for [Linux/Mac](#linuxmac) and [Windows](#windows).
|
||||
|
||||
### Linux/Mac
|
||||
|
||||
1. You will need to install the following prerequisites if they are not already available. Use your
|
||||
operating system's preferred installer
|
||||
* Python (version 3.8 or higher)
|
||||
* Python (version 3.8.5 recommended; higher may work)
|
||||
* git
|
||||
|
||||
2. Install the Python Anaconda environment manager using pip3.
|
||||
@@ -122,19 +166,26 @@ After these steps, your command prompt will be prefixed by "(ldm)" as shown abov
|
||||
(ldm) ~/stable-diffusion$ python3 scripts/preload_models.py
|
||||
```
|
||||
|
||||
Note that this step is necessary because I modified the original
|
||||
just-in-time model loading scheme to allow the script to work on GPU
|
||||
machines that are not internet connected. See [Workaround for machines with limited internet connectivity](#workaround-for-machines-with-limited-internet-connectivity)
|
||||
|
||||
7. Now you need to install the weights for the stable diffusion model.
|
||||
|
||||
For testing prior to the release of the real weights, you can use an older weight file that produces low-quality images. Create a directory within stable-diffusion named "models/ldm/text2img.large", and use the wget URL downloader tool to copy the weight file into it:
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ mkdir -p models/ldm/text2img-large
|
||||
(ldm) ~/stable-diffusion$ wget -O models/ldm/text2img-large/model.ckpt https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt
|
||||
```
|
||||
For testing with the released weighs, you will do something similar, but with a directory named "models/ldm/stable-diffusion-v1"
|
||||
For running with the released weights, you will first need to set up an acount with Hugging Face (https://huggingface.co).
|
||||
Use your credentials to log in, and then point your browser at https://huggingface.co/CompVis/stable-diffusion-v-1-4-original.
|
||||
You may be asked to sign a license agreement at this point.
|
||||
|
||||
Click on "Files and versions" near the top of the page, and then click on the file named "sd-v1-4.ckpt". You'll be taken
|
||||
to a page that prompts you to click the "download" link. Save the file somewhere safe on your local machine.
|
||||
|
||||
Now run the following commands from within the stable-diffusion directory. This will create a symbolic
|
||||
link from the stable-diffusion model.ckpt file, to the true location of the sd-v1-4.ckpt file.
|
||||
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ mkdir -p models/ldm/stable-diffusion-v1
|
||||
(ldm) ~/stable-diffusion$ wget -O models/ldm/stable-diffusion-v1/model.ckpt <ENTER URL HERE>
|
||||
(ldm) ~/stable-diffusion$ ln -sf /path/to/sd-v1-4.ckpt models/ldm/stable-diffusion-v1/model.ckpt
|
||||
```
|
||||
These weight files are ~5 GB in size, so downloading may take a while.
|
||||
|
||||
8. Start generating images!
|
||||
```
|
||||
@@ -150,7 +201,7 @@ These weight files are ~5 GB in size, so downloading may take a while.
|
||||
9. Subsequently, to relaunch the script, be sure to run "conda activate ldm" (step 5, second command), enter the "stable-diffusion"
|
||||
directory, and then launch the dream script (step 8). If you forget to activate the ldm environment, the script will fail with multiple ModuleNotFound errors.
|
||||
|
||||
### Updating to newer versions of the script
|
||||
#### Updating to newer versions of the script
|
||||
|
||||
This distribution is changing rapidly. If you used the "git clone" method (step 5) to download the stable-diffusion directory, then to update to the latest and greatest version, launch the Anaconda window, enter "stable-diffusion", and type:
|
||||
```
|
||||
@@ -160,7 +211,8 @@ This will bring your local copy into sync with the remote one.
|
||||
|
||||
### Windows
|
||||
|
||||
1. Install the most recent Python from here: https://www.python.org/downloads/windows/
|
||||
1. Install Python version 3.8.5 from here: https://www.python.org/downloads/windows/
|
||||
(note that several users have reported that later versions do not work properly)
|
||||
|
||||
2. Install Anaconda3 (miniconda3 version) from here: https://docs.anaconda.com/anaconda/install/windows/
|
||||
|
||||
@@ -190,15 +242,37 @@ This will install all python requirements and activate the "ldm" environment whi
|
||||
```
|
||||
python scripts\preload_models.py
|
||||
```
|
||||
This installs two machine learning models that stable diffusion requires.
|
||||
|
||||
This installs several machine learning models that stable diffusion
|
||||
requires. (Note that this step is required. I created it because some people
|
||||
are using GPU systems that are behind a firewall and the models can't be
|
||||
downloaded just-in-time)
|
||||
|
||||
9. Now you need to install the weights for the big stable diffusion model.
|
||||
|
||||
For testing prior to the release of the real weights, create a directory within stable-diffusion named "models\ldm\text2img.large".
|
||||
For running with the released weights, you will first need to set up
|
||||
an acount with Hugging Face (https://huggingface.co). Use your
|
||||
credentials to log in, and then point your browser at
|
||||
https://huggingface.co/CompVis/stable-diffusion-v-1-4-original. You
|
||||
may be asked to sign a license agreement at this point.
|
||||
|
||||
For testing with the released weights, create a directory within stable-diffusion named "models\ldm\stable-diffusion-v1".
|
||||
Click on "Files and versions" near the top of the page, and then click
|
||||
on the file named "sd-v1-4.ckpt". You'll be taken to a page that
|
||||
prompts you to click the "download" link. Now save the file somewhere
|
||||
safe on your local machine. The weight file is >4 GB in size, so
|
||||
downloading may take a while.
|
||||
|
||||
Then use a web browser to copy model.ckpt into the appropriate directory. For the text2img.large (pre-release) model, the weights are at https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt. Check back here later for the release URL.
|
||||
Now run the following commands from **within the stable-diffusion
|
||||
directory** to copy the weights file to the right place:
|
||||
|
||||
```
|
||||
mkdir -p models/ldm/stable-diffusion-v1
|
||||
copy C:\path\to\sd-v1-4.ckpt models\ldm\stable-diffusion-v1\model.ckpt
|
||||
```
|
||||
Please replace "C:\path\to\sd-v1.4.ckpt" with the correct path to wherever
|
||||
you stashed this file. If you prefer not to copy or move the .ckpt file,
|
||||
you may instead create a shortcut to it from within
|
||||
"models\ldm\stable-diffusion-v1\".
|
||||
|
||||
10. Start generating images!
|
||||
```
|
||||
@@ -208,9 +282,9 @@ python scripts\dream.py -l
|
||||
# for the post-release weights
|
||||
python scripts\dream.py
|
||||
```
|
||||
11. Subsequently, to relaunch the script, first activate the Anaconda command window (step 4), run "conda activate ldm" (step 7b), and then launch the dream script (step 10).
|
||||
11. Subsequently, to relaunch the script, first activate the Anaconda command window (step 4), enter the stable-diffusion directory (step 6, "cd \path\to\stable-diffusion"), run "conda activate ldm" (step 7b), and then launch the dream script (step 10).
|
||||
|
||||
### Updating to newer versions of the script
|
||||
#### Updating to newer versions of the script
|
||||
|
||||
This distribution is changing rapidly. If you used the "git clone" method (step 5) to download the stable-diffusion directory, then to update to the latest and greatest version, launch the Anaconda window, enter "stable-diffusion", and type:
|
||||
```
|
||||
@@ -275,7 +349,9 @@ For support,
|
||||
please use this repository's GitHub Issues tracking service. Feel free
|
||||
to send me an email if you use and like the script.
|
||||
|
||||
*Author:* Lincoln D. Stein <lincoln.stein@gmail.com>
|
||||
*Original Author:* Lincoln D. Stein <lincoln.stein@gmail.com>
|
||||
|
||||
*Contributions by:* [Peter Kowalczyk](https://github.com/slix), [Henry Harrison](https://github.com/hwharrison), [xraxra](https://github.com/xraxra), and [bmaltais](https://github.com/bmaltais)
|
||||
|
||||
# Original README from CompViz/stable-diffusion
|
||||
*Stable Diffusion was made possible thanks to a collaboration with [Stability AI](https://stability.ai/) and [Runway](https://runwayml.com/) and builds upon our previous work:*
|
||||
|
||||
@@ -146,8 +146,8 @@ class FrozenCLIPEmbedder(AbstractEncoder):
|
||||
"""Uses the CLIP transformer encoder for text (from Hugging Face)"""
|
||||
def __init__(self, version="openai/clip-vit-large-patch14", device="cuda", max_length=77):
|
||||
super().__init__()
|
||||
self.tokenizer = CLIPTokenizer.from_pretrained(version)
|
||||
self.transformer = CLIPTextModel.from_pretrained(version)
|
||||
self.tokenizer = CLIPTokenizer.from_pretrained(version,local_files_only=True)
|
||||
self.transformer = CLIPTextModel.from_pretrained(version,local_files_only=True)
|
||||
self.device = device
|
||||
self.max_length = max_length
|
||||
self.freeze()
|
||||
|
||||
@@ -11,7 +11,7 @@ t2i = T2I(outdir = <path> // outputs/txt2img-samples
|
||||
batch_size = <integer> // how many images to generate per sampling (1)
|
||||
steps = <integer> // 50
|
||||
seed = <integer> // current system time
|
||||
sampler = ['ddim','plms','klms'] // klms
|
||||
sampler_name= ['ddim','plms','klms'] // klms
|
||||
grid = <boolean> // false
|
||||
width = <integer> // image width, multiple of 64 (512)
|
||||
height = <integer> // image height, multiple of 64 (512)
|
||||
@@ -60,6 +60,7 @@ from torch import autocast
|
||||
from contextlib import contextmanager, nullcontext
|
||||
import time
|
||||
import math
|
||||
import re
|
||||
|
||||
from ldm.util import instantiate_from_config
|
||||
from ldm.models.diffusion.ddim import DDIMSampler
|
||||
@@ -77,7 +78,7 @@ class T2I:
|
||||
batch_size
|
||||
steps
|
||||
seed
|
||||
sampler
|
||||
sampler_name
|
||||
grid
|
||||
individual
|
||||
width
|
||||
@@ -88,6 +89,8 @@ class T2I:
|
||||
downsampling_factor
|
||||
precision
|
||||
strength
|
||||
|
||||
The vast majority of these arguments default to reasonable values.
|
||||
"""
|
||||
def __init__(self,
|
||||
outdir="outputs/txt2img-samples",
|
||||
@@ -102,14 +105,15 @@ class T2I:
|
||||
cfg_scale=7.5,
|
||||
weights="models/ldm/stable-diffusion-v1/model.ckpt",
|
||||
config = "configs/latent-diffusion/txt2img-1p4B-eval.yaml",
|
||||
sampler="klms",
|
||||
sampler_name="klms",
|
||||
latent_channels=4,
|
||||
downsampling_factor=8,
|
||||
ddim_eta=0.0, # deterministic
|
||||
fixed_code=False,
|
||||
precision='autocast',
|
||||
full_precision=False,
|
||||
strength=0.75 # default in scripts/img2img.py
|
||||
strength=0.75, # default in scripts/img2img.py
|
||||
latent_diffusion_weights=False # just to keep track of this parameter when regenerating prompt
|
||||
):
|
||||
self.outdir = outdir
|
||||
self.batch_size = batch_size
|
||||
@@ -119,9 +123,9 @@ class T2I:
|
||||
self.grid = grid
|
||||
self.steps = steps
|
||||
self.cfg_scale = cfg_scale
|
||||
self.weights = weights
|
||||
self.weights = weights
|
||||
self.config = config
|
||||
self.sampler_name = sampler
|
||||
self.sampler_name = sampler_name
|
||||
self.fixed_code = fixed_code
|
||||
self.latent_channels = latent_channels
|
||||
self.downsampling_factor = downsampling_factor
|
||||
@@ -131,6 +135,7 @@ class T2I:
|
||||
self.strength = strength
|
||||
self.model = None # empty for now
|
||||
self.sampler = None
|
||||
self.latent_diffusion_weights=latent_diffusion_weights
|
||||
if seed is None:
|
||||
self.seed = self._new_seed()
|
||||
else:
|
||||
@@ -167,7 +172,6 @@ class T2I:
|
||||
|
||||
# make directories and establish names for the output files
|
||||
os.makedirs(outdir, exist_ok=True)
|
||||
base_count = len(os.listdir(outdir))-1
|
||||
|
||||
start_code = None
|
||||
if self.fixed_code:
|
||||
@@ -181,7 +185,7 @@ class T2I:
|
||||
sampler = self.sampler
|
||||
images = list()
|
||||
seeds = list()
|
||||
|
||||
filename = None
|
||||
tic = time.time()
|
||||
|
||||
with torch.no_grad():
|
||||
@@ -214,10 +218,11 @@ class T2I:
|
||||
if not grid:
|
||||
for x_sample in x_samples_ddim:
|
||||
x_sample = 255. * rearrange(x_sample.cpu().numpy(), 'c h w -> h w c')
|
||||
filename = os.path.join(outdir, f"{base_count:05}.png")
|
||||
filename = self._unique_filename(outdir,previousname=filename,
|
||||
seed=seed,isbatch=(batch_size>1))
|
||||
assert not os.path.exists(filename)
|
||||
Image.fromarray(x_sample.astype(np.uint8)).save(filename)
|
||||
images.append([filename,seed])
|
||||
base_count += 1
|
||||
else:
|
||||
all_samples.append(x_samples_ddim)
|
||||
seeds.append(seed)
|
||||
@@ -279,7 +284,6 @@ class T2I:
|
||||
|
||||
# make directories and establish names for the output files
|
||||
os.makedirs(outdir, exist_ok=True)
|
||||
base_count = len(os.listdir(outdir))-1
|
||||
|
||||
assert os.path.isfile(init_img)
|
||||
init_image = self._load_img(init_img).to(self.device)
|
||||
@@ -300,7 +304,8 @@ class T2I:
|
||||
|
||||
images = list()
|
||||
seeds = list()
|
||||
|
||||
filename = None
|
||||
|
||||
tic = time.time()
|
||||
|
||||
with torch.no_grad():
|
||||
@@ -329,10 +334,10 @@ class T2I:
|
||||
if not grid:
|
||||
for x_sample in x_samples:
|
||||
x_sample = 255. * rearrange(x_sample.cpu().numpy(), 'c h w -> h w c')
|
||||
filename = os.path.join(outdir, f"{base_count:05}.png")
|
||||
filename = self._unique_filename(outdir,filename,seed=seed,isbatch=(batch_size>1))
|
||||
assert not os.path.exists(filename)
|
||||
Image.fromarray(x_sample.astype(np.uint8)).save(filename)
|
||||
images.append([filename,seed])
|
||||
base_count += 1
|
||||
else:
|
||||
all_samples.append(x_samples)
|
||||
seeds.append(seed)
|
||||
@@ -353,7 +358,6 @@ class T2I:
|
||||
|
||||
def _make_grid(self,samples,seeds,batch_size,iterations,outdir):
|
||||
images = list()
|
||||
base_count = len(os.listdir(outdir))-1
|
||||
n_rows = batch_size if batch_size>1 else int(math.sqrt(batch_size * iterations))
|
||||
# save as grid
|
||||
grid = torch.stack(samples, 0)
|
||||
@@ -362,7 +366,7 @@ class T2I:
|
||||
|
||||
# to image
|
||||
grid = 255. * rearrange(grid, 'c h w -> h w c').cpu().numpy()
|
||||
filename = os.path.join(outdir, f"{base_count:05}.png")
|
||||
filename = self._unique_filename(outdir,seed=seeds[0],grid_count=batch_size*iterations)
|
||||
Image.fromarray(grid.astype(np.uint8)).save(filename)
|
||||
for s in seeds:
|
||||
images.append([filename,s])
|
||||
@@ -412,7 +416,7 @@ class T2I:
|
||||
if self.full_precision:
|
||||
print('Using slower but more accurate full-precision math (--full_precision)')
|
||||
else:
|
||||
print('Using half precision math. Call with --full_precision to use full precision')
|
||||
print('Using half precision math. Call with --full_precision to use slower but more accurate full precision.')
|
||||
model.half()
|
||||
return model
|
||||
|
||||
@@ -426,3 +430,40 @@ class T2I:
|
||||
image = image[None].transpose(0, 3, 1, 2)
|
||||
image = torch.from_numpy(image)
|
||||
return 2.*image - 1.
|
||||
|
||||
def _unique_filename(self,outdir,previousname=None,seed=0,isbatch=False,grid_count=None):
|
||||
revision = 1
|
||||
|
||||
if previousname is None:
|
||||
# count up until we find an unfilled slot
|
||||
dir_list = [a.split('.',1)[0] for a in os.listdir(outdir)]
|
||||
uniques = dict.fromkeys(dir_list,True)
|
||||
basecount = 1
|
||||
while f'{basecount:06}' in uniques:
|
||||
basecount += 1
|
||||
if grid_count is not None:
|
||||
grid_label = f'grid#1-{grid_count}'
|
||||
filename = f'{basecount:06}.{seed}.{grid_label}.png'
|
||||
elif isbatch:
|
||||
filename = f'{basecount:06}.{seed}.01.png'
|
||||
else:
|
||||
filename = f'{basecount:06}.{seed}.png'
|
||||
|
||||
return os.path.join(outdir,filename)
|
||||
|
||||
else:
|
||||
previousname = os.path.basename(previousname)
|
||||
x = re.match('^(\d+)\..*\.png',previousname)
|
||||
if not x:
|
||||
return self._unique_filename(outdir,previousname,seed)
|
||||
|
||||
basecount = int(x.groups()[0])
|
||||
series = 0
|
||||
finished = False
|
||||
while not finished:
|
||||
series += 1
|
||||
filename = f'{basecount:06}.{seed}.png'
|
||||
if isbatch or os.path.exists(os.path.join(outdir,filename)):
|
||||
filename = f'{basecount:06}.{seed}.{series:02}.png'
|
||||
finished = not os.path.exists(os.path.join(outdir,filename))
|
||||
return os.path.join(outdir,filename)
|
||||
|
||||
157
scripts/dream.py
157
scripts/dream.py
@@ -4,6 +4,7 @@ import shlex
|
||||
import atexit
|
||||
import os
|
||||
import sys
|
||||
from PIL import Image,PngImagePlugin
|
||||
|
||||
# readline unavailable on windows systems
|
||||
try:
|
||||
@@ -39,7 +40,11 @@ def main():
|
||||
sys.path.append('.')
|
||||
from pytorch_lightning import logging
|
||||
from ldm.simplet2i import T2I
|
||||
|
||||
# these two lines prevent a horrible warning message from appearing
|
||||
# when the frozen CLIP tokenizer is imported
|
||||
import transformers
|
||||
transformers.logging.set_verbosity_error()
|
||||
|
||||
# creating a simple text2image object with a handful of
|
||||
# defaults passed on the command line.
|
||||
# additional parameters will be added (or overriden) during
|
||||
@@ -48,10 +53,12 @@ def main():
|
||||
height=height,
|
||||
batch_size=opt.batch_size,
|
||||
outdir=opt.outdir,
|
||||
sampler=opt.sampler,
|
||||
sampler_name=opt.sampler_name,
|
||||
weights=weights,
|
||||
full_precision=opt.full_precision,
|
||||
config=config)
|
||||
config=config,
|
||||
latent_diffusion_weights=opt.laion400m # this is solely for recreating the prompt
|
||||
)
|
||||
|
||||
# make sure the output directory exists
|
||||
if not os.path.exists(opt.outdir):
|
||||
@@ -63,9 +70,9 @@ def main():
|
||||
# preload the model
|
||||
if not debugging:
|
||||
t2i.load_model()
|
||||
print("\n* Initialization done! Awaiting your command (-h for help, q to quit)...")
|
||||
print("\n* Initialization done! Awaiting your command (-h for help, 'q' to quit, 'cd' to change output dir, 'pwd' to print output dir)...")
|
||||
|
||||
log_path = os.path.join(opt.outdir,"dream_log.txt")
|
||||
log_path = os.path.join(opt.outdir,'..','dream_log.txt')
|
||||
with open(log_path,'a') as log:
|
||||
cmd_parser = create_cmd_parser()
|
||||
main_loop(t2i,cmd_parser,log)
|
||||
@@ -83,10 +90,31 @@ def main_loop(t2i,parser,log):
|
||||
done = True
|
||||
break
|
||||
|
||||
elements = shlex.split(command)
|
||||
if elements[0]=='q': #
|
||||
try:
|
||||
elements = shlex.split(command)
|
||||
except ValueError as e:
|
||||
print(str(e))
|
||||
continue
|
||||
|
||||
if len(elements)==0:
|
||||
continue
|
||||
|
||||
if elements[0]=='q':
|
||||
done = True
|
||||
break
|
||||
|
||||
if elements[0]=='cd' and len(elements)>1:
|
||||
if os.path.exists(elements[1]):
|
||||
print(f"setting image output directory to {elements[1]}")
|
||||
t2i.outdir=elements[1]
|
||||
else:
|
||||
print(f"directory {elements[1]} does not exist")
|
||||
continue
|
||||
|
||||
if elements[0]=='pwd':
|
||||
print(f"current output directory is {t2i.outdir}")
|
||||
continue
|
||||
|
||||
if elements[0].startswith('!dream'): # in case a stored prompt still contains the !dream command
|
||||
elements.pop(0)
|
||||
|
||||
@@ -119,42 +147,83 @@ def main_loop(t2i,parser,log):
|
||||
else:
|
||||
results = t2i.img2img(**vars(opt))
|
||||
print("Outputs:")
|
||||
write_log_message(opt,switches,results,log)
|
||||
write_log_message(t2i,opt,results,log)
|
||||
except KeyboardInterrupt:
|
||||
print('*interrupted*')
|
||||
continue
|
||||
except RuntimeError as e:
|
||||
print(str(e))
|
||||
continue
|
||||
|
||||
|
||||
print("goodbye!")
|
||||
|
||||
|
||||
def write_log_message(opt,switches,results,logfile):
|
||||
''' logs the name of the output image, its prompt and seed to both the terminal and the log file '''
|
||||
if opt.grid:
|
||||
_output_for_grid(switches,results,logfile)
|
||||
else:
|
||||
_output_for_individual(switches,results,logfile)
|
||||
def write_log_message(t2i,opt,results,logfile):
|
||||
''' logs the name of the output image, its prompt and seed to the terminal, log file, and a Dream text chunk in the PNG metadata '''
|
||||
switches = _reconstruct_switches(t2i,opt)
|
||||
prompt_str = ' '.join(switches)
|
||||
|
||||
# when multiple images are produced in batch, then we keep track of where each starts
|
||||
last_seed = None
|
||||
img_num = 1
|
||||
batch_size = opt.batch_size or t2i.batch_size
|
||||
seenit = {}
|
||||
|
||||
seeds = [a[1] for a in results]
|
||||
if batch_size > 1:
|
||||
seeds = f"(seeds for each batch row: {seeds})"
|
||||
else:
|
||||
seeds = f"(seeds for individual images: {seeds})"
|
||||
|
||||
def _output_for_individual(switches,results,logfile):
|
||||
for r in results:
|
||||
log_message = " ".join([' ',str(r[0])+':',
|
||||
f'"{switches[0]}"',
|
||||
*switches[1:],f'-S {r[1]}'])
|
||||
seed = r[1]
|
||||
log_message = (f'{r[0]}: {prompt_str} -S{seed}')
|
||||
|
||||
if batch_size > 1:
|
||||
if seed != last_seed:
|
||||
img_num = 1
|
||||
log_message += f' # (batch image {img_num} of {batch_size})'
|
||||
else:
|
||||
img_num += 1
|
||||
log_message += f' # (batch image {img_num} of {batch_size})'
|
||||
last_seed = seed
|
||||
print(log_message)
|
||||
logfile.write(log_message+"\n")
|
||||
logfile.flush()
|
||||
if r[0] not in seenit:
|
||||
seenit[r[0]] = True
|
||||
try:
|
||||
if opt.grid:
|
||||
_write_prompt_to_png(r[0],f'{prompt_str} -g -S{seed} {seeds}')
|
||||
else:
|
||||
_write_prompt_to_png(r[0],f'{prompt_str} -S{seed}')
|
||||
except FileNotFoundError:
|
||||
print(f"Could not open file '{r[0]}' for reading")
|
||||
|
||||
def _output_for_grid(switches,results,logfile):
|
||||
first_seed = results[0][1]
|
||||
log_message = " ".join([' ',str(results[0][0])+':',
|
||||
f'"{switches[0]}"',
|
||||
*switches[1:],f'-S {results[0][1]}'])
|
||||
print(log_message)
|
||||
logfile.write(log_message+"\n")
|
||||
all_seeds = [row[1] for row in results]
|
||||
log_message = f' seeds for individual rows: {all_seeds}'
|
||||
print(log_message)
|
||||
logfile.write(log_message+"\n")
|
||||
def _reconstruct_switches(t2i,opt):
|
||||
'''Normalize the prompt and switches'''
|
||||
switches = list()
|
||||
switches.append(f'"{opt.prompt}"')
|
||||
switches.append(f'-s{opt.steps or t2i.steps}')
|
||||
switches.append(f'-b{opt.batch_size or t2i.batch_size}')
|
||||
switches.append(f'-W{opt.width or t2i.width}')
|
||||
switches.append(f'-H{opt.height or t2i.height}')
|
||||
switches.append(f'-C{opt.cfg_scale or t2i.cfg_scale}')
|
||||
if opt.init_img:
|
||||
switches.append(f'-I{opt.init_img}')
|
||||
if opt.strength and opt.init_img is not None:
|
||||
switches.append(f'-f{opt.strength or t2i.strength}')
|
||||
if t2i.full_precision:
|
||||
switches.append('-F')
|
||||
return switches
|
||||
|
||||
def _write_prompt_to_png(path,prompt):
|
||||
info = PngImagePlugin.PngInfo()
|
||||
info.add_text("Dream",prompt)
|
||||
im = Image.open(path)
|
||||
im.save(path,"PNG",pnginfo=info)
|
||||
|
||||
def create_argv_parser():
|
||||
parser = argparse.ArgumentParser(description="Parse script's command line args")
|
||||
parser.add_argument("--laion400m",
|
||||
@@ -162,7 +231,7 @@ def create_argv_parser():
|
||||
"-l",
|
||||
dest='laion400m',
|
||||
action='store_true',
|
||||
help="fallback to the latent diffusion (LAION4400M) weights and config")
|
||||
help="fallback to the latent diffusion (laion400m) weights and config")
|
||||
parser.add_argument('-n','--iterations',
|
||||
type=int,
|
||||
default=1,
|
||||
@@ -174,11 +243,12 @@ def create_argv_parser():
|
||||
parser.add_argument('-b','--batch_size',
|
||||
type=int,
|
||||
default=1,
|
||||
help="number of images to produce per iteration (currently not working properly - producing too many images)")
|
||||
parser.add_argument('--sampler',
|
||||
help="number of images to produce per iteration (faster, but doesn't generate individual seeds")
|
||||
parser.add_argument('--sampler','-m',
|
||||
dest="sampler_name",
|
||||
choices=['plms','ddim', 'klms'],
|
||||
default='klms',
|
||||
help="which sampler to use (klms)")
|
||||
help="which sampler to use (klms) - can only be set on command line")
|
||||
parser.add_argument('-o',
|
||||
'--outdir',
|
||||
type=str,
|
||||
@@ -193,7 +263,7 @@ def create_cmd_parser():
|
||||
parser.add_argument('-s','--steps',type=int,help="number of steps")
|
||||
parser.add_argument('-S','--seed',type=int,help="image seed")
|
||||
parser.add_argument('-n','--iterations',type=int,default=1,help="number of samplings to perform")
|
||||
parser.add_argument('-b','--batch_size',type=int,default=1,help="number of images to produce per sampling (currently broken)")
|
||||
parser.add_argument('-b','--batch_size',type=int,default=1,help="number of images to produce per sampling")
|
||||
parser.add_argument('-W','--width',type=int,help="image width, multiple of 64")
|
||||
parser.add_argument('-H','--height',type=int,help="image height, multiple of 64")
|
||||
parser.add_argument('-C','--cfg_scale',default=7.5,type=float,help="prompt configuration scale")
|
||||
@@ -205,7 +275,8 @@ def create_cmd_parser():
|
||||
|
||||
if readline_available:
|
||||
def setup_readline():
|
||||
readline.set_completer(Completer(['--steps','-s','--seed','-S','--iterations','-n','--batch_size','-b',
|
||||
readline.set_completer(Completer(['cd','pwd',
|
||||
'--steps','-s','--seed','-S','--iterations','-n','--batch_size','-b',
|
||||
'--width','-W','--height','-H','--cfg_scale','-C','--grid','-g',
|
||||
'--individual','-i','--init_img','-I','--strength','-f']).complete)
|
||||
readline.set_completer_delims(" ")
|
||||
@@ -227,8 +298,13 @@ if readline_available:
|
||||
return
|
||||
|
||||
def complete(self,text,state):
|
||||
if text.startswith('-I') or text.startswith('--init_img'):
|
||||
return self._image_completions(text,state)
|
||||
buffer = readline.get_line_buffer()
|
||||
|
||||
if text.startswith(('-I','--init_img')):
|
||||
return self._path_completions(text,state,('.png'))
|
||||
|
||||
if buffer.strip().endswith('cd') or text.startswith(('.','/')):
|
||||
return self._path_completions(text,state,())
|
||||
|
||||
response = None
|
||||
if state == 0:
|
||||
@@ -248,12 +324,14 @@ if readline_available:
|
||||
response = None
|
||||
return response
|
||||
|
||||
def _image_completions(self,text,state):
|
||||
def _path_completions(self,text,state,extensions):
|
||||
# get the path so far
|
||||
if text.startswith('-I'):
|
||||
path = text.replace('-I','',1).lstrip()
|
||||
elif text.startswith('--init_img='):
|
||||
path = text.replace('--init_img=','',1).lstrip()
|
||||
else:
|
||||
path = text
|
||||
|
||||
matches = list()
|
||||
|
||||
@@ -270,7 +348,7 @@ if readline_available:
|
||||
if full_path.startswith(path):
|
||||
if os.path.isdir(full_path):
|
||||
matches.append(os.path.join(os.path.dirname(text),n)+'/')
|
||||
elif n.endswith('.png'):
|
||||
elif n.endswith(extensions):
|
||||
matches.append(os.path.join(os.path.dirname(text),n))
|
||||
|
||||
try:
|
||||
@@ -278,7 +356,6 @@ if readline_available:
|
||||
except IndexError:
|
||||
response = None
|
||||
return response
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
||||
30
scripts/images2prompt.py
Executable file
30
scripts/images2prompt.py
Executable file
@@ -0,0 +1,30 @@
|
||||
#!/usr/bin/env python3
|
||||
'''This script reads the "Dream" Stable Diffusion prompt embedded in files generated by dream.py'''
|
||||
|
||||
import sys
|
||||
from PIL import Image,PngImagePlugin
|
||||
|
||||
if len(sys.argv) < 2:
|
||||
print("Usage: file2prompt.py <file1.png> <file2.png> <file3.png>...")
|
||||
print("This script opens up the indicated dream.py-generated PNG file(s) and prints out the prompt used to generate them.")
|
||||
exit(-1)
|
||||
|
||||
filenames = sys.argv[1:]
|
||||
for f in filenames:
|
||||
try:
|
||||
im = Image.open(f)
|
||||
try:
|
||||
prompt = im.text['Dream']
|
||||
except KeyError:
|
||||
prompt = ''
|
||||
print(f'{f}: {prompt}')
|
||||
except FileNotFoundError:
|
||||
sys.stderr.write(f'{f} not found\n')
|
||||
continue
|
||||
except PermissionError:
|
||||
sys.stderr.write(f'{f} could not be opened due to inadequate permissions\n')
|
||||
continue
|
||||
|
||||
|
||||
|
||||
|
||||
18
scripts/preload_models.py
Normal file → Executable file
18
scripts/preload_models.py
Normal file → Executable file
@@ -2,6 +2,10 @@
|
||||
# Before running stable-diffusion on an internet-isolated machine,
|
||||
# run this script from one with internet connectivity. The
|
||||
# two machines must share a common .cache directory.
|
||||
import sys
|
||||
import transformers
|
||||
|
||||
transformers.logging.set_verbosity_error()
|
||||
|
||||
# this will preload the Bert tokenizer fles
|
||||
print("preloading bert tokenizer...")
|
||||
@@ -10,7 +14,19 @@ tokenizer = BertTokenizerFast.from_pretrained("bert-base-uncased")
|
||||
print("...success")
|
||||
|
||||
# this will download requirements for Kornia
|
||||
print("preloading Kornia requirements...")
|
||||
print("preloading Kornia requirements (ignore the warnings)...")
|
||||
import kornia
|
||||
print("...success")
|
||||
|
||||
# doesn't work - probably wrong logger
|
||||
# logging.getLogger('transformers.tokenization_utils').setLevel(logging.ERROR)
|
||||
version='openai/clip-vit-large-patch14'
|
||||
|
||||
print('preloading CLIP model (Ignore the warnings)...')
|
||||
sys.stdout.flush()
|
||||
import clip
|
||||
from transformers import CLIPTokenizer, CLIPTextModel
|
||||
tokenizer =CLIPTokenizer.from_pretrained(version)
|
||||
transformer=CLIPTextModel.from_pretrained(version)
|
||||
print('\n\n...success')
|
||||
|
||||
|
||||
Reference in New Issue
Block a user