Compare commits

..

9 Commits

Author SHA1 Message Date
Lincoln Stein
ddf0ef3af1 updated README for image metadata storage 2022-08-22 00:22:12 -04:00
Lincoln Stein
aa2729d868 user's prompt is now normalized for reproducibility and written into the destination PNG file as a tEXt metadata chunk named "Dream". You can retrieve the prompt with an image editing program that supports browsing the full metadata, or with the images2prompt.py script located in 'scripts' 2022-08-22 00:12:16 -04:00
Lincoln Stein
5f352aec87 test of normalization of prompt 2022-08-21 22:48:40 -04:00
Lincoln Stein
c4c4974b39 Update README.md
Fixed formatting in changelog.
2022-08-21 21:48:02 -04:00
Lincoln Stein
194f43f00b Update README.md
Add acknowledges for those who sent pull requests.
2022-08-21 21:46:00 -04:00
Lincoln Stein
325bc5280e Updated README.md
Fix the path for where to install the LIAON-400m model.
2022-08-21 20:48:44 -04:00
Lincoln Stein
11cc8e545b Clarified the required Python version (3.8.5) 2022-08-21 20:30:21 -04:00
Lincoln Stein
9adac56f4e Fixed incorrect conda env update command 2022-08-21 20:27:25 -04:00
Lincoln Stein
5d5307dcb4 Update README.md 2022-08-21 20:20:22 -04:00
4 changed files with 136 additions and 50 deletions

View File

@@ -17,7 +17,11 @@ initialization only happens once. After that image generation
from the command-line interface is very fast.
The script uses the readline library to allow for in-line editing,
command history (up and down arrows), autocompletion, and more.
command history (up and down arrows), autocompletion, and more. To help
keep track of which prompts generated which images, the script writes a
log file of image names and prompts to the selected output directory.
In addition, as of version 1.02, it also writes the prompt into the PNG
file's metadata where it can be retrieved using scripts/images2prompt.py
The script is confirmed to work on Linux and Windows systems. It should
work on MacOSX as well, but this is not confirmed. Note that this script
@@ -38,12 +42,19 @@ setting sampler to plms
* Initialization done! Awaiting your command...
dream> ashley judd riding a camel -n2 -s150
Outputs:
outputs/txt2img-samples/00009.png: "ashley judd riding a camel" -n2 -s150 -S 416354203
outputs/txt2img-samples/00010.png: "ashley judd riding a camel" -n2 -s150-S 1362479620
outputs/img-samples/00009.png: "ashley judd riding a camel" -n2 -s150 -S 416354203
outputs/img-samples/00010.png: "ashley judd riding a camel" -n2 -s150 -S 1362479620
dream> "there's a fly in my soup" -n6 -g
outputs/txt2img-samples/00041.png: "there's a fly in my soup" -n6 -g -S 2685670268
outputs/img-samples/00011.png: "there's a fly in my soup" -n6 -g -S 2685670268
seeds for individual rows: [2685670268, 1216708065, 2335773498, 822223658, 714542046, 3395302430]
dream> q
# this shows how to retrieve the prompt stored in the saved image's metadata
(ldm) ~/stable-diffusion$ python3 ./scripts/images2prompt.py outputs/img_samples/*.png
00009.png: "ashley judd riding a camel" -s150 -S 416354203
00010.png: "ashley judd riding a camel" -s150 -S 1362479620
00011.png: "there's a fly in my soup" -n6 -g -S 2685670268
~~~~
The dream> prompt's arguments are pretty much
@@ -75,10 +86,17 @@ completely). The default is 0.75, and ranges from 0.25-0.75 give interesting res
## Changes
- v1.01 (21 August 2022)
* added k_lms sampling **Please run "conda update -f environment.yaml" to load the k_lms dependencies**
* use half precision arithmetic by default, resulting in faster execution and lower memory requirements
Pass argument --full_precision to dream.py to get slower but more accurate image generation
* v1.01 (21 August 2022)
* A copy of the prompt and all of its switches and options is now stored in the corresponding
image in a tEXt metadata field named "Dream". You can read the prompt using scripts/images2prompt.py,
or an image editor that allows you to explore the full metadata.
**Please run "conda env update -f environment.yaml" to load the k_lms dependencies!!**
* v1.01 (21 August 2022)
* added k_lms sampling.
**Please run "conda env update -f environment.yaml" to load the k_lms dependencies!!**
* use half precision arithmetic by default, resulting in faster execution and lower memory requirements
Pass argument --full_precision to dream.py to get slower but more accurate image generation
## Installation
@@ -87,7 +105,7 @@ Pass argument --full_precision to dream.py to get slower but more accurate image
1. You will need to install the following prerequisites if they are not already available. Use your
operating system's preferred installer
* Python (version 3.8 or higher)
* Python (version 3.8.5 recommended; higher may work)
* git
2. Install the Python Anaconda environment manager using pip3.
@@ -124,7 +142,7 @@ After these steps, your command prompt will be prefixed by "(ldm)" as shown abov
7. Now you need to install the weights for the stable diffusion model.
For testing prior to the release of the real weights, you can use an older weight file that produces low-quality images. Create a directory within stable-diffusion named "models/ldm/text2img.large", and use the wget URL downloader tool to copy the weight file into it:
For testing prior to the release of the real weights, you can use an older weight file that produces low-quality images. Create a directory within stable-diffusion named "models/ldm/text2img-large", and use the wget URL downloader tool to copy the weight file into it:
```
(ldm) ~/stable-diffusion$ mkdir -p models/ldm/text2img-large
(ldm) ~/stable-diffusion$ wget -O models/ldm/text2img-large/model.ckpt https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt
@@ -160,7 +178,8 @@ This will bring your local copy into sync with the remote one.
### Windows
1. Install the most recent Python from here: https://www.python.org/downloads/windows/
1. Install Python version 3.8.5 from here: https://www.python.org/downloads/windows/
(note that several users have reported that later versions do not work properly)
2. Install Anaconda3 (miniconda3 version) from here: https://docs.anaconda.com/anaconda/install/windows/
@@ -194,11 +213,11 @@ This installs two machine learning models that stable diffusion requires.
9. Now you need to install the weights for the big stable diffusion model.
For testing prior to the release of the real weights, create a directory within stable-diffusion named "models\ldm\text2img.large".
For testing prior to the release of the real weights, create a directory within stable-diffusion named "models\ldm\text2img-large".
For testing with the released weights, create a directory within stable-diffusion named "models\ldm\stable-diffusion-v1".
Then use a web browser to copy model.ckpt into the appropriate directory. For the text2img.large (pre-release) model, the weights are at https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt. Check back here later for the release URL.
Then use a web browser to copy model.ckpt into the appropriate directory. For the text2img-large (pre-release) model, the weights are at https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt. Check back here later for the release URL.
10. Start generating images!
```
@@ -275,7 +294,9 @@ For support,
please use this repository's GitHub Issues tracking service. Feel free
to send me an email if you use and like the script.
*Author:* Lincoln D. Stein <lincoln.stein@gmail.com>
*Original Author:* Lincoln D. Stein <lincoln.stein@gmail.com>
*Contributions by:* [Peter Kowalczyk](https://github.com/slix), [Henry Harrison](https://github.com/hwharrison), [xraxra](https://github.com/xraxra), and [bmaltais](https://github.com/bmaltais)
# Original README from CompViz/stable-diffusion
*Stable Diffusion was made possible thanks to a collaboration with [Stability AI](https://stability.ai/) and [Runway](https://runwayml.com/) and builds upon our previous work:*

View File

@@ -11,7 +11,7 @@ t2i = T2I(outdir = <path> // outputs/txt2img-samples
batch_size = <integer> // how many images to generate per sampling (1)
steps = <integer> // 50
seed = <integer> // current system time
sampler = ['ddim','plms','klms'] // klms
sampler_name= ['ddim','plms','klms'] // klms
grid = <boolean> // false
width = <integer> // image width, multiple of 64 (512)
height = <integer> // image height, multiple of 64 (512)
@@ -77,7 +77,7 @@ class T2I:
batch_size
steps
seed
sampler
sampler_name
grid
individual
width
@@ -88,6 +88,8 @@ class T2I:
downsampling_factor
precision
strength
The vast majority of these arguments default to reasonable values.
"""
def __init__(self,
outdir="outputs/txt2img-samples",
@@ -102,14 +104,15 @@ class T2I:
cfg_scale=7.5,
weights="models/ldm/stable-diffusion-v1/model.ckpt",
config = "configs/latent-diffusion/txt2img-1p4B-eval.yaml",
sampler="klms",
sampler_name="klms",
latent_channels=4,
downsampling_factor=8,
ddim_eta=0.0, # deterministic
fixed_code=False,
precision='autocast',
full_precision=False,
strength=0.75 # default in scripts/img2img.py
strength=0.75, # default in scripts/img2img.py
latent_diffusion_weights=False # just to keep track of this parameter when regenerating prompt
):
self.outdir = outdir
self.batch_size = batch_size
@@ -119,9 +122,9 @@ class T2I:
self.grid = grid
self.steps = steps
self.cfg_scale = cfg_scale
self.weights = weights
self.weights = weights
self.config = config
self.sampler_name = sampler
self.sampler_name = sampler_name
self.fixed_code = fixed_code
self.latent_channels = latent_channels
self.downsampling_factor = downsampling_factor
@@ -131,6 +134,7 @@ class T2I:
self.strength = strength
self.model = None # empty for now
self.sampler = None
self.latent_diffusion_weights=latent_diffusion_weights
if seed is None:
self.seed = self._new_seed()
else:
@@ -412,7 +416,7 @@ class T2I:
if self.full_precision:
print('Using slower but more accurate full-precision math (--full_precision)')
else:
print('Using half precision math. Call with --full_precision to use full precision')
print('Using half precision math. Call with --full_precision to use slower but more accurate full precision.')
model.half()
return model

View File

@@ -4,6 +4,7 @@ import shlex
import atexit
import os
import sys
from PIL import Image,PngImagePlugin
# readline unavailable on windows systems
try:
@@ -48,10 +49,12 @@ def main():
height=height,
batch_size=opt.batch_size,
outdir=opt.outdir,
sampler=opt.sampler,
sampler_name=opt.sampler_name,
weights=weights,
full_precision=opt.full_precision,
config=config)
config=config,
latent_diffusion_weights=opt.laion400m # this is solely for recreating the prompt
)
# make sure the output directory exists
if not os.path.exists(opt.outdir):
@@ -119,7 +122,7 @@ def main_loop(t2i,parser,log):
else:
results = t2i.img2img(**vars(opt))
print("Outputs:")
write_log_message(opt,switches,results,log)
write_log_message(t2i,opt,results,log)
except KeyboardInterrupt:
print('*interrupted*')
continue
@@ -127,34 +130,62 @@ def main_loop(t2i,parser,log):
print("goodbye!")
def write_log_message(opt,switches,results,logfile):
''' logs the name of the output image, its prompt and seed to both the terminal and the log file '''
if opt.grid:
_output_for_grid(switches,results,logfile)
else:
_output_for_individual(switches,results,logfile)
def write_log_message(t2i,opt,results,logfile):
''' logs the name of the output image, its prompt and seed to the terminal, log file, and a Dream text chunk in the PNG metadata '''
switches = _reconstruct_switches(t2i,opt)
prompt_str = ' '.join(switches)
def _output_for_individual(switches,results,logfile):
# when multiple images are produced in batch, then we keep track of where each starts
last_seed = None
img_num = 1
batch_size = opt.batch_size or t2i.batch_size
seenit = {}
for r in results:
log_message = " ".join([' ',str(r[0])+':',
f'"{switches[0]}"',
*switches[1:],f'-S {r[1]}'])
seed = r[1]
log_message = (f'{r[0]}: {prompt_str} -S{seed}')
if batch_size > 1:
if seed != last_seed:
img_num = 1
log_message += f' # (batch image {img_num} of {batch_size})'
else:
img_num += 1
log_message += f' # (batch image {img_num} of {batch_size})'
last_seed = seed
print(log_message)
logfile.write(log_message+"\n")
logfile.flush()
if r[0] not in seenit:
seenit[r[0]] = True
try:
_write_prompt_to_png(r[0],f'{prompt_str} -S{seed}')
except FileNotFoundError:
print(f"Could not open file '{r[0]}' for reading")
def _output_for_grid(switches,results,logfile):
first_seed = results[0][1]
log_message = " ".join([' ',str(results[0][0])+':',
f'"{switches[0]}"',
*switches[1:],f'-S {results[0][1]}'])
print(log_message)
logfile.write(log_message+"\n")
all_seeds = [row[1] for row in results]
log_message = f' seeds for individual rows: {all_seeds}'
print(log_message)
logfile.write(log_message+"\n")
def _reconstruct_switches(t2i,opt):
'''Normalize the prompt and switches'''
switches = list()
switches.append(f'"{opt.prompt}"')
switches.append(f'-s{opt.steps or t2i.steps}')
switches.append(f'-b{opt.batch_size or t2i.batch_size}')
switches.append(f'-W{opt.width or t2i.width}')
switches.append(f'-H{opt.height or t2i.height}')
switches.append(f'-C{opt.cfg_scale or t2i.cfg_scale}')
if opt.init_img:
switches.append(f'-I{opt.init_img}')
if opt.strength and opt.init_img is not None:
switches.append(f'-f{opt.strength or t2i.strength}')
if t2i.full_precision:
switches.append('-F')
return switches
def _write_prompt_to_png(path,prompt):
info = PngImagePlugin.PngInfo()
info.add_text("Dream",prompt)
im = Image.open(path)
im.save(path,"PNG",pnginfo=info)
def create_argv_parser():
parser = argparse.ArgumentParser(description="Parse script's command line args")
parser.add_argument("--laion400m",
@@ -162,7 +193,7 @@ def create_argv_parser():
"-l",
dest='laion400m',
action='store_true',
help="fallback to the latent diffusion (LAION4400M) weights and config")
help="fallback to the latent diffusion (laion400m) weights and config")
parser.add_argument('-n','--iterations',
type=int,
default=1,
@@ -174,11 +205,12 @@ def create_argv_parser():
parser.add_argument('-b','--batch_size',
type=int,
default=1,
help="number of images to produce per iteration (currently not working properly - producing too many images)")
parser.add_argument('--sampler',
help="number of images to produce per iteration (faster, but doesn't generate individual seeds")
parser.add_argument('--sampler','-m',
dest="sampler_name",
choices=['plms','ddim', 'klms'],
default='klms',
help="which sampler to use (klms)")
help="which sampler to use (klms) - can only be set on command line")
parser.add_argument('-o',
'--outdir',
type=str,
@@ -193,7 +225,7 @@ def create_cmd_parser():
parser.add_argument('-s','--steps',type=int,help="number of steps")
parser.add_argument('-S','--seed',type=int,help="image seed")
parser.add_argument('-n','--iterations',type=int,default=1,help="number of samplings to perform")
parser.add_argument('-b','--batch_size',type=int,default=1,help="number of images to produce per sampling (currently broken)")
parser.add_argument('-b','--batch_size',type=int,default=1,help="number of images to produce per sampling")
parser.add_argument('-W','--width',type=int,help="image width, multiple of 64")
parser.add_argument('-H','--height',type=int,help="image height, multiple of 64")
parser.add_argument('-C','--cfg_scale',default=7.5,type=float,help="prompt configuration scale")

29
scripts/images2prompt.py Normal file
View File

@@ -0,0 +1,29 @@
#!/usr/bin/env python3
'''This script reads the "Dream" Stable Diffusion prompt embedded in files generated by dream.py'''
import sys
from PIL import Image,PngImagePlugin
if len(sys.argv) < 2:
print("Usage: file2prompt.py <file1.png> <file2.png> <file3.png>...")
exit(-1)
filenames = sys.argv[1:]
for f in filenames:
try:
im = Image.open(f)
try:
prompt = im.text['Dream']
except KeyError:
prompt = ''
print(f'{f}: {prompt}')
except FileNotFoundError:
sys.stderr.write(f'{f} not found\n')
continue
except PermissionError:
sys.stderr.write(f'{f} could not be opened due to inadequate permissions\n')
continue