mirror of
https://github.com/invoke-ai/InvokeAI.git
synced 2026-01-16 00:17:56 -05:00
Compare commits
10 Commits
release-0.
...
release_1.
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
4cb5fc5ed4 | ||
|
|
d8926fb8c0 | ||
|
|
80c0e30099 | ||
|
|
ac440a1197 | ||
|
|
bb46c70ec5 | ||
|
|
2b2ebd19e7 | ||
|
|
74f238d310 | ||
|
|
58f1962671 | ||
|
|
7b8c883b07 | ||
|
|
be6ab334c2 |
131
README.md
131
README.md
@@ -1,40 +1,29 @@
|
||||
# Stable Diffusion
|
||||
# Stable Diffusion Dream Script
|
||||
|
||||
This is a fork of CompVis/stable-diffusion, the wonderful open source
|
||||
text-to-image generator.
|
||||
|
||||
The original has been modified in several minor ways:
|
||||
|
||||
## Simplified API for text to image generation
|
||||
|
||||
There is now a simplified API for text to image generation, which
|
||||
lets you create images from a prompt in just three lines of code:
|
||||
|
||||
~~~~
|
||||
from ldm.simplet2i import T2I
|
||||
model = T2I()
|
||||
outputs = model.text2image("a unicorn in manhattan")
|
||||
~~~~
|
||||
|
||||
Outputs is a list of lists in the format [[filename1,seed1],[filename2,seed2]...]
|
||||
Please see ldm/simplet2i.py for more information.
|
||||
The original has been modified in several ways:
|
||||
|
||||
## Interactive command-line interface similar to the Discord bot
|
||||
|
||||
There is now a command-line script, located in scripts/dream.py, which
|
||||
The *dream.py* script, located in scripts/dream.py,
|
||||
provides an interactive interface to image generation similar to
|
||||
the "dream mothership" bot that Stable AI provided on its Discord
|
||||
server. The advantage of this is that the lengthy model
|
||||
initialization only happens once. After that image generation is
|
||||
fast.
|
||||
server. Unlike the txt2img.py and img2img.py scripts provided in the
|
||||
original CompViz/stable-diffusion source code repository, the
|
||||
time-consuming initialization of the AI model
|
||||
initialization only happens once. After that image generation
|
||||
from the command-line interface is very fast.
|
||||
|
||||
The script uses the readline library to allow for in-line editing,
|
||||
command history (up and down arrows) and more.
|
||||
command history (up and down arrows), autocompletion, and more.
|
||||
|
||||
Note that this has only been tested in the Linux environment!
|
||||
Note that this has only been tested in the Linux environment. Testing
|
||||
and tweaking for Windows is in progress.
|
||||
|
||||
~~~~
|
||||
(ldm) ~/stable-diffusion$ ./scripts/dream.py
|
||||
(ldm) ~/stable-diffusion$ python3 ./scripts/dream.py
|
||||
* Initializing, be patient...
|
||||
Loading model from models/ldm/text2img-large/model.ckpt
|
||||
LatentDiffusion: Running in eps-prediction mode
|
||||
@@ -46,24 +35,83 @@ Loading Bert tokenizer from "models/bert"
|
||||
setting sampler to plms
|
||||
|
||||
* Initialization done! Awaiting your command...
|
||||
dream> ashley judd riding a camel -n2
|
||||
dream> ashley judd riding a camel -n2 -s150
|
||||
Outputs:
|
||||
outputs/txt2img-samples/00009.png: "ashley judd riding a camel" -n2 -S 416354203
|
||||
outputs/txt2img-samples/00010.png: "ashley judd riding a camel" -n2 -S 1362479620
|
||||
outputs/txt2img-samples/00009.png: "ashley judd riding a camel" -n2 -s150 -S 416354203
|
||||
outputs/txt2img-samples/00010.png: "ashley judd riding a camel" -n2 -s150-S 1362479620
|
||||
|
||||
dream> "your prompt here" -n6 -g
|
||||
outputs/txt2img-samples/00041.png: "your prompt here" -n6 -g -S 2685670268
|
||||
dream> "there's a fly in my soup" -n6 -g
|
||||
outputs/txt2img-samples/00041.png: "there's a fly in my soup" -n6 -g -S 2685670268
|
||||
seeds for individual rows: [2685670268, 1216708065, 2335773498, 822223658, 714542046, 3395302430]
|
||||
~~~~
|
||||
|
||||
Command-line arguments passed to the script allow you to change
|
||||
various defaults, and select between the mature stable-diffusion
|
||||
weights (512x512) and the older (256x256) latent diffusion weights
|
||||
(laion400m). From the dream> prompt, the arguments are (mostly)
|
||||
The dream> prompt's arguments are pretty-much
|
||||
identical to those used in the Discord bot, except you don't need to
|
||||
type "!dream". Pass "-h" (or "--help") to list the arguments.
|
||||
type "!dream". A significant change is that creation of individual images is the default
|
||||
unless --grid (-g) is given. For backward compatibility, the -i switch is recognized.
|
||||
For command-line help type -h (or --help) at the dream> prompt.
|
||||
|
||||
The script itself also recognizes a series of command-line switches that will change
|
||||
important global defaults, such as the directory for image outputs and the location
|
||||
of the model weight files.
|
||||
|
||||
## Image-to-Image
|
||||
|
||||
This script also provides an img2img feature that lets you seed your
|
||||
creations with a drawing or photo. This is a really cool feature that tells
|
||||
stable diffusion to build the prompt on top of the image you provide, preserving
|
||||
the original's basic shape and layout. To use it, provide the --init_img
|
||||
option as shown here:
|
||||
|
||||
~~~~
|
||||
dream> "waterfall and rainbow" --init_img=./init-images/crude_drawing.png --strength=0.5 -s100 -n4
|
||||
~~~~
|
||||
|
||||
The --init_img (-I) option gives the path to the seed picture. --strength (-f) controls how much
|
||||
the original will be modified, ranging from 0.0 (keep the original intact), to 1.0 (ignore the original
|
||||
completely). The default is 0.75, and ranges from 0.25-0.75 give interesting results.
|
||||
|
||||
## Installation
|
||||
|
||||
For installation, follow the instructions from the original CompViz/stable-diffusion
|
||||
README which is appended to this README for your convenience. A few things to be aware of:
|
||||
|
||||
1. You will need the stable-diffusion model weights, which have to be downloaded separately as described
|
||||
in the CompViz instructions. They are expected to be released in the latter half of August.
|
||||
|
||||
2. If you do not have the weights and want to play with low-quality image generation, then you can use
|
||||
the public LAION400m weights, which can be installed like this:
|
||||
|
||||
~~~~
|
||||
mkdir -p models/ldm/text2img-large/
|
||||
wget -O models/ldm/text2img-large/model.ckpt https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt
|
||||
~~~~
|
||||
|
||||
You will then have to invoke dream.py with the --laion400m (or -l for short) flag:
|
||||
~~~~
|
||||
(ldm) ~/stable-diffusion$ python3 ./scripts/dream.py -l
|
||||
~~~~
|
||||
|
||||
3. To get around issues that arise when running the stable diffusion model on a machine without internet
|
||||
connectivity, I wrote a script that pre-downloads internet dependencies. Whether or not your GPU machine
|
||||
has connectivity, you will need to run this preloading script before the first run of dream.py. See
|
||||
"Workaround for machines with limited internet connectivity" below for the walkthrough.
|
||||
|
||||
## Simplified API for text to image generation
|
||||
|
||||
For programmers who wish to incorporate stable-diffusion into other
|
||||
products, this repository includes a simplified API for text to image generation, which
|
||||
lets you create images from a prompt in just three lines of code:
|
||||
|
||||
~~~~
|
||||
from ldm.simplet2i import T2I
|
||||
model = T2I()
|
||||
outputs = model.text2image("a unicorn in manhattan")
|
||||
~~~~
|
||||
|
||||
Outputs is a list of lists in the format [[filename1,seed1],[filename2,seed2]...]
|
||||
Please see ldm/simplet2i.py for more information.
|
||||
|
||||
For command-line help, type -h (or --help) at the dream> prompt.
|
||||
|
||||
## Workaround for machines with limited internet connectivity
|
||||
|
||||
@@ -100,14 +148,9 @@ time, copy over the file ldm/modules/encoders/modules.py from the
|
||||
CompVis/stable-diffusion repository. Or you can run preload_models.py
|
||||
on the target machine.
|
||||
|
||||
## Minor fixes
|
||||
## Support
|
||||
|
||||
I added the requirement for torchmetrics to environment.yaml.
|
||||
|
||||
## Installation and support
|
||||
|
||||
Follow the directions from the original README, which starts below, to
|
||||
configure the environment and install requirements. For support,
|
||||
For support,
|
||||
please use this repository's GitHub Issues tracking service. Feel free
|
||||
to send me an email if you use and like the script.
|
||||
|
||||
@@ -116,14 +159,16 @@ to send me an email if you use and like the script.
|
||||
# Original README from CompViz/stable-diffusion
|
||||
*Stable Diffusion was made possible thanks to a collaboration with [Stability AI](https://stability.ai/) and [Runway](https://runwayml.com/) and builds upon our previous work:*
|
||||
|
||||
[**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752)<br/>
|
||||
[**High-Resolution Image Synthesis with Latent Diffusion Models**](https://ommer-lab.com/research/latent-diffusion-models/)<br/>
|
||||
[Robin Rombach](https://github.com/rromb)\*,
|
||||
[Andreas Blattmann](https://github.com/ablattmann)\*,
|
||||
[Dominik Lorenz](https://github.com/qp-qp)\,
|
||||
[Patrick Esser](https://github.com/pesser),
|
||||
[Björn Ommer](https://hci.iwr.uni-heidelberg.de/Staff/bommer)<br/>
|
||||
|
||||
which is available on [GitHub](https://github.com/CompVis/latent-diffusion).
|
||||
**CVPR '22 Oral**
|
||||
|
||||
which is available on [GitHub](https://github.com/CompVis/latent-diffusion). PDF at [arXiv](https://arxiv.org/abs/2112.10752). Please also visit our [Project page](https://ommer-lab.com/research/latent-diffusion-models/).
|
||||
|
||||

|
||||
[Stable Diffusion](#stable-diffusion-v1) is a latent text-to-image diffusion
|
||||
|
||||
@@ -197,7 +197,7 @@ class T2I:
|
||||
shape = [self.latent_channels, height // self.downsampling_factor, width // self.downsampling_factor]
|
||||
samples_ddim, _ = sampler.sample(S=steps,
|
||||
conditioning=c,
|
||||
batch_size_size=batch_size,
|
||||
batch_size=batch_size,
|
||||
shape=shape,
|
||||
verbose=False,
|
||||
unconditional_guidance_scale=cfg_scale,
|
||||
|
||||
191
scripts/dream.py
191
scripts/dream.py
@@ -1,12 +1,17 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
import readline
|
||||
import argparse
|
||||
import shlex
|
||||
import atexit
|
||||
import os
|
||||
|
||||
debugging = False
|
||||
# readline unavailable on windows systems
|
||||
try:
|
||||
import readline
|
||||
readline_available = True
|
||||
except:
|
||||
readline_available = False
|
||||
|
||||
debugging = True
|
||||
|
||||
def main():
|
||||
''' Initialize command-line parsers and the diffusion model '''
|
||||
@@ -26,7 +31,8 @@ def main():
|
||||
weights = "models/ldm/stable-diffusion-v1/model.ckpt"
|
||||
|
||||
# command line history will be stored in a file called "~/.dream_history"
|
||||
setup_readline()
|
||||
if readline_available:
|
||||
setup_readline()
|
||||
|
||||
print("* Initializing, be patient...\n")
|
||||
from pytorch_lightning import logging
|
||||
@@ -54,7 +60,7 @@ def main():
|
||||
# preload the model
|
||||
if not debugging:
|
||||
t2i.load_model()
|
||||
print("\n* Initialization done! Awaiting your command (-h for help)...")
|
||||
print("\n* Initialization done! Awaiting your command (-h for help, q to quit)...")
|
||||
|
||||
log_path = os.path.join(opt.outdir,"dream_log.txt")
|
||||
with open(log_path,'a') as log:
|
||||
@@ -62,17 +68,26 @@ def main():
|
||||
main_loop(t2i,cmd_parser,log)
|
||||
log.close()
|
||||
|
||||
|
||||
def main_loop(t2i,parser,log):
|
||||
''' prompt/read/execute loop '''
|
||||
while True:
|
||||
done = False
|
||||
|
||||
while not done:
|
||||
try:
|
||||
command = input("dream> ")
|
||||
except EOFError:
|
||||
print("goodbye!")
|
||||
done = True
|
||||
break
|
||||
|
||||
# rearrange the arguments to mimic how it works in the Dream bot.
|
||||
elements = shlex.split(command)
|
||||
if elements[0]=='q': #
|
||||
done = True
|
||||
break
|
||||
if elements[0].startswith('!dream'): # in case a stored prompt still contains the !dream command
|
||||
elements.pop(0)
|
||||
|
||||
# rearrange the arguments to mimic how it works in the Dream bot.
|
||||
switches = ['']
|
||||
switches_started = False
|
||||
|
||||
@@ -95,12 +110,19 @@ def main_loop(t2i,parser,log):
|
||||
print("Try again with a prompt!")
|
||||
continue
|
||||
|
||||
if opt.init_img is None:
|
||||
results = t2i.txt2img(**vars(opt))
|
||||
else:
|
||||
results = t2i.img2img(**vars(opt))
|
||||
print("Outputs:")
|
||||
write_log_message(opt,switches,results,log)
|
||||
try:
|
||||
if opt.init_img is None:
|
||||
results = t2i.txt2img(**vars(opt))
|
||||
else:
|
||||
results = t2i.img2img(**vars(opt))
|
||||
print("Outputs:")
|
||||
write_log_message(opt,switches,results,log)
|
||||
except KeyboardInterrupt:
|
||||
print('*interrupted*')
|
||||
continue
|
||||
|
||||
print("goodbye!")
|
||||
|
||||
|
||||
def write_log_message(opt,switches,results,logfile):
|
||||
''' logs the name of the output image, its prompt and seed to both the terminal and the log file '''
|
||||
@@ -153,7 +175,7 @@ def create_argv_parser():
|
||||
parser.add_argument('-o',
|
||||
'--outdir',
|
||||
type=str,
|
||||
default="outputs/txt2img-samples",
|
||||
default="outputs/img-samples",
|
||||
help="directory in which to place generated images and a log of prompts and seeds")
|
||||
return parser
|
||||
|
||||
@@ -174,80 +196,81 @@ def create_cmd_parser():
|
||||
parser.add_argument('-f','--strength',default=0.75,type=float,help="strength for noising/unnoising. 0.0 preserves image exactly, 1.0 replaces it completely")
|
||||
return parser
|
||||
|
||||
def setup_readline():
|
||||
readline.set_completer(Completer(['--steps','-s','--seed','-S','--iterations','-n','--batch_size','-b',
|
||||
'--width','-W','--height','-H','--cfg_scale','-C','--grid','-g',
|
||||
'--individual','-i','--init_img','-I','--strength','-f']).complete)
|
||||
readline.set_completer_delims(" ")
|
||||
readline.parse_and_bind('tab: complete')
|
||||
load_history()
|
||||
if readline_available:
|
||||
def setup_readline():
|
||||
readline.set_completer(Completer(['--steps','-s','--seed','-S','--iterations','-n','--batch_size','-b',
|
||||
'--width','-W','--height','-H','--cfg_scale','-C','--grid','-g',
|
||||
'--individual','-i','--init_img','-I','--strength','-f']).complete)
|
||||
readline.set_completer_delims(" ")
|
||||
readline.parse_and_bind('tab: complete')
|
||||
load_history()
|
||||
|
||||
def load_history():
|
||||
histfile = os.path.join(os.path.expanduser('~'),".dream_history")
|
||||
try:
|
||||
readline.read_history_file(histfile)
|
||||
readline.set_history_length(1000)
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
atexit.register(readline.write_history_file,histfile)
|
||||
def load_history():
|
||||
histfile = os.path.join(os.path.expanduser('~'),".dream_history")
|
||||
try:
|
||||
readline.read_history_file(histfile)
|
||||
readline.set_history_length(1000)
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
atexit.register(readline.write_history_file,histfile)
|
||||
|
||||
class Completer():
|
||||
def __init__(self,options):
|
||||
self.options = sorted(options)
|
||||
return
|
||||
class Completer():
|
||||
def __init__(self,options):
|
||||
self.options = sorted(options)
|
||||
return
|
||||
|
||||
def complete(self,text,state):
|
||||
if text.startswith('-I') or text.startswith('--init_img'):
|
||||
return self._image_completions(text,state)
|
||||
|
||||
response = None
|
||||
if state == 0:
|
||||
# This is the first time for this text, so build a match list.
|
||||
if text:
|
||||
self.matches = [s
|
||||
for s in self.options
|
||||
if s and s.startswith(text)]
|
||||
def complete(self,text,state):
|
||||
if text.startswith('-I') or text.startswith('--init_img'):
|
||||
return self._image_completions(text,state)
|
||||
|
||||
response = None
|
||||
if state == 0:
|
||||
# This is the first time for this text, so build a match list.
|
||||
if text:
|
||||
self.matches = [s
|
||||
for s in self.options
|
||||
if s and s.startswith(text)]
|
||||
else:
|
||||
self.matches = self.options[:]
|
||||
|
||||
# Return the state'th item from the match list,
|
||||
# if we have that many.
|
||||
try:
|
||||
response = self.matches[state]
|
||||
except IndexError:
|
||||
response = None
|
||||
return response
|
||||
|
||||
def _image_completions(self,text,state):
|
||||
# get the path so far
|
||||
if text.startswith('-I'):
|
||||
path = text.replace('-I','',1).lstrip()
|
||||
elif text.startswith('--init_img='):
|
||||
path = text.replace('--init_img=','',1).lstrip()
|
||||
|
||||
matches = list()
|
||||
|
||||
path = os.path.expanduser(path)
|
||||
if len(path)==0:
|
||||
matches.append(text+'./')
|
||||
else:
|
||||
self.matches = self.options[:]
|
||||
|
||||
# Return the state'th item from the match list,
|
||||
# if we have that many.
|
||||
try:
|
||||
response = self.matches[state]
|
||||
except IndexError:
|
||||
response = None
|
||||
return response
|
||||
dir = os.path.dirname(path)
|
||||
dir_list = os.listdir(dir)
|
||||
for n in dir_list:
|
||||
if n.startswith('.') and len(n)>1:
|
||||
continue
|
||||
full_path = os.path.join(dir,n)
|
||||
if full_path.startswith(path):
|
||||
if os.path.isdir(full_path):
|
||||
matches.append(os.path.join(os.path.dirname(text),n)+'/')
|
||||
elif n.endswith('.png'):
|
||||
matches.append(os.path.join(os.path.dirname(text),n))
|
||||
|
||||
def _image_completions(self,text,state):
|
||||
# get the path so far
|
||||
if text.startswith('-I'):
|
||||
path = text.replace('-I','',1).lstrip()
|
||||
elif text.startswith('--init_img='):
|
||||
path = text.replace('--init_img=','',1).lstrip()
|
||||
|
||||
matches = list()
|
||||
|
||||
path = os.path.expanduser(path)
|
||||
if len(path)==0:
|
||||
matches.append(text+'./')
|
||||
else:
|
||||
dir = os.path.dirname(path)
|
||||
dir_list = os.listdir(dir)
|
||||
for n in dir_list:
|
||||
if n.startswith('.') and len(n)>1:
|
||||
continue
|
||||
full_path = os.path.join(dir,n)
|
||||
if full_path.startswith(path):
|
||||
if os.path.isdir(full_path):
|
||||
matches.append(os.path.join(os.path.dirname(text),n)+'/')
|
||||
elif n.endswith('.png'):
|
||||
matches.append(os.path.join(os.path.dirname(text),n))
|
||||
|
||||
try:
|
||||
response = matches[state]
|
||||
except IndexError:
|
||||
response = None
|
||||
return response
|
||||
try:
|
||||
response = matches[state]
|
||||
except IndexError:
|
||||
response = None
|
||||
return response
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
||||
Reference in New Issue
Block a user