changed default output directory to outputs/img-samples because the same directory is now used for both txt2img and img2img

indentation error prevented filenames from printing
intercept keyboard interrupt during processing and return to prompt;
2026-01-16 08:28:11 -05:00 · 2022-08-18 23:23:44 -04:00 · 2022-08-18 23:15:03 -04:00 · 2022-08-18 23:03:22 -04:00 · 2022-08-18 16:00:44 -04:00 · 2022-08-18 14:54:19 -04:00
3 changed files with 196 additions and 128 deletions
--- a/README.md
+++ b/README.md
@@ -1,40 +1,29 @@
-# Stable Diffusion
+# Stable Diffusion Dream Script

 This is a fork of CompVis/stable-diffusion, the wonderful open source
 text-to-image generator.

-The original has been modified in several minor ways:
-
-## Simplified API for text to image generation
-
-There is now a simplified API for text to image generation, which
-lets you create images from a prompt in just three lines of code:
-
-~~~~
-from ldm.simplet2i import T2I
-model   = T2I()
-outputs = model.text2image("a unicorn in manhattan")
-~~~~
-
-Outputs is a list of lists in the format [[filename1,seed1],[filename2,seed2]...]
-Please see ldm/simplet2i.py for more information.
+The original has been modified in several ways:

 ## Interactive command-line interface similar to the Discord bot

-There is now a command-line script, located in scripts/dream.py, which
+The *dream.py* script, located in scripts/dream.py, 
 provides an interactive interface to image generation similar to
 the "dream mothership" bot that Stable AI provided on its Discord
-server.  The advantage of this is that the lengthy model
-initialization only happens once. After that image generation is
-fast.
+server. Unlike the txt2img.py and img2img.py scripts provided in the
+original CompViz/stable-diffusion source code repository, the
+time-consuming initialization of the AI model
+initialization only happens once. After that image generation 
+from the command-line interface is very fast.

 The script uses the readline library to allow for in-line editing,
-command history (up and down arrows) and more.
+command history (up and down arrows), autocompletion, and more.

-Note that this has only been tested in the Linux environment!
+Note that this has only been tested in the Linux environment. Testing
+and tweaking for Windows is in progress.

 ~~~~
-(ldm) ~/stable-diffusion$ ./scripts/dream.py
+(ldm) ~/stable-diffusion$ python3 ./scripts/dream.py
 * Initializing, be patient...
 Loading model from models/ldm/text2img-large/model.ckpt
 LatentDiffusion: Running in eps-prediction mode
@@ -46,24 +35,83 @@ Loading Bert tokenizer from "models/bert"
 setting sampler to plms

 * Initialization done! Awaiting your command...
-dream> ashley judd riding a camel -n2
+dream> ashley judd riding a camel -n2 -s150
 Outputs:
-   outputs/txt2img-samples/00009.png: "ashley judd riding a camel" -n2 -S 416354203
-   outputs/txt2img-samples/00010.png: "ashley judd riding a camel" -n2 -S 1362479620
+   outputs/txt2img-samples/00009.png: "ashley judd riding a camel" -n2 -s150 -S 416354203
+   outputs/txt2img-samples/00010.png: "ashley judd riding a camel" -n2 -s150-S 1362479620

-dream> "your prompt here" -n6 -g
-    outputs/txt2img-samples/00041.png: "your prompt here" -n6 -g -S 2685670268
+dream> "there's a fly in my soup" -n6 -g
+    outputs/txt2img-samples/00041.png: "there's a fly in my soup" -n6 -g -S 2685670268
    seeds for individual rows: [2685670268, 1216708065, 2335773498, 822223658, 714542046, 3395302430]
 ~~~~

-Command-line arguments passed to the script allow you to change
-various defaults, and select between the mature stable-diffusion
-weights (512x512) and the older (256x256) latent diffusion weights
-(laion400m). From the dream> prompt, the arguments are (mostly)
+The dream> prompt's  arguments are pretty-much
 identical to those used in the Discord bot, except you don't need to
-type "!dream". Pass "-h" (or "--help") to list the arguments.
+type "!dream". A significant change is that creation of individual images is the default
+unless --grid (-g) is given. For backward compatibility, the -i switch is recognized.
+For command-line help type -h (or --help) at the dream> prompt.
+
+The script itself also recognizes a series of command-line switches that will change
+important global defaults, such as the directory for image outputs and the location
+of the model weight files.
+
+## Image-to-Image
+
+This script also provides an img2img feature that lets you seed your
+creations with a drawing or photo. This is a really cool feature that tells
+stable diffusion to build the prompt on top of the image you provide, preserving
+the original's basic shape and layout. To use it, provide the --init_img 
+option as shown here:
+
+~~~~
+dream> "waterfall and rainbow" --init_img=./init-images/crude_drawing.png --strength=0.5 -s100 -n4
+~~~~
+
+The --init_img (-I) option gives the path to the seed picture. --strength (-f) controls how much
+the original will be modified, ranging from 0.0 (keep the original intact), to 1.0 (ignore the original
+completely). The default is 0.75, and ranges from 0.25-0.75 give interesting results.
+
+## Installation
+
+For installation, follow the instructions from the original CompViz/stable-diffusion
+README which is appended to this README for your convenience. A few things to be aware of:
+
+1. You will need the stable-diffusion model weights, which have to be downloaded separately as described
+in the CompViz instructions. They are expected to be released in the latter half of August.
+
+2. If you do not have the weights and want to play with low-quality image generation, then you can use
+the public LAION400m weights, which can be installed like this:
+
+~~~~
+mkdir -p models/ldm/text2img-large/
+wget -O models/ldm/text2img-large/model.ckpt https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt
+~~~~
+
+You will then have to invoke dream.py with the --laion400m (or -l for short) flag:
+~~~~
+(ldm) ~/stable-diffusion$ python3 ./scripts/dream.py -l
+~~~~
+
+3. To get around issues that arise when running the stable diffusion model on a machine without internet
+connectivity, I wrote a script that pre-downloads internet dependencies. Whether or not your GPU machine 
+has connectivity, you will need to run this preloading script before the first run of dream.py. See
+"Workaround for machines with limited internet connectivity" below for the walkthrough.
+
+## Simplified API for text to image generation
+
+For programmers who wish to incorporate stable-diffusion into other
+products, this repository includes a simplified API for text to image generation, which
+lets you create images from a prompt in just three lines of code:
+
+~~~~
+from ldm.simplet2i import T2I
+model   = T2I()
+outputs = model.text2image("a unicorn in manhattan")
+~~~~
+
+Outputs is a list of lists in the format [[filename1,seed1],[filename2,seed2]...]
+Please see ldm/simplet2i.py for more information.

-For command-line help, type -h (or --help) at the dream> prompt.

 ## Workaround for machines with limited internet connectivity

@@ -100,14 +148,9 @@ time, copy over the file ldm/modules/encoders/modules.py from the
 CompVis/stable-diffusion repository. Or you can run preload_models.py
 on the target machine.

-## Minor fixes
+## Support

-I added the requirement for torchmetrics to environment.yaml.
-
-## Installation and support
-
-Follow the directions from the original README, which starts below, to
-configure the environment and install requirements. For support,
+For support,
 please use this repository's GitHub Issues tracking service. Feel free
 to send me an email if you use and like the script.

@@ -116,14 +159,16 @@ to send me an email if you use and like the script.
 # Original README from CompViz/stable-diffusion
 *Stable Diffusion was made possible thanks to a collaboration with [Stability AI](https://stability.ai/) and [Runway](https://runwayml.com/) and builds upon our previous work:*

-[**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752)<br/>
+[**High-Resolution Image Synthesis with Latent Diffusion Models**](https://ommer-lab.com/research/latent-diffusion-models/)<br/>
 [Robin Rombach](https://github.com/rromb)\*,
 [Andreas Blattmann](https://github.com/ablattmann)\*,
 [Dominik Lorenz](https://github.com/qp-qp)\,
 [Patrick Esser](https://github.com/pesser),
 [Björn Ommer](https://hci.iwr.uni-heidelberg.de/Staff/bommer)<br/>

-which is available on [GitHub](https://github.com/CompVis/latent-diffusion).
+**CVPR '22 Oral**
+
+which is available on [GitHub](https://github.com/CompVis/latent-diffusion). PDF at [arXiv](https://arxiv.org/abs/2112.10752). Please also visit our [Project page](https://ommer-lab.com/research/latent-diffusion-models/).

 ![txt2img-stable2](assets/stable-samples/txt2img/merged-0006.png)
 [Stable Diffusion](#stable-diffusion-v1) is a latent text-to-image diffusion
--- a/ldm/simplet2i.py
+++ b/ldm/simplet2i.py
@@ -197,7 +197,7 @@ class T2I:
                            shape = [self.latent_channels, height // self.downsampling_factor, width // self.downsampling_factor]
                            samples_ddim, _ = sampler.sample(S=steps,
                                                             conditioning=c,
-                                                             batch_size_size=batch_size,
+                                                             batch_size=batch_size,
                                                             shape=shape,
                                                             verbose=False,
                                                             unconditional_guidance_scale=cfg_scale,
--- a/scripts/dream.py
+++ b/scripts/dream.py
@@ -1,12 +1,17 @@
 #!/usr/bin/env python
-
-import readline
 import argparse
 import shlex
 import atexit
 import os

-debugging = False
+# readline unavailable on windows systems
+try:
+    import readline
+    readline_available = True
+except:
+    readline_available = False
+
+debugging = True

 def main():
    ''' Initialize command-line parsers and the diffusion model '''
@@ -26,7 +31,8 @@ def main():
        weights = "models/ldm/stable-diffusion-v1/model.ckpt"

    # command line history will be stored in a file called "~/.dream_history"
-    setup_readline()
+    if readline_available:
+        setup_readline()

    print("* Initializing, be patient...\n")
    from pytorch_lightning import logging
@@ -54,7 +60,7 @@ def main():
    # preload the model
    if not debugging:
        t2i.load_model()
-    print("\n* Initialization done! Awaiting your command (-h for help)...")
+    print("\n* Initialization done! Awaiting your command (-h for help, q to quit)...")

    log_path   = os.path.join(opt.outdir,"dream_log.txt")
    with open(log_path,'a') as log:
@@ -62,17 +68,26 @@ def main():
        main_loop(t2i,cmd_parser,log)
        log.close()

+
 def main_loop(t2i,parser,log):
    ''' prompt/read/execute loop '''
-    while True:
+    done = False
+    
+    while not done:
        try:
            command = input("dream> ")
        except EOFError:
-            print("goodbye!")
+            done = True
            break

-        # rearrange the arguments to mimic how it works in the Dream bot.
        elements = shlex.split(command)
+        if elements[0]=='q':  # 
+            done = True
+            break
+        if elements[0].startswith('!dream'): # in case a stored prompt still contains the !dream command
+            elements.pop(0)
+            
+        # rearrange the arguments to mimic how it works in the Dream bot.
        switches = ['']
        switches_started = False

@@ -95,12 +110,19 @@ def main_loop(t2i,parser,log):
            print("Try again with a prompt!")
            continue

-        if opt.init_img is None:
-            results = t2i.txt2img(**vars(opt))
-        else:
-            results = t2i.img2img(**vars(opt))
-        print("Outputs:")
-        write_log_message(opt,switches,results,log)
+        try:
+            if opt.init_img is None:
+                results = t2i.txt2img(**vars(opt))
+            else:
+                results = t2i.img2img(**vars(opt))
+            print("Outputs:")
+            write_log_message(opt,switches,results,log)
+        except KeyboardInterrupt:
+            print('*interrupted*')
+            continue
+
+    print("goodbye!")
+

 def write_log_message(opt,switches,results,logfile):
    ''' logs the name of the output image, its prompt and seed to both the terminal and the log file '''
@@ -153,7 +175,7 @@ def create_argv_parser():
    parser.add_argument('-o',
                        '--outdir',
                        type=str,
-                        default="outputs/txt2img-samples",
+                        default="outputs/img-samples",
                        help="directory in which to place generated images and a log of prompts and seeds")
    return parser
                        
@@ -174,80 +196,81 @@ def create_cmd_parser():
    parser.add_argument('-f','--strength',default=0.75,type=float,help="strength for noising/unnoising. 0.0 preserves image exactly, 1.0 replaces it completely")
    return parser

-def setup_readline():
-    readline.set_completer(Completer(['--steps','-s','--seed','-S','--iterations','-n','--batch_size','-b',
-                                      '--width','-W','--height','-H','--cfg_scale','-C','--grid','-g',
-                                      '--individual','-i','--init_img','-I','--strength','-f']).complete)
-    readline.set_completer_delims(" ")
-    readline.parse_and_bind('tab: complete')
-    load_history()
+if readline_available:
+    def setup_readline():
+        readline.set_completer(Completer(['--steps','-s','--seed','-S','--iterations','-n','--batch_size','-b',
+                                          '--width','-W','--height','-H','--cfg_scale','-C','--grid','-g',
+                                          '--individual','-i','--init_img','-I','--strength','-f']).complete)
+        readline.set_completer_delims(" ")
+        readline.parse_and_bind('tab: complete')
+        load_history()

-def load_history():
-    histfile = os.path.join(os.path.expanduser('~'),".dream_history")
-    try:
-        readline.read_history_file(histfile)
-        readline.set_history_length(1000)
-    except FileNotFoundError:
-        pass
-    atexit.register(readline.write_history_file,histfile)
+    def load_history():
+        histfile = os.path.join(os.path.expanduser('~'),".dream_history")
+        try:
+            readline.read_history_file(histfile)
+            readline.set_history_length(1000)
+        except FileNotFoundError:
+            pass
+        atexit.register(readline.write_history_file,histfile)

-class Completer():
-    def __init__(self,options):
-        self.options = sorted(options)
-        return
+    class Completer():
+        def __init__(self,options):
+            self.options = sorted(options)
+            return

-    def complete(self,text,state):
-        if text.startswith('-I') or text.startswith('--init_img'):
-            return self._image_completions(text,state)
-        
-        response = None
-        if state == 0:
-            # This is the first time for this text, so build a match list.
-            if text:
-                self.matches = [s 
-                                for s in self.options
-                                if s and s.startswith(text)]
+        def complete(self,text,state):
+            if text.startswith('-I') or text.startswith('--init_img'):
+                return self._image_completions(text,state)
+
+            response = None
+            if state == 0:
+                # This is the first time for this text, so build a match list.
+                if text:
+                    self.matches = [s 
+                                    for s in self.options
+                                    if s and s.startswith(text)]
+                else:
+                    self.matches = self.options[:]
+
+            # Return the state'th item from the match list,
+            # if we have that many.
+            try:
+                response = self.matches[state]
+            except IndexError:
+                response = None
+            return response
+
+        def _image_completions(self,text,state):
+            # get the path so far
+            if text.startswith('-I'):
+                path = text.replace('-I','',1).lstrip()
+            elif text.startswith('--init_img='):
+                path = text.replace('--init_img=','',1).lstrip()
+
+            matches  = list()
+
+            path = os.path.expanduser(path)
+            if len(path)==0:
+                matches.append(text+'./')
            else:
-                self.matches = self.options[:]
-        
-        # Return the state'th item from the match list,
-        # if we have that many.
-        try:
-            response = self.matches[state]
-        except IndexError:
-            response = None
-        return response
+                dir  = os.path.dirname(path)
+                dir_list = os.listdir(dir)
+                for n in dir_list:
+                    if n.startswith('.') and len(n)>1:
+                        continue
+                    full_path = os.path.join(dir,n)
+                    if full_path.startswith(path):
+                        if os.path.isdir(full_path):
+                            matches.append(os.path.join(os.path.dirname(text),n)+'/')
+                        elif n.endswith('.png'):
+                            matches.append(os.path.join(os.path.dirname(text),n))

-    def _image_completions(self,text,state):
-        # get the path so far
-        if text.startswith('-I'):
-            path = text.replace('-I','',1).lstrip()
-        elif text.startswith('--init_img='):
-            path = text.replace('--init_img=','',1).lstrip()
-
-        matches  = list()
-
-        path = os.path.expanduser(path)
-        if len(path)==0:
-            matches.append(text+'./')
-        else:
-            dir  = os.path.dirname(path)
-            dir_list = os.listdir(dir)
-            for n in dir_list:
-                if n.startswith('.') and len(n)>1:
-                    continue
-                full_path = os.path.join(dir,n)
-                if full_path.startswith(path):
-                    if os.path.isdir(full_path):
-                        matches.append(os.path.join(os.path.dirname(text),n)+'/')
-                    elif n.endswith('.png'):
-                        matches.append(os.path.join(os.path.dirname(text),n))
-                
-        try:
-            response = matches[state]
-        except IndexError:
-            response = None
-        return response
+            try:
+                response = matches[state]
+            except IndexError:
+                response = None
+            return response
        

 if __name__ == "__main__":
Author	SHA1	Message	Date
Lincoln Stein	4cb5fc5ed4	changed default output directory to outputs/img-samples because the same directory is now used for both txt2img and img2img	2022-08-18 23:23:44 -04:00
Lincoln Stein	d8926fb8c0	indentation error prevented filenames from printing	2022-08-18 23:15:03 -04:00
Lincoln Stein	80c0e30099	intercept keyboard interrupt during processing and return to prompt; remove "!dream" from beginning of prompt; user can quit by typing <q>	2022-08-18 23:03:22 -04:00
Lincoln Stein	ac440a1197	disable readline functionality on windows	2022-08-18 16:00:44 -04:00
Lincoln Stein	bb46c70ec5	Added more info to README.md	2022-08-18 14:54:19 -04:00
Lincoln Stein	2b2ebd19e7	fixed a typo that introduced a crash	2022-08-18 13:47:07 -04:00
Lincoln Stein	74f238d310	Added info on img2img functionality.	2022-08-18 13:35:54 -04:00
Lincoln Stein	58f1962671	Merge branch 'CompVis:main' into main	2022-08-18 13:32:45 -04:00
owenvincent	7b8c883b07	Update README.md	2022-08-18 15:46:44 +02:00
owenvincent	be6ab334c2	update links in README.md	2022-08-18 13:49:59 +02:00