updated README for image metadata storage

user's prompt is now normalized for reproducibility and written into the destination PNG file as a tEXt metadata chunk named "Dream". You can retrieve the prompt with an image editing program that supports browsing the full metadata, or with the images2prompt.py script located in 'scripts'
test of normalization of prompt
2026-01-17 03:48:01 -05:00 · 2022-08-22 00:22:12 -04:00 · 2022-08-22 00:12:16 -04:00 · 2022-08-21 22:48:40 -04:00 · 2022-08-21 21:48:02 -04:00 · 2022-08-21 21:46:00 -04:00
4 changed files with 136 additions and 50 deletions
--- a/README.md
+++ b/README.md
@@ -17,7 +17,11 @@ initialization only happens once. After that image generation
 from the command-line interface is very fast.

 The script uses the readline library to allow for in-line editing,
-command history (up and down arrows), autocompletion, and more.
+command history (up and down arrows), autocompletion, and more. To help
+keep track of which prompts generated which images, the script writes a
+log file of image names and prompts to the selected output directory.
+In addition, as of version 1.02, it also writes the prompt into the PNG
+file's metadata where it can be retrieved using scripts/images2prompt.py

 The script is confirmed to work on Linux and Windows systems. It should
 work on MacOSX as well, but this is not confirmed. Note that this script
@@ -38,12 +42,19 @@ setting sampler to plms
 * Initialization done! Awaiting your command...
 dream> ashley judd riding a camel -n2 -s150
 Outputs:
-   outputs/txt2img-samples/00009.png: "ashley judd riding a camel" -n2 -s150 -S 416354203
-   outputs/txt2img-samples/00010.png: "ashley judd riding a camel" -n2 -s150-S 1362479620
+   outputs/img-samples/00009.png: "ashley judd riding a camel" -n2 -s150 -S 416354203
+   outputs/img-samples/00010.png: "ashley judd riding a camel" -n2 -s150 -S 1362479620

 dream> "there's a fly in my soup" -n6 -g
-    outputs/txt2img-samples/00041.png: "there's a fly in my soup" -n6 -g -S 2685670268
+    outputs/img-samples/00011.png: "there's a fly in my soup" -n6 -g -S 2685670268
    seeds for individual rows: [2685670268, 1216708065, 2335773498, 822223658, 714542046, 3395302430]
+dream> q
+
+# this shows how to retrieve the prompt stored in the saved image's metadata
+(ldm) ~/stable-diffusion$ python3 ./scripts/images2prompt.py outputs/img_samples/*.png
+00009.png: "ashley judd riding a camel" -s150 -S 416354203
+00010.png: "ashley judd riding a camel" -s150 -S 1362479620
+00011.png: "there's a fly in my soup" -n6 -g -S 2685670268
 ~~~~

 The dream> prompt's arguments are pretty much
@@ -75,10 +86,17 @@ completely). The default is 0.75, and ranges from 0.25-0.75 give interesting res

 ## Changes

- v1.01 (21 August 2022)
-* added k_lms sampling **Please run "conda update -f environment.yaml" to load the k_lms dependencies**
-* use half precision arithmetic by default, resulting in faster execution and lower memory requirements
-Pass argument --full_precision to dream.py to get slower but more accurate image generation
+* v1.01 (21 August 2022)
+    * A copy of the prompt and all of its switches and options is now stored in the corresponding
+    image in a tEXt metadata field named "Dream". You can read the prompt using scripts/images2prompt.py,
+    or an image editor that allows you to explore the full metadata.
+        **Please run "conda env update -f environment.yaml" to load the k_lms dependencies!!**
+
+* v1.01 (21 August 2022)
+    * added k_lms sampling. 
+      **Please run "conda env update -f environment.yaml" to load the k_lms dependencies!!**
+    * use half precision arithmetic by default, resulting in faster execution and lower memory requirements
+    Pass argument --full_precision to dream.py to get slower but more accurate image generation


 ## Installation
@@ -87,7 +105,7 @@ Pass argument --full_precision to dream.py to get slower but more accurate image

 1. You will need to install the following prerequisites if they are not already available. Use your
 operating system's preferred installer
-* Python (version 3.8 or higher)
+* Python (version 3.8.5 recommended; higher may work)
 * git

 2. Install the Python Anaconda environment manager using pip3.
@@ -124,7 +142,7 @@ After these steps, your command prompt will be prefixed by "(ldm)" as shown abov

 7. Now you need to install the weights for the stable diffusion model.

-For testing prior to the release of the real weights, you can use an older weight file that produces low-quality images. Create a directory within stable-diffusion named "models/ldm/text2img.large", and use the wget URL downloader tool to copy the weight file into it:
+For testing prior to the release of the real weights, you can use an older weight file that produces low-quality images. Create a directory within stable-diffusion named "models/ldm/text2img-large", and use the wget URL downloader tool to copy the weight file into it:
 ```
 (ldm) ~/stable-diffusion$ mkdir -p models/ldm/text2img-large
 (ldm) ~/stable-diffusion$ wget -O models/ldm/text2img-large/model.ckpt https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt
@@ -160,7 +178,8 @@ This will bring your local copy into sync with the remote one.

 ### Windows

-1. Install the most recent Python from here: https://www.python.org/downloads/windows/
+1. Install Python version 3.8.5 from here: https://www.python.org/downloads/windows/
+   (note that several users have reported that later versions do not work properly)

 2. Install Anaconda3 (miniconda3 version) from here: https://docs.anaconda.com/anaconda/install/windows/

@@ -194,11 +213,11 @@ This installs two machine learning models that stable diffusion requires.

 9. Now you need to install the weights for the big stable diffusion model.

-For testing prior to the release of the real weights, create a directory within stable-diffusion named "models\ldm\text2img.large".
+For testing prior to the release of the real weights, create a directory within stable-diffusion named "models\ldm\text2img-large".

 For testing with the released weights, create a directory within stable-diffusion named "models\ldm\stable-diffusion-v1".

-Then use a web browser to copy model.ckpt into the appropriate directory. For the text2img.large (pre-release) model, the weights are at https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt. Check back here later for the release URL.
+Then use a web browser to copy model.ckpt into the appropriate directory. For the text2img-large (pre-release) model, the weights are at https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt. Check back here later for the release URL.

 10. Start generating images!
 ```
@@ -275,7 +294,9 @@ For support,
 please use this repository's GitHub Issues tracking service. Feel free
 to send me an email if you use and like the script.

-*Author:* Lincoln D. Stein <lincoln.stein@gmail.com>
+*Original Author:* Lincoln D. Stein <lincoln.stein@gmail.com>
+
+*Contributions by:* [Peter Kowalczyk](https://github.com/slix), [Henry Harrison](https://github.com/hwharrison), [xraxra](https://github.com/xraxra), and [bmaltais](https://github.com/bmaltais)

 # Original README from CompViz/stable-diffusion
 *Stable Diffusion was made possible thanks to a collaboration with [Stability AI](https://stability.ai/) and [Runway](https://runwayml.com/) and builds upon our previous work:*
--- a/ldm/simplet2i.py
+++ b/ldm/simplet2i.py
@@ -11,7 +11,7 @@ t2i = T2I(outdir      = <path>        // outputs/txt2img-samples
          batch_size       = <integer>     // how many images to generate per sampling (1)
          steps       = <integer>     // 50
          seed        = <integer>     // current system time
-          sampler     = ['ddim','plms','klms']  // klms
+          sampler_name= ['ddim','plms','klms']  // klms
          grid        = <boolean>     // false
          width       = <integer>     // image width, multiple of 64 (512)
          height      = <integer>     // image height, multiple of 64 (512)
@@ -77,7 +77,7 @@ class T2I:
    batch_size
    steps
    seed
-    sampler
+    sampler_name
    grid
    individual
    width
@@ -88,6 +88,8 @@ class T2I:
    downsampling_factor
    precision
    strength
+
+The vast majority of these arguments default to reasonable values.
 """
    def __init__(self,
                 outdir="outputs/txt2img-samples",
@@ -102,14 +104,15 @@ class T2I:
                 cfg_scale=7.5,
                 weights="models/ldm/stable-diffusion-v1/model.ckpt",
                 config = "configs/latent-diffusion/txt2img-1p4B-eval.yaml",
-                 sampler="klms",
+                 sampler_name="klms",
                 latent_channels=4,
                 downsampling_factor=8,
                 ddim_eta=0.0,  # deterministic
                 fixed_code=False,
                 precision='autocast',
                 full_precision=False,
-                 strength=0.75 # default in scripts/img2img.py
+                 strength=0.75, # default in scripts/img2img.py
+                 latent_diffusion_weights=False  # just to keep track of this parameter when regenerating prompt
    ):
        self.outdir     = outdir
        self.batch_size      = batch_size
@@ -119,9 +122,9 @@ class T2I:
        self.grid       = grid
        self.steps      = steps
        self.cfg_scale  = cfg_scale
-        self.weights   = weights
+        self.weights    = weights
        self.config     = config
-        self.sampler_name  = sampler
+        self.sampler_name  = sampler_name
        self.fixed_code    = fixed_code
        self.latent_channels     = latent_channels
        self.downsampling_factor = downsampling_factor
@@ -131,6 +134,7 @@ class T2I:
        self.strength            = strength
        self.model      = None     # empty for now
        self.sampler    = None
+        self.latent_diffusion_weights=latent_diffusion_weights
        if seed is None:
            self.seed = self._new_seed()
        else:
@@ -412,7 +416,7 @@ class T2I:
        if self.full_precision:
            print('Using slower but more accurate full-precision math (--full_precision)')
        else:
-            print('Using half precision math. Call with --full_precision to use full precision')
+            print('Using half precision math. Call with --full_precision to use slower but more accurate full precision.')
            model.half()
        return model

--- a/scripts/dream.py
+++ b/scripts/dream.py
@@ -4,6 +4,7 @@ import shlex
 import atexit
 import os
 import sys
+from PIL import Image,PngImagePlugin

 # readline unavailable on windows systems
 try:
@@ -48,10 +49,12 @@ def main():
              height=height,
              batch_size=opt.batch_size,
              outdir=opt.outdir,
-              sampler=opt.sampler,
+              sampler_name=opt.sampler_name,
              weights=weights,
              full_precision=opt.full_precision,
-              config=config)
+              config=config,
+              latent_diffusion_weights=opt.laion400m # this is solely for recreating the prompt
+    )

    # make sure the output directory exists
    if not os.path.exists(opt.outdir):
@@ -119,7 +122,7 @@ def main_loop(t2i,parser,log):
            else:
                results = t2i.img2img(**vars(opt))
            print("Outputs:")
-            write_log_message(opt,switches,results,log)
+            write_log_message(t2i,opt,results,log)
        except KeyboardInterrupt:
            print('*interrupted*')
            continue
@@ -127,34 +130,62 @@ def main_loop(t2i,parser,log):
    print("goodbye!")


-def write_log_message(opt,switches,results,logfile):
-    ''' logs the name of the output image, its prompt and seed to both the terminal and the log file '''
-    if opt.grid:
-        _output_for_grid(switches,results,logfile)
-    else:
-        _output_for_individual(switches,results,logfile)
+def write_log_message(t2i,opt,results,logfile):
+    ''' logs the name of the output image, its prompt and seed to the terminal, log file, and a Dream text chunk in the PNG metadata '''
+    switches = _reconstruct_switches(t2i,opt)
+    prompt_str = ' '.join(switches)

-def _output_for_individual(switches,results,logfile):
+    # when multiple images are produced in batch, then we keep track of where each starts
+    last_seed  = None
+    img_num    = 1
+    batch_size = opt.batch_size or t2i.batch_size
+    seenit     = {}
+    
    for r in results:
-        log_message = " ".join(['   ',str(r[0])+':',
-                                f'"{switches[0]}"',
-                                *switches[1:],f'-S {r[1]}'])
+        seed = r[1]
+        log_message = (f'{r[0]}: {prompt_str} -S{seed}')
+
+        if batch_size > 1:
+            if seed != last_seed:
+                img_num = 1
+                log_message += f' # (batch image {img_num} of {batch_size})'
+            else:
+                img_num += 1
+                log_message += f' # (batch image {img_num} of {batch_size})'
+            last_seed = seed
        print(log_message)
        logfile.write(log_message+"\n")
        logfile.flush()
+        if r[0] not in seenit:
+            seenit[r[0]] = True
+            try:
+                _write_prompt_to_png(r[0],f'{prompt_str} -S{seed}')
+            except FileNotFoundError:
+                print(f"Could not open file '{r[0]}' for reading")

-def _output_for_grid(switches,results,logfile):
-    first_seed = results[0][1]
-    log_message = " ".join(['   ',str(results[0][0])+':',
-                            f'"{switches[0]}"',
-                            *switches[1:],f'-S {results[0][1]}'])
-    print(log_message)
-    logfile.write(log_message+"\n")
-    all_seeds   = [row[1] for row in results]
-    log_message = f'    seeds for individual rows: {all_seeds}'
-    print(log_message)
-    logfile.write(log_message+"\n")
+def _reconstruct_switches(t2i,opt):
+    '''Normalize the prompt and switches'''
+    switches = list()
+    switches.append(f'"{opt.prompt}"')
+    switches.append(f'-s{opt.steps        or t2i.steps}')
+    switches.append(f'-b{opt.batch_size   or t2i.batch_size}')
+    switches.append(f'-W{opt.width        or t2i.width}')
+    switches.append(f'-H{opt.height       or t2i.height}')
+    switches.append(f'-C{opt.cfg_scale    or t2i.cfg_scale}')
+    if opt.init_img:
+        switches.append(f'-I{opt.init_img}')
+    if opt.strength and opt.init_img is not None:
+        switches.append(f'-f{opt.strength or t2i.strength}')
+    if t2i.full_precision:
+        switches.append('-F')
+    return switches

+def _write_prompt_to_png(path,prompt):
+    info = PngImagePlugin.PngInfo()
+    info.add_text("Dream",prompt)
+    im = Image.open(path)
+    im.save(path,"PNG",pnginfo=info)
+    
 def create_argv_parser():
    parser = argparse.ArgumentParser(description="Parse script's command line args")
    parser.add_argument("--laion400m",
@@ -162,7 +193,7 @@ def create_argv_parser():
                        "-l",
                        dest='laion400m',
                        action='store_true',
-                        help="fallback to the latent diffusion (LAION4400M) weights and config")
+                        help="fallback to the latent diffusion (laion400m) weights and config")
    parser.add_argument('-n','--iterations',
                        type=int,
                        default=1,
@@ -174,11 +205,12 @@ def create_argv_parser():
    parser.add_argument('-b','--batch_size',
                        type=int,
                        default=1,
-                        help="number of images to produce per iteration (currently not working properly - producing too many images)")
-    parser.add_argument('--sampler',
+                        help="number of images to produce per iteration (faster, but doesn't generate individual seeds")
+    parser.add_argument('--sampler','-m',
+                        dest="sampler_name",
                        choices=['plms','ddim', 'klms'],
                        default='klms',
-                        help="which sampler to use (klms)")
+                        help="which sampler to use (klms) - can only be set on command line")
    parser.add_argument('-o',
                        '--outdir',
                        type=str,
@@ -193,7 +225,7 @@ def create_cmd_parser():
    parser.add_argument('-s','--steps',type=int,help="number of steps")
    parser.add_argument('-S','--seed',type=int,help="image seed")
    parser.add_argument('-n','--iterations',type=int,default=1,help="number of samplings to perform")
-    parser.add_argument('-b','--batch_size',type=int,default=1,help="number of images to produce per sampling (currently broken)")
+    parser.add_argument('-b','--batch_size',type=int,default=1,help="number of images to produce per sampling")
    parser.add_argument('-W','--width',type=int,help="image width, multiple of 64")
    parser.add_argument('-H','--height',type=int,help="image height, multiple of 64")
    parser.add_argument('-C','--cfg_scale',default=7.5,type=float,help="prompt configuration scale")
--- a/scripts/images2prompt.py
+++ b/scripts/images2prompt.py
@@ -0,0 +1,29 @@
+#!/usr/bin/env python3
+'''This script reads the "Dream" Stable Diffusion prompt embedded in files generated by dream.py'''
+
+import sys
+from PIL import Image,PngImagePlugin
+
+if len(sys.argv) < 2:
+    print("Usage: file2prompt.py <file1.png> <file2.png> <file3.png>...")
+    exit(-1)
+
+filenames = sys.argv[1:]
+for f in filenames:
+    try:
+        im = Image.open(f)
+        try:
+            prompt = im.text['Dream']
+        except KeyError:
+            prompt = ''
+        print(f'{f}: {prompt}')
+    except FileNotFoundError:
+        sys.stderr.write(f'{f} not found\n')
+        continue
+    except PermissionError:
+        sys.stderr.write(f'{f} could not be opened due to inadequate permissions\n')
+        continue
+        
+
+
+
Author	SHA1	Message	Date
Lincoln Stein	ddf0ef3af1	updated README for image metadata storage	2022-08-22 00:22:12 -04:00
Lincoln Stein	aa2729d868	user's prompt is now normalized for reproducibility and written into the destination PNG file as a tEXt metadata chunk named "Dream". You can retrieve the prompt with an image editing program that supports browsing the full metadata, or with the images2prompt.py script located in 'scripts'	2022-08-22 00:12:16 -04:00
Lincoln Stein	5f352aec87	test of normalization of prompt	2022-08-21 22:48:40 -04:00
Lincoln Stein	c4c4974b39	Update README.md Fixed formatting in changelog.	2022-08-21 21:48:02 -04:00
Lincoln Stein	194f43f00b	Update README.md Add acknowledges for those who sent pull requests.	2022-08-21 21:46:00 -04:00
Lincoln Stein	325bc5280e	Updated README.md Fix the path for where to install the LIAON-400m model.	2022-08-21 20:48:44 -04:00
Lincoln Stein	11cc8e545b	Clarified the required Python version (3.8.5)	2022-08-21 20:30:21 -04:00
Lincoln Stein	9adac56f4e	Fixed incorrect conda env update command	2022-08-21 20:27:25 -04:00
Lincoln Stein	5d5307dcb4	Update README.md	2022-08-21 20:20:22 -04:00