mirror of
https://github.com/invoke-ai/InvokeAI.git
synced 2026-01-15 17:18:11 -05:00
Compare commits
43 Commits
version_0.
...
release_1.
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
3c74dd41c4 | ||
|
|
f5450bad61 | ||
|
|
2ace56313c | ||
|
|
78aba5b770 | ||
|
|
49f0d31fac | ||
|
|
bb91ca0462 | ||
|
|
d340afc9e5 | ||
|
|
7085d1910b | ||
|
|
a997e09c48 | ||
|
|
503f962f68 | ||
|
|
41f0afbcb6 | ||
|
|
6650b98e7c | ||
|
|
1ca3dc553c | ||
|
|
09afcc321c | ||
|
|
7b2335068c | ||
|
|
d3eff4d827 | ||
|
|
0d23a0f899 | ||
|
|
985948c8b9 | ||
|
|
6ae09f6e46 | ||
|
|
ae821ce0e6 | ||
|
|
ce5b94bf40 | ||
|
|
b5d9981125 | ||
|
|
9a237015da | ||
|
|
5eff5d4cd2 | ||
|
|
4527ef15f9 | ||
|
|
0cea751476 | ||
|
|
a5fb8469ed | ||
|
|
9eaef0c5a8 | ||
|
|
4cb5fc5ed4 | ||
|
|
d8926fb8c0 | ||
|
|
80c0e30099 | ||
|
|
ac440a1197 | ||
|
|
bb46c70ec5 | ||
|
|
2b2ebd19e7 | ||
|
|
74f238d310 | ||
|
|
58f1962671 | ||
|
|
87fb4186d4 | ||
|
|
750408f793 | ||
|
|
bf76c4f283 | ||
|
|
7b8c883b07 | ||
|
|
be6ab334c2 | ||
|
|
831bbd7a54 | ||
|
|
c477525036 |
258
README.md
258
README.md
@@ -1,36 +1,30 @@
|
||||
# Stable Diffusion
|
||||
# Stable Diffusion Dream Script
|
||||
|
||||
This is a fork of CompVis/stable-diffusion, the wonderful open source
|
||||
text-to-image generator.
|
||||
|
||||
The original has been modified in several minor ways:
|
||||
|
||||
## Simplified API for text to image generation
|
||||
|
||||
There is now a simplified API for text to image generation, which
|
||||
lets you create images from a prompt in just three lines of code:
|
||||
|
||||
~~~~
|
||||
from ldm.simplet2i import T2I
|
||||
model = T2I()
|
||||
model.text2image("a unicorn in manhattan")
|
||||
~~~~
|
||||
|
||||
Please see ldm/simplet2i.py for more information.
|
||||
The original has been modified in several ways:
|
||||
|
||||
## Interactive command-line interface similar to the Discord bot
|
||||
|
||||
There is now a command-line script, located in scripts/dream.py, which
|
||||
The *dream.py* script, located in scripts/dream.py,
|
||||
provides an interactive interface to image generation similar to
|
||||
the "dream mothership" bot that Stable AI provided on its Discord
|
||||
server. The advantage of this is that the lengthy model
|
||||
initialization only happens once. After that image generation is
|
||||
fast.
|
||||
server. Unlike the txt2img.py and img2img.py scripts provided in the
|
||||
original CompViz/stable-diffusion source code repository, the
|
||||
time-consuming initialization of the AI model
|
||||
initialization only happens once. After that image generation
|
||||
from the command-line interface is very fast.
|
||||
|
||||
Note that this has only been tested in the Linux environment!
|
||||
The script uses the readline library to allow for in-line editing,
|
||||
command history (up and down arrows), autocompletion, and more.
|
||||
|
||||
The script is confirmed to work on Linux and Windows systems. It should
|
||||
work on MacOSX as well, but this is not confirmed. Note that this script
|
||||
runs from the command-line (CMD or Terminal window), and does not have a GUI.
|
||||
|
||||
~~~~
|
||||
(ldm) ~/stable-diffusion$ ./scripts/dream.py
|
||||
(ldm) ~/stable-diffusion$ python3 ./scripts/dream.py
|
||||
* Initializing, be patient...
|
||||
Loading model from models/ldm/text2img-large/model.ckpt
|
||||
LatentDiffusion: Running in eps-prediction mode
|
||||
@@ -42,20 +36,203 @@ Loading Bert tokenizer from "models/bert"
|
||||
setting sampler to plms
|
||||
|
||||
* Initialization done! Awaiting your command...
|
||||
dream> ashley judd riding a camel -n2
|
||||
dream> ashley judd riding a camel -n2 -s150
|
||||
Outputs:
|
||||
outputs/txt2img-samples/00009.png: "ashley judd riding a camel" -n2 -S 416354203
|
||||
outputs/txt2img-samples/00010.png: "ashley judd riding a camel" -n2 -S 1362479620
|
||||
outputs/txt2img-samples/00009.png: "ashley judd riding a camel" -n2 -s150 -S 416354203
|
||||
outputs/txt2img-samples/00010.png: "ashley judd riding a camel" -n2 -s150-S 1362479620
|
||||
|
||||
dream> "your prompt here" -n6 -g
|
||||
...
|
||||
dream> "there's a fly in my soup" -n6 -g
|
||||
outputs/txt2img-samples/00041.png: "there's a fly in my soup" -n6 -g -S 2685670268
|
||||
seeds for individual rows: [2685670268, 1216708065, 2335773498, 822223658, 714542046, 3395302430]
|
||||
~~~~
|
||||
|
||||
Command-line arguments (`./scripts/dream.py -h`) allow you to change
|
||||
various defaults, and select between the mature stable-diffusion
|
||||
weights (512x512) and the older (256x256) latent diffusion weights
|
||||
(laion400m). Within the script, the switches are (mostly) identical to
|
||||
those used in the Discord bot, except you don't need to type "!dream".
|
||||
The dream> prompt's arguments are pretty much
|
||||
identical to those used in the Discord bot, except you don't need to
|
||||
type "!dream" (it doesn't hurt if you do). A significant change is that creation of individual images
|
||||
is now the default
|
||||
unless --grid (-g) is given. For backward compatibility, the -i switch is recognized.
|
||||
For command-line help type -h (or --help) at the dream> prompt.
|
||||
|
||||
The script itself also recognizes a series of command-line switches that will change
|
||||
important global defaults, such as the directory for image outputs and the location
|
||||
of the model weight files.
|
||||
|
||||
## Image-to-Image
|
||||
|
||||
This script also provides an img2img feature that lets you seed your
|
||||
creations with a drawing or photo. This is a really cool feature that tells
|
||||
stable diffusion to build the prompt on top of the image you provide, preserving
|
||||
the original's basic shape and layout. To use it, provide the --init_img
|
||||
option as shown here:
|
||||
|
||||
~~~~
|
||||
dream> "waterfall and rainbow" --init_img=./init-images/crude_drawing.png --strength=0.5 -s100 -n4
|
||||
~~~~
|
||||
|
||||
The --init_img (-I) option gives the path to the seed picture. --strength (-f) controls how much
|
||||
the original will be modified, ranging from 0.0 (keep the original intact), to 1.0 (ignore the original
|
||||
completely). The default is 0.75, and ranges from 0.25-0.75 give interesting results.
|
||||
|
||||
## Changes
|
||||
|
||||
- v1.01 (21 August 2022)
|
||||
* added k_lms sampling **Please run "conda update -f environment.yaml" to load the k_lms dependencies**
|
||||
* use half precision arithmetic by default, resulting in faster execution and lower memory requirements
|
||||
Pass argument --full_precision to dream.py to get slower but more accurate image generation
|
||||
|
||||
|
||||
## Installation
|
||||
|
||||
### Linux/Mac
|
||||
|
||||
1. You will need to install the following prerequisites if they are not already available. Use your
|
||||
operating system's preferred installer
|
||||
* Python (version 3.8 or higher)
|
||||
* git
|
||||
|
||||
2. Install the Python Anaconda environment manager using pip3.
|
||||
```
|
||||
~$ pip3 install anaconda
|
||||
```
|
||||
After installing anaconda, you should log out of your system and log back in. If the installation
|
||||
worked, your command prompt will be prefixed by the name of the current anaconda environment, "(base)".
|
||||
|
||||
3. Copy the stable-diffusion source code from GitHub:
|
||||
```
|
||||
(base) ~$ git clone https://github.com/lstein/stable-diffusion.git
|
||||
```
|
||||
This will create stable-diffusion folder where you will follow the rest of the steps.
|
||||
|
||||
4. Enter the newly-created stable-diffusion folder. From this step forward make sure that you are working in the stable-diffusion directory!
|
||||
```
|
||||
(base) ~$ cd stable-diffusion
|
||||
(base) ~/stable-diffusion$
|
||||
```
|
||||
5. Use anaconda to copy necessary python packages, create a new python environment named "ldm",
|
||||
and activate the environment.
|
||||
```
|
||||
(base) ~/stable-diffusion$ conda env create -f environment.yaml
|
||||
(base) ~/stable-diffusion$ conda activate ldm
|
||||
(ldm) ~/stable-diffusion$
|
||||
```
|
||||
After these steps, your command prompt will be prefixed by "(ldm)" as shown above.
|
||||
|
||||
6. Load a couple of small machine-learning models required by stable diffusion:
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ python3 scripts/preload_models.py
|
||||
```
|
||||
|
||||
7. Now you need to install the weights for the stable diffusion model.
|
||||
|
||||
For testing prior to the release of the real weights, you can use an older weight file that produces low-quality images. Create a directory within stable-diffusion named "models/ldm/text2img.large", and use the wget URL downloader tool to copy the weight file into it:
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ mkdir -p models/ldm/text2img-large
|
||||
(ldm) ~/stable-diffusion$ wget -O models/ldm/text2img-large/model.ckpt https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt
|
||||
```
|
||||
For testing with the released weighs, you will do something similar, but with a directory named "models/ldm/stable-diffusion-v1"
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ mkdir -p models/ldm/stable-diffusion-v1
|
||||
(ldm) ~/stable-diffusion$ wget -O models/ldm/stable-diffusion-v1/model.ckpt <ENTER URL HERE>
|
||||
```
|
||||
These weight files are ~5 GB in size, so downloading may take a while.
|
||||
|
||||
8. Start generating images!
|
||||
```
|
||||
# for the pre-release weights use the -l or --liaon400m switch
|
||||
(ldm) ~/stable-diffusion$ python3 scripts/dream.py -l
|
||||
|
||||
# for the post-release weights do not use the switch
|
||||
(ldm) ~/stable-diffusion$ python3 scripts/dream.py
|
||||
|
||||
# for additional configuration switches and arguments, use -h or --help
|
||||
(ldm) ~/stable-diffusion$ python3 scripts/dream.py -h
|
||||
```
|
||||
9. Subsequently, to relaunch the script, be sure to run "conda activate ldm" (step 5, second command), enter the "stable-diffusion"
|
||||
directory, and then launch the dream script (step 8). If you forget to activate the ldm environment, the script will fail with multiple ModuleNotFound errors.
|
||||
|
||||
### Updating to newer versions of the script
|
||||
|
||||
This distribution is changing rapidly. If you used the "git clone" method (step 5) to download the stable-diffusion directory, then to update to the latest and greatest version, launch the Anaconda window, enter "stable-diffusion", and type:
|
||||
```
|
||||
(ldm) ~/stable-diffusion$ git pull
|
||||
```
|
||||
This will bring your local copy into sync with the remote one.
|
||||
|
||||
### Windows
|
||||
|
||||
1. Install the most recent Python from here: https://www.python.org/downloads/windows/
|
||||
|
||||
2. Install Anaconda3 (miniconda3 version) from here: https://docs.anaconda.com/anaconda/install/windows/
|
||||
|
||||
3. Install Git from here: https://git-scm.com/download/win
|
||||
|
||||
4. Launch Anaconda from the Windows Start menu. This will bring up a command window. Type all the remaining commands in this window.
|
||||
|
||||
5. Run the command:
|
||||
```
|
||||
git clone https://github.com/lstein/stable-diffusion.git
|
||||
```
|
||||
This will create stable-diffusion folder where you will follow the rest of the steps.
|
||||
|
||||
6. Enter the newly-created stable-diffusion folder. From this step forward make sure that you are working in the stable-diffusion directory!
|
||||
```
|
||||
cd stable-diffusion
|
||||
```
|
||||
|
||||
7. Run the following two commands:
|
||||
```
|
||||
conda env create -f environment.yaml (step 7a)
|
||||
conda activate ldm (step 7b)
|
||||
```
|
||||
This will install all python requirements and activate the "ldm" environment which sets PATH and other environment variables properly.
|
||||
|
||||
8. Run the command:
|
||||
```
|
||||
python scripts\preload_models.py
|
||||
```
|
||||
This installs two machine learning models that stable diffusion requires.
|
||||
|
||||
9. Now you need to install the weights for the big stable diffusion model.
|
||||
|
||||
For testing prior to the release of the real weights, create a directory within stable-diffusion named "models\ldm\text2img.large".
|
||||
|
||||
For testing with the released weights, create a directory within stable-diffusion named "models\ldm\stable-diffusion-v1".
|
||||
|
||||
Then use a web browser to copy model.ckpt into the appropriate directory. For the text2img.large (pre-release) model, the weights are at https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt. Check back here later for the release URL.
|
||||
|
||||
10. Start generating images!
|
||||
```
|
||||
# for the pre-release weights
|
||||
python scripts\dream.py -l
|
||||
|
||||
# for the post-release weights
|
||||
python scripts\dream.py
|
||||
```
|
||||
11. Subsequently, to relaunch the script, first activate the Anaconda command window (step 4), run "conda activate ldm" (step 7b), and then launch the dream script (step 10).
|
||||
|
||||
### Updating to newer versions of the script
|
||||
|
||||
This distribution is changing rapidly. If you used the "git clone" method (step 5) to download the stable-diffusion directory, then to update to the latest and greatest version, launch the Anaconda window, enter "stable-diffusion", and type:
|
||||
```
|
||||
git pull
|
||||
```
|
||||
This will bring your local copy into sync with the remote one.
|
||||
|
||||
## Simplified API for text to image generation
|
||||
|
||||
For programmers who wish to incorporate stable-diffusion into other
|
||||
products, this repository includes a simplified API for text to image generation, which
|
||||
lets you create images from a prompt in just three lines of code:
|
||||
|
||||
~~~~
|
||||
from ldm.simplet2i import T2I
|
||||
model = T2I()
|
||||
outputs = model.text2image("a unicorn in manhattan")
|
||||
~~~~
|
||||
|
||||
Outputs is a list of lists in the format [[filename1,seed1],[filename2,seed2]...]
|
||||
Please see ldm/simplet2i.py for more information.
|
||||
|
||||
|
||||
## Workaround for machines with limited internet connectivity
|
||||
|
||||
@@ -92,14 +269,9 @@ time, copy over the file ldm/modules/encoders/modules.py from the
|
||||
CompVis/stable-diffusion repository. Or you can run preload_models.py
|
||||
on the target machine.
|
||||
|
||||
## Minor fixes
|
||||
## Support
|
||||
|
||||
I added the requirement for torchmetrics to environment.yaml.
|
||||
|
||||
## Installation and support
|
||||
|
||||
Follow the directions from the original README, which starts below, to
|
||||
configure the environment and install requirements. For support,
|
||||
For support,
|
||||
please use this repository's GitHub Issues tracking service. Feel free
|
||||
to send me an email if you use and like the script.
|
||||
|
||||
@@ -108,14 +280,16 @@ to send me an email if you use and like the script.
|
||||
# Original README from CompViz/stable-diffusion
|
||||
*Stable Diffusion was made possible thanks to a collaboration with [Stability AI](https://stability.ai/) and [Runway](https://runwayml.com/) and builds upon our previous work:*
|
||||
|
||||
[**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752)<br/>
|
||||
[**High-Resolution Image Synthesis with Latent Diffusion Models**](https://ommer-lab.com/research/latent-diffusion-models/)<br/>
|
||||
[Robin Rombach](https://github.com/rromb)\*,
|
||||
[Andreas Blattmann](https://github.com/ablattmann)\*,
|
||||
[Dominik Lorenz](https://github.com/qp-qp)\,
|
||||
[Patrick Esser](https://github.com/pesser),
|
||||
[Björn Ommer](https://hci.iwr.uni-heidelberg.de/Staff/bommer)<br/>
|
||||
|
||||
which is available on [GitHub](https://github.com/CompVis/latent-diffusion).
|
||||
**CVPR '22 Oral**
|
||||
|
||||
which is available on [GitHub](https://github.com/CompVis/latent-diffusion). PDF at [arXiv](https://arxiv.org/abs/2112.10752). Please also visit our [Project page](https://ommer-lab.com/research/latent-diffusion-models/).
|
||||
|
||||

|
||||
[Stable Diffusion](#stable-diffusion-v1) is a latent text-to-image diffusion
|
||||
@@ -128,6 +302,7 @@ See [this section](#stable-diffusion-v1) below and the [model card](https://hugg
|
||||
|
||||
|
||||
## Requirements
|
||||
|
||||
A suitable [conda](https://conda.io/) environment named `ldm` can be created
|
||||
and activated with:
|
||||
|
||||
@@ -142,8 +317,7 @@ You can also update an existing [latent diffusion](https://github.com/CompVis/la
|
||||
conda install pytorch torchvision -c pytorch
|
||||
pip install transformers==4.19.2
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
```
|
||||
|
||||
## Stable Diffusion v1
|
||||
|
||||
|
||||
@@ -24,6 +24,8 @@ dependencies:
|
||||
- transformers==4.19.2
|
||||
- torchmetrics==0.6.0
|
||||
- kornia==0.6
|
||||
- -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers
|
||||
- accelerate==0.12.0
|
||||
- -e git+https://github.com/openai/CLIP.git@main#egg=clip
|
||||
- -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers
|
||||
- -e git+https://github.com/lstein/k-diffusion.git@master#egg=k-diffusion
|
||||
- -e .
|
||||
|
||||
@@ -17,6 +17,7 @@ from functools import partial
|
||||
from tqdm import tqdm
|
||||
from torchvision.utils import make_grid
|
||||
from pytorch_lightning.utilities.distributed import rank_zero_only
|
||||
import urllib
|
||||
|
||||
from ldm.util import log_txt_as_img, exists, default, ismap, isimage, mean_flat, count_params, instantiate_from_config
|
||||
from ldm.modules.ema import LitEma
|
||||
@@ -524,7 +525,10 @@ class LatentDiffusion(DDPM):
|
||||
else:
|
||||
assert config != '__is_first_stage__'
|
||||
assert config != '__is_unconditional__'
|
||||
model = instantiate_from_config(config)
|
||||
try:
|
||||
model = instantiate_from_config(config)
|
||||
except urllib.error.URLError:
|
||||
raise SystemExit("* Couldn't load a dependency. Try running scripts/preload_models.py from an internet-conected machine.")
|
||||
self.cond_stage_model = model
|
||||
|
||||
def _get_denoise_row_from_list(self, samples, desc='', force_no_decoder_quantization=False):
|
||||
|
||||
74
ldm/models/diffusion/ksampler.py
Normal file
74
ldm/models/diffusion/ksampler.py
Normal file
@@ -0,0 +1,74 @@
|
||||
'''wrapper around part of Karen Crownson's k-duffsion library, making it call compatible with other Samplers'''
|
||||
import k_diffusion as K
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
import accelerate
|
||||
|
||||
class CFGDenoiser(nn.Module):
|
||||
def __init__(self, model):
|
||||
super().__init__()
|
||||
self.inner_model = model
|
||||
|
||||
def forward(self, x, sigma, uncond, cond, cond_scale):
|
||||
x_in = torch.cat([x] * 2)
|
||||
sigma_in = torch.cat([sigma] * 2)
|
||||
cond_in = torch.cat([uncond, cond])
|
||||
uncond, cond = self.inner_model(x_in, sigma_in, cond=cond_in).chunk(2)
|
||||
return uncond + (cond - uncond) * cond_scale
|
||||
|
||||
class KSampler(object):
|
||||
def __init__(self,model,schedule="lms", **kwargs):
|
||||
super().__init__()
|
||||
self.model = K.external.CompVisDenoiser(model)
|
||||
self.accelerator = accelerate.Accelerator()
|
||||
self.device = self.accelerator.device
|
||||
self.schedule = schedule
|
||||
|
||||
def forward(self, x, sigma, uncond, cond, cond_scale):
|
||||
x_in = torch.cat([x] * 2)
|
||||
sigma_in = torch.cat([sigma] * 2)
|
||||
cond_in = torch.cat([uncond, cond])
|
||||
uncond, cond = self.inner_model(x_in, sigma_in, cond=cond_in).chunk(2)
|
||||
return uncond + (cond - uncond) * cond_scale
|
||||
|
||||
|
||||
# most of these arguments are ignored and are only present for compatibility with
|
||||
# other samples
|
||||
@torch.no_grad()
|
||||
def sample(self,
|
||||
S,
|
||||
batch_size,
|
||||
shape,
|
||||
conditioning=None,
|
||||
callback=None,
|
||||
normals_sequence=None,
|
||||
img_callback=None,
|
||||
quantize_x0=False,
|
||||
eta=0.,
|
||||
mask=None,
|
||||
x0=None,
|
||||
temperature=1.,
|
||||
noise_dropout=0.,
|
||||
score_corrector=None,
|
||||
corrector_kwargs=None,
|
||||
verbose=True,
|
||||
x_T=None,
|
||||
log_every_t=100,
|
||||
unconditional_guidance_scale=1.,
|
||||
unconditional_conditioning=None,
|
||||
# this has to come in the same format as the conditioning, # e.g. as encoded tokens, ...
|
||||
**kwargs
|
||||
):
|
||||
|
||||
sigmas = self.model.get_sigmas(S)
|
||||
if x_T:
|
||||
x = x_T
|
||||
else:
|
||||
x = torch.randn([batch_size, *shape], device=self.device) * sigmas[0] # for GPU draw
|
||||
model_wrap_cfg = CFGDenoiser(self.model)
|
||||
extra_args = {'cond': conditioning, 'uncond': unconditional_conditioning, 'cond_scale': unconditional_guidance_scale}
|
||||
return (K.sampling.sample_lms(model_wrap_cfg, x, sigmas, extra_args=extra_args, disable=not self.accelerator.is_main_process),
|
||||
None)
|
||||
|
||||
def gather(samples_ddim):
|
||||
return self.accelerator.gather(samples_ddim)
|
||||
@@ -60,7 +60,10 @@ class BERTTokenizer(AbstractEncoder):
|
||||
# by running:
|
||||
# from transformers import BertTokenizerFast
|
||||
# BertTokenizerFast.from_pretrained("bert-base-uncased")
|
||||
self.tokenizer = BertTokenizerFast.from_pretrained("bert-base-uncased",local_files_only=True)
|
||||
try:
|
||||
self.tokenizer = BertTokenizerFast.from_pretrained("bert-base-uncased",local_files_only=True)
|
||||
except OSError:
|
||||
raise SystemExit("* Couldn't load Bert tokenizer files. Try running scripts/preload_models.py from an internet-conected machine.")
|
||||
self.device = device
|
||||
self.vq_interface = vq_interface
|
||||
self.max_length = max_length
|
||||
|
||||
247
ldm/simplet2i.py
247
ldm/simplet2i.py
@@ -7,37 +7,43 @@ from ldm.simplet2i import T2I
|
||||
t2i = T2I(outdir = <path> // outputs/txt2img-samples
|
||||
model = <path> // models/ldm/stable-diffusion-v1/model.ckpt
|
||||
config = <path> // default="configs/stable-diffusion/v1-inference.yaml
|
||||
batch = <integer> // 1
|
||||
iterations = <integer> // how many times to run the sampling (1)
|
||||
batch_size = <integer> // how many images to generate per sampling (1)
|
||||
steps = <integer> // 50
|
||||
seed = <integer> // current system time
|
||||
sampler = ['ddim','plms'] // ddim
|
||||
sampler = ['ddim','plms','klms'] // klms
|
||||
grid = <boolean> // false
|
||||
width = <integer> // image width, multiple of 64 (512)
|
||||
height = <integer> // image height, multiple of 64 (512)
|
||||
cfg_scale = <float> // unconditional guidance scale (7.5)
|
||||
fixed_code = <boolean> // False
|
||||
)
|
||||
|
||||
# do the slow model initialization
|
||||
t2i.load_model()
|
||||
|
||||
# Do the fast inference & image generation. Any options passed here
|
||||
# override the default values assigned during class initialization
|
||||
# Will call load_model() if the model was not previously loaded.
|
||||
t2i.txt2img(prompt = <string> // required
|
||||
// the remaining option arguments override constructur value when present
|
||||
outdir = <path>
|
||||
iterations = <integer>
|
||||
batch = <integer>
|
||||
steps = <integer>
|
||||
seed = <integer>
|
||||
sampler = ['ddim','plms']
|
||||
grid = <boolean>
|
||||
width = <integer>
|
||||
height = <integer>
|
||||
cfg_scale = <float>
|
||||
) -> boolean
|
||||
# The method returns a list of images. Each row of the list is a sub-list of [filename,seed]
|
||||
results = t2i.txt2img(prompt = "an astronaut riding a horse"
|
||||
outdir = "./outputs/txt2img-samples)
|
||||
)
|
||||
|
||||
for row in results:
|
||||
print(f'filename={row[0]}')
|
||||
print(f'seed ={row[1]}')
|
||||
|
||||
# Same thing, but using an initial image.
|
||||
results = t2i.img2img(prompt = "an astronaut riding a horse"
|
||||
outdir = "./outputs/img2img-samples"
|
||||
init_img = "./sketches/horse+rider.png")
|
||||
|
||||
for row in results:
|
||||
print(f'filename={row[0]}')
|
||||
print(f'seed ={row[1]}')
|
||||
"""
|
||||
|
||||
import torch
|
||||
import numpy as np
|
||||
import random
|
||||
@@ -47,7 +53,7 @@ from omegaconf import OmegaConf
|
||||
from PIL import Image
|
||||
from tqdm import tqdm, trange
|
||||
from itertools import islice
|
||||
from einops import rearrange
|
||||
from einops import rearrange, repeat
|
||||
from torchvision.utils import make_grid
|
||||
from pytorch_lightning import seed_everything
|
||||
from torch import autocast
|
||||
@@ -56,8 +62,9 @@ import time
|
||||
import math
|
||||
|
||||
from ldm.util import instantiate_from_config
|
||||
from ldm.models.diffusion.ddim import DDIMSampler
|
||||
from ldm.models.diffusion.plms import PLMSSampler
|
||||
from ldm.models.diffusion.ddim import DDIMSampler
|
||||
from ldm.models.diffusion.plms import PLMSSampler
|
||||
from ldm.models.diffusion.ksampler import KSampler
|
||||
|
||||
class T2I:
|
||||
"""T2I class
|
||||
@@ -67,7 +74,7 @@ class T2I:
|
||||
model
|
||||
config
|
||||
iterations
|
||||
batch
|
||||
batch_size
|
||||
steps
|
||||
seed
|
||||
sampler
|
||||
@@ -80,10 +87,11 @@ class T2I:
|
||||
latent_channels
|
||||
downsampling_factor
|
||||
precision
|
||||
strength
|
||||
"""
|
||||
def __init__(self,
|
||||
outdir="outputs/txt2img-samples",
|
||||
batch=1,
|
||||
batch_size=1,
|
||||
iterations = 1,
|
||||
width=512,
|
||||
height=512,
|
||||
@@ -94,15 +102,17 @@ class T2I:
|
||||
cfg_scale=7.5,
|
||||
weights="models/ldm/stable-diffusion-v1/model.ckpt",
|
||||
config = "configs/latent-diffusion/txt2img-1p4B-eval.yaml",
|
||||
sampler="plms",
|
||||
sampler="klms",
|
||||
latent_channels=4,
|
||||
downsampling_factor=8,
|
||||
ddim_eta=0.0, # deterministic
|
||||
fixed_code=False,
|
||||
precision='autocast'
|
||||
precision='autocast',
|
||||
full_precision=False,
|
||||
strength=0.75 # default in scripts/img2img.py
|
||||
):
|
||||
self.outdir = outdir
|
||||
self.batch = batch
|
||||
self.batch_size = batch_size
|
||||
self.iterations = iterations
|
||||
self.width = width
|
||||
self.height = height
|
||||
@@ -117,16 +127,22 @@ class T2I:
|
||||
self.downsampling_factor = downsampling_factor
|
||||
self.ddim_eta = ddim_eta
|
||||
self.precision = precision
|
||||
self.full_precision = full_precision
|
||||
self.strength = strength
|
||||
self.model = None # empty for now
|
||||
self.sampler = None
|
||||
if seed is None:
|
||||
self.seed = self._new_seed()
|
||||
else:
|
||||
self.seed = seed
|
||||
def txt2img(self,prompt,outdir=None,batch=None,iterations=None,
|
||||
|
||||
def txt2img(self,prompt,outdir=None,batch_size=None,iterations=None,
|
||||
steps=None,seed=None,grid=None,individual=None,width=None,height=None,
|
||||
cfg_scale=None,ddim_eta=None):
|
||||
""" generate an image from the prompt, writing iteration images into the outdir """
|
||||
cfg_scale=None,ddim_eta=None,strength=None,init_img=None):
|
||||
"""
|
||||
Generate an image from the prompt, writing iteration images into the outdir
|
||||
The output is a list of lists in the format: [[filename1,seed1], [filename2,seed2],...]
|
||||
"""
|
||||
outdir = outdir or self.outdir
|
||||
steps = steps or self.steps
|
||||
seed = seed or self.seed
|
||||
@@ -134,8 +150,9 @@ class T2I:
|
||||
height = height or self.height
|
||||
cfg_scale = cfg_scale or self.cfg_scale
|
||||
ddim_eta = ddim_eta or self.ddim_eta
|
||||
batch = batch or self.batch
|
||||
batch_size = batch_size or self.batch_size
|
||||
iterations = iterations or self.iterations
|
||||
strength = strength or self.strength # not actually used here, but preserved for code refactoring
|
||||
|
||||
model = self.load_model() # will instantiate the model or return it from cache
|
||||
|
||||
@@ -146,7 +163,7 @@ class T2I:
|
||||
if individual:
|
||||
grid = False
|
||||
|
||||
data = [batch * [prompt]]
|
||||
data = [batch_size * [prompt]]
|
||||
|
||||
# make directories and establish names for the output files
|
||||
os.makedirs(outdir, exist_ok=True)
|
||||
@@ -154,7 +171,7 @@ class T2I:
|
||||
|
||||
start_code = None
|
||||
if self.fixed_code:
|
||||
start_code = torch.randn([batch,
|
||||
start_code = torch.randn([batch_size,
|
||||
self.latent_channels,
|
||||
height // self.downsampling_factor,
|
||||
width // self.downsampling_factor],
|
||||
@@ -176,14 +193,14 @@ class T2I:
|
||||
for prompts in tqdm(data, desc="data", dynamic_ncols=True):
|
||||
uc = None
|
||||
if cfg_scale != 1.0:
|
||||
uc = model.get_learned_conditioning(batch * [""])
|
||||
uc = model.get_learned_conditioning(batch_size * [""])
|
||||
if isinstance(prompts, tuple):
|
||||
prompts = list(prompts)
|
||||
c = model.get_learned_conditioning(prompts)
|
||||
shape = [self.latent_channels, height // self.downsampling_factor, width // self.downsampling_factor]
|
||||
samples_ddim, _ = sampler.sample(S=steps,
|
||||
conditioning=c,
|
||||
batch_size=batch,
|
||||
batch_size=batch_size,
|
||||
shape=shape,
|
||||
verbose=False,
|
||||
unconditional_guidance_scale=cfg_scale,
|
||||
@@ -208,24 +225,148 @@ class T2I:
|
||||
seed = self._new_seed()
|
||||
|
||||
if grid:
|
||||
n_rows = batch if batch>1 else int(math.sqrt(batch * iterations))
|
||||
# save as grid
|
||||
grid = torch.stack(all_samples, 0)
|
||||
grid = rearrange(grid, 'n b c h w -> (n b) c h w')
|
||||
grid = make_grid(grid, nrow=n_rows)
|
||||
|
||||
# to image
|
||||
grid = 255. * rearrange(grid, 'c h w -> h w c').cpu().numpy()
|
||||
filename = os.path.join(outdir, f"{base_count:05}.png")
|
||||
Image.fromarray(grid.astype(np.uint8)).save(filename)
|
||||
for s in seeds:
|
||||
images.append([filename,s])
|
||||
images = self._make_grid(samples=all_samples,
|
||||
seeds=seeds,
|
||||
batch_size=batch_size,
|
||||
iterations=iterations,
|
||||
outdir=outdir)
|
||||
|
||||
toc = time.time()
|
||||
print(f'{batch * iterations} images generated in',"%4.2fs"% (toc-tic))
|
||||
print(f'{batch_size * iterations} images generated in',"%4.2fs"% (toc-tic))
|
||||
|
||||
return images
|
||||
|
||||
# There is lots of shared code between this and txt2img and should be refactored.
|
||||
def img2img(self,prompt,outdir=None,init_img=None,batch_size=None,iterations=None,
|
||||
steps=None,seed=None,grid=None,individual=None,width=None,height=None,
|
||||
cfg_scale=None,ddim_eta=None,strength=None):
|
||||
"""
|
||||
Generate an image from the prompt and the initial image, writing iteration images into the outdir
|
||||
The output is a list of lists in the format: [[filename1,seed1], [filename2,seed2],...]
|
||||
"""
|
||||
outdir = outdir or self.outdir
|
||||
steps = steps or self.steps
|
||||
seed = seed or self.seed
|
||||
cfg_scale = cfg_scale or self.cfg_scale
|
||||
ddim_eta = ddim_eta or self.ddim_eta
|
||||
batch_size = batch_size or self.batch_size
|
||||
iterations = iterations or self.iterations
|
||||
strength = strength or self.strength
|
||||
|
||||
if init_img is None:
|
||||
print("no init_img provided!")
|
||||
return []
|
||||
|
||||
model = self.load_model() # will instantiate the model or return it from cache
|
||||
|
||||
precision_scope = autocast if self.precision=="autocast" else nullcontext
|
||||
|
||||
# grid and individual are mutually exclusive, with individual taking priority.
|
||||
# not necessary, but needed for compatability with dream bot
|
||||
if (grid is None):
|
||||
grid = self.grid
|
||||
if individual:
|
||||
grid = False
|
||||
|
||||
data = [batch_size * [prompt]]
|
||||
|
||||
# PLMS sampler not supported yet, so ignore previous sampler
|
||||
if self.sampler_name!='ddim':
|
||||
print(f"sampler '{self.sampler_name}' is not yet supported. Using DDM sampler")
|
||||
sampler = DDIMSampler(model)
|
||||
else:
|
||||
sampler = self.sampler
|
||||
|
||||
# make directories and establish names for the output files
|
||||
os.makedirs(outdir, exist_ok=True)
|
||||
base_count = len(os.listdir(outdir))-1
|
||||
|
||||
assert os.path.isfile(init_img)
|
||||
init_image = self._load_img(init_img).to(self.device)
|
||||
init_image = repeat(init_image, '1 ... -> b ...', b=batch_size)
|
||||
with precision_scope("cuda"):
|
||||
init_latent = model.get_first_stage_encoding(model.encode_first_stage(init_image)) # move to latent space
|
||||
|
||||
sampler.make_schedule(ddim_num_steps=steps, ddim_eta=ddim_eta, verbose=False)
|
||||
|
||||
try:
|
||||
assert 0. <= strength <= 1., 'can only work with strength in [0.0, 1.0]'
|
||||
except AssertionError:
|
||||
print(f"strength must be between 0.0 and 1.0, but received value {strength}")
|
||||
return []
|
||||
|
||||
t_enc = int(strength * steps)
|
||||
print(f"target t_enc is {t_enc} steps")
|
||||
|
||||
images = list()
|
||||
seeds = list()
|
||||
|
||||
tic = time.time()
|
||||
|
||||
with torch.no_grad():
|
||||
with precision_scope("cuda"):
|
||||
with model.ema_scope():
|
||||
all_samples = list()
|
||||
for n in trange(iterations, desc="Sampling"):
|
||||
seed_everything(seed)
|
||||
for prompts in tqdm(data, desc="data", dynamic_ncols=True):
|
||||
uc = None
|
||||
if cfg_scale != 1.0:
|
||||
uc = model.get_learned_conditioning(batch_size * [""])
|
||||
if isinstance(prompts, tuple):
|
||||
prompts = list(prompts)
|
||||
c = model.get_learned_conditioning(prompts)
|
||||
|
||||
# encode (scaled latent)
|
||||
z_enc = sampler.stochastic_encode(init_latent, torch.tensor([t_enc]*batch_size).to(self.device))
|
||||
# decode it
|
||||
samples = sampler.decode(z_enc, c, t_enc, unconditional_guidance_scale=cfg_scale,
|
||||
unconditional_conditioning=uc,)
|
||||
|
||||
x_samples = model.decode_first_stage(samples)
|
||||
x_samples = torch.clamp((x_samples + 1.0) / 2.0, min=0.0, max=1.0)
|
||||
|
||||
if not grid:
|
||||
for x_sample in x_samples:
|
||||
x_sample = 255. * rearrange(x_sample.cpu().numpy(), 'c h w -> h w c')
|
||||
filename = os.path.join(outdir, f"{base_count:05}.png")
|
||||
Image.fromarray(x_sample.astype(np.uint8)).save(filename)
|
||||
images.append([filename,seed])
|
||||
base_count += 1
|
||||
else:
|
||||
all_samples.append(x_samples)
|
||||
seeds.append(seed)
|
||||
|
||||
seed = self._new_seed()
|
||||
|
||||
if grid:
|
||||
images = self._make_grid(samples=all_samples,
|
||||
seeds=seeds,
|
||||
batch_size=batch_size,
|
||||
iterations=iterations,
|
||||
outdir=outdir)
|
||||
|
||||
toc = time.time()
|
||||
print(f'{batch_size * iterations} images generated in',"%4.2fs"% (toc-tic))
|
||||
|
||||
return images
|
||||
|
||||
def _make_grid(self,samples,seeds,batch_size,iterations,outdir):
|
||||
images = list()
|
||||
base_count = len(os.listdir(outdir))-1
|
||||
n_rows = batch_size if batch_size>1 else int(math.sqrt(batch_size * iterations))
|
||||
# save as grid
|
||||
grid = torch.stack(samples, 0)
|
||||
grid = rearrange(grid, 'n b c h w -> (n b) c h w')
|
||||
grid = make_grid(grid, nrow=n_rows)
|
||||
|
||||
# to image
|
||||
grid = 255. * rearrange(grid, 'c h w -> h w c').cpu().numpy()
|
||||
filename = os.path.join(outdir, f"{base_count:05}.png")
|
||||
Image.fromarray(grid.astype(np.uint8)).save(filename)
|
||||
for s in seeds:
|
||||
images.append([filename,s])
|
||||
return images
|
||||
|
||||
def _new_seed(self):
|
||||
self.seed = random.randrange(0,np.iinfo(np.uint32).max)
|
||||
@@ -249,6 +390,9 @@ class T2I:
|
||||
elif self.sampler_name == 'ddim':
|
||||
print("setting sampler to ddim")
|
||||
self.sampler = DDIMSampler(self.model)
|
||||
elif self.sampler_name == 'klms':
|
||||
print("setting sampler to klms")
|
||||
self.sampler = KSampler(self.model,'lms')
|
||||
else:
|
||||
print(f"unsupported sampler {self.sampler_name}, defaulting to plms")
|
||||
self.sampler = PLMSSampler(self.model)
|
||||
@@ -265,5 +409,20 @@ class T2I:
|
||||
m, u = model.load_state_dict(sd, strict=False)
|
||||
model.cuda()
|
||||
model.eval()
|
||||
if self.full_precision:
|
||||
print('Using slower but more accurate full-precision math (--full_precision)')
|
||||
else:
|
||||
print('Using half precision math. Call with --full_precision to use full precision')
|
||||
model.half()
|
||||
return model
|
||||
|
||||
def _load_img(self,path):
|
||||
image = Image.open(path).convert("RGB")
|
||||
w, h = image.size
|
||||
print(f"loaded input image of size ({w}, {h}) from {path}")
|
||||
w, h = map(lambda x: x - x % 32, (w, h)) # resize to integer multiple of 32
|
||||
image = image.resize((w, h), resample=Image.Resampling.LANCZOS)
|
||||
image = np.array(image).astype(np.float32) / 255.0
|
||||
image = image[None].transpose(0, 3, 1, 2)
|
||||
image = torch.from_numpy(image)
|
||||
return 2.*image - 1.
|
||||
|
||||
169
scripts/dream.py
169
scripts/dream.py
@@ -1,10 +1,18 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
import readline
|
||||
#!/usr/bin/env python3
|
||||
import argparse
|
||||
import shlex
|
||||
import atexit
|
||||
import os
|
||||
import sys
|
||||
|
||||
# readline unavailable on windows systems
|
||||
try:
|
||||
import readline
|
||||
readline_available = True
|
||||
except:
|
||||
readline_available = False
|
||||
|
||||
debugging = False
|
||||
|
||||
def main():
|
||||
''' Initialize command-line parsers and the diffusion model '''
|
||||
@@ -24,9 +32,11 @@ def main():
|
||||
weights = "models/ldm/stable-diffusion-v1/model.ckpt"
|
||||
|
||||
# command line history will be stored in a file called "~/.dream_history"
|
||||
load_history()
|
||||
if readline_available:
|
||||
setup_readline()
|
||||
|
||||
print("* Initializing, be patient...\n")
|
||||
sys.path.append('.')
|
||||
from pytorch_lightning import logging
|
||||
from ldm.simplet2i import T2I
|
||||
|
||||
@@ -36,10 +46,11 @@ def main():
|
||||
# the user input loop
|
||||
t2i = T2I(width=width,
|
||||
height=height,
|
||||
batch=opt.batch,
|
||||
batch_size=opt.batch_size,
|
||||
outdir=opt.outdir,
|
||||
sampler=opt.sampler,
|
||||
weights=weights,
|
||||
full_precision=opt.full_precision,
|
||||
config=config)
|
||||
|
||||
# make sure the output directory exists
|
||||
@@ -50,8 +61,9 @@ def main():
|
||||
logging.getLogger("pytorch_lightning").setLevel(logging.ERROR)
|
||||
|
||||
# preload the model
|
||||
t2i.load_model()
|
||||
print("\n* Initialization done! Awaiting your command...")
|
||||
if not debugging:
|
||||
t2i.load_model()
|
||||
print("\n* Initialization done! Awaiting your command (-h for help, q to quit)...")
|
||||
|
||||
log_path = os.path.join(opt.outdir,"dream_log.txt")
|
||||
with open(log_path,'a') as log:
|
||||
@@ -59,16 +71,26 @@ def main():
|
||||
main_loop(t2i,cmd_parser,log)
|
||||
log.close()
|
||||
|
||||
|
||||
def main_loop(t2i,parser,log):
|
||||
''' prompt/read/execute loop '''
|
||||
while True:
|
||||
done = False
|
||||
|
||||
while not done:
|
||||
try:
|
||||
command = input("dream> ")
|
||||
except EOFError:
|
||||
print("goodbye!")
|
||||
done = True
|
||||
break
|
||||
|
||||
elements = shlex.split(command)
|
||||
if elements[0]=='q': #
|
||||
done = True
|
||||
break
|
||||
if elements[0].startswith('!dream'): # in case a stored prompt still contains the !dream command
|
||||
elements.pop(0)
|
||||
|
||||
# rearrange the arguments to mimic how it works in the Dream bot.
|
||||
switches = ['']
|
||||
switches_started = False
|
||||
|
||||
@@ -81,14 +103,29 @@ def main_loop(t2i,parser,log):
|
||||
switches[0] += el
|
||||
switches[0] += ' '
|
||||
switches[0] = switches[0][:len(switches[0])-1]
|
||||
|
||||
try:
|
||||
opt = parser.parse_args(switches)
|
||||
except SystemExit:
|
||||
parser.print_help()
|
||||
pass
|
||||
results = t2i.txt2img(**vars(opt))
|
||||
print("Outputs:")
|
||||
write_log_message(opt,switches,results,log)
|
||||
continue
|
||||
if len(opt.prompt)==0:
|
||||
print("Try again with a prompt!")
|
||||
continue
|
||||
|
||||
try:
|
||||
if opt.init_img is None:
|
||||
results = t2i.txt2img(**vars(opt))
|
||||
else:
|
||||
results = t2i.img2img(**vars(opt))
|
||||
print("Outputs:")
|
||||
write_log_message(opt,switches,results,log)
|
||||
except KeyboardInterrupt:
|
||||
print('*interrupted*')
|
||||
continue
|
||||
|
||||
print("goodbye!")
|
||||
|
||||
|
||||
def write_log_message(opt,switches,results,logfile):
|
||||
''' logs the name of the output image, its prompt and seed to both the terminal and the log file '''
|
||||
@@ -130,44 +167,118 @@ def create_argv_parser():
|
||||
type=int,
|
||||
default=1,
|
||||
help="number of images to generate")
|
||||
parser.add_argument('-b','--batch',
|
||||
parser.add_argument('-F','--full_precision',
|
||||
dest='full_precision',
|
||||
action='store_true',
|
||||
help="use slower full precision math for calculations")
|
||||
parser.add_argument('-b','--batch_size',
|
||||
type=int,
|
||||
default=1,
|
||||
help="number of images to produce per iteration (currently not working properly - producing too many images)")
|
||||
parser.add_argument('--sampler',
|
||||
choices=['plms','ddim'],
|
||||
default='plms',
|
||||
help="which sampler to use")
|
||||
choices=['plms','ddim', 'klms'],
|
||||
default='klms',
|
||||
help="which sampler to use (klms)")
|
||||
parser.add_argument('-o',
|
||||
'--outdir',
|
||||
type=str,
|
||||
default="outputs/txt2img-samples",
|
||||
default="outputs/img-samples",
|
||||
help="directory in which to place generated images and a log of prompts and seeds")
|
||||
return parser
|
||||
|
||||
|
||||
def create_cmd_parser():
|
||||
parser = argparse.ArgumentParser(description="Parse terminal input in a discord 'dreambot' fashion")
|
||||
parser = argparse.ArgumentParser(description='Example: dream> a fantastic alien landscape -W1024 -H960 -s100 -n12')
|
||||
parser.add_argument('prompt')
|
||||
parser.add_argument('-s','--steps',type=int,help="number of steps")
|
||||
parser.add_argument('-S','--seed',type=int,help="image seed")
|
||||
parser.add_argument('-n','--iterations',type=int,default=1,help="number of samplings to perform")
|
||||
parser.add_argument('-b','--batch',type=int,default=1,help="number of images to produce per sampling (currently broken)")
|
||||
parser.add_argument('-b','--batch_size',type=int,default=1,help="number of images to produce per sampling (currently broken)")
|
||||
parser.add_argument('-W','--width',type=int,help="image width, multiple of 64")
|
||||
parser.add_argument('-H','--height',type=int,help="image height, multiple of 64")
|
||||
parser.add_argument('-C','--cfg_scale',type=float,help="prompt configuration scale (7.5)")
|
||||
parser.add_argument('-C','--cfg_scale',default=7.5,type=float,help="prompt configuration scale")
|
||||
parser.add_argument('-g','--grid',action='store_true',help="generate a grid")
|
||||
parser.add_argument('-i','--individual',action='store_true',help="generate individual files (default)")
|
||||
parser.add_argument('-I','--init_img',type=str,help="path to input image (supersedes width and height)")
|
||||
parser.add_argument('-f','--strength',default=0.75,type=float,help="strength for noising/unnoising. 0.0 preserves image exactly, 1.0 replaces it completely")
|
||||
return parser
|
||||
|
||||
def load_history():
|
||||
histfile = os.path.join(os.path.expanduser('~'),".dream_history")
|
||||
try:
|
||||
readline.read_history_file(histfile)
|
||||
readline.set_history_length(1000)
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
atexit.register(readline.write_history_file,histfile)
|
||||
if readline_available:
|
||||
def setup_readline():
|
||||
readline.set_completer(Completer(['--steps','-s','--seed','-S','--iterations','-n','--batch_size','-b',
|
||||
'--width','-W','--height','-H','--cfg_scale','-C','--grid','-g',
|
||||
'--individual','-i','--init_img','-I','--strength','-f']).complete)
|
||||
readline.set_completer_delims(" ")
|
||||
readline.parse_and_bind('tab: complete')
|
||||
load_history()
|
||||
|
||||
def load_history():
|
||||
histfile = os.path.join(os.path.expanduser('~'),".dream_history")
|
||||
try:
|
||||
readline.read_history_file(histfile)
|
||||
readline.set_history_length(1000)
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
atexit.register(readline.write_history_file,histfile)
|
||||
|
||||
class Completer():
|
||||
def __init__(self,options):
|
||||
self.options = sorted(options)
|
||||
return
|
||||
|
||||
def complete(self,text,state):
|
||||
if text.startswith('-I') or text.startswith('--init_img'):
|
||||
return self._image_completions(text,state)
|
||||
|
||||
response = None
|
||||
if state == 0:
|
||||
# This is the first time for this text, so build a match list.
|
||||
if text:
|
||||
self.matches = [s
|
||||
for s in self.options
|
||||
if s and s.startswith(text)]
|
||||
else:
|
||||
self.matches = self.options[:]
|
||||
|
||||
# Return the state'th item from the match list,
|
||||
# if we have that many.
|
||||
try:
|
||||
response = self.matches[state]
|
||||
except IndexError:
|
||||
response = None
|
||||
return response
|
||||
|
||||
def _image_completions(self,text,state):
|
||||
# get the path so far
|
||||
if text.startswith('-I'):
|
||||
path = text.replace('-I','',1).lstrip()
|
||||
elif text.startswith('--init_img='):
|
||||
path = text.replace('--init_img=','',1).lstrip()
|
||||
|
||||
matches = list()
|
||||
|
||||
path = os.path.expanduser(path)
|
||||
if len(path)==0:
|
||||
matches.append(text+'./')
|
||||
else:
|
||||
dir = os.path.dirname(path)
|
||||
dir_list = os.listdir(dir)
|
||||
for n in dir_list:
|
||||
if n.startswith('.') and len(n)>1:
|
||||
continue
|
||||
full_path = os.path.join(dir,n)
|
||||
if full_path.startswith(path):
|
||||
if os.path.isdir(full_path):
|
||||
matches.append(os.path.join(os.path.dirname(text),n)+'/')
|
||||
elif n.endswith('.png'):
|
||||
matches.append(os.path.join(os.path.dirname(text),n))
|
||||
|
||||
try:
|
||||
response = matches[state]
|
||||
except IndexError:
|
||||
response = None
|
||||
return response
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
||||
@@ -1,17 +1,16 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
#!/usr/bin/env python3
|
||||
# Before running stable-diffusion on an internet-isolated machine,
|
||||
# run this script from one with internet connectivity. The
|
||||
# two machines must share a common .cache directory.
|
||||
|
||||
# this will preload the Bert tokenizer fles
|
||||
print("preloading bert tokenizer...",end='')
|
||||
print("preloading bert tokenizer...")
|
||||
from transformers import BertTokenizerFast
|
||||
tokenizer = BertTokenizerFast.from_pretrained("bert-base-uncased")
|
||||
print("...success")
|
||||
|
||||
# this will download requirements for Kornia
|
||||
print("preloading Kornia requirements...",end='')
|
||||
print("preloading Kornia requirements...")
|
||||
import kornia
|
||||
print("...success")
|
||||
|
||||
|
||||
@@ -12,6 +12,10 @@ from pytorch_lightning import seed_everything
|
||||
from torch import autocast
|
||||
from contextlib import contextmanager, nullcontext
|
||||
|
||||
import accelerate
|
||||
import k_diffusion as K
|
||||
import torch.nn as nn
|
||||
|
||||
from ldm.util import instantiate_from_config
|
||||
from ldm.models.diffusion.ddim import DDIMSampler
|
||||
from ldm.models.diffusion.plms import PLMSSampler
|
||||
@@ -80,6 +84,11 @@ def main():
|
||||
action='store_true',
|
||||
help="use plms sampling",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--klms",
|
||||
action='store_true',
|
||||
help="use klms sampling",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--laion400m",
|
||||
action='store_true',
|
||||
@@ -190,6 +199,22 @@ def main():
|
||||
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
|
||||
model = model.to(device)
|
||||
|
||||
#for klms
|
||||
model_wrap = K.external.CompVisDenoiser(model)
|
||||
accelerator = accelerate.Accelerator()
|
||||
device = accelerator.device
|
||||
class CFGDenoiser(nn.Module):
|
||||
def __init__(self, model):
|
||||
super().__init__()
|
||||
self.inner_model = model
|
||||
|
||||
def forward(self, x, sigma, uncond, cond, cond_scale):
|
||||
x_in = torch.cat([x] * 2)
|
||||
sigma_in = torch.cat([sigma] * 2)
|
||||
cond_in = torch.cat([uncond, cond])
|
||||
uncond, cond = self.inner_model(x_in, sigma_in, cond=cond_in).chunk(2)
|
||||
return uncond + (cond - uncond) * cond_scale
|
||||
|
||||
if opt.plms:
|
||||
sampler = PLMSSampler(model)
|
||||
else:
|
||||
@@ -226,8 +251,8 @@ def main():
|
||||
with model.ema_scope():
|
||||
tic = time.time()
|
||||
all_samples = list()
|
||||
for n in trange(opt.n_iter, desc="Sampling"):
|
||||
for prompts in tqdm(data, desc="data"):
|
||||
for n in trange(opt.n_iter, desc="Sampling", disable =not accelerator.is_main_process):
|
||||
for prompts in tqdm(data, desc="data", disable =not accelerator.is_main_process):
|
||||
uc = None
|
||||
if opt.scale != 1.0:
|
||||
uc = model.get_learned_conditioning(batch_size * [""])
|
||||
@@ -235,18 +260,32 @@ def main():
|
||||
prompts = list(prompts)
|
||||
c = model.get_learned_conditioning(prompts)
|
||||
shape = [opt.C, opt.H // opt.f, opt.W // opt.f]
|
||||
samples_ddim, _ = sampler.sample(S=opt.ddim_steps,
|
||||
conditioning=c,
|
||||
batch_size=opt.n_samples,
|
||||
shape=shape,
|
||||
verbose=False,
|
||||
unconditional_guidance_scale=opt.scale,
|
||||
unconditional_conditioning=uc,
|
||||
eta=opt.ddim_eta,
|
||||
x_T=start_code)
|
||||
|
||||
|
||||
if not opt.klms:
|
||||
samples_ddim, _ = sampler.sample(S=opt.ddim_steps,
|
||||
conditioning=c,
|
||||
batch_size=opt.n_samples,
|
||||
shape=shape,
|
||||
verbose=False,
|
||||
unconditional_guidance_scale=opt.scale,
|
||||
unconditional_conditioning=uc,
|
||||
eta=opt.ddim_eta,
|
||||
x_T=start_code)
|
||||
else:
|
||||
sigmas = model_wrap.get_sigmas(opt.ddim_steps)
|
||||
if start_code:
|
||||
x = start_code
|
||||
else:
|
||||
x = torch.randn([opt.n_samples, *shape], device=device) * sigmas[0] # for GPU draw
|
||||
model_wrap_cfg = CFGDenoiser(model_wrap)
|
||||
extra_args = {'cond': c, 'uncond': uc, 'cond_scale': opt.scale}
|
||||
samples_ddim = K.sampling.sample_lms(model_wrap_cfg, x, sigmas, extra_args=extra_args, disable=not accelerator.is_main_process)
|
||||
|
||||
x_samples_ddim = model.decode_first_stage(samples_ddim)
|
||||
x_samples_ddim = torch.clamp((x_samples_ddim + 1.0) / 2.0, min=0.0, max=1.0)
|
||||
|
||||
if opt.klms:
|
||||
x_sample = accelerator.gather(x_samples_ddim)
|
||||
|
||||
if not opt.skip_save:
|
||||
for x_sample in x_samples_ddim:
|
||||
|
||||
Reference in New Issue
Block a user