restore ability of ksamplers to process -v variation options

- supersedes PR #977 - works with both img2img and txt2img
autorotate init images using exif orientation tag
2026-01-15 10:48:12 -05:00 · 2022-10-07 16:25:58 -04:00 · 2022-10-07 12:06:50 -04:00 · 2022-10-07 10:36:45 -04:00 · 2022-10-07 10:20:02 -04:00 · 2022-10-07 08:12:55 -04:00
41 changed files with 1220 additions and 630 deletions
--- a/backend/invoke_ai_web_server.py
+++ b/backend/invoke_ai_web_server.py
@@ -53,17 +53,14 @@ class InvokeAIWebServer:
        cors_allowed_origins = [
            'http://127.0.0.1:5173',
            'http://localhost:5173',
+            'http://localhost:9090'
        ]
        additional_allowed_origins = (
            opt.cors if opt.cors else []
        )  # additional CORS allowed origins
-        if self.host == '127.0.0.1':
-            cors_allowed_origins.extend(
-                [
-                    f'http://{self.host}:{self.port}',
-                    f'http://localhost:{self.port}',
-                ]
-            )
+        cors_allowed_origins.append(f'http://{self.host}:{self.port}')
+        if self.host == '127.0.0.1' or self.host == '0.0.0.0':
+            cors_allowed_origins.append(f'http://localhost:{self.port}')
        cors_allowed_origins = (
            cors_allowed_origins + additional_allowed_origins
        )
--- a/docs/assets/prompt-blending/blue-sphere-0.25-red-cube-0.75-hybrid.png
+++ b/docs/assets/prompt-blending/blue-sphere-0.25-red-cube-0.75-hybrid.png
--- a/docs/assets/prompt-blending/blue-sphere-0.5-red-cube-0.5-hybrid.png
+++ b/docs/assets/prompt-blending/blue-sphere-0.5-red-cube-0.5-hybrid.png
--- a/docs/assets/prompt-blending/blue-sphere-0.5-red-cube-0.5.png
+++ b/docs/assets/prompt-blending/blue-sphere-0.5-red-cube-0.5.png
--- a/docs/assets/prompt-blending/blue-sphere-0.75-red-cube-0.25-hybrid.png
+++ b/docs/assets/prompt-blending/blue-sphere-0.75-red-cube-0.25-hybrid.png
--- a/docs/features/IMG2IMG.md
+++ b/docs/features/IMG2IMG.md
@@ -10,18 +10,39 @@ top of the image you provide, preserving the original's basic shape and layout.
 the `--init_img` option as shown here:

 ```commandline
-dream> "waterfall and rainbow" --init_img=./init-images/crude_drawing.png --strength=0.5 -s100 -n4
+tree on a hill with a river, nature photograph, national geographic -I./test-pictures/tree-and-river-sketch.png -f 0.85
 ```

+This will take the original image shown here:
+
+<img src="https://user-images.githubusercontent.com/50542132/193946000-c42a96d8-5a74-4f8a-b4c3-5213e6cadcce.png" width=350>
+                                                                                                               
+and generate a new image based on it as shown here:
+
+<img src="https://user-images.githubusercontent.com/111189/194135515-53d4c060-e994-4016-8121-7c685e281ac9.png" width=350>
+
 The `--init_img (-I)` option gives the path to the seed picture. `--strength (-f)` controls how much
 the original will be modified, ranging from `0.0` (keep the original intact), to `1.0` (ignore the
-original completely). The default is `0.75`, and ranges from `0.25-0.75` give interesting results.
+original completely). The default is `0.75`, and ranges from `0.25-0.90` give interesting results. 
+Other relevant options include `-C` (classification free guidance scale), and `-s` (steps). Unlike `txt2img`, 
+adding steps will continuously change the resulting image and it will not converge.

 You may also pass a `-v<variation_amount>` option to generate `-n<iterations>` count variants on
 the original image. This is done by passing the first generated image
 back into img2img the requested number of times. It generates
 interesting variants.

+Note that the prompt makes a big difference. For example, this slight variation on the prompt produces
+a very different image:
+
+`photograph of a tree on a hill with a river`
+
+<img src="https://user-images.githubusercontent.com/111189/194135220-16b62181-b60c-4248-8989-4834a8fd7fbd.png" width=350>
+
+(When designing prompts, think about how the images scraped from the internet were captioned. Very few photographs will
+be labeled "photograph" or "photorealistic." They will, however, be captioned with the publication, photographer, camera
+model, or film settings.)
+
 If the initial image contains transparent regions, then Stable Diffusion will only draw within the
 transparent regions, a process called "inpainting". However, for this to work correctly, the color
 information underneath the transparent needs to be preserved, not erased.
@@ -29,6 +50,14 @@ information underneath the transparent needs to be preserved, not erased.
 More details can be found here:
 [Creating Transparent Images For Inpainting](./INPAINTING.md#creating-transparent-regions-for-inpainting)

+**IMPORTANT ISSUE** `img2img` does not work properly on initial images smaller than 512x512. Please scale your
+image to at least 512x512 before using it. Larger images are not a problem, but may run out of VRAM on your
+GPU card. To fix this, use the --fit option, which downscales the initial image to fit within the box specified
+by width x height:
+~~~
+tree on a hill with a river, national geographic -I./test-pictures/big-sketch.png -H512 -W512 --fit
+~~~
+
 ## How does it actually work, though?

 The main difference between `img2img` and `prompt2img` is the starting point. While `prompt2img` always starts with pure 
--- a/docs/features/PROMPTS.md
+++ b/docs/features/PROMPTS.md
@@ -114,7 +114,7 @@ is depth there, so the enclosing frame is actually a cube.

 ### "blue sphere:0.25 red cube:0.75 hybrid"

-<img src="../assets/prompt-blending/blue-sphere:0.25-red-cube:0.75-hybrid.png" width=256>
+<img src="../assets/prompt-blending/blue-sphere-0.25-red-cube-0.75-hybrid.png" width=256>

 Now that's interesting. We get neither a blue sphere nor a red cube,
 but a red sphere embedded in a brick wall, which represents a melding
@@ -123,14 +123,14 @@ representations. Where is Ludwig Wittgenstein when you need him?

 ### "blue sphere:0.75 red cube:0.25 hybrid"

-<img src="../assets/prompt-blending/blue-sphere:0.75-red-cube:0.25-hybrid.png" width=256>
+<img src="../assets/prompt-blending/blue-sphere-0.75-red-cube-0.25-hybrid.png" width=256>

 Definitely more blue-spherey. The cube is gone entirely, but it's
 really cool abstract art.

 ### "blue sphere:0.5 red cube:0.5 hybrid"

-<img src="../assets/prompt-blending/blue-sphere:0.5-red-cube:0.5-hybrid.png" width=256>
+<img src="../assets/prompt-blending/blue-sphere-0.5-red-cube-0.5-hybrid.png" width=256>

 Whoa...! I see blue and red, but no spheres or cubes. Is the word
 "hybrid" summoning up the concept of some sort of scifi creature?
@@ -138,7 +138,7 @@ Let's find out.

 ### "blue sphere:0.5 red cube:0.5"

-<img src="../assets/prompt-blending/blue-sphere:0.5-red-cube:0.5.png" width=256>
+<img src="../assets/prompt-blending/blue-sphere-0.5-red-cube-0.5.png" width=256>

 Indeed, removing the word "hybrid" produces an image that is more like
 what we'd expect.
--- a/frontend/dist/assets/index.3a9574b7.js
+++ b/frontend/dist/assets/index.3a9574b7.js
--- a/frontend/dist/assets/index.60ca0ee5.css
+++ b/frontend/dist/assets/index.60ca0ee5.css
--- a/frontend/dist/assets/index.853a336f.css
+++ b/frontend/dist/assets/index.853a336f.css
--- a/frontend/dist/assets/index.d9916e7a.js
+++ b/frontend/dist/assets/index.d9916e7a.js
--- a/frontend/dist/index.html
+++ b/frontend/dist/index.html
@@ -6,8 +6,8 @@
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <title>InvokeAI - A Stable Diffusion Toolkit</title>
  <link rel="shortcut icon" type="icon" href="/assets/favicon.0d253ced.ico" />
-  <script type="module" crossorigin src="/assets/index.d9916e7a.js"></script>
-  <link rel="stylesheet" href="/assets/index.853a336f.css">
+  <script type="module" crossorigin src="/assets/index.3a9574b7.js"></script>
+  <link rel="stylesheet" href="/assets/index.60ca0ee5.css">
 </head>

 <body>
--- a/frontend/package.json
+++ b/frontend/package.json
@@ -23,6 +23,7 @@
    "react": "^18.2.0",
    "react-dom": "^18.2.0",
    "react-dropzone": "^14.2.2",
+    "react-hotkeys-hook": "^3.4.7",
    "react-icons": "^4.4.0",
    "react-redux": "^8.0.2",
    "redux-persist": "^6.0.0",
--- a/frontend/src/features/gallery/CurrentImageButtons.tsx
+++ b/frontend/src/features/gallery/CurrentImageButtons.tsx
@@ -19,6 +19,8 @@ import { MdDelete, MdFace, MdHd, MdImage, MdInfo } from 'react-icons/md';
 import InvokePopover from './InvokePopover';
 import UpscaleOptions from '../options/AdvancedOptions/Upscale/UpscaleOptions';
 import FaceRestoreOptions from '../options/AdvancedOptions/FaceRestore/FaceRestoreOptions';
+import { useHotkeys } from 'react-hotkeys-hook';
+import { useToast } from '@chakra-ui/react';

 const systemSelector = createSelector(
  (state: RootState) => state.system,
@@ -54,6 +56,8 @@ const CurrentImageButtons = ({
 }: CurrentImageButtonsProps) => {
  const dispatch = useAppDispatch();

+  const toast = useToast();
+
  const intermediateImage = useAppSelector(
    (state: RootState) => state.gallery.intermediateImage
  );
@@ -71,19 +75,163 @@ const CurrentImageButtons = ({

  const handleClickUseAsInitialImage = () =>
    dispatch(setInitialImagePath(image.url));
+  useHotkeys(
+    'shift+i',
+    () => {
+      if (image) {
+        handleClickUseAsInitialImage();
+        toast({
+          title: 'Sent To Image To Image',
+          status: 'success',
+          duration: 2500,
+          isClosable: true,
+        });
+      } else {
+        toast({
+          title: 'No Image Loaded',
+          description: 'No image found to send to image to image module.',
+          status: 'error',
+          duration: 2500,
+          isClosable: true,
+        });
+      }
+    },
+    [image]
+  );

  const handleClickUseAllParameters = () =>
    dispatch(setAllParameters(image.metadata));
+  useHotkeys(
+    'a',
+    () => {
+      if (['txt2img', 'img2img'].includes(image?.metadata?.image?.type)) {
+        handleClickUseAllParameters();
+        toast({
+          title: 'Parameters Set',
+          status: 'success',
+          duration: 2500,
+          isClosable: true,
+        });
+      } else {
+        toast({
+          title: 'Parameters Not Set',
+          description: 'No metadata found for this image.',
+          status: 'error',
+          duration: 2500,
+          isClosable: true,
+        });
+      }
+    },
+    [image]
+  );

  // Non-null assertion: this button is disabled if there is no seed.
  // eslint-disable-next-line @typescript-eslint/no-non-null-assertion
  const handleClickUseSeed = () => dispatch(setSeed(image.metadata.image.seed));
+  useHotkeys(
+    's',
+    () => {
+      if (image?.metadata?.image?.seed) {
+        handleClickUseSeed();
+        toast({
+          title: 'Seed Set',
+          status: 'success',
+          duration: 2500,
+          isClosable: true,
+        });
+      } else {
+        toast({
+          title: 'Seed Not Set',
+          description: 'Could not find seed for this image.',
+          status: 'error',
+          duration: 2500,
+          isClosable: true,
+        });
+      }
+    },
+    [image]
+  );
+
  const handleClickUpscale = () => dispatch(runESRGAN(image));
+  useHotkeys(
+    'u',
+    () => {
+      if (
+        isESRGANAvailable &&
+        Boolean(!intermediateImage) &&
+        isConnected &&
+        !isProcessing &&
+        upscalingLevel
+      ) {
+        handleClickUpscale();
+      } else {
+        toast({
+          title: 'Upscaling Failed',
+          status: 'error',
+          duration: 2500,
+          isClosable: true,
+        });
+      }
+    },
+    [
+      image,
+      isESRGANAvailable,
+      intermediateImage,
+      isConnected,
+      isProcessing,
+      upscalingLevel,
+    ]
+  );

  const handleClickFixFaces = () => dispatch(runGFPGAN(image));
+  useHotkeys(
+    'r',
+    () => {
+      if (
+        isGFPGANAvailable &&
+        Boolean(!intermediateImage) &&
+        isConnected &&
+        !isProcessing &&
+        gfpganStrength
+      ) {
+        handleClickFixFaces();
+      } else {
+        toast({
+          title: 'Face Restoration Failed',
+          status: 'error',
+          duration: 2500,
+          isClosable: true,
+        });
+      }
+    },
+    [
+      image,
+      isGFPGANAvailable,
+      intermediateImage,
+      isConnected,
+      isProcessing,
+      gfpganStrength,
+    ]
+  );

  const handleClickShowImageDetails = () =>
    setShouldShowImageDetails(!shouldShowImageDetails);
+  useHotkeys(
+    'i',
+    () => {
+      if (image) {
+        handleClickShowImageDetails();
+      } else {
+        toast({
+          title: 'Failed to load metadata',
+          status: 'error',
+          duration: 2500,
+          isClosable: true,
+        });
+      }
+    },
+    [image, shouldShowImageDetails]
+  );

  return (
    <div className="current-image-options">
--- a/frontend/src/features/gallery/CurrentImageDisplay.scss
+++ b/frontend/src/features/gallery/CurrentImageDisplay.scss
@@ -67,6 +67,44 @@
  }
 }

+.current-image-next-prev-buttons {
+  position: absolute;
+  top: 0;
+  left: 0;
+  display: flex;
+  align-items: center;
+  justify-content: space-between;
+  width: calc(100% - 2rem);
+  padding: 0.5rem;
+  margin-left: 1rem;
+  z-index: 1;
+  height: calc($app-metadata-height - 1rem);
+  pointer-events: none;
+}
+
+.next-prev-button-trigger-area {
+  width: 7rem;
+  height: 100%;
+  display: flex;
+  align-items: center;
+  pointer-events: auto;
+
+  &.prev-button-trigger-area {
+    justify-content: flex-start;
+  }
+
+  &.next-button-trigger-area {
+    justify-content: flex-end;
+  }
+}
+
+.next-prev-button {
+  font-size: 5rem;
+  fill: var(--text-color-secondary);
+  filter: drop-shadow(0 0 1rem var(--text-color-secondary));
+  opacity: 70%;
+}
+
 .current-image-metadata-viewer {
  border-radius: 0.5rem;
  position: absolute;
--- a/frontend/src/features/gallery/CurrentImageDisplay.tsx
+++ b/frontend/src/features/gallery/CurrentImageDisplay.tsx
@@ -1,15 +1,21 @@
-import { Image } from '@chakra-ui/react';
-import { useAppSelector } from '../../app/store';
+import { IconButton, Image } from '@chakra-ui/react';
+import { useAppDispatch, useAppSelector } from '../../app/store';
 import { RootState } from '../../app/store';
 import { useState } from 'react';
 import ImageMetadataViewer from './ImageMetadataViewer';
 import CurrentImageButtons from './CurrentImageButtons';
 import { MdPhoto } from 'react-icons/md';
+import { FaAngleLeft, FaAngleRight } from 'react-icons/fa';
+import { selectNextImage, selectPrevImage } from './gallerySlice';

 /**
 * Displays the current image if there is one, plus associated actions.
 */
 const CurrentImageDisplay = () => {
+  const dispatch = useAppDispatch();
+  const [shouldShowNextPrevButtons, setShouldShowNextPrevButtons] =
+    useState<boolean>(false);
+
  const { currentImage, intermediateImage } = useAppSelector(
    (state: RootState) => state.gallery
  );
@@ -19,6 +25,22 @@ const CurrentImageDisplay = () => {

  const imageToDisplay = intermediateImage || currentImage;

+  const handleCurrentImagePreviewMouseOver = () => {
+    setShouldShowNextPrevButtons(true);
+  };
+
+  const handleCurrentImagePreviewMouseOut = () => {
+    setShouldShowNextPrevButtons(false);
+  };
+
+  const handleClickPrevButton = () => {
+    dispatch(selectPrevImage());
+  };
+
+  const handleClickNextButton = () => {
+    dispatch(selectNextImage());
+  };
+
  return imageToDisplay ? (
    <div className="current-image-display">
      <div className="current-image-tools">
@@ -40,6 +62,38 @@ const CurrentImageDisplay = () => {
            <ImageMetadataViewer image={imageToDisplay} />
          </div>
        )}
+        {!shouldShowImageDetails && (
+          <div className="current-image-next-prev-buttons">
+            <div
+              className="next-prev-button-trigger-area prev-button-trigger-area"
+              onMouseOver={handleCurrentImagePreviewMouseOver}
+              onMouseOut={handleCurrentImagePreviewMouseOut}
+            >
+              {shouldShowNextPrevButtons && (
+                <IconButton
+                  aria-label="Previous image"
+                  icon={<FaAngleLeft className="next-prev-button" />}
+                  variant="unstyled"
+                  onClick={handleClickPrevButton}
+                />
+              )}
+            </div>
+            <div
+              className="next-prev-button-trigger-area next-button-trigger-area"
+              onMouseOver={handleCurrentImagePreviewMouseOver}
+              onMouseOut={handleCurrentImagePreviewMouseOut}
+            >
+              {shouldShowNextPrevButtons && (
+                <IconButton
+                  aria-label="Next image"
+                  icon={<FaAngleRight className="next-prev-button" />}
+                  variant="unstyled"
+                  onClick={handleClickNextButton}
+                />
+              )}
+            </div>
+          </div>
+        )}
      </div>
    </div>
  ) : (
--- a/frontend/src/features/gallery/DeleteImageModal.tsx
+++ b/frontend/src/features/gallery/DeleteImageModal.tsx
@@ -27,6 +27,7 @@ import { deleteImage } from '../../app/socketio/actions';
 import { RootState } from '../../app/store';
 import { setShouldConfirmOnDelete, SystemState } from '../system/systemSlice';
 import * as InvokeAI from '../../app/invokeai';
+import { useHotkeys } from 'react-hotkeys-hook';

 interface DeleteImageModalProps {
  /**
@@ -67,6 +68,14 @@ const DeleteImageModal = forwardRef(
      onClose();
    };

+    useHotkeys(
+      'del',
+      () => {
+        shouldConfirmOnDelete ? onOpen() : handleDelete();
+      },
+      [image, shouldConfirmOnDelete]
+    );
+
    const handleChangeShouldConfirmOnDelete = (
      e: ChangeEvent<HTMLInputElement>
    ) => dispatch(setShouldConfirmOnDelete(!e.target.checked));
--- a/frontend/src/features/gallery/ImageGallery.tsx
+++ b/frontend/src/features/gallery/ImageGallery.tsx
@@ -1,8 +1,10 @@
 import { Button } from '@chakra-ui/react';
+import { useHotkeys } from 'react-hotkeys-hook';
 import { MdPhotoLibrary } from 'react-icons/md';
 import { requestImages } from '../../app/socketio/actions';
 import { RootState, useAppDispatch } from '../../app/store';
 import { useAppSelector } from '../../app/store';
+import { selectNextImage, selectPrevImage } from './gallerySlice';
 import HoverableImage from './HoverableImage';

 /**
@@ -25,6 +27,22 @@ const ImageGallery = () => {
    dispatch(requestImages());
  };

+  useHotkeys(
+    'left',
+    () => {
+      dispatch(selectPrevImage());
+    },
+    []
+  );
+
+  useHotkeys(
+    'right',
+    () => {
+      dispatch(selectNextImage());
+    },
+    []
+  );
+
  return (
    <div className="image-gallery-container">
      {images.length ? (
--- a/frontend/src/features/gallery/gallerySlice.ts
+++ b/frontend/src/features/gallery/gallerySlice.ts
@@ -1,6 +1,6 @@
 import { createSlice } from '@reduxjs/toolkit';
 import type { PayloadAction } from '@reduxjs/toolkit';
-import { clamp } from 'lodash';
+import _, { clamp } from 'lodash';
 import * as InvokeAI from '../../app/invokeai';

 export interface GalleryState {
@@ -85,6 +85,32 @@ export const gallerySlice = createSlice({
    clearIntermediateImage: (state) => {
      state.intermediateImage = undefined;
    },
+    selectNextImage: (state) => {
+      const { images, currentImage } = state;
+      if (currentImage) {
+        const currentImageIndex = images.findIndex(
+          (i) => i.uuid === currentImage.uuid
+        );
+        if (_.inRange(currentImageIndex, 0, images.length)) {
+          const newCurrentImage = images[currentImageIndex + 1];
+          state.currentImage = newCurrentImage;
+          state.currentImageUuid = newCurrentImage.uuid;
+        }
+      }
+    },
+    selectPrevImage: (state) => {
+      const { images, currentImage } = state;
+      if (currentImage) {
+        const currentImageIndex = images.findIndex(
+          (i) => i.uuid === currentImage.uuid
+        );
+        if (_.inRange(currentImageIndex, 1, images.length + 1)) {
+          const newCurrentImage = images[currentImageIndex - 1];
+          state.currentImage = newCurrentImage;
+          state.currentImageUuid = newCurrentImage.uuid;
+        }
+      }
+    },
    addGalleryImages: (
      state,
      action: PayloadAction<{
@@ -122,6 +148,8 @@ export const {
  setCurrentImage,
  addGalleryImages,
  setIntermediateImage,
+  selectNextImage,
+  selectPrevImage,
 } = gallerySlice.actions;

 export default gallerySlice.reducer;
--- a/frontend/src/features/options/AdvancedOptions/Seed/SeedOptions.tsx
+++ b/frontend/src/features/options/AdvancedOptions/Seed/SeedOptions.tsx
@@ -18,6 +18,8 @@ const SeedOptions = () => {
      </Flex>
      <Flex gap={2}>
        <Threshold />
+      </Flex>
+      <Flex gap={2}>
        <Perlin />
      </Flex>
    </Flex>
--- a/frontend/src/features/options/ProcessButtons/CancelButton.tsx
+++ b/frontend/src/features/options/ProcessButtons/CancelButton.tsx
@@ -4,12 +4,23 @@ import { cancelProcessing } from '../../../app/socketio/actions';
 import { useAppDispatch, useAppSelector } from '../../../app/store';
 import IAIIconButton from '../../../common/components/IAIIconButton';
 import { systemSelector } from '../../../common/hooks/useCheckParameters';
+import { useHotkeys } from 'react-hotkeys-hook';

 export default function CancelButton() {
  const dispatch = useAppDispatch();
  const { isProcessing, isConnected } = useAppSelector(systemSelector);
  const handleClickCancel = () => dispatch(cancelProcessing());

+  useHotkeys(
+    'shift+x',
+    () => {
+      if (isConnected || isProcessing) {
+        handleClickCancel();
+      }
+    },
+    [isConnected, isProcessing]
+  );
+
  return (
    <IAIIconButton
      icon={<MdCancel />}
--- a/frontend/src/features/options/PromptInput/PromptInput.tsx
+++ b/frontend/src/features/options/PromptInput/PromptInput.tsx
@@ -1,5 +1,5 @@
 import { FormControl, Textarea } from '@chakra-ui/react';
-import { ChangeEvent, KeyboardEvent } from 'react';
+import { ChangeEvent, KeyboardEvent, useRef } from 'react';
 import { RootState, useAppDispatch, useAppSelector } from '../../../app/store';
 import { generateImage } from '../../../app/socketio/actions';

@@ -9,6 +9,7 @@ import { isEqual } from 'lodash';
 import useCheckParameters, {
  systemSelector,
 } from '../../../common/hooks/useCheckParameters';
+import { useHotkeys } from 'react-hotkeys-hook';

 export const optionsSelector = createSelector(
  (state: RootState) => state.options,
@@ -28,6 +29,7 @@ export const optionsSelector = createSelector(
 * Prompt input text area.
 */
 const PromptInput = () => {
+  const promptRef = useRef<HTMLTextAreaElement>(null);
  const { prompt } = useAppSelector(optionsSelector);
  const { isProcessing } = useAppSelector(systemSelector);
  const dispatch = useAppDispatch();
@@ -37,6 +39,24 @@ const PromptInput = () => {
    dispatch(setPrompt(e.target.value));
  };

+  useHotkeys(
+    'ctrl+enter',
+    () => {
+      if (isReady) {
+        dispatch(generateImage());
+      }
+    },
+    [isReady]
+  );
+
+  useHotkeys(
+    'alt+a',
+    () => {
+      promptRef.current?.focus();
+    },
+    []
+  );
+
  const handleKeyDown = (e: KeyboardEvent<HTMLTextAreaElement>) => {
    if (e.key === 'Enter' && e.shiftKey === false && isReady) {
      e.preventDefault();
@@ -60,6 +80,7 @@ const PromptInput = () => {
          onKeyDown={handleKeyDown}
          resize="vertical"
          height={30}
+          ref={promptRef}
        />
      </FormControl>
    </div>
--- a/frontend/src/features/options/optionsSlice.ts
+++ b/frontend/src/features/options/optionsSlice.ts
@@ -246,7 +246,9 @@ export const optionsSlice = createSlice({
      if (steps) state.steps = steps;
      if (cfg_scale) state.cfgScale = cfg_scale;
      if (threshold) state.threshold = threshold;
+      if (typeof threshold === 'undefined') state.threshold = 0;
      if (perlin) state.perlin = perlin;
+      if (typeof perlin === 'undefined') state.perlin = 0;      
      if (typeof seamless === 'boolean') state.seamless = seamless;
      if (width) state.width = width;
      if (height) state.height = height;
--- a/frontend/src/features/system/HotkeysModal/HotkeysModal.scss
+++ b/frontend/src/features/system/HotkeysModal/HotkeysModal.scss
@@ -0,0 +1,53 @@
+@use '../../../styles/Mixins/' as *;
+
+.hotkeys-modal {
+  display: grid;
+  padding: 1rem;
+  background-color: var(--settings-modal-bg) !important;
+  row-gap: 1rem;
+  font-family: Inter;
+
+  h1 {
+    font-size: 1.2rem;
+    font-weight: bold;
+  }
+}
+
+.hotkeys-modal-items {
+  display: grid;
+  row-gap: 0.5rem;
+  max-height: 32rem;
+  overflow-y: scroll;
+  @include HideScrollbar;
+}
+
+.hotkey-modal-item {
+  display: grid;
+  grid-template-columns: auto max-content;
+  justify-content: space-between;
+  align-items: center;
+  background-color: var(--background-color);
+  padding: 0.5rem 1rem;
+  border-radius: 0.3rem;
+
+  .hotkey-info {
+    display: grid;
+
+    .hotkey-title {
+      font-weight: bold;
+    }
+
+    .hotkey-description {
+      font-size: 0.9rem;
+      color: var(--text-color-secondary);
+    }
+  }
+
+  .hotkey-key {
+    font-size: 0.8rem;
+    font-weight: bold;
+    border: 2px solid var(--settings-modal-bg);
+    padding: 0.2rem 0.5rem;
+    border-radius: 0.3rem;
+  }
+}
--- a/frontend/src/features/system/HotkeysModal/HotkeysModal.tsx
+++ b/frontend/src/features/system/HotkeysModal/HotkeysModal.tsx
@@ -0,0 +1,98 @@
+import {
+  Modal,
+  ModalCloseButton,
+  ModalContent,
+  ModalOverlay,
+  useDisclosure,
+} from '@chakra-ui/react';
+import React, { cloneElement, ReactElement } from 'react';
+import HotkeysModalItem from './HotkeysModalItem';
+
+type HotkeysModalProps = {
+  /* The button to open the Settings Modal */
+  children: ReactElement;
+};
+
+export default function HotkeysModal({ children }: HotkeysModalProps) {
+  const {
+    isOpen: isHotkeyModalOpen,
+    onOpen: onHotkeysModalOpen,
+    onClose: onHotkeysModalClose,
+  } = useDisclosure();
+
+  const hotkeys = [
+    { title: 'Invoke', desc: 'Generate an image', hotkey: 'Ctrl+Enter' },
+    { title: 'Cancel', desc: 'Cancel image generation', hotkey: 'Shift+X' },
+    {
+      title: 'Set Seed',
+      desc: 'Use the seed of the current image',
+      hotkey: 'S',
+    },
+    {
+      title: 'Set Parameters',
+      desc: 'Use all parameters of the current image',
+      hotkey: 'A',
+    },
+    { title: 'Restore Faces', desc: 'Restore the current image', hotkey: 'R' },
+    { title: 'Upscale', desc: 'Upscale the current image', hotkey: 'U' },
+    {
+      title: 'Show Info',
+      desc: 'Show metadata info of the current image',
+      hotkey: 'I',
+    },
+    {
+      title: 'Send To Image To Image',
+      desc: 'Send the current image to Image to Image module',
+      hotkey: 'Shift+I',
+    },
+    { title: 'Delete Image', desc: 'Delete the current image', hotkey: 'Del' },
+    {
+      title: 'Focus Prompt',
+      desc: 'Focus the prompt input area',
+      hotkey: 'Alt+A',
+    },
+    {
+      title: 'Previous Image',
+      desc: 'Display the previous image in the gallery',
+      hotkey: 'Arrow left',
+    },
+    {
+      title: 'Next Image',
+      desc: 'Display the next image in the gallery',
+      hotkey: 'Arrow right',
+    },
+  ];
+
+  const renderHotkeyModalItems = () => {
+    const hotkeyModalItemsToRender: ReactElement[] = [];
+
+    hotkeys.forEach((hotkey, i) => {
+      hotkeyModalItemsToRender.push(
+        <HotkeysModalItem
+          key={i}
+          title={hotkey.title}
+          description={hotkey.desc}
+          hotkey={hotkey.hotkey}
+        />
+      );
+    });
+
+    return hotkeyModalItemsToRender;
+  };
+
+  return (
+    <>
+      {cloneElement(children, {
+        onClick: onHotkeysModalOpen,
+      })}
+      <Modal isOpen={isHotkeyModalOpen} onClose={onHotkeysModalClose}>
+        <ModalOverlay />
+        <ModalContent className="hotkeys-modal">
+          <ModalCloseButton />
+          <h1>Keyboard Shorcuts</h1>
+          <div className="hotkeys-modal-items">{renderHotkeyModalItems()}</div>
+        </ModalContent>
+      </Modal>
+    </>
+  );
+}
--- a/frontend/src/features/system/HotkeysModal/HotkeysModalItem.tsx
+++ b/frontend/src/features/system/HotkeysModal/HotkeysModalItem.tsx
@@ -0,0 +1,20 @@
+import React from 'react';
+
+interface HotkeysModalProps {
+  hotkey: string;
+  title: string;
+  description?: string;
+}
+
+export default function HotkeysModalItem(props: HotkeysModalProps) {
+  const { title, hotkey, description } = props;
+  return (
+    <div className="hotkey-modal-item">
+      <div className="hotkey-info">
+        <p className="hotkey-title">{title}</p>
+        {description && <p className="hotkey-description">{description}</p>}
+      </div>
+      <div className="hotkey-key">{hotkey}</div>
+    </div>
+  );
+}
--- a/frontend/src/features/system/SettingsModal/SettingsModal.scss
+++ b/frontend/src/features/system/SettingsModal/SettingsModal.scss
@@ -2,6 +2,7 @@

 .settings-modal {
  background-color: var(--settings-modal-bg) !important;
+  font-family: Inter;

  .settings-modal-content {
    display: grid;
--- a/frontend/src/features/system/SiteHeader.scss
+++ b/frontend/src/features/system/SiteHeader.scss
@@ -21,7 +21,7 @@

 .site-header-right-side {
  display: grid;
-  grid-template-columns: repeat(5, max-content);
+  grid-template-columns: repeat(6, max-content);
  align-items: center;
  column-gap: 0.5rem;
 }
--- a/frontend/src/features/system/SiteHeader.tsx
+++ b/frontend/src/features/system/SiteHeader.tsx
@@ -1,9 +1,11 @@
 import { IconButton, Link, useColorMode } from '@chakra-ui/react';

 import { FaSun, FaMoon, FaGithub } from 'react-icons/fa';
-import { MdHelp, MdSettings } from 'react-icons/md';
+import { MdHelp, MdKeyboard, MdSettings } from 'react-icons/md';

 import InvokeAILogo from '../../assets/images/logo.png';
+import HotkeysModal from './HotkeysModal/HotkeysModal';
+
 import SettingsModal from './SettingsModal/SettingsModal';
 import StatusIndicator from './StatusIndicator';

@@ -40,6 +42,16 @@ const SiteHeader = () => {
          />
        </SettingsModal>

+        <HotkeysModal>
+          <IconButton
+            aria-label="Hotkeys"
+            variant="link"
+            fontSize={24}
+            size={'sm'}
+            icon={<MdKeyboard />}
+          />
+        </HotkeysModal>
+
        <IconButton
          aria-label="Link to Github Issues"
          variant="link"
--- a/frontend/src/styles/index.scss
+++ b/frontend/src/styles/index.scss
@@ -11,6 +11,7 @@
@use '../features/system/SiteHeader.scss';
@use '../features/system/StatusIndicator.scss';
@use '../features/system/SettingsModal/SettingsModal.scss';
+@use '../features/system/HotkeysModal/HotkeysModal.scss';
@use '../features/system/Console.scss';

 // options
--- a/frontend/yarn.lock
+++ b/frontend/yarn.lock
@@ -1582,6 +1582,11 @@ balanced-match@^1.0.0:
  resolved "https://registry.yarnpkg.com/balanced-match/-/balanced-match-1.0.2.tgz#e83e3a7e3f300b34cb9d87f615fa0cbf357690ee"
  integrity sha512-3oSeUO0TMV67hN1AmbXsK4yaqU7tjiHlbxRDZOpH0KW9+CeX4bRAaX0Anxt0tx2MrpRpWwQaPwIlISEJhYU5Pw==

+base64id@2.0.0, base64id@~2.0.0:
+  version "2.0.0"
+  resolved "https://registry.yarnpkg.com/base64id/-/base64id-2.0.0.tgz#2770ac6bc47d312af97a8bf9a634342e0cd25cb6"
+  integrity sha512-lGe34o6EHj9y3Kts9R4ZYs/Gr+6N7MCaMlIFA3F1R2O5/m7K06AxfSeO5530PEERE6/WyEg3lsuyw4GHlPZHog==
+
 binary-extensions@^2.0.0:
  version "2.2.0"
  resolved "https://registry.yarnpkg.com/binary-extensions/-/binary-extensions-2.2.0.tgz#75f502eeaf9ffde42fc98829645be4ea76bd9e2d"
@@ -2359,6 +2364,11 @@ hoist-non-react-statics@^3.3.0, hoist-non-react-statics@^3.3.1, hoist-non-react-
  dependencies:
    react-is "^16.7.0"

+hotkeys-js@3.9.4:
+  version "3.9.4"
+  resolved "https://registry.yarnpkg.com/hotkeys-js/-/hotkeys-js-3.9.4.tgz#ce1aa4c3a132b6a63a9dd5644fc92b8a9b9cbfb9"
+  integrity sha512-2zuLt85Ta+gIyvs4N88pCYskNrxf1TFv3LR9t5mdAZIX8BcgQQ48F2opUptvHa6m8zsy5v/a0i9mWzTrlNWU0Q==
+
 ignore@^5.2.0:
  version "5.2.0"
  resolved "https://registry.yarnpkg.com/ignore/-/ignore-5.2.0.tgz#6d3bac8fa7fe0d45d9f9be7bac2fc279577e345a"
@@ -2618,7 +2628,7 @@ normalize-path@^3.0.0, normalize-path@~3.0.0:
  resolved "https://registry.yarnpkg.com/normalize-path/-/normalize-path-3.0.0.tgz#0dcd69ff23a1c9b11fd0978316644a0388216a65"
  integrity sha512-6eZs5Ls3WtCisHWp9S2GUy8dqkpGi4BVSz3GaqiE6ezub0512ESztXUwUB6C6IKbQkY2Pnb/mD4WYojCRwcwLA==

-object-assign@^4.1.1:
+object-assign@^4, object-assign@^4.1.1:
  version "4.1.1"
  resolved "https://registry.yarnpkg.com/object-assign/-/object-assign-4.1.1.tgz#2109adc7965887cfc05cbbd442cac8bfbb360863"
  integrity sha512-rJgTQnkUnH1sFw8yT6VSU3zD3sWmu6sZhIseY8VX+GRu3P6F7Fu+JNDoXfklElbLJSnc3FUQHVe4cU5hj+BcUg==
@@ -2818,6 +2828,13 @@ react-focus-lock@^2.9.1:
    use-callback-ref "^1.3.0"
    use-sidecar "^1.1.2"

+react-hotkeys-hook@^3.4.7:
+  version "3.4.7"
+  resolved "https://registry.yarnpkg.com/react-hotkeys-hook/-/react-hotkeys-hook-3.4.7.tgz#e16a0a85f59feed9f48d12cfaf166d7df4c96b7a"
+  integrity sha512-+bbPmhPAl6ns9VkXkNNyxlmCAIyDAcWbB76O4I0ntr3uWCRuIQf/aRLartUahe9chVMPj+OEzzfk3CQSjclUEQ==
+  dependencies:
+    hotkeys-js "3.9.4"
+
 react-icons@^4.4.0:
  version "4.4.0"
  resolved "https://registry.yarnpkg.com/react-icons/-/react-icons-4.4.0.tgz#a13a8a20c254854e1ec9aecef28a95cdf24ef703"
@@ -3044,6 +3061,18 @@ socket.io-parser@~4.2.0:
    "@socket.io/component-emitter" "~3.1.0"
    debug "~4.3.1"

+socket.io@^4.5.2:
+  version "4.5.2"
+  resolved "https://registry.yarnpkg.com/socket.io/-/socket.io-4.5.2.tgz#1eb25fd380ab3d63470aa8279f8e48d922d443ac"
+  integrity sha512-6fCnk4ARMPZN448+SQcnn1u8OHUC72puJcNtSgg2xS34Cu7br1gQ09YKkO1PFfDn/wyUE9ZgMAwosJed003+NQ==
+  dependencies:
+    accepts "~1.3.4"
+    base64id "~2.0.0"
+    debug "~4.3.2"
+    engine.io "~6.2.0"
+    socket.io-adapter "~2.4.0"
+    socket.io-parser "~4.2.0"
+
 "source-map-js@>=0.6.2 <2.0.0", source-map-js@^1.0.2:
  version "1.0.2"
  resolved "https://registry.yarnpkg.com/source-map-js/-/source-map-js-1.0.2.tgz#adbc361d9c62df380125e7f161f71c826f1e490c"
--- a/ldm/dream/generator/base.py
+++ b/ldm/dream/generator/base.py
@@ -21,6 +21,8 @@ class Generator():
        self.seed                = None
        self.latent_channels     = model.channels
        self.downsampling_factor = downsampling   # BUG: should come from model or config
+        self.perlin              = 0.0
+        self.threshold           = 0
        self.variation_amount    = 0
        self.with_variations     = []

@@ -122,8 +124,8 @@ class Generator():
        raise NotImplementedError("get_noise() must be implemented in a descendent class")
    
    def get_perlin_noise(self,width,height):
-        return torch.stack([rand_perlin_2d((height, width), (8, 8)).to(self.model.device) for _ in range(self.latent_channels)], dim=0)
-
+        fixdevice = 'cpu' if (self.model.device.type == 'mps') else self.model.device
+        return torch.stack([rand_perlin_2d((height, width), (8, 8), device = self.model.device).to(fixdevice) for _ in range(self.latent_channels)], dim=0).to(self.model.device)
    
    def new_seed(self):
        self.seed = random.randrange(0, np.iinfo(np.uint32).max)
--- a/ldm/dream/generator/img2img.py
+++ b/ldm/dream/generator/img2img.py
@@ -49,6 +49,7 @@ class Img2Img(Generator):
                img_callback = step_callback,
                unconditional_guidance_scale=cfg_scale,
                unconditional_conditioning=uc,
+                init_latent = self.init_latent,  # changes how noising is performed in ksampler
            )

            return self.sample_to_image(samples)
--- a/ldm/dream/generator/inpaint.py
+++ b/ldm/dream/generator/inpaint.py
@@ -27,7 +27,7 @@ class Inpaint(Img2Img):
        # klms samplers not supported yet, so ignore previous sampler
        if isinstance(sampler,KSampler):
            print(
-                f">> sampler '{sampler.__class__.__name__}' is not yet supported for inpainting, using DDIMSampler instead."
+                f">> Using recommended DDIM sampler for inpainting."
            )
            sampler = DDIMSampler(self.model, device=self.model.device)
        
--- a/ldm/dream/generator/txt2img2img.py
+++ b/ldm/dream/generator/txt2img2img.py
@@ -40,7 +40,12 @@ class Txt2Img2Img(Generator):
                init_width // self.downsampling_factor,
            ]
            
-            x = self.get_noise(init_width, init_height)
+            sampler.make_schedule(
+                    ddim_num_steps=steps, ddim_eta=ddim_eta, verbose=False
+            )
+            
+            #x = self.get_noise(init_width, init_height)
+            x = x_T
            
            if self.free_gpu_mem and self.model.model.device != self.model.device:
                self.model.model.to(self.model.device)
@@ -59,7 +64,7 @@ class Txt2Img2Img(Generator):
            )
            
            print(
-                  f"\n>> Interpolating from {init_width}x{init_height} to {width}x{height}"
+                  f"\n>> Interpolating from {init_width}x{init_height} to {width}x{height} using DDIM sampling"
                 )
            
            # resizing
@@ -70,29 +75,19 @@ class Txt2Img2Img(Generator):
            )

            t_enc = int(strength * steps)
-
-            x = None
-
-            # Other samplers not supported yet, so ignore previous sampler
-            if not isinstance(sampler,DDIMSampler):
-                print(
-                    f"\n>> Sampler '{sampler.__class__.__name__}' is not yet supported for img2img. Using DDIM sampler"
-                )
-                img_sampler = DDIMSampler(self.model, device=self.model.device)
-                img_sampler.make_schedule(
+            ddim_sampler = DDIMSampler(self.model, device=self.model.device)
+            ddim_sampler.make_schedule(
                    ddim_num_steps=steps, ddim_eta=ddim_eta, verbose=False
-                )
-            else:
-                img_sampler = sampler
-            
-            z_enc = img_sampler.stochastic_encode(
+            )
+
+            z_enc = ddim_sampler.stochastic_encode(
                samples,
                torch.tensor([t_enc]).to(self.model.device),
-                noise=x_T
+                noise=self.get_noise(width,height,False)
            )

            # decode it
-            samples = img_sampler.decode(
+            samples = ddim_sampler.decode(
                z_enc,
                c,
                t_enc,
@@ -110,17 +105,28 @@ class Txt2Img2Img(Generator):


    # returns a tensor filled with random numbers from a normal distribution
-    def get_noise(self,width,height):
+    def get_noise(self,width,height,scale = True):
+        # print(f"Get noise: {width}x{height}")
+        if scale:
+            trained_square = 512 * 512
+            actual_square = width * height
+            scale = math.sqrt(trained_square / actual_square)
+            scaled_width = math.ceil(scale * width / 64) * 64
+            scaled_height = math.ceil(scale * height / 64) * 64
+        else:
+            scaled_width = width
+            scaled_height = height
+            
        device      = self.model.device
        if device.type == 'mps':
            return torch.randn([1,
                                self.latent_channels,
-                                height // self.downsampling_factor,
-                                width  // self.downsampling_factor],
+                                scaled_height // self.downsampling_factor,
+                                scaled_width  // self.downsampling_factor],
                                device='cpu').to(device)
        else:
            return torch.randn([1,
                                self.latent_channels,
-                                height // self.downsampling_factor,
-                                width  // self.downsampling_factor],
+                                scaled_height // self.downsampling_factor,
+                                scaled_width  // self.downsampling_factor],
                                device=device)
--- a/ldm/dream/restoration/outpaint.py
+++ b/ldm/dream/restoration/outpaint.py
@@ -13,8 +13,6 @@ class Outpaint(object):
        seed   = old_opt.seed
        prompt = old_opt.prompt

-        print(f'DEBUG: old seed={seed}, old prompt = {prompt}')
-
        def wrapped_callback(img,seed,**kwargs):
            image_callback(img,seed,use_prefix=prefix,**kwargs)

--- a/ldm/generate.py
+++ b/ldm/generate.py
@@ -34,23 +34,7 @@ from ldm.dream.image_util import InitImageResizer
 from ldm.dream.devices import choose_torch_device, choose_precision
 from ldm.dream.conditioning import get_uc_and_c

-def fix_func(orig):
-    if hasattr(torch.backends, 'mps') and torch.backends.mps.is_available():
-        def new_func(*args, **kw):
-            device = kw.get("device", "mps")
-            kw["device"]="cpu"
-            return orig(*args, **kw).to(device)
-        return new_func
-    return orig

-torch.rand = fix_func(torch.rand)
-torch.rand_like = fix_func(torch.rand_like)
-torch.randn = fix_func(torch.randn)
-torch.randn_like = fix_func(torch.randn_like)
-torch.randint = fix_func(torch.randint)
-torch.randint_like = fix_func(torch.randint_like)
-torch.bernoulli = fix_func(torch.bernoulli)
-torch.multinomial = fix_func(torch.multinomial)

 def fix_func(orig):
    if hasattr(torch.backends, 'mps') and torch.backends.mps.is_available():
@@ -70,23 +54,7 @@ torch.randint_like = fix_func(torch.randint_like)
 torch.bernoulli = fix_func(torch.bernoulli)
 torch.multinomial = fix_func(torch.multinomial)

-def fix_func(orig):
-    if hasattr(torch.backends, 'mps') and torch.backends.mps.is_available():
-        def new_func(*args, **kw):
-            device = kw.get("device", "mps")
-            kw["device"]="cpu"
-            return orig(*args, **kw).to(device)
-        return new_func
-    return orig

-torch.rand = fix_func(torch.rand)
-torch.rand_like = fix_func(torch.rand_like)
-torch.randn = fix_func(torch.randn)
-torch.randn_like = fix_func(torch.randn_like)
-torch.randint = fix_func(torch.randint)
-torch.randint_like = fix_func(torch.randint_like)
-torch.bernoulli = fix_func(torch.bernoulli)
-torch.multinomial = fix_func(torch.multinomial)

 """Simplified text to image API for stable diffusion/latent diffusion

@@ -174,7 +142,8 @@ class Generate:
            config                = None,
            gfpgan=None,
            codeformer=None,
-            esrgan=None
+            esrgan=None,
+            free_gpu_mem=False,
    ):
        models              = OmegaConf.load(conf)
        mconfig             = models[model]
@@ -201,6 +170,7 @@ class Generate:
        self.gfpgan = gfpgan
        self.codeformer = codeformer
        self.esrgan = esrgan
+        self.free_gpu_mem = free_gpu_mem

        # Note that in previous versions, there was an option to pass the
        # device to Generate(). However the device was then ignored, so
@@ -417,7 +387,8 @@ class Generate:
                generator = self._make_txt2img()

            generator.set_variation(
-                self.seed, variation_amount, with_variations)
+                self.seed, variation_amount, with_variations
+            )
            results = generator.generate(
                prompt,
                iterations=iterations,
@@ -626,18 +597,14 @@ class Generate:
            height,
        )

+        if image.width < self.width and image.height < self.height:
+            print(f'>> WARNING: img2img and inpainting may produce unexpected results with initial images smaller than {self.width}x{self.height} in both dimensions')
+
        # if image has a transparent area and no mask was provided, then try to generate mask
-        if self._has_transparency(image) and not mask:
-            print(
-                '>> Initial image has transparent areas. Will inpaint in these regions.')
-            if self._check_for_erasure(image):
-                print(
-                    '>> WARNING: Colors underneath the transparent region seem to have been erased.\n',
-                    '>>          Inpainting will be suboptimal. Please preserve the colors when making\n',
-                    '>>          a transparency mask, or provide mask explicitly using --init_mask (-M).'
-                )
+        if self._has_transparency(image):
+            self._transparency_check_and_warning(image, mask)
            # this returns a torch tensor
-            init_mask = self._create_init_mask(image,width,height,fit=fit)
+            init_mask = self._create_init_mask(image, width, height, fit=fit)
            
        if (image.width * image.height) > (self.width * self.height):
            print(">> This input is larger than your defaults. If you run out of memory, please use a smaller image.")
@@ -881,6 +848,7 @@ class Generate:
            print(
                f'>> loaded input image of size {image.width}x{image.height}'
            )
+        image = ImageOps.exif_transpose(image)
        return image

    def _create_init_image(self, image, width, height, fit=True):
@@ -889,7 +857,6 @@ class Generate:
            image = self._fit_image(image, (width, height))
        else:
            image = self._squeeze_image(image)
-
        image = np.array(image).astype(np.float32) / 255.0
        image = image[None].transpose(0, 3, 1, 2)
        image = torch.from_numpy(image)
@@ -906,7 +873,6 @@ class Generate:
            image = self._fit_image(image, (width, height))
        else:
            image = self._squeeze_image(image)
-
        image = image.resize((image.width//downsampling, image.height //
                              downsampling), resample=Image.Resampling.NEAREST)
        image = np.array(image)
@@ -953,6 +919,17 @@ class Generate:
                        colored += 1
        return colored == 0

+    def _transparency_check_and_warning(self,image, mask):
+        if not mask:
+            print(
+                '>> Initial image has transparent areas. Will inpaint in these regions.')
+            if self._check_for_erasure(image):
+                print(
+                    '>> WARNING: Colors underneath the transparent region seem to have been erased.\n',
+                    '>>          Inpainting will be suboptimal. Please preserve the colors when making\n',
+                    '>>          a transparency mask, or provide mask explicitly using --init_mask (-M).'
+                )
+
    def _squeeze_image(self, image):
        x, y, resize_needed = self._resolution_check(image.width, image.height)
        if resize_needed:
--- a/ldm/models/diffusion/ksampler.py
+++ b/ldm/models/diffusion/ksampler.py
@@ -5,6 +5,12 @@ import torch.nn as nn
 from ldm.dream.devices import choose_torch_device
 from ldm.models.diffusion.sampler import Sampler
 from ldm.util import rand_perlin_2d
+from ldm.modules.diffusionmodules.util import (
+    make_ddim_sampling_parameters,
+    make_ddim_timesteps,
+    noise_like,
+    extract_into_tensor,
+)

 def cfg_apply_threshold(result, threshold = 0.0, scale = 0.7):
    if threshold <= 0.0:
@@ -51,8 +57,9 @@ class KSampler(Sampler):
            schedule,
            steps=model.num_timesteps,
        )
-        self.ds    = None
-        self.s_in  = None
+        self.sigmas = None
+        self.ds     = None
+        self.s_in   = None

        def forward(self, x, sigma, uncond, cond, cond_scale):
            x_in = torch.cat([x] * 2)
@@ -81,13 +88,54 @@ class KSampler(Sampler):
        )
        self.model          = outer_model
        self.ddim_num_steps = ddim_num_steps
-        sigmas = self.model.get_sigmas(ddim_num_steps)
-        self.sigmas = sigmas
+        # we don't need both of these sigmas, but storing them here to make
+        # comparison easier later on
+        self.model_sigmas  = self.model.get_sigmas(ddim_num_steps)
+        self.karras_sigmas = K.sampling.get_sigmas_karras(
+            n=ddim_num_steps,
+            sigma_min=self.model.sigmas[0].item(),
+            sigma_max=self.model.sigmas[-1].item(),
+            rho=7.,
+            device=self.device,
+        )
+        self.sigmas = self.karras_sigmas
        
    # ALERT: We are completely overriding the sample() method in the base class, which
-    # means that inpainting will (probably?) not work correctly. To get this to work
-    # we need to be able to modify the inner loop of k_heun, k_lms, etc, as is done
-    # in an ugly way in the lstein/k-diffusion branch.
+    # means that inpainting will not work. To get this to work we need to be able to
+    # modify the inner loop of k_heun, k_lms, etc, as is done in an ugly way
+    # in the lstein/k-diffusion branch.
+    
+    @torch.no_grad()
+    def decode(
+            self,
+            z_enc,
+            cond,
+            t_enc,
+            img_callback=None,
+            unconditional_guidance_scale=1.0,
+            unconditional_conditioning=None,
+            use_original_steps=False,
+            init_latent       = None,
+            mask              = None,
+    ):
+        samples,_ = self.sample(
+            batch_size = 1,
+            S          = t_enc,
+            x_T        = z_enc,
+            shape      = z_enc.shape[1:],
+            conditioning = cond,
+            unconditional_guidance_scale=unconditional_guidance_scale,
+            unconditional_conditioning = unconditional_conditioning,
+            img_callback = img_callback,
+            x0           = init_latent,
+            mask         = mask
+            )
+        return samples
+
+    # this is a no-op, provided here for compatibility with ddim and plms samplers
+    @torch.no_grad()
+    def stochastic_encode(self, x0, t, use_original_steps=False, noise=None):
+        return x0
    
    # Most of these arguments are ignored and are only present for compatibility with
    # other samples
@@ -123,24 +171,27 @@ class KSampler(Sampler):
            if img_callback is not None:
                img_callback(k_callback_values['x'],k_callback_values['i'])

-        # sigmas = self.model.get_sigmas(S)
-        # sigmas are now set up in make_schedule - we take the last steps items
+        # sigmas are set up in make_schedule - we take the last steps items
+        total_steps = len(self.sigmas)
        sigmas = self.sigmas[-S-1:]

+        # x_T is variation noise. When an init image is provided (in x0) we need to add
+        # more randomness to the starting image.
        if x_T is not None:
-            x = x_T * sigmas[0]
+            if x0 is not None:
+                x = x_T + torch.randn_like(x0, device=self.device) * sigmas[0]
+            else:
+                x = x_T * sigmas[0]
        else:
-            x = (
-                torch.randn([batch_size, *shape], device=self.device)
-                * sigmas[0]
-            )   # for GPU draw
+            x = torch.randn([batch_size, *shape], device=self.device) * sigmas[0]
+
        model_wrap_cfg = CFGDenoiser(self.model, threshold=threshold, warmup=max(0.8*S,S-10))
        extra_args = {
            'cond': conditioning,
            'uncond': unconditional_conditioning,
            'cond_scale': unconditional_guidance_scale,
        }
-        print(f'>> Sampling with k_{self.schedule}')
+        print(f'>> Sampling with k_{self.schedule} starting at step {len(self.sigmas)-S-1} of {len(self.sigmas)-1} ({S} new sampling steps)')
        return (
            K.sampling.__dict__[f'sample_{self.schedule}'](
                model_wrap_cfg, x, sigmas, extra_args=extra_args,
@@ -149,6 +200,8 @@ class KSampler(Sampler):
            None,
        )

+    # this code will support inpainting if and when ksampler API modified or
+    # a workaround is found.
    @torch.no_grad()
    def p_sample(
            self,
@@ -196,10 +249,12 @@ class KSampler(Sampler):
        return img, None, None

    def get_initial_image(self,x_T,shape,steps):
+        print(f'WARNING: ksampler.get_initial_image(): get_initial_image needs testing')
+        x = (torch.randn(shape, device=self.device) * self.sigmas[0])
        if x_T is not None:
            return x_T + x_T * self.sigmas[0]
        else:
-            return (torch.randn(shape, device=self.device) * self.sigmas[0])
+            return x
        
    def prepare_to_sample(self,t_enc):
        self.t_enc      = t_enc
@@ -213,29 +268,3 @@ class KSampler(Sampler):
        '''
        return self.model.inner_model.q_sample(x0,ts)

-    @torch.no_grad()
-    def decode(
-            self,
-            z_enc,
-            cond,
-            t_enc,
-            img_callback=None,
-            unconditional_guidance_scale=1.0,
-            unconditional_conditioning=None,
-            use_original_steps=False,
-            init_latent       = None,
-            mask              = None,
-    ):
-        samples,_ = self.sample(
-            batch_size = 1,
-            S          = t_enc,
-            x_T        = z_enc,
-            shape      = z_enc.shape[1:],
-            conditioning = cond,
-            unconditional_guidance_scale=unconditional_guidance_scale,
-            unconditional_conditioning = unconditional_conditioning,
-            img_callback = img_callback,
-            x0           = init_latent,
-            mask         = mask
-            )
-        return samples
--- a/ldm/models/diffusion/sampler.py
+++ b/ldm/models/diffusion/sampler.py
@@ -39,6 +39,7 @@ class Sampler(object):
            ddim_eta=0.0,
            verbose=False,
    ):
+        self.total_steps = ddim_num_steps
        self.ddim_timesteps = make_ddim_timesteps(
            ddim_discr_method=ddim_discretize,
            num_ddim_timesteps=ddim_num_steps,
@@ -211,6 +212,7 @@ class Sampler(object):
            if ddim_use_original_steps
            else np.flip(timesteps)
        )
+
        total_steps=steps

        iterator = tqdm(
@@ -305,7 +307,7 @@ class Sampler(object):

        time_range = np.flip(timesteps)
        total_steps = timesteps.shape[0]
-        print(f'>> Running {self.__class__.__name__} Sampling with {total_steps} timesteps')
+        print(f'>> Running {self.__class__.__name__} sampling starting at step {self.total_steps - t_start} of {self.total_steps} ({total_steps} new sampling steps)')

        iterator = tqdm(time_range, desc='Decoding image', total=total_steps)
        x_dec    = x_latent
--- a/ldm/util.py
+++ b/ldm/util.py
@@ -214,15 +214,19 @@ def parallel_data_prefetch(
    else:
        return gather_res

-def rand_perlin_2d(shape, res, fade = lambda t: 6*t**5 - 15*t**4 + 10*t**3):
+def rand_perlin_2d(shape, res, device, fade = lambda t: 6*t**5 - 15*t**4 + 10*t**3):
    delta = (res[0] / shape[0], res[1] / shape[1])
    d = (shape[0] // res[0], shape[1] // res[1])

-    grid = torch.stack(torch.meshgrid(torch.arange(0, res[0], delta[0]), torch.arange(0, res[1], delta[1]), indexing='ij'), dim = -1) % 1
-    angles = 2*math.pi*torch.rand(res[0]+1, res[1]+1)
+    grid = torch.stack(torch.meshgrid(torch.arange(0, res[0], delta[0]), torch.arange(0, res[1], delta[1]), indexing='ij'), dim = -1).to(device) % 1
+
+    rand_val = torch.rand(res[0]+1, res[1]+1)
+    
+    angles = 2*math.pi*rand_val
    gradients = torch.stack((torch.cos(angles), torch.sin(angles)), dim = -1)

    tile_grads = lambda slice1, slice2: gradients[slice1[0]:slice1[1], slice2[0]:slice2[1]].repeat_interleave(d[0], 0).repeat_interleave(d[1], 1)
+
    dot = lambda grad, shift: (torch.stack((grid[:shape[0],:shape[1],0] + shift[0], grid[:shape[0],:shape[1], 1] + shift[1]  ), dim = -1) * grad[:shape[0], :shape[1]]).sum(dim = -1)

    n00 = dot(tile_grads([0, -1], [0, -1]), [0,  0])
--- a/scripts/dream.py
+++ b/scripts/dream.py
@@ -75,7 +75,8 @@ def main():
            precision      = opt.precision,
            gfpgan=gfpgan,
            codeformer=codeformer,
-            esrgan=esrgan
+            esrgan=esrgan,
+            free_gpu_mem=opt.free_gpu_mem,
            )
    except (FileNotFoundError, IOError, KeyError) as e:
        print(f'{e}. Aborting.')
@@ -104,8 +105,6 @@ def main():

    # preload the model
    gen.load_model()
-    #set additional option
-    gen.free_gpu_mem = opt.free_gpu_mem

    # web server loops forever
    if opt.web or opt.gui:
Author	SHA1	Message	Date
Lincoln Stein	7a701506a4	restore ability of ksamplers to process -v variation options - supersedes PR #977 - works with both img2img and txt2img	2022-10-07 16:25:58 -04:00
Lincoln Stein	3d7bc074cf	autorotate init images using exif orientation tag	2022-10-07 12:06:50 -04:00
Jakub Kolčář	70bb7f4a61	fixed perlin noise generation for mps (macos) - fix for cpu fallback	2022-10-07 10:36:45 -04:00
Lincoln Stein	9c9cb71544	rebuild frontend package	2022-10-07 10:20:02 -04:00
spezialspezial	a7515624b2	remove duplicated code	2022-10-07 08:12:55 -04:00
Lincoln Stein	9f34ddfcea	fix crash on len(Nonetype) in k_sampler	2022-10-07 08:05:13 -04:00
Lincoln Stein	c6a7be63b8	fix crash in generate._transparency_check_and_warning()	2022-10-06 21:00:27 -04:00
Lincoln Stein	75165957c9	Revert "realesrgan inherits precision setting from main program" This reverts commit `5f42d08945`. This fix was intended to solve issue #939, in which ESRGAN generates dark images when upscaling 4X on certain GTX cards. However, the fix apparently causes conflicts with some versions of the ESRGAN library, and this fix will have to wait until after release of 2.0.	2022-10-06 20:52:38 -04:00
Lincoln Stein	d60df54f69	fix k_samplers in img2img - probably correct now	2022-10-06 18:53:54 -04:00
Lincoln Stein	82481a6f9c	Merge branch 'release-candidate-2' of github.com:invoke-ai/InvokeAI into release-candidate-2	2022-10-06 13:58:53 -04:00
Lincoln Stein	90d64388ab	Merge branch 'release-candidate-2' into release-candidate-2 - This includes #949 "Bug fixes for new Threshold and Perlin Options"	2022-10-06 13:57:43 -04:00
Lincoln Stein	3444c8e6b8	Merge branch 'release-candidate-2' into release-candidate-2	2022-10-06 13:53:27 -04:00
psychedelicious	d84321e080	Adds hotkeys to modal	2022-10-06 13:49:09 -04:00
psychedelicious	6542556ebd	Adds next/prev image buttons/hotkeys	2022-10-06 13:48:59 -04:00
blessedcoolant	70bbb670ec	Add Basic Hotkey Support	2022-10-06 13:27:42 -04:00
Lincoln Stein	5f42d08945	realesrgan inherits precision setting from main program	2022-10-06 12:23:30 -04:00
blessedcoolant	911c99f125	Fix WebUI CORS Issue	2022-10-06 11:17:48 -04:00
Lincoln Stein	2154dd2349	prevent crashes due to uninitialized free_gpu_mem	2022-10-06 10:54:05 -04:00
Lincoln Stein	f3050fefce	bug and warning message fixes - txt2img2img back to using DDIM as img2img sampler; results produced by some k* samplers are just not reliable enough for good user experience - img2img progress message clarifies why img2img steps taken != steps requested - warn of potential problems when user tries to run img2img on a small init image	2022-10-06 10:39:08 -04:00
Lincoln Stein	183b98384f	set perlin & threshold to zero on generator initialization	2022-10-06 09:35:04 -04:00
Peter Baylies	6d475ee290	* Bug fixes for new Threshold and Perlin options	2022-10-06 08:46:27 -04:00
Lincoln Stein	2f29b78a00	enable --hires to use k* samplers	2022-10-05 17:18:32 -04:00
ArDiouscuros	bcb6e2e506	Fix for crashes in txt2img hires fix mode	2022-10-05 17:13:43 -04:00
Lincoln Stein	194b875cf3	Update IMG2IMG.md Added information on the small initial image size bug.	2022-10-05 15:55:38 -04:00
Lincoln Stein	b2cd98259d	rename img files with colons	2022-10-05 12:56:57 -04:00