11 KiB
Overview
In 1.47.2 Koboldcpp added AUTOMATIC1111 integration for image generation. Since AMDSHARK implements a small subset of the A1111 REST api, you can also use AMDSHARK for this. This document gives a starting point for how to get this working.
In Action
Memory considerations
Since both Koboldcpp and AMDSHARK will use VRAM on your graphic card(s) running both at the same time using the same card will impose extra limitations on the model size you can fully offload to the video card in Koboldcpp. For me, on a RX 7900 XTX on Windows with 24 GiB of VRAM, the limit was about a 13 Billion parameter model with Q5_K_M quantisation.
Performance Considerations
When using AMDSHARK for image generation, especially with Koboldcpp, you need to be aware that it is currently designed to pay a large upfront cost in time compiling and tuning the model you select, to get an optimal individual image generation time. You need to be the judge as to whether this trade-off is going to be worth it for your OS and hardware combination.
It means that the first time you run a particular Stable Diffusion model for a particular combination of image size, LoRA, and VAE, AMDSHARK will spend many minutes - even on a beefy machaine with very fast graphics card with lots of memory - building that model combination just so it can save it to disk. It may even have to go away and download the model if it doesn't already have it locally. Once it has done its build of a model combination for your hardware once, it shouldn't need to do it again until you upgrade to a newer AMDSHARK version, install different drivers or change your graphics hardware. It will just upload the files it generated the first time to your graphics card and proceed from there.
This does mean however, that on a brand new fresh install of AMDSHARK that has not generated any images on a model you haven't selected before, the first image Koboldcpp requests may look like it is never going finish and that the whole process has broken. Be forewarned, make yourself a cup of coffee, and expect a lot of messages about compilation and tuning from AMDSHARK in the terminal you ran it from.
Setup AMDSHARK and prerequisites:
- Make sure you have suitable drivers for your graphics card installed. See the prerequisties section of the README.
- Download the latest AMDSHARK studio .exe from here or follow the instructions in the README for an advanced, Linux or Mac install.
- Run AMDSHARK from terminal/PowerShell with the
--apiflag. Since koboldcpp also expects both CORS support and the image generator to be running on port7860rather than AMDSHARK default of8080, also include both the--api_accept_originflag with a suitable origin (use="*"to enable all origins) and--server_port=7860on the command line. (See the if you want to run AMDSHARK on a different port)
## Run the .exe in API mode, with CORS support, on the A1111 endpoint port:
.\node_ai_amdshark_studio_<date>_<ver>.exe --api --api_accept_origin="*" --server_port=7860
## Run trom the base directory of a source clone of AMDSHARK on Windows:
.\setup_venv.ps1
python .\apps\stable_diffusion\web\index.py --api --api_accept_origin="*" --server_port=7860
## Run a the base directory of a source clone of AMDSHARK on Linux:
./setup_venv.sh
source amdshark.venv/bin/activate
python ./apps/stable_diffusion/web/index.py --api --api_accept_origin="*" --server_port=7860
## An example giving improved performance on AMD cards using vulkan, that runs on the same port as A1111
.\node_ai_amdshark_studio_20320901_2525.exe --api --api_accept_origin="*" --device_allocator="caching" --server_port=7860
## Since the api respects most applicable AMDSHARK command line arguments for options not specified,
## or currently unimplemented by API, there might be some you want to set, as listed in `--help`
.\node_ai_amdshark_studio_20320901_2525.exe --help
## For instance, the example above, but with a a custom VAE specified
.\node_ai_amdshark_studio_20320901_2525.exe --api --api_accept_origin="*" --device_allocator="caching" --server_port=7860 --custom_vae="clearvae_v23.safetensors"
## An example with multiple specific CORS origins
python apps/stable_diffusion/web/index.py --api --api_accept_origin="koboldcpp.example.com:7001" --api_accept_origin="koboldcpp.example.com:7002" --server_port=7860
AMDSHARK should start in server mode, and you should see something like this:
- Note: When running in api mode with
--api, the .exe will not function as a webUI. Thus, the address or port shown in the terminal output will only be useful for API requests.
Configure Koboldcpp for local image generation:
-
Get the latest Koboldcpp if you don't already have it. If you have a recent AMD card that has ROCm HIP support for Windows or support for Linux, you'll likely prefer YellowRosecx's ROCm fork.
-
Start Koboldcpp in another terminal/Powershell and setup your model configuration. Refer to the Koboldcpp README for more details on how to do this if this is your first time using Koboldcpp.
-
Once the main UI has loaded into your browser click the settings button, go to the advanced tab, and then choose Local A1111 from the generate images dropdown:
if you get an error here, see the next section below
-
A list of Stable Diffusion models available to your AMDSHARK instance should now be listed in the box below generate images. The default value will usually be set to
stabilityai/stable-diffusion-2-1-base. Choose the model you want to use for image generation from the list (but see performance considerations). -
You should now be ready to generate images, either by clicking the 'Add Img' button above the text entry box:
...or by selecting the 'Autogenerate' option in the settings:
I often find that even if I have selected autogenerate I have to do an 'add img' to get things started off
-
There is one final piece of image generation configuration within Koboldcpp you might want to do. This is also in the generate images section of advanced settings. Here there is, not very obviously, a 'style' button:
This will bring up a dialog box where you can enter a short text that will sent as a prefix to the Prompt sent to AMDSHARK:
Connecting to AMDSHARK on a different address or port
If you didn't set the port to --server_port=7860 when starting AMDSHARK, or you are running it on different machine on your network than you are running Koboldcpp, or to where you are running the koboldcpp's kdlite client frontend, then you very likely got the following error:
As long as AMDSHARK is running correctly, this means you need to set the url and port to the correct values in Koboldcpp. For instance. to set the port that Koboldcpp looks for an image generator to AMDSHARK's default port of 8080:
-
Select the cog icon the Generate Images section of Advanced settings:
-
Then edit the port number at the end of the url in the 'A1111 Endpoint Selection' dialog box to read 8080:
-
Similarly, when running AMDSHARK on a different machine you will need to change host part of the endpoint url to the hostname or ip address where AMDSHARK is running, similarly:
Examples
Here's how Koboldcpp shows an image being requested:

The generated image in context in story mode:
And the same image when clicked on:
Where to find the images in AMDSHARK
Even though Koboldcpp requests images at a size of 512x512, it resizes then to 256x256, converts them to .jpeg, and only shows them at 200x200 in the main text window. It does this so it can save them compactly embedded in your story as a data:// uri.
However the images at the original size are saved by AMDSHARK in its output_dir which is usually a folder named for the current date. inside generated_imgs folder in the AMDSHARK installation directory.
You can browse these, either using the Output Gallery tab from within the AMDSHARK web ui:
...or by browsing to the output_dir in your operating system's file manager:















