mirror of
https://github.com/invoke-ai/InvokeAI.git
synced 2026-04-23 03:00:31 -04:00
If the transformer fills up VRAM, then when we VAE encode kontext latents, we'll need to first offload the transformer (partially, if partial loading is enabled). No need to do this - we can encode kontext latents before loading the transformer to reduce model thrashing.