mirror of
https://github.com/invoke-ai/InvokeAI.git
synced 2026-02-12 12:15:09 -05:00
If the transformer fills up VRAM, then when we VAE encode kontext latents, we'll need to first offload the transformer (partially, if partial loading is enabled). No need to do this - we can encode kontext latents before loading the transformer to reduce model thrashing.