Files
InvokeAI/mkdocs.yml
CypherNaugh_0x 9deb545cc1 External models (Gemini Nano Banana & OpenAI GPT Image) (#8633) (#8884)
* feat: initial external model support

* feat: support reference images for external models

* fix: sorting lint error

* chore: hide Reidentify button for external models

* review: enable auto-install/remove fro external models

* feat: show external mode name during install

* review: model descriptions

* review: implemented review comments

* review: added optional seed control for external models

* chore: fix linter warning

* review: save api keys to a seperate file

* docs: updated external model docs

* chore: fix linter errors

* fix: sync configured external starter models on startup

* feat(ui): add provider-specific external generation nodes

* feat: expose external panel schemas in model configs

* feat(ui): drive external panels from panel schema

* docs: sync app config docstring order

* feat: add gemini 3.1 flash image preview starter model

* feat: update gemini image model limits

* fix: resolve TypeScript errors and move external provider config to api_keys.yaml

Add 'external', 'external_image_generator', and 'external_api' to Zod
enum schemas (zBaseModelType, zModelType, zModelFormat) to match the
generated OpenAPI types. Remove redundant union workarounds from
component prop types and Record definitions.

Fix type errors in ModelEdit (react-hook-form Control invariance),
parsing.tsx (model identifier narrowing), buildExternalGraph (edge
typing), and ModelSettings import/export buttons.

Move external_gemini_base_url and external_openai_base_url into
api_keys.yaml alongside the API keys so all external provider config
lives in one dedicated file, separate from invokeai.yaml.

* feat: add resolution presets and imageConfig support for Gemini 3 models

Add combined resolution preset selector for external models that maps
aspect ratio + image size to fixed dimensions. Gemini 3 Pro and 3.1 Flash
now send imageConfig (aspectRatio + imageSize) via generationConfig instead
of text-based aspect ratio hints used by Gemini 2.5 Flash.

Backend: ExternalResolutionPreset model, resolution_presets capability field,
image_size on ExternalGenerationRequest, and Gemini provider imageConfig logic.

Frontend: ExternalSettingsAccordion with combo resolution select, dimension
slider disabling for fixed-size models, and panel schema constraint wiring
for Steps/Guidance/Seed controls.

* Remove unused external model fields and add provider-specific parameters

- Remove negative_prompt, steps, guidance, reference_image_weights,
  reference_image_modes from external model nodes (unused by any provider)
- Remove supports_negative_prompt, supports_steps, supports_guidance
  from ExternalModelCapabilities
- Add provider_options dict to ExternalGenerationRequest for
  provider-specific parameters
- Add OpenAI-specific fields: quality, background, input_fidelity
- Add Gemini-specific fields: temperature, thinking_level
- Add new OpenAI starter models: GPT Image 1.5, GPT Image 1 Mini,
  DALL-E 3, DALL-E 2
- Fix OpenAI provider to use output_format (GPT Image) vs
  response_format (DALL-E) and send model ID in requests
- Add fixed aspect ratio sizes for OpenAI models (bucketing)
- Add ExternalProviderRateLimitError with retry logic for 429 responses
- Add provider-specific UI components in ExternalSettingsAccordion
- Simplify ParamSteps/ParamGuidance by removing dead external overrides
- Update all backend and frontend tests

* Chore Ruff check & format

* Chore typegen

* feat: full canvas workflow integration for external models

- Add missing aspect ratios (4:5, 5:4, 8:1, 4:1, 1:4, 1:8) to type
  system for external model support
- Sync canvas bbox when external model resolution preset is selected
- Use params preset dimensions in buildExternalGraph to prevent
  "unsupported aspect ratio" errors
- Lock all bbox controls (resize handles, aspect ratio select,
  width/height sliders, swap/optimal buttons) for external models
  with fixed dimension presets
- Disable denoise strength slider for external models (not applicable)
- Sync bbox aspect ratio changes back to paramsSlice for external models
- Initialize bbox dimensions when switching to an external model

* Chore typegen Linux seperator

* feat: full canvas workflow integration for external models
- Update buildExternalGraph test to include dimensions in mock params

* Merge remote-tracking branch 'upstream/main' into external-models

* Chore pnpm fix

* add missing parameter

* docs: add External Models guide with Gemini and OpenAI provider pages

* fix(external-models): address PR review feedback

- Gemini recall: write temperature, thinking_level, image_size to image metadata;
  wire external graph as metadata receiver; add recall handlers.
- Canvas: gate regional guidance, inpaint mask, and control layer for external models.
- Canvas: throw a clear error on outpainting for external models (was falling back to
  inpaint and hitting an API-side mask/image size mismatch).
- Workflow editor: add ui_model_provider_id filter so OpenAI and Gemini nodes only
  list their own provider's models.
- Workflow editor: silently drop seed when the selected model does not support it
  instead of raising a capability error.
- Remove the legacy external_image_generation invocation and the graph-builder
  fallback; providers must register a dedicated node.
- Regenerate schema.ts.
- remove Gemini debug dumps to outputs/external_debug

* fix(external-models): resolve TSC errors in metadata parsing and external graph

- Export imageSizeChanged from paramsSlice (required by the new ImageSize
  recall handler).
- Emit the external graph's metadata model entry via zModelIdentifierField
  since ExternalApiModelConfig is not part of the AnyModelConfig union.

* chore: prettier format ModelIdentifierFieldInputComponent

* fix: remove unsupported thinkingConfig from Gemini image models and restrict GPT Image models to txt2img

* chore typegen

* chore(docs): regenerate settings.json for external provider fields

* fix(external): fix mask handling and mode support for external providers

- Remove img2img and inpaint modes from Gemini models (Gemini has no
  bitmap mask or dedicated edit API; image editing works via reference
  images in the UI)
- Fix DALL-E 2 inpainting: convert grayscale mask to RGBA with alpha
  channel transparency (OpenAI expects transparent=edit area) and
  convert init image to RGBA when mask is present

* fix(external): update mode support and UI for external providers

- Remove DALL-E 2 from starter models (deprecated, shutdown May 12 2026)
- Enable img2img for GPT Image 1/1.5/1-mini (supports edits endpoint)
- Set Gemini models to txt2img only (no mask/edit API; editing via
  ref images)
- Hide mode/init_image/mask_image fields on Gemini node (not usable)
- Hide mask_image field on OpenAI node (no model supports inpaint)

* Chore typegen

* fix(external): improve OpenAI node UX and disable cache by default

- Hide OpenAI node's mode and init_image fields: OpenAI's API has no
  img2img/inpaint distinction (the edits endpoint is invoked
  automatically when reference images are provided). init_image is
  functionally identical to a reference image and was misleading users.
- Default use_cache to False for external image generation nodes:
  external API calls are non-deterministic and incur usage costs.
  Cache hits returned stale image references that did not produce new
  gallery entries on repeat invokes.

* fix(external): duplicate cached images on cache hit instead of skipping

External image generation nodes use the standard invocation cache, but
returning the cached output (with stale image_name references) on cache
hits resulted in no new gallery entries — the Invoke button would spin
indefinitely on repeat invokes with identical parameters.

Override invoke_internal so that on cache hit, the cached images are
loaded and re-saved as new gallery entries. The expensive API call is
still skipped (cost saving), but the user sees a new image as expected.

* Chore typegen + ruff

* CHore ruff format

* fix(external): restore OpenAI advanced settings on Remix recall

Remix recall iterates through ImageMetadataHandlers but only Gemini's
temperature handler was wired up — OpenAI's quality, background, and
input_fidelity were stored in image metadata but never parsed back into
the params slice. Add the three missing handlers so Remix restores
these settings as expected.

---------

Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev>
Co-authored-by: Alexander Eichhorn <alex@code-with.us>
Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
2026-04-20 17:13:26 +00:00

181 lines
6.0 KiB
YAML

# yaml-language-server: $schema=https://squidfunk.github.io/mkdocs-material/schema.json
# General
site_name: Invoke
site_url: https://invoke-ai.github.io/InvokeAI
site_author: mauwii
dev_addr: '127.0.0.1:8080'
# Repository
repo_name: 'invoke-ai/InvokeAI'
repo_url: 'https://github.com/invoke-ai/InvokeAI'
edit_uri: edit/main/docs-old/
# Copyright
copyright: Copyright &copy; 2022-2024 InvokeAI Team
# Configuration
theme:
name: material
font:
text: 'Inter'
code: 'JetBrains Mono'
logo: img/invoke-symbol-wht-lrg.svg
icon:
repo: fontawesome/brands/github
edit: material/file-document-edit-outline
favicon: img/favicon.ico
palette:
scheme: slate
primary: black
features:
- navigation.instant
- navigation.tabs
- navigation.tabs.sticky
- navigation.tracking
- navigation.indexes
- navigation.path
- search.highlight
- search.suggest
- toc.integrate
# Extensions
markdown_extensions:
- abbr
- admonition
- attr_list
- def_list
- footnotes
- md_in_html
- toc:
permalink: '#'
- pymdownx.arithmatex:
generic: true
- pymdownx.betterem:
smart_enable: all
- pymdownx.caret
- pymdownx.details
- pymdownx.emoji:
emoji_index: !!python/name:material.extensions.emoji.twemoji
emoji_generator: !!python/name:material.extensions.emoji.to_svg
- pymdownx.highlight:
anchor_linenums: true
- pymdownx.inlinehilite
- pymdownx.keys
- pymdownx.magiclink:
repo_url_shorthand: true
user: 'invoke-ai'
repo: 'InvokeAI'
- pymdownx.mark
- pymdownx.smartsymbols
- pymdownx.superfences:
custom_fences:
- name: mermaid
class: mermaid
format: !!python/name:pymdownx.superfences.fence_code_format
- pymdownx.snippets
- pymdownx.tabbed:
alternate_style: true
- pymdownx.tasklist:
custom_checkbox: true
- pymdownx.tilde
- tables
plugins:
- search
- git-revision-date-localized:
enable_creation_date: true
- redirects:
redirect_maps:
'installation/index.md': 'installation/quick_start.md'
'installation/INSTALL_AUTOMATED.md': 'installation/quick_start.md'
'installation/installer.md': 'installation/quick_start.md'
'installation/INSTALL_MANUAL.md': 'installation/manual.md'
'installation/INSTALL_SOURCE.md': 'installation/manual.md'
'installation/INSTALL_DOCKER.md': 'installation/docker.md'
'installation/INSTALLING_MODELS.md': 'installation/models.md'
'installation/INSTALL_PATCHMATCH.md': 'installation/patchmatch.md'
'installation/060_INSTALL_PATCHMATCH.md': 'installation/patchmatch.md'
- mkdocstrings:
handlers:
python:
options:
separate_signature: true
show_signature_annotations: true
parameter_headings: false
signature_crossrefs: true
show_source: false
summary: true
show_root_heading: true
show_root_full_path: false
show_bases: false
extra:
analytics:
provider: google
property: G-2X4JR4S4FB
nav:
- Home: 'index.md'
- Installation:
- Quick Start: 'installation/quick_start.md'
- Detailed Requirements: 'installation/requirements.md'
- Manual Install: 'installation/manual.md'
- Docker: 'installation/docker.md'
- PatchMatch: 'installation/patchmatch.md'
- Models: 'installation/models.md'
- Workflows & Nodes:
- Nodes Overview: 'nodes/overview.md'
- Workflow Editor Basics: 'nodes/NODES.md'
- List of Default Nodes: 'nodes/defaultNodes.md'
- Community Nodes: 'nodes/communityNodes.md'
- ComfyUI to InvokeAI: 'nodes/comfyToInvoke.md'
- Facetool Node: 'nodes/detailedNodes/faceTools.md'
- Contributing Nodes: 'nodes/contributingNodes.md'
- Migrating from v3 to v4: 'nodes/NODES_MIGRATION_V3_V4.md'
- Invocation API: 'nodes/invocation-api.md'
- Configuration: 'configuration.md'
- Features:
- New to InvokeAI?: 'help/gettingStartedWithAI.md'
- Low VRAM mode: 'features/low-vram.md'
- Database: 'features/database.md'
- Gallery: 'features/gallery.md'
- Hot Keys: 'features/hotkeys.md'
- External Models:
- Overview: 'features/external-models/index.md'
- Google Gemini: 'features/external-models/gemini.md'
- OpenAI: 'features/external-models/openai.md'
- Multi-User Mode:
- User Guide: 'multiuser/user_guide.md'
- Administrator Guide: 'multiuser/admin_guide.md'
- API Guide: 'multiuser/api_guide.md'
- Specification: 'multiuser/specification.md'
- Contributing:
- Overview: 'contributing/index.md'
- Code of Conduct: 'CODE_OF_CONDUCT.md'
- Dev Environment: 'contributing/dev-environment.md'
- Development:
- Overview: 'contributing/contribution_guides/development.md'
- New Contributors: 'contributing/contribution_guides/newContributorChecklist.md'
- Model Manager v2: 'contributing/MODEL_MANAGER.md'
- Multiuser Mode: 'multiuser/specification.md'
- Local Development: 'contributing/LOCAL_DEVELOPMENT.md'
- System Architecture: 'contributing/ARCHITECTURE.md'
- Hotkeys: 'contributing/HOTKEYS.md'
- Testing: 'contributing/TESTS.md'
- Frontend:
- Overview: 'contributing/frontend/index.md'
- State Management: 'contributing/frontend/state-management.md'
- Workflows - Design and Implementation: 'contributing/frontend/workflows.md'
- Documentation: 'contributing/contribution_guides/documentation.md'
- Nodes: 'contributing/INVOCATIONS.md'
- Model Manager: 'contributing/MODEL_MANAGER.md'
- Download Queue: 'contributing/DOWNLOAD_QUEUE.md'
- Translation: 'contributing/contribution_guides/translation.md'
- Tutorials: 'contributing/contribution_guides/tutorials.md'
- Help:
- Getting Started: 'help/gettingStartedWithAI.md'
- Diffusion Overview: 'help/diffusion.md'
- Sampler Convergence: 'help/SAMPLER_CONVERGENCE.md'
- FAQ: 'faq.md'