* feat: initial external model support * feat: support reference images for external models * fix: sorting lint error * chore: hide Reidentify button for external models * review: enable auto-install/remove fro external models * feat: show external mode name during install * review: model descriptions * review: implemented review comments * review: added optional seed control for external models * chore: fix linter warning * review: save api keys to a seperate file * docs: updated external model docs * chore: fix linter errors * fix: sync configured external starter models on startup * feat(ui): add provider-specific external generation nodes * feat: expose external panel schemas in model configs * feat(ui): drive external panels from panel schema * docs: sync app config docstring order * feat: add gemini 3.1 flash image preview starter model * feat: update gemini image model limits * fix: resolve TypeScript errors and move external provider config to api_keys.yaml Add 'external', 'external_image_generator', and 'external_api' to Zod enum schemas (zBaseModelType, zModelType, zModelFormat) to match the generated OpenAPI types. Remove redundant union workarounds from component prop types and Record definitions. Fix type errors in ModelEdit (react-hook-form Control invariance), parsing.tsx (model identifier narrowing), buildExternalGraph (edge typing), and ModelSettings import/export buttons. Move external_gemini_base_url and external_openai_base_url into api_keys.yaml alongside the API keys so all external provider config lives in one dedicated file, separate from invokeai.yaml. * feat: add resolution presets and imageConfig support for Gemini 3 models Add combined resolution preset selector for external models that maps aspect ratio + image size to fixed dimensions. Gemini 3 Pro and 3.1 Flash now send imageConfig (aspectRatio + imageSize) via generationConfig instead of text-based aspect ratio hints used by Gemini 2.5 Flash. Backend: ExternalResolutionPreset model, resolution_presets capability field, image_size on ExternalGenerationRequest, and Gemini provider imageConfig logic. Frontend: ExternalSettingsAccordion with combo resolution select, dimension slider disabling for fixed-size models, and panel schema constraint wiring for Steps/Guidance/Seed controls. * Remove unused external model fields and add provider-specific parameters - Remove negative_prompt, steps, guidance, reference_image_weights, reference_image_modes from external model nodes (unused by any provider) - Remove supports_negative_prompt, supports_steps, supports_guidance from ExternalModelCapabilities - Add provider_options dict to ExternalGenerationRequest for provider-specific parameters - Add OpenAI-specific fields: quality, background, input_fidelity - Add Gemini-specific fields: temperature, thinking_level - Add new OpenAI starter models: GPT Image 1.5, GPT Image 1 Mini, DALL-E 3, DALL-E 2 - Fix OpenAI provider to use output_format (GPT Image) vs response_format (DALL-E) and send model ID in requests - Add fixed aspect ratio sizes for OpenAI models (bucketing) - Add ExternalProviderRateLimitError with retry logic for 429 responses - Add provider-specific UI components in ExternalSettingsAccordion - Simplify ParamSteps/ParamGuidance by removing dead external overrides - Update all backend and frontend tests * Chore Ruff check & format * Chore typegen * feat: full canvas workflow integration for external models - Add missing aspect ratios (4:5, 5:4, 8:1, 4:1, 1:4, 1:8) to type system for external model support - Sync canvas bbox when external model resolution preset is selected - Use params preset dimensions in buildExternalGraph to prevent "unsupported aspect ratio" errors - Lock all bbox controls (resize handles, aspect ratio select, width/height sliders, swap/optimal buttons) for external models with fixed dimension presets - Disable denoise strength slider for external models (not applicable) - Sync bbox aspect ratio changes back to paramsSlice for external models - Initialize bbox dimensions when switching to an external model * Chore typegen Linux seperator * feat: full canvas workflow integration for external models - Update buildExternalGraph test to include dimensions in mock params * Merge remote-tracking branch 'upstream/main' into external-models * Chore pnpm fix * add missing parameter * docs: add External Models guide with Gemini and OpenAI provider pages * fix(external-models): address PR review feedback - Gemini recall: write temperature, thinking_level, image_size to image metadata; wire external graph as metadata receiver; add recall handlers. - Canvas: gate regional guidance, inpaint mask, and control layer for external models. - Canvas: throw a clear error on outpainting for external models (was falling back to inpaint and hitting an API-side mask/image size mismatch). - Workflow editor: add ui_model_provider_id filter so OpenAI and Gemini nodes only list their own provider's models. - Workflow editor: silently drop seed when the selected model does not support it instead of raising a capability error. - Remove the legacy external_image_generation invocation and the graph-builder fallback; providers must register a dedicated node. - Regenerate schema.ts. - remove Gemini debug dumps to outputs/external_debug * fix(external-models): resolve TSC errors in metadata parsing and external graph - Export imageSizeChanged from paramsSlice (required by the new ImageSize recall handler). - Emit the external graph's metadata model entry via zModelIdentifierField since ExternalApiModelConfig is not part of the AnyModelConfig union. * chore: prettier format ModelIdentifierFieldInputComponent * fix: remove unsupported thinkingConfig from Gemini image models and restrict GPT Image models to txt2img * chore typegen * chore(docs): regenerate settings.json for external provider fields * fix(external): fix mask handling and mode support for external providers - Remove img2img and inpaint modes from Gemini models (Gemini has no bitmap mask or dedicated edit API; image editing works via reference images in the UI) - Fix DALL-E 2 inpainting: convert grayscale mask to RGBA with alpha channel transparency (OpenAI expects transparent=edit area) and convert init image to RGBA when mask is present * fix(external): update mode support and UI for external providers - Remove DALL-E 2 from starter models (deprecated, shutdown May 12 2026) - Enable img2img for GPT Image 1/1.5/1-mini (supports edits endpoint) - Set Gemini models to txt2img only (no mask/edit API; editing via ref images) - Hide mode/init_image/mask_image fields on Gemini node (not usable) - Hide mask_image field on OpenAI node (no model supports inpaint) * Chore typegen * fix(external): improve OpenAI node UX and disable cache by default - Hide OpenAI node's mode and init_image fields: OpenAI's API has no img2img/inpaint distinction (the edits endpoint is invoked automatically when reference images are provided). init_image is functionally identical to a reference image and was misleading users. - Default use_cache to False for external image generation nodes: external API calls are non-deterministic and incur usage costs. Cache hits returned stale image references that did not produce new gallery entries on repeat invokes. * fix(external): duplicate cached images on cache hit instead of skipping External image generation nodes use the standard invocation cache, but returning the cached output (with stale image_name references) on cache hits resulted in no new gallery entries — the Invoke button would spin indefinitely on repeat invokes with identical parameters. Override invoke_internal so that on cache hit, the cached images are loaded and re-saved as new gallery entries. The expensive API call is still skipped (cost saving), but the user sees a new image as expected. * Chore typegen + ruff * CHore ruff format * fix(external): restore OpenAI advanced settings on Remix recall Remix recall iterates through ImageMetadataHandlers but only Gemini's temperature handler was wired up — OpenAI's quality, background, and input_fidelity were stored in image metadata but never parsed back into the params slice. Add the three missing handlers so Remix restores these settings as expected. --------- Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev> Co-authored-by: Alexander Eichhorn <alex@code-with.us> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
InvokeAI Graph - Design Overview
High-level design for the graph module. Focuses on responsibilities, data flow, and how traversal works.
1) Purpose
Provide a typed, acyclic workflow model (Graph) plus a runtime scheduler (GraphExecutionState) that expands iterator patterns, tracks readiness via indegree (the number of incoming edges to a node in the directed graph), and executes nodes in class-grouped batches. In normal execution, runtime expansion happens in a separate execution graph instead of mutating the source graph.
2) Major Data Types
EdgeConnection
- Fields:
node_id: str,field: str. - Hashable; printed as
node.fieldfor readable diagnostics.
Edge
- Fields:
source: EdgeConnection,destination: EdgeConnection. - One directed connection from a specific output port to a specific input port.
AnyInvocation / AnyInvocationOutput
- Pydantic wrappers that carry concrete invocation models and outputs.
- No registry logic in this file; they are permissive containers for heterogeneous nodes.
IterateInvocation / CollectInvocation
-
Control nodes used by validation and execution:
- IterateInvocation: input
collection, outputs includeitem(and index/total). - CollectInvocation: many
iteminputs aggregated to onecollectionoutput.
- IterateInvocation: input
3) Graph (author-time model)
A container for declared nodes and edges. Does not perform iteration expansion.
3.1 Data
nodes: dict[str, AnyInvocation]- key must equalnode.id.edges: list[Edge]- zero or more.- Utility:
_get_input_edges(node_id, field?),_get_output_edges(node_id, field?)These scanself.edges(no adjacency indices in the current code).
3.2 Validation (validate_self)
Runs a sequence of checks:
-
Node ID uniqueness No duplicate IDs; map key equals
node.id. -
Endpoint existence Source and destination node IDs must exist.
-
Port existence Input ports must exist on the node class; output ports on the node's output model.
-
DAG constraint Build a flat
DiGraph(no runtime expansion) and assert acyclicity. -
Type compatibility
get_output_field_typevsget_input_field_typeandare_connection_types_compatible. -
Iterator / collector structure Enforce special rules:
- Iterator's input must be
collection; its outgoing edges useitem. - Collector accepts many
iteminputs; outputs a singlecollection. - Edge fan-in to a non-collector input is rejected.
- Iterator's input must be
3.3 Edge admission (_validate_edge)
Checks a single prospective edge before insertion:
- Endpoints/ports exist.
- Destination port is not already occupied unless it's a collector
item. - Adding the edge to the flat DAG must keep it acyclic.
- Iterator/collector constraints re-checked when the edge creates relevant patterns.
3.4 Topology utilities
nx_graph()- DiGraph of declared nodes and edges.nx_graph_flat()- "flattened" DAG (still author-time; no runtime copies). Used in validation and in_prepare()during execution planning.
3.5 Mutation helpers
add_node,update_node(preserve edges, rewrite endpoints if id changes),delete_node.add_edge,delete_edge(with validation).
4) GraphExecutionState (runtime)
Holds the state for a single run. Keeps the source graph intact and materializes a separate execution graph.
GraphExecutionState is still the public runtime entry point, but most execution behavior is now delegated to a small
set of internal helper classes.
The source graph is treated as stable during normal execution, but the runtime object still exposes guarded graph mutation helpers. Those helpers reject changes once the affected nodes have already been prepared or executed.
4.1 Data
graph: Graph- source graph for the run; treated as stable during normal execution.execution_graph: Graph- materialized runtime nodes/edges.executed: set[str],executed_history: list[str].results: dict[str, AnyInvocationOutput],errors: dict[str, str].prepared_source_mapping: dict[str, str]- exec id -> source id.source_prepared_mapping: dict[str, set[str]]- source id -> exec ids.indegree: dict[str, int]- unmet inputs per exec node.- Prepared exec metadata caches:
- source node id
- iteration path
- runtime state such as pending, ready, executed, or skipped
- Ready queues grouped by class (private attrs):
_ready_queues: dict[class_name, deque[str]],_active_class: Optional[str]. Optionalready_order: list[str]to prioritize classes.
4.2 Core methods
next()Returns the next ready exec node. If none are ready, it asks the materializer to expand more source nodes and then retries. Before returning a node, the runtime helper deep-copies inbound values into the node fields.complete(node_id, output)Records the result, marks the exec node executed, marks the source node executed once all of its prepared exec copies are done, then decrements downstream indegrees and enqueues newly ready nodes.
4.3 Runtime helper classes
GraphExecutionState now delegates most runtime behavior to internal helpers:
_PreparedExecRegistryOwns the relationship between source graph nodes and prepared execution graph nodes, plus cached metadata such as iteration path and runtime state._ExecutionMaterializerExpands source graph nodes into concrete execution graph nodes when the scheduler runs out of ready work._ExecutionSchedulerOwns indegree transitions, ready queues, class batching, and downstream release on completion._ExecutionRuntimeOwns iteration-path lookup and input hydration for prepared exec nodes._IfBranchSchedulerApplies lazyIfsemantics by deferring branch-local work until the condition is known, then releasing the selected branch and skipping the unselected branch.
4.4 Preparation (_prepare())
-
Build a flat DAG from the source graph.
-
Choose the next source node in topological order that:
- has not been prepared,
- if it is an iterator, its inputs are already executed,
- it has no unexecuted iterator ancestors.
-
If the node is a CollectInvocation: collapse all prepared parents into one mapping and create one exec node.
-
Otherwise: compute all combinations of prepared iterator ancestors. For each combination, choose the prepared parent for each upstream by matching iterator ancestry, then create one exec node.
-
For each new exec node:
- Deep-copy the source node; assign a fresh ID (and
indexfor iterators). - Wire edges from chosen prepared parents.
- Set
indegree = number of unmet inputs(i.e., parents not yet executed). - Try to resolve any
If-specific scheduling state. - If the node is ready and not deferred by an unresolved
If, enqueue it into its class queue.
- Deep-copy the source node; assign a fresh ID (and
4.5 Readiness and batching
_enqueue_if_ready(nid)enqueues by class name only whenindegree == 0, the node has not already executed, and the node is not deferred by an unresolvedIf._get_next_node()drains the_active_classqueue FIFO; when empty, selects the next nonempty class queue (byready_orderif set, else alphabetical), and continues. Optional fairness knobs can limit batch size per class; default is drain fully.
4.5.1 Indegree (what it is and how it's used)
Indegree is the number of incoming edges to a node in the execution graph that are still unmet. In this engine:
- For every materialized exec node,
indegree[node]equals the count of its prerequisite parents that have not finished yet. - A node is "ready" exactly when
indegree[node] == 0; only then is it enqueued. - When a node completes, the scheduler decrements
indegree[child]for each outgoing edge. Any child that reaches 0 is enqueued.
Example: edges A->C, B->C, C->D. Start: A:0, B:0, C:2, D:1. Run A -> C:1. Run B -> C:0 -> enqueue C.
Run C -> D:0 -> enqueue D. Run D -> done.
4.6 Input hydration (_prepare_inputs())
- For CollectInvocation: gather all incoming
itemvalues intocollection, sorting inputs by iteration path so collected results are stable across expanded iterations. Incomingcollectionvalues are merged first, then incomingitemvalues are appended. - For IfInvocation: hydrate only
conditionand the selected branch input. - For all others: deep-copy each incoming edge's value into the destination field. This prevents cross-node mutation through shared references.
4.7 Lazy If semantics
IfInvocation now acts as a lazy branch boundary rather than a simple value multiplexer.
- The
conditioninput must resolve first. - Nodes that are exclusive to the true or false branch can remain deferred even when their indegree is zero.
- Once the prepared
Ifnode resolves its condition:- the selected branch is released
- the unselected branch is marked skipped
- branch-exclusive ancestors of the unselected branch are never executed
- Shared ancestors still execute if they are required by the selected branch or by any other live path in the graph.
This behavior is implemented in the runtime scheduler, not in the invocation body itself.
5) Traversal Summary
-
Author builds a valid Graph.
-
Create GraphExecutionState with that graph.
-
Loop:
node = state.next()-> may trigger_prepare()expansion.- Execute node externally ->
output. state.complete(node.id, output)-> updates indegrees,Ifstate, and ready queues.
-
Finish when
next()returnsNone.
In normal execution, all runtime expansion occurs in execution_graph with traceability back to source nodes.
6) Invariants
- Source Graph remains a DAG and type-consistent.
execution_graphremains a DAG.- Nodes are enqueued only when
indegree == 0and they are not deferred by an unresolvedIf. resultsanderrorsare keyed by exec node id.- Collectors aggregate
iteminputs and may also merge incomingcollectioninputs during runtime hydration. - Branch-exclusive nodes behind an unselected
Ifbranch are skipped, not failed.
7) Extensibility
- New node types: implement as Pydantic models with typed fields and outputs. Register per your invocation system;
this file accepts them as
AnyInvocation. - Scheduling policy: adjust
ready_orderto batch by class; add a batch cap for fairness without changing complexity. - Dynamic behaviors (future): can be added in
GraphExecutionStateby creating exec nodes and edges atcomplete()time, as long as the DAG invariant holds.
8) Error Model (selected)
DuplicateNodeIdError,NodeAlreadyInGraphErrorNodeNotFoundError,NodeFieldNotFoundErrorInvalidEdgeError,CyclicalGraphErrorNodeInputError(raised when preparing inputs for execution)
Messages favor short, precise diagnostics (node id, field, and failing condition).
9) Rationale
- Two-graph approach isolates authoring from execution expansion and keeps validation simple.
- Indegree + queues gives O(1) scheduling decisions with clear batching semantics.
- Iterator/collector separation keeps fan-out/fan-in explicit and testable.
- Deep-copy hydration avoids incidental aliasing bugs between nodes.