Adds full support for managing model-to-model relationships in the UI and backend.
Introduces RelatedModels subpanel for linking and unlinking models in model management.
- Adds REST API routes for adding, removing, and retrieving model relationships.
- New database migration: creates model_relationships table for bidirectional links.
- New service layer (model_relationships) for relationship management.
- Updated frontend: Related models float to top of LoRA/Main grouped model comboboxes for quick access.
- Added 'Show Only Related' toggle badge to MainModelPicker filter bar
**Amended commit to remove changes to ParamMainModelSelect.tsx and MainModelPicker.tsx to avoid conflict with upstream deletion/ rewrite**
There is a subtle change in behaviour with the new model probe API.
Previously, checks for model types was done in a specific order. For example, we did all main model checks before LoRA checks.
With the new API, the order of checks has changed. Check ordering is as follows:
- New API checks are run first, then legacy API checks.
- New API checks categorized by their speed. When we run new API checks, we sort them from fastest to slowest, and run them in that order. This is a performance optimization.
Currently, LoRA and LLaVA models are the only model types with the new API. Checks for them are thus run first.
LoRA checks involve checking the state dict for presence of keys with specific prefixes. We expect these keys to only exist in LoRAs.
It turns out that main models may have some of these keys.
For example, this model has keys that match the LoRA prefix `lora_te_`: https://civitai.com/models/134442/helloyoung25d
Under the old probe, we'd do the main model checks first and correctly identify this as a main model. But with the new setup, we do the LoRA check first, and those pass. So we import this model as a LoRA.
Thankfully, the old probe still exists. For now, the new probe is fully disabled. It was only called in one spot.
I've also added the example affected model as a test case for the model probe. Right now, this causes the test to fail, and I've marked the test as xfail. CI will pass.
Once we enable the new API again, the xfail will pass, and CI will fail, and we'll be reminded to update the test.
In `ObjectSerializerDisk`, we use `torch.load` to load serialized objects from disk. With torch 2.6.0, torch defaults to `weights_only=True`. As a result, torch will raise when attempting to deserialize anything with an unrecognized class.
For example, our `ConditioningFieldData` class is untrusted. When we load conditioning from disk, we will get a runtime error.
Torch provides a method to add trusted classes to an allowlist. This change adds an arg to `ObjectSerializerDisk` to add a list of safe globals to the allowlist and uses it for both `ObjectSerializerDisk` instances.
Note: My first attempt inferred the class from the generic type arg that `ObjectSerializerDisk` accepts, and added that to the allowlist. Unfortunately, this doesn't work.
For example, `ConditioningFieldData` has a `conditionings` attribute that may be one some other untrusted classes representing model-specific conditioning data. So, even if we allowlist `ConditioningFieldData`, loading will fail when torch deserializes the `conditionings` attribute.
This message is logged _every_ time we retrieve a list of models if there is an invalid model. Previously it logged the _whole_ row which can be a lot of data. Truncate the row to 64 characters to reduce log pollution.
In #7688 we optimized queuing preparation logic. This inadvertently broke retrying queue items.
Previously, a `NamedTuple` was used to store the values to insert in the DB when enqueuing. This handy class provides an API similar to a dataclass, where you can instantiate it with kwargs in any order. The resultant tuple re-orders the kwargs to match the order in the class definition.
For example, consider this `NamedTuple`:
```py
class SessionQueueValueToInsert(NamedTuple):
foo: str
bar: str
```
When instantiating it, no matter the order of the kwargs, if you make a normal tuple out of it, the tuple values are in the same order as in the class definition:
```
t1 = SessionQueueValueToInsert(foo="foo", bar="bar")
print(tuple(t1)) # -> ('foo', 'bar')
t2 = SessionQueueValueToInsert(bar="bar", foo="foo")
print(tuple(t2)) # -> ('foo', 'bar')
```
So, in the old code, when we used the `NamedTuple`, it implicitly normalized the order of the values we insert into the DB.
In the retry logic, the values of the tuple were not ordered correctly, but the use of `NamedTuple` had secretly fixed the order for us.
In the linked PR, `NamedTuple` was dropped for a normal tuple, after profiling showed `NamedTuple` to be meaningfully slower than a normal tuple.
The implicit order normalization behaviour wasn't understood, and the order wasn't fixed when changin the retry logic to use a normal tuple instead of `NamedTuple`. This results in a bug where we incorrectly create queue items in the DB. For example, we stored the `destination` in the `field_values` column.
When such an incorrectly-created queue item is dequeued, it fails pydantic validation and causes what appears to be an endless loop of errors.
The only user-facing solution is to add this line to `invokeai.yaml` and restart the app:
```yaml
clear_queue_on_startup: true
```
On next startup, the queue is forcibly cleared before the error loop is triggered. Then the user should remove this line so their queue is persisted across app launches per usual.
The solution is simple - fix the ordering of the tuple. I also added a type annotation and comment to the tuple type alias definition.
Note: The endless error loop, as a general problem, will take some thinking to fix. The queue service methods to cancel and fail a queue item still retrieve it and parse it. And the list queue items methods parse the queue items. Bit of a catch 22, maybe the solution is to simply delete totally borked queue items and log an error.
- Replace `get_counts` method with `get_tag_counts_with_filter` which gets the counts for a list of tags, filtering by a list of selected tags
- Update `get_many` logic to apply tag filtering with AND logic, to match the new `get_tag_counts_with_filter` method
- Update workflow library router