Commit Graph

2930 Commits

Author SHA1 Message Date
psychedelicious
43b0d07517 feat(api): add route to reset hf token 2025-05-05 23:25:29 +10:00
blessedcoolant
f83592a052 fix: deprecation warning in get_iso_timestemp 2025-05-05 11:45:30 +10:00
psychedelicious
cc54466db9 fix(nodes): default value for UIConfigBase.tags 2025-04-28 13:31:26 -04:00
psychedelicious
cbdafe7e38 feat(nodes): allow node clobbering 2025-04-28 13:31:26 -04:00
psychedelicious
8ed5585285 feat(nodes): move output metadata to BaseInvocationOutput 2025-04-28 09:19:43 -04:00
Mary Hipp
c64f20a72b remove output_metdata from schema 2025-04-28 09:19:43 -04:00
Mary Hipp
4a0df6b865 add optional output_metadata to baseinvocation 2025-04-28 09:19:43 -04:00
psychedelicious
deb1984289 fix(mm): disable new model probe API
There is a subtle change in behaviour with the new model probe API.

Previously, checks for model types was done in a specific order. For example, we did all main model checks before LoRA checks.

With the new API, the order of checks has changed. Check ordering is as follows:
- New API checks are run first, then legacy API checks.
- New API checks categorized by their speed. When we run new API checks, we sort them from fastest to slowest, and run them in that order. This is a performance optimization.

Currently, LoRA and LLaVA models are the only model types with the new API. Checks for them are thus run first.

LoRA checks involve checking the state dict for presence of keys with specific prefixes. We expect these keys to only exist in LoRAs.

It turns out that main models may have some of these keys.

For example, this model has keys that match the LoRA prefix `lora_te_`: https://civitai.com/models/134442/helloyoung25d

Under the old probe, we'd do the main model checks first and correctly identify this as a main model. But with the new setup, we do the LoRA check first, and those pass. So we import this model as a LoRA.

Thankfully, the old probe still exists. For now, the new probe is fully disabled. It was only called in one spot.

I've also added the example affected model as a test case for the model probe. Right now, this causes the test to fail, and I've marked the test as xfail. CI will pass.

Once we enable the new API again, the xfail will pass, and CI will fail, and we'll be reminded to update the test.
2025-04-18 22:44:10 +10:00
psychedelicious
814406d98a feat(mm): siglip model loading supports partial loading
In the previous commit, the LLaVA model was updated to support partial loading.

In this commit, the SigLIP model is updated in the same way.

This model is used for FLUX Redux. It's <4GB and only ever run in isolation, so it won't benefit from partial loading for the vast majority of users. Regardless, I think it is best if we make _all_ models work with partial loading.

PS: I also fixed the initial load dtype issue, described in the prev commit. It's probably a non-issue for this model, but we may as well fix it.
2025-04-18 10:12:03 +10:00
psychedelicious
c054501103 feat(mm): llava model loading supports partial loading; fix OOM crash on initial load
The model manager has two types of model cache entries:
- `CachedModelOnlyFullLoad`: The model may only ever be loaded and unloaded as a single object.
- `CachedModelWithPartialLoad`: The model may be partially loaded and unloaded.

Partial loaded is enabled by overwriting certain torch layer classes, adding the ability to autocast the layer to a device on-the-fly. See `CustomLinear` for an example.

So, to take advantage of partial loading and be cached as a `CachedModelWithPartialLoad`, the model must inherit from `torch.nn.Module`.

The LLaVA classes provided by `transformers` do inherit from `torch.nn.Module`, but we wrap those classes in a separate class called `LlavaOnevisionModel`. The wrapper encapsulate both the LLaVA model and its "processor" - a lightweight class that prepares model inputs like text and images.

While it is more elegant to encapsulate both model and processor classes in a single entity, this prevents the model cache from enabling partial loading for the chunky vLLM model.

Fixing this involved a few changes.
- Update the `LlavaOnevisionModelLoader` class to operate on the vLLM model directly, instead the `LlavaOnevisionModel` wrapper class.
- Instantiate the processor directly in the node. The processor is lightweight and does its business on the CPU. We don't need to worry about caching in the model manager.
- Remove caching support code from the `LlavaOnevisionModel` wrapper class. It's not needed, because we do not cache this class. The class now only handles running the models provided to it.
- Rename `LlavaOnevisionModel` to `LlavaOnevisionPipeline` to better represent its purpose.

These changes have a bonus effect of fixing an OOM crash when initially loading the models. This was most apparent when loading LLaVA 7B, which is pretty chunky.

The initial load is onto CPU RAM. In the old version of the loaders, we ignored the loader's target dtype for the initial load. Instead, we loaded the model at `transformers`'s "default" dtype of fp32.

LLaVA 7B is fp16 and weighs ~17GB. Loading as fp32 means we need double that amount (~34GB) of CPU RAM. Many users only have 32GB RAM, so this causes a _CPU_ OOM - which is a hard crash of the whole process.

With the updated loaders, the initial load logic now uses the target dtype for the initial load. LLaVA now needs the expected ~17GB RAM for its initial load.

PS: If we didn't make the accompanying partial loading changes, we still could have solved this OOM. We'd just need to pass the initial load dtype to the wrapper class and have it load on that dtype. But we may as well fix both issues.

PPS: There are other models whose model classes are wrappers around a torch module class, and thus cannot be partially loaded. However, these models are typically fairly small and/or are run only on their own, so they don't benefit as much from partial loading. It's the really big models (like LLaVA 7B) that benefit most from the partial loading.
2025-04-18 10:12:03 +10:00
psychedelicious
c1d819c7e5 feat(nodes): add get_absolute_path method to context.models API
Given a model config or path (presumably to a model), returns the absolute path to the model.

Check the next few commits for use-case.
2025-04-18 10:12:03 +10:00
psychedelicious
cbee6e6faf fix(app): remove accidentally committed tensor cache size
I had set this to zero for testing udring the python 2.6.0 upgrade and neglected to remove it.
2025-04-17 10:12:47 +10:00
skunkworxdark
566282bff0 Update metadata_linked.py
added metadata_to_string_collection, metadata_to_integer_collection, metadata_to_float_collection, metadata_to_bool_collection
2025-04-16 06:28:22 +10:00
psychedelicious
a5bc21cf50 feat(nodes): extract LaMa model url to constant 2025-04-15 07:13:25 +10:00
psychedelicious
ae8d1f26d6 fix(app): import CogView4Transformer2DModel from the module that exports it 2025-04-10 10:50:13 +10:00
psychedelicious
170ea4fb75 fix(app): add CogView4ConditioningInfo to ObjectSerializerDisk's safe_globals
needed for torch w/ weights_only=True
2025-04-10 10:50:13 +10:00
psychedelicious
e5b0f8b985 feat(app): remove cogview4 inpaint workflow
This doesn't make sense to have as a default workflow given the trickiness of producing alpha masks.
2025-04-10 10:50:13 +10:00
psychedelicious
3f656072cf feat(app): update cogview4 t2i workflow w/ form 2025-04-10 10:50:13 +10:00
psychedelicious
ad582c8cc5 feat(nodes): rename CogView4 nodes to match naming format 2025-04-10 10:50:13 +10:00
psychedelicious
c99e65bdab feat(app): add cogview4 default workflows 2025-04-10 10:50:13 +10:00
maryhipp
305c5761d0 add generation modes for cogview linear 2025-04-10 10:50:13 +10:00
Ryan Dick
d86cd66994 Add CogView4 VAE approximation for progress images. 2025-04-10 10:50:13 +10:00
Ryan Dick
13850271ab Add inpainting to CogView4DenoiseInvocation. 2025-04-10 10:50:13 +10:00
Ryan Dick
7e894ffe83 Consolidate InpaintExtension implementations for SD3 and FLUX. 2025-04-10 10:50:13 +10:00
Ryan Dick
0939030324 Support cfg_scale list in CogView4Denoise. 2025-04-10 10:50:13 +10:00
Ryan Dick
30f19dc37a Update CogView4Denoise to support image-to-image. 2025-04-10 10:50:13 +10:00
Ryan Dick
ace5e748f4 Simplify CogView4 timesteps schedule generation in preparation for timestep schedule slipping. 2025-04-10 10:50:13 +10:00
Ryan Dick
4fae8ad163 Add CogView4ImageToLatentsInvocation. 2025-04-10 10:50:13 +10:00
Ryan Dick
5e75bc570a Fix bug in CogView4 noise schedule handling that was resulting in low-quality images. 2025-04-10 10:50:13 +10:00
Ryan Dick
3166b5d2ea Switch to sequential CFG for CogView4 (for now, until I sort out the padding). 2025-04-10 10:50:13 +10:00
Ryan Dick
321c2d358c Add CogView4 model loader. And various other fixes to get a CogView4 workflow running (though quality is still below expectations). 2025-04-10 10:50:13 +10:00
Ryan Dick
cf76a0b575 Add CogView4ModelLoaderInvocation. (Not wired up with frontend yet.) 2025-04-10 10:50:13 +10:00
Ryan Dick
67bfd63c73 Require the cogview4 height/width are multiples of 32. This requirement is documented here: https://huggingface.co/THUDM/CogView4-6B. I haven't tracked down the underlying source of this requirement. 2025-04-10 10:50:13 +10:00
Ryan Dick
cdad8a4fd1 Add CogView4LatentsToImageInvocation. 2025-04-10 10:50:13 +10:00
Ryan Dick
5d9797945b Completed first pass of CogView4Denoise. 2025-04-10 10:50:13 +10:00
Ryan Dick
78159c3200 Simplify CogView4 timestep schedule initialization. 2025-04-10 10:50:13 +10:00
Ryan Dick
1320c4fa13 WIP - CogView4DenoiseInvocation. 2025-04-10 10:50:13 +10:00
Ryan Dick
bac05a7885 Add CogView4TextEncoderInvocation 2025-04-10 10:50:13 +10:00
Kevin Turner
52a8ad1c18 chore: rename model.size to model.file_size
to disambiguate from RAM size or pixel size
2025-04-10 09:53:03 +10:00
Kevin Turner
f09aacf992 fix: ModelProbe.probe needs to return a size field 2025-04-10 09:53:03 +10:00
Kevin Turner
9590e8ff39 feat: expose model storage size 2025-04-10 09:53:03 +10:00
psychedelicious
49622c37ed fix(nodes): logic bug in flux redux node 2025-04-08 10:33:45 +10:00
skunkworxdark
e1538af219 Update flux_redux.py
Add down sampling and weight to redux node
2025-04-08 10:33:45 +10:00
psychedelicious
b0d5e7e3d8 feat(app): restore "Using torch device" message on startup 2025-04-07 10:56:26 +10:00
psychedelicious
8d3743c6f2 tidy(nodes): rename controlnet_image_processors.py -> controlnet.py 2025-04-04 18:42:13 +11:00
psychedelicious
986b7426d2 tidy(nodes): remove unused old dw openpose detector class 2025-04-04 18:42:13 +11:00
psychedelicious
8d8150b47e tidy(nodes): remove deprecated controlnet "processor" nodes 2025-04-04 18:42:13 +11:00
psychedelicious
89c999ca58 fix(backend): remove mps_fixes
The fixes in this module monkeypatched `torch` to resolve some issues with FP16 on macOS. These issues have long since been resolved.

Included in the now-removed fixes is `CustomSlicedAttentionProcessor`, which is intended to reduce memory requirements for MPS. This overrides `diffusers`' own `SlicedAttentionProcessor`.

Unfortunately, `attention_type: sliced` produces hot garbage with the fixes and black images without the fixes. So this class appears to now be a moot point.

Regardless, SDPA is supported on MPS and very efficient, so sliced attention is largely obsolete.
2025-04-04 18:42:13 +11:00
psychedelicious
5fa2cf59e2 fix(app): add trusted classes to torch safe globals to prevent errors when loading them
In `ObjectSerializerDisk`, we use `torch.load` to load serialized objects from disk. With torch 2.6.0, torch defaults to `weights_only=True`. As a result, torch will raise when attempting to deserialize anything with an unrecognized class.

For example, our `ConditioningFieldData` class is untrusted. When we load conditioning from disk, we will get a runtime error.

Torch provides a method to add trusted classes to an allowlist. This change adds an arg to `ObjectSerializerDisk` to add a list of safe globals to the allowlist and uses it for both `ObjectSerializerDisk` instances.

Note: My first attempt inferred the class from the generic type arg that `ObjectSerializerDisk` accepts, and added that to the allowlist. Unfortunately, this doesn't work.

For example, `ConditioningFieldData` has a `conditionings` attribute that may be one some other untrusted classes representing model-specific conditioning data. So, even if we allowlist `ConditioningFieldData`, loading will fail when torch deserializes the `conditionings` attribute.
2025-04-04 18:42:13 +11:00
psychedelicious
38e7b23d18 feat(api): put all validatoin run data into single object 2025-04-04 11:38:04 +11:00