fix(backend): filter disabled blocks before batch_size in embedding backfill

The batch_size limit was applied before filtering out disabled blocks.
With 120+ disabled blocks and batch_size=100, the first 100 entries
were all disabled (skipped via continue), leaving the 36 enabled blocks
beyond the slice boundary never indexed. This made core blocks like
AITextGeneratorBlock, AIConversationBlock, etc. invisible to search.

Fix: filter disabled blocks from the missing list before slicing by
batch_size, so every batch slot goes to an enabled block that actually
needs indexing.
This commit is contained in:
Zamil Majdy
2026-03-13 16:20:36 +07:00
parent 70dfe64c6d
commit afcce75aff

View File

@@ -188,10 +188,13 @@ class BlockHandler(ContentHandler):
)
existing_ids = {row["contentId"] for row in existing_result}
# Filter disabled blocks before applying batch_size so that a large
# number of disabled blocks can't exhaust the batch budget and prevent
# enabled blocks from being indexed.
missing_blocks = [
(block_id, block_cls)
for block_id, block_cls in all_blocks.items()
if block_id not in existing_ids
if block_id not in existing_ids and not block_cls().disabled
]
# Convert to ContentItem
@@ -200,9 +203,6 @@ class BlockHandler(ContentHandler):
try:
block_instance = block_cls()
if block_instance.disabled:
continue
# Build searchable text from block metadata
parts = []
if block_instance.name: