fix(integration): FirecrawlExtractBlock returns 400 Invalid JSON schema when output_schema is passed as a string (#10669)

When the FirecrawlExtractBlock receives an output_schema, we currently
declare the field as a str.
Pydantic therefore serialises the JSON‐looking value into a string and
the Firecrawl API rejects the request with:

`400 Bad Request – Invalid JSON schema. path: ['schema']`

Direct curl requests work because the same structure is sent as a proper
JSON object.

### Changes 🏗️

- Changed the output_schema to dict instead of str

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Test firebase.extract(..., schema) works with dict rather than str

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This commit is contained in:
Swifty
2025-08-19 09:04:04 +02:00
committed by GitHub
parent 35bd7f7f7a
commit 650be0d1f7

View File

@@ -29,8 +29,8 @@ class FirecrawlExtractBlock(Block):
prompt: str | None = SchemaField(
description="The prompt to use for the crawl", default=None, advanced=False
)
output_schema: str | None = SchemaField(
description="A more rigid structure if you already know the JSON layout.",
output_schema: dict | None = SchemaField(
description="A Json Schema describing the output structure if more rigid structure is desired.",
default=None,
)
enable_web_search: bool = SchemaField(
@@ -56,7 +56,6 @@ class FirecrawlExtractBlock(Block):
app = FirecrawlApp(api_key=credentials.api_key.get_secret_value())
# Sync call
extract_result = app.extract(
urls=input_data.urls,
prompt=input_data.prompt,