mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-04-08 03:00:28 -04:00
fix(platform): fix prod Sentry errors and reduce on-call alert noise (#12565)
## Why
Multiple Sentry issues paging on-call in prod:
1. **AUTOGPT-SERVER-8BP**: `ConversionError: Failed to convert
anthropic/claude-sonnet-4-6 to <enum 'LlmModel'>` — the copilot passes
OpenRouter-style provider-prefixed model names
(`anthropic/claude-sonnet-4-6`) to blocks, but the `LlmModel` enum only
recognizes the bare model ID (`claude-sonnet-4-6`).
2. **BUILDER-7GF**: `Error invoking postEvent: Method not found` —
Sentry SDK internal error on Chrome Mobile Android, not a platform bug.
3. **XMLParserBlock**: `BlockUnknownError raised by XMLParserBlock with
message: Error in input xml syntax` — user sent bad XML but the block
raised `SyntaxError`, which gets wrapped as `BlockUnknownError`
(unexpected) instead of `BlockExecutionError` (expected).
4. **AUTOGPT-SERVER-8BS**: `Virus scanning failed for Screenshot
2026-03-26 091900.png: range() arg 3 must not be zero` — empty (0-byte)
file upload causes `range(0, 0, 0)` in the virus scanner chunking loop,
and the failure is logged at `error` level which pages on-call.
5. **AUTOGPT-SERVER-8BT**: `ValueError: <Token var=<ContextVar
name='current_context'>> was created in a different Context` —
OpenTelemetry `context.detach()` fails when the SDK streaming async
generator is garbage-collected in a different context than where it was
created (client disconnect mid-stream).
6. **AUTOGPT-SERVER-8BW**: `RuntimeError: Attempted to exit cancel scope
in a different task than it was entered in` — anyio's
`TaskGroup.__aexit__` detects cancel scope entered in one task but
exited in another when `GeneratorExit` interrupts the SDK cleanup during
client disconnect.
7. **Workspace UniqueViolationError**: `UniqueViolationError: Unique
constraint failed on (workspaceId, path)` — race condition during
concurrent file uploads handled by `WorkspaceManager._persist_db_record`
retry logic, but Sentry still captures the exception at the raise site.
8. **Library UniqueViolationError**: `UniqueViolationError` on
`LibraryAgent (userId, agentGraphId, agentGraphVersion)` — race
conditions in `add_graph_to_library` and `create_library_agent` caused
crashes or silent data loss.
9. **Graph version collision**: `UniqueViolationError` on `AgentGraph
(id, version)` — copilot re-saving an agent at an existing version
collides with the primary key.
## What
### Backend: `LlmModel._missing_()` for provider-prefixed model names
- Adds `_missing_` classmethod to `LlmModel` enum that strips the
provider prefix (e.g., `anthropic/`) when direct lookup fails
- Self-contained in the enum — no changes to the generic type conversion
system
### Frontend: Filter Sentry SDK noise
- Adds `postEvent: Method not found` to `ignoreErrors` — a known Sentry
SDK issue on certain mobile browsers
### Backend: XMLParserBlock — raise ValueError instead of SyntaxError
- Changed `_validate_tokens()` to raise `ValueError` instead of
`SyntaxError`
- Changed the `except SyntaxError` handler in `run()` to re-raise as
`ValueError`
- This ensures `Block.execute()` wraps XML parsing failures as
`BlockExecutionError` (expected/user-caused) instead of
`BlockUnknownError` (unexpected/alerts Sentry)
### Backend: Virus scanner — handle empty files + reduce alert noise
- Added early return for empty (0-byte) files in `scan_file()` to avoid
`range() arg 3 must not be zero` when `chunk_size` is 0
- Added `max(1, len(content))` guard on `chunk_size` as defense-in-depth
- Downgraded `scan_content_safe` failure log from `error` to `warning`
so single-file scan failures don't page on-call via Sentry
### Backend: Suppress SDK client cleanup errors on SSE disconnect
- Replaced `async with ClaudeSDKClient` in `_run_stream_attempt` with
manual `__aenter__`/`__aexit__` wrapped in new
`_safe_close_sdk_client()` helper
- `_safe_close_sdk_client()` catches `ValueError` (OTEL context token
mismatch) and `RuntimeError` (anyio cancel scope in wrong task) during
`__aexit__` and logs at `debug` level — these are expected when SSE
client disconnects mid-stream
- Added `_is_sdk_disconnect_error()` helper for defense-in-depth at the
outer `except BaseException` handler in `stream_chat_completion_sdk`
- Both Sentry errors (8BT and 8BW) are now suppressed without affecting
normal cleanup flow
### Backend: Filter workspace UniqueViolationError from Sentry alerts
- Added `before_send` filter in `_before_send()` to drop
`UniqueViolationError` events where the message contains `workspaceId`
and `path`
- The error is already handled by `WorkspaceManager._persist_db_record`
retry logic — it must propagate for the retry logic to work, so the fix
is at the Sentry filter level rather than catching/suppressing at source
### Backend: Library agent race condition fixes
- **`add_graph_to_library`**: Replaced check-then-create pattern with
create-then-catch-`UniqueViolationError`-then-update. On collision,
updates the existing row (restoring soft-deleted/archived agents)
instead of crashing.
- **`create_library_agent`**: Replaced `create` with `upsert` on the
`(userId, agentGraphId, agentGraphVersion)` composite unique constraint,
so concurrent adds restore soft-deleted entries instead of throwing.
### Backend: Graph version auto-increment on collision
- `__create_graph` now checks if the `(id, version)` already exists
before `create_many`, and auto-increments the version to `max_existing +
1` to avoid `UniqueViolationError` when the copilot re-saves an agent.
### Backend: Workspace `get_or_create_workspace` upsert
- Changed from find-then-create to `upsert` to atomically handle
concurrent workspace creation.
## Test plan
- [x] `LlmModel("anthropic/claude-sonnet-4-6")` resolves correctly
- [x] `LlmModel("claude-sonnet-4-6")` still works (no regression)
- [x] `LlmModel("invalid/nonexistent-model")` still raises `ValueError`
- [x] XMLParserBlock: unclosed tags, extra closing tags, empty XML all
raise `ValueError`
- [x] XMLParserBlock: `SyntaxError` from gravitasml library is caught
and re-raised as `ValueError`
- [x] Virus scanner: empty file (0 bytes) returns clean without hitting
ClamAV
- [x] Virus scanner: single-byte file scans normally (regression test)
- [x] Virus scanner: `scan_content_safe` logs at WARNING not ERROR on
failure
- [x] SDK disconnect: `_is_sdk_disconnect_error` correctly identifies
cancel scope and context var errors
- [x] SDK disconnect: `_is_sdk_disconnect_error` rejects unrelated
errors
- [x] SDK disconnect: `_safe_close_sdk_client` suppresses ValueError,
RuntimeError, and unexpected exceptions
- [x] SDK disconnect: `_safe_close_sdk_client` calls `__aexit__` on
clean exit
- [x] Library: `add_graph_to_library` creates new agent on first call
- [x] Library: `add_graph_to_library` updates existing on
UniqueViolationError
- [x] Library: `create_library_agent` uses upsert to handle concurrent
adds
- [x] All existing workspace overwrite tests still pass
- [x] All tests passing (existing + 4 XML syntax + 3 virus scanner + 10
SDK disconnect + library tests)
This commit is contained in:
@@ -17,8 +17,6 @@ from backend.data.includes import library_agent_include
|
||||
from backend.util.exceptions import NotFoundError
|
||||
from backend.util.json import SafeJson
|
||||
|
||||
from .db import get_library_agent_by_graph_id, update_library_agent
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@@ -61,28 +59,17 @@ async def add_graph_to_library(
|
||||
graph_model: GraphModel,
|
||||
user_id: str,
|
||||
) -> library_model.LibraryAgent:
|
||||
"""Check existing / restore soft-deleted / create new LibraryAgent."""
|
||||
if existing := await get_library_agent_by_graph_id(
|
||||
user_id, graph_model.id, graph_model.version
|
||||
):
|
||||
return existing
|
||||
"""Check existing / restore soft-deleted / create new LibraryAgent.
|
||||
|
||||
deleted_agent = await prisma.models.LibraryAgent.prisma().find_unique(
|
||||
where={
|
||||
"userId_agentGraphId_agentGraphVersion": {
|
||||
"userId": user_id,
|
||||
"agentGraphId": graph_model.id,
|
||||
"agentGraphVersion": graph_model.version,
|
||||
}
|
||||
},
|
||||
Uses a create-then-catch-UniqueViolationError-then-update pattern on
|
||||
the (userId, agentGraphId, agentGraphVersion) composite unique constraint.
|
||||
This is more robust than ``upsert`` because Prisma's upsert atomicity
|
||||
guarantees are not well-documented for all versions.
|
||||
"""
|
||||
settings_json = SafeJson(GraphSettings.from_graph(graph_model).model_dump())
|
||||
_include = library_agent_include(
|
||||
user_id, include_nodes=False, include_executions=False
|
||||
)
|
||||
if deleted_agent and (deleted_agent.isDeleted or deleted_agent.isArchived):
|
||||
return await update_library_agent(
|
||||
deleted_agent.id,
|
||||
user_id,
|
||||
is_deleted=False,
|
||||
is_archived=False,
|
||||
)
|
||||
|
||||
try:
|
||||
added_agent = await prisma.models.LibraryAgent.prisma().create(
|
||||
@@ -98,23 +85,32 @@ async def add_graph_to_library(
|
||||
},
|
||||
"isCreatedByUser": False,
|
||||
"useGraphIsActiveVersion": False,
|
||||
"settings": SafeJson(
|
||||
GraphSettings.from_graph(graph_model).model_dump()
|
||||
),
|
||||
"settings": settings_json,
|
||||
},
|
||||
include=library_agent_include(
|
||||
user_id, include_nodes=False, include_executions=False
|
||||
),
|
||||
include=_include,
|
||||
)
|
||||
except prisma.errors.UniqueViolationError:
|
||||
# Race condition: concurrent request created the row between our
|
||||
# check and create. Re-read instead of crashing.
|
||||
existing = await get_library_agent_by_graph_id(
|
||||
user_id, graph_model.id, graph_model.version
|
||||
# Already exists — update to restore if previously soft-deleted/archived
|
||||
added_agent = await prisma.models.LibraryAgent.prisma().update(
|
||||
where={
|
||||
"userId_agentGraphId_agentGraphVersion": {
|
||||
"userId": user_id,
|
||||
"agentGraphId": graph_model.id,
|
||||
"agentGraphVersion": graph_model.version,
|
||||
}
|
||||
},
|
||||
data={
|
||||
"isDeleted": False,
|
||||
"isArchived": False,
|
||||
"settings": settings_json,
|
||||
},
|
||||
include=_include,
|
||||
)
|
||||
if existing:
|
||||
return existing
|
||||
raise # Shouldn't happen, but don't swallow unexpected errors
|
||||
if added_agent is None:
|
||||
raise NotFoundError(
|
||||
f"LibraryAgent for graph #{graph_model.id} "
|
||||
f"v{graph_model.version} not found after UniqueViolationError"
|
||||
)
|
||||
|
||||
logger.debug(
|
||||
f"Added graph #{graph_model.id} v{graph_model.version} "
|
||||
|
||||
@@ -1,71 +1,80 @@
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import prisma.errors
|
||||
import pytest
|
||||
|
||||
from ._add_to_library import add_graph_to_library
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_add_graph_to_library_restores_archived_agent() -> None:
|
||||
graph_model = MagicMock(id="graph-id", version=2)
|
||||
archived_agent = MagicMock(id="library-agent-id", isDeleted=False, isArchived=True)
|
||||
restored_agent = MagicMock(name="LibraryAgentModel")
|
||||
async def test_add_graph_to_library_create_new_agent() -> None:
|
||||
"""When no matching LibraryAgent exists, create inserts a new one."""
|
||||
graph_model = MagicMock(id="graph-id", version=2, nodes=[])
|
||||
created_agent = MagicMock(name="CreatedLibraryAgent")
|
||||
converted_agent = MagicMock(name="ConvertedLibraryAgent")
|
||||
|
||||
with (
|
||||
patch(
|
||||
"backend.api.features.library._add_to_library.get_library_agent_by_graph_id",
|
||||
new=AsyncMock(return_value=None),
|
||||
),
|
||||
patch(
|
||||
"backend.api.features.library._add_to_library.prisma.models.LibraryAgent.prisma"
|
||||
) as mock_prisma,
|
||||
patch(
|
||||
"backend.api.features.library._add_to_library.update_library_agent",
|
||||
new=AsyncMock(return_value=restored_agent),
|
||||
) as mock_update,
|
||||
"backend.api.features.library._add_to_library.library_model.LibraryAgent.from_db",
|
||||
return_value=converted_agent,
|
||||
) as mock_from_db,
|
||||
):
|
||||
mock_prisma.return_value.find_unique = AsyncMock(return_value=archived_agent)
|
||||
mock_prisma.return_value.create = AsyncMock(return_value=created_agent)
|
||||
|
||||
result = await add_graph_to_library("slv-id", graph_model, "user-id")
|
||||
|
||||
assert result is restored_agent
|
||||
mock_update.assert_awaited_once_with(
|
||||
"library-agent-id",
|
||||
"user-id",
|
||||
is_deleted=False,
|
||||
is_archived=False,
|
||||
)
|
||||
mock_prisma.return_value.create.assert_not_called()
|
||||
assert result is converted_agent
|
||||
mock_from_db.assert_called_once_with(created_agent)
|
||||
# Verify create was called with correct data
|
||||
create_call = mock_prisma.return_value.create.call_args
|
||||
create_data = create_call.kwargs["data"]
|
||||
assert create_data["User"] == {"connect": {"id": "user-id"}}
|
||||
assert create_data["AgentGraph"] == {
|
||||
"connect": {"graphVersionId": {"id": "graph-id", "version": 2}}
|
||||
}
|
||||
assert create_data["isCreatedByUser"] is False
|
||||
assert create_data["useGraphIsActiveVersion"] is False
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_add_graph_to_library_restores_deleted_agent() -> None:
|
||||
graph_model = MagicMock(id="graph-id", version=2)
|
||||
deleted_agent = MagicMock(id="library-agent-id", isDeleted=True, isArchived=False)
|
||||
restored_agent = MagicMock(name="LibraryAgentModel")
|
||||
async def test_add_graph_to_library_unique_violation_updates_existing() -> None:
|
||||
"""UniqueViolationError on create falls back to update."""
|
||||
graph_model = MagicMock(id="graph-id", version=2, nodes=[])
|
||||
updated_agent = MagicMock(name="UpdatedLibraryAgent")
|
||||
converted_agent = MagicMock(name="ConvertedLibraryAgent")
|
||||
|
||||
with (
|
||||
patch(
|
||||
"backend.api.features.library._add_to_library.get_library_agent_by_graph_id",
|
||||
new=AsyncMock(return_value=None),
|
||||
),
|
||||
patch(
|
||||
"backend.api.features.library._add_to_library.prisma.models.LibraryAgent.prisma"
|
||||
) as mock_prisma,
|
||||
patch(
|
||||
"backend.api.features.library._add_to_library.update_library_agent",
|
||||
new=AsyncMock(return_value=restored_agent),
|
||||
) as mock_update,
|
||||
"backend.api.features.library._add_to_library.library_model.LibraryAgent.from_db",
|
||||
return_value=converted_agent,
|
||||
) as mock_from_db,
|
||||
):
|
||||
mock_prisma.return_value.find_unique = AsyncMock(return_value=deleted_agent)
|
||||
mock_prisma.return_value.create = AsyncMock(
|
||||
side_effect=prisma.errors.UniqueViolationError(
|
||||
MagicMock(), message="unique constraint"
|
||||
)
|
||||
)
|
||||
mock_prisma.return_value.update = AsyncMock(return_value=updated_agent)
|
||||
|
||||
result = await add_graph_to_library("slv-id", graph_model, "user-id")
|
||||
|
||||
assert result is restored_agent
|
||||
mock_update.assert_awaited_once_with(
|
||||
"library-agent-id",
|
||||
"user-id",
|
||||
is_deleted=False,
|
||||
is_archived=False,
|
||||
)
|
||||
mock_prisma.return_value.create.assert_not_called()
|
||||
assert result is converted_agent
|
||||
mock_from_db.assert_called_once_with(updated_agent)
|
||||
# Verify update was called with correct where and data
|
||||
update_call = mock_prisma.return_value.update.call_args
|
||||
assert update_call.kwargs["where"] == {
|
||||
"userId_agentGraphId_agentGraphVersion": {
|
||||
"userId": "user-id",
|
||||
"agentGraphId": "graph-id",
|
||||
"agentGraphVersion": 2,
|
||||
}
|
||||
}
|
||||
update_data = update_call.kwargs["data"]
|
||||
assert update_data["isDeleted"] is False
|
||||
assert update_data["isArchived"] is False
|
||||
|
||||
@@ -436,32 +436,53 @@ async def create_library_agent(
|
||||
async with transaction() as tx:
|
||||
library_agents = await asyncio.gather(
|
||||
*(
|
||||
prisma.models.LibraryAgent.prisma(tx).create(
|
||||
data=prisma.types.LibraryAgentCreateInput(
|
||||
isCreatedByUser=(user_id == user_id),
|
||||
useGraphIsActiveVersion=True,
|
||||
User={"connect": {"id": user_id}},
|
||||
AgentGraph={
|
||||
"connect": {
|
||||
"graphVersionId": {
|
||||
"id": graph_entry.id,
|
||||
"version": graph_entry.version,
|
||||
prisma.models.LibraryAgent.prisma(tx).upsert(
|
||||
where={
|
||||
"userId_agentGraphId_agentGraphVersion": {
|
||||
"userId": user_id,
|
||||
"agentGraphId": graph_entry.id,
|
||||
"agentGraphVersion": graph_entry.version,
|
||||
}
|
||||
},
|
||||
data={
|
||||
"create": prisma.types.LibraryAgentCreateInput(
|
||||
isCreatedByUser=(user_id == graph.user_id),
|
||||
useGraphIsActiveVersion=True,
|
||||
User={"connect": {"id": user_id}},
|
||||
AgentGraph={
|
||||
"connect": {
|
||||
"graphVersionId": {
|
||||
"id": graph_entry.id,
|
||||
"version": graph_entry.version,
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
settings=SafeJson(
|
||||
GraphSettings.from_graph(
|
||||
graph_entry,
|
||||
hitl_safe_mode=hitl_safe_mode,
|
||||
sensitive_action_safe_mode=sensitive_action_safe_mode,
|
||||
).model_dump()
|
||||
),
|
||||
**(
|
||||
{"Folder": {"connect": {"id": folder_id}}}
|
||||
if folder_id and graph_entry is graph
|
||||
else {}
|
||||
),
|
||||
),
|
||||
"update": {
|
||||
"isDeleted": False,
|
||||
"isArchived": False,
|
||||
"useGraphIsActiveVersion": True,
|
||||
"settings": SafeJson(
|
||||
GraphSettings.from_graph(
|
||||
graph_entry,
|
||||
hitl_safe_mode=hitl_safe_mode,
|
||||
sensitive_action_safe_mode=sensitive_action_safe_mode,
|
||||
).model_dump()
|
||||
),
|
||||
},
|
||||
settings=SafeJson(
|
||||
GraphSettings.from_graph(
|
||||
graph_entry,
|
||||
hitl_safe_mode=hitl_safe_mode,
|
||||
sensitive_action_safe_mode=sensitive_action_safe_mode,
|
||||
).model_dump()
|
||||
),
|
||||
**(
|
||||
{"Folder": {"connect": {"id": folder_id}}}
|
||||
if folder_id and graph_entry is graph
|
||||
else {}
|
||||
),
|
||||
),
|
||||
},
|
||||
include=library_agent_include(
|
||||
user_id, include_nodes=False, include_executions=False
|
||||
),
|
||||
|
||||
@@ -1,4 +1,6 @@
|
||||
from contextlib import asynccontextmanager
|
||||
from datetime import datetime
|
||||
from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import prisma.enums
|
||||
import prisma.models
|
||||
@@ -85,10 +87,6 @@ async def test_get_library_agents(mocker):
|
||||
async def test_add_agent_to_library(mocker):
|
||||
await connect()
|
||||
|
||||
# Mock the transaction context
|
||||
mock_transaction = mocker.patch("backend.api.features.library.db.transaction")
|
||||
mock_transaction.return_value.__aenter__ = mocker.AsyncMock(return_value=None)
|
||||
mock_transaction.return_value.__aexit__ = mocker.AsyncMock(return_value=None)
|
||||
# Mock data
|
||||
mock_store_listing_data = prisma.models.StoreListingVersion(
|
||||
id="version123",
|
||||
@@ -143,13 +141,11 @@ async def test_add_agent_to_library(mocker):
|
||||
)
|
||||
|
||||
mock_library_agent = mocker.patch("prisma.models.LibraryAgent.prisma")
|
||||
mock_library_agent.return_value.find_first = mocker.AsyncMock(return_value=None)
|
||||
mock_library_agent.return_value.find_unique = mocker.AsyncMock(return_value=None)
|
||||
mock_library_agent.return_value.create = mocker.AsyncMock(
|
||||
return_value=mock_library_agent_data
|
||||
)
|
||||
|
||||
# Mock graph_db.get_graph function that's called to check for HITL blocks
|
||||
# Mock graph_db.get_graph function that's called in resolve_graph_for_library
|
||||
# (lives in _add_to_library.py after refactor, not db.py)
|
||||
mock_graph_db = mocker.patch(
|
||||
"backend.api.features.library._add_to_library.graph_db"
|
||||
@@ -175,37 +171,27 @@ async def test_add_agent_to_library(mocker):
|
||||
mock_store_listing_version.return_value.find_unique.assert_called_once_with(
|
||||
where={"id": "version123"}, include={"AgentGraph": True}
|
||||
)
|
||||
mock_library_agent.return_value.find_unique.assert_called_once_with(
|
||||
where={
|
||||
"userId_agentGraphId_agentGraphVersion": {
|
||||
"userId": "test-user",
|
||||
"agentGraphId": "agent1",
|
||||
"agentGraphVersion": 1,
|
||||
}
|
||||
},
|
||||
)
|
||||
# Check that create was called with the expected data including settings
|
||||
create_call_args = mock_library_agent.return_value.create.call_args
|
||||
assert create_call_args is not None
|
||||
|
||||
# Verify the main structure
|
||||
expected_data = {
|
||||
# Verify the create data structure
|
||||
create_data = create_call_args.kwargs["data"]
|
||||
expected_create = {
|
||||
"User": {"connect": {"id": "test-user"}},
|
||||
"AgentGraph": {"connect": {"graphVersionId": {"id": "agent1", "version": 1}}},
|
||||
"isCreatedByUser": False,
|
||||
"useGraphIsActiveVersion": False,
|
||||
}
|
||||
|
||||
actual_data = create_call_args[1]["data"]
|
||||
# Check that all expected fields are present
|
||||
for key, value in expected_data.items():
|
||||
assert actual_data[key] == value
|
||||
for key, value in expected_create.items():
|
||||
assert create_data[key] == value
|
||||
|
||||
# Check that settings field is present and is a SafeJson object
|
||||
assert "settings" in actual_data
|
||||
assert hasattr(actual_data["settings"], "__class__") # Should be a SafeJson object
|
||||
assert "settings" in create_data
|
||||
assert hasattr(create_data["settings"], "__class__") # Should be a SafeJson object
|
||||
|
||||
# Check include parameter
|
||||
assert create_call_args[1]["include"] == library_agent_include(
|
||||
assert create_call_args.kwargs["include"] == library_agent_include(
|
||||
"test-user", include_nodes=False, include_executions=False
|
||||
)
|
||||
|
||||
@@ -320,3 +306,50 @@ async def test_update_graph_in_library_allows_archived_library_agent(mocker):
|
||||
include_archived=True,
|
||||
)
|
||||
mock_update_library_agent.assert_awaited_once_with("test-user", created_graph)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_create_library_agent_uses_upsert():
|
||||
"""create_library_agent should use upsert (not create) to handle duplicates."""
|
||||
mock_graph = MagicMock()
|
||||
mock_graph.id = "graph-1"
|
||||
mock_graph.version = 1
|
||||
mock_graph.user_id = "user-1"
|
||||
mock_graph.nodes = []
|
||||
mock_graph.sub_graphs = []
|
||||
|
||||
mock_upserted = MagicMock(name="UpsertedLibraryAgent")
|
||||
|
||||
@asynccontextmanager
|
||||
async def fake_tx():
|
||||
yield None
|
||||
|
||||
with (
|
||||
patch("backend.api.features.library.db.transaction", fake_tx),
|
||||
patch("prisma.models.LibraryAgent.prisma") as mock_prisma,
|
||||
patch(
|
||||
"backend.api.features.library.db.add_generated_agent_image",
|
||||
new=AsyncMock(),
|
||||
),
|
||||
patch(
|
||||
"backend.api.features.library.model.LibraryAgent.from_db",
|
||||
return_value=MagicMock(),
|
||||
),
|
||||
):
|
||||
mock_prisma.return_value.upsert = AsyncMock(return_value=mock_upserted)
|
||||
|
||||
result = await db.create_library_agent(mock_graph, "user-1")
|
||||
|
||||
assert len(result) == 1
|
||||
upsert_call = mock_prisma.return_value.upsert.call_args
|
||||
assert upsert_call is not None
|
||||
# Verify the upsert where clause uses the composite unique key
|
||||
where = upsert_call.kwargs["where"]
|
||||
assert "userId_agentGraphId_agentGraphVersion" in where
|
||||
# Verify the upsert data has both create and update branches
|
||||
data = upsert_call.kwargs["data"]
|
||||
assert "create" in data
|
||||
assert "update" in data
|
||||
# Verify update branch restores soft-deleted/archived agents
|
||||
assert data["update"]["isDeleted"] is False
|
||||
assert data["update"]["isArchived"] is False
|
||||
|
||||
@@ -104,6 +104,18 @@ class LlmModelMeta(EnumMeta):
|
||||
|
||||
|
||||
class LlmModel(str, Enum, metaclass=LlmModelMeta):
|
||||
|
||||
@classmethod
|
||||
def _missing_(cls, value: object) -> "LlmModel | None":
|
||||
"""Handle provider-prefixed model names like 'anthropic/claude-sonnet-4-6'."""
|
||||
if isinstance(value, str) and "/" in value:
|
||||
stripped = value.split("/", 1)[1]
|
||||
try:
|
||||
return cls(stripped)
|
||||
except ValueError:
|
||||
return None
|
||||
return None
|
||||
|
||||
# OpenAI models
|
||||
O3_MINI = "o3-mini"
|
||||
O3 = "o3-2025-04-16"
|
||||
|
||||
@@ -207,6 +207,51 @@ class TestXMLParserBlockSecurity:
|
||||
pass
|
||||
|
||||
|
||||
class TestXMLParserBlockSyntaxErrors:
|
||||
"""XML syntax errors should raise ValueError (not SyntaxError).
|
||||
|
||||
This ensures the base Block.execute() wraps them as BlockExecutionError
|
||||
(expected / user-caused) instead of BlockUnknownError (unexpected / alerts
|
||||
Sentry).
|
||||
"""
|
||||
|
||||
async def test_unclosed_tag_raises_value_error(self):
|
||||
"""Unclosed tags should raise ValueError, not SyntaxError."""
|
||||
block = XMLParserBlock()
|
||||
bad_xml = "<root><unclosed>"
|
||||
|
||||
with pytest.raises(ValueError, match="Unclosed tag"):
|
||||
async for _ in block.run(XMLParserBlock.Input(input_xml=bad_xml)):
|
||||
pass
|
||||
|
||||
async def test_unexpected_closing_tag_raises_value_error(self):
|
||||
"""Extra closing tags should raise ValueError, not SyntaxError."""
|
||||
block = XMLParserBlock()
|
||||
bad_xml = "</unexpected>"
|
||||
|
||||
with pytest.raises(ValueError):
|
||||
async for _ in block.run(XMLParserBlock.Input(input_xml=bad_xml)):
|
||||
pass
|
||||
|
||||
async def test_empty_xml_raises_value_error(self):
|
||||
"""Empty XML input should raise ValueError."""
|
||||
block = XMLParserBlock()
|
||||
|
||||
with pytest.raises(ValueError, match="XML input is empty"):
|
||||
async for _ in block.run(XMLParserBlock.Input(input_xml="")):
|
||||
pass
|
||||
|
||||
async def test_syntax_error_from_parser_becomes_value_error(self):
|
||||
"""SyntaxErrors from gravitasml library become ValueError (BlockExecutionError)."""
|
||||
block = XMLParserBlock()
|
||||
# Malformed XML that might trigger a SyntaxError from the parser
|
||||
bad_xml = "<root><child>no closing"
|
||||
|
||||
with pytest.raises(ValueError):
|
||||
async for _ in block.run(XMLParserBlock.Input(input_xml=bad_xml)):
|
||||
pass
|
||||
|
||||
|
||||
class TestStoreMediaFileSecurity:
|
||||
"""Test file storage security limits."""
|
||||
|
||||
|
||||
@@ -809,3 +809,33 @@ class TestUserErrorStatusCodeHandling:
|
||||
|
||||
mock_warning.assert_called_once()
|
||||
mock_exception.assert_not_called()
|
||||
|
||||
|
||||
class TestLlmModelMissing:
|
||||
"""Test that LlmModel handles provider-prefixed model names."""
|
||||
|
||||
def test_provider_prefixed_model_resolves(self):
|
||||
"""Provider-prefixed model string should resolve to the correct enum member."""
|
||||
assert (
|
||||
llm.LlmModel("anthropic/claude-sonnet-4-6")
|
||||
== llm.LlmModel.CLAUDE_4_6_SONNET
|
||||
)
|
||||
|
||||
def test_bare_model_still_works(self):
|
||||
"""Bare (non-prefixed) model string should still resolve correctly."""
|
||||
assert llm.LlmModel("claude-sonnet-4-6") == llm.LlmModel.CLAUDE_4_6_SONNET
|
||||
|
||||
def test_invalid_prefixed_model_raises(self):
|
||||
"""Unknown provider-prefixed model string should raise ValueError."""
|
||||
with pytest.raises(ValueError):
|
||||
llm.LlmModel("invalid/nonexistent-model")
|
||||
|
||||
def test_slash_containing_value_direct_lookup(self):
|
||||
"""Enum values with '/' (e.g., OpenRouter models) should resolve via direct lookup, not _missing_."""
|
||||
assert llm.LlmModel("google/gemini-2.5-pro") == llm.LlmModel.GEMINI_2_5_PRO
|
||||
|
||||
def test_double_prefixed_slash_model(self):
|
||||
"""Double-prefixed value should still resolve by stripping first prefix."""
|
||||
assert (
|
||||
llm.LlmModel("extra/google/gemini-2.5-pro") == llm.LlmModel.GEMINI_2_5_PRO
|
||||
)
|
||||
|
||||
@@ -44,7 +44,7 @@ class XMLParserBlock(Block):
|
||||
elif token.type == "TAG_CLOSE":
|
||||
depth -= 1
|
||||
if depth < 0:
|
||||
raise SyntaxError("Unexpected closing tag in XML input.")
|
||||
raise ValueError("Unexpected closing tag in XML input.")
|
||||
elif token.type in {"TEXT", "ESCAPE"}:
|
||||
if depth == 0 and token.value:
|
||||
raise ValueError(
|
||||
@@ -53,7 +53,7 @@ class XMLParserBlock(Block):
|
||||
)
|
||||
|
||||
if depth != 0:
|
||||
raise SyntaxError("Unclosed tag detected in XML input.")
|
||||
raise ValueError("Unclosed tag detected in XML input.")
|
||||
if not root_seen:
|
||||
raise ValueError("XML must include a root element.")
|
||||
|
||||
@@ -76,4 +76,7 @@ class XMLParserBlock(Block):
|
||||
except ValueError as val_e:
|
||||
raise ValueError(f"Validation error for dict:{val_e}") from val_e
|
||||
except SyntaxError as syn_e:
|
||||
raise SyntaxError(f"Error in input xml syntax: {syn_e}") from syn_e
|
||||
# Raise as ValueError so the base Block.execute() wraps it as
|
||||
# BlockExecutionError (expected user-caused failure) instead of
|
||||
# BlockUnknownError (unexpected platform error that alerts Sentry).
|
||||
raise ValueError(f"Error in input xml syntax: {syn_e}") from syn_e
|
||||
|
||||
@@ -185,6 +185,24 @@ def _is_prompt_too_long(err: BaseException) -> bool:
|
||||
return False
|
||||
|
||||
|
||||
def _is_sdk_disconnect_error(exc: BaseException) -> bool:
|
||||
"""Return True if *exc* is an expected SDK cleanup error from client disconnect.
|
||||
|
||||
Two known patterns occur when ``GeneratorExit`` tears down the async
|
||||
generator and the SDK's ``__aexit__`` runs in a different context/task:
|
||||
|
||||
* ``RuntimeError``: cancel scope exited in wrong task (anyio)
|
||||
* ``ValueError``: ContextVar token created in a different Context (OTEL)
|
||||
|
||||
These are suppressed to avoid polluting Sentry with non-actionable noise.
|
||||
"""
|
||||
if isinstance(exc, RuntimeError) and "cancel scope" in str(exc):
|
||||
return True
|
||||
if isinstance(exc, ValueError) and "was created in a different Context" in str(exc):
|
||||
return True
|
||||
return False
|
||||
|
||||
|
||||
def _is_tool_only_message(sdk_msg: object) -> bool:
|
||||
"""Return True if *sdk_msg* is an AssistantMessage containing only ToolUseBlocks.
|
||||
|
||||
@@ -409,6 +427,63 @@ _HEARTBEAT_INTERVAL = 10.0 # seconds
|
||||
STREAM_LOCK_PREFIX = "copilot:stream:lock:"
|
||||
|
||||
|
||||
async def _safe_close_sdk_client(
|
||||
sdk_client: ClaudeSDKClient,
|
||||
log_prefix: str,
|
||||
) -> None:
|
||||
"""Close a ClaudeSDKClient, suppressing errors from client disconnect.
|
||||
|
||||
When the SSE client disconnects mid-stream, ``GeneratorExit`` propagates
|
||||
through the async generator stack and causes ``ClaudeSDKClient.__aexit__``
|
||||
to run in a different async context or task than where the client was
|
||||
opened. This triggers two known error classes:
|
||||
|
||||
* ``ValueError``: ``<Token var=<ContextVar name='current_context'>>
|
||||
was created in a different Context`` — OpenTelemetry's
|
||||
``context.detach()`` fails because the OTEL context token was
|
||||
created in the original generator coroutine but detach runs in
|
||||
the GC / cleanup coroutine (Sentry: AUTOGPT-SERVER-8BT).
|
||||
|
||||
* ``RuntimeError``: ``Attempted to exit cancel scope in a different
|
||||
task than it was entered in`` — anyio's ``TaskGroup.__aexit__``
|
||||
detects that the cancel scope was entered in one task but is
|
||||
being exited in another (Sentry: AUTOGPT-SERVER-8BW).
|
||||
|
||||
Both are harmless — the TCP connection is already dead and no
|
||||
resources leak. Logging them at ``debug`` level keeps observability
|
||||
without polluting Sentry.
|
||||
"""
|
||||
try:
|
||||
await sdk_client.__aexit__(None, None, None)
|
||||
except (ValueError, RuntimeError) as exc:
|
||||
if _is_sdk_disconnect_error(exc):
|
||||
# Expected during client disconnect — suppress to avoid Sentry noise.
|
||||
logger.debug(
|
||||
"%s SDK client cleanup error suppressed (client disconnect): %s: %s",
|
||||
log_prefix,
|
||||
type(exc).__name__,
|
||||
exc,
|
||||
)
|
||||
else:
|
||||
raise
|
||||
except GeneratorExit:
|
||||
# GeneratorExit can propagate through __aexit__ — suppress it here
|
||||
# since the generator is already being torn down.
|
||||
logger.debug(
|
||||
"%s SDK client cleanup GeneratorExit suppressed (client disconnect)",
|
||||
log_prefix,
|
||||
)
|
||||
except Exception:
|
||||
# Unexpected cleanup error — log at error level so Sentry captures it
|
||||
# (via its logging integration), but don't propagate since we're in
|
||||
# teardown and the caller cannot meaningfully handle this.
|
||||
logger.error(
|
||||
"%s Unexpected SDK client cleanup error",
|
||||
log_prefix,
|
||||
exc_info=True,
|
||||
)
|
||||
|
||||
|
||||
async def _iter_sdk_messages(
|
||||
client: ClaudeSDKClient,
|
||||
) -> AsyncGenerator[Any, None]:
|
||||
@@ -1188,7 +1263,17 @@ async def _run_stream_attempt(
|
||||
|
||||
consecutive_empty_tool_calls = 0
|
||||
|
||||
async with ClaudeSDKClient(options=state.options) as client:
|
||||
# Use manual __aenter__/__aexit__ instead of ``async with`` so we can
|
||||
# suppress SDK cleanup errors that occur when the SSE client disconnects
|
||||
# mid-stream. GeneratorExit causes the SDK's ``__aexit__`` to run in a
|
||||
# different async context/task than where the client was opened, which
|
||||
# triggers:
|
||||
# - ValueError: ContextVar token mismatch (AUTOGPT-SERVER-8BT)
|
||||
# - RuntimeError: cancel scope in wrong task (AUTOGPT-SERVER-8BW)
|
||||
# Both are harmless — the TCP connection is already dead.
|
||||
sdk_client = ClaudeSDKClient(options=state.options)
|
||||
client = await sdk_client.__aenter__()
|
||||
try:
|
||||
logger.info(
|
||||
"%s Sending query — resume=%s, total_msgs=%d, "
|
||||
"query_len=%d, attached_files=%d, image_blocks=%d",
|
||||
@@ -1448,6 +1533,8 @@ async def _run_stream_attempt(
|
||||
|
||||
if acc.stream_completed:
|
||||
break
|
||||
finally:
|
||||
await _safe_close_sdk_client(sdk_client, ctx.log_prefix)
|
||||
|
||||
# --- Post-stream processing (only on success) ---
|
||||
if state.adapter.has_unresolved_tool_calls:
|
||||
@@ -2169,9 +2256,16 @@ async def stream_chat_completion_sdk(
|
||||
error_msg = "Operation cancelled"
|
||||
else:
|
||||
error_msg = str(e) or type(e).__name__
|
||||
# SDK cleanup RuntimeError is expected during cancellation, log as warning
|
||||
if isinstance(e, RuntimeError) and "cancel scope" in str(e):
|
||||
logger.warning("%s SDK cleanup error: %s", log_prefix, error_msg)
|
||||
# SDK cleanup errors are expected during client disconnect —
|
||||
# log as warning rather than error to reduce Sentry noise.
|
||||
# These are normally caught by _safe_close_sdk_client but
|
||||
# can escape in edge cases (e.g. GeneratorExit timing).
|
||||
if _is_sdk_disconnect_error(e):
|
||||
logger.warning(
|
||||
"%s SDK cleanup error (client disconnect): %s",
|
||||
log_prefix,
|
||||
error_msg,
|
||||
)
|
||||
else:
|
||||
logger.error("%s Error: %s", log_prefix, error_msg, exc_info=True)
|
||||
|
||||
@@ -2193,10 +2287,11 @@ async def stream_chat_completion_sdk(
|
||||
)
|
||||
|
||||
# Yield StreamError for immediate feedback (only for non-cancellation errors)
|
||||
# Skip for CancelledError and RuntimeError cleanup issues (both are cancellations)
|
||||
is_cancellation = isinstance(e, asyncio.CancelledError) or (
|
||||
isinstance(e, RuntimeError) and "cancel scope" in str(e)
|
||||
)
|
||||
# Skip for CancelledError and SDK disconnect cleanup errors — these
|
||||
# are not actionable by the user and the SSE connection is already dead.
|
||||
is_cancellation = isinstance(
|
||||
e, asyncio.CancelledError
|
||||
) or _is_sdk_disconnect_error(e)
|
||||
if not is_cancellation:
|
||||
yield StreamError(errorText=display_msg, code=code)
|
||||
|
||||
|
||||
@@ -8,7 +8,12 @@ from unittest.mock import AsyncMock, MagicMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from .service import _prepare_file_attachments, _resolve_sdk_model
|
||||
from .service import (
|
||||
_is_sdk_disconnect_error,
|
||||
_prepare_file_attachments,
|
||||
_resolve_sdk_model,
|
||||
_safe_close_sdk_client,
|
||||
)
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -499,3 +504,111 @@ class TestResolveSdkModel:
|
||||
)
|
||||
monkeypatch.setattr("backend.copilot.sdk.service.config", cfg)
|
||||
assert _resolve_sdk_model() == "claude-opus-4-6"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _is_sdk_disconnect_error — classify client disconnect cleanup errors
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestIsSdkDisconnectError:
|
||||
"""Tests for _is_sdk_disconnect_error — identifies expected SDK cleanup errors."""
|
||||
|
||||
def test_cancel_scope_runtime_error(self):
|
||||
"""RuntimeError about cancel scope in wrong task is a disconnect error."""
|
||||
exc = RuntimeError(
|
||||
"Attempted to exit cancel scope in a different task than it was entered in"
|
||||
)
|
||||
assert _is_sdk_disconnect_error(exc) is True
|
||||
|
||||
def test_context_var_value_error(self):
|
||||
"""ValueError about ContextVar token mismatch is a disconnect error."""
|
||||
exc = ValueError(
|
||||
"<Token var=<ContextVar name='current_context'>> "
|
||||
"was created in a different Context"
|
||||
)
|
||||
assert _is_sdk_disconnect_error(exc) is True
|
||||
|
||||
def test_unrelated_runtime_error(self):
|
||||
"""Unrelated RuntimeError should NOT be classified as disconnect error."""
|
||||
exc = RuntimeError("something else went wrong")
|
||||
assert _is_sdk_disconnect_error(exc) is False
|
||||
|
||||
def test_unrelated_value_error(self):
|
||||
"""Unrelated ValueError should NOT be classified as disconnect error."""
|
||||
exc = ValueError("invalid argument")
|
||||
assert _is_sdk_disconnect_error(exc) is False
|
||||
|
||||
def test_other_exception_types(self):
|
||||
"""Non-RuntimeError/ValueError should NOT be classified as disconnect error."""
|
||||
assert _is_sdk_disconnect_error(TypeError("bad type")) is False
|
||||
assert _is_sdk_disconnect_error(OSError("network down")) is False
|
||||
assert _is_sdk_disconnect_error(asyncio.CancelledError()) is False
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# _safe_close_sdk_client — suppress cleanup errors during disconnect
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
class TestSafeCloseSdkClient:
|
||||
"""Tests for _safe_close_sdk_client — suppresses expected SDK cleanup errors."""
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_clean_exit(self):
|
||||
"""Normal __aexit__ (no error) should succeed silently."""
|
||||
client = AsyncMock()
|
||||
client.__aexit__ = AsyncMock(return_value=None)
|
||||
await _safe_close_sdk_client(client, "[test]")
|
||||
client.__aexit__.assert_awaited_once_with(None, None, None)
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_cancel_scope_runtime_error_suppressed(self):
|
||||
"""RuntimeError from cancel scope mismatch should be suppressed."""
|
||||
client = AsyncMock()
|
||||
client.__aexit__ = AsyncMock(
|
||||
side_effect=RuntimeError(
|
||||
"Attempted to exit cancel scope in a different task"
|
||||
)
|
||||
)
|
||||
# Should NOT raise
|
||||
await _safe_close_sdk_client(client, "[test]")
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_context_var_value_error_suppressed(self):
|
||||
"""ValueError from ContextVar token mismatch should be suppressed."""
|
||||
client = AsyncMock()
|
||||
client.__aexit__ = AsyncMock(
|
||||
side_effect=ValueError(
|
||||
"<Token var=<ContextVar name='current_context'>> "
|
||||
"was created in a different Context"
|
||||
)
|
||||
)
|
||||
# Should NOT raise
|
||||
await _safe_close_sdk_client(client, "[test]")
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_unexpected_exception_suppressed_with_error_log(self):
|
||||
"""Unexpected exceptions should be caught (not propagated) but logged at error."""
|
||||
client = AsyncMock()
|
||||
client.__aexit__ = AsyncMock(side_effect=OSError("unexpected"))
|
||||
# Should NOT raise — unexpected errors are also suppressed to
|
||||
# avoid crashing the generator during teardown. Logged at error
|
||||
# level so Sentry captures them via its logging integration.
|
||||
await _safe_close_sdk_client(client, "[test]")
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_unrelated_runtime_error_propagates(self):
|
||||
"""Non-cancel-scope RuntimeError should propagate (not suppressed)."""
|
||||
client = AsyncMock()
|
||||
client.__aexit__ = AsyncMock(side_effect=RuntimeError("something unrelated"))
|
||||
with pytest.raises(RuntimeError, match="something unrelated"):
|
||||
await _safe_close_sdk_client(client, "[test]")
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_unrelated_value_error_propagates(self):
|
||||
"""Non-disconnect ValueError should propagate (not suppressed)."""
|
||||
client = AsyncMock()
|
||||
client.__aexit__ = AsyncMock(side_effect=ValueError("invalid argument"))
|
||||
with pytest.raises(ValueError, match="invalid argument"):
|
||||
await _safe_close_sdk_client(client, "[test]")
|
||||
|
||||
@@ -581,7 +581,6 @@ class GraphModel(Graph, GraphMeta):
|
||||
field_name,
|
||||
field_info,
|
||||
) in node.block.input_schema.get_credentials_fields_info().items():
|
||||
|
||||
discriminator = field_info.discriminator
|
||||
if not discriminator:
|
||||
node_credential_data.append((field_info, (node.id, field_name)))
|
||||
@@ -1529,6 +1528,28 @@ async def fork_graph(graph_id: str, graph_version: int, user_id: str) -> GraphMo
|
||||
async def __create_graph(tx, graph: Graph, user_id: str):
|
||||
graphs = [graph] + graph.sub_graphs
|
||||
|
||||
# Auto-increment version for any graph entry (parent or sub-graph) whose
|
||||
# (id, version) already exists. This prevents UniqueViolationError when
|
||||
# the copilot re-saves an agent that already exists at the requested version.
|
||||
# NOTE: This issues one find_first query per graph entry (N+1 pattern).
|
||||
# Sub-graph counts are typically small (< 5), so the overhead is negligible.
|
||||
for g in graphs:
|
||||
existing = await AgentGraph.prisma(tx).find_first(
|
||||
where={"id": g.id},
|
||||
order={"version": "desc"},
|
||||
)
|
||||
if existing and existing.version >= g.version:
|
||||
old_version = g.version
|
||||
g.version = existing.version + 1
|
||||
logger.warning(
|
||||
"Auto-incremented graph %s version from %d to %d "
|
||||
"(version %d already exists)",
|
||||
g.id,
|
||||
old_version,
|
||||
g.version,
|
||||
existing.version,
|
||||
)
|
||||
|
||||
await AgentGraph.prisma(tx).create_many(
|
||||
data=[
|
||||
AgentGraphCreateInput(
|
||||
|
||||
@@ -76,20 +76,24 @@ async def get_or_create_workspace(user_id: str) -> Workspace:
|
||||
"""
|
||||
Get user's workspace, creating one if it doesn't exist.
|
||||
|
||||
Uses upsert to atomically handle concurrent creation attempts.
|
||||
|
||||
Args:
|
||||
user_id: The user's ID
|
||||
|
||||
Returns:
|
||||
Workspace instance
|
||||
"""
|
||||
workspace = await UserWorkspace.prisma().find_unique(where={"userId": user_id})
|
||||
if workspace:
|
||||
return Workspace.from_db(workspace)
|
||||
|
||||
try:
|
||||
workspace = await UserWorkspace.prisma().create(data={"userId": user_id})
|
||||
workspace = await UserWorkspace.prisma().upsert(
|
||||
where={"userId": user_id},
|
||||
data={
|
||||
"create": {"userId": user_id},
|
||||
"update": {}, # No-op update; workspace already exists
|
||||
},
|
||||
)
|
||||
except UniqueViolationError:
|
||||
# Concurrent request already created it
|
||||
# Defense-in-depth: should not happen with upsert, but handle gracefully
|
||||
workspace = await UserWorkspace.prisma().find_unique(where={"userId": user_id})
|
||||
if workspace is None:
|
||||
raise
|
||||
@@ -125,6 +129,13 @@ async def create_workspace_file(
|
||||
"""
|
||||
Create a new workspace file record.
|
||||
|
||||
Raises ``UniqueViolationError`` if a record with the same
|
||||
``(workspaceId, path)`` already exists. The caller
|
||||
(``WorkspaceManager._persist_db_record``) relies on this to trigger
|
||||
its delete-old-file-then-retry flow, which cleans up the old storage
|
||||
blob before re-creating the DB record. Using ``upsert`` here would
|
||||
silently overwrite ``storagePath`` and orphan the old blob in storage.
|
||||
|
||||
Args:
|
||||
workspace_id: The workspace ID
|
||||
file_id: The file ID (same as used in storage path for consistency)
|
||||
|
||||
@@ -84,6 +84,14 @@ def _before_send(event, hint):
|
||||
):
|
||||
return None
|
||||
|
||||
# Prisma UniqueViolationError — always caught and handled in our codebase.
|
||||
# These arise from concurrent create operations racing on unique constraints
|
||||
# (workspace files, credits, library folders, store listings, chat messages).
|
||||
# Every call site has an except handler; the global FastAPI handler also
|
||||
# catches them and returns 400. Safe to drop unconditionally.
|
||||
if exc_type and exc_type.__name__ == "UniqueViolationError":
|
||||
return None
|
||||
|
||||
# Google metadata DNS errors — expected in non-GCP environments
|
||||
if (
|
||||
"metadata.google.internal" in exc_msg
|
||||
|
||||
@@ -227,10 +227,16 @@ class AppService(BaseAppService, ABC):
|
||||
def _handle_internal_http_error(status_code: int = 500, log_error: bool = True):
|
||||
def handler(request: Request, exc: Exception):
|
||||
if log_error:
|
||||
logger.error(
|
||||
f"{request.method} {request.url.path} failed: {exc}",
|
||||
exc_info=exc if status_code == 500 else None,
|
||||
)
|
||||
if status_code >= 500:
|
||||
logger.error(
|
||||
f"{request.method} {request.url.path} failed: {exc}",
|
||||
exc_info=exc,
|
||||
)
|
||||
else:
|
||||
logger.warning(
|
||||
f"{request.method} {request.url.path} failed: {exc}",
|
||||
exc_info=exc,
|
||||
)
|
||||
return responses.JSONResponse(
|
||||
status_code=status_code,
|
||||
content=RemoteCallError(
|
||||
@@ -563,7 +569,6 @@ def get_service_client(
|
||||
self._connection_failure_count >= 3
|
||||
and current_time - self._last_client_reset > 30
|
||||
):
|
||||
|
||||
logger.warning(
|
||||
f"Connection failures detected ({self._connection_failure_count}), recreating HTTP clients"
|
||||
)
|
||||
|
||||
@@ -108,6 +108,9 @@ class VirusScannerService:
|
||||
return VirusScanResult(
|
||||
is_clean=True, scan_time_ms=0, file_size=len(content)
|
||||
)
|
||||
if len(content) == 0:
|
||||
logger.debug(f"Skipping virus scan for empty file {filename}")
|
||||
return VirusScanResult(is_clean=True, scan_time_ms=0, file_size=0)
|
||||
if len(content) > self.settings.max_scan_size:
|
||||
logger.warning(
|
||||
f"File {filename} ({len(content)} bytes) exceeds client max scan size ({self.settings.max_scan_size}); Stopping virus scan"
|
||||
@@ -123,7 +126,7 @@ class VirusScannerService:
|
||||
raise RuntimeError("ClamAV service is unreachable")
|
||||
|
||||
start = time.monotonic()
|
||||
chunk_size = len(content) # Start with full content length
|
||||
chunk_size = max(1, len(content)) # Start with full content length
|
||||
for retry in range(self.settings.max_retries):
|
||||
# For small files, don't check min_chunk_size limit
|
||||
if chunk_size < self.settings.min_chunk_size and chunk_size < len(content):
|
||||
@@ -212,5 +215,5 @@ async def scan_content_safe(content: bytes, *, filename: str = "unknown") -> Non
|
||||
except VirusDetectedError:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"Virus scanning failed for {filename}: {str(e)}")
|
||||
logger.warning(f"Virus scanning failed for {filename}: {str(e)}")
|
||||
raise VirusScanError(f"Virus scanning failed: {str(e)}") from e
|
||||
|
||||
@@ -85,7 +85,36 @@ class TestVirusScannerService:
|
||||
)
|
||||
assert result_dirty.is_clean is False
|
||||
|
||||
# Note: ping method was removed from current implementation
|
||||
@pytest.mark.asyncio
|
||||
async def test_scan_empty_file(self, scanner):
|
||||
"""Empty files (0 bytes) should be accepted without hitting ClamAV."""
|
||||
content = b""
|
||||
result = await scanner.scan_file(content, filename="empty.png")
|
||||
|
||||
assert result.is_clean is True
|
||||
assert result.threat_name is None
|
||||
assert result.file_size == 0
|
||||
assert result.scan_time_ms == 0
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_scan_single_byte_file(self, scanner):
|
||||
"""A 1-byte file should be scanned normally (regression: chunk_size must not be 0)."""
|
||||
|
||||
async def mock_instream(_):
|
||||
await asyncio.sleep(0.001)
|
||||
return None
|
||||
|
||||
mock_client = Mock()
|
||||
mock_client.ping = AsyncMock(return_value=True)
|
||||
mock_client.instream = AsyncMock(side_effect=mock_instream)
|
||||
scanner._client = mock_client
|
||||
|
||||
content = b"\x00"
|
||||
result = await scanner.scan_file(content, filename="tiny.bin")
|
||||
|
||||
assert result.is_clean is True
|
||||
assert result.file_size == 1
|
||||
assert result.scan_time_ms > 0
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_scan_clean_file(self, scanner):
|
||||
@@ -251,3 +280,27 @@ class TestHelperFunctions:
|
||||
|
||||
with pytest.raises(VirusScanError, match="Virus scanning failed"):
|
||||
await scan_content_safe(b"test content", filename="test.txt")
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_scan_content_safe_logs_warning_not_error_on_failure(self):
|
||||
"""Scan failures should log at WARNING level, not ERROR, to avoid paging on-call."""
|
||||
with patch("backend.util.virus_scanner.get_virus_scanner") as mock_get_scanner:
|
||||
mock_scanner = Mock()
|
||||
mock_scanner.scan_file = AsyncMock()
|
||||
mock_scanner.scan_file.side_effect = Exception(
|
||||
"range() arg 3 must not be zero"
|
||||
)
|
||||
mock_get_scanner.return_value = mock_scanner
|
||||
|
||||
with (
|
||||
pytest.raises(VirusScanError),
|
||||
patch("backend.util.virus_scanner.logger") as mock_logger,
|
||||
):
|
||||
await scan_content_safe(b"test", filename="screenshot.png")
|
||||
|
||||
mock_logger.warning.assert_called_once()
|
||||
# Check the formatted log message contains the error text.
|
||||
# Use str() to handle both f-string and %-style logging formats.
|
||||
log_msg = str(mock_logger.warning.call_args)
|
||||
assert "range()" in log_msg
|
||||
mock_logger.error.assert_not_called()
|
||||
|
||||
@@ -25,7 +25,11 @@ Sentry.init({
|
||||
// Suppress cross-origin stylesheet errors from Sentry Replay (rrweb)
|
||||
// serializing DOM snapshots with cross-origin stylesheets
|
||||
// (e.g., from browser extensions or CDN-loaded CSS)
|
||||
ignoreErrors: [/Not allowed to access cross-origin stylesheet/],
|
||||
ignoreErrors: [
|
||||
/Not allowed to access cross-origin stylesheet/,
|
||||
// Sentry SDK internal issue on some mobile browsers
|
||||
/Error invoking postEvent: Method not found/,
|
||||
],
|
||||
|
||||
// Add optional integrations for additional features
|
||||
integrations: [
|
||||
|
||||
Reference in New Issue
Block a user