fix(platform): fix prod Sentry errors and reduce on-call alert noise (#12560)

## Summary
Hotfix targeting master for production Sentry errors that are triggering
on-call pages. Fixes actual bugs and expands Sentry filters to suppress
user-caused errors that are not platform issues.

### Bug Fixes
- **Workspace race condition** (`get_or_create_workspace`): Replaced
Prisma's non-atomic `upsert` with find-then-create pattern. Prisma's
upsert translates to SELECT + INSERT (not PostgreSQL's native `INSERT
... ON CONFLICT`), causing `UniqueViolationError` when concurrent
requests hit for the same user (e.g. copilot + file upload
simultaneously).
- **ChatSidebar crash**: Added null-safe `?.` for `sessions` which can
be `undefined` during error/loading states, preventing `TypeError:
Cannot read properties of undefined (reading 'length')`.
- **UsageLimits crash**: Added null-safe `?.` for
`usage.daily`/`usage.weekly` which can be `undefined` when the API
returns partial data, preventing `TypeError: Cannot read properties of
undefined (reading 'limit')`.

### Sentry Filter Improvements
Expanded backend `_before_send` to stop user-caused errors from reaching
Sentry and triggering on-call alerts:
- **Consolidated auth keywords** into a shared `_USER_AUTH_KEYWORDS`
list used by both exception-based and log-based filters (previously
duplicated).
- **Added missing auth keywords**: `"unauthorized"`, `"bad
credentials"`, `"insufficient authentication scopes"` — these were
leaking through.
- **Added user integration HTTP error filter**: `"http 401 error"`,
`"http 403 error"`, `"http 404 error"` — catches `BlockUnknownError` and
`HTTPClientError` from user integrations (expired GitHub tokens, wrong
Airtable IDs, etc.).
- **Fixed log-based event gap**: User auth errors logged via
`logger.error()` (not raised as exceptions) were bypassing the
`exc_info` filter. Now the same `_USER_AUTH_KEYWORDS` list is checked
against log messages too.

## On-Call Alerts Addressed

### Fixed (actual bugs)
| Alert | Issue | Root Cause |
|-------|-------|------------|
| `Unique constraint failed on the fields: (userId)` |
[AUTOGPT-SERVER-8BM](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8BM)
| Prisma upsert race condition |
| `Unique constraint failed on the fields: (userId)` |
[AUTOGPT-SERVER-8BK](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8BK)
| Same — via `/api/workspace/files/upload` |
| `Unique constraint failed on the fields: (userId)` |
[AUTOGPT-SERVER-8BN](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8BN)
| Same — via `tools/call run_block` |
| `Upload failed (500): Unique constraint failed` |
[BUILDER-7GA](https://significant-gravitas.sentry.io/issues/BUILDER-7GA)
| Frontend surface of same workspace bug |
| `Cannot read properties of undefined (reading 'length')` |
[BUILDER-7GD](https://significant-gravitas.sentry.io/issues/BUILDER-7GD)
| `sessions` undefined in ChatSidebar |
| `Cannot read properties of undefined (reading 'limit')` |
[BUILDER-7GB](https://significant-gravitas.sentry.io/issues/BUILDER-7GB)
| `usage.daily` undefined in UsageLimits |

### Filtered (user-caused, not platform bugs)
| Alert | Issue | Why it's not a platform bug |
|-------|-------|-----------------------------|
| `Anthropic API error: invalid x-api-key` |
[AUTOGPT-SERVER-8B6](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8B6),
8B7, 8B8 | User provided invalid Anthropic API key |
| `AI condition evaluation failed: Incorrect API key` |
[AUTOGPT-SERVER-83Y](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-83Y)
| User's OpenAI key is wrong (4.5K events, 1 user) |
| `GithubListIssuesBlock: HTTP 401 Bad credentials` |
[AUTOGPT-SERVER-8BF](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8BF)
| User's GitHub token expired |
| `HTTPClientError: HTTP 401 Unauthorized` |
[AUTOGPT-SERVER-8BG](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8BG)
| Same — credential check endpoint |
| `GithubReadIssueBlock: HTTP 401 Bad credentials` |
[AUTOGPT-SERVER-8BH](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8BH)
| Same — different block |
| `AirtableCreateBaseBlock: HTTP 404 MODEL_ID_NOT_FOUND` |
[AUTOGPT-SERVER-8BC](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-8BC)
| User's Airtable model ID is wrong |

### Not addressed in this PR
| Alert | Issue | Reason |
|-------|-------|--------|
| `Unexpected token '<', "<html><hea"...` |
[BUILDER-7GC](https://significant-gravitas.sentry.io/issues/BUILDER-7GC)
| Transient — backend briefly returned HTML error page |
| `undefined is not an object (activeResponse.state)` |
[BUILDER-71J](https://significant-gravitas.sentry.io/issues/BUILDER-71J)
| Bug in Vercel AI SDK `ai@6.0.59`, already resolved |
| `Last Tool Output is needed` |
[AUTOGPT-SERVER-72T](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-72T)
| User graph misconfiguration (1 user, 21 events) |
| `Cannot set property ethereum` |
[BUILDER-7G6](https://significant-gravitas.sentry.io/issues/BUILDER-7G6)
| Browser wallet extension conflict |
| `File already exists at path` |
[BUILDER-7FS](https://significant-gravitas.sentry.io/issues/BUILDER-7FS)
| Expected 409 conflict |

## Test plan
- [ ] Verify workspace creation works for new users
- [ ] Verify concurrent workspace access (e.g. copilot + file upload)
doesn't error
- [ ] Verify copilot ChatSidebar and UsageLimits load correctly when API
returns partial/error data
- [ ] Verify user auth errors (invalid API keys, expired tokens) no
longer appear in Sentry after deployment
This commit is contained in:
Zamil Majdy
2026-03-25 23:25:32 +07:00
committed by GitHub
parent 866563ad25
commit 85f0d8353a
5 changed files with 52 additions and 40 deletions

View File

@@ -9,6 +9,7 @@ from datetime import datetime, timezone
from typing import Optional
import pydantic
from prisma.errors import UniqueViolationError
from prisma.models import UserWorkspace, UserWorkspaceFile
from prisma.types import UserWorkspaceFileWhereInput
@@ -75,22 +76,23 @@ async def get_or_create_workspace(user_id: str) -> Workspace:
"""
Get user's workspace, creating one if it doesn't exist.
Uses upsert to handle race conditions when multiple concurrent requests
attempt to create a workspace for the same user.
Args:
user_id: The user's ID
Returns:
Workspace instance
"""
workspace = await UserWorkspace.prisma().upsert(
where={"userId": user_id},
data={
"create": {"userId": user_id},
"update": {}, # No updates needed if exists
},
)
workspace = await UserWorkspace.prisma().find_unique(where={"userId": user_id})
if workspace:
return Workspace.from_db(workspace)
try:
workspace = await UserWorkspace.prisma().create(data={"userId": user_id})
except UniqueViolationError:
# Concurrent request already created it
workspace = await UserWorkspace.prisma().find_unique(where={"userId": user_id})
if workspace is None:
raise
return Workspace.from_db(workspace)

View File

@@ -21,6 +21,31 @@ class DiscordChannel(str, Enum):
PRODUCT = "product" # For product alerts (low balance, zero balance, etc.)
_USER_AUTH_KEYWORDS = [
"incorrect api key",
"invalid x-api-key",
"invalid api key",
"missing authentication header",
"invalid api token",
"authentication_error",
"bad credentials",
"unauthorized",
"insufficient authentication scopes",
"http 401 error",
"http 403 error",
]
_AMQP_KEYWORDS = [
"amqpconnection",
"amqpconnector",
"connection_forced",
"channelinvalidstateerror",
"no active transport",
]
_AMQP_INDICATORS = ["aio_pika", "aiormq", "amqp", "pika", "rabbitmq"]
def _before_send(event, hint):
"""Filter out expected/transient errors from Sentry to reduce noise."""
if "exc_info" in hint:
@@ -28,36 +53,21 @@ def _before_send(event, hint):
exc_msg = str(exc_value).lower() if exc_value else ""
# AMQP/RabbitMQ transient connection errors — expected during deploys
amqp_keywords = [
"amqpconnection",
"amqpconnector",
"connection_forced",
"channelinvalidstateerror",
"no active transport",
]
if any(kw in exc_msg for kw in amqp_keywords):
if any(kw in exc_msg for kw in _AMQP_KEYWORDS):
return None
# "connection refused" only for AMQP-related exceptions (not other services)
if "connection refused" in exc_msg:
exc_module = getattr(exc_type, "__module__", "") or ""
exc_name = getattr(exc_type, "__name__", "") or ""
amqp_indicators = ["aio_pika", "aiormq", "amqp", "pika", "rabbitmq"]
if any(
ind in exc_module.lower() or ind in exc_name.lower()
for ind in amqp_indicators
) or any(kw in exc_msg for kw in ["amqp", "pika", "rabbitmq"]):
for ind in _AMQP_INDICATORS
) or any(kw in exc_msg for kw in _AMQP_INDICATORS):
return None
# User-caused credential/auth errors — not platform bugs
user_auth_keywords = [
"incorrect api key",
"invalid x-api-key",
"missing authentication header",
"invalid api token",
"authentication_error",
]
if any(kw in exc_msg for kw in user_auth_keywords):
# User-caused credential/auth/integration errors — not platform bugs
if any(kw in exc_msg for kw in _USER_AUTH_KEYWORDS):
return None
# Expected business logic — insufficient balance
@@ -93,18 +103,18 @@ def _before_send(event, hint):
)
if event.get("logger") and log_msg:
msg = log_msg.lower()
noisy_patterns = [
noisy_log_patterns = [
"amqpconnection",
"connection_forced",
"unclosed client session",
"unclosed connector",
]
if any(p in msg for p in noisy_patterns):
if any(p in msg for p in noisy_log_patterns):
return None
# "connection refused" in logs only when AMQP-related context is present
if "connection refused" in msg and any(
ind in msg for ind in ("amqp", "pika", "rabbitmq", "aio_pika", "aiormq")
):
if "connection refused" in msg and any(ind in msg for ind in _AMQP_INDICATORS):
return None
# Same auth keywords — errors logged via logger.error() bypass exc_info
if any(kw in msg for kw in _USER_AUTH_KEYWORDS):
return None
return event

View File

@@ -290,12 +290,12 @@ export function ChatSidebar() {
<div className="flex min-h-[30rem] items-center justify-center py-4">
<LoadingSpinner size="small" className="text-neutral-600" />
</div>
) : sessions.length === 0 ? (
) : !sessions?.length ? (
<p className="py-4 text-center text-sm text-neutral-500">
No conversations yet
</p>
) : (
sessions.map((session) => (
sessions?.map((session) => (
<div
key={session.id}
className={cn(

View File

@@ -20,7 +20,7 @@ export function UsageLimits() {
},
});
if (isLoading || !usage) return null;
if (isLoading || !usage?.daily || !usage?.weekly) return null;
if (usage.daily.limit <= 0 && usage.weekly.limit <= 0) return null;
return (

View File

@@ -34,7 +34,7 @@ function CoPilotUsageSection() {
},
});
if (isLoading || !usage) return null;
if (isLoading || !usage?.daily || !usage?.weekly) return null;
if (usage.daily.limit <= 0 && usage.weekly.limit <= 0) return null;
return (