Commit Graph

8176 Commits

Author SHA1 Message Date
Bentlybro
405bdb2808 fix(backend/llm-registry): enforce single recommended model in update_model
When setting is_recommended=True on a model, first clears the flag on all
other models within the same transaction so only one model can be
recommended at a time.
2026-04-13 15:59:23 +01:00
Bentlybro
6fcd05ef61 chore: regenerate OpenAPI schema from current backend endpoints 2026-04-13 15:59:23 +01:00
Bentlybro
33c30c6990 fix: add trailing newline to openapi.json 2026-04-13 15:59:23 +01:00
Bentlybro
52d074d31f chore: regenerate OpenAPI schema for new migration endpoints 2026-04-13 15:59:23 +01:00
Bentlybro
93be8e5095 feat(backend): Add model migration system - usage tracking, safe delete, disable with migration, revert
- GET /llm/models/{slug}/usage - count AgentNodes using a model
- DELETE /llm/models/{slug} with optional replacement_model_slug for safe migration
- POST /llm/models/{slug}/toggle with migration support when disabling
- GET /llm/migrations - list model migrations (with include_reverted filter)
- POST /llm/migrations/{id}/revert - revert a migration (restores nodes, re-enables source model)
- Transactional migration: counts nodes, migrates atomically, creates LlmModelMigration audit record
- Ported from original PR #11699's db.py
2026-04-13 15:59:08 +01:00
Bentlybro
b0d9ef13e6 feat(backend): Add admin list endpoints, creator CRUD, model cost creation
- GET /llm/admin/providers - list all providers from DB (includes empty ones)
- GET /llm/admin/models - list all models with costs and creator info
- POST /llm/creators - create new creator
- PATCH /llm/creators/{name} - update creator
- DELETE /llm/creators/{name} - delete creator (with model check)
- Create LlmModelCost records when creating a model
- Resolve provider name to ID in create_model
- Add costs field to CreateLlmModelRequest
2026-04-13 15:59:08 +01:00
Bentlybro
d5a0ce2815 fix(backend): Use {slug:path} for model routes to support slugs with slashes
Model slugs like 'openai/gpt-oss-120b' contain forward slashes which
FastAPI's default {slug} parameter doesn't capture. Using {slug:path}
allows the full slug to be captured as a single parameter.
2026-04-13 15:59:08 +01:00
Bentlybro
3f6b1120f3 Add LLM creators endpoint and OpenAPI entry
Introduce a read endpoint for LLM model creators: add _map_creator_response serializer and an admin-only GET /llm/creators route that queries prisma.models.LlmModelCreator (ordered by name), logs results, and returns serialized creators with error handling. Also update frontend OpenAPI spec with the /api/llm/creators GET operation.
2026-04-13 15:59:08 +01:00
Bentlybro
77757a25a5 feat(platform): Implement LLM registry admin API functionality
Implement full CRUD operations for admin API:

Database layer (db_write.py):
- create_provider, update_provider, delete_provider
- create_model, update_model, delete_model
- refresh_runtime_caches - invalidates in-memory registry after mutations
- Proper validation and error handling

Admin routes (admin_routes.py):
- All endpoints now functional (no more 501)
- Proper error responses (400 for validation, 404 for not found, 500 for server errors)
- Lookup by slug/name before operations
- Cache refresh after all mutations

Features:
- Provider deletion blocked if models exist (FK constraint)
- All mutations refresh registry cache automatically
- Proper logging for audit trail
- Admin auth enforced on all endpoints

Based on original implementation from PR #11699 (upstream-llm branch).

Builds on:
- PR #12357: Schema foundation
- PR #12359: Registry core
- PR #12371: Public read API
2026-04-13 15:59:08 +01:00
Bentlybro
e192695884 feat(platform): Add LLM registry admin API skeleton - Part 4 of 6
Add admin write API endpoints for LLM registry management:
- POST /api/llm/models - Create model
- PATCH /api/llm/models/{slug} - Update model
- DELETE /api/llm/models/{slug} - Delete model
- POST /api/llm/providers - Create provider
- PATCH /api/llm/providers/{name} - Update provider
- DELETE /api/llm/providers/{name} - Delete provider

All endpoints require admin authentication via requires_admin_user.

Request/response models defined in admin_model.py:
- CreateLlmModelRequest, UpdateLlmModelRequest
- CreateLlmProviderRequest, UpdateLlmProviderRequest

Implementation coming in follow-up commits (currently returns 501 Not Implemented).

This builds on:
- PR #12357: Schema foundation
- PR #12359: Registry core
- PR #12371: Public read API
2026-04-13 15:59:08 +01:00
Bentlybro
5b2d4595d1 refactor(backend/llm-public-api): extract _map_model helper, fix indentation, add total to providers response
- Extract _map_model() to eliminate ~25-line duplication between list_models and list_providers
- Fix misaligned is_recommended field in list_providers
- Remove duplicate tags=["llm"] from router definition
- Add total field to LlmProvidersResponse for consistency with LlmModelsResponse
- Tighten provider_map type annotation
2026-04-13 15:52:47 +01:00
Bentlybro
4e1774c939 test(backend/llm-public-api): public LLM models/providers endpoint tests
routes_test.py (new, 8 tests):
- GET /llm/models: enabled_only default, all, empty, creator, costs
- GET /llm/providers: single provider, multiple sorted, empty
2026-04-13 15:49:46 +01:00
Bentlybro
845ce6ae8d fix(backend): Include is_enabled in public model list response 2026-04-13 15:49:46 +01:00
Bentlybro
84f30775fd Add is_enabled field to OpenAPI model
Introduce a new boolean property `is_enabled` (default: true) into the OpenAPI schema in autogpt_platform/frontend/src/app/api/openapi.json next to `price_tier` and `is_recommended`. This exposes an enable/disable flag in the API model for consumers and defaults new entries to enabled.
2026-04-13 15:49:46 +01:00
Bentlybro
62065292ec Add is_enabled flag to LlmModel
Introduce an is_enabled: bool = True field to the LlmModel pydantic model to allow toggling model availability. Defaulting to True preserves backward compatibility and avoids breaking changes; can be used by APIs or UIs to filter or disable models without removing them.
2026-04-13 15:49:46 +01:00
Bentlybro
dff9b0f3b2 Add LLM models/providers endpoints to OpenAPI
Add two new GET endpoints to the OpenAPI spec: /api/llm/models (with optional enabled_only query param, JWT auth) and /api/llm/providers (JWT auth). These endpoints expose the in-memory LLM registry: list of models and grouped providers with their enabled models. Also add related component schemas (LlmModel, LlmModelCost, LlmModelCreator, LlmModelsResponse, LlmProvider, LlmProvidersResponse) describing model metadata, costs, creators and response shapes.
2026-04-13 15:49:46 +01:00
Bentlybro
fa47d898d1 fix: remove incorrectly placed openapi.json file 2026-04-13 15:49:46 +01:00
Bentlybro
67455f6a35 fix: regenerate OpenAPI schema after rebase 2026-04-13 15:49:46 +01:00
Bentlybro
cb8cf81be7 feat(platform): Add LLM registry public read API
Implements public GET endpoints for querying LLM models and providers - Part 3 of 6 in the incremental registry rollout.

**Endpoints:**
- GET /api/llm/models - List all models (filterable by enabled_only)
- GET /api/llm/providers - List providers with their models

**Design:**
- Uses in-memory registry from PR 2 (no DB queries)
- Fast reads from cache populated at startup
- Grouped by provider for easy UI rendering

**Response models:**
- LlmModel - model info with capabilities, costs, creator
- LlmProvider - provider with nested models
- LlmModelsResponse - list + total count
- LlmProvidersResponse - grouped by provider

**Authentication:**
- Requires user auth (requires_user dependency)
- Public within authenticated sessions

**Integration:**
- Registered in rest_api.py at /api prefix
- Tagged with v2 + llm for OpenAPI grouping

**What's NOT included (later PRs):**
- Admin write API (PR 4)
- Block integration (PR 5)
- Redis cache (PR 6)

Lines: ~180 total
Files: 4 (3 new, 1 modified)
Review time: < 10 minutes
2026-04-13 15:49:46 +01:00
Bentlybro
ef30c1ed76 fix(backend/llm-registry): guard subscription task lifecycle, top-level imports, tighten error match 2026-04-13 15:46:42 +01:00
Bentlybro
b5f63c13a4 test(backend/llm-registry): comprehensive registry + notifications tests
registry_test.py (+8 tests):
- clear_registry_cache, get_model (found/not found), get_all_models,
  get_enabled_models, get_all_model_slugs_for_validation
- refresh_llm_registry error re-raise

notifications_test.py (new, 9 tests):
- publish: happy path and Redis error swallowed
- subscribe: valid message triggers on_refresh, non-message types ignored,
  wrong channel ignored, None (timeout) handled, multiple messages,
  CancelledError stops loop, connection error triggers reconnect
2026-04-13 15:44:11 +01:00
Bentlybro
c5dfe3333d feat(backend/llm-registry): add Redis-backed cache and cross-process pub/sub sync
- Wrap DB fetch with @cached(shared_cache=True) so results are stored in
  Redis automatically — other workers skip the DB on warm cache
- Add notifications.py with publish/subscribe helpers using llm_registry:refresh
  pub/sub channel for cross-process invalidation
- clear_registry_cache() invalidates the shared Redis entry before a forced
  DB refresh (called by admin mutations)
- rest_api.py: start a background subscription task so every worker reloads
  its in-process cache when another worker refreshes the registry
2026-04-13 15:44:11 +01:00
Bentlybro
696b273afc fix(registry): switch to Pydantic models, add typed capabilities, add unit tests
- Replace frozen dataclasses with Pydantic BaseModel(frozen=True) for true immutability
- Add typed boolean fields for model capabilities (supports_tools, etc.)
- Add comprehensive unit tests for registry module
- Addresses Majdyz review feedback on PR #12359
2026-04-13 15:44:11 +01:00
Bentlybro
732365cd8f fix(registry): address Majdyz review - extract helper, fix schema prefix, return copies, remove re-export 2026-04-13 15:44:10 +01:00
Bentlybro
7e85371ce5 style: fix trailing whitespace in registry.py 2026-04-13 15:43:27 +01:00
Bentlybro
3f964c8aba fix(startup): handle missing AgentNode table in migrate_llm_models
Tests fail with 'relation "platform.AgentNode" does not exist' because
migrate_llm_models() runs during startup and queries a table that doesn't
exist in fresh test databases.

This is an existing bug in the codebase - the function has no error handling.

Wrap the call in try/except to gracefully handle test environments where
the AgentNode table hasn't been created yet.
2026-04-13 15:43:27 +01:00
Bentlybro
ad1f489c5c refactor: address CodeRabbit/Majdyz review feedback
- Fix ModelMetadata duplicate type collision by importing from blocks.llm
- Remove _json_to_dict helper, use dict() inline
- Add warning when Provider relation is missing (data corruption indicator)
- Optimize get_default_model_slug with next() (single sort pass)
- Optimize _build_schema_options to use list comprehension
- Move llm_registry import to top-level in rest_api.py
- Ensure max_output_tokens falls back to context_window when null

All critical and quick-win issues addressed.
2026-04-13 15:43:27 +01:00
Bentlybro
9c77a2207f fix: address Sentry/CodeRabbit critical and major issues
**CRITICAL FIX - ModelMetadata instantiation:**
- Removed non-existent 'supports_vision' argument
- Added required fields: display_name, provider_name, creator_name, price_tier
- Handle nullable DB fields (Creator, priceTier, maxOutputTokens) safely
- Fallback: creator_name='Unknown' if no Creator, price_tier=1 if invalid

**MAJOR FIX - Preserve pricing unit:**
- Added 'unit' field to RegistryModelCost dataclass
- Prevents RUN vs TOKENS ambiguity in cached costs
- Convert Prisma enum to string when building cost objects

**MAJOR FIX - Deterministic default model:**
- Sort recommended models by display_name before selection
- Prevents non-deterministic results when multiple models are recommended
- Ensures consistent default across refreshes

**STARTUP IMPROVEMENT:**
- Added comment: graceful fallback OK for now (no blocks use registry yet)
- Will be stricter in PR #5 when block integration lands
- Added success log message for registry refresh

Fixes identified by Sentry (critical TypeError) and CodeRabbit review.
2026-04-13 15:43:27 +01:00
Bentlybro
081fa9f2db feat(platform): Add LLM registry core - DB layer + in-memory cache
Implements the registry core for dynamic LLM model management:

**DB Layer:**
- Fetch models with provider, costs, and creator relations
- Prisma query with includes for related data
- Convert DB records to typed dataclasses

**In-memory Cache:**
- Global dict for fast model lookups
- Atomic cache refresh with lock protection
- Schema options generation for UI dropdowns

**Public API:**
- get_model(slug) - lookup by slug
- get_all_models() - all models (including disabled)
- get_enabled_models() - enabled models only
- get_schema_options() - UI dropdown data
- get_default_model_slug() - recommended or first enabled
- refresh_llm_registry() - manual refresh trigger

**Integration:**
- Refresh at API startup (before block init)
- Graceful fallback if registry unavailable
- Enables blocks to consume registry data

**Models:**
- RegistryModel - full model with metadata
- RegistryModelCost - pricing configuration
- RegistryModelCreator - model creator info
- ModelMetadata - context window, capabilities

**Next PRs:**
- PR #3: Public read API (GET endpoints)
- PR #4: Admin write API (POST/PATCH/DELETE)
- PR #5: Block integration (update LLM block)
- PR #6: Redis cache (solve thundering herd)

Lines: ~230 (registry.py ~210, __init__.py ~30, model.py from draft)
Files: 4 (3 new, 1 modified)
2026-04-13 15:43:27 +01:00
Bentlybro
8b6dea2496 fix(backend): remove f-string from migrate_llm_models SQL query 2026-04-13 15:42:22 +01:00
Bentlybro
5e15213846 fix(schema): rename migration dirs to 14-digit Prisma timestamps, add onUpdate: Cascade
- Rename 20260310_ → 20260310120000_ (schema) and 20260310130000_ (seed)
- Add onUpdate: Cascade to LlmModelMigration FK relations
2026-04-04 20:47:49 +00:00
Bentlybro
f0cc4ae573 Seed LLM model creators and link models
Update migration to seed LLM model creators and associate them with models. Adds INSERTs for LlmModelCreator, updates the file header comment, introduces a creator_ids CTE, and extends the LlmModel INSERT to include creatorId (joining on creator name). Existing provider seeding and model cost logic remain unchanged; ON CONFLICT behavior preserved.
2026-03-25 13:57:41 +00:00
Bently
e0282b00db Merge branch 'dev' into feat/llm-registry-schema 2026-03-23 13:43:15 +00:00
Bentlybro
9a9c36b806 Update LLM registry seeds and conflict clause
Add and rename model slugs and costs in the LLM registry seed migration (e.g. rename 'o3' -> 'o3-2025-04-16', add 'gpt-5.2-2025-12-11', Anthropic 'claude-opus-4-6'/'claude-sonnet-4-6', multiple Google Gemini and Mistralai OpenRouter entries, and other provider models). Also tighten the ON CONFLICT upsert semantics so conflicts are ignored only when "credentialId" IS NULL, preventing silent skips for credentialed entries. These changes seed new models and ensure correct conflict handling during migration.
2026-03-23 13:20:48 +00:00
Zamil Majdy
e86ac21c43 feat(platform): add workflow import from other tools (n8n, Make.com, Zapier) (#12440)
## Summary
- Enable one-click import of workflows from other platforms (n8n,
Make.com, Zapier, etc.) into AutoGPT via CoPilot
- **No backend endpoint** — import is entirely client-side: the dialog
reads the file or fetches the n8n template URL, uploads the JSON to the
workspace via `uploadFileDirect`, stores the file reference in
`sessionStorage`, and redirects to CoPilot with `autosubmit=true`
- CoPilot receives the workflow JSON as a proper file attachment and
uses the existing agent-generator pipeline to convert it
- Library dialog redesigned: 2 tabs — "AutoGPT agent" (upload exported
agent JSON) and "Another platform" (file upload + optional n8n URL)

## How it works
1. User uploads a workflow JSON (or pastes an n8n template URL)
2. Frontend fetches/reads the JSON and uploads it to the user's
workspace via the existing file upload API
3. User is redirected to `/copilot?source=import&autosubmit=true`
4. CoPilot picks up the file from `sessionStorage` and sends it as a
`FileUIPart` attachment with a prompt to recreate the workflow as an
AutoGPT agent

## Test plan
- [x] Manual test: import a real n8n workflow JSON via the dialog
- [x] Manual test: paste an n8n template URL and verify it fetches +
converts
- [x] Manual test: import Make.com / Zapier workflow export JSON
- [x] Repeated imports don't cause 409 conflicts (filenames use
`crypto.randomUUID()`)
- [x] E2E: Import dialog has 2 tabs (AutoGPT agent + Another platform)
- [x] E2E: n8n quick-start template buttons present
- [x] E2E: n8n URL input enables Import button on valid URL
- [x] E2E: Workspace upload API returns file_id
2026-03-23 13:03:02 +00:00
Bentlybro
d5381625cd Add timestamps to LLM registry seed inserts
Update migration.sql to include createdAt and updatedAt columns/values for LlmProvider, LlmModel, and LlmModelCost seed inserts. Uses CURRENT_TIMESTAMP for both timestamp fields and adjusts the INSERT SELECT ordering for models to match the added columns. This ensures the seed data satisfies schemas that require timestamps and provides consistent created/updated metadata.
2026-03-23 12:59:30 +00:00
Bentlybro
f6ae3d6593 Update migration.sql 2026-03-23 12:43:06 +00:00
Lluis Agusti
94224be841 Merge remote-tracking branch 'origin/master' into dev 2026-03-23 20:42:32 +08:00
Bentlybro
0fb1b854df add llm's via migration 2026-03-23 12:37:29 +00:00
Otto
da4bdc7ab9 fix(backend+frontend): reduce Sentry noise from user-caused errors (#12513)
Requested by @majdyz

User-caused errors (no payment method, webhook agent invocation, missing
credentials, bad API keys) were hitting Sentry via `logger.exception()`
in the `ValueError` handler, creating noise that obscures real bugs.
Additionally, a frontend crash on the copilot page (BUILDER-71J) needed
fixing.

**Changes:**

**Backend — rest_api.py**
- Set `log_error=False` for the `ValueError` exception handler (line
278), consistent with how `FolderValidationError` and `NotFoundError`
are already handled. User-caused 400 errors no longer trigger
`logger.exception()` → Sentry.

**Backend — executor/manager.py**
- Downgrade `ExecutionManager` input validation skip errors from `error`
to `warning` level. Missing credentials is expected user behavior, not
an internal error.

**Backend — blocks/llm.py**
- Sanitize unpaired surrogates in LLM prompt content before sending to
provider APIs. Prevents `UnicodeEncodeError: surrogates not allowed`
when httpx encodes the JSON body (AUTOGPT-SERVER-8AX).

**Frontend — package.json**
- Upgrade `ai` SDK from `6.0.59` to `6.0.134` to fix BUILDER-71J
(`TypeError: undefined is not an object (evaluating
'this.activeResponse.state')` on /copilot page). This is a known issue
in the Vercel AI SDK fixed in later patch versions.

**Sentry issues addressed:**
- `No payment method found` (ValueError → 400)
- `This agent is triggered by an external event (webhook)` (ValueError →
400)
- `Node input updated with non-existent credentials` (ValueError → 400)
- `[ExecutionManager] Skip execution, input validation error: missing
input {credentials}`
- `UnicodeEncodeError: surrogates not allowed` (AUTOGPT-SERVER-8AX)
- `TypeError: activeResponse.state` (BUILDER-71J)

Resolves SECRT-2166

---
Co-authored-by: Zamil Majdy (@majdyz) <zamil.majdy@agpt.co>

---------

Co-authored-by: Zamil Majdy (@majdyz) <zamil.majdy@agpt.co>
2026-03-23 12:22:49 +00:00
Zamil Majdy
7176cecf25 perf(copilot): reduce tool schema token cost by 34% (#12398)
## Summary

Reduce CoPilot per-turn token overhead by systematically trimming tool
descriptions, parameter schemas, and system prompt content. All 35 MCP
tool schemas are passed on every SDK call — this PR reduces their size.

### Strategy

1. **Tool descriptions**: Trimmed verbose multi-sentence explanations to
concise single-sentence summaries while preserving meaning
2. **Parameter schemas**: Shortened parameter descriptions to essential
info, removed some `default` values (handled in code)
3. **System prompt**: Condensed `_SHARED_TOOL_NOTES` and storage
supplement template in `prompting.py`
4. **Cross-tool references**: Removed duplicate workflow hints (e.g.
"call find_block before run_block" appeared in BOTH tools — kept only in
the dependent tool). Critical cross-tool references retained (e.g.
`continue_run_block` in `run_block`, `fix_agent_graph` in
`validate_agent`, `get_doc_page` in `search_docs`, `web_fetch`
preference in `browser_navigate`)

### Token Impact

| Metric | Before | After | Reduction |
|--------|--------|-------|-----------|
| System Prompt | ~865 tokens | ~497 tokens | 43% |
| Tool Schemas | ~9,744 tokens | ~6,470 tokens | 34% |
| **Grand Total** | **~10,609 tokens** | **~6,967 tokens** | **34%** |

Saves **~3,642 tokens per conversation turn**.

### Key Decisions

- **Mostly description changes**: Tool logic, parameters, and types
unchanged. However, some schema-level `default` fields were removed
(e.g. `save` in `customize_agent`) — these are machine-readable
metadata, not just prose, and may affect LLM behavior.
- **Quality preserved**: All descriptions still convey what the tool
does and essential usage patterns
- **Cross-references trimmed carefully**: Kept prerequisite hints in the
dependent tool (run_block mentions find_block) but removed the reverse
(find_block no longer mentions run_block). Critical cross-tool guidance
retained where removal would degrade model behavior.
- **`run_time` description fixed**: Added missing supported values
(today, last 30 days, ISO datetime) per review feedback

### Future Optimization

The SDK passes all 35 tools on every call. The MCP protocol's
`list_tools()` handler supports dynamic tool registration — a follow-up
PR could implement lazy tool loading (register core tools + a discovery
meta-tool) to further reduce per-turn token cost.

### Changes

- Trimmed descriptions across 25 tool files
- Condensed `_SHARED_TOOL_NOTES` and `_build_storage_supplement` in
`prompting.py`
- Fixed `run_time` schema description in `agent_output.py`

### Checklist

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] All 273 copilot tests pass locally
  - [x] All 35 tools load and produce valid schemas
  - [x] Before/after token dumps compared
  - [x] Formatting passes (`poetry run format`)
  - [x] CI green
2026-03-23 08:27:24 +00:00
Zamil Majdy
f35210761c feat(devops): add /pr-test skill + subscription mode auto-provisioning (#12507)
## Summary
- Adds `/pr-test` skill for automated E2E testing of PRs using docker
compose, agent-browser, and API calls
- Covers full environment setup (copy .env, configure copilot auth,
ARM64 Docker fix)
- Includes browser UI testing, direct API testing, screenshot capture,
and test report generation
- Has `--fix` mode for auto-fixing bugs found during testing (similar to
`/pr-address`)
- **Screenshot uploads use GitHub Git API** (blobs → tree → commit →
ref) — no local git operations, safe for worktrees
- **Subscription mode improvements:**
- Extract subscription auth logic to `sdk/subscription.py` — uses SDK's
bundled CLI binary instead of requiring `npm install -g
@anthropic-ai/claude-code`
- Auto-provision `~/.claude/.credentials.json` from
`CLAUDE_CODE_OAUTH_TOKEN` env var on container startup — no `claude
login` needed in Docker
- Add `scripts/refresh_claude_token.sh` — cross-platform helper
(macOS/Linux/Windows) to extract OAuth tokens from host and update
`backend/.env`

## Test plan
- [x] Validated skill on multiple PRs (#12482, #12483, #12499, #12500,
#12501, #12440, #12472) — all test scenarios passed
- [x] Confirmed screenshot upload via GitHub Git API renders correctly
on all 7 PRs
- [x] Verified subscription mode E2E in Docker:
`refresh_claude_token.sh` → `docker compose up` → copilot chat responds
correctly with no API keys (pure OAuth subscription)
- [x] Verified auto-provisioning of credentials file inside container
from `CLAUDE_CODE_OAUTH_TOKEN` env var
- [x] Confirmed bundled CLI detection
(`claude_agent_sdk._bundled/claude`) works without system-installed
`claude`
- [x] `poetry run pytest backend/copilot/sdk/service_test.py` — 24/24
tests pass
2026-03-23 15:29:00 +07:00
Zamil Majdy
1ebcf85669 fix(platform): resolve 5 production Sentry alerts (#12496)
## Summary

Fixes 5 high-priority Sentry alerts from production:

- **AUTOGPT-SERVER-8AM**: Fix `TypeError: TypedDict does not support
instance and class checks` — `_value_satisfies_type` in `type.py` now
handles TypedDict classes that don't support `isinstance()` checks
- **AUTOGPT-SERVER-8AN**: Fix `ValueError: No payment method found`
triggering Sentry error — catch the expected ValueError in the
auto-top-up endpoint and return HTTP 422 instead
- **BUILDER-7F5**: Fix `Upload failed (409): File already exists` — add
`overwrite` query param to workspace upload endpoint and set it to
`true` from the frontend direct-upload
- **BUILDER-7F0**: Fix `LaTeX-incompatible input` KaTeX warnings
flooding Sentry — set `strict: false` on rehype-katex plugin to suppress
warnings for unrecognized Unicode characters
- **AUTOGPT-SERVER-89N**: Fix `Tool execution with manager failed:
validation error for dict[str,list[any]]` — make RPC return type
validation resilient (log warning instead of crash) and downgrade
SmartDecisionMaker tool execution errors to warnings

## Test plan
- [ ] Verify TypedDict type coercion works for
GithubMultiFileCommitBlock inputs
- [ ] Verify auto-top-up without payment method returns 422, not 500
- [ ] Verify file re-upload in copilot succeeds (overwrites instead of
409)
- [ ] Verify LaTeX rendering with Unicode characters doesn't produce
console warnings
- [ ] Verify SmartDecisionMaker tool execution failures are logged at
warning level
2026-03-23 08:05:08 +00:00
Otto
ab7c38bda7 fix(frontend): detect closed OAuth popup and allow dismissing waiting modal (#12443)
Requested by @kcze

When a user closes the OAuth sign-in popup without completing
authentication, the 'Waiting on sign-in process' modal was stuck open
with no way to dismiss it, forcing a page refresh.

Two bugs caused this:

1. `oauth-popup.ts` had no detection for the popup being closed by the
user. The promise would hang until the 5-minute timeout.

2. The modal's cancel button aborted a disconnected `AbortController`
instead of the actual OAuth flow's abort function, so clicking
cancel/close did nothing.

### Changes

- Add `popup.closed` polling (500ms) in `openOAuthPopup()` that rejects
the promise when the user closes the auth window
- Add reject-on-abort so the cancel button properly terminates the flow
- Replace the disconnected `oAuthPopupController` with a direct
`cancelOAuthFlow()` function that calls the real abort ref
- Handle popup-closed and user-canceled as silent cancellations (no
error toast)

### Testing

Tested manually 
- [x] Start OAuth flow → close popup window → modal dismisses
automatically 
- [x] Start OAuth flow → click cancel on modal → popup closes, modal
dismisses 
- [x] Complete OAuth flow normally → works as before 

Resolves SECRT-2054

---
Co-authored-by: Krzysztof Czerwinski (@kcze)
<krzysztof.czerwinski@agpt.co>

---------

Co-authored-by: Krzysztof Czerwinski <kpczerwinski@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 14:41:09 +00:00
Ubbe
0f67e45d05 hotfix(marketplace): adjust card height overflow (#12497)
## Summary

### Before

<img width="500" height="501" alt="Screenshot 2026-03-20 at 21 50 31"
src="https://github.com/user-attachments/assets/6154cffb-6772-4c3d-a703-527c8ca0daff"
/>

### After

<img width="500" height="581" alt="Screenshot 2026-03-20 at 21 33 12"
src="https://github.com/user-attachments/assets/2f9bd69d-30c5-4d06-ad1e-ed76b184afe5"
/>

### Other minor fixes

- minor spacing adjustments in creator/search pages when empty and
between sections


### Summary

- Increase StoreCard height from 25rem to 26.5rem to prevent content
overflow
- Replace manual tooltip-based title truncation with `OverflowText`
component in StoreCard
- Adjust carousel indicator positioning and hide it on md+ when exactly
3 featured agents are shown

## Test plan
- [x] Verify marketplace cards display without text overflow
- [x] Verify featured section carousel indicators behave correctly
- [x] Check responsive behavior at common breakpoints

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 22:03:28 +08:00
Ubbe
b9ce37600e refactor(frontend/marketplace): move download below Add to library with contextual text (#12486)
## Summary

<img width="1487" height="670" alt="Screenshot 2026-03-20 at 00 52 58"
src="https://github.com/user-attachments/assets/f09de2a0-3c5b-4bce-b6f4-8a853f6792cf"
/>


- Move the download button from inline next to "Add to library" to a
separate line below it
- Add contextual text: "Want to use this agent locally? Download here"
- Style the "Download here" as a violet ghost button link with the
download icon

## Test plan
- [ ] Visit a marketplace agent page
- [ ] Verify "Add to library" button renders in its row
- [ ] Verify "Want to use this agent locally? Download here" appears
below it
- [ ] Click "Download here" and confirm the agent downloads correctly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 13:13:59 +00:00
Otto
3921deaef1 fix(frontend): truncate marketplace card description to 2 lines (#12494)
Reduces `line-clamp` from 3 to 2 on the marketplace `StoreCard`
description to prevent text from overlapping with the
absolutely-positioned run count and +Add button at the bottom of the
card.

Resolves SECRT-2156.

---
Co-authored-by: Abhimanyu Yadav (@Abhi1992002)
<122007096+Abhi1992002@users.noreply.github.com>
2026-03-20 09:10:21 +00:00
Nicholas Tindle
f01f668674 fix(backend): support Responses API in SmartDecisionMakerBlock (#12489)
## Summary

- Fixes SmartDecisionMakerBlock conversation management to work with
OpenAI's Responses API, which was introduced in #12099 (commit 1240f38)
- The migration to `responses.create` updated the outbound LLM call but
missed the conversation history serialization — the `raw_response` is
now the entire `Response` object (not a `ChatCompletionMessage`), and
tool calls/results use `function_call` / `function_call_output` types
instead of role-based messages
- This caused a 400 error on the second LLM call in agent mode:
`"Invalid value: ''. Supported values are: 'assistant', 'system',
'developer', and 'user'."`

### Changes

**`smart_decision_maker.py`** — 6 functions updated:
| Function | Fix |
|---|---|
| `_convert_raw_response_to_dict` | Detects Responses API `Response`
objects, extracts output items as a list |
| `_get_tool_requests` | Recognizes `type: "function_call"` items |
| `_get_tool_responses` | Recognizes `type: "function_call_output"`
items |
| `_create_tool_response` | New `responses_api` kwarg produces
`function_call_output` format |
| `_update_conversation` | Handles list return from
`_convert_raw_response_to_dict` |
| Non-agent mode path | Same list handling for traditional execution |

**`test_smart_decision_maker_responses_api.py`** — 61 tests covering:
- Every branch of all 6 affected helper functions
- Chat Completions, Anthropic, and Responses API formats
- End-to-end agent mode and traditional mode conversation validity

## Test plan

- [x] 61 new unit tests all pass
- [x] 11 existing SmartDecisionMakerBlock tests still pass (no
regressions)
- [x] All pre-commit hooks pass (ruff, black, isort, pyright)
- [ ] CI integration tests

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Updates core LLM invocation and agent conversation/tool-call
bookkeeping to match OpenAI’s Responses API, which can affect tool
execution loops and prompt serialization across providers. Risk is
mitigated by extensive new unit tests, but regressions could surface in
production agent-mode flows or token/usage accounting.
> 
> **Overview**
> **Migrates OpenAI calls from Chat Completions to the Responses API
end-to-end**, including tool schema conversion, output parsing,
reasoning/text extraction, and updated token usage fields in
`LLMResponse`.
> 
> **Fixes SmartDecisionMakerBlock conversation/tool handling for
Responses API** by treating `raw_response` as a Response object
(splitting it into `output` items for replay), recognizing
`function_call`/`function_call_output` entries, and emitting tool
outputs in the correct Responses format to prevent invalid follow-up
prompts.
> 
> Also adjusts prompt compaction/token estimation to understand
Responses API tool items, changes
`get_execution_outputs_by_node_exec_id` to return list-valued
`CompletedBlockOutput`, removes `gpt-3.5-turbo` from model/cost/docs
lists, and adds focused unit tests plus a lightweight `conftest.py` to
run these tests without the full server stack.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
ff292efd3d. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Otto <otto@agpt.co>
Co-authored-by: Krzysztof Czerwinski <kpczerwinski@gmail.com>
autogpt-platform-beta-v0.6.52
2026-03-20 03:23:52 +00:00
Otto
f7a3491f91 docs(platform): add TDD guidance to CLAUDE.md files (#12491)
Requested by @majdyz

Adds TDD (test-driven development) guidance to CLAUDE.md files so Claude
Code follows a test-first workflow when fixing bugs or adding features.

**Changes:**
- **Parent `CLAUDE.md`**: Cross-cutting TDD workflow — write a failing
`xfail` test, implement the fix, remove the marker
- **Backend `CLAUDE.md`**: Concrete pytest example with
`@pytest.mark.xfail` pattern
- **Frontend `CLAUDE.md`**: Note about using Playwright `.fixme`
annotation for bug-fix tests

The workflow is: write a failing test first → confirm it fails for the
right reason → implement → confirm it passes. This ensures every bug fix
is covered by a test that would have caught the regression.

---
Co-authored-by: Zamil Majdy (@majdyz) <zamil.majdy@agpt.co>
2026-03-20 02:13:16 +00:00
Nicholas Tindle
cbff3b53d3 Revert "feat(backend): migrate OpenAI provider to Responses API" (#12490)
Reverts Significant-Gravitas/AutoGPT#12099

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Reverts the OpenAI integration in `llm_call` from the Responses API
back to `chat.completions`, which can change tool-calling, JSON-mode
behavior, and token accounting across core AI blocks. The change is
localized but touches the primary LLM execution path and associated
tests/docs.
> 
> **Overview**
> Reverts the OpenAI path in `backend/blocks/llm.py` from the Responses
API back to `chat.completions`, including updating JSON-mode
(`response_format`), tool handling, and usage extraction to match the
Chat Completions response shape.
> 
> Removes the now-unused `backend/util/openai_responses.py` helpers and
their unit tests, updates LLM tests to mock `chat.completions.create`,
and adds `gpt-3.5-turbo` to the supported model list, cost config, and
LLM docs.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
7d6226d10e. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2026-03-20 01:51:56 +00:00