feat(backend/llm-registry): wire refresh_runtime_caches to Redis invalidation and pub/sub

After any admin DB mutation, clear the shared Redis cache, refresh this process's in-memory state, then publish a notification so all other workers reload from Redis without hitting the database.
fix(backend/llm-registry): enforce single recommended model in update_model
2026-04-08 03:00:28 -04:00 · 2026-04-07 18:35:41 +01:00 · 2026-04-07 18:35:08 +01:00 · 2026-04-07 18:35:08 +01:00 · 2026-04-07 18:35:08 +01:00 · 2026-04-07 18:35:08 +01:00
1022 changed files with 116934 additions and 36667 deletions
--- a/.claude/skills/pr-address/SKILL.md
+++ b/.claude/skills/pr-address/SKILL.md
@@ -0,0 +1,200 @@
+---
+name: pr-address
+description: Address PR review comments and loop until CI green and all comments resolved. TRIGGER when user asks to address comments, fix PR feedback, respond to reviewers, or babysit/monitor a PR.
+user-invocable: true
+argument-hint: "[PR number or URL] — if omitted, finds PR for current branch."
+metadata:
+  author: autogpt-team
+  version: "1.0.0"
+---
+
+# PR Address
+
+## Find the PR
+
+```bash
+gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT
+gh pr view {N}
+```
+
+## Fetch comments (all sources)
+
+### 1. Inline review threads — GraphQL (primary source of actionable items)
+
+Use GraphQL to fetch inline threads. It natively exposes `isResolved`, returns threads already grouped with all replies, and paginates via cursor — no manual thread reconstruction needed.
+
+```bash
+gh api graphql -f query='
+{
+  repository(owner: "Significant-Gravitas", name: "AutoGPT") {
+    pullRequest(number: {N}) {
+      reviewThreads(first: 100) {
+        pageInfo { hasNextPage endCursor }
+        nodes {
+          id
+          isResolved
+          path
+          comments(last: 1) {
+            nodes { databaseId body author { login } createdAt }
+          }
+        }
+      }
+    }
+  }
+}'
+```
+
+If `pageInfo.hasNextPage` is true, fetch subsequent pages by adding `after: "<endCursor>"` to `reviewThreads(first: 100, after: "...")` and repeat until `hasNextPage` is false.
+
+**Filter to unresolved threads only** — skip any thread where `isResolved: true`. `comments(last: 1)` returns the most recent comment in the thread — act on that; it reflects the reviewer's final ask. Use the thread `id` (Relay global ID) to track threads across polls.
+
+### 2. Top-level reviews — REST (MUST paginate)
+
+```bash
+gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate
+```
+
+**CRITICAL — always `--paginate`.** Reviews default to 30 per page. PRs can have 80–170+ reviews (mostly empty resolution events). Without pagination you miss reviews past position 30 — including `autogpt-reviewer`'s structured review which is typically posted after several CI runs and sits well beyond the first page.
+
+Two things to extract:
+- **Overall state**: look for `CHANGES_REQUESTED` or `APPROVED` reviews.
+- **Actionable feedback**: non-empty bodies only. Empty-body reviews are thread-resolution events — they indicate progress but have no feedback to act on.
+
+**Where each reviewer posts:**
+- `autogpt-reviewer` — posts detailed structured reviews ("Blockers", "Should Fix", "Nice to Have") as **top-level reviews**. Not present on every PR. Address ALL items.
+- `sentry[bot]` — posts bug predictions as **inline threads**. Fix real bugs, explain false positives.
+- `coderabbitai[bot]` — posts summaries as **top-level reviews** AND actionable items as **inline threads**. Address actionable items.
+- Human reviewers — can post in any source. Address ALL non-empty feedback.
+
+### 3. PR conversation comments — REST
+
+```bash
+gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments --paginate
+```
+
+Mostly contains: bot summaries (`coderabbitai[bot]`), CI/conflict detection (`github-actions[bot]`), and author status updates. Scan for non-empty messages from non-bot human reviewers that aren't the PR author — those are the ones that need a response.
+
+## For each unaddressed comment
+
+Address comments **one at a time**: fix → commit → push → inline reply → next.
+
+1. Read the referenced code, make the fix (or reply explaining why it's not needed)
+2. Commit and push the fix
+3. Reply **inline** (not as a new top-level comment) referencing the fixing commit — this is what resolves the conversation for bot reviewers (coderabbitai, sentry):
+
+| Comment type | How to reply |
+|---|---|
+| Inline review (`pulls/{N}/comments`) | `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID}/replies -f body="🤖 Fixed in <commit-sha>: <description>"` |
+| Conversation (`issues/{N}/comments`) | `gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments -f body="🤖 Fixed in <commit-sha>: <description>"` |
+
+## Format and commit
+
+After fixing, format the changed code:
+
+- **Backend** (from `autogpt_platform/backend/`): `poetry run format`
+- **Frontend** (from `autogpt_platform/frontend/`): `pnpm format && pnpm lint && pnpm types`
+
+If API routes changed, regenerate the frontend client:
+```bash
+cd autogpt_platform/backend && poetry run rest &
+REST_PID=$!
+trap "kill $REST_PID 2>/dev/null" EXIT
+WAIT=0; until curl -sf http://localhost:8006/health > /dev/null 2>&1; do sleep 1; WAIT=$((WAIT+1)); [ $WAIT -ge 60 ] && echo "Timed out" && exit 1; done
+cd ../frontend && pnpm generate:api:force
+kill $REST_PID 2>/dev/null; trap - EXIT
+```
+Never manually edit files in `src/app/api/__generated__/`.
+
+Then commit and **push immediately** — never batch commits without pushing.
+
+For backend commits in worktrees: `poetry run git commit` (pre-commit hooks).
+
+## The loop
+
+```text
+address comments → format → commit → push
+→ wait for CI (while addressing new comments) → fix failures → push
+→ re-check comments after CI settles
+→ repeat until: all comments addressed AND CI green AND no new comments arriving
+```
+
+### Polling for CI + new comments
+
+After pushing, poll for **both** CI status and new comments in a single loop. Do not use `gh pr checks --watch` — it blocks the tool and prevents reacting to new comments while CI is running.
+
+> **Note:** `gh pr checks --watch --fail-fast` is tempting but it blocks the entire Bash tool call, meaning the agent cannot check for or address new comments until CI fully completes. Always poll manually instead.
+
+**Polling loop — repeat every 30 seconds:**
+
+1. Check CI status:
+```bash
+gh pr checks {N} --repo Significant-Gravitas/AutoGPT --json bucket,name,link
+```
+   Parse the results: if every check has `bucket` of `"pass"` or `"skipping"`, CI is green. If any has `"fail"`, CI has failed. Otherwise CI is still pending.
+
+2. Check for merge conflicts:
+```bash
+gh pr view {N} --repo Significant-Gravitas/AutoGPT --json mergeable --jq '.mergeable'
+```
+   If the result is `"CONFLICTING"`, the PR has a merge conflict — see "Resolving merge conflicts" below. If `"UNKNOWN"`, GitHub is still computing mergeability — wait and re-check next poll.
+
+3. Check for new/changed comments (all three sources):
+
+   **Inline threads** — re-run the GraphQL query from "Fetch comments". For each unresolved thread, record `{thread_id, last_comment_databaseId}` as your baseline. On each poll, action is needed if:
+   - A new thread `id` appears that wasn't in the baseline (new thread), OR
+   - An existing thread's `last_comment_databaseId` has changed (new reply on existing thread)
+
+   **Conversation comments:**
+   ```bash
+   gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments --paginate
+   ```
+   Compare total count and newest `id` against baseline. Filter to non-empty, non-bot, non-author-update messages.
+
+   **Top-level reviews:**
+   ```bash
+   gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate
+   ```
+   Watch for new non-empty reviews (`CHANGES_REQUESTED` or `COMMENTED` with body). Compare total count and newest `id` against baseline.
+
+4. **React in this precedence order (first match wins):**
+
+| What happened | Action |
+|---|---|
+| Merge conflict detected | See "Resolving merge conflicts" below. |
+| Mergeability is `UNKNOWN` | GitHub is still computing mergeability. Sleep 30 seconds, then restart polling from the top. |
+| New comments detected | Address them (fix → commit → push → reply). After pushing, re-fetch all comments to update your baseline, then restart this polling loop from the top (new commits invalidate CI status). |
+| CI failed (bucket == "fail") | Get failed check links: `gh pr checks {N} --repo Significant-Gravitas/AutoGPT --json bucket,link --jq '.[] \| select(.bucket == "fail") \| .link'`. Extract run ID from link (format: `.../actions/runs/<run-id>/job/...`), read logs with `gh run view <run-id> --repo Significant-Gravitas/AutoGPT --log-failed`. Fix → commit → push → restart polling. |
+| CI green + no new comments | **Do not exit immediately.** Bots (coderabbitai, sentry) often post reviews shortly after CI settles. Continue polling for **2 more cycles (60s)** after CI goes green. Only exit after 2 consecutive green+quiet polls. |
+| CI pending + no new comments | Sleep 30 seconds, then poll again. |
+
+**The loop ends when:** CI fully green + all comments addressed + **2 consecutive polls with no new comments after CI settled.**
+
+### Resolving merge conflicts
+
+1. Identify the PR's target branch and remote:
+```bash
+gh pr view {N} --repo Significant-Gravitas/AutoGPT --json baseRefName --jq '.baseRefName'
+git remote -v   # find the remote pointing to Significant-Gravitas/AutoGPT (typically 'upstream' in forks, 'origin' for direct contributors)
+```
+
+2. Pull the latest base branch with a 3-way merge:
+```bash
+git pull {base-remote} {base-branch} --no-rebase
+```
+
+3. Resolve conflicting files, then verify no conflict markers remain:
+```bash
+if grep -R -n -E '^(<<<<<<<|=======|>>>>>>>)' <conflicted-files>; then
+  echo "Unresolved conflict markers found — resolve before proceeding."
+  exit 1
+fi
+```
+
+4. Stage and push:
+```bash
+git add <conflicted-files>
+git commit -m "Resolve merge conflicts with {base-branch}"
+git push
+```
+
+5. Restart the polling loop from the top — new commits reset CI status.
--- a/.claude/skills/pr-review/SKILL.md
+++ b/.claude/skills/pr-review/SKILL.md
@@ -0,0 +1,74 @@
+---
+name: pr-review
+description: Review a PR for correctness, security, code quality, and testing issues. TRIGGER when user asks to review a PR, check PR quality, or give feedback on a PR.
+user-invocable: true
+args: "[PR number or URL] — if omitted, finds PR for current branch."
+metadata:
+  author: autogpt-team
+  version: "1.0.0"
+---
+
+# PR Review
+
+## Find the PR
+
+```bash
+gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT
+gh pr view {N}
+```
+
+## Read the diff
+
+```bash
+gh pr diff {N}
+```
+
+## Fetch existing review comments
+
+Before posting anything, fetch existing inline comments to avoid duplicates:
+
+```bash
+gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments --paginate
+gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews
+```
+
+## What to check
+
+**Correctness:** logic errors, off-by-one, missing edge cases, race conditions (TOCTOU in file access, credit charging), error handling gaps, async correctness (missing `await`, unclosed resources).
+
+**Security:** input validation at boundaries, no injection (command, XSS, SQL), secrets not logged, file paths sanitized (`os.path.basename()` in error messages).
+
+**Code quality:** apply rules from backend/frontend CLAUDE.md files.
+
+**Architecture:** DRY, single responsibility, modular functions. `Security()` vs `Depends()` for FastAPI auth. `data:` for SSE events, `: comment` for heartbeats. `transaction=True` for Redis pipelines.
+
+**Testing:** edge cases covered, colocated `*_test.py` (backend) / `__tests__/` (frontend), mocks target where symbol is **used** not defined, `AsyncMock` for async.
+
+## Output format
+
+Every comment **must** be prefixed with `🤖` and a criticality badge:
+
+| Tier | Badge | Meaning |
+|---|---|---|
+| Blocker | `🔴 **Blocker**` | Must fix before merge |
+| Should Fix | `🟠 **Should Fix**` | Important improvement |
+| Nice to Have | `🟡 **Nice to Have**` | Minor suggestion |
+| Nit | `🔵 **Nit**` | Style / wording |
+
+Example: `🤖 🔴 **Blocker**: Missing error handling for X — suggest wrapping in try/except.`
+
+## Post inline comments
+
+For each finding, post an inline comment on the PR (do not just write a local report):
+
+```bash
+# Get the latest commit SHA for the PR
+COMMIT_SHA=$(gh api repos/Significant-Gravitas/AutoGPT/pulls/{N} --jq '.head.sha')
+
+# Post an inline comment on a specific file/line
+gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments \
+  -f body="🤖 🔴 **Blocker**: <description>" \
+  -f commit_id="$COMMIT_SHA" \
+  -f path="<file path>" \
+  -F line=<line number>
+```
--- a/.claude/skills/pr-test/SKILL.md
+++ b/.claude/skills/pr-test/SKILL.md
@@ -0,0 +1,534 @@
+---
+name: pr-test
+description: "E2E manual testing of PRs/branches using docker compose, agent-browser, and API calls. TRIGGER when user asks to manually test a PR, test a feature end-to-end, or run integration tests against a running system."
+user-invocable: true
+argument-hint: "[worktree path or PR number] — tests the PR in the given worktree. Optional flags: --fix (auto-fix issues found)"
+metadata:
+  author: autogpt-team
+  version: "1.0.0"
+---
+
+# Manual E2E Test
+
+Test a PR/branch end-to-end by building the full platform, interacting via browser and API, capturing screenshots, and reporting results.
+
+## Arguments
+
+- `$ARGUMENTS` — worktree path (e.g. `$REPO_ROOT`) or PR number
+- If `--fix` flag is present, auto-fix bugs found and push fixes (like pr-address loop)
+
+## Step 0: Resolve the target
+
+```bash
+# If argument is a PR number, find its worktree
+gh pr view {N} --json headRefName --jq '.headRefName'
+# If argument is a path, use it directly
+```
+
+Determine:
+- `REPO_ROOT` — the root repo directory: `git -C "$WORKTREE_PATH" worktree list | head -1 | awk '{print $1}'` (or `git rev-parse --show-toplevel` if not a worktree)
+- `WORKTREE_PATH` — the worktree directory
+- `PLATFORM_DIR` — `$WORKTREE_PATH/autogpt_platform`
+- `BACKEND_DIR` — `$PLATFORM_DIR/backend`
+- `FRONTEND_DIR` — `$PLATFORM_DIR/frontend`
+- `PR_NUMBER` — the PR number (from `gh pr list --head $(git branch --show-current)`)
+- `PR_TITLE` — the PR title, slugified (e.g. "Add copilot permissions" → "add-copilot-permissions")
+- `RESULTS_DIR` — `$REPO_ROOT/test-results/PR-{PR_NUMBER}-{slugified-title}`
+
+Create the results directory:
+```bash
+PR_NUMBER=$(cd $WORKTREE_PATH && gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT --json number --jq '.[0].number')
+PR_TITLE=$(cd $WORKTREE_PATH && gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT --json title --jq '.[0].title' | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/-/g' | sed 's/--*/-/g' | sed 's/^-//;s/-$//' | head -c 50)
+RESULTS_DIR="$REPO_ROOT/test-results/PR-${PR_NUMBER}-${PR_TITLE}"
+mkdir -p $RESULTS_DIR
+```
+
+**Test user credentials** (for logging into the UI or verifying results manually):
+- Email: `test@test.com`
+- Password: `testtest123`
+
+## Step 1: Understand the PR
+
+Before testing, understand what changed:
+
+```bash
+cd $WORKTREE_PATH
+git log --oneline dev..HEAD | head -20
+git diff dev --stat
+```
+
+Read the changed files to understand:
+1. What feature/fix does this PR implement?
+2. What components are affected? (backend, frontend, copilot, executor, etc.)
+3. What are the key user-facing behaviors to test?
+
+## Step 2: Write test scenarios
+
+Based on the PR analysis, write a test plan to `$RESULTS_DIR/test-plan.md`:
+
+```markdown
+# Test Plan: PR #{N} — {title}
+
+## Scenarios
+1. [Scenario name] — [what to verify]
+2. ...
+
+## API Tests (if applicable)
+1. [Endpoint] — [expected behavior]
+
+## UI Tests (if applicable)
+1. [Page/component] — [interaction to test]
+
+## Negative Tests
+1. [What should NOT happen]
+```
+
+**Be critical** — include edge cases, error paths, and security checks.
+
+## Step 3: Environment setup
+
+### 3a. Copy .env files from the root worktree
+
+The root worktree (`$REPO_ROOT`) has the canonical `.env` files with all API keys. Copy them to the target worktree:
+
+```bash
+# CRITICAL: .env files are NOT checked into git. They must be copied manually.
+cp $REPO_ROOT/autogpt_platform/.env $PLATFORM_DIR/.env
+cp $REPO_ROOT/autogpt_platform/backend/.env $BACKEND_DIR/.env
+cp $REPO_ROOT/autogpt_platform/frontend/.env $FRONTEND_DIR/.env
+```
+
+### 3b. Configure copilot authentication
+
+The copilot needs an LLM API to function. Two approaches (try subscription first):
+
+#### Option 1: Subscription mode (preferred — uses your Claude Max/Pro subscription)
+
+The `claude_agent_sdk` Python package **bundles its own Claude CLI binary** — no need to install `@anthropic-ai/claude-code` via npm. The backend auto-provisions credentials from environment variables on startup.
+
+Run the helper script to extract tokens from your host and auto-update `backend/.env` (works on macOS, Linux, and Windows/WSL):
+
+```bash
+# Extracts OAuth tokens and writes CLAUDE_CODE_OAUTH_TOKEN + CLAUDE_CODE_REFRESH_TOKEN into .env
+bash $BACKEND_DIR/scripts/refresh_claude_token.sh --env-file $BACKEND_DIR/.env
+```
+
+**How it works:** The script reads the OAuth token from:
+- **macOS**: system keychain (`"Claude Code-credentials"`)
+- **Linux/WSL**: `~/.claude/.credentials.json`
+- **Windows**: `%APPDATA%/claude/.credentials.json`
+
+It sets `CLAUDE_CODE_OAUTH_TOKEN`, `CLAUDE_CODE_REFRESH_TOKEN`, and `CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=true` in the `.env` file. On container startup, the backend auto-provisions `~/.claude/.credentials.json` inside the container from these env vars. The SDK's bundled CLI then authenticates using that file. No `claude login`, no npm install needed.
+
+**Note:** The OAuth token expires (~24h). If copilot returns auth errors, re-run the script and restart: `$BACKEND_DIR/scripts/refresh_claude_token.sh --env-file $BACKEND_DIR/.env && docker compose up -d copilot_executor`
+
+#### Option 2: OpenRouter API key mode (fallback)
+
+If subscription mode doesn't work, switch to API key mode using OpenRouter:
+
+```bash
+# In $BACKEND_DIR/.env, ensure these are set:
+CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=false
+CHAT_API_KEY=<value of OPEN_ROUTER_API_KEY from the same .env>
+CHAT_BASE_URL=https://openrouter.ai/api/v1
+CHAT_USE_CLAUDE_AGENT_SDK=true
+```
+
+Use `sed` to update these values:
+```bash
+ORKEY=$(grep "^OPEN_ROUTER_API_KEY=" $BACKEND_DIR/.env | cut -d= -f2)
+[ -n "$ORKEY" ] || { echo "ERROR: OPEN_ROUTER_API_KEY is missing in $BACKEND_DIR/.env"; exit 1; }
+perl -i -pe 's/CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=true/CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=false/' $BACKEND_DIR/.env
+# Add or update CHAT_API_KEY and CHAT_BASE_URL
+grep -q "^CHAT_API_KEY=" $BACKEND_DIR/.env && perl -i -pe "s|^CHAT_API_KEY=.*|CHAT_API_KEY=$ORKEY|" $BACKEND_DIR/.env || echo "CHAT_API_KEY=$ORKEY" >> $BACKEND_DIR/.env
+grep -q "^CHAT_BASE_URL=" $BACKEND_DIR/.env && perl -i -pe 's|^CHAT_BASE_URL=.*|CHAT_BASE_URL=https://openrouter.ai/api/v1|' $BACKEND_DIR/.env || echo "CHAT_BASE_URL=https://openrouter.ai/api/v1" >> $BACKEND_DIR/.env
+```
+
+### 3c. Stop conflicting containers
+
+```bash
+# Stop any running app containers (keep infra: supabase, redis, rabbitmq, clamav)
+docker ps --format "{{.Names}}" | grep -E "rest_server|executor|copilot|websocket|database_manager|scheduler|notification|frontend|migrate" | while read name; do
+  docker stop "$name" 2>/dev/null
+done
+```
+
+### 3e. Build and start
+
+```bash
+cd $PLATFORM_DIR && docker compose build --no-cache 2>&1 | tail -20
+if [ ${PIPESTATUS[0]} -ne 0 ]; then echo "ERROR: Docker build failed"; exit 1; fi
+
+cd $PLATFORM_DIR && docker compose up -d 2>&1 | tail -20
+if [ ${PIPESTATUS[0]} -ne 0 ]; then echo "ERROR: Docker compose up failed"; exit 1; fi
+```
+
+**Note:** If the container appears to be running old code (e.g. missing PR changes), use `docker compose build --no-cache` to force a full rebuild. Docker BuildKit may sometimes reuse cached `COPY` layers from a previous build on a different branch.
+
+**Expected time: 3-8 minutes** for build, 5-10 minutes with `--no-cache`.
+
+### 3f. Wait for services to be ready
+
+```bash
+# Poll until backend and frontend respond
+for i in $(seq 1 60); do
+  BACKEND=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:8006/docs 2>/dev/null)
+  FRONTEND=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:3000 2>/dev/null)
+  if [ "$BACKEND" = "200" ] && [ "$FRONTEND" = "200" ]; then
+    echo "Services ready"
+    break
+  fi
+  sleep 5
+done
+```
+
+
+### 3h. Create test user and get auth token
+
+```bash
+ANON_KEY=$(grep "NEXT_PUBLIC_SUPABASE_ANON_KEY=" $FRONTEND_DIR/.env | sed 's/.*NEXT_PUBLIC_SUPABASE_ANON_KEY=//' | tr -d '[:space:]')
+
+# Signup (idempotent — returns "User already registered" if exists)
+RESULT=$(curl -s -X POST 'http://localhost:8000/auth/v1/signup' \
+  -H "apikey: $ANON_KEY" \
+  -H 'Content-Type: application/json' \
+  -d '{"email":"test@test.com","password":"testtest123"}')
+
+# If "Database error finding user", restart supabase-auth and retry
+if echo "$RESULT" | grep -q "Database error"; then
+  docker restart supabase-auth && sleep 5
+  curl -s -X POST 'http://localhost:8000/auth/v1/signup' \
+    -H "apikey: $ANON_KEY" \
+    -H 'Content-Type: application/json' \
+    -d '{"email":"test@test.com","password":"testtest123"}'
+fi
+
+# Get auth token
+TOKEN=$(curl -s -X POST 'http://localhost:8000/auth/v1/token?grant_type=password' \
+  -H "apikey: $ANON_KEY" \
+  -H 'Content-Type: application/json' \
+  -d '{"email":"test@test.com","password":"testtest123"}' | jq -r '.access_token // ""')
+```
+
+**Use this token for ALL API calls:**
+```bash
+curl -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/...
+```
+
+## Step 4: Run tests
+
+### Service ports reference
+
+| Service | Port | URL |
+|---------|------|-----|
+| Frontend | 3000 | http://localhost:3000 |
+| Backend REST | 8006 | http://localhost:8006 |
+| Supabase Auth (via Kong) | 8000 | http://localhost:8000 |
+| Executor | 8002 | http://localhost:8002 |
+| Copilot Executor | 8008 | http://localhost:8008 |
+| WebSocket | 8001 | http://localhost:8001 |
+| Database Manager | 8005 | http://localhost:8005 |
+| Redis | 6379 | localhost:6379 |
+| RabbitMQ | 5672 | localhost:5672 |
+
+### API testing
+
+Use `curl` with the auth token for backend API tests:
+
+```bash
+# Example: List agents
+curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/graphs | jq . | head -20
+
+# Example: Create an agent
+curl -s -X POST http://localhost:8006/api/graphs \
+  -H "Authorization: Bearer $TOKEN" \
+  -H 'Content-Type: application/json' \
+  -d '{...}' | jq .
+
+# Example: Run an agent
+curl -s -X POST "http://localhost:8006/api/graphs/{graph_id}/execute" \
+  -H "Authorization: Bearer $TOKEN" \
+  -H 'Content-Type: application/json' \
+  -d '{"data": {...}}'
+
+# Example: Get execution results
+curl -s -H "Authorization: Bearer $TOKEN" \
+  "http://localhost:8006/api/graphs/{graph_id}/executions/{exec_id}" | jq .
+```
+
+### Browser testing with agent-browser
+
+```bash
+# Close any existing session
+agent-browser close 2>/dev/null || true
+
+# Use --session-name to persist cookies across navigations
+# This means login only needs to happen once per test session
+agent-browser --session-name pr-test open 'http://localhost:3000/login' --timeout 15000
+
+# Get interactive elements
+agent-browser --session-name pr-test snapshot | grep "textbox\|button"
+
+# Login
+agent-browser --session-name pr-test fill {email_ref} "test@test.com"
+agent-browser --session-name pr-test fill {password_ref} "testtest123"
+agent-browser --session-name pr-test click {login_button_ref}
+sleep 5
+
+# Dismiss cookie banner if present
+agent-browser --session-name pr-test click 'text=Accept All' 2>/dev/null || true
+
+# Navigate — cookies are preserved so login persists
+agent-browser --session-name pr-test open 'http://localhost:3000/copilot' --timeout 10000
+
+# Take screenshot
+agent-browser --session-name pr-test screenshot $RESULTS_DIR/01-page.png
+
+# Interact with elements
+agent-browser --session-name pr-test fill {ref} "text"
+agent-browser --session-name pr-test press "Enter"
+agent-browser --session-name pr-test click {ref}
+agent-browser --session-name pr-test click 'text=Button Text'
+
+# Read page content
+agent-browser --session-name pr-test snapshot | grep "text:"
+```
+
+**Key pages:**
+- `/copilot` — CoPilot chat (for testing copilot features)
+- `/build` — Agent builder (for testing block/node features)
+- `/build?flowID={id}` — Specific agent in builder
+- `/library` — Agent library (for testing listing/import features)
+- `/library/agents/{id}` — Agent detail with run history
+- `/marketplace` — Marketplace
+
+### Checking logs
+
+```bash
+# Backend REST server
+docker logs autogpt_platform-rest_server-1 2>&1 | tail -30
+
+# Executor (runs agent graphs)
+docker logs autogpt_platform-executor-1 2>&1 | tail -30
+
+# Copilot executor (runs copilot chat sessions)
+docker logs autogpt_platform-copilot_executor-1 2>&1 | tail -30
+
+# Frontend
+docker logs autogpt_platform-frontend-1 2>&1 | tail -30
+
+# Filter for errors
+docker logs autogpt_platform-executor-1 2>&1 | grep -i "error\|exception\|traceback" | tail -20
+```
+
+### Copilot chat testing
+
+The copilot uses SSE streaming. To test via API:
+
+```bash
+# Create a session
+SESSION_ID=$(curl -s -X POST 'http://localhost:8006/api/chat/sessions' \
+  -H "Authorization: Bearer $TOKEN" \
+  -H 'Content-Type: application/json' \
+  -d '{}' | jq -r '.id // .session_id // ""')
+
+# Stream a message (SSE - will stream chunks)
+curl -N -X POST "http://localhost:8006/api/chat/sessions/$SESSION_ID/stream" \
+  -H "Authorization: Bearer $TOKEN" \
+  -H 'Content-Type: application/json' \
+  -d '{"message": "Hello, what can you help me with?"}' \
+  --max-time 60 2>/dev/null | head -50
+```
+
+Or test via browser (preferred for UI verification):
+```bash
+agent-browser --session-name pr-test open 'http://localhost:3000/copilot' --timeout 10000
+# ... fill chat input and press Enter, wait 20-30s for response
+```
+
+## Step 5: Record results
+
+For each test scenario, record in `$RESULTS_DIR/test-report.md`:
+
+```markdown
+# E2E Test Report: PR #{N} — {title}
+Date: {date}
+Branch: {branch}
+Worktree: {path}
+
+## Environment
+- Docker services: [list running containers]
+- API keys: OpenRouter={present/missing}, E2B={present/missing}
+
+## Test Results
+
+### Scenario 1: {name}
+**Steps:**
+1. ...
+2. ...
+**Expected:** ...
+**Actual:** ...
+**Result:** PASS / FAIL
+**Screenshot:** {filename}.png
+**Logs:** (if relevant)
+
+### Scenario 2: {name}
+...
+
+## Summary
+- Total: X scenarios
+- Passed: Y
+- Failed: Z
+- Bugs found: [list]
+```
+
+Take screenshots at each significant step:
+```bash
+agent-browser --session-name pr-test screenshot $RESULTS_DIR/{NN}-{description}.png
+```
+
+## Step 6: Report results
+
+After all tests complete, output a summary to the user:
+
+1. Table of all scenarios with PASS/FAIL
+2. Screenshots of failures (read the PNG files to show them)
+3. Any bugs found with details
+4. Recommendations
+
+### Post test results as PR comment with screenshots
+
+Upload screenshots to the PR using the GitHub Git API (no local git operations — safe for worktrees).
+
+```bash
+# Upload screenshots via GitHub Git API (creates blobs, tree, commit, and ref remotely)
+REPO="Significant-Gravitas/AutoGPT"
+SCREENSHOTS_BRANCH="test-screenshots/pr-${PR_NUMBER}"
+SCREENSHOTS_DIR="test-screenshots/PR-${PR_NUMBER}"
+
+# Step 1: Create blobs for each screenshot
+declare -a TREE_ENTRIES
+for img in $RESULTS_DIR/*.png; do
+  BASENAME=$(basename "$img")
+  B64=$(base64 < "$img")
+  BLOB_SHA=$(gh api "repos/${REPO}/git/blobs" -f content="$B64" -f encoding="base64" --jq '.sha')
+  TREE_ENTRIES+=("-f" "tree[][path]=${SCREENSHOTS_DIR}/${BASENAME}" "-f" "tree[][mode]=100644" "-f" "tree[][type]=blob" "-f" "tree[][sha]=${BLOB_SHA}")
+done
+
+# Step 2: Create a tree with all screenshot blobs
+# Build the tree JSON manually since gh api doesn't handle arrays well
+TREE_JSON='['
+FIRST=true
+for img in $RESULTS_DIR/*.png; do
+  BASENAME=$(basename "$img")
+  B64=$(base64 < "$img")
+  BLOB_SHA=$(gh api "repos/${REPO}/git/blobs" -f content="$B64" -f encoding="base64" --jq '.sha')
+  if [ "$FIRST" = true ]; then FIRST=false; else TREE_JSON+=','; fi
+  TREE_JSON+="{\"path\":\"${SCREENSHOTS_DIR}/${BASENAME}\",\"mode\":\"100644\",\"type\":\"blob\",\"sha\":\"${BLOB_SHA}\"}"
+done
+TREE_JSON+=']'
+
+TREE_SHA=$(echo "$TREE_JSON" | gh api "repos/${REPO}/git/trees" --input - -f base_tree="" --jq '.sha' 2>/dev/null \
+  || echo "$TREE_JSON" | jq -c '{tree: .}' | gh api "repos/${REPO}/git/trees" --input - --jq '.sha')
+
+# Step 3: Create a commit pointing to that tree
+COMMIT_SHA=$(gh api "repos/${REPO}/git/commits" \
+  -f message="test: add E2E test screenshots for PR #${PR_NUMBER}" \
+  -f tree="$TREE_SHA" \
+  --jq '.sha')
+
+# Step 4: Create or update the ref (branch) — no local checkout needed
+gh api "repos/${REPO}/git/refs" \
+  -f ref="refs/heads/${SCREENSHOTS_BRANCH}" \
+  -f sha="$COMMIT_SHA" 2>/dev/null \
+  || gh api "repos/${REPO}/git/refs/heads/${SCREENSHOTS_BRANCH}" \
+    -X PATCH -f sha="$COMMIT_SHA" -f force=true
+
+# Step 5: Build image markdown and post the comment
+REPO_URL="https://raw.githubusercontent.com/${REPO}/${SCREENSHOTS_BRANCH}"
+IMAGE_MARKDOWN=""
+for img in $RESULTS_DIR/*.png; do
+  BASENAME=$(basename "$img")
+  IMAGE_MARKDOWN="$IMAGE_MARKDOWN
+![${BASENAME}](${REPO_URL}/${SCREENSHOTS_DIR}/${BASENAME})"
+done
+
+gh api "repos/${REPO}/issues/$PR_NUMBER/comments" -f body="$(cat <<EOF
+## 🧪 E2E Test Report
+
+$(cat $RESULTS_DIR/test-report.md)
+
+### Screenshots
+${IMAGE_MARKDOWN}
+EOF
+)"
+```
+
+This approach uses the GitHub Git API to create blobs, trees, commits, and refs entirely server-side. No local `git checkout` or `git push` — safe for worktrees and won't interfere with the PR branch.
+
+## Fix mode (--fix flag)
+
+When `--fix` is present, after finding a bug:
+
+1. Identify the root cause in the code
+2. Fix it in the worktree
+3. Rebuild the affected service: `cd $PLATFORM_DIR && docker compose up --build -d {service_name}`
+4. Re-test the scenario
+5. If fix works, commit and push:
+   ```bash
+   cd $WORKTREE_PATH
+   git add -A
+   git commit -m "fix: {description of fix}"
+   git push
+   ```
+6. Continue testing remaining scenarios
+7. After all fixes, run the full test suite again to ensure no regressions
+
+### Fix loop (like pr-address)
+
+```text
+test scenario → find bug → fix code → rebuild service → re-test
+→ repeat until all scenarios pass
+→ commit + push all fixes
+→ run full re-test to verify
+```
+
+## Known issues and workarounds
+
+### Problem: "Database error finding user" on signup
+**Cause:** Supabase auth service schema cache is stale after migration.
+**Fix:** `docker restart supabase-auth && sleep 5` then retry signup.
+
+### Problem: Copilot returns auth errors in subscription mode
+**Cause:** `CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=true` but `CLAUDE_CODE_OAUTH_TOKEN` is not set or expired.
+**Fix:** Re-extract the OAuth token from macOS keychain (see step 3b, Option 1) and recreate the container (`docker compose up -d copilot_executor`). The backend auto-provisions `~/.claude/.credentials.json` from the env var on startup. No `npm install` or `claude login` needed — the SDK bundles its own CLI binary.
+
+### Problem: agent-browser can't find chromium
+**Cause:** The Dockerfile auto-provisions system chromium on all architectures (including ARM64). If your branch is behind `dev`, this may not be present yet.
+**Fix:** Check if chromium exists: `which chromium || which chromium-browser`. If missing, install it: `apt-get install -y chromium` and set `AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium` in the container environment.
+
+### Problem: agent-browser selector matches multiple elements
+**Cause:** `text=X` matches all elements containing that text.
+**Fix:** Use `agent-browser snapshot` to get specific `ref=eNN` references, then use those: `agent-browser click eNN`.
+
+### Problem: Frontend shows cookie banner blocking interaction
+**Fix:** `agent-browser click 'text=Accept All'` before other interactions.
+
+### Problem: Container loses npm packages after rebuild
+**Cause:** `docker compose up --build` rebuilds the image, losing runtime installs.
+**Fix:** Add packages to the Dockerfile instead of installing at runtime.
+
+### Problem: Services not starting after `docker compose up`
+**Fix:** Wait and check health: `docker compose ps`. Common cause: migration hasn't finished. Check: `docker logs autogpt_platform-migrate-1 2>&1 | tail -5`. If supabase-db isn't healthy: `docker restart supabase-db && sleep 10`.
+
+### Problem: Docker uses cached layers with old code (PR changes not visible)
+**Cause:** `docker compose up --build` reuses cached `COPY` layers from previous builds. If the PR branch changes Python files but the previous build already cached that layer from `dev`, the container runs `dev` code.
+**Fix:** Always use `docker compose build --no-cache` for the first build of a PR branch. Subsequent rebuilds within the same branch can use `--build`.
+
+### Problem: `agent-browser open` loses login session
+**Cause:** Without session persistence, `agent-browser open` starts fresh.
+**Fix:** Use `--session-name pr-test` on ALL agent-browser commands. This auto-saves/restores cookies and localStorage across navigations. Alternatively, use `agent-browser eval "window.location.href = '...'"` to navigate within the same context.
+
+### Problem: Supabase auth returns "Database error querying schema"
+**Cause:** The database schema changed (migration ran) but supabase-auth has a stale schema cache.
+**Fix:** `docker restart supabase-db && sleep 10 && docker restart supabase-auth && sleep 8`. If user data was lost, re-signup.
--- a/.claude/skills/worktree/SKILL.md
+++ b/.claude/skills/worktree/SKILL.md
@@ -0,0 +1,85 @@
+---
+name: worktree
+description: Set up a new git worktree for parallel development. Creates the worktree, copies .env files, installs dependencies, and generates Prisma client. TRIGGER when user asks to set up a worktree, work on a branch in isolation, or needs a separate environment for a branch or PR.
+user-invocable: true
+args: "[name] — optional worktree name (e.g., 'AutoGPT7'). If omitted, uses next available AutoGPT<N>."
+metadata:
+  author: autogpt-team
+  version: "3.0.0"
+---
+
+# Worktree Setup
+
+## Create the worktree
+
+Derive paths from the git toplevel. If a name is provided as argument, use it. Otherwise, check `git worktree list` and pick the next `AutoGPT<N>`.
+
+```bash
+ROOT=$(git rev-parse --show-toplevel)
+PARENT=$(dirname "$ROOT")
+
+# From an existing branch
+git worktree add "$PARENT/<NAME>" <branch-name>
+
+# From a new branch off dev
+git worktree add -b <new-branch> "$PARENT/<NAME>" dev
+```
+
+## Copy environment files
+
+Copy `.env` from the root worktree. Falls back to `.env.default` if `.env` doesn't exist.
+
+```bash
+ROOT=$(git rev-parse --show-toplevel)
+TARGET="$(dirname "$ROOT")/<NAME>"
+
+for envpath in autogpt_platform/backend autogpt_platform/frontend autogpt_platform; do
+  if [ -f "$ROOT/$envpath/.env" ]; then
+    cp "$ROOT/$envpath/.env" "$TARGET/$envpath/.env"
+  elif [ -f "$ROOT/$envpath/.env.default" ]; then
+    cp "$ROOT/$envpath/.env.default" "$TARGET/$envpath/.env"
+  fi
+done
+```
+
+## Install dependencies
+
+```bash
+TARGET="$(dirname "$(git rev-parse --show-toplevel)")/<NAME>"
+cd "$TARGET/autogpt_platform/autogpt_libs" && poetry install
+cd "$TARGET/autogpt_platform/backend" && poetry install && poetry run prisma generate
+cd "$TARGET/autogpt_platform/frontend" && pnpm install
+```
+
+Replace `<NAME>` with the actual worktree name (e.g., `AutoGPT7`).
+
+## Running the app (optional)
+
+Backend uses ports: 8001, 8002, 8003, 8005, 8006, 8007, 8008. Free them first if needed:
+
+```bash
+TARGET="$(dirname "$(git rev-parse --show-toplevel)")/<NAME>"
+for port in 8001 8002 8003 8005 8006 8007 8008; do
+  lsof -ti :$port | xargs kill -9 2>/dev/null || true
+done
+cd "$TARGET/autogpt_platform/backend" && poetry run app
+```
+
+## CoPilot testing
+
+SDK mode spawns a Claude subprocess — won't work inside Claude Code. Set `CHAT_USE_CLAUDE_AGENT_SDK=false` in `backend/.env` to use baseline mode.
+
+## Cleanup
+
+```bash
+# Replace <NAME> with the actual worktree name (e.g., AutoGPT7)
+git worktree remove "$(dirname "$(git rev-parse --show-toplevel)")/<NAME>"
+```
+
+## Alternative: Branchlet (optional)
+
+If [branchlet](https://www.npmjs.com/package/branchlet) is installed:
+
+```bash
+branchlet create -n <name> -s <source-branch> -b <new-branch>
+```
--- a/.dockerignore
+++ b/.dockerignore
@@ -5,42 +5,13 @@
 !docs/

 # Platform - Libs
-!autogpt_platform/autogpt_libs/autogpt_libs/
-!autogpt_platform/autogpt_libs/pyproject.toml
-!autogpt_platform/autogpt_libs/poetry.lock
-!autogpt_platform/autogpt_libs/README.md
+!autogpt_platform/autogpt_libs/

 # Platform - Backend
-!autogpt_platform/backend/backend/
-!autogpt_platform/backend/test/e2e_test_data.py
-!autogpt_platform/backend/migrations/
-!autogpt_platform/backend/schema.prisma
-!autogpt_platform/backend/pyproject.toml
-!autogpt_platform/backend/poetry.lock
-!autogpt_platform/backend/README.md
-!autogpt_platform/backend/.env
-!autogpt_platform/backend/gen_prisma_types_stub.py
-
-# Platform - Market
-!autogpt_platform/market/market/
-!autogpt_platform/market/scripts.py
-!autogpt_platform/market/schema.prisma
-!autogpt_platform/market/pyproject.toml
-!autogpt_platform/market/poetry.lock
-!autogpt_platform/market/README.md
+!autogpt_platform/backend/

 # Platform - Frontend
-!autogpt_platform/frontend/src/
-!autogpt_platform/frontend/public/
-!autogpt_platform/frontend/scripts/
-!autogpt_platform/frontend/package.json
-!autogpt_platform/frontend/pnpm-lock.yaml
-!autogpt_platform/frontend/tsconfig.json
-!autogpt_platform/frontend/README.md
-## config
-!autogpt_platform/frontend/*.config.*
-!autogpt_platform/frontend/.env.*
-!autogpt_platform/frontend/.env
+!autogpt_platform/frontend/

 # Classic - AutoGPT
 !classic/original_autogpt/autogpt/
@@ -64,6 +35,38 @@
 # Classic - Frontend
 !classic/frontend/build/web/

-# Explicitly re-ignore some folders
-.*
-**/__pycache__
+# Explicitly re-ignore unwanted files from whitelisted directories
+# Note: These patterns MUST come after the whitelist rules to take effect
+
+# Hidden files and directories (but keep frontend .env files needed for build)
+**/.*
+!autogpt_platform/frontend/.env
+!autogpt_platform/frontend/.env.default
+!autogpt_platform/frontend/.env.production
+
+# Python artifacts
+**/__pycache__/
+**/*.pyc
+**/*.pyo
+**/.venv/
+**/.ruff_cache/
+**/.pytest_cache/
+**/.coverage
+**/htmlcov/
+
+# Node artifacts
+**/node_modules/
+**/.next/
+**/storybook-static/
+**/playwright-report/
+**/test-results/
+
+# Build artifacts
+**/dist/
+**/build/
+!autogpt_platform/frontend/src/**/build/
+**/target/
+
+# Logs and temp files
+**/*.log
+**/*.tmp
--- a/.github/scripts/detect_overlaps.py
+++ b/.github/scripts/detect_overlaps.py
--- a/.github/workflows/claude-ci-failure-auto-fix.yml
+++ b/.github/workflows/claude-ci-failure-auto-fix.yml
@@ -22,7 +22,7 @@ jobs:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6
        with:
          ref: ${{ github.event.workflow_run.head_branch }}
          fetch-depth: 0
@@ -40,6 +40,48 @@ jobs:
          git checkout -b "$BRANCH_NAME"
          echo "branch_name=$BRANCH_NAME" >> $GITHUB_OUTPUT

+      # Backend Python/Poetry setup (so Claude can run linting/tests)
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.11"
+
+      - name: Set up Python dependency cache
+        uses: actions/cache@v5
+        with:
+          path: ~/.cache/pypoetry
+          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
+
+      - name: Install Poetry
+        run: |
+          cd autogpt_platform/backend
+          HEAD_POETRY_VERSION=$(python3 ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
+          curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
+          echo "$HOME/.local/bin" >> $GITHUB_PATH
+
+      - name: Install Python dependencies
+        working-directory: autogpt_platform/backend
+        run: poetry install
+
+      - name: Generate Prisma Client
+        working-directory: autogpt_platform/backend
+        run: poetry run prisma generate && poetry run gen-prisma-stub
+
+      # Frontend Node.js/pnpm setup (so Claude can run linting/tests)
+      - name: Enable corepack
+        run: corepack enable
+
+      - name: Set up Node.js
+        uses: actions/setup-node@v6
+        with:
+          node-version: "22"
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
+
+      - name: Install JavaScript dependencies
+        working-directory: autogpt_platform/frontend
+        run: pnpm install --frozen-lockfile
+
      - name: Get CI failure details
        id: failure_details
        uses: actions/github-script@v8
--- a/.github/workflows/claude-dependabot.yml
+++ b/.github/workflows/claude-dependabot.yml
@@ -30,7 +30,7 @@ jobs:
      actions: read # Required for CI access
    steps:
      - name: Checkout code
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6
        with:
          fetch-depth: 1

@@ -77,27 +77,15 @@ jobs:
        run: poetry run prisma generate && poetry run gen-prisma-stub

      # Frontend Node.js/pnpm setup (mirrors platform-frontend-ci.yml)
+      - name: Enable corepack
+        run: corepack enable
+
      - name: Set up Node.js
        uses: actions/setup-node@v6
        with:
          node-version: "22"
-
-      - name: Enable corepack
-        run: corepack enable
-
-      - name: Set pnpm store directory
-        run: |
-          pnpm config set store-dir ~/.pnpm-store
-          echo "PNPM_HOME=$HOME/.pnpm-store" >> $GITHUB_ENV
-
-      - name: Cache frontend dependencies
-        uses: actions/cache@v5
-        with:
-          path: ~/.pnpm-store
-          key: ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}
-          restore-keys: |
-            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
-            ${{ runner.os }}-pnpm-
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml

      - name: Install JavaScript dependencies
        working-directory: autogpt_platform/frontend
--- a/.github/workflows/claude.yml
+++ b/.github/workflows/claude.yml
@@ -40,7 +40,7 @@ jobs:
      actions: read # Required for CI access
    steps:
      - name: Checkout code
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6
        with:
          fetch-depth: 1

@@ -93,27 +93,15 @@ jobs:
        run: poetry run prisma generate && poetry run gen-prisma-stub

      # Frontend Node.js/pnpm setup (mirrors platform-frontend-ci.yml)
+      - name: Enable corepack
+        run: corepack enable
+
      - name: Set up Node.js
        uses: actions/setup-node@v6
        with:
          node-version: "22"
-
-      - name: Enable corepack
-        run: corepack enable
-
-      - name: Set pnpm store directory
-        run: |
-          pnpm config set store-dir ~/.pnpm-store
-          echo "PNPM_HOME=$HOME/.pnpm-store" >> $GITHUB_ENV
-
-      - name: Cache frontend dependencies
-        uses: actions/cache@v5
-        with:
-          path: ~/.pnpm-store
-          key: ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}
-          restore-keys: |
-            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
-            ${{ runner.os }}-pnpm-
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml

      - name: Install JavaScript dependencies
        working-directory: autogpt_platform/frontend
--- a/.github/workflows/codeql.yml
+++ b/.github/workflows/codeql.yml
@@ -58,11 +58,11 @@ jobs:
        # your codebase is analyzed, see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages
    steps:
    - name: Checkout repository
-      uses: actions/checkout@v4
+      uses: actions/checkout@v6

    # Initializes the CodeQL tools for scanning.
    - name: Initialize CodeQL
-      uses: github/codeql-action/init@v3
+      uses: github/codeql-action/init@v4
      with:
        languages: ${{ matrix.language }}
        build-mode: ${{ matrix.build-mode }}
@@ -93,6 +93,6 @@ jobs:
        exit 1

    - name: Perform CodeQL Analysis
-      uses: github/codeql-action/analyze@v3
+      uses: github/codeql-action/analyze@v4
      with:
        category: "/language:${{matrix.language}}"
--- a/.github/workflows/copilot-setup-steps.yml
+++ b/.github/workflows/copilot-setup-steps.yml
@@ -27,7 +27,7 @@ jobs:
    # If you do not check out your code, Copilot will do this for you.
    steps:
      - name: Checkout code
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6
        with:
          fetch-depth: 0
          submodules: true
--- a/.github/workflows/docs-block-sync.yml
+++ b/.github/workflows/docs-block-sync.yml
@@ -23,7 +23,7 @@ jobs:

    steps:
      - name: Checkout code
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6
        with:
          fetch-depth: 1

--- a/.github/workflows/docs-claude-review.yml
+++ b/.github/workflows/docs-claude-review.yml
@@ -7,6 +7,10 @@ on:
      - "docs/integrations/**"
      - "autogpt_platform/backend/backend/blocks/**"

+concurrency:
+  group: claude-docs-review-${{ github.event.pull_request.number }}
+  cancel-in-progress: true
+
 jobs:
  claude-review:
    # Only run for PRs from members/collaborators
@@ -23,7 +27,7 @@ jobs:

    steps:
      - name: Checkout code
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6
        with:
          fetch-depth: 0

@@ -91,5 +95,35 @@ jobs:
            3. Read corresponding documentation files to verify accuracy
            4. Provide your feedback as a PR comment

+            ## IMPORTANT: Comment Marker
+            Start your PR comment with exactly this HTML comment marker on its own line:
+            <!-- CLAUDE_DOCS_REVIEW -->
+
+            This marker is used to identify and replace your comment on subsequent runs.
+
            Be constructive and specific. If everything looks good, say so!
            If there are issues, explain what's wrong and suggest how to fix it.
+
+      - name: Delete old Claude review comments
+        env:
+          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+        run: |
+          # Get all comment IDs with our marker, sorted by creation date (oldest first)
+          COMMENT_IDS=$(gh api \
+            repos/${{ github.repository }}/issues/${{ github.event.pull_request.number }}/comments \
+            --jq '[.[] | select(.body | contains("<!-- CLAUDE_DOCS_REVIEW -->"))] | sort_by(.created_at) | .[].id')
+
+          # Count comments
+          COMMENT_COUNT=$(echo "$COMMENT_IDS" | grep -c . || true)
+
+          if [ "$COMMENT_COUNT" -gt 1 ]; then
+            # Delete all but the last (newest) comment
+            echo "$COMMENT_IDS" | head -n -1 | while read -r COMMENT_ID; do
+              if [ -n "$COMMENT_ID" ]; then
+                echo "Deleting old review comment: $COMMENT_ID"
+                gh api -X DELETE repos/${{ github.repository }}/issues/comments/$COMMENT_ID
+              fi
+            done
+          else
+            echo "No old review comments to clean up"
+          fi
--- a/.github/workflows/docs-enhance.yml
+++ b/.github/workflows/docs-enhance.yml
@@ -28,7 +28,7 @@ jobs:

    steps:
      - name: Checkout code
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6
        with:
          fetch-depth: 1

--- a/.github/workflows/platform-autogpt-deploy-dev.yaml
+++ b/.github/workflows/platform-autogpt-deploy-dev.yaml
@@ -25,7 +25,7 @@ jobs:

    steps:
      - name: Checkout code
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6
        with:
          ref: ${{ github.event.inputs.git_ref || github.ref_name }}

@@ -52,7 +52,7 @@ jobs:
    runs-on: ubuntu-latest
    steps:
      - name: Trigger deploy workflow
-        uses: peter-evans/repository-dispatch@v3
+        uses: peter-evans/repository-dispatch@v4
        with:
          token: ${{ secrets.DEPLOY_TOKEN }}
          repository: Significant-Gravitas/AutoGPT_cloud_infrastructure
--- a/.github/workflows/platform-autogpt-deploy-prod.yml
+++ b/.github/workflows/platform-autogpt-deploy-prod.yml
@@ -17,7 +17,7 @@ jobs:

    steps:
      - name: Checkout code
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6
        with:
          ref: ${{ github.ref_name || 'master' }}

@@ -45,7 +45,7 @@ jobs:
    runs-on: ubuntu-latest
    steps:
      - name: Trigger deploy workflow
-        uses: peter-evans/repository-dispatch@v3
+        uses: peter-evans/repository-dispatch@v4
        with:
          token: ${{ secrets.DEPLOY_TOKEN }}
          repository: Significant-Gravitas/AutoGPT_cloud_infrastructure
--- a/.github/workflows/platform-backend-ci.yml
+++ b/.github/workflows/platform-backend-ci.yml
@@ -5,12 +5,14 @@ on:
    branches: [master, dev, ci-test*]
    paths:
      - ".github/workflows/platform-backend-ci.yml"
+      - ".github/workflows/scripts/get_package_version_from_lockfile.py"
      - "autogpt_platform/backend/**"
      - "autogpt_platform/autogpt_libs/**"
  pull_request:
    branches: [master, dev, release-*]
    paths:
      - ".github/workflows/platform-backend-ci.yml"
+      - ".github/workflows/scripts/get_package_version_from_lockfile.py"
      - "autogpt_platform/backend/**"
      - "autogpt_platform/autogpt_libs/**"
  merge_group:
@@ -25,10 +27,91 @@ defaults:
    working-directory: autogpt_platform/backend

 jobs:
+  lint:
+    permissions:
+      contents: read
+    timeout-minutes: 10
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v6
+
+      - name: Set up Python 3.12
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+
+      - name: Set up Python dependency cache
+        uses: actions/cache@v5
+        with:
+          path: ~/.cache/pypoetry
+          key: poetry-${{ runner.os }}-py3.12-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
+
+      - name: Install Poetry
+        run: |
+          HEAD_POETRY_VERSION=$(python ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
+          echo "Using Poetry version ${HEAD_POETRY_VERSION}"
+          curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
+
+      - name: Install Python dependencies
+        run: poetry install
+
+      - name: Run Linters
+        run: poetry run lint --skip-pyright
+
+    env:
+      CI: true
+      PLAIN_OUTPUT: True
+
+  type-check:
+    permissions:
+      contents: read
+    timeout-minutes: 10
+    strategy:
+      fail-fast: false
+      matrix:
+        python-version: ["3.11", "3.12", "3.13"]
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v6
+
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+
+      - name: Set up Python dependency cache
+        uses: actions/cache@v5
+        with:
+          path: ~/.cache/pypoetry
+          key: poetry-${{ runner.os }}-py${{ matrix.python-version }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
+
+      - name: Install Poetry
+        run: |
+          HEAD_POETRY_VERSION=$(python ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
+          echo "Using Poetry version ${HEAD_POETRY_VERSION}"
+          curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
+
+      - name: Install Python dependencies
+        run: poetry install
+
+      - name: Generate Prisma Client
+        run: poetry run prisma generate && poetry run gen-prisma-stub
+
+      - name: Run Pyright
+        run: poetry run pyright --pythonversion ${{ matrix.python-version }}
+
+    env:
+      CI: true
+      PLAIN_OUTPUT: True
+
  test:
    permissions:
      contents: read
-    timeout-minutes: 30
+    timeout-minutes: 15
    strategy:
      fail-fast: false
      matrix:
@@ -41,13 +124,18 @@ jobs:
        ports:
          - 6379:6379
      rabbitmq:
-        image: rabbitmq:3.12-management
+        image: rabbitmq:4.1.4
        ports:
          - 5672:5672
-          - 15672:15672
        env:
          RABBITMQ_DEFAULT_USER: ${{ env.RABBITMQ_DEFAULT_USER }}
          RABBITMQ_DEFAULT_PASS: ${{ env.RABBITMQ_DEFAULT_PASS }}
+        options: >-
+          --health-cmd "rabbitmq-diagnostics -q ping"
+          --health-interval 30s
+          --health-timeout 10s
+          --health-retries 5
+          --health-start-period 10s
      clamav:
        image: clamav/clamav-debian:latest
        ports:
@@ -68,7 +156,7 @@ jobs:

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6
        with:
          fetch-depth: 0
          submodules: true
@@ -91,9 +179,9 @@ jobs:
        uses: actions/cache@v5
        with:
          path: ~/.cache/pypoetry
-          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
+          key: poetry-${{ runner.os }}-py${{ matrix.python-version }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}

-      - name: Install Poetry (Unix)
+      - name: Install Poetry
        run: |
          # Extract Poetry version from backend/poetry.lock
          HEAD_POETRY_VERSION=$(python ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
@@ -151,22 +239,22 @@ jobs:
          echo "Waiting for ClamAV daemon to start..."
          max_attempts=60
          attempt=0
-          
+
          until nc -z localhost 3310 || [ $attempt -eq $max_attempts ]; do
            echo "ClamAV is unavailable - sleeping (attempt $((attempt+1))/$max_attempts)"
            sleep 5
            attempt=$((attempt+1))
          done
-          
+
          if [ $attempt -eq $max_attempts ]; then
            echo "ClamAV failed to start after $((max_attempts*5)) seconds"
            echo "Checking ClamAV service logs..."
            docker logs $(docker ps -q --filter "ancestor=clamav/clamav-debian:latest") 2>&1 | tail -50 || echo "No ClamAV container found"
            exit 1
          fi
-          
+
          echo "ClamAV is ready!"
-          
+
          # Verify ClamAV is responsive
          echo "Testing ClamAV connection..."
          timeout 10 bash -c 'echo "PING" | nc localhost 3310' || {
@@ -181,18 +269,13 @@ jobs:
          DATABASE_URL: ${{ steps.supabase.outputs.DB_URL }}
          DIRECT_URL: ${{ steps.supabase.outputs.DB_URL }}

-      - id: lint
-        name: Run Linter
-        run: poetry run lint
-
-      - name: Run pytest with coverage
+      - name: Run pytest
        run: |
          if [[ "${{ runner.debug }}" == "1" ]]; then
            poetry run pytest -s -vv -o log_cli=true -o log_cli_level=DEBUG
          else
            poetry run pytest -s -vv
          fi
-        if: success() || (failure() && steps.lint.outcome == 'failure')
        env:
          LOG_LEVEL: ${{ runner.debug && 'DEBUG' || 'INFO' }}
          DATABASE_URL: ${{ steps.supabase.outputs.DB_URL }}
@@ -204,6 +287,12 @@ jobs:
          REDIS_PORT: "6379"
          ENCRYPTION_KEY: "dvziYgz0KSK8FENhju0ZYi8-fRTfAdlz6YLhdB_jhNw=" # DO NOT USE IN PRODUCTION!!

+      # - name: Upload coverage reports to Codecov
+      #   uses: codecov/codecov-action@v4
+      #   with:
+      #     token: ${{ secrets.CODECOV_TOKEN }}
+      #     flags: backend,${{ runner.os }}
+
    env:
      CI: true
      PLAIN_OUTPUT: True
@@ -217,9 +306,3 @@ jobs:
      # the backend service, docker composes, and examples
      RABBITMQ_DEFAULT_USER: "rabbitmq_user_default"
      RABBITMQ_DEFAULT_PASS: "k0VMxyIJF9S35f3x2uaw5IWAl6Y536O7"
-
-      # - name: Upload coverage reports to Codecov
-      #   uses: codecov/codecov-action@v4
-      #   with:
-      #     token: ${{ secrets.CODECOV_TOKEN }}
-      #     flags: backend,${{ runner.os }}
--- a/.github/workflows/platform-dev-deploy-event-dispatcher.yml
+++ b/.github/workflows/platform-dev-deploy-event-dispatcher.yml
@@ -82,7 +82,7 @@ jobs:
          
      - name: Dispatch Deploy Event
        if: steps.check_status.outputs.should_deploy == 'true'
-        uses: peter-evans/repository-dispatch@v3
+        uses: peter-evans/repository-dispatch@v4
        with:
          token: ${{ secrets.DISPATCH_TOKEN }}
          repository: Significant-Gravitas/AutoGPT_cloud_infrastructure
@@ -110,7 +110,7 @@ jobs:

      - name: Dispatch Undeploy Event (from comment)
        if: steps.check_status.outputs.should_undeploy == 'true'
-        uses: peter-evans/repository-dispatch@v3
+        uses: peter-evans/repository-dispatch@v4
        with:
          token: ${{ secrets.DISPATCH_TOKEN }}
          repository: Significant-Gravitas/AutoGPT_cloud_infrastructure
@@ -168,7 +168,7 @@ jobs:
          github.event_name == 'pull_request' &&
          github.event.action == 'closed' &&
          steps.check_pr_close.outputs.should_undeploy == 'true'
-        uses: peter-evans/repository-dispatch@v3
+        uses: peter-evans/repository-dispatch@v4
        with:
          token: ${{ secrets.DISPATCH_TOKEN }}
          repository: Significant-Gravitas/AutoGPT_cloud_infrastructure
--- a/.github/workflows/platform-frontend-ci.yml
+++ b/.github/workflows/platform-frontend-ci.yml
@@ -6,10 +6,16 @@ on:
    paths:
      - ".github/workflows/platform-frontend-ci.yml"
      - "autogpt_platform/frontend/**"
+      - "autogpt_platform/backend/Dockerfile"
+      - "autogpt_platform/docker-compose.yml"
+      - "autogpt_platform/docker-compose.platform.yml"
  pull_request:
    paths:
      - ".github/workflows/platform-frontend-ci.yml"
      - "autogpt_platform/frontend/**"
+      - "autogpt_platform/backend/Dockerfile"
+      - "autogpt_platform/docker-compose.yml"
+      - "autogpt_platform/docker-compose.platform.yml"
  merge_group:
  workflow_dispatch:

@@ -26,12 +32,11 @@ jobs:
  setup:
    runs-on: ubuntu-latest
    outputs:
-      cache-key: ${{ steps.cache-key.outputs.key }}
      components-changed: ${{ steps.filter.outputs.components }}

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6

      - name: Check for component changes
        uses: dorny/paths-filter@v3
@@ -41,28 +46,17 @@ jobs:
            components:
              - 'autogpt_platform/frontend/src/components/**'

-      - name: Set up Node.js
-        uses: actions/setup-node@v6
-        with:
-          node-version: "22.18.0"
-
      - name: Enable corepack
        run: corepack enable

-      - name: Generate cache key
-        id: cache-key
-        run: echo "key=${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}" >> $GITHUB_OUTPUT
-
-      - name: Cache dependencies
-        uses: actions/cache@v5
+      - name: Set up Node
+        uses: actions/setup-node@v6
        with:
-          path: ~/.pnpm-store
-          key: ${{ steps.cache-key.outputs.key }}
-          restore-keys: |
-            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
-            ${{ runner.os }}-pnpm-
+          node-version: "22.18.0"
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml

-      - name: Install dependencies
+      - name: Install dependencies to populate cache
        run: pnpm install --frozen-lockfile

  lint:
@@ -71,24 +65,17 @@ jobs:

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v4
-
-      - name: Set up Node.js
-        uses: actions/setup-node@v6
-        with:
-          node-version: "22.18.0"
+        uses: actions/checkout@v6

      - name: Enable corepack
        run: corepack enable

-      - name: Restore dependencies cache
-        uses: actions/cache@v5
+      - name: Set up Node
+        uses: actions/setup-node@v6
        with:
-          path: ~/.pnpm-store
-          key: ${{ needs.setup.outputs.cache-key }}
-          restore-keys: |
-            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
-            ${{ runner.os }}-pnpm-
+          node-version: "22.18.0"
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml

      - name: Install dependencies
        run: pnpm install --frozen-lockfile
@@ -107,26 +94,19 @@ jobs:

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6
        with:
          fetch-depth: 0

-      - name: Set up Node.js
-        uses: actions/setup-node@v6
-        with:
-          node-version: "22.18.0"
-
      - name: Enable corepack
        run: corepack enable

-      - name: Restore dependencies cache
-        uses: actions/cache@v5
+      - name: Set up Node
+        uses: actions/setup-node@v6
        with:
-          path: ~/.pnpm-store
-          key: ${{ needs.setup.outputs.cache-key }}
-          restore-keys: |
-            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
-            ${{ runner.os }}-pnpm-
+          node-version: "22.18.0"
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml

      - name: Install dependencies
        run: pnpm install --frozen-lockfile
@@ -140,163 +120,25 @@ jobs:
          token: ${{ secrets.GITHUB_TOKEN }}
          exitOnceUploaded: true

-  e2e_test:
-    runs-on: big-boi
-    needs: setup
-    strategy:
-      fail-fast: false
-
-    steps:
-      - name: Checkout repository
-        uses: actions/checkout@v4
-        with:
-          submodules: recursive
-
-      - name: Set up Node.js
-        uses: actions/setup-node@v6
-        with:
-          node-version: "22.18.0"
-
-      - name: Enable corepack
-        run: corepack enable
-
-      - name: Copy default supabase .env
-        run: |
-          cp ../.env.default ../.env
-
-      - name: Copy backend .env and set OpenAI API key
-        run: |
-          cp ../backend/.env.default ../backend/.env
-          echo "OPENAI_INTERNAL_API_KEY=${{ secrets.OPENAI_API_KEY }}" >> ../backend/.env
-        env:
-          # Used by E2E test data script to generate embeddings for approved store agents
-          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
-
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3
-
-      - name: Cache Docker layers
-        uses: actions/cache@v5
-        with:
-          path: /tmp/.buildx-cache
-          key: ${{ runner.os }}-buildx-frontend-test-${{ hashFiles('autogpt_platform/docker-compose.yml', 'autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/pyproject.toml', 'autogpt_platform/backend/poetry.lock') }}
-          restore-keys: |
-            ${{ runner.os }}-buildx-frontend-test-
-
-      - name: Run docker compose
-        run: |
-          NEXT_PUBLIC_PW_TEST=true docker compose -f ../docker-compose.yml up -d
-        env:
-          DOCKER_BUILDKIT: 1
-          BUILDX_CACHE_FROM: type=local,src=/tmp/.buildx-cache
-          BUILDX_CACHE_TO: type=local,dest=/tmp/.buildx-cache-new,mode=max
-
-      - name: Move cache
-        run: |
-          rm -rf /tmp/.buildx-cache
-          if [ -d "/tmp/.buildx-cache-new" ]; then
-            mv /tmp/.buildx-cache-new /tmp/.buildx-cache
-          fi
-
-      - name: Wait for services to be ready
-        run: |
-          echo "Waiting for rest_server to be ready..."
-          timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
-          echo "Waiting for database to be ready..."
-          timeout 60 sh -c 'until docker compose -f ../docker-compose.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done' || echo "Database ready check timeout, continuing..."
-
-      - name: Create E2E test data
-        run: |
-          echo "Creating E2E test data..."
-          # First try to run the script from inside the container
-          if docker compose -f ../docker-compose.yml exec -T rest_server test -f /app/autogpt_platform/backend/test/e2e_test_data.py; then
-            echo "✅ Found e2e_test_data.py in container, running it..."
-            docker compose -f ../docker-compose.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python backend/test/e2e_test_data.py" || {
-              echo "❌ E2E test data creation failed!"
-              docker compose -f ../docker-compose.yml logs --tail=50 rest_server
-              exit 1
-            }
-          else
-            echo "⚠️ e2e_test_data.py not found in container, copying and running..."
-            # Copy the script into the container and run it
-            docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.yml ps -q rest_server):/tmp/e2e_test_data.py || {
-              echo "❌ Failed to copy script to container"
-              exit 1
-            }
-            docker compose -f ../docker-compose.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
-              echo "❌ E2E test data creation failed!"
-              docker compose -f ../docker-compose.yml logs --tail=50 rest_server
-              exit 1
-            }
-          fi
-
-      - name: Restore dependencies cache
-        uses: actions/cache@v5
-        with:
-          path: ~/.pnpm-store
-          key: ${{ needs.setup.outputs.cache-key }}
-          restore-keys: |
-            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
-            ${{ runner.os }}-pnpm-
-
-      - name: Install dependencies
-        run: pnpm install --frozen-lockfile
-
-      - name: Install Browser 'chromium'
-        run: pnpm playwright install --with-deps chromium
-
-      - name: Run Playwright tests
-        run: pnpm test:no-build
-        continue-on-error: false
-
-      - name: Upload Playwright report
-        if: always()
-        uses: actions/upload-artifact@v4
-        with:
-          name: playwright-report
-          path: playwright-report
-          if-no-files-found: ignore
-          retention-days: 3
-
-      - name: Upload Playwright test results
-        if: always()
-        uses: actions/upload-artifact@v4
-        with:
-          name: playwright-test-results
-          path: test-results
-          if-no-files-found: ignore
-          retention-days: 3
-
-      - name: Print Final Docker Compose logs
-        if: always()
-        run: docker compose -f ../docker-compose.yml logs
-
  integration_test:
    runs-on: ubuntu-latest
    needs: setup

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6
        with:
          submodules: recursive

-      - name: Set up Node.js
-        uses: actions/setup-node@v6
-        with:
-          node-version: "22.18.0"
-
      - name: Enable corepack
        run: corepack enable

-      - name: Restore dependencies cache
-        uses: actions/cache@v5
+      - name: Set up Node
+        uses: actions/setup-node@v6
        with:
-          path: ~/.pnpm-store
-          key: ${{ needs.setup.outputs.cache-key }}
-          restore-keys: |
-            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
-            ${{ runner.os }}-pnpm-
+          node-version: "22.18.0"
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml

      - name: Install dependencies
        run: pnpm install --frozen-lockfile
--- a/.github/workflows/platform-fullstack-ci.yml
+++ b/.github/workflows/platform-fullstack-ci.yml
@@ -1,14 +1,18 @@
-name: AutoGPT Platform - Frontend CI
+name: AutoGPT Platform - Full-stack CI

 on:
  push:
    branches: [master, dev]
    paths:
      - ".github/workflows/platform-fullstack-ci.yml"
+      - ".github/workflows/scripts/docker-ci-fix-compose-build-cache.py"
+      - ".github/workflows/scripts/get_package_version_from_lockfile.py"
      - "autogpt_platform/**"
  pull_request:
    paths:
      - ".github/workflows/platform-fullstack-ci.yml"
+      - ".github/workflows/scripts/docker-ci-fix-compose-build-cache.py"
+      - ".github/workflows/scripts/get_package_version_from_lockfile.py"
      - "autogpt_platform/**"
  merge_group:

@@ -24,113 +28,285 @@ defaults:
 jobs:
  setup:
    runs-on: ubuntu-latest
-    outputs:
-      cache-key: ${{ steps.cache-key.outputs.key }}

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v4
-
-      - name: Set up Node.js
-        uses: actions/setup-node@v6
-        with:
-          node-version: "22.18.0"
+        uses: actions/checkout@v6

      - name: Enable corepack
        run: corepack enable

-      - name: Generate cache key
-        id: cache-key
-        run: echo "key=${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}" >> $GITHUB_OUTPUT
-
-      - name: Cache dependencies
-        uses: actions/cache@v5
+      - name: Set up Node
+        uses: actions/setup-node@v6
        with:
-          path: ~/.pnpm-store
-          key: ${{ steps.cache-key.outputs.key }}
-          restore-keys: |
-            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
-            ${{ runner.os }}-pnpm-
+          node-version: "22.18.0"
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml

-      - name: Install dependencies
+      - name: Install dependencies to populate cache
        run: pnpm install --frozen-lockfile

-  types:
-    runs-on: big-boi
+  check-api-types:
+    name: check API types
+    runs-on: ubuntu-latest
    needs: setup
-    strategy:
-      fail-fast: false

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v4
+        uses: actions/checkout@v6
        with:
          submodules: recursive

-      - name: Set up Node.js
+      # ------------------------ Backend setup ------------------------
+
+      - name: Set up Backend - Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+
+      - name: Set up Backend - Install Poetry
+        working-directory: autogpt_platform/backend
+        run: |
+          POETRY_VERSION=$(python ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
+          echo "Installing Poetry version ${POETRY_VERSION}"
+          curl -sSL https://install.python-poetry.org | POETRY_VERSION=$POETRY_VERSION python3 -
+
+      - name: Set up Backend - Set up dependency cache
+        uses: actions/cache@v5
+        with:
+          path: ~/.cache/pypoetry
+          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
+
+      - name: Set up Backend - Install dependencies
+        working-directory: autogpt_platform/backend
+        run: poetry install
+
+      - name: Set up Backend - Generate Prisma client
+        working-directory: autogpt_platform/backend
+        run: poetry run prisma generate && poetry run gen-prisma-stub
+
+      - name: Set up Frontend - Export OpenAPI schema from Backend
+        working-directory: autogpt_platform/backend
+        run: poetry run export-api-schema --output ../frontend/src/app/api/openapi.json
+
+      # ------------------------ Frontend setup ------------------------
+
+      - name: Set up Frontend - Enable corepack
+        run: corepack enable
+
+      - name: Set up Frontend - Set up Node
        uses: actions/setup-node@v6
        with:
          node-version: "22.18.0"
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml

-      - name: Enable corepack
-        run: corepack enable
-
-      - name: Copy default supabase .env
-        run: |
-          cp ../.env.default ../.env
-
-      - name: Copy backend .env
-        run: |
-          cp ../backend/.env.default ../backend/.env
-
-      - name: Run docker compose
-        run: |
-          docker compose -f ../docker-compose.yml --profile local up -d deps_backend
-
-      - name: Restore dependencies cache
-        uses: actions/cache@v5
-        with:
-          path: ~/.pnpm-store
-          key: ${{ needs.setup.outputs.cache-key }}
-          restore-keys: |
-            ${{ runner.os }}-pnpm-
-
-      - name: Install dependencies
+      - name: Set up Frontend - Install dependencies
        run: pnpm install --frozen-lockfile

-      - name: Setup .env
-        run: cp .env.default .env
-
-      - name: Wait for services to be ready
-        run: |
-          echo "Waiting for rest_server to be ready..."
-          timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
-          echo "Waiting for database to be ready..."
-          timeout 60 sh -c 'until docker compose -f ../docker-compose.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done' || echo "Database ready check timeout, continuing..."
-
-      - name: Generate API queries
-        run: pnpm generate:api:force
+      - name: Set up Frontend - Format OpenAPI schema
+        id: format-schema
+        run: pnpm prettier --write ./src/app/api/openapi.json

      - name: Check for API schema changes
        run: |
          if ! git diff --exit-code src/app/api/openapi.json; then
            echo "❌ API schema changes detected in src/app/api/openapi.json"
            echo ""
-            echo "The openapi.json file has been modified after running 'pnpm generate:api-all'."
+            echo "The openapi.json file has been modified after exporting the API schema."
            echo "This usually means changes have been made in the BE endpoints without updating the Frontend."
            echo "The API schema is now out of sync with the Front-end queries."
            echo ""
            echo "To fix this:"
-            echo "1. Pull the backend 'docker compose pull && docker compose up -d --build --force-recreate'"
-            echo "2. Run 'pnpm generate:api' locally"
-            echo "3. Run 'pnpm types' locally"
-            echo "4. Fix any TypeScript errors that may have been introduced"
-            echo "5. Commit and push your changes"
+            echo "\nIn the backend directory:"
+            echo "1. Run 'poetry run export-api-schema --output ../frontend/src/app/api/openapi.json'"
+            echo "\nIn the frontend directory:"
+            echo "2. Run 'pnpm prettier --write src/app/api/openapi.json'"
+            echo "3. Run 'pnpm generate:api'"
+            echo "4. Run 'pnpm types'"
+            echo "5. Fix any TypeScript errors that may have been introduced"
+            echo "6. Commit and push your changes"
            echo ""
            exit 1
          else
            echo "✅ No API schema changes detected"
          fi

-      - name: Run Typescript checks
+      - name: Set up Frontend - Generate API client
+        id: generate-api-client
+        run: pnpm orval --config ./orval.config.ts
+        # Continue with type generation & check even if there are schema changes
+        if: success() || (steps.format-schema.outcome == 'success')
+
+      - name: Check for TypeScript errors
        run: pnpm types
+        if: success() || (steps.generate-api-client.outcome == 'success')
+
+  e2e_test:
+    name: end-to-end tests
+    runs-on: big-boi
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v6
+        with:
+          submodules: recursive
+
+      - name: Set up Platform - Copy default supabase .env
+        run: |
+          cp ../.env.default ../.env
+
+      - name: Set up Platform - Copy backend .env and set OpenAI API key
+        run: |
+          cp ../backend/.env.default ../backend/.env
+          echo "OPENAI_INTERNAL_API_KEY=${{ secrets.OPENAI_API_KEY }}" >> ../backend/.env
+        env:
+          # Used by E2E test data script to generate embeddings for approved store agents
+          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+
+      - name: Set up Platform - Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3
+        with:
+          driver: docker-container
+          driver-opts: network=host
+
+      - name: Set up Platform - Expose GHA cache to docker buildx CLI
+        uses: crazy-max/ghaction-github-runtime@v4
+
+      - name: Set up Platform - Build Docker images (with cache)
+        working-directory: autogpt_platform
+        run: |
+          pip install pyyaml
+
+          # Resolve extends and generate a flat compose file that bake can understand
+          docker compose -f docker-compose.yml config > docker-compose.resolved.yml
+
+          # Add cache configuration to the resolved compose file
+          python ../.github/workflows/scripts/docker-ci-fix-compose-build-cache.py \
+            --source docker-compose.resolved.yml \
+            --cache-from "type=gha" \
+            --cache-to "type=gha,mode=max" \
+            --backend-hash "${{ hashFiles('autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/poetry.lock', 'autogpt_platform/backend/backend/**') }}" \
+            --frontend-hash "${{ hashFiles('autogpt_platform/frontend/Dockerfile', 'autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/src/**') }}" \
+            --git-ref "${{ github.ref }}"
+
+          # Build with bake using the resolved compose file (now includes cache config)
+          docker buildx bake --allow=fs.read=.. -f docker-compose.resolved.yml --load
+        env:
+          NEXT_PUBLIC_PW_TEST: true
+
+      - name: Set up tests - Cache E2E test data
+        id: e2e-data-cache
+        uses: actions/cache@v5
+        with:
+          path: /tmp/e2e_test_data.sql
+          key: e2e-test-data-${{ hashFiles('autogpt_platform/backend/test/e2e_test_data.py', 'autogpt_platform/backend/migrations/**', '.github/workflows/platform-fullstack-ci.yml') }}
+
+      - name: Set up Platform - Start Supabase DB + Auth
+        run: |
+          docker compose -f ../docker-compose.resolved.yml up -d db auth --no-build
+          echo "Waiting for database to be ready..."
+          timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done'
+          echo "Waiting for auth service to be ready..."
+          timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -c "SELECT 1 FROM auth.users LIMIT 1" 2>/dev/null; do sleep 2; done' || echo "Auth schema check timeout, continuing..."
+
+      - name: Set up Platform - Run migrations
+        run: |
+          echo "Running migrations..."
+          docker compose -f ../docker-compose.resolved.yml run --rm migrate
+          echo "✅ Migrations completed"
+        env:
+          NEXT_PUBLIC_PW_TEST: true
+
+      - name: Set up tests - Load cached E2E test data
+        if: steps.e2e-data-cache.outputs.cache-hit == 'true'
+        run: |
+          echo "✅ Found cached E2E test data, restoring..."
+          {
+            echo "SET session_replication_role = 'replica';"
+            cat /tmp/e2e_test_data.sql
+            echo "SET session_replication_role = 'origin';"
+          } | docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -b
+          # Refresh materialized views after restore
+          docker compose -f ../docker-compose.resolved.yml exec -T db \
+            psql -U postgres -d postgres -b -c "SET search_path TO platform; SELECT refresh_store_materialized_views();" || true
+
+          echo "✅ E2E test data restored from cache"
+
+      - name: Set up Platform - Start (all other services)
+        run: |
+          docker compose -f ../docker-compose.resolved.yml up -d --no-build
+          echo "Waiting for rest_server to be ready..."
+          timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
+        env:
+          NEXT_PUBLIC_PW_TEST: true
+
+      - name: Set up tests - Create E2E test data
+        if: steps.e2e-data-cache.outputs.cache-hit != 'true'
+        run: |
+          echo "Creating E2E test data..."
+          docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.resolved.yml ps -q rest_server):/tmp/e2e_test_data.py
+          docker compose -f ../docker-compose.resolved.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
+            echo "❌ E2E test data creation failed!"
+            docker compose -f ../docker-compose.resolved.yml logs --tail=50 rest_server
+            exit 1
+          }
+
+          # Dump auth.users + platform schema for cache (two separate dumps)
+          echo "Dumping database for cache..."
+          {
+            docker compose -f ../docker-compose.resolved.yml exec -T db \
+              pg_dump -U postgres --data-only --column-inserts \
+              --table='auth.users' postgres
+            docker compose -f ../docker-compose.resolved.yml exec -T db \
+              pg_dump -U postgres --data-only --column-inserts \
+              --schema=platform \
+              --exclude-table='platform._prisma_migrations' \
+              --exclude-table='platform.apscheduler_jobs' \
+              --exclude-table='platform.apscheduler_jobs_batched_notifications' \
+              postgres
+          } > /tmp/e2e_test_data.sql
+
+          echo "✅ Database dump created for caching ($(wc -l < /tmp/e2e_test_data.sql) lines)"
+
+      - name: Set up tests - Enable corepack
+        run: corepack enable
+
+      - name: Set up tests - Set up Node
+        uses: actions/setup-node@v6
+        with:
+          node-version: "22.18.0"
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
+
+      - name: Set up tests - Install dependencies
+        run: pnpm install --frozen-lockfile
+
+      - name: Set up tests - Install browser 'chromium'
+        run: pnpm playwright install --with-deps chromium
+
+      - name: Run Playwright tests
+        run: pnpm test:no-build
+        continue-on-error: false
+
+      - name: Upload Playwright report
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: playwright-report
+          path: autogpt_platform/frontend/playwright-report
+          if-no-files-found: ignore
+          retention-days: 3
+
+      - name: Upload Playwright test results
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: playwright-test-results
+          path: autogpt_platform/frontend/test-results
+          if-no-files-found: ignore
+          retention-days: 3
+
+      - name: Print Final Docker Compose logs
+        if: always()
+        run: docker compose -f ../docker-compose.resolved.yml logs
--- a/.github/workflows/pr-overlap-check.yml
+++ b/.github/workflows/pr-overlap-check.yml
@@ -0,0 +1,39 @@
+name: PR Overlap Detection
+
+on:
+  pull_request:
+    types: [opened, synchronize, reopened]
+    branches:
+      - dev
+      - master
+
+permissions:
+  contents: read
+  pull-requests: write
+
+jobs:
+  check-overlaps:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 0  # Need full history for merge testing
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: '3.11'
+
+      - name: Configure git
+        run: |
+          git config user.email "github-actions[bot]@users.noreply.github.com"
+          git config user.name "github-actions[bot]"
+
+      - name: Run overlap detection
+        env:
+          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+        # Always succeed - this check informs contributors, it shouldn't block merging
+        continue-on-error: true
+        run: |
+          python .github/scripts/detect_overlaps.py ${{ github.event.pull_request.number }}
--- a/.github/workflows/repo-workflow-checker.yml
+++ b/.github/workflows/repo-workflow-checker.yml
@@ -11,7 +11,7 @@ jobs:
    steps:
      # - name: Wait some time for all actions to start
      #   run: sleep 30
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6
        # with:
          # fetch-depth: 0
      - name: Set up Python
--- a/.github/workflows/scripts/docker-ci-fix-compose-build-cache.py
+++ b/.github/workflows/scripts/docker-ci-fix-compose-build-cache.py
@@ -0,0 +1,195 @@
+#!/usr/bin/env python3
+"""
+Add cache configuration to a resolved docker-compose file for all services
+that have a build key, and ensure image names match what docker compose expects.
+"""
+
+import argparse
+
+import yaml
+
+
+DEFAULT_BRANCH = "dev"
+CACHE_BUILDS_FOR_COMPONENTS = ["backend", "frontend"]
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Add cache config to a resolved compose file"
+    )
+    parser.add_argument(
+        "--source",
+        required=True,
+        help="Source compose file to read (should be output of `docker compose config`)",
+    )
+    parser.add_argument(
+        "--cache-from",
+        default="type=gha",
+        help="Cache source configuration",
+    )
+    parser.add_argument(
+        "--cache-to",
+        default="type=gha,mode=max",
+        help="Cache destination configuration",
+    )
+    for component in CACHE_BUILDS_FOR_COMPONENTS:
+        parser.add_argument(
+            f"--{component}-hash",
+            default="",
+            help=f"Hash for {component} cache scope (e.g., from hashFiles())",
+        )
+    parser.add_argument(
+        "--git-ref",
+        default="",
+        help="Git ref for branch-based cache scope (e.g., refs/heads/master)",
+    )
+    args = parser.parse_args()
+
+    # Normalize git ref to a safe scope name (e.g., refs/heads/master -> master)
+    git_ref_scope = ""
+    if args.git_ref:
+        git_ref_scope = args.git_ref.replace("refs/heads/", "").replace("/", "-")
+
+    with open(args.source, "r") as f:
+        compose = yaml.safe_load(f)
+
+    # Get project name from compose file or default
+    project_name = compose.get("name", "autogpt_platform")
+
+    def get_image_name(dockerfile: str, target: str) -> str:
+        """Generate image name based on Dockerfile folder and build target."""
+        dockerfile_parts = dockerfile.replace("\\", "/").split("/")
+        if len(dockerfile_parts) >= 2:
+            folder_name = dockerfile_parts[-2]  # e.g., "backend" or "frontend"
+        else:
+            folder_name = "app"
+        return f"{project_name}-{folder_name}:{target}"
+
+    def get_build_key(dockerfile: str, target: str) -> str:
+        """Generate a unique key for a Dockerfile+target combination."""
+        return f"{dockerfile}:{target}"
+
+    def get_component(dockerfile: str) -> str | None:
+        """Get component name (frontend/backend) from dockerfile path."""
+        for component in CACHE_BUILDS_FOR_COMPONENTS:
+            if component in dockerfile:
+                return component
+        return None
+
+    # First pass: collect all services with build configs and identify duplicates
+    # Track which (dockerfile, target) combinations we've seen
+    build_key_to_first_service: dict[str, str] = {}
+    services_to_build: list[str] = []
+    services_to_dedupe: list[str] = []
+
+    for service_name, service_config in compose.get("services", {}).items():
+        if "build" not in service_config:
+            continue
+
+        build_config = service_config["build"]
+        dockerfile = build_config.get("dockerfile", "Dockerfile")
+        target = build_config.get("target", "default")
+        build_key = get_build_key(dockerfile, target)
+
+        if build_key not in build_key_to_first_service:
+            # First service with this build config - it will do the actual build
+            build_key_to_first_service[build_key] = service_name
+            services_to_build.append(service_name)
+        else:
+            # Duplicate - will just use the image from the first service
+            services_to_dedupe.append(service_name)
+
+    # Second pass: configure builds and deduplicate
+    modified_services = []
+    for service_name, service_config in compose.get("services", {}).items():
+        if "build" not in service_config:
+            continue
+
+        build_config = service_config["build"]
+        dockerfile = build_config.get("dockerfile", "Dockerfile")
+        target = build_config.get("target", "latest")
+        image_name = get_image_name(dockerfile, target)
+
+        # Set image name for all services (needed for both builders and deduped)
+        service_config["image"] = image_name
+
+        if service_name in services_to_dedupe:
+            # Remove build config - this service will use the pre-built image
+            del service_config["build"]
+            continue
+
+        # This service will do the actual build - add cache config
+        cache_from_list = []
+        cache_to_list = []
+
+        component = get_component(dockerfile)
+        if not component:
+            # Skip services that don't clearly match frontend/backend
+            continue
+
+        # Get the hash for this component
+        component_hash = getattr(args, f"{component}_hash")
+
+        # Scope format: platform-{component}-{target}-{hash|ref}
+        # Example: platform-backend-server-abc123
+
+        if "type=gha" in args.cache_from:
+            # 1. Primary: exact hash match (most specific)
+            if component_hash:
+                hash_scope = f"platform-{component}-{target}-{component_hash}"
+                cache_from_list.append(f"{args.cache_from},scope={hash_scope}")
+
+            # 2. Fallback: branch-based cache
+            if git_ref_scope:
+                ref_scope = f"platform-{component}-{target}-{git_ref_scope}"
+                cache_from_list.append(f"{args.cache_from},scope={ref_scope}")
+
+            # 3. Fallback: dev branch cache (for PRs/feature branches)
+            if git_ref_scope and git_ref_scope != DEFAULT_BRANCH:
+                master_scope = f"platform-{component}-{target}-{DEFAULT_BRANCH}"
+                cache_from_list.append(f"{args.cache_from},scope={master_scope}")
+
+        if "type=gha" in args.cache_to:
+            # Write to both hash-based and branch-based scopes
+            if component_hash:
+                hash_scope = f"platform-{component}-{target}-{component_hash}"
+                cache_to_list.append(f"{args.cache_to},scope={hash_scope}")
+
+            if git_ref_scope:
+                ref_scope = f"platform-{component}-{target}-{git_ref_scope}"
+                cache_to_list.append(f"{args.cache_to},scope={ref_scope}")
+
+        # Ensure we have at least one cache source/target
+        if not cache_from_list:
+            cache_from_list.append(args.cache_from)
+        if not cache_to_list:
+            cache_to_list.append(args.cache_to)
+
+        build_config["cache_from"] = cache_from_list
+        build_config["cache_to"] = cache_to_list
+        modified_services.append(service_name)
+
+    # Write back to the same file
+    with open(args.source, "w") as f:
+        yaml.dump(compose, f, default_flow_style=False, sort_keys=False)
+
+    print(f"Added cache config to {len(modified_services)} services in {args.source}:")
+    for svc in modified_services:
+        svc_config = compose["services"][svc]
+        build_cfg = svc_config.get("build", {})
+        cache_from_list = build_cfg.get("cache_from", ["none"])
+        cache_to_list = build_cfg.get("cache_to", ["none"])
+        print(f"  - {svc}")
+        print(f"      image: {svc_config.get('image', 'N/A')}")
+        print(f"      cache_from: {cache_from_list}")
+        print(f"      cache_to: {cache_to_list}")
+    if services_to_dedupe:
+        print(
+            f"Deduplicated {len(services_to_dedupe)} services (will use pre-built images):"
+        )
+        for svc in services_to_dedupe:
+            print(f"  - {svc} -> {compose['services'][svc].get('image', 'N/A')}")
+
+
+if __name__ == "__main__":
+    main()
--- a/.gitignore
+++ b/.gitignore
@@ -180,4 +180,6 @@ autogpt_platform/backend/settings.py
 .claude/settings.local.json
 CLAUDE.local.md
 /autogpt_platform/backend/logs
-.next
+.next
+# Implementation plans (generated by AI agents)
+plans/
--- a/.nvmrc
+++ b/.nvmrc
@@ -0,0 +1 @@
+22
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -1,3 +1,10 @@
+default_install_hook_types:
+  - pre-commit
+  - pre-push
+  - post-checkout
+
+default_stages: [pre-commit]
+
 repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.4.0
@@ -17,6 +24,7 @@ repos:
        name: Detect secrets
        description: Detects high entropy strings that are likely to be passwords.
        files: ^autogpt_platform/
+        exclude: pnpm-lock\.yaml$
        stages: [pre-push]

  - repo: local
@@ -26,49 +34,106 @@ repos:
      - id: poetry-install
        name: Check & Install dependencies - AutoGPT Platform - Backend
        alias: poetry-install-platform-backend
-        entry: poetry -C autogpt_platform/backend install
        # include autogpt_libs source (since it's a path dependency)
-        files: ^autogpt_platform/(backend|autogpt_libs)/poetry\.lock$
-        types: [file]
+        entry: >
+          bash -c '
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
+          else
+            git diff --cached --name-only
+          fi | grep -qE "^autogpt_platform/(backend|autogpt_libs)/poetry\.lock$" || exit 0;
+          poetry -C autogpt_platform/backend install
+          '
+        always_run: true
        language: system
        pass_filenames: false
+        stages: [pre-commit, post-checkout]

      - id: poetry-install
        name: Check & Install dependencies - AutoGPT Platform - Libs
        alias: poetry-install-platform-libs
-        entry: poetry -C autogpt_platform/autogpt_libs install
-        files: ^autogpt_platform/autogpt_libs/poetry\.lock$
-        types: [file]
+        entry: >
+          bash -c '
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
+          else
+            git diff --cached --name-only
+          fi | grep -qE "^autogpt_platform/autogpt_libs/poetry\.lock$" || exit 0;
+          poetry -C autogpt_platform/autogpt_libs install
+          '
+        always_run: true
        language: system
        pass_filenames: false
+        stages: [pre-commit, post-checkout]
+
+      - id: pnpm-install
+        name: Check & Install dependencies - AutoGPT Platform - Frontend
+        alias: pnpm-install-platform-frontend
+        entry: >
+          bash -c '
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
+          else
+            git diff --cached --name-only
+          fi | grep -qE "^autogpt_platform/frontend/pnpm-lock\.yaml$" || exit 0;
+          pnpm --prefix autogpt_platform/frontend install
+          '
+        always_run: true
+        language: system
+        pass_filenames: false
+        stages: [pre-commit, post-checkout]

      - id: poetry-install
        name: Check & Install dependencies - Classic - AutoGPT
        alias: poetry-install-classic-autogpt
-        entry: poetry -C classic/original_autogpt install
+        entry: >
+          bash -c '
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
+          else
+            git diff --cached --name-only
+          fi | grep -qE "^classic/(original_autogpt|forge)/poetry\.lock$" || exit 0;
+          poetry -C classic/original_autogpt install
+          '
        # include forge source (since it's a path dependency)
-        files: ^classic/(original_autogpt|forge)/poetry\.lock$
-        types: [file]
+        always_run: true
        language: system
        pass_filenames: false
+        stages: [pre-commit, post-checkout]

      - id: poetry-install
        name: Check & Install dependencies - Classic - Forge
        alias: poetry-install-classic-forge
-        entry: poetry -C classic/forge install
-        files: ^classic/forge/poetry\.lock$
-        types: [file]
+        entry: >
+          bash -c '
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
+          else
+            git diff --cached --name-only
+          fi | grep -qE "^classic/forge/poetry\.lock$" || exit 0;
+          poetry -C classic/forge install
+          '
+        always_run: true
        language: system
        pass_filenames: false
+        stages: [pre-commit, post-checkout]

      - id: poetry-install
        name: Check & Install dependencies - Classic - Benchmark
        alias: poetry-install-classic-benchmark
-        entry: poetry -C classic/benchmark install
-        files: ^classic/benchmark/poetry\.lock$
-        types: [file]
+        entry: >
+          bash -c '
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
+          else
+            git diff --cached --name-only
+          fi | grep -qE "^classic/benchmark/poetry\.lock$" || exit 0;
+          poetry -C classic/benchmark install
+          '
+        always_run: true
        language: system
        pass_filenames: false
+        stages: [pre-commit, post-checkout]

  - repo: local
    # For proper type checking, Prisma client must be up-to-date.
@@ -76,12 +141,54 @@ repos:
      - id: prisma-generate
        name: Prisma Generate - AutoGPT Platform - Backend
        alias: prisma-generate-platform-backend
-        entry: bash -c 'cd autogpt_platform/backend && poetry run prisma generate'
+        entry: >
+          bash -c '
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
+          else
+            git diff --cached --name-only
+          fi | grep -qE "^autogpt_platform/((backend|autogpt_libs)/poetry\.lock|backend/schema\.prisma)$" || exit 0;
+          cd autogpt_platform/backend
+          && poetry run prisma generate
+          && poetry run gen-prisma-stub
+          '
        # include everything that triggers poetry install + the prisma schema
-        files: ^autogpt_platform/((backend|autogpt_libs)/poetry\.lock|backend/schema.prisma)$
-        types: [file]
+        always_run: true
        language: system
        pass_filenames: false
+        stages: [pre-commit, post-checkout]
+
+      - id: export-api-schema
+        name: Export API schema - AutoGPT Platform - Backend -> Frontend
+        alias: export-api-schema-platform
+        entry: >
+          bash -c '
+          cd autogpt_platform/backend
+          && poetry run export-api-schema --output ../frontend/src/app/api/openapi.json
+          && cd ../frontend
+          && pnpm prettier --write ./src/app/api/openapi.json
+          '
+        files: ^autogpt_platform/backend/
+        language: system
+        pass_filenames: false
+
+      - id: generate-api-client
+        name: Generate API client - AutoGPT Platform - Frontend
+        alias: generate-api-client-platform-frontend
+        entry: >
+          bash -c '
+          SCHEMA=autogpt_platform/frontend/src/app/api/openapi.json;
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --quiet "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF" -- "$SCHEMA" && exit 0
+          else
+            git diff --quiet HEAD -- "$SCHEMA" && exit 0
+          fi;
+          cd autogpt_platform/frontend && pnpm generate:api
+          '
+        always_run: true
+        language: system
+        pass_filenames: false
+        stages: [pre-commit, post-checkout]

  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.7.2
--- a/autogpt_platform/.gitignore
+++ b/autogpt_platform/.gitignore
@@ -1,2 +1,3 @@
 *.ignore.*
-*.ign.*
+*.ign.*
+.application.logs
--- a/autogpt_platform/CLAUDE.md
+++ b/autogpt_platform/CLAUDE.md
@@ -45,19 +45,47 @@ AutoGPT Platform is a monorepo containing:
 - Backend/Frontend services use YAML anchors for consistent configuration
 - Supabase services (`db/docker/docker-compose.yml`) follow the same pattern

+### Branching Strategy
+
+- **`dev`** is the main development branch. All PRs should target `dev`.
+- **`master`** is the production branch. Only used for production releases.
+
 ### Creating Pull Requests

 - Create the PR against the `dev` branch of the repository.
 - Ensure the branch name is descriptive (e.g., `feature/add-new-block`)
 - Use conventional commit messages (see below)
 - Fill out the .github/PULL_REQUEST_TEMPLATE.md template as the PR description
+- Always use `--body-file` to pass PR body — avoids shell interpretation of backticks and special characters:
+  ```bash
+  PR_BODY=$(mktemp)
+  cat > "$PR_BODY" << 'PREOF'
+  ## Summary
+  - use `backticks` freely here
+  PREOF
+  gh pr create --title "..." --body-file "$PR_BODY" --base dev
+  rm "$PR_BODY"
+  ```
 - Run the github pre-commit hooks to ensure code quality.

+### Test-Driven Development (TDD)
+
+When fixing a bug or adding a feature, follow a test-first approach:
+
+1. **Write a failing test first** — create a test that reproduces the bug or validates the new behavior, marked with `@pytest.mark.xfail` (backend) or `.fixme` (Playwright). Run it to confirm it fails for the right reason.
+2. **Implement the fix/feature** — write the minimal code to make the test pass.
+3. **Remove the xfail marker** — once the test passes, remove the `xfail`/`.fixme` annotation and run the full test suite to confirm nothing else broke.
+
+This ensures every change is covered by a test and that the test actually validates the intended behavior.
+
 ### Reviewing/Revising Pull Requests

- When the user runs /pr-comments or tries to fetch them, also run gh api /repos/Significant-Gravitas/AutoGPT/pulls/[issuenum]/reviews to get the reviews
- Use gh api /repos/Significant-Gravitas/AutoGPT/pulls/[issuenum]/reviews/[review_id]/comments to get the review contents
- Use gh api /repos/Significant-Gravitas/AutoGPT/issues/9924/comments to get the pr specific comments
+Use `/pr-review` to review a PR or `/pr-address` to address comments.
+
+When fetching comments manually:
+- `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate` — top-level reviews
+- `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments --paginate` — inline review comments (always paginate to avoid missing comments beyond page 1)
+- `gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments` — PR conversation comments

 ### Conventional Commits

--- a/autogpt_platform/analytics/queries/auth_activities.sql
+++ b/autogpt_platform/analytics/queries/auth_activities.sql
@@ -0,0 +1,40 @@
+-- =============================================================
+-- View: analytics.auth_activities
+-- Looker source alias: ds49  |  Charts: 1
+-- =============================================================
+-- DESCRIPTION
+--   Tracks authentication events (login, logout, SSO, password
+--   reset, etc.) from Supabase's internal audit log.
+--   Useful for monitoring sign-in patterns and detecting anomalies.
+--
+-- SOURCE TABLES
+--   auth.audit_log_entries  — Supabase internal auth event log
+--
+-- OUTPUT COLUMNS
+--   created_at      TIMESTAMPTZ  When the auth event occurred
+--   actor_id        TEXT         User ID who triggered the event
+--   actor_via_sso   TEXT         Whether the action was via SSO ('true'/'false')
+--   action          TEXT         Event type (e.g. 'login', 'logout', 'token_refreshed')
+--
+-- WINDOW
+--   Rolling 90 days from current date
+--
+-- EXAMPLE QUERIES
+--   -- Daily login counts
+--   SELECT DATE_TRUNC('day', created_at) AS day, COUNT(*) AS logins
+--   FROM analytics.auth_activities
+--   WHERE action = 'login'
+--   GROUP BY 1 ORDER BY 1;
+--
+--   -- SSO vs password login breakdown
+--   SELECT actor_via_sso, COUNT(*) FROM analytics.auth_activities
+--   WHERE action = 'login' GROUP BY 1;
+-- =============================================================
+
+SELECT
+    created_at,
+    payload->>'actor_id'      AS actor_id,
+    payload->>'actor_via_sso' AS actor_via_sso,
+    payload->>'action'        AS action
+FROM auth.audit_log_entries
+WHERE created_at >= NOW() - INTERVAL '90 days'
--- a/autogpt_platform/analytics/queries/graph_execution.sql
+++ b/autogpt_platform/analytics/queries/graph_execution.sql
@@ -0,0 +1,105 @@
+-- =============================================================
+-- View: analytics.graph_execution
+-- Looker source alias: ds16  |  Charts: 21
+-- =============================================================
+-- DESCRIPTION
+--   One row per agent graph execution (last 90 days).
+--   Unpacks the JSONB stats column into individual numeric columns
+--   and normalises the executionStatus — runs that failed due to
+--   insufficient credits are reclassified as 'NO_CREDITS' for
+--   easier filtering.  Error messages are scrubbed of IDs and URLs
+--   to allow safe grouping.
+--
+-- SOURCE TABLES
+--   platform.AgentGraphExecution  — Execution records
+--   platform.AgentGraph           — Agent graph metadata (for name)
+--   platform.LibraryAgent         — To flag possibly-AI (safe-mode) agents
+--
+-- OUTPUT COLUMNS
+--   id                TEXT         Execution UUID
+--   agentGraphId      TEXT         Agent graph UUID
+--   agentGraphVersion INT          Graph version number
+--   executionStatus   TEXT         COMPLETED | FAILED | NO_CREDITS | RUNNING | QUEUED | TERMINATED
+--   createdAt         TIMESTAMPTZ  When the execution was queued
+--   updatedAt         TIMESTAMPTZ  Last status update time
+--   userId            TEXT         Owner user UUID
+--   agentGraphName    TEXT         Human-readable agent name
+--   cputime           DECIMAL      Total CPU seconds consumed
+--   walltime          DECIMAL      Total wall-clock seconds
+--   node_count        DECIMAL      Number of nodes in the graph
+--   nodes_cputime     DECIMAL      CPU time across all nodes
+--   nodes_walltime    DECIMAL      Wall time across all nodes
+--   execution_cost    DECIMAL      Credit cost of this execution
+--   correctness_score FLOAT        AI correctness score (if available)
+--   possibly_ai       BOOLEAN      True if agent has sensitive_action_safe_mode enabled
+--   groupedErrorMessage TEXT       Scrubbed error string (IDs/URLs replaced with wildcards)
+--
+-- WINDOW
+--   Rolling 90 days (createdAt > CURRENT_DATE - 90 days)
+--
+-- EXAMPLE QUERIES
+--   -- Daily execution counts by status
+--   SELECT DATE_TRUNC('day', "createdAt") AS day, "executionStatus", COUNT(*)
+--   FROM analytics.graph_execution
+--   GROUP BY 1, 2 ORDER BY 1;
+--
+--   -- Average cost per execution by agent
+--   SELECT "agentGraphName", AVG("execution_cost") AS avg_cost, COUNT(*) AS runs
+--   FROM analytics.graph_execution
+--   WHERE "executionStatus" = 'COMPLETED'
+--   GROUP BY 1 ORDER BY avg_cost DESC;
+--
+--   -- Top error messages
+--   SELECT "groupedErrorMessage", COUNT(*) AS occurrences
+--   FROM analytics.graph_execution
+--   WHERE "executionStatus" = 'FAILED'
+--   GROUP BY 1 ORDER BY 2 DESC LIMIT 20;
+-- =============================================================
+
+SELECT
+    ge."id"                                                        AS id,
+    ge."agentGraphId"                                              AS agentGraphId,
+    ge."agentGraphVersion"                                         AS agentGraphVersion,
+    CASE
+        WHEN jsonb_exists(ge."stats"::jsonb, 'error')
+         AND (
+               (ge."stats"::jsonb->>'error') ILIKE '%insufficient balance%'
+            OR (ge."stats"::jsonb->>'error') ILIKE '%you have no credits left%'
+             )
+        THEN 'NO_CREDITS'
+        ELSE CAST(ge."executionStatus" AS TEXT)
+    END                                                            AS executionStatus,
+    ge."createdAt"                                                 AS createdAt,
+    ge."updatedAt"                                                 AS updatedAt,
+    ge."userId"                                                    AS userId,
+    g."name"                                                       AS agentGraphName,
+    (ge."stats"::jsonb->>'cputime')::decimal                       AS cputime,
+    (ge."stats"::jsonb->>'walltime')::decimal                      AS walltime,
+    (ge."stats"::jsonb->>'node_count')::decimal                    AS node_count,
+    (ge."stats"::jsonb->>'nodes_cputime')::decimal                 AS nodes_cputime,
+    (ge."stats"::jsonb->>'nodes_walltime')::decimal                AS nodes_walltime,
+    (ge."stats"::jsonb->>'cost')::decimal                          AS execution_cost,
+    (ge."stats"::jsonb->>'correctness_score')::float               AS correctness_score,
+    COALESCE(la.possibly_ai, FALSE)                                AS possibly_ai,
+    REGEXP_REPLACE(
+        REGEXP_REPLACE(
+            TRIM(BOTH '"' FROM ge."stats"::jsonb->>'error'),
+            '(https?://)([A-Za-z0-9.-]+)(:[0-9]+)?(/[^\s]*)?',
+            '\1\2/...', 'gi'
+        ),
+        '[a-zA-Z0-9_:-]*\d[a-zA-Z0-9_:-]*', '*', 'g'
+    )                                                              AS groupedErrorMessage
+FROM platform."AgentGraphExecution" ge
+LEFT JOIN platform."AgentGraph" g
+       ON ge."agentGraphId" = g."id"
+      AND ge."agentGraphVersion" = g."version"
+LEFT JOIN (
+    SELECT DISTINCT ON ("userId", "agentGraphId")
+           "userId", "agentGraphId",
+           ("settings"::jsonb->>'sensitive_action_safe_mode')::boolean AS possibly_ai
+    FROM platform."LibraryAgent"
+    WHERE "isDeleted"  = FALSE
+      AND "isArchived" = FALSE
+    ORDER BY "userId", "agentGraphId", "agentGraphVersion" DESC
+) la ON la."userId" = ge."userId" AND la."agentGraphId" = ge."agentGraphId"
+WHERE ge."createdAt" > CURRENT_DATE - INTERVAL '90 days'
--- a/autogpt_platform/analytics/queries/node_block_execution.sql
+++ b/autogpt_platform/analytics/queries/node_block_execution.sql
@@ -0,0 +1,101 @@
+-- =============================================================
+-- View: analytics.node_block_execution
+-- Looker source alias: ds14  |  Charts: 11
+-- =============================================================
+-- DESCRIPTION
+--   One row per node (block) execution (last 90 days).
+--   Unpacks stats JSONB and joins to identify which block type
+--   was run.  For failed nodes, joins the error output and
+--   scrubs it for safe grouping.
+--
+-- SOURCE TABLES
+--   platform.AgentNodeExecution              — Node execution records
+--   platform.AgentNode                       — Node → block mapping
+--   platform.AgentBlock                      — Block name/ID
+--   platform.AgentNodeExecutionInputOutput   — Error output values
+--
+-- OUTPUT COLUMNS
+--   id                    TEXT         Node execution UUID
+--   agentGraphExecutionId TEXT         Parent graph execution UUID
+--   agentNodeId           TEXT         Node UUID within the graph
+--   executionStatus       TEXT         COMPLETED | FAILED | QUEUED | RUNNING | TERMINATED
+--   addedTime             TIMESTAMPTZ  When the node was queued
+--   queuedTime            TIMESTAMPTZ  When it entered the queue
+--   startedTime           TIMESTAMPTZ  When execution started
+--   endedTime             TIMESTAMPTZ  When execution finished
+--   inputSize             BIGINT       Input payload size in bytes
+--   outputSize            BIGINT       Output payload size in bytes
+--   walltime              NUMERIC      Wall-clock seconds for this node
+--   cputime               NUMERIC      CPU seconds for this node
+--   llmRetryCount         INT          Number of LLM retries
+--   llmCallCount          INT          Number of LLM API calls made
+--   inputTokenCount       BIGINT       LLM input tokens consumed
+--   outputTokenCount      BIGINT       LLM output tokens produced
+--   blockName             TEXT         Human-readable block name (e.g. 'OpenAIBlock')
+--   blockId               TEXT         Block UUID
+--   groupedErrorMessage   TEXT         Scrubbed error (IDs/URLs wildcarded)
+--   errorMessage          TEXT         Raw error output (only set when FAILED)
+--
+-- WINDOW
+--   Rolling 90 days (addedTime > CURRENT_DATE - 90 days)
+--
+-- EXAMPLE QUERIES
+--   -- Most-used blocks by execution count
+--   SELECT "blockName", COUNT(*) AS executions,
+--          COUNT(*) FILTER (WHERE "executionStatus"='FAILED') AS failures
+--   FROM analytics.node_block_execution
+--   GROUP BY 1 ORDER BY executions DESC LIMIT 20;
+--
+--   -- Average LLM token usage per block
+--   SELECT "blockName",
+--          AVG("inputTokenCount") AS avg_input_tokens,
+--          AVG("outputTokenCount") AS avg_output_tokens
+--   FROM analytics.node_block_execution
+--   WHERE "llmCallCount" > 0
+--   GROUP BY 1 ORDER BY avg_input_tokens DESC;
+--
+--   -- Top failure reasons
+--   SELECT "blockName", "groupedErrorMessage", COUNT(*) AS count
+--   FROM analytics.node_block_execution
+--   WHERE "executionStatus" = 'FAILED'
+--   GROUP BY 1, 2 ORDER BY count DESC LIMIT 20;
+-- =============================================================
+
+SELECT
+    ne."id"                                                            AS id,
+    ne."agentGraphExecutionId"                                         AS agentGraphExecutionId,
+    ne."agentNodeId"                                                   AS agentNodeId,
+    CAST(ne."executionStatus" AS TEXT)                                 AS executionStatus,
+    ne."addedTime"                                                     AS addedTime,
+    ne."queuedTime"                                                    AS queuedTime,
+    ne."startedTime"                                                   AS startedTime,
+    ne."endedTime"                                                     AS endedTime,
+    (ne."stats"::jsonb->>'input_size')::bigint                         AS inputSize,
+    (ne."stats"::jsonb->>'output_size')::bigint                        AS outputSize,
+    (ne."stats"::jsonb->>'walltime')::numeric                          AS walltime,
+    (ne."stats"::jsonb->>'cputime')::numeric                           AS cputime,
+    (ne."stats"::jsonb->>'llm_retry_count')::int                       AS llmRetryCount,
+    (ne."stats"::jsonb->>'llm_call_count')::int                        AS llmCallCount,
+    (ne."stats"::jsonb->>'input_token_count')::bigint                  AS inputTokenCount,
+    (ne."stats"::jsonb->>'output_token_count')::bigint                 AS outputTokenCount,
+    b."name"                                                           AS blockName,
+    b."id"                                                             AS blockId,
+    REGEXP_REPLACE(
+        REGEXP_REPLACE(
+            TRIM(BOTH '"' FROM eio."data"::text),
+            '(https?://)([A-Za-z0-9.-]+)(:[0-9]+)?(/[^\s]*)?',
+            '\1\2/...', 'gi'
+        ),
+        '[a-zA-Z0-9_:-]*\d[a-zA-Z0-9_:-]*', '*', 'g'
+    )                                                                  AS groupedErrorMessage,
+    eio."data"                                                         AS errorMessage
+FROM platform."AgentNodeExecution" ne
+LEFT JOIN platform."AgentNode" nd
+       ON ne."agentNodeId" = nd."id"
+LEFT JOIN platform."AgentBlock" b
+       ON nd."agentBlockId" = b."id"
+LEFT JOIN platform."AgentNodeExecutionInputOutput" eio
+       ON eio."referencedByOutputExecId" = ne."id"
+      AND eio."name" = 'error'
+      AND ne."executionStatus" = 'FAILED'
+WHERE ne."addedTime" > CURRENT_DATE - INTERVAL '90 days'
--- a/autogpt_platform/analytics/queries/retention_agent.sql
+++ b/autogpt_platform/analytics/queries/retention_agent.sql
@@ -0,0 +1,97 @@
+-- =============================================================
+-- View: analytics.retention_agent
+-- Looker source alias: ds35  |  Charts: 2
+-- =============================================================
+-- DESCRIPTION
+--   Weekly cohort retention broken down per individual agent.
+--   Cohort = week of a user's first use of THAT specific agent.
+--   Tells you which agents keep users coming back vs. one-shot
+--   use. Only includes cohorts from the last 180 days.
+--
+-- SOURCE TABLES
+--   platform.AgentGraphExecution  — Execution records (user × agent × time)
+--   platform.AgentGraph           — Agent names
+--
+-- OUTPUT COLUMNS
+--   agent_id            TEXT   Agent graph UUID
+--   agent_label         TEXT   'AgentName [first8chars]'
+--   agent_label_n       TEXT   'AgentName [first8chars] (n=total_users)'
+--   cohort_week_start   DATE   Week users first ran this agent
+--   cohort_label        TEXT   ISO week label
+--   cohort_label_n      TEXT   ISO week label with cohort size
+--   user_lifetime_week  INT    Weeks since first use of this agent
+--   cohort_users        BIGINT Users in this cohort for this agent
+--   active_users        BIGINT Users who ran the agent again in week k
+--   retention_rate      FLOAT  active_users / cohort_users
+--   cohort_users_w0     BIGINT cohort_users only at week 0 (safe to SUM)
+--   agent_total_users   BIGINT Total users across all cohorts for this agent
+--
+-- EXAMPLE QUERIES
+--   -- Best-retained agents at week 2
+--   SELECT agent_label, AVG(retention_rate) AS w2_retention
+--   FROM analytics.retention_agent
+--   WHERE user_lifetime_week = 2 AND cohort_users >= 10
+--   GROUP BY 1 ORDER BY w2_retention DESC LIMIT 10;
+--
+--   -- Agents with most unique users
+--   SELECT DISTINCT agent_label, agent_total_users
+--   FROM analytics.retention_agent
+--   ORDER BY agent_total_users DESC LIMIT 20;
+-- =============================================================
+
+WITH params AS (SELECT 12::int AS max_weeks, (CURRENT_DATE - INTERVAL '180 days') AS cohort_start),
+events AS (
+  SELECT e."userId"::text AS user_id, e."agentGraphId" AS agent_id,
+         e."createdAt"::timestamptz AS created_at,
+         DATE_TRUNC('week', e."createdAt")::date AS week_start
+  FROM platform."AgentGraphExecution" e
+),
+first_use AS (
+  SELECT user_id, agent_id, MIN(created_at) AS first_use_at,
+         DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
+  FROM events GROUP BY 1,2
+  HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
+),
+activity_weeks AS (SELECT DISTINCT user_id, agent_id, week_start FROM events),
+user_week_age AS (
+  SELECT aw.user_id, aw.agent_id, fu.cohort_week_start,
+         ((aw.week_start - DATE_TRUNC('week',fu.first_use_at)::date)/7)::int AS user_lifetime_week
+  FROM activity_weeks aw JOIN first_use fu USING (user_id, agent_id)
+  WHERE aw.week_start >= DATE_TRUNC('week',fu.first_use_at)::date
+),
+active_counts AS (
+  SELECT agent_id, cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users
+  FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2,3
+),
+cohort_sizes AS (
+  SELECT agent_id, cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_use GROUP BY 1,2
+),
+cohort_caps AS (
+  SELECT cs.agent_id, cs.cohort_week_start, cs.cohort_users,
+         LEAST((SELECT max_weeks FROM params),
+               GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date-cs.cohort_week_start)/7)::int)) AS cap_weeks
+  FROM cohort_sizes cs
+),
+grid AS (
+  SELECT cc.agent_id, cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
+  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
+),
+agent_names AS (SELECT DISTINCT ON (g."id") g."id" AS agent_id, g."name" AS agent_name FROM platform."AgentGraph" g ORDER BY g."id", g."version" DESC),
+agent_total_users AS (SELECT agent_id, SUM(cohort_users) AS agent_total_users FROM cohort_sizes GROUP BY 1)
+SELECT
+  g.agent_id,
+  COALESCE(an.agent_name,'(unnamed)')||' ['||LEFT(g.agent_id::text,8)||']'  AS agent_label,
+  COALESCE(an.agent_name,'(unnamed)')||' ['||LEFT(g.agent_id::text,8)||'] (n='||COALESCE(atu.agent_total_users,0)||')' AS agent_label_n,
+  g.cohort_week_start,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')                               AS cohort_label,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')'  AS cohort_label_n,
+  g.user_lifetime_week, g.cohort_users,
+  COALESCE(ac.active_users,0)                                              AS active_users,
+  COALESCE(ac.active_users,0)::float / NULLIF(g.cohort_users,0)           AS retention_rate,
+  CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END         AS cohort_users_w0,
+  COALESCE(atu.agent_total_users,0)                                        AS agent_total_users
+FROM grid g
+LEFT JOIN active_counts     ac  ON ac.agent_id=g.agent_id AND ac.cohort_week_start=g.cohort_week_start AND ac.user_lifetime_week=g.user_lifetime_week
+LEFT JOIN agent_names       an  ON an.agent_id=g.agent_id
+LEFT JOIN agent_total_users atu ON atu.agent_id=g.agent_id
+ORDER BY agent_label, g.cohort_week_start, g.user_lifetime_week;
--- a/autogpt_platform/analytics/queries/retention_execution_daily.sql
+++ b/autogpt_platform/analytics/queries/retention_execution_daily.sql
@@ -0,0 +1,81 @@
+-- =============================================================
+-- View: analytics.retention_execution_daily
+-- Looker source alias: ds111  |  Charts: 1
+-- =============================================================
+-- DESCRIPTION
+--   Daily cohort retention based on agent executions.
+--   Cohort anchor = day of user's FIRST ever execution.
+--   Only includes cohorts from the last 90 days, up to day 30.
+--   Great for early engagement analysis (did users run another
+--   agent the next day?).
+--
+-- SOURCE TABLES
+--   platform.AgentGraphExecution  — Execution records
+--
+-- OUTPUT COLUMNS
+--   Same pattern as retention_login_daily.
+--   cohort_day_start = day of first execution (not first login)
+--
+-- EXAMPLE QUERIES
+--   -- Day-3 execution retention
+--   SELECT cohort_label, retention_rate_bounded AS d3_retention
+--   FROM analytics.retention_execution_daily
+--   WHERE user_lifetime_day = 3 ORDER BY cohort_day_start;
+-- =============================================================
+
+WITH params AS (SELECT 30::int AS max_days, (CURRENT_DATE - INTERVAL '90 days') AS cohort_start),
+events AS (
+  SELECT e."userId"::text AS user_id, e."createdAt"::timestamptz AS created_at,
+         DATE_TRUNC('day', e."createdAt")::date AS day_start
+  FROM platform."AgentGraphExecution" e WHERE e."userId" IS NOT NULL
+),
+first_exec AS (
+  SELECT user_id, MIN(created_at) AS first_exec_at,
+         DATE_TRUNC('day', MIN(created_at))::date AS cohort_day_start
+  FROM events GROUP BY 1
+  HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
+),
+activity_days AS (SELECT DISTINCT user_id, day_start FROM events),
+user_day_age AS (
+  SELECT ad.user_id, fe.cohort_day_start,
+         (ad.day_start - DATE_TRUNC('day',fe.first_exec_at)::date)::int AS user_lifetime_day
+  FROM activity_days ad JOIN first_exec fe USING (user_id)
+  WHERE ad.day_start >= DATE_TRUNC('day',fe.first_exec_at)::date
+),
+bounded_counts AS (
+  SELECT cohort_day_start, user_lifetime_day, COUNT(DISTINCT user_id) AS active_users_bounded
+  FROM user_day_age WHERE user_lifetime_day >= 0 GROUP BY 1,2
+),
+last_active AS (
+  SELECT cohort_day_start, user_id, MAX(user_lifetime_day) AS last_active_day FROM user_day_age GROUP BY 1,2
+),
+unbounded_counts AS (
+  SELECT la.cohort_day_start, gs AS user_lifetime_day, COUNT(*) AS retained_users_unbounded
+  FROM last_active la
+  CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_day,(SELECT max_days FROM params))) gs
+  GROUP BY 1,2
+),
+cohort_sizes AS (SELECT cohort_day_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_exec GROUP BY 1),
+cohort_caps AS (
+  SELECT cs.cohort_day_start, cs.cohort_users,
+         LEAST((SELECT max_days FROM params), GREATEST(0,(CURRENT_DATE-cs.cohort_day_start)::int)) AS cap_days
+  FROM cohort_sizes cs
+),
+grid AS (
+  SELECT cc.cohort_day_start, gs AS user_lifetime_day, cc.cohort_users
+  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_days) gs
+)
+SELECT
+  g.cohort_day_start,
+  TO_CHAR(g.cohort_day_start,'YYYY-MM-DD')                                AS cohort_label,
+  TO_CHAR(g.cohort_day_start,'YYYY-MM-DD')||' (n='||g.cohort_users||')'   AS cohort_label_n,
+  g.user_lifetime_day, g.cohort_users,
+  COALESCE(b.active_users_bounded,0)     AS active_users_bounded,
+  COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END    AS retention_rate_bounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
+  CASE WHEN g.user_lifetime_day=0 THEN g.cohort_users ELSE 0 END          AS cohort_users_d0
+FROM grid g
+LEFT JOIN bounded_counts   b ON b.cohort_day_start=g.cohort_day_start AND b.user_lifetime_day=g.user_lifetime_day
+LEFT JOIN unbounded_counts u ON u.cohort_day_start=g.cohort_day_start AND u.user_lifetime_day=g.user_lifetime_day
+ORDER BY g.cohort_day_start, g.user_lifetime_day;
--- a/autogpt_platform/analytics/queries/retention_execution_weekly.sql
+++ b/autogpt_platform/analytics/queries/retention_execution_weekly.sql
@@ -0,0 +1,81 @@
+-- =============================================================
+-- View: analytics.retention_execution_weekly
+-- Looker source alias: ds92  |  Charts: 2
+-- =============================================================
+-- DESCRIPTION
+--   Weekly cohort retention based on agent executions.
+--   Cohort anchor = week of user's FIRST ever agent execution
+--   (not first login). Only includes cohorts from the last 180 days.
+--   Useful when you care about product engagement, not just visits.
+--
+-- SOURCE TABLES
+--   platform.AgentGraphExecution  — Execution records
+--
+-- OUTPUT COLUMNS
+--   Same pattern as retention_login_weekly.
+--   cohort_week_start = week of first execution (not first login)
+--
+-- EXAMPLE QUERIES
+--   -- Week-2 execution retention
+--   SELECT cohort_label, retention_rate_bounded
+--   FROM analytics.retention_execution_weekly
+--   WHERE user_lifetime_week = 2 ORDER BY cohort_week_start;
+-- =============================================================
+
+WITH params AS (SELECT 12::int AS max_weeks, (CURRENT_DATE - INTERVAL '180 days') AS cohort_start),
+events AS (
+  SELECT e."userId"::text AS user_id, e."createdAt"::timestamptz AS created_at,
+         DATE_TRUNC('week', e."createdAt")::date AS week_start
+  FROM platform."AgentGraphExecution" e WHERE e."userId" IS NOT NULL
+),
+first_exec AS (
+  SELECT user_id, MIN(created_at) AS first_exec_at,
+         DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
+  FROM events GROUP BY 1
+  HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
+),
+activity_weeks AS (SELECT DISTINCT user_id, week_start FROM events),
+user_week_age AS (
+  SELECT aw.user_id, fe.cohort_week_start,
+         ((aw.week_start - DATE_TRUNC('week',fe.first_exec_at)::date)/7)::int AS user_lifetime_week
+  FROM activity_weeks aw JOIN first_exec fe USING (user_id)
+  WHERE aw.week_start >= DATE_TRUNC('week',fe.first_exec_at)::date
+),
+bounded_counts AS (
+  SELECT cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users_bounded
+  FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2
+),
+last_active AS (
+  SELECT cohort_week_start, user_id, MAX(user_lifetime_week) AS last_active_week FROM user_week_age GROUP BY 1,2
+),
+unbounded_counts AS (
+  SELECT la.cohort_week_start, gs AS user_lifetime_week, COUNT(*) AS retained_users_unbounded
+  FROM last_active la
+  CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_week,(SELECT max_weeks FROM params))) gs
+  GROUP BY 1,2
+),
+cohort_sizes AS (SELECT cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_exec GROUP BY 1),
+cohort_caps AS (
+  SELECT cs.cohort_week_start, cs.cohort_users,
+         LEAST((SELECT max_weeks FROM params),
+               GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date-cs.cohort_week_start)/7)::int)) AS cap_weeks
+  FROM cohort_sizes cs
+),
+grid AS (
+  SELECT cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
+  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
+)
+SELECT
+  g.cohort_week_start,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')                               AS cohort_label,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')'  AS cohort_label_n,
+  g.user_lifetime_week, g.cohort_users,
+  COALESCE(b.active_users_bounded,0)     AS active_users_bounded,
+  COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END    AS retention_rate_bounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
+  CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END         AS cohort_users_w0
+FROM grid g
+LEFT JOIN bounded_counts   b ON b.cohort_week_start=g.cohort_week_start AND b.user_lifetime_week=g.user_lifetime_week
+LEFT JOIN unbounded_counts u ON u.cohort_week_start=g.cohort_week_start AND u.user_lifetime_week=g.user_lifetime_week
+ORDER BY g.cohort_week_start, g.user_lifetime_week;
--- a/autogpt_platform/analytics/queries/retention_login_daily.sql
+++ b/autogpt_platform/analytics/queries/retention_login_daily.sql
@@ -0,0 +1,94 @@
+-- =============================================================
+-- View: analytics.retention_login_daily
+-- Looker source alias: ds112  |  Charts: 1
+-- =============================================================
+-- DESCRIPTION
+--   Daily cohort retention based on login sessions.
+--   Same logic as retention_login_weekly but at day granularity,
+--   showing up to day 30 for cohorts from the last 90 days.
+--   Useful for analysing early activation (days 1-7) in detail.
+--
+-- SOURCE TABLES
+--   auth.sessions  — Login session records
+--
+-- OUTPUT COLUMNS (same pattern as retention_login_weekly)
+--   cohort_day_start          DATE     First day the cohort logged in
+--   cohort_label              TEXT     Date string (e.g. '2025-03-01')
+--   cohort_label_n            TEXT     Date + cohort size (e.g. '2025-03-01 (n=12)')
+--   user_lifetime_day         INT      Days since first login (0 = signup day)
+--   cohort_users              BIGINT   Total users in cohort
+--   active_users_bounded      BIGINT   Users active on exactly day k
+--   retained_users_unbounded  BIGINT   Users active any time on/after day k
+--   retention_rate_bounded    FLOAT    bounded / cohort_users
+--   retention_rate_unbounded  FLOAT    unbounded / cohort_users
+--   cohort_users_d0           BIGINT   cohort_users only at day 0, else 0 (safe to SUM)
+--
+-- EXAMPLE QUERIES
+--   -- Day-1 retention rate (came back next day)
+--   SELECT cohort_label, retention_rate_bounded AS d1_retention
+--   FROM analytics.retention_login_daily
+--   WHERE user_lifetime_day = 1 ORDER BY cohort_day_start;
+--
+--   -- Average retention curve across all cohorts
+--   SELECT user_lifetime_day,
+--          SUM(active_users_bounded)::float / NULLIF(SUM(cohort_users_d0), 0) AS avg_retention
+--   FROM analytics.retention_login_daily
+--   GROUP BY 1 ORDER BY 1;
+-- =============================================================
+
+WITH params AS (SELECT 30::int AS max_days, (CURRENT_DATE - INTERVAL '90 days')::date AS cohort_start),
+events AS (
+  SELECT s.user_id::text AS user_id, s.created_at::timestamptz AS created_at,
+         DATE_TRUNC('day', s.created_at)::date AS day_start
+  FROM auth.sessions s WHERE s.user_id IS NOT NULL
+),
+first_login AS (
+  SELECT user_id, MIN(created_at) AS first_login_time,
+         DATE_TRUNC('day', MIN(created_at))::date AS cohort_day_start
+  FROM events GROUP BY 1
+  HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
+),
+activity_days AS (SELECT DISTINCT user_id, day_start FROM events),
+user_day_age AS (
+  SELECT ad.user_id, fl.cohort_day_start,
+         (ad.day_start - DATE_TRUNC('day', fl.first_login_time)::date)::int AS user_lifetime_day
+  FROM activity_days ad JOIN first_login fl USING (user_id)
+  WHERE ad.day_start >= DATE_TRUNC('day', fl.first_login_time)::date
+),
+bounded_counts AS (
+  SELECT cohort_day_start, user_lifetime_day, COUNT(DISTINCT user_id) AS active_users_bounded
+  FROM user_day_age WHERE user_lifetime_day >= 0 GROUP BY 1,2
+),
+last_active AS (
+  SELECT cohort_day_start, user_id, MAX(user_lifetime_day) AS last_active_day FROM user_day_age GROUP BY 1,2
+),
+unbounded_counts AS (
+  SELECT la.cohort_day_start, gs AS user_lifetime_day, COUNT(*) AS retained_users_unbounded
+  FROM last_active la
+  CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_day,(SELECT max_days FROM params))) gs
+  GROUP BY 1,2
+),
+cohort_sizes AS (SELECT cohort_day_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_login GROUP BY 1),
+cohort_caps AS (
+  SELECT cs.cohort_day_start, cs.cohort_users,
+         LEAST((SELECT max_days FROM params), GREATEST(0,(CURRENT_DATE-cs.cohort_day_start)::int)) AS cap_days
+  FROM cohort_sizes cs
+),
+grid AS (
+  SELECT cc.cohort_day_start, gs AS user_lifetime_day, cc.cohort_users
+  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_days) gs
+)
+SELECT
+  g.cohort_day_start,
+  TO_CHAR(g.cohort_day_start,'YYYY-MM-DD')                                  AS cohort_label,
+  TO_CHAR(g.cohort_day_start,'YYYY-MM-DD')||' (n='||g.cohort_users||')'     AS cohort_label_n,
+  g.user_lifetime_day, g.cohort_users,
+  COALESCE(b.active_users_bounded,0)     AS active_users_bounded,
+  COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END    AS retention_rate_bounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
+  CASE WHEN g.user_lifetime_day=0 THEN g.cohort_users ELSE 0 END            AS cohort_users_d0
+FROM grid g
+LEFT JOIN bounded_counts   b ON b.cohort_day_start=g.cohort_day_start AND b.user_lifetime_day=g.user_lifetime_day
+LEFT JOIN unbounded_counts u ON u.cohort_day_start=g.cohort_day_start AND u.user_lifetime_day=g.user_lifetime_day
+ORDER BY g.cohort_day_start, g.user_lifetime_day;
--- a/autogpt_platform/analytics/queries/retention_login_onboarded_weekly.sql
+++ b/autogpt_platform/analytics/queries/retention_login_onboarded_weekly.sql
@@ -0,0 +1,96 @@
+-- =============================================================
+-- View: analytics.retention_login_onboarded_weekly
+-- Looker source alias: ds101  |  Charts: 2
+-- =============================================================
+-- DESCRIPTION
+--   Weekly cohort retention from login sessions, restricted to
+--   users who "onboarded" — defined as running at least one
+--   agent within 365 days of their first login.
+--   Filters out users who signed up but never activated,
+--   giving a cleaner view of engaged-user retention.
+--
+-- SOURCE TABLES
+--   auth.sessions                  — Login session records
+--   platform.AgentGraphExecution   — Used to identify onboarders
+--
+-- OUTPUT COLUMNS
+--   Same as retention_login_weekly (cohort_week_start, user_lifetime_week,
+--   retention_rate_bounded, retention_rate_unbounded, etc.)
+--   Only difference: cohort is filtered to onboarded users only.
+--
+-- EXAMPLE QUERIES
+--   -- Compare week-4 retention: all users vs onboarded only
+--   SELECT 'all_users' AS segment, AVG(retention_rate_bounded) AS w4_retention
+--   FROM analytics.retention_login_weekly WHERE user_lifetime_week = 4
+--   UNION ALL
+--   SELECT 'onboarded', AVG(retention_rate_bounded)
+--   FROM analytics.retention_login_onboarded_weekly WHERE user_lifetime_week = 4;
+-- =============================================================
+
+WITH params AS (SELECT 12::int AS max_weeks, 365::int AS onboarding_window_days),
+events AS (
+  SELECT s.user_id::text AS user_id, s.created_at::timestamptz AS created_at,
+         DATE_TRUNC('week', s.created_at)::date AS week_start
+  FROM auth.sessions s WHERE s.user_id IS NOT NULL
+),
+first_login_all AS (
+  SELECT user_id, MIN(created_at) AS first_login_time,
+         DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
+  FROM events GROUP BY 1
+),
+onboarders AS (
+  SELECT fl.user_id FROM first_login_all fl
+  WHERE EXISTS (
+    SELECT 1 FROM platform."AgentGraphExecution" e
+    WHERE e."userId"::text = fl.user_id
+      AND e."createdAt" >= fl.first_login_time
+      AND e."createdAt" < fl.first_login_time
+          + make_interval(days => (SELECT onboarding_window_days FROM params))
+  )
+),
+first_login AS (SELECT * FROM first_login_all WHERE user_id IN (SELECT user_id FROM onboarders)),
+activity_weeks AS (SELECT DISTINCT user_id, week_start FROM events),
+user_week_age AS (
+  SELECT aw.user_id, fl.cohort_week_start,
+         ((aw.week_start - DATE_TRUNC('week',fl.first_login_time)::date)/7)::int AS user_lifetime_week
+  FROM activity_weeks aw JOIN first_login fl USING (user_id)
+  WHERE aw.week_start >= DATE_TRUNC('week',fl.first_login_time)::date
+),
+bounded_counts AS (
+  SELECT cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users_bounded
+  FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2
+),
+last_active AS (
+  SELECT cohort_week_start, user_id, MAX(user_lifetime_week) AS last_active_week FROM user_week_age GROUP BY 1,2
+),
+unbounded_counts AS (
+  SELECT la.cohort_week_start, gs AS user_lifetime_week, COUNT(*) AS retained_users_unbounded
+  FROM last_active la
+  CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_week,(SELECT max_weeks FROM params))) gs
+  GROUP BY 1,2
+),
+cohort_sizes AS (SELECT cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_login GROUP BY 1),
+cohort_caps AS (
+  SELECT cs.cohort_week_start, cs.cohort_users,
+         LEAST((SELECT max_weeks FROM params),
+               GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date-cs.cohort_week_start)/7)::int)) AS cap_weeks
+  FROM cohort_sizes cs
+),
+grid AS (
+  SELECT cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
+  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
+)
+SELECT
+  g.cohort_week_start,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')                               AS cohort_label,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')'  AS cohort_label_n,
+  g.user_lifetime_week, g.cohort_users,
+  COALESCE(b.active_users_bounded,0)     AS active_users_bounded,
+  COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END    AS retention_rate_bounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
+  CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END         AS cohort_users_w0
+FROM grid g
+LEFT JOIN bounded_counts   b ON b.cohort_week_start=g.cohort_week_start AND b.user_lifetime_week=g.user_lifetime_week
+LEFT JOIN unbounded_counts u ON u.cohort_week_start=g.cohort_week_start AND u.user_lifetime_week=g.user_lifetime_week
+ORDER BY g.cohort_week_start, g.user_lifetime_week;
--- a/autogpt_platform/analytics/queries/retention_login_weekly.sql
+++ b/autogpt_platform/analytics/queries/retention_login_weekly.sql
@@ -0,0 +1,103 @@
+-- =============================================================
+-- View: analytics.retention_login_weekly
+-- Looker source alias: ds83  |  Charts: 2
+-- =============================================================
+-- DESCRIPTION
+--   Weekly cohort retention based on login sessions.
+--   Users are grouped by the ISO week of their first ever login.
+--   For each cohort × lifetime-week combination, outputs both:
+--     - bounded rate: % active in exactly that week
+--     - unbounded rate: % who were ever active on or after that week
+--   Weeks are capped to the cohort's actual age (no future data points).
+--
+-- SOURCE TABLES
+--   auth.sessions  — Login session records
+--
+-- HOW TO READ THE OUTPUT
+--   cohort_week_start   The Monday of the week users first logged in
+--   user_lifetime_week  0 = signup week, 1 = one week later, etc.
+--   retention_rate_bounded   = active_users_bounded / cohort_users
+--   retention_rate_unbounded = retained_users_unbounded / cohort_users
+--
+-- OUTPUT COLUMNS
+--   cohort_week_start         DATE     First day of the cohort's signup week
+--   cohort_label              TEXT     ISO week label (e.g. '2025-W01')
+--   cohort_label_n            TEXT     ISO week label with cohort size (e.g. '2025-W01 (n=42)')
+--   user_lifetime_week        INT      Weeks since first login (0 = signup week)
+--   cohort_users              BIGINT   Total users in this cohort (denominator)
+--   active_users_bounded      BIGINT   Users active in exactly week k
+--   retained_users_unbounded  BIGINT   Users active any time on/after week k
+--   retention_rate_bounded    FLOAT    bounded active / cohort_users
+--   retention_rate_unbounded  FLOAT    unbounded retained / cohort_users
+--   cohort_users_w0           BIGINT   cohort_users only at week 0, else 0 (safe to SUM in pivot tables)
+--
+-- EXAMPLE QUERIES
+--   -- Week-1 retention rate per cohort
+--   SELECT cohort_label, retention_rate_bounded AS w1_retention
+--   FROM analytics.retention_login_weekly
+--   WHERE user_lifetime_week = 1
+--   ORDER BY cohort_week_start;
+--
+--   -- Overall average retention curve (all cohorts combined)
+--   SELECT user_lifetime_week,
+--          SUM(active_users_bounded)::float / NULLIF(SUM(cohort_users_w0), 0) AS avg_retention
+--   FROM analytics.retention_login_weekly
+--   GROUP BY 1 ORDER BY 1;
+-- =============================================================
+
+WITH params AS (SELECT 12::int AS max_weeks),
+events AS (
+  SELECT s.user_id::text AS user_id, s.created_at::timestamptz AS created_at,
+         DATE_TRUNC('week', s.created_at)::date AS week_start
+  FROM auth.sessions s WHERE s.user_id IS NOT NULL
+),
+first_login AS (
+  SELECT user_id, MIN(created_at) AS first_login_time,
+         DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
+  FROM events GROUP BY 1
+),
+activity_weeks AS (SELECT DISTINCT user_id, week_start FROM events),
+user_week_age AS (
+  SELECT aw.user_id, fl.cohort_week_start,
+         ((aw.week_start - DATE_TRUNC('week', fl.first_login_time)::date) / 7)::int AS user_lifetime_week
+  FROM activity_weeks aw JOIN first_login fl USING (user_id)
+  WHERE aw.week_start >= DATE_TRUNC('week', fl.first_login_time)::date
+),
+bounded_counts AS (
+  SELECT cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users_bounded
+  FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2
+),
+last_active AS (
+  SELECT cohort_week_start, user_id, MAX(user_lifetime_week) AS last_active_week FROM user_week_age GROUP BY 1,2
+),
+unbounded_counts AS (
+  SELECT la.cohort_week_start, gs AS user_lifetime_week, COUNT(*) AS retained_users_unbounded
+  FROM last_active la
+  CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_week,(SELECT max_weeks FROM params))) gs
+  GROUP BY 1,2
+),
+cohort_sizes AS (SELECT cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_login GROUP BY 1),
+cohort_caps AS (
+  SELECT cs.cohort_week_start, cs.cohort_users,
+         LEAST((SELECT max_weeks FROM params),
+               GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date - cs.cohort_week_start)/7)::int)) AS cap_weeks
+  FROM cohort_sizes cs
+),
+grid AS (
+  SELECT cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
+  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
+)
+SELECT
+  g.cohort_week_start,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')                                    AS cohort_label,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')'       AS cohort_label_n,
+  g.user_lifetime_week, g.cohort_users,
+  COALESCE(b.active_users_bounded,0)     AS active_users_bounded,
+  COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END    AS retention_rate_bounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
+  CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END               AS cohort_users_w0
+FROM grid g
+LEFT JOIN bounded_counts   b ON b.cohort_week_start=g.cohort_week_start AND b.user_lifetime_week=g.user_lifetime_week
+LEFT JOIN unbounded_counts u ON u.cohort_week_start=g.cohort_week_start AND u.user_lifetime_week=g.user_lifetime_week
+ORDER BY g.cohort_week_start, g.user_lifetime_week
--- a/autogpt_platform/analytics/queries/user_block_spending.sql
+++ b/autogpt_platform/analytics/queries/user_block_spending.sql
@@ -0,0 +1,71 @@
+-- =============================================================
+-- View: analytics.user_block_spending
+-- Looker source alias: ds6  |  Charts: 5
+-- =============================================================
+-- DESCRIPTION
+--   One row per credit transaction (last 90 days).
+--   Shows how users spend credits broken down by block type,
+--   LLM provider and model.  Joins node execution stats for
+--   token-level detail.
+--
+-- SOURCE TABLES
+--   platform.CreditTransaction   — Credit debit/credit records
+--   platform.AgentNodeExecution  — Node execution stats (for token counts)
+--
+-- OUTPUT COLUMNS
+--   transactionKey        TEXT         Unique transaction identifier
+--   userId                TEXT         User who was charged
+--   amount                DECIMAL      Credit amount (positive = credit, negative = debit)
+--   negativeAmount        DECIMAL      amount * -1 (convenience for spend charts)
+--   transactionType       TEXT         Transaction type (e.g. 'USAGE', 'REFUND', 'TOP_UP')
+--   transactionTime       TIMESTAMPTZ  When the transaction was recorded
+--   blockId               TEXT         Block UUID that triggered the spend
+--   blockName             TEXT         Human-readable block name
+--   llm_provider          TEXT         LLM provider (e.g. 'openai', 'anthropic')
+--   llm_model             TEXT         Model name (e.g. 'gpt-4o', 'claude-3-5-sonnet')
+--   node_exec_id          TEXT         Linked node execution UUID
+--   llm_call_count        INT          LLM API calls made in that execution
+--   llm_retry_count       INT          LLM retries in that execution
+--   llm_input_token_count INT          Input tokens consumed
+--   llm_output_token_count INT         Output tokens produced
+--
+-- WINDOW
+--   Rolling 90 days (createdAt > CURRENT_DATE - 90 days)
+--
+-- EXAMPLE QUERIES
+--   -- Total spend per user (last 90 days)
+--   SELECT "userId", SUM("negativeAmount") AS total_spent
+--   FROM analytics.user_block_spending
+--   WHERE "transactionType" = 'USAGE'
+--   GROUP BY 1 ORDER BY total_spent DESC;
+--
+--   -- Spend by LLM provider + model
+--   SELECT "llm_provider", "llm_model",
+--          SUM("negativeAmount") AS total_cost,
+--          SUM("llm_input_token_count") AS input_tokens,
+--          SUM("llm_output_token_count") AS output_tokens
+--   FROM analytics.user_block_spending
+--   WHERE "llm_provider" IS NOT NULL
+--   GROUP BY 1, 2 ORDER BY total_cost DESC;
+-- =============================================================
+
+SELECT
+    c."transactionKey"                                        AS transactionKey,
+    c."userId"                                                AS userId,
+    c."amount"                                                AS amount,
+    c."amount" * -1                                           AS negativeAmount,
+    c."type"                                                  AS transactionType,
+    c."createdAt"                                             AS transactionTime,
+    c.metadata->>'block_id'                                   AS blockId,
+    c.metadata->>'block'                                      AS blockName,
+    c.metadata->'input'->'credentials'->>'provider'           AS llm_provider,
+    c.metadata->'input'->>'model'                             AS llm_model,
+    c.metadata->>'node_exec_id'                               AS node_exec_id,
+    (ne."stats"->>'llm_call_count')::int                       AS llm_call_count,
+    (ne."stats"->>'llm_retry_count')::int                      AS llm_retry_count,
+    (ne."stats"->>'input_token_count')::int                    AS llm_input_token_count,
+    (ne."stats"->>'output_token_count')::int                   AS llm_output_token_count
+FROM platform."CreditTransaction" c
+LEFT JOIN platform."AgentNodeExecution" ne
+       ON (c.metadata->>'node_exec_id') = ne."id"::text
+WHERE c."createdAt" > CURRENT_DATE - INTERVAL '90 days'
--- a/autogpt_platform/analytics/queries/user_onboarding.sql
+++ b/autogpt_platform/analytics/queries/user_onboarding.sql
@@ -0,0 +1,45 @@
+-- =============================================================
+-- View: analytics.user_onboarding
+-- Looker source alias: ds68  |  Charts: 3
+-- =============================================================
+-- DESCRIPTION
+--   One row per user onboarding record.  Contains the user's
+--   stated usage reason, selected integrations, completed
+--   onboarding steps and optional first agent selection.
+--   Full history (no date filter) since onboarding happens
+--   once per user.
+--
+-- SOURCE TABLES
+--   platform.UserOnboarding  — Onboarding state per user
+--
+-- OUTPUT COLUMNS
+--   id                            TEXT         Onboarding record UUID
+--   createdAt                     TIMESTAMPTZ  When onboarding started
+--   updatedAt                     TIMESTAMPTZ  Last update to onboarding state
+--   usageReason                   TEXT         Why user signed up (e.g. 'work', 'personal')
+--   integrations                  TEXT[]       Array of integration names the user selected
+--   userId                        TEXT         User UUID
+--   completedSteps                TEXT[]       Array of onboarding step enums completed
+--   selectedStoreListingVersionId TEXT         First marketplace agent the user chose (if any)
+--
+-- EXAMPLE QUERIES
+--   -- Usage reason breakdown
+--   SELECT "usageReason", COUNT(*) FROM analytics.user_onboarding GROUP BY 1;
+--
+--   -- Completion rate per step
+--   SELECT step, COUNT(*) AS users_completed
+--   FROM analytics.user_onboarding
+--   CROSS JOIN LATERAL UNNEST("completedSteps") AS step
+--   GROUP BY 1 ORDER BY users_completed DESC;
+-- =============================================================
+
+SELECT
+    id,
+    "createdAt",
+    "updatedAt",
+    "usageReason",
+    integrations,
+    "userId",
+    "completedSteps",
+    "selectedStoreListingVersionId"
+FROM platform."UserOnboarding"
--- a/autogpt_platform/analytics/queries/user_onboarding_funnel.sql
+++ b/autogpt_platform/analytics/queries/user_onboarding_funnel.sql
@@ -0,0 +1,100 @@
+-- =============================================================
+-- View: analytics.user_onboarding_funnel
+-- Looker source alias: ds74  |  Charts: 1
+-- =============================================================
+-- DESCRIPTION
+--   Pre-aggregated onboarding funnel showing how many users
+--   completed each step and the drop-off percentage from the
+--   previous step.  One row per onboarding step (all 22 steps
+--   always present, even with 0 completions — prevents sparse
+--   gaps from making LAG compare the wrong predecessors).
+--
+-- SOURCE TABLES
+--   platform.UserOnboarding  — Onboarding records with completedSteps array
+--
+-- OUTPUT COLUMNS
+--   step             TEXT     Onboarding step enum name (e.g. 'WELCOME', 'CONGRATS')
+--   step_order       INT      Numeric position in the funnel (1=first, 22=last)
+--   users_completed  BIGINT   Distinct users who completed this step
+--   pct_from_prev    NUMERIC  % of users from the previous step who reached this one
+--
+-- STEP ORDER
+--   1  WELCOME               9  MARKETPLACE_VISIT     17  SCHEDULE_AGENT
+--   2  USAGE_REASON         10  MARKETPLACE_ADD_AGENT  18  RUN_AGENTS
+--   3  INTEGRATIONS         11  MARKETPLACE_RUN_AGENT  19  RUN_3_DAYS
+--   4  AGENT_CHOICE         12  BUILDER_OPEN           20  TRIGGER_WEBHOOK
+--   5  AGENT_NEW_RUN        13  BUILDER_SAVE_AGENT     21  RUN_14_DAYS
+--   6  AGENT_INPUT          14  BUILDER_RUN_AGENT      22  RUN_AGENTS_100
+--   7  CONGRATS             15  VISIT_COPILOT
+--   8  GET_RESULTS          16  RE_RUN_AGENT
+--
+-- WINDOW
+--   Users who started onboarding in the last 90 days
+--
+-- EXAMPLE QUERIES
+--   -- Full funnel
+--   SELECT * FROM analytics.user_onboarding_funnel ORDER BY step_order;
+--
+--   -- Biggest drop-off point
+--   SELECT step, pct_from_prev FROM analytics.user_onboarding_funnel
+--   ORDER BY pct_from_prev ASC LIMIT 3;
+-- =============================================================
+
+WITH all_steps AS (
+  -- Complete ordered grid of all 22 steps so zero-completion steps
+  -- are always present, keeping LAG comparisons correct.
+  SELECT step_name, step_order
+  FROM (VALUES
+    ('WELCOME',               1),
+    ('USAGE_REASON',          2),
+    ('INTEGRATIONS',          3),
+    ('AGENT_CHOICE',          4),
+    ('AGENT_NEW_RUN',         5),
+    ('AGENT_INPUT',           6),
+    ('CONGRATS',              7),
+    ('GET_RESULTS',           8),
+    ('MARKETPLACE_VISIT',     9),
+    ('MARKETPLACE_ADD_AGENT', 10),
+    ('MARKETPLACE_RUN_AGENT', 11),
+    ('BUILDER_OPEN',          12),
+    ('BUILDER_SAVE_AGENT',    13),
+    ('BUILDER_RUN_AGENT',     14),
+    ('VISIT_COPILOT',         15),
+    ('RE_RUN_AGENT',          16),
+    ('SCHEDULE_AGENT',        17),
+    ('RUN_AGENTS',            18),
+    ('RUN_3_DAYS',            19),
+    ('TRIGGER_WEBHOOK',       20),
+    ('RUN_14_DAYS',           21),
+    ('RUN_AGENTS_100',        22)
+  ) AS t(step_name, step_order)
+),
+raw AS (
+  SELECT
+      u."userId",
+      step_txt::text AS step
+  FROM platform."UserOnboarding" u
+  CROSS JOIN LATERAL UNNEST(u."completedSteps") AS step_txt
+  WHERE u."createdAt" >= CURRENT_DATE - INTERVAL '90 days'
+),
+step_counts AS (
+  SELECT step, COUNT(DISTINCT "userId") AS users_completed
+  FROM raw GROUP BY step
+),
+funnel AS (
+  SELECT
+      a.step_name                          AS step,
+      a.step_order,
+      COALESCE(sc.users_completed, 0)      AS users_completed,
+      ROUND(
+        100.0 * COALESCE(sc.users_completed, 0)
+        / NULLIF(
+            LAG(COALESCE(sc.users_completed, 0)) OVER (ORDER BY a.step_order),
+            0
+          ),
+        2
+      )                                    AS pct_from_prev
+  FROM all_steps a
+  LEFT JOIN step_counts sc ON sc.step = a.step_name
+)
+SELECT * FROM funnel ORDER BY step_order
--- a/autogpt_platform/analytics/queries/user_onboarding_integration.sql
+++ b/autogpt_platform/analytics/queries/user_onboarding_integration.sql
@@ -0,0 +1,41 @@
+-- =============================================================
+-- View: analytics.user_onboarding_integration
+-- Looker source alias: ds75  |  Charts: 1
+-- =============================================================
+-- DESCRIPTION
+--   Pre-aggregated count of users who selected each integration
+--   during onboarding.  One row per integration type, sorted
+--   by popularity.
+--
+-- SOURCE TABLES
+--   platform.UserOnboarding  — integrations array column
+--
+-- OUTPUT COLUMNS
+--   integration            TEXT    Integration name (e.g. 'github', 'slack', 'notion')
+--   users_with_integration BIGINT  Distinct users who selected this integration
+--
+-- WINDOW
+--   Users who started onboarding in the last 90 days
+--
+-- EXAMPLE QUERIES
+--   -- Full integration popularity ranking
+--   SELECT * FROM analytics.user_onboarding_integration;
+--
+--   -- Top 5 integrations
+--   SELECT * FROM analytics.user_onboarding_integration LIMIT 5;
+-- =============================================================
+
+WITH exploded AS (
+  SELECT
+      u."userId" AS user_id,
+      UNNEST(u."integrations") AS integration
+  FROM platform."UserOnboarding" u
+  WHERE u."createdAt" >= CURRENT_DATE - INTERVAL '90 days'
+)
+SELECT
+    integration,
+    COUNT(DISTINCT user_id) AS users_with_integration
+FROM exploded
+WHERE integration IS NOT NULL AND integration <> ''
+GROUP BY integration
+ORDER BY users_with_integration DESC
--- a/autogpt_platform/analytics/queries/users_activities.sql
+++ b/autogpt_platform/analytics/queries/users_activities.sql
@@ -0,0 +1,145 @@
+-- =============================================================
+-- View: analytics.users_activities
+-- Looker source alias: ds56  |  Charts: 5
+-- =============================================================
+-- DESCRIPTION
+--   One row per user with lifetime activity summary.
+--   Joins login sessions with agent graphs, executions and
+--   node-level runs to give a full picture of how engaged
+--   each user is.  Includes a convenience flag for 7-day
+--   activation (did the user return at least 7 days after
+--   their first login?).
+--
+-- SOURCE TABLES
+--   auth.sessions                    — Login/session records
+--   platform.AgentGraph              — Graphs (agents) built by the user
+--   platform.AgentGraphExecution     — Agent run history
+--   platform.AgentNodeExecution      — Individual block execution history
+--
+-- PERFORMANCE NOTE
+--   Each CTE aggregates its own table independently by userId.
+--   This avoids the fan-out that occurs when driving every join
+--   from user_logins across the two largest tables
+--   (AgentGraphExecution and AgentNodeExecution).
+--
+-- OUTPUT COLUMNS
+--   user_id                   TEXT         Supabase user UUID
+--   first_login_time          TIMESTAMPTZ  First ever session created_at
+--   last_login_time           TIMESTAMPTZ  Most recent session created_at
+--   last_visit_time           TIMESTAMPTZ  Max of last refresh or login
+--   last_agent_save_time      TIMESTAMPTZ  Last time user saved an agent graph
+--   agent_count               BIGINT       Number of distinct active graphs built (0 if none)
+--   first_agent_run_time      TIMESTAMPTZ  First ever graph execution
+--   last_agent_run_time       TIMESTAMPTZ  Most recent graph execution
+--   unique_agent_runs         BIGINT       Distinct agent graphs ever run (0 if none)
+--   agent_runs                BIGINT       Total graph execution count (0 if none)
+--   node_execution_count      BIGINT       Total node executions across all runs
+--   node_execution_failed     BIGINT       Node executions with FAILED status
+--   node_execution_completed  BIGINT       Node executions with COMPLETED status
+--   node_execution_terminated BIGINT       Node executions with TERMINATED status
+--   node_execution_queued     BIGINT       Node executions with QUEUED status
+--   node_execution_running    BIGINT       Node executions with RUNNING status
+--   is_active_after_7d        INT          1=returned after day 7, 0=did not, NULL=too early to tell
+--   node_execution_incomplete BIGINT       Node executions with INCOMPLETE status
+--   node_execution_review     BIGINT       Node executions with REVIEW status
+--
+-- EXAMPLE QUERIES
+--   -- Users who ran at least one agent and returned after 7 days
+--   SELECT COUNT(*) FROM analytics.users_activities
+--   WHERE agent_runs > 0 AND is_active_after_7d = 1;
+--
+--   -- Top 10 most active users by agent runs
+--   SELECT user_id, agent_runs, node_execution_count
+--   FROM analytics.users_activities
+--   ORDER BY agent_runs DESC LIMIT 10;
+--
+--   -- 7-day activation rate
+--   SELECT
+--     SUM(CASE WHEN is_active_after_7d = 1 THEN 1 ELSE 0 END)::float
+--     / NULLIF(COUNT(CASE WHEN is_active_after_7d IS NOT NULL THEN 1 END), 0)
+--     AS activation_rate
+--   FROM analytics.users_activities;
+-- =============================================================
+
+WITH user_logins AS (
+  SELECT
+    user_id::text                                    AS user_id,
+    MIN(created_at)                                  AS first_login_time,
+    MAX(created_at)                                  AS last_login_time,
+    GREATEST(
+      MAX(refreshed_at)::timestamptz,
+      MAX(created_at)::timestamptz
+    )                                                AS last_visit_time
+  FROM auth.sessions
+  GROUP BY user_id
+),
+user_agents AS (
+  -- Aggregate AgentGraph directly by userId (no fan-out from user_logins)
+  SELECT
+    "userId"::text                AS user_id,
+    MAX("updatedAt")              AS last_agent_save_time,
+    COUNT(DISTINCT "id")          AS agent_count
+  FROM platform."AgentGraph"
+  WHERE "isActive"
+  GROUP BY "userId"
+),
+user_graph_runs AS (
+  -- Aggregate AgentGraphExecution directly by userId
+  SELECT
+    "userId"::text                        AS user_id,
+    MIN("createdAt")                      AS first_agent_run_time,
+    MAX("createdAt")                      AS last_agent_run_time,
+    COUNT(DISTINCT "agentGraphId")        AS unique_agent_runs,
+    COUNT("id")                           AS agent_runs
+  FROM platform."AgentGraphExecution"
+  GROUP BY "userId"
+),
+user_node_runs AS (
+  -- Aggregate AgentNodeExecution directly; resolve userId via a
+  -- single join to AgentGraphExecution instead of fanning out from
+  -- user_logins through both large tables.
+  SELECT
+    g."userId"::text                                                   AS user_id,
+    COUNT(*)                                                           AS node_execution_count,
+    COUNT(*) FILTER (WHERE n."executionStatus" = 'FAILED')             AS node_execution_failed,
+    COUNT(*) FILTER (WHERE n."executionStatus" = 'COMPLETED')          AS node_execution_completed,
+    COUNT(*) FILTER (WHERE n."executionStatus" = 'TERMINATED')         AS node_execution_terminated,
+    COUNT(*) FILTER (WHERE n."executionStatus" = 'QUEUED')             AS node_execution_queued,
+    COUNT(*) FILTER (WHERE n."executionStatus" = 'RUNNING')            AS node_execution_running,
+    COUNT(*) FILTER (WHERE n."executionStatus" = 'INCOMPLETE')         AS node_execution_incomplete,
+    COUNT(*) FILTER (WHERE n."executionStatus" = 'REVIEW')             AS node_execution_review
+  FROM platform."AgentNodeExecution" n
+  JOIN platform."AgentGraphExecution" g
+    ON g."id" = n."agentGraphExecutionId"
+  GROUP BY g."userId"
+)
+SELECT
+  ul.user_id,
+  ul.first_login_time,
+  ul.last_login_time,
+  ul.last_visit_time,
+  ua.last_agent_save_time,
+  COALESCE(ua.agent_count, 0)             AS agent_count,
+  gr.first_agent_run_time,
+  gr.last_agent_run_time,
+  COALESCE(gr.unique_agent_runs, 0)       AS unique_agent_runs,
+  COALESCE(gr.agent_runs, 0)              AS agent_runs,
+  COALESCE(nr.node_execution_count, 0)      AS node_execution_count,
+  COALESCE(nr.node_execution_failed, 0)     AS node_execution_failed,
+  COALESCE(nr.node_execution_completed, 0)  AS node_execution_completed,
+  COALESCE(nr.node_execution_terminated, 0) AS node_execution_terminated,
+  COALESCE(nr.node_execution_queued, 0)     AS node_execution_queued,
+  COALESCE(nr.node_execution_running, 0)    AS node_execution_running,
+  CASE
+    WHEN ul.first_login_time < NOW() - INTERVAL '7 days'
+     AND ul.last_visit_time  >= ul.first_login_time + INTERVAL '7 days' THEN 1
+    WHEN ul.first_login_time < NOW() - INTERVAL '7 days'
+     AND ul.last_visit_time  <  ul.first_login_time + INTERVAL '7 days' THEN 0
+    ELSE NULL
+  END AS is_active_after_7d,
+  COALESCE(nr.node_execution_incomplete, 0) AS node_execution_incomplete,
+  COALESCE(nr.node_execution_review, 0)     AS node_execution_review
+FROM user_logins ul
+LEFT JOIN user_agents     ua ON ul.user_id = ua.user_id
+LEFT JOIN user_graph_runs gr ON ul.user_id = gr.user_id
+LEFT JOIN user_node_runs  nr ON ul.user_id = nr.user_id
--- a/autogpt_platform/autogpt_libs/poetry.lock
+++ b/autogpt_platform/autogpt_libs/poetry.lock
@@ -448,61 +448,61 @@ toml = ["tomli ; python_full_version <= \"3.11.0a6\""]

 [[package]]
 name = "cryptography"
-version = "46.0.4"
+version = "46.0.5"
 description = "cryptography is a package which provides cryptographic recipes and primitives to Python developers."
 optional = false
 python-versions = "!=3.9.0,!=3.9.1,>=3.8"
 groups = ["main"]
 files = [
-    {file = "cryptography-46.0.4-cp311-abi3-macosx_10_9_universal2.whl", hash = "sha256:281526e865ed4166009e235afadf3a4c4cba6056f99336a99efba65336fd5485"},
-    {file = "cryptography-46.0.4-cp311-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:5f14fba5bf6f4390d7ff8f086c566454bff0411f6d8aa7af79c88b6f9267aecc"},
-    {file = "cryptography-46.0.4-cp311-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:47bcd19517e6389132f76e2d5303ded6cf3f78903da2158a671be8de024f4cd0"},
-    {file = "cryptography-46.0.4-cp311-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:01df4f50f314fbe7009f54046e908d1754f19d0c6d3070df1e6268c5a4af09fa"},
-    {file = "cryptography-46.0.4-cp311-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:5aa3e463596b0087b3da0dbe2b2487e9fc261d25da85754e30e3b40637d61f81"},
-    {file = "cryptography-46.0.4-cp311-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:0a9ad24359fee86f131836a9ac3bffc9329e956624a2d379b613f8f8abaf5255"},
-    {file = "cryptography-46.0.4-cp311-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:dc1272e25ef673efe72f2096e92ae39dea1a1a450dd44918b15351f72c5a168e"},
-    {file = "cryptography-46.0.4-cp311-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:de0f5f4ec8711ebc555f54735d4c673fc34b65c44283895f1a08c2b49d2fd99c"},
-    {file = "cryptography-46.0.4-cp311-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:eeeb2e33d8dbcccc34d64651f00a98cb41b2dc69cef866771a5717e6734dfa32"},
-    {file = "cryptography-46.0.4-cp311-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:3d425eacbc9aceafd2cb429e42f4e5d5633c6f873f5e567077043ef1b9bbf616"},
-    {file = "cryptography-46.0.4-cp311-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:91627ebf691d1ea3976a031b61fb7bac1ccd745afa03602275dda443e11c8de0"},
-    {file = "cryptography-46.0.4-cp311-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:2d08bc22efd73e8854b0b7caff402d735b354862f1145d7be3b9c0f740fef6a0"},
-    {file = "cryptography-46.0.4-cp311-abi3-win32.whl", hash = "sha256:82a62483daf20b8134f6e92898da70d04d0ef9a75829d732ea1018678185f4f5"},
-    {file = "cryptography-46.0.4-cp311-abi3-win_amd64.whl", hash = "sha256:6225d3ebe26a55dbc8ead5ad1265c0403552a63336499564675b29eb3184c09b"},
-    {file = "cryptography-46.0.4-cp314-cp314t-macosx_10_9_universal2.whl", hash = "sha256:485e2b65d25ec0d901bca7bcae0f53b00133bf3173916d8e421f6fddde103908"},
-    {file = "cryptography-46.0.4-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:078e5f06bd2fa5aea5a324f2a09f914b1484f1d0c2a4d6a8a28c74e72f65f2da"},
-    {file = "cryptography-46.0.4-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:dce1e4f068f03008da7fa51cc7abc6ddc5e5de3e3d1550334eaf8393982a5829"},
-    {file = "cryptography-46.0.4-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:2067461c80271f422ee7bdbe79b9b4be54a5162e90345f86a23445a0cf3fd8a2"},
-    {file = "cryptography-46.0.4-cp314-cp314t-manylinux_2_28_ppc64le.whl", hash = "sha256:c92010b58a51196a5f41c3795190203ac52edfd5dc3ff99149b4659eba9d2085"},
-    {file = "cryptography-46.0.4-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:829c2b12bbc5428ab02d6b7f7e9bbfd53e33efd6672d21341f2177470171ad8b"},
-    {file = "cryptography-46.0.4-cp314-cp314t-manylinux_2_31_armv7l.whl", hash = "sha256:62217ba44bf81b30abaeda1488686a04a702a261e26f87db51ff61d9d3510abd"},
-    {file = "cryptography-46.0.4-cp314-cp314t-manylinux_2_34_aarch64.whl", hash = "sha256:9c2da296c8d3415b93e6053f5a728649a87a48ce084a9aaf51d6e46c87c7f2d2"},
-    {file = "cryptography-46.0.4-cp314-cp314t-manylinux_2_34_ppc64le.whl", hash = "sha256:9b34d8ba84454641a6bf4d6762d15847ecbd85c1316c0a7984e6e4e9f748ec2e"},
-    {file = "cryptography-46.0.4-cp314-cp314t-manylinux_2_34_x86_64.whl", hash = "sha256:df4a817fa7138dd0c96c8c8c20f04b8aaa1fac3bbf610913dcad8ea82e1bfd3f"},
-    {file = "cryptography-46.0.4-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:b1de0ebf7587f28f9190b9cb526e901bf448c9e6a99655d2b07fff60e8212a82"},
-    {file = "cryptography-46.0.4-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:9b4d17bc7bd7cdd98e3af40b441feaea4c68225e2eb2341026c84511ad246c0c"},
-    {file = "cryptography-46.0.4-cp314-cp314t-win32.whl", hash = "sha256:c411f16275b0dea722d76544a61d6421e2cc829ad76eec79280dbdc9ddf50061"},
-    {file = "cryptography-46.0.4-cp314-cp314t-win_amd64.whl", hash = "sha256:728fedc529efc1439eb6107b677f7f7558adab4553ef8669f0d02d42d7b959a7"},
-    {file = "cryptography-46.0.4-cp38-abi3-macosx_10_9_universal2.whl", hash = "sha256:a9556ba711f7c23f77b151d5798f3ac44a13455cc68db7697a1096e6d0563cab"},
-    {file = "cryptography-46.0.4-cp38-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:8bf75b0259e87fa70bddc0b8b4078b76e7fd512fd9afae6c1193bcf440a4dbef"},
-    {file = "cryptography-46.0.4-cp38-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:3c268a3490df22270955966ba236d6bc4a8f9b6e4ffddb78aac535f1a5ea471d"},
-    {file = "cryptography-46.0.4-cp38-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:812815182f6a0c1d49a37893a303b44eaac827d7f0d582cecfc81b6427f22973"},
-    {file = "cryptography-46.0.4-cp38-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:a90e43e3ef65e6dcf969dfe3bb40cbf5aef0d523dff95bfa24256be172a845f4"},
-    {file = "cryptography-46.0.4-cp38-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:a05177ff6296644ef2876fce50518dffb5bcdf903c85250974fc8bc85d54c0af"},
-    {file = "cryptography-46.0.4-cp38-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:daa392191f626d50f1b136c9b4cf08af69ca8279d110ea24f5c2700054d2e263"},
-    {file = "cryptography-46.0.4-cp38-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:e07ea39c5b048e085f15923511d8121e4a9dc45cee4e3b970ca4f0d338f23095"},
-    {file = "cryptography-46.0.4-cp38-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:d5a45ddc256f492ce42a4e35879c5e5528c09cd9ad12420828c972951d8e016b"},
-    {file = "cryptography-46.0.4-cp38-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:6bb5157bf6a350e5b28aee23beb2d84ae6f5be390b2f8ee7ea179cda077e1019"},
-    {file = "cryptography-46.0.4-cp38-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:dd5aba870a2c40f87a3af043e0dee7d9eb02d4aff88a797b48f2b43eff8c3ab4"},
-    {file = "cryptography-46.0.4-cp38-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:93d8291da8d71024379ab2cb0b5c57915300155ad42e07f76bea6ad838d7e59b"},
-    {file = "cryptography-46.0.4-cp38-abi3-win32.whl", hash = "sha256:0563655cb3c6d05fb2afe693340bc050c30f9f34e15763361cf08e94749401fc"},
-    {file = "cryptography-46.0.4-cp38-abi3-win_amd64.whl", hash = "sha256:fa0900b9ef9c49728887d1576fd8d9e7e3ea872fa9b25ef9b64888adc434e976"},
-    {file = "cryptography-46.0.4-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:766330cce7416c92b5e90c3bb71b1b79521760cdcfc3a6a1a182d4c9fab23d2b"},
-    {file = "cryptography-46.0.4-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:c236a44acfb610e70f6b3e1c3ca20ff24459659231ef2f8c48e879e2d32b73da"},
-    {file = "cryptography-46.0.4-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:8a15fb869670efa8f83cbffbc8753c1abf236883225aed74cd179b720ac9ec80"},
-    {file = "cryptography-46.0.4-pp311-pypy311_pp73-manylinux_2_34_aarch64.whl", hash = "sha256:fdc3daab53b212472f1524d070735b2f0c214239df131903bae1d598016fa822"},
-    {file = "cryptography-46.0.4-pp311-pypy311_pp73-manylinux_2_34_x86_64.whl", hash = "sha256:44cc0675b27cadb71bdbb96099cca1fa051cd11d2ade09e5cd3a2edb929ed947"},
-    {file = "cryptography-46.0.4-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:be8c01a7d5a55f9a47d1888162b76c8f49d62b234d88f0ff91a9fbebe32ffbc3"},
-    {file = "cryptography-46.0.4.tar.gz", hash = "sha256:bfd019f60f8abc2ed1b9be4ddc21cfef059c841d86d710bb69909a688cbb8f59"},
+    {file = "cryptography-46.0.5-cp311-abi3-macosx_10_9_universal2.whl", hash = "sha256:351695ada9ea9618b3500b490ad54c739860883df6c1f555e088eaf25b1bbaad"},
+    {file = "cryptography-46.0.5-cp311-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:c18ff11e86df2e28854939acde2d003f7984f721eba450b56a200ad90eeb0e6b"},
+    {file = "cryptography-46.0.5-cp311-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:4d7e3d356b8cd4ea5aff04f129d5f66ebdc7b6f8eae802b93739ed520c47c79b"},
+    {file = "cryptography-46.0.5-cp311-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:50bfb6925eff619c9c023b967d5b77a54e04256c4281b0e21336a130cd7fc263"},
+    {file = "cryptography-46.0.5-cp311-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:803812e111e75d1aa73690d2facc295eaefd4439be1023fefc4995eaea2af90d"},
+    {file = "cryptography-46.0.5-cp311-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:3ee190460e2fbe447175cda91b88b84ae8322a104fc27766ad09428754a618ed"},
+    {file = "cryptography-46.0.5-cp311-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:f145bba11b878005c496e93e257c1e88f154d278d2638e6450d17e0f31e558d2"},
+    {file = "cryptography-46.0.5-cp311-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:e9251e3be159d1020c4030bd2e5f84d6a43fe54b6c19c12f51cde9542a2817b2"},
+    {file = "cryptography-46.0.5-cp311-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:47fb8a66058b80e509c47118ef8a75d14c455e81ac369050f20ba0d23e77fee0"},
+    {file = "cryptography-46.0.5-cp311-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:4c3341037c136030cb46e4b1e17b7418ea4cbd9dd207e4a6f3b2b24e0d4ac731"},
+    {file = "cryptography-46.0.5-cp311-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:890bcb4abd5a2d3f852196437129eb3667d62630333aacc13dfd470fad3aaa82"},
+    {file = "cryptography-46.0.5-cp311-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:80a8d7bfdf38f87ca30a5391c0c9ce4ed2926918e017c29ddf643d0ed2778ea1"},
+    {file = "cryptography-46.0.5-cp311-abi3-win32.whl", hash = "sha256:60ee7e19e95104d4c03871d7d7dfb3d22ef8a9b9c6778c94e1c8fcc8365afd48"},
+    {file = "cryptography-46.0.5-cp311-abi3-win_amd64.whl", hash = "sha256:38946c54b16c885c72c4f59846be9743d699eee2b69b6988e0a00a01f46a61a4"},
+    {file = "cryptography-46.0.5-cp314-cp314t-macosx_10_9_universal2.whl", hash = "sha256:94a76daa32eb78d61339aff7952ea819b1734b46f73646a07decb40e5b3448e2"},
+    {file = "cryptography-46.0.5-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:5be7bf2fb40769e05739dd0046e7b26f9d4670badc7b032d6ce4db64dddc0678"},
+    {file = "cryptography-46.0.5-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:fe346b143ff9685e40192a4960938545c699054ba11d4f9029f94751e3f71d87"},
+    {file = "cryptography-46.0.5-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:c69fd885df7d089548a42d5ec05be26050ebcd2283d89b3d30676eb32ff87dee"},
+    {file = "cryptography-46.0.5-cp314-cp314t-manylinux_2_28_ppc64le.whl", hash = "sha256:8293f3dea7fc929ef7240796ba231413afa7b68ce38fd21da2995549f5961981"},
+    {file = "cryptography-46.0.5-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:1abfdb89b41c3be0365328a410baa9df3ff8a9110fb75e7b52e66803ddabc9a9"},
+    {file = "cryptography-46.0.5-cp314-cp314t-manylinux_2_31_armv7l.whl", hash = "sha256:d66e421495fdb797610a08f43b05269e0a5ea7f5e652a89bfd5a7d3c1dee3648"},
+    {file = "cryptography-46.0.5-cp314-cp314t-manylinux_2_34_aarch64.whl", hash = "sha256:4e817a8920bfbcff8940ecfd60f23d01836408242b30f1a708d93198393a80b4"},
+    {file = "cryptography-46.0.5-cp314-cp314t-manylinux_2_34_ppc64le.whl", hash = "sha256:68f68d13f2e1cb95163fa3b4db4bf9a159a418f5f6e7242564fc75fcae667fd0"},
+    {file = "cryptography-46.0.5-cp314-cp314t-manylinux_2_34_x86_64.whl", hash = "sha256:a3d1fae9863299076f05cb8a778c467578262fae09f9dc0ee9b12eb4268ce663"},
+    {file = "cryptography-46.0.5-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:c4143987a42a2397f2fc3b4d7e3a7d313fbe684f67ff443999e803dd75a76826"},
+    {file = "cryptography-46.0.5-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:7d731d4b107030987fd61a7f8ab512b25b53cef8f233a97379ede116f30eb67d"},
+    {file = "cryptography-46.0.5-cp314-cp314t-win32.whl", hash = "sha256:c3bcce8521d785d510b2aad26ae2c966092b7daa8f45dd8f44734a104dc0bc1a"},
+    {file = "cryptography-46.0.5-cp314-cp314t-win_amd64.whl", hash = "sha256:4d8ae8659ab18c65ced284993c2265910f6c9e650189d4e3f68445ef82a810e4"},
+    {file = "cryptography-46.0.5-cp38-abi3-macosx_10_9_universal2.whl", hash = "sha256:4108d4c09fbbf2789d0c926eb4152ae1760d5a2d97612b92d508d96c861e4d31"},
+    {file = "cryptography-46.0.5-cp38-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:7d1f30a86d2757199cb2d56e48cce14deddf1f9c95f1ef1b64ee91ea43fe2e18"},
+    {file = "cryptography-46.0.5-cp38-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:039917b0dc418bb9f6edce8a906572d69e74bd330b0b3fea4f79dab7f8ddd235"},
+    {file = "cryptography-46.0.5-cp38-abi3-manylinux_2_28_aarch64.whl", hash = "sha256:ba2a27ff02f48193fc4daeadf8ad2590516fa3d0adeeb34336b96f7fa64c1e3a"},
+    {file = "cryptography-46.0.5-cp38-abi3-manylinux_2_28_ppc64le.whl", hash = "sha256:61aa400dce22cb001a98014f647dc21cda08f7915ceb95df0c9eaf84b4b6af76"},
+    {file = "cryptography-46.0.5-cp38-abi3-manylinux_2_28_x86_64.whl", hash = "sha256:3ce58ba46e1bc2aac4f7d9290223cead56743fa6ab94a5d53292ffaac6a91614"},
+    {file = "cryptography-46.0.5-cp38-abi3-manylinux_2_31_armv7l.whl", hash = "sha256:420d0e909050490d04359e7fdb5ed7e667ca5c3c402b809ae2563d7e66a92229"},
+    {file = "cryptography-46.0.5-cp38-abi3-manylinux_2_34_aarch64.whl", hash = "sha256:582f5fcd2afa31622f317f80426a027f30dc792e9c80ffee87b993200ea115f1"},
+    {file = "cryptography-46.0.5-cp38-abi3-manylinux_2_34_ppc64le.whl", hash = "sha256:bfd56bb4b37ed4f330b82402f6f435845a5f5648edf1ad497da51a8452d5d62d"},
+    {file = "cryptography-46.0.5-cp38-abi3-manylinux_2_34_x86_64.whl", hash = "sha256:a3d507bb6a513ca96ba84443226af944b0f7f47dcc9a399d110cd6146481d24c"},
+    {file = "cryptography-46.0.5-cp38-abi3-musllinux_1_2_aarch64.whl", hash = "sha256:9f16fbdf4da055efb21c22d81b89f155f02ba420558db21288b3d0035bafd5f4"},
+    {file = "cryptography-46.0.5-cp38-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:ced80795227d70549a411a4ab66e8ce307899fad2220ce5ab2f296e687eacde9"},
+    {file = "cryptography-46.0.5-cp38-abi3-win32.whl", hash = "sha256:02f547fce831f5096c9a567fd41bc12ca8f11df260959ecc7c3202555cc47a72"},
+    {file = "cryptography-46.0.5-cp38-abi3-win_amd64.whl", hash = "sha256:556e106ee01aa13484ce9b0239bca667be5004efb0aabbed28d353df86445595"},
+    {file = "cryptography-46.0.5-pp311-pypy311_pp73-macosx_11_0_arm64.whl", hash = "sha256:3b4995dc971c9fb83c25aa44cf45f02ba86f71ee600d81091c2f0cbae116b06c"},
+    {file = "cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_28_aarch64.whl", hash = "sha256:bc84e875994c3b445871ea7181d424588171efec3e185dced958dad9e001950a"},
+    {file = "cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_28_x86_64.whl", hash = "sha256:2ae6971afd6246710480e3f15824ed3029a60fc16991db250034efd0b9fb4356"},
+    {file = "cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_34_aarch64.whl", hash = "sha256:d861ee9e76ace6cf36a6a89b959ec08e7bc2493ee39d07ffe5acb23ef46d27da"},
+    {file = "cryptography-46.0.5-pp311-pypy311_pp73-manylinux_2_34_x86_64.whl", hash = "sha256:2b7a67c9cd56372f3249b39699f2ad479f6991e62ea15800973b956f4b73e257"},
+    {file = "cryptography-46.0.5-pp311-pypy311_pp73-win_amd64.whl", hash = "sha256:8456928655f856c6e1533ff59d5be76578a7157224dbd9ce6872f25055ab9ab7"},
+    {file = "cryptography-46.0.5.tar.gz", hash = "sha256:abace499247268e3757271b2f1e244b36b06f8515cf27c4d49468fc9eb16e93d"},
 ]

 [package.dependencies]
@@ -516,7 +516,7 @@ nox = ["nox[uv] (>=2024.4.15)"]
 pep8test = ["check-sdist", "click (>=8.0.1)", "mypy (>=1.14)", "ruff (>=0.11.11)"]
 sdist = ["build (>=1.0.0)"]
 ssh = ["bcrypt (>=3.1.5)"]
-test = ["certifi (>=2024)", "cryptography-vectors (==46.0.4)", "pretend (>=0.7)", "pytest (>=7.4.0)", "pytest-benchmark (>=4.0)", "pytest-cov (>=2.10.1)", "pytest-xdist (>=3.5.0)"]
+test = ["certifi (>=2024)", "cryptography-vectors (==46.0.5)", "pretend (>=0.7)", "pytest (>=7.4.0)", "pytest-benchmark (>=4.0)", "pytest-cov (>=2.10.1)", "pytest-xdist (>=3.5.0)"]
 test-randomorder = ["pytest-randomly"]

 [[package]]
@@ -570,24 +570,25 @@ tests = ["coverage", "coveralls", "dill", "mock", "nose"]

 [[package]]
 name = "fastapi"
-version = "0.128.0"
+version = "0.128.7"
 description = "FastAPI framework, high performance, easy to learn, fast to code, ready for production"
 optional = false
 python-versions = ">=3.9"
 groups = ["main"]
 files = [
-    {file = "fastapi-0.128.0-py3-none-any.whl", hash = "sha256:aebd93f9716ee3b4f4fcfe13ffb7cf308d99c9f3ab5622d8877441072561582d"},
-    {file = "fastapi-0.128.0.tar.gz", hash = "sha256:1cc179e1cef10a6be60ffe429f79b829dce99d8de32d7acb7e6c8dfdf7f2645a"},
+    {file = "fastapi-0.128.7-py3-none-any.whl", hash = "sha256:6bd9bd31cb7047465f2d3fa3ba3f33b0870b17d4eaf7cdb36d1576ab060ad662"},
+    {file = "fastapi-0.128.7.tar.gz", hash = "sha256:783c273416995486c155ad2c0e2b45905dedfaf20b9ef8d9f6a9124670639a24"},
 ]

 [package.dependencies]
 annotated-doc = ">=0.0.2"
 pydantic = ">=2.7.0"
-starlette = ">=0.40.0,<0.51.0"
+starlette = ">=0.40.0,<1.0.0"
 typing-extensions = ">=4.8.0"
+typing-inspection = ">=0.4.2"

 [package.extras]
-all = ["email-validator (>=2.0.0)", "fastapi-cli[standard] (>=0.0.8)", "httpx (>=0.23.0,<1.0.0)", "itsdangerous (>=1.1.0)", "jinja2 (>=3.1.5)", "orjson (>=3.2.1)", "pydantic-extra-types (>=2.0.0)", "pydantic-settings (>=2.0.0)", "python-multipart (>=0.0.18)", "pyyaml (>=5.3.1)", "ujson (>=4.0.1,!=4.0.2,!=4.1.0,!=4.2.0,!=4.3.0,!=5.0.0,!=5.1.0)", "uvicorn[standard] (>=0.12.0)"]
+all = ["email-validator (>=2.0.0)", "fastapi-cli[standard] (>=0.0.8)", "httpx (>=0.23.0,<1.0.0)", "itsdangerous (>=1.1.0)", "jinja2 (>=3.1.5)", "orjson (>=3.9.3)", "pydantic-extra-types (>=2.0.0)", "pydantic-settings (>=2.0.0)", "python-multipart (>=0.0.18)", "pyyaml (>=5.3.1)", "ujson (>=5.8.0)", "uvicorn[standard] (>=0.12.0)"]
 standard = ["email-validator (>=2.0.0)", "fastapi-cli[standard] (>=0.0.8)", "httpx (>=0.23.0,<1.0.0)", "jinja2 (>=3.1.5)", "pydantic-extra-types (>=2.0.0)", "pydantic-settings (>=2.0.0)", "python-multipart (>=0.0.18)", "uvicorn[standard] (>=0.12.0)"]
 standard-no-fastapi-cloud-cli = ["email-validator (>=2.0.0)", "fastapi-cli[standard-no-fastapi-cloud-cli] (>=0.0.8)", "httpx (>=0.23.0,<1.0.0)", "jinja2 (>=3.1.5)", "pydantic-extra-types (>=2.0.0)", "pydantic-settings (>=2.0.0)", "python-multipart (>=0.0.18)", "uvicorn[standard] (>=0.12.0)"]

@@ -1062,14 +1063,14 @@ urllib3 = ">=1.26.0,<3"

 [[package]]
 name = "launchdarkly-server-sdk"
-version = "9.14.1"
+version = "9.15.0"
 description = "LaunchDarkly SDK for Python"
 optional = false
-python-versions = ">=3.9"
+python-versions = ">=3.10"
 groups = ["main"]
 files = [
-    {file = "launchdarkly_server_sdk-9.14.1-py3-none-any.whl", hash = "sha256:a9e2bd9ecdef845cd631ae0d4334a1115e5b44257c42eb2349492be4bac7815c"},
-    {file = "launchdarkly_server_sdk-9.14.1.tar.gz", hash = "sha256:1df44baf0a0efa74d8c1dad7a00592b98bce7d19edded7f770da8dbc49922213"},
+    {file = "launchdarkly_server_sdk-9.15.0-py3-none-any.whl", hash = "sha256:c267e29bfa3fb5e2a06a208448ada6ed5557a2924979b8d79c970b45d227c668"},
+    {file = "launchdarkly_server_sdk-9.15.0.tar.gz", hash = "sha256:f31441b74bc1a69c381db57c33116509e407a2612628ad6dff0a7dbb39d5020b"},
 ]

 [package.dependencies]
@@ -1478,14 +1479,14 @@ testing = ["coverage", "pytest", "pytest-benchmark"]

 [[package]]
 name = "postgrest"
-version = "2.27.2"
+version = "2.28.0"
 description = "PostgREST client for Python. This library provides an ORM interface to PostgREST."
 optional = false
 python-versions = ">=3.9"
 groups = ["main"]
 files = [
-    {file = "postgrest-2.27.2-py3-none-any.whl", hash = "sha256:1666fef3de05ca097a314433dd5ae2f2d71c613cb7b233d0f468c4ffe37277da"},
-    {file = "postgrest-2.27.2.tar.gz", hash = "sha256:55407d530b5af3d64e883a71fec1f345d369958f723ce4a8ab0b7d169e313242"},
+    {file = "postgrest-2.28.0-py3-none-any.whl", hash = "sha256:7bca2f24dd1a1bf8a3d586c7482aba6cd41662da6733045fad585b63b7f7df75"},
+    {file = "postgrest-2.28.0.tar.gz", hash = "sha256:c36b38646d25ea4255321d3d924ce70f8d20ec7799cb42c1221d6a818d4f6515"},
 ]

 [package.dependencies]
@@ -2248,14 +2249,14 @@ cli = ["click (>=5.0)"]

 [[package]]
 name = "realtime"
-version = "2.27.2"
+version = "2.28.0"
 description = ""
 optional = false
 python-versions = ">=3.9"
 groups = ["main"]
 files = [
-    {file = "realtime-2.27.2-py3-none-any.whl", hash = "sha256:34a9cbb26a274e707e8fc9e3ee0a66de944beac0fe604dc336d1e985db2c830f"},
-    {file = "realtime-2.27.2.tar.gz", hash = "sha256:b960a90294d2cea1b3f1275ecb89204304728e08fff1c393cc1b3150739556b3"},
+    {file = "realtime-2.28.0-py3-none-any.whl", hash = "sha256:db1bd59bab9b1fcc9f9d3b1a073bed35bf4994d720e6751f10031a58d57a3836"},
+    {file = "realtime-2.28.0.tar.gz", hash = "sha256:d18cedcebd6a8f22fcd509bc767f639761eb218b7b2b6f14fc4205b6259b50fc"},
 ]

 [package.dependencies]
@@ -2436,14 +2437,14 @@ full = ["httpx (>=0.27.0,<0.29.0)", "itsdangerous", "jinja2", "python-multipart

 [[package]]
 name = "storage3"
-version = "2.27.2"
+version = "2.28.0"
 description = "Supabase Storage client for Python."
 optional = false
 python-versions = ">=3.9"
 groups = ["main"]
 files = [
-    {file = "storage3-2.27.2-py3-none-any.whl", hash = "sha256:e6f16e7a260729e7b1f46e9bf61746805a02e30f5e419ee1291007c432e3ec63"},
-    {file = "storage3-2.27.2.tar.gz", hash = "sha256:cb4807b7f86b4bb1272ac6fdd2f3cfd8ba577297046fa5f88557425200275af5"},
+    {file = "storage3-2.28.0-py3-none-any.whl", hash = "sha256:ecb50efd2ac71dabbdf97e99ad346eafa630c4c627a8e5a138ceb5fbbadae716"},
+    {file = "storage3-2.28.0.tar.gz", hash = "sha256:bc1d008aff67de7a0f2bd867baee7aadbcdb6f78f5a310b4f7a38e8c13c19865"},
 ]

 [package.dependencies]
@@ -2487,35 +2488,35 @@ python-dateutil = ">=2.6.0"

 [[package]]
 name = "supabase"
-version = "2.27.2"
+version = "2.28.0"
 description = "Supabase client for Python."
 optional = false
 python-versions = ">=3.9"
 groups = ["main"]
 files = [
-    {file = "supabase-2.27.2-py3-none-any.whl", hash = "sha256:d4dce00b3a418ee578017ec577c0e5be47a9a636355009c76f20ed2faa15bc54"},
-    {file = "supabase-2.27.2.tar.gz", hash = "sha256:2aed40e4f3454438822442a1e94a47be6694c2c70392e7ae99b51a226d4293f7"},
+    {file = "supabase-2.28.0-py3-none-any.whl", hash = "sha256:42776971c7d0ccca16034df1ab96a31c50228eb1eb19da4249ad2f756fc20272"},
+    {file = "supabase-2.28.0.tar.gz", hash = "sha256:aea299aaab2a2eed3c57e0be7fc035c6807214194cce795a3575add20268ece1"},
 ]

 [package.dependencies]
 httpx = ">=0.26,<0.29"
-postgrest = "2.27.2"
-realtime = "2.27.2"
-storage3 = "2.27.2"
-supabase-auth = "2.27.2"
-supabase-functions = "2.27.2"
+postgrest = "2.28.0"
+realtime = "2.28.0"
+storage3 = "2.28.0"
+supabase-auth = "2.28.0"
+supabase-functions = "2.28.0"
 yarl = ">=1.22.0"

 [[package]]
 name = "supabase-auth"
-version = "2.27.2"
+version = "2.28.0"
 description = "Python Client Library for Supabase Auth"
 optional = false
 python-versions = ">=3.9"
 groups = ["main"]
 files = [
-    {file = "supabase_auth-2.27.2-py3-none-any.whl", hash = "sha256:78ec25b11314d0a9527a7205f3b1c72560dccdc11b38392f80297ef98664ee91"},
-    {file = "supabase_auth-2.27.2.tar.gz", hash = "sha256:0f5bcc79b3677cb42e9d321f3c559070cfa40d6a29a67672cc8382fb7dc2fe97"},
+    {file = "supabase_auth-2.28.0-py3-none-any.whl", hash = "sha256:2ac85026cc285054c7fa6d41924f3a333e9ec298c013e5b5e1754039ba7caec9"},
+    {file = "supabase_auth-2.28.0.tar.gz", hash = "sha256:2bb8f18ff39934e44b28f10918db965659f3735cd6fbfcc022fe0b82dbf8233e"},
 ]

 [package.dependencies]
@@ -2525,14 +2526,14 @@ pyjwt = {version = ">=2.10.1", extras = ["crypto"]}

 [[package]]
 name = "supabase-functions"
-version = "2.27.2"
+version = "2.28.0"
 description = "Library for Supabase Functions"
 optional = false
 python-versions = ">=3.9"
 groups = ["main"]
 files = [
-    {file = "supabase_functions-2.27.2-py3-none-any.whl", hash = "sha256:db480efc669d0bca07605b9b6f167312af43121adcc842a111f79bea416ef754"},
-    {file = "supabase_functions-2.27.2.tar.gz", hash = "sha256:d0c8266207a94371cb3fd35ad3c7f025b78a97cf026861e04ccd35ac1775f80b"},
+    {file = "supabase_functions-2.28.0-py3-none-any.whl", hash = "sha256:30bf2d586f8df285faf0621bb5d5bb3ec3157234fc820553ca156f009475e4ae"},
+    {file = "supabase_functions-2.28.0.tar.gz", hash = "sha256:db3dddfc37aca5858819eb461130968473bd8c75bd284581013958526dac718b"},
 ]

 [package.dependencies]
@@ -2911,4 +2912,4 @@ type = ["pytest-mypy"]
 [metadata]
 lock-version = "2.1"
 python-versions = ">=3.10,<4.0"
-content-hash = "40eae94995dc0a388fa832ed4af9b6137f28d5b5ced3aaea70d5f91d4d9a179d"
+content-hash = "9619cae908ad38fa2c48016a58bcf4241f6f5793aa0e6cc140276e91c433cbbb"
--- a/autogpt_platform/autogpt_libs/pyproject.toml
+++ b/autogpt_platform/autogpt_libs/pyproject.toml
@@ -11,14 +11,14 @@ python = ">=3.10,<4.0"
 colorama = "^0.4.6"
 cryptography = "^46.0"
 expiringdict = "^1.2.2"
-fastapi = "^0.128.0"
+fastapi = "^0.128.7"
 google-cloud-logging = "^3.13.0"
-launchdarkly-server-sdk = "^9.14.1"
+launchdarkly-server-sdk = "^9.15.0"
 pydantic = "^2.12.5"
 pydantic-settings = "^2.12.0"
 pyjwt = { version = "^2.11.0", extras = ["crypto"] }
 redis = "^6.2.0"
-supabase = "^2.27.2"
+supabase = "^2.28.0"
 uvicorn = "^0.40.0"

 [tool.poetry.group.dev.dependencies]
--- a/autogpt_platform/backend/.env.default
+++ b/autogpt_platform/backend/.env.default
@@ -104,6 +104,12 @@ TWITTER_CLIENT_SECRET=
 # Make a new workspace for your OAuth APP -- trust me
 # https://linear.app/settings/api/applications/new
 # Callback URL: http://localhost:3000/auth/integrations/oauth_callback
+LINEAR_API_KEY=
+# Linear project and team IDs for the feature request tracker.
+# Find these in your Linear workspace URL: linear.app/<workspace>/project/<project-id>
+# and in team settings. Used by the chat copilot to file and search feature requests.
+LINEAR_FEATURE_REQUEST_PROJECT_ID=
+LINEAR_FEATURE_REQUEST_TEAM_ID=
 LINEAR_CLIENT_ID=
 LINEAR_CLIENT_SECRET=

@@ -184,5 +190,8 @@ ZEROBOUNCE_API_KEY=
 POSTHOG_API_KEY=
 POSTHOG_HOST=https://eu.i.posthog.com

+# Tally Form Integration (pre-populate business understanding on signup)
+TALLY_API_KEY=
+
 # Other Services
 AUTOMOD_API_KEY=
--- a/autogpt_platform/backend/CLAUDE.md
+++ b/autogpt_platform/backend/CLAUDE.md
@@ -58,10 +58,56 @@ poetry run pytest path/to/test.py --snapshot-update
 - **Authentication**: JWT-based with Supabase integration
 - **Security**: Cache protection middleware prevents sensitive data caching in browsers/proxies

+## Code Style
+
+- **Top-level imports only** — no local/inner imports (lazy imports only for heavy optional deps like `openpyxl`)
+- **No duck typing** — no `hasattr`/`getattr`/`isinstance` for type dispatch; use typed interfaces/unions/protocols
+- **Pydantic models** over dataclass/namedtuple/dict for structured data
+- **No linter suppressors** — no `# type: ignore`, `# noqa`, `# pyright: ignore`; fix the type/code
+- **List comprehensions** over manual loop-and-append
+- **Early return** — guard clauses first, avoid deep nesting
+- **f-strings vs printf syntax in log statements** — Use `%s` for deferred interpolation in `debug` statements, f-strings elsewhere for readability: `logger.debug("Processing %s items", count)`, `logger.info(f"Processing {count} items")`
+- **Sanitize error paths** — `os.path.basename()` in error messages to avoid leaking directory structure
+- **TOCTOU awareness** — avoid check-then-act patterns for file access and credit charging
+- **`Security()` vs `Depends()`** — use `Security()` for auth deps to get proper OpenAPI security spec
+- **Redis pipelines** — `transaction=True` for atomicity on multi-step operations
+- **`max(0, value)` guards** — for computed values that should never be negative
+- **SSE protocol** — `data:` lines for frontend-parsed events (must match Zod schema), `: comment` lines for heartbeats/status
+- **File length** — keep files under ~300 lines; if a file grows beyond this, split by responsibility (e.g. extract helpers, models, or a sub-module into a new file). Never keep appending to a long file.
+- **Function length** — keep functions under ~40 lines; extract named helpers when a function grows longer. Long functions are a sign of mixed concerns, not complexity.
+- **Top-down ordering** — define the main/public function or class first, then the helpers it uses below. A reader should encounter high-level logic before implementation details.
+
 ## Testing Approach

 - Uses pytest with snapshot testing for API responses
 - Test files are colocated with source files (`*_test.py`)
+- Mock at boundaries — mock where the symbol is **used**, not where it's **defined**
+- After refactoring, update mock targets to match new module paths
+- Use `AsyncMock` for async functions (`from unittest.mock import AsyncMock`)
+
+### Test-Driven Development (TDD)
+
+When fixing a bug or adding a feature, write the test **before** the implementation:
+
+```python
+# 1. Write a failing test marked xfail
+@pytest.mark.xfail(reason="Bug #1234: widget crashes on empty input")
+def test_widget_handles_empty_input():
+    result = widget.process("")
+    assert result == Widget.EMPTY_RESULT
+
+# 2. Run it — confirm it fails (XFAIL)
+# poetry run pytest path/to/test.py::test_widget_handles_empty_input -xvs
+
+# 3. Implement the fix
+
+# 4. Remove xfail, run again — confirm it passes
+def test_widget_handles_empty_input():
+    result = widget.process("")
+    assert result == Widget.EMPTY_RESULT
+```
+
+This catches regressions and proves the fix actually works. **Every bug fix should include a test that would have caught it.**

 ## Database Schema

@@ -157,6 +203,16 @@ yield "image_url", result_url
 3. Write tests alongside the route file
 4. Run `poetry run test` to verify

+## Workspace & Media Files
+
+**Read [Workspace & Media Architecture](../../docs/platform/workspace-media-architecture.md) when:**
+- Working on CoPilot file upload/download features
+- Building blocks that handle `MediaFileType` inputs/outputs
+- Modifying `WorkspaceManager` or `store_media_file()`
+- Debugging file persistence or virus scanning issues
+
+Covers: `WorkspaceManager` (persistent storage with session scoping), `store_media_file()` (media normalization pipeline), and responsibility boundaries for virus scanning and persistence.
+
 ## Security Implementation

 ### Cache Protection Middleware
--- a/autogpt_platform/backend/Dockerfile
+++ b/autogpt_platform/backend/Dockerfile
@@ -1,3 +1,5 @@
+# ============================ DEPENDENCY BUILDER ============================ #
+
 FROM debian:13-slim AS builder

 # Set environment variables
@@ -48,63 +50,128 @@ RUN poetry install --no-ansi --no-root
 # Generate Prisma client
 COPY autogpt_platform/backend/schema.prisma ./
 COPY autogpt_platform/backend/backend/data/partial_types.py ./backend/data/partial_types.py
-COPY autogpt_platform/backend/gen_prisma_types_stub.py ./
+COPY autogpt_platform/backend/scripts/gen_prisma_types_stub.py ./scripts/
 RUN poetry run prisma generate && poetry run gen-prisma-stub

-FROM debian:13-slim AS server_dependencies
+# =============================== DB MIGRATOR =============================== #
+
+# Lightweight migrate stage - only needs Prisma CLI, not full Python environment
+FROM debian:13-slim AS migrate
+
+WORKDIR /app/autogpt_platform/backend
+
+ENV DEBIAN_FRONTEND=noninteractive
+
+# Install only what's needed for prisma migrate: Node.js and minimal Python for prisma-python
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    python3.13 \
+    python3-pip \
+    ca-certificates \
+    && rm -rf /var/lib/apt/lists/*
+
+# Copy Node.js from builder (needed for Prisma CLI)
+COPY --from=builder /usr/bin/node /usr/bin/node
+COPY --from=builder /usr/lib/node_modules /usr/lib/node_modules
+COPY --from=builder /usr/bin/npm /usr/bin/npm
+
+# Copy Prisma binaries
+COPY --from=builder /root/.cache/prisma-python/binaries /root/.cache/prisma-python/binaries
+
+# Install prisma-client-py directly (much smaller than copying full venv)
+RUN pip3 install prisma>=0.15.0 --break-system-packages
+
+COPY autogpt_platform/backend/schema.prisma ./
+COPY autogpt_platform/backend/backend/data/partial_types.py ./backend/data/partial_types.py
+COPY autogpt_platform/backend/scripts/gen_prisma_types_stub.py ./scripts/
+COPY autogpt_platform/backend/migrations ./migrations
+
+# ============================== BACKEND SERVER ============================== #
+
+FROM debian:13-slim AS server

 WORKDIR /app

-ENV POETRY_HOME=/opt/poetry \
-    POETRY_NO_INTERACTION=1 \
-    POETRY_VIRTUALENVS_CREATE=true \
-    POETRY_VIRTUALENVS_IN_PROJECT=true \
-    DEBIAN_FRONTEND=noninteractive
-ENV PATH=/opt/poetry/bin:$PATH
+ENV DEBIAN_FRONTEND=noninteractive

-# Install Python, FFmpeg, and ImageMagick (required for video processing blocks)
-RUN apt-get update && apt-get install -y \
+# Install Python, FFmpeg, ImageMagick, and CLI tools for agent use.
+# bubblewrap provides OS-level sandbox (whitelist-only FS + no network)
+# for the bash_exec MCP tool (fallback when E2B is not configured).
+# Using --no-install-recommends saves ~650MB by skipping unnecessary deps like llvm, mesa, etc.
+RUN apt-get update && apt-get install -y --no-install-recommends \
    python3.13 \
    python3-pip \
    ffmpeg \
    imagemagick \
+    jq \
+    ripgrep \
+    tree \
+    bubblewrap \
    && rm -rf /var/lib/apt/lists/*

-# Copy only necessary files from builder
-COPY --from=builder /app /app
+# Copy poetry (build-time only, for `poetry install --only-root` to create entry points)
 COPY --from=builder /usr/local/lib/python3* /usr/local/lib/python3*
 COPY --from=builder /usr/local/bin/poetry /usr/local/bin/poetry
-# Copy Node.js installation for Prisma
+# Copy Node.js installation for Prisma and agent-browser.
+# npm/npx are symlinks in the builder (-> ../lib/node_modules/npm/bin/*-cli.js);
+# COPY resolves them to regular files, breaking require() paths.  Recreate as
+# proper symlinks so npm/npx can find their modules.
 COPY --from=builder /usr/bin/node /usr/bin/node
 COPY --from=builder /usr/lib/node_modules /usr/lib/node_modules
-COPY --from=builder /usr/bin/npm /usr/bin/npm
-COPY --from=builder /usr/bin/npx /usr/bin/npx
+RUN ln -s ../lib/node_modules/npm/bin/npm-cli.js /usr/bin/npm \
+    && ln -s ../lib/node_modules/npm/bin/npx-cli.js /usr/bin/npx
 COPY --from=builder /root/.cache/prisma-python/binaries /root/.cache/prisma-python/binaries

-ENV PATH="/app/autogpt_platform/backend/.venv/bin:$PATH"
+# Install agent-browser (Copilot browser tool) + Chromium.
+# On amd64: install runtime libs + run `agent-browser install` to download
+#   Chrome for Testing (pinned version, tested with Playwright).
+# On arm64: install system chromium package — Chrome for Testing has no ARM64
+#   binary. AGENT_BROWSER_EXECUTABLE_PATH is set at runtime by the entrypoint
+#   script (below) to redirect agent-browser to the system binary.
+ARG TARGETARCH
+RUN apt-get update \
+    && if [ "$TARGETARCH" = "arm64" ]; then \
+         apt-get install -y --no-install-recommends chromium fonts-liberation; \
+       else \
+         apt-get install -y --no-install-recommends \
+           libnss3 libnspr4 libatk1.0-0 libatk-bridge2.0-0 libcups2 libdrm2 \
+           libdbus-1-3 libxkbcommon0 libatspi2.0-0t64 libxcomposite1 libxdamage1 \
+           libxfixes3 libxrandr2 libgbm1 libasound2t64 libpango-1.0-0 libcairo2 \
+           libx11-6 libx11-xcb1 libxcb1 libxext6 libglib2.0-0t64 \
+           fonts-liberation libfontconfig1; \
+       fi \
+    && rm -rf /var/lib/apt/lists/* \
+    && npm install -g agent-browser \
+    && ([ "$TARGETARCH" = "arm64" ] || agent-browser install) \
+    && rm -rf /tmp/* /root/.npm

-RUN mkdir -p /app/autogpt_platform/autogpt_libs
-RUN mkdir -p /app/autogpt_platform/backend
-
-COPY autogpt_platform/autogpt_libs /app/autogpt_platform/autogpt_libs
-
-COPY autogpt_platform/backend/poetry.lock autogpt_platform/backend/pyproject.toml /app/autogpt_platform/backend/
+# On arm64 the system chromium is at /usr/bin/chromium; set
+# AGENT_BROWSER_EXECUTABLE_PATH so agent-browser's daemon uses it instead of
+# Chrome for Testing (which has no ARM64 binary). On amd64 the variable is left
+# unset so agent-browser uses the Chrome for Testing binary it downloaded above.
+RUN printf '#!/bin/sh\n[ -x /usr/bin/chromium ] && export AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium\nexec "$@"\n' \
+    > /usr/local/bin/entrypoint.sh \
+    && chmod +x /usr/local/bin/entrypoint.sh

 WORKDIR /app/autogpt_platform/backend

-FROM server_dependencies AS migrate
+# Copy only the .venv from builder (not the entire /app directory)
+# The .venv includes the generated Prisma client
+COPY --from=builder /app/autogpt_platform/backend/.venv ./.venv
+ENV PATH="/app/autogpt_platform/backend/.venv/bin:$PATH"

-# Migration stage only needs schema and migrations - much lighter than full backend
-COPY autogpt_platform/backend/schema.prisma /app/autogpt_platform/backend/
-COPY autogpt_platform/backend/backend/data/partial_types.py /app/autogpt_platform/backend/backend/data/partial_types.py
-COPY autogpt_platform/backend/migrations /app/autogpt_platform/backend/migrations
+# Copy dependency files + autogpt_libs (path dependency)
+COPY autogpt_platform/autogpt_libs /app/autogpt_platform/autogpt_libs
+COPY autogpt_platform/backend/poetry.lock autogpt_platform/backend/pyproject.toml ./

-FROM server_dependencies AS server
-
-COPY autogpt_platform/backend /app/autogpt_platform/backend
+# Copy backend code + docs (for Copilot docs search)
+COPY autogpt_platform/backend ./
 COPY docs /app/docs
-RUN poetry install --no-ansi --only-root
+# Install the project package to create entry point scripts in .venv/bin/
+# (e.g., rest, executor, ws, db, scheduler, notification - see [tool.poetry.scripts])
+RUN POETRY_VIRTUALENVS_CREATE=true POETRY_VIRTUALENVS_IN_PROJECT=true \
+    poetry install --no-ansi --only-root

 ENV PORT=8000

-CMD ["poetry", "run", "rest"]
+ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
+CMD ["rest"]
--- a/autogpt_platform/backend/backend/api/conftest.py
+++ b/autogpt_platform/backend/backend/api/conftest.py
@@ -1,4 +1,9 @@
-"""Common test fixtures for server tests."""
+"""Common test fixtures for server tests.
+
+Note: Common fixtures like test_user_id, admin_user_id, target_user_id,
+setup_test_user, and setup_admin_user are defined in the parent conftest.py
+(backend/conftest.py) and are available here automatically.
+"""

 import pytest
 from pytest_snapshot.plugin import Snapshot
@@ -11,54 +16,6 @@ def configured_snapshot(snapshot: Snapshot) -> Snapshot:
    return snapshot


-@pytest.fixture
-def test_user_id() -> str:
-    """Test user ID fixture."""
-    return "3e53486c-cf57-477e-ba2a-cb02dc828e1a"
-
-
-@pytest.fixture
-def admin_user_id() -> str:
-    """Admin user ID fixture."""
-    return "4e53486c-cf57-477e-ba2a-cb02dc828e1b"
-
-
-@pytest.fixture
-def target_user_id() -> str:
-    """Target user ID fixture."""
-    return "5e53486c-cf57-477e-ba2a-cb02dc828e1c"
-
-
-@pytest.fixture
-async def setup_test_user(test_user_id):
-    """Create test user in database before tests."""
-    from backend.data.user import get_or_create_user
-
-    # Create the test user in the database using JWT token format
-    user_data = {
-        "sub": test_user_id,
-        "email": "test@example.com",
-        "user_metadata": {"name": "Test User"},
-    }
-    await get_or_create_user(user_data)
-    return test_user_id
-
-
-@pytest.fixture
-async def setup_admin_user(admin_user_id):
-    """Create admin user in database before tests."""
-    from backend.data.user import get_or_create_user
-
-    # Create the admin user in the database using JWT token format
-    user_data = {
-        "sub": admin_user_id,
-        "email": "test-admin@example.com",
-        "user_metadata": {"name": "Test Admin"},
-    }
-    await get_or_create_user(user_data)
-    return admin_user_id
-
-
@pytest.fixture
 def mock_jwt_user(test_user_id):
    """Provide mock JWT payload for regular user testing."""
--- a/autogpt_platform/backend/backend/api/external/middleware.py
+++ b/autogpt_platform/backend/backend/api/external/middleware.py
@@ -88,20 +88,23 @@ async def require_auth(
    )


-def require_permission(permission: APIKeyPermission):
+def require_permission(*permissions: APIKeyPermission):
    """
-    Dependency function for checking specific permissions
+    Dependency function for checking required permissions.
+    All listed permissions must be present.
    (works with API keys and OAuth tokens)
    """

-    async def check_permission(
+    async def check_permissions(
        auth: APIAuthorizationInfo = Security(require_auth),
    ) -> APIAuthorizationInfo:
-        if permission not in auth.scopes:
+        missing = [p for p in permissions if p not in auth.scopes]
+        if missing:
            raise HTTPException(
                status_code=status.HTTP_403_FORBIDDEN,
-                detail=f"Missing required permission: {permission.value}",
+                detail=f"Missing required permission(s): "
+                f"{', '.join(p.value for p in missing)}",
            )
        return auth

-    return check_permission
+    return check_permissions
--- a/autogpt_platform/backend/backend/api/external/v1/routes.py
+++ b/autogpt_platform/backend/backend/api/external/v1/routes.py
@@ -1,7 +1,7 @@
 import logging
 import urllib.parse
 from collections import defaultdict
-from typing import Annotated, Any, Literal, Optional, Sequence
+from typing import Annotated, Any, Optional, Sequence

 from fastapi import APIRouter, Body, HTTPException, Security
 from prisma.enums import AgentExecutionStatus, APIKeyPermission
@@ -9,15 +9,17 @@ from pydantic import BaseModel, Field
 from typing_extensions import TypedDict

 import backend.api.features.store.cache as store_cache
+import backend.api.features.store.db as store_db
 import backend.api.features.store.model as store_model
-import backend.data.block
-from backend.api.external.middleware import require_permission
+import backend.blocks
+from backend.api.external.middleware import require_auth, require_permission
 from backend.data import execution as execution_db
 from backend.data import graph as graph_db
 from backend.data import user as user_db
 from backend.data.auth.base import APIAuthorizationInfo
 from backend.data.block import BlockInput, CompletedBlockOutput
 from backend.executor.utils import add_graph_execution
+from backend.integrations.webhooks.graph_lifecycle_hooks import on_graph_activate
 from backend.util.settings import Settings

 from .integrations import integrations_router
@@ -67,7 +69,7 @@ async def get_user_info(
    dependencies=[Security(require_permission(APIKeyPermission.READ_BLOCK))],
 )
 async def get_graph_blocks() -> Sequence[dict[Any, Any]]:
-    blocks = [block() for block in backend.data.block.get_blocks().values()]
+    blocks = [block() for block in backend.blocks.get_blocks().values()]
    return [b.to_dict() for b in blocks if not b.disabled]


@@ -83,7 +85,7 @@ async def execute_graph_block(
        require_permission(APIKeyPermission.EXECUTE_BLOCK)
    ),
 ) -> CompletedBlockOutput:
-    obj = backend.data.block.get_block(block_id)
+    obj = backend.blocks.get_block(block_id)
    if not obj:
        raise HTTPException(status_code=404, detail=f"Block #{block_id} not found.")
    if obj.disabled:
@@ -95,6 +97,43 @@ async def execute_graph_block(
    return output


+@v1_router.post(
+    path="/graphs",
+    tags=["graphs"],
+    status_code=201,
+    dependencies=[
+        Security(
+            require_permission(
+                APIKeyPermission.WRITE_GRAPH, APIKeyPermission.WRITE_LIBRARY
+            )
+        )
+    ],
+)
+async def create_graph(
+    graph: graph_db.Graph,
+    auth: APIAuthorizationInfo = Security(
+        require_permission(APIKeyPermission.WRITE_GRAPH, APIKeyPermission.WRITE_LIBRARY)
+    ),
+) -> graph_db.GraphModel:
+    """
+    Create a new agent graph.
+
+    The graph will be validated and assigned a new ID.
+    It is automatically added to the user's library.
+    """
+    from backend.api.features.library import db as library_db
+
+    graph_model = graph_db.make_graph_model(graph, auth.user_id)
+    graph_model.reassign_ids(user_id=auth.user_id, reassign_graph_id=True)
+    graph_model.validate_graph(for_run=False)
+
+    await graph_db.create_graph(graph_model, user_id=auth.user_id)
+    await library_db.create_library_agent(graph_model, auth.user_id)
+    activated_graph = await on_graph_activate(graph_model, user_id=auth.user_id)
+
+    return activated_graph
+
+
@v1_router.post(
    path="/graphs/{graph_id}/execute/{graph_version}",
    tags=["graphs"],
@@ -192,13 +231,13 @@ async def get_graph_execution_results(
@v1_router.get(
    path="/store/agents",
    tags=["store"],
-    dependencies=[Security(require_permission(APIKeyPermission.READ_STORE))],
+    dependencies=[Security(require_auth)],  # data is public; auth required as anti-DDoS
    response_model=store_model.StoreAgentsResponse,
 )
 async def get_store_agents(
    featured: bool = False,
    creator: str | None = None,
-    sorted_by: Literal["rating", "runs", "name", "updated_at"] | None = None,
+    sorted_by: store_db.StoreAgentsSortOptions | None = None,
    search_query: str | None = None,
    category: str | None = None,
    page: int = 1,
@@ -240,7 +279,7 @@ async def get_store_agents(
@v1_router.get(
    path="/store/agents/{username}/{agent_name}",
    tags=["store"],
-    dependencies=[Security(require_permission(APIKeyPermission.READ_STORE))],
+    dependencies=[Security(require_auth)],  # data is public; auth required as anti-DDoS
    response_model=store_model.StoreAgentDetails,
 )
 async def get_store_agent(
@@ -268,13 +307,13 @@ async def get_store_agent(
@v1_router.get(
    path="/store/creators",
    tags=["store"],
-    dependencies=[Security(require_permission(APIKeyPermission.READ_STORE))],
+    dependencies=[Security(require_auth)],  # data is public; auth required as anti-DDoS
    response_model=store_model.CreatorsResponse,
 )
 async def get_store_creators(
    featured: bool = False,
    search_query: str | None = None,
-    sorted_by: Literal["agent_rating", "agent_runs", "num_agents"] | None = None,
+    sorted_by: store_db.StoreCreatorsSortOptions | None = None,
    page: int = 1,
    page_size: int = 20,
 ) -> store_model.CreatorsResponse:
@@ -310,7 +349,7 @@ async def get_store_creators(
@v1_router.get(
    path="/store/creators/{username}",
    tags=["store"],
-    dependencies=[Security(require_permission(APIKeyPermission.READ_STORE))],
+    dependencies=[Security(require_auth)],  # data is public; auth required as anti-DDoS
    response_model=store_model.CreatorDetails,
 )
 async def get_store_creator(
--- a/autogpt_platform/backend/backend/api/external/v1/tools.py
+++ b/autogpt_platform/backend/backend/api/external/v1/tools.py
@@ -15,9 +15,9 @@ from prisma.enums import APIKeyPermission
 from pydantic import BaseModel, Field

 from backend.api.external.middleware import require_permission
-from backend.api.features.chat.model import ChatSession
-from backend.api.features.chat.tools import find_agent_tool, run_agent_tool
-from backend.api.features.chat.tools.models import ToolResponseBase
+from backend.copilot.model import ChatSession
+from backend.copilot.tools import find_agent_tool, run_agent_tool
+from backend.copilot.tools.models import ToolResponseBase
 from backend.data.auth.base import APIAuthorizationInfo

 logger = logging.getLogger(__name__)
--- a/autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py
+++ b/autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py
@@ -24,14 +24,13 @@ router = fastapi.APIRouter(
@router.get(
    "/listings",
    summary="Get Admin Listings History",
-    response_model=store_model.StoreListingsWithVersionsResponse,
 )
 async def get_admin_listings_with_versions(
    status: typing.Optional[prisma.enums.SubmissionStatus] = None,
    search: typing.Optional[str] = None,
    page: int = 1,
    page_size: int = 20,
-):
+) -> store_model.StoreListingsWithVersionsAdminViewResponse:
    """
    Get store listings with their version history for admins.

@@ -45,36 +44,26 @@ async def get_admin_listings_with_versions(
        page_size: Number of items per page

    Returns:
-        StoreListingsWithVersionsResponse with listings and their versions
+        Paginated listings with their versions
    """
-    try:
-        listings = await store_db.get_admin_listings_with_versions(
-            status=status,
-            search_query=search,
-            page=page,
-            page_size=page_size,
-        )
-        return listings
-    except Exception as e:
-        logger.exception("Error getting admin listings with versions: %s", e)
-        return fastapi.responses.JSONResponse(
-            status_code=500,
-            content={
-                "detail": "An error occurred while retrieving listings with versions"
-            },
-        )
+    listings = await store_db.get_admin_listings_with_versions(
+        status=status,
+        search_query=search,
+        page=page,
+        page_size=page_size,
+    )
+    return listings


@router.post(
    "/submissions/{store_listing_version_id}/review",
    summary="Review Store Submission",
-    response_model=store_model.StoreSubmission,
 )
 async def review_submission(
    store_listing_version_id: str,
    request: store_model.ReviewSubmissionRequest,
    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
-):
+) -> store_model.StoreSubmissionAdminView:
    """
    Review a store listing submission.

@@ -84,31 +73,24 @@ async def review_submission(
        user_id: Authenticated admin user performing the review

    Returns:
-        StoreSubmission with updated review information
+        StoreSubmissionAdminView with updated review information
    """
-    try:
-        already_approved = await store_db.check_submission_already_approved(
-            store_listing_version_id=store_listing_version_id,
-        )
-        submission = await store_db.review_store_submission(
-            store_listing_version_id=store_listing_version_id,
-            is_approved=request.is_approved,
-            external_comments=request.comments,
-            internal_comments=request.internal_comments or "",
-            reviewer_id=user_id,
-        )
+    already_approved = await store_db.check_submission_already_approved(
+        store_listing_version_id=store_listing_version_id,
+    )
+    submission = await store_db.review_store_submission(
+        store_listing_version_id=store_listing_version_id,
+        is_approved=request.is_approved,
+        external_comments=request.comments,
+        internal_comments=request.internal_comments or "",
+        reviewer_id=user_id,
+    )

-        state_changed = already_approved != request.is_approved
-        # Clear caches when the request is approved as it updates what is shown on the store
-        if state_changed:
-            store_cache.clear_all_caches()
-        return submission
-    except Exception as e:
-        logger.exception("Error reviewing submission: %s", e)
-        return fastapi.responses.JSONResponse(
-            status_code=500,
-            content={"detail": "An error occurred while reviewing the submission"},
-        )
+    state_changed = already_approved != request.is_approved
+    # Clear caches whenever approval state changes, since store visibility can change
+    if state_changed:
+        store_cache.clear_all_caches()
+    return submission


@router.get(
--- a/autogpt_platform/backend/backend/api/features/builder/db.py
+++ b/autogpt_platform/backend/backend/api/features/builder/db.py
@@ -1,28 +1,33 @@
 import logging
 from dataclasses import dataclass
-from datetime import datetime, timedelta, timezone
 from difflib import SequenceMatcher
-from typing import Sequence
+from typing import Any, Sequence, get_args, get_origin

 import prisma
+from prisma.models import mv_suggested_blocks

 import backend.api.features.library.db as library_db
 import backend.api.features.library.model as library_model
 import backend.api.features.store.db as store_db
 import backend.api.features.store.model as store_model
-import backend.data.block
 from backend.blocks import load_all_blocks
+from backend.blocks._base import (
+    AnyBlockSchema,
+    BlockCategory,
+    BlockInfo,
+    BlockSchema,
+    BlockType,
+)
 from backend.blocks.llm import LlmModel
-from backend.data.block import AnyBlockSchema, BlockCategory, BlockInfo, BlockSchema
-from backend.data.db import query_raw_with_schema
 from backend.integrations.providers import ProviderName
 from backend.util.cache import cached
 from backend.util.models import Pagination
+from backend.util.text import split_camelcase

 from .model import (
    BlockCategoryResponse,
    BlockResponse,
-    BlockType,
+    BlockTypeFilter,
    CountResponse,
    FilterType,
    Provider,
@@ -37,6 +42,16 @@ MAX_LIBRARY_AGENT_RESULTS = 100
 MAX_MARKETPLACE_AGENT_RESULTS = 100
 MIN_SCORE_FOR_FILTERED_RESULTS = 10.0

+# Boost blocks over marketplace agents in search results
+BLOCK_SCORE_BOOST = 50.0
+
+# Block IDs to exclude from search results
+EXCLUDED_BLOCK_IDS = frozenset(
+    {
+        "e189baac-8c20-45a1-94a7-55177ea42565",  # AgentExecutorBlock
+    }
+)
+
 SearchResultItem = BlockInfo | library_model.LibraryAgent | store_model.StoreAgent


@@ -59,8 +74,8 @@ def get_block_categories(category_blocks: int = 3) -> list[BlockCategoryResponse

    for block_type in load_all_blocks().values():
        block: AnyBlockSchema = block_type()
-        # Skip disabled blocks
-        if block.disabled:
+        # Skip disabled and excluded blocks
+        if block.disabled or block.id in EXCLUDED_BLOCK_IDS:
            continue
        # Skip blocks that don't have categories (all should have at least one)
        if not block.categories:
@@ -88,7 +103,7 @@ def get_block_categories(category_blocks: int = 3) -> list[BlockCategoryResponse
 def get_blocks(
    *,
    category: str | None = None,
-    type: BlockType | None = None,
+    type: BlockTypeFilter | None = None,
    provider: ProviderName | None = None,
    page: int = 1,
    page_size: int = 50,
@@ -111,6 +126,9 @@ def get_blocks(
        # Skip disabled blocks
        if block.disabled:
            continue
+        # Skip excluded blocks
+        if block.id in EXCLUDED_BLOCK_IDS:
+            continue
        # Skip blocks that don't match the category
        if category and category not in {c.name.lower() for c in block.categories}:
            continue
@@ -250,14 +268,25 @@ async def _build_cached_search_results(
        "my_agents": 0,
    }

-    block_results, block_total, integration_total = _collect_block_results(
-        normalized_query=normalized_query,
-        include_blocks=include_blocks,
-        include_integrations=include_integrations,
-    )
-    scored_items.extend(block_results)
-    total_items["blocks"] = block_total
-    total_items["integrations"] = integration_total
+    # Use hybrid search when query is present, otherwise list all blocks
+    if (include_blocks or include_integrations) and normalized_query:
+        block_results, block_total, integration_total = await _text_search_blocks(
+            query=search_query,
+            include_blocks=include_blocks,
+            include_integrations=include_integrations,
+        )
+        scored_items.extend(block_results)
+        total_items["blocks"] = block_total
+        total_items["integrations"] = integration_total
+    elif include_blocks or include_integrations:
+        # No query - list all blocks using in-memory approach
+        block_results, block_total, integration_total = _collect_block_results(
+            include_blocks=include_blocks,
+            include_integrations=include_integrations,
+        )
+        scored_items.extend(block_results)
+        total_items["blocks"] = block_total
+        total_items["integrations"] = integration_total

    if include_library_agents:
        library_response = await library_db.list_library_agents(
@@ -302,10 +331,14 @@ async def _build_cached_search_results(

 def _collect_block_results(
    *,
-    normalized_query: str,
    include_blocks: bool,
    include_integrations: bool,
 ) -> tuple[list[_ScoredItem], int, int]:
+    """
+    Collect all blocks for listing (no search query).
+
+    All blocks get BLOCK_SCORE_BOOST to prioritize them over marketplace agents.
+    """
    results: list[_ScoredItem] = []
    block_count = 0
    integration_count = 0
@@ -318,6 +351,10 @@ def _collect_block_results(
        if block.disabled:
            continue

+        # Skip excluded blocks
+        if block.id in EXCLUDED_BLOCK_IDS:
+            continue
+
        block_info = block.get_info()
        credentials = list(block.input_schema.get_credentials_fields().values())
        is_integration = len(credentials) > 0
@@ -327,10 +364,6 @@ def _collect_block_results(
        if not is_integration and not include_blocks:
            continue

-        score = _score_block(block, block_info, normalized_query)
-        if not _should_include_item(score, normalized_query):
-            continue
-
        filter_type: FilterType = "integrations" if is_integration else "blocks"
        if is_integration:
            integration_count += 1
@@ -341,14 +374,86 @@ def _collect_block_results(
            _ScoredItem(
                item=block_info,
                filter_type=filter_type,
-                score=score,
-                sort_key=_get_item_name(block_info),
+                score=BLOCK_SCORE_BOOST,
+                sort_key=block_info.name.lower(),
            )
        )

    return results, block_count, integration_count


+async def _text_search_blocks(
+    *,
+    query: str,
+    include_blocks: bool,
+    include_integrations: bool,
+) -> tuple[list[_ScoredItem], int, int]:
+    """
+    Search blocks using in-memory text matching over the block registry.
+
+    All blocks are already loaded in memory, so this is fast and reliable
+    regardless of whether OpenAI embeddings are available.
+
+    Scoring:
+        - Base: text relevance via _score_primary_fields, plus BLOCK_SCORE_BOOST
+          to prioritize blocks over marketplace agents in combined results
+        - +20 if the block has an LlmModel field and the query matches an LLM model name
+    """
+    results: list[_ScoredItem] = []
+
+    if not include_blocks and not include_integrations:
+        return results, 0, 0
+
+    normalized_query = query.strip().lower()
+
+    all_results, _, _ = _collect_block_results(
+        include_blocks=include_blocks,
+        include_integrations=include_integrations,
+    )
+
+    all_blocks = load_all_blocks()
+
+    for item in all_results:
+        block_info = item.item
+        assert isinstance(block_info, BlockInfo)
+        name = split_camelcase(block_info.name).lower()
+
+        # Build rich description including input field descriptions,
+        # matching the searchable text that the embedding pipeline uses
+        desc_parts = [block_info.description or ""]
+        block_cls = all_blocks.get(block_info.id)
+        if block_cls is not None:
+            block: AnyBlockSchema = block_cls()
+            desc_parts += [
+                f"{f}: {info.description}"
+                for f, info in block.input_schema.model_fields.items()
+                if info.description
+            ]
+        description = " ".join(desc_parts).lower()
+
+        score = _score_primary_fields(name, description, normalized_query)
+
+        # Add LLM model match bonus
+        if block_cls is not None and _matches_llm_model(
+            block_cls().input_schema, normalized_query
+        ):
+            score += 20
+
+        if score >= MIN_SCORE_FOR_FILTERED_RESULTS:
+            results.append(
+                _ScoredItem(
+                    item=block_info,
+                    filter_type=item.filter_type,
+                    score=score + BLOCK_SCORE_BOOST,
+                    sort_key=name,
+                )
+            )
+
+    block_count = sum(1 for r in results if r.filter_type == "blocks")
+    integration_count = sum(1 for r in results if r.filter_type == "integrations")
+    return results, block_count, integration_count
+
+
 def _build_library_items(
    *,
    agents: list[library_model.LibraryAgent],
@@ -467,6 +572,8 @@ async def _get_static_counts():
        block: AnyBlockSchema = block_type()
        if block.disabled:
            continue
+        if block.id in EXCLUDED_BLOCK_IDS:
+            continue

        all_blocks += 1

@@ -493,47 +600,25 @@ async def _get_static_counts():
    }


+def _contains_type(annotation: Any, target: type) -> bool:
+    """Check if an annotation is or contains the target type (handles Optional/Union/Annotated)."""
+    if annotation is target:
+        return True
+    origin = get_origin(annotation)
+    if origin is None:
+        return False
+    return any(_contains_type(arg, target) for arg in get_args(annotation))
+
+
 def _matches_llm_model(schema_cls: type[BlockSchema], query: str) -> bool:
    for field in schema_cls.model_fields.values():
-        if field.annotation == LlmModel:
+        if _contains_type(field.annotation, LlmModel):
            # Check if query matches any value in llm_models
            if any(query in name for name in llm_models):
                return True
    return False


-def _score_block(
-    block: AnyBlockSchema,
-    block_info: BlockInfo,
-    normalized_query: str,
-) -> float:
-    if not normalized_query:
-        return 0.0
-
-    name = block_info.name.lower()
-    description = block_info.description.lower()
-    score = _score_primary_fields(name, description, normalized_query)
-
-    category_text = " ".join(
-        category.get("category", "").lower() for category in block_info.categories
-    )
-    score += _score_additional_field(category_text, normalized_query, 12, 6)
-
-    credentials_info = block.input_schema.get_credentials_fields_info().values()
-    provider_names = [
-        provider.value.lower()
-        for info in credentials_info
-        for provider in info.provider
-    ]
-    provider_text = " ".join(provider_names)
-    score += _score_additional_field(provider_text, normalized_query, 15, 6)
-
-    if _matches_llm_model(block.input_schema, normalized_query):
-        score += 20
-
-    return score
-
-
 def _score_library_agent(
    agent: library_model.LibraryAgent,
    normalized_query: str,
@@ -640,45 +725,32 @@ def _get_all_providers() -> dict[ProviderName, Provider]:
    return providers


-@cached(ttl_seconds=3600)
+@cached(ttl_seconds=3600, shared_cache=True)
 async def get_suggested_blocks(count: int = 5) -> list[BlockInfo]:
-    suggested_blocks = []
-    # Sum the number of executions for each block type
-    # Prisma cannot group by nested relations, so we do a raw query
-    # Calculate the cutoff timestamp
-    timestamp_threshold = datetime.now(timezone.utc) - timedelta(days=30)
+    """Return the most-executed blocks from the last 14 days.

-    results = await query_raw_with_schema(
-        """
-        SELECT
-            agent_node."agentBlockId" AS block_id,
-            COUNT(execution.id) AS execution_count
-        FROM {schema_prefix}"AgentNodeExecution" execution
-        JOIN {schema_prefix}"AgentNode" agent_node ON execution."agentNodeId" = agent_node.id
-        WHERE execution."endedTime" >= $1::timestamp
-        GROUP BY agent_node."agentBlockId"
-        ORDER BY execution_count DESC;
-        """,
-        timestamp_threshold,
-    )
+    Queries the mv_suggested_blocks materialized view (refreshed hourly via pg_cron)
+    and returns the top `count` blocks sorted by execution count, excluding
+    Input/Output/Agent block types and blocks in EXCLUDED_BLOCK_IDS.
+    """
+    results = await mv_suggested_blocks.prisma().find_many()

    # Get the top blocks based on execution count
-    # But ignore Input and Output blocks
+    # But ignore Input, Output, Agent, and excluded blocks
    blocks: list[tuple[BlockInfo, int]] = []
+    execution_counts = {row.block_id: row.execution_count for row in results}

    for block_type in load_all_blocks().values():
        block: AnyBlockSchema = block_type()
        if block.disabled or block.block_type in (
-            backend.data.block.BlockType.INPUT,
-            backend.data.block.BlockType.OUTPUT,
-            backend.data.block.BlockType.AGENT,
+            BlockType.INPUT,
+            BlockType.OUTPUT,
+            BlockType.AGENT,
        ):
            continue
-        # Find the execution count for this block
-        execution_count = next(
-            (row["execution_count"] for row in results if row["block_id"] == block.id),
-            0,
-        )
+        if block.id in EXCLUDED_BLOCK_IDS:
+            continue
+        execution_count = execution_counts.get(block.id, 0)
        blocks.append((block.get_info(), execution_count))
    # Sort blocks by execution count
    blocks.sort(key=lambda x: x[1], reverse=True)
--- a/autogpt_platform/backend/backend/api/features/builder/model.py
+++ b/autogpt_platform/backend/backend/api/features/builder/model.py
@@ -4,7 +4,7 @@ from pydantic import BaseModel

 import backend.api.features.library.model as library_model
 import backend.api.features.store.model as store_model
-from backend.data.block import BlockInfo
+from backend.blocks._base import BlockInfo
 from backend.integrations.providers import ProviderName
 from backend.util.models import Pagination

@@ -15,7 +15,7 @@ FilterType = Literal[
    "my_agents",
 ]

-BlockType = Literal["all", "input", "action", "output"]
+BlockTypeFilter = Literal["all", "input", "action", "output"]


 class SearchEntry(BaseModel):
@@ -27,7 +27,6 @@ class SearchEntry(BaseModel):

 # Suggestions
 class SuggestionsResponse(BaseModel):
-    otto_suggestions: list[str]
    recent_searches: list[SearchEntry]
    providers: list[ProviderName]
    top_blocks: list[BlockInfo]
--- a/autogpt_platform/backend/backend/api/features/builder/routes.py
+++ b/autogpt_platform/backend/backend/api/features/builder/routes.py
@@ -1,5 +1,5 @@
 import logging
-from typing import Annotated, Sequence
+from typing import Annotated, Sequence, cast, get_args

 import fastapi
 from autogpt_libs.auth.dependencies import get_user_id, requires_user
@@ -10,6 +10,8 @@ from backend.util.models import Pagination
 from . import db as builder_db
 from . import model as builder_model

+VALID_FILTER_VALUES = get_args(builder_model.FilterType)
+
 logger = logging.getLogger(__name__)

 router = fastapi.APIRouter(
@@ -49,11 +51,6 @@ async def get_suggestions(
    Get all suggestions for the Blocks Menu.
    """
    return builder_model.SuggestionsResponse(
-        otto_suggestions=[
-            "What blocks do I need to get started?",
-            "Help me create a list",
-            "Help me feed my data to Google Maps",
-        ],
        recent_searches=await builder_db.get_recent_searches(user_id),
        providers=[
            ProviderName.TWITTER,
@@ -88,7 +85,7 @@ async def get_block_categories(
 )
 async def get_blocks(
    category: Annotated[str | None, fastapi.Query()] = None,
-    type: Annotated[builder_model.BlockType | None, fastapi.Query()] = None,
+    type: Annotated[builder_model.BlockTypeFilter | None, fastapi.Query()] = None,
    provider: Annotated[ProviderName | None, fastapi.Query()] = None,
    page: Annotated[int, fastapi.Query()] = 1,
    page_size: Annotated[int, fastapi.Query()] = 50,
@@ -151,7 +148,7 @@ async def get_providers(
 async def search(
    user_id: Annotated[str, fastapi.Security(get_user_id)],
    search_query: Annotated[str | None, fastapi.Query()] = None,
-    filter: Annotated[list[builder_model.FilterType] | None, fastapi.Query()] = None,
+    filter: Annotated[str | None, fastapi.Query()] = None,
    search_id: Annotated[str | None, fastapi.Query()] = None,
    by_creator: Annotated[list[str] | None, fastapi.Query()] = None,
    page: Annotated[int, fastapi.Query()] = 1,
@@ -160,9 +157,20 @@ async def search(
    """
    Search for blocks (including integrations), marketplace agents, and user library agents.
    """
-    # If no filters are provided, then we will return all types
-    if not filter:
-        filter = [
+    # Parse and validate filter parameter
+    filters: list[builder_model.FilterType]
+    if filter:
+        filter_values = [f.strip() for f in filter.split(",")]
+        invalid_filters = [f for f in filter_values if f not in VALID_FILTER_VALUES]
+        if invalid_filters:
+            raise fastapi.HTTPException(
+                status_code=400,
+                detail=f"Invalid filter value(s): {', '.join(invalid_filters)}. "
+                f"Valid values are: {', '.join(VALID_FILTER_VALUES)}",
+            )
+        filters = cast(list[builder_model.FilterType], filter_values)
+    else:
+        filters = [
            "blocks",
            "integrations",
            "marketplace_agents",
@@ -174,7 +182,7 @@ async def search(
    cached_results = await builder_db.get_sorted_search_results(
        user_id=user_id,
        search_query=search_query,
-        filters=filter,
+        filters=filters,
        by_creator=by_creator,
    )

@@ -196,7 +204,7 @@ async def search(
        user_id,
        builder_model.SearchEntry(
            search_query=search_query,
-            filter=filter,
+            filter=filters,
            by_creator=by_creator,
            search_id=search_id,
        ),
--- a/autogpt_platform/backend/backend/api/features/chat/completion_consumer.py
+++ b/autogpt_platform/backend/backend/api/features/chat/completion_consumer.py
@@ -1,368 +0,0 @@
-"""Redis Streams consumer for operation completion messages.
-
-This module provides a consumer (ChatCompletionConsumer) that listens for
-completion notifications (OperationCompleteMessage) from external services
-(like Agent Generator) and triggers the appropriate stream registry and
-chat service updates via process_operation_success/process_operation_failure.
-
-Why Redis Streams instead of RabbitMQ?
--------------------------------------
-While the project typically uses RabbitMQ for async task queues (e.g., execution
-queue), Redis Streams was chosen for chat completion notifications because:
-
-1. **Unified Infrastructure**: The SSE reconnection feature already uses Redis
-   Streams (via stream_registry) for message persistence and replay. Using Redis
-   Streams for completion notifications keeps all chat streaming infrastructure
-   in one system, simplifying operations and reducing cross-system coordination.
-
-2. **Message Replay**: Redis Streams support XREAD with arbitrary message IDs,
-   allowing consumers to replay missed messages after reconnection. This aligns
-   with the SSE reconnection pattern where clients can resume from last_message_id.
-
-3. **Consumer Groups with XAUTOCLAIM**: Redis consumer groups provide automatic
-   load balancing across pods with explicit message claiming (XAUTOCLAIM) for
-   recovering from dead consumers - ideal for the completion callback pattern.
-
-4. **Lower Latency**: For real-time SSE updates, Redis (already in-memory for
-   stream_registry) provides lower latency than an additional RabbitMQ hop.
-
-5. **Atomicity with Task State**: Completion processing often needs to update
-   task metadata stored in Redis. Keeping both in Redis enables simpler
-   transactional semantics without distributed coordination.
-
-The consumer uses Redis Streams with consumer groups for reliable message
-processing across multiple platform pods, with XAUTOCLAIM for reclaiming
-stale pending messages from dead consumers.
-"""
-
-import asyncio
-import logging
-import os
-import uuid
-from typing import Any
-
-import orjson
-from prisma import Prisma
-from pydantic import BaseModel
-from redis.exceptions import ResponseError
-
-from backend.data.redis_client import get_redis_async
-
-from . import stream_registry
-from .completion_handler import process_operation_failure, process_operation_success
-from .config import ChatConfig
-
-logger = logging.getLogger(__name__)
-config = ChatConfig()
-
-
-class OperationCompleteMessage(BaseModel):
-    """Message format for operation completion notifications."""
-
-    operation_id: str
-    task_id: str
-    success: bool
-    result: dict | str | None = None
-    error: str | None = None
-
-
-class ChatCompletionConsumer:
-    """Consumer for chat operation completion messages from Redis Streams.
-
-    This consumer initializes its own Prisma client in start() to ensure
-    database operations work correctly within this async context.
-
-    Uses Redis consumer groups to allow multiple platform pods to consume
-    messages reliably with automatic redelivery on failure.
-    """
-
-    def __init__(self):
-        self._consumer_task: asyncio.Task | None = None
-        self._running = False
-        self._prisma: Prisma | None = None
-        self._consumer_name = f"consumer-{uuid.uuid4().hex[:8]}"
-
-    async def start(self) -> None:
-        """Start the completion consumer."""
-        if self._running:
-            logger.warning("Completion consumer already running")
-            return
-
-        # Create consumer group if it doesn't exist
-        try:
-            redis = await get_redis_async()
-            await redis.xgroup_create(
-                config.stream_completion_name,
-                config.stream_consumer_group,
-                id="0",
-                mkstream=True,
-            )
-            logger.info(
-                f"Created consumer group '{config.stream_consumer_group}' "
-                f"on stream '{config.stream_completion_name}'"
-            )
-        except ResponseError as e:
-            if "BUSYGROUP" in str(e):
-                logger.debug(
-                    f"Consumer group '{config.stream_consumer_group}' already exists"
-                )
-            else:
-                raise
-
-        self._running = True
-        self._consumer_task = asyncio.create_task(self._consume_messages())
-        logger.info(
-            f"Chat completion consumer started (consumer: {self._consumer_name})"
-        )
-
-    async def _ensure_prisma(self) -> Prisma:
-        """Lazily initialize Prisma client on first use."""
-        if self._prisma is None:
-            database_url = os.getenv("DATABASE_URL", "postgresql://localhost:5432")
-            self._prisma = Prisma(datasource={"url": database_url})
-            await self._prisma.connect()
-            logger.info("[COMPLETION] Consumer Prisma client connected (lazy init)")
-        return self._prisma
-
-    async def stop(self) -> None:
-        """Stop the completion consumer."""
-        self._running = False
-
-        if self._consumer_task:
-            self._consumer_task.cancel()
-            try:
-                await self._consumer_task
-            except asyncio.CancelledError:
-                pass
-            self._consumer_task = None
-
-        if self._prisma:
-            await self._prisma.disconnect()
-            self._prisma = None
-            logger.info("[COMPLETION] Consumer Prisma client disconnected")
-
-        logger.info("Chat completion consumer stopped")
-
-    async def _consume_messages(self) -> None:
-        """Main message consumption loop with retry logic."""
-        max_retries = 10
-        retry_delay = 5  # seconds
-        retry_count = 0
-        block_timeout = 5000  # milliseconds
-
-        while self._running and retry_count < max_retries:
-            try:
-                redis = await get_redis_async()
-
-                # Reset retry count on successful connection
-                retry_count = 0
-
-                while self._running:
-                    # First, claim any stale pending messages from dead consumers
-                    # Redis does NOT auto-redeliver pending messages; we must explicitly
-                    # claim them using XAUTOCLAIM
-                    try:
-                        claimed_result = await redis.xautoclaim(
-                            name=config.stream_completion_name,
-                            groupname=config.stream_consumer_group,
-                            consumername=self._consumer_name,
-                            min_idle_time=config.stream_claim_min_idle_ms,
-                            start_id="0-0",
-                            count=10,
-                        )
-                        # xautoclaim returns: (next_start_id, [(id, data), ...], [deleted_ids])
-                        if claimed_result and len(claimed_result) >= 2:
-                            claimed_entries = claimed_result[1]
-                            if claimed_entries:
-                                logger.info(
-                                    f"Claimed {len(claimed_entries)} stale pending messages"
-                                )
-                                for entry_id, data in claimed_entries:
-                                    if not self._running:
-                                        return
-                                    await self._process_entry(redis, entry_id, data)
-                    except Exception as e:
-                        logger.warning(f"XAUTOCLAIM failed (non-fatal): {e}")
-
-                    # Read new messages from the stream
-                    messages = await redis.xreadgroup(
-                        groupname=config.stream_consumer_group,
-                        consumername=self._consumer_name,
-                        streams={config.stream_completion_name: ">"},
-                        block=block_timeout,
-                        count=10,
-                    )
-
-                    if not messages:
-                        continue
-
-                    for stream_name, entries in messages:
-                        for entry_id, data in entries:
-                            if not self._running:
-                                return
-                            await self._process_entry(redis, entry_id, data)
-
-            except asyncio.CancelledError:
-                logger.info("Consumer cancelled")
-                return
-            except Exception as e:
-                retry_count += 1
-                logger.error(
-                    f"Consumer error (retry {retry_count}/{max_retries}): {e}",
-                    exc_info=True,
-                )
-                if self._running and retry_count < max_retries:
-                    await asyncio.sleep(retry_delay)
-                else:
-                    logger.error("Max retries reached, stopping consumer")
-                    return
-
-    async def _process_entry(
-        self, redis: Any, entry_id: str, data: dict[str, Any]
-    ) -> None:
-        """Process a single stream entry and acknowledge it on success.
-
-        Args:
-            redis: Redis client connection
-            entry_id: The stream entry ID
-            data: The entry data dict
-        """
-        try:
-            # Handle the message
-            message_data = data.get("data")
-            if message_data:
-                await self._handle_message(
-                    message_data.encode()
-                    if isinstance(message_data, str)
-                    else message_data
-                )
-
-            # Acknowledge the message after successful processing
-            await redis.xack(
-                config.stream_completion_name,
-                config.stream_consumer_group,
-                entry_id,
-            )
-        except Exception as e:
-            logger.error(
-                f"Error processing completion message {entry_id}: {e}",
-                exc_info=True,
-            )
-            # Message remains in pending state and will be claimed by
-            # XAUTOCLAIM after min_idle_time expires
-
-    async def _handle_message(self, body: bytes) -> None:
-        """Handle a completion message using our own Prisma client."""
-        try:
-            data = orjson.loads(body)
-            message = OperationCompleteMessage(**data)
-        except Exception as e:
-            logger.error(f"Failed to parse completion message: {e}")
-            return
-
-        logger.info(
-            f"[COMPLETION] Received completion for operation {message.operation_id} "
-            f"(task_id={message.task_id}, success={message.success})"
-        )
-
-        # Find task in registry
-        task = await stream_registry.find_task_by_operation_id(message.operation_id)
-        if task is None:
-            task = await stream_registry.get_task(message.task_id)
-
-        if task is None:
-            logger.warning(
-                f"[COMPLETION] Task not found for operation {message.operation_id} "
-                f"(task_id={message.task_id})"
-            )
-            return
-
-        logger.info(
-            f"[COMPLETION] Found task: task_id={task.task_id}, "
-            f"session_id={task.session_id}, tool_call_id={task.tool_call_id}"
-        )
-
-        # Guard against empty task fields
-        if not task.task_id or not task.session_id or not task.tool_call_id:
-            logger.error(
-                f"[COMPLETION] Task has empty critical fields! "
-                f"task_id={task.task_id!r}, session_id={task.session_id!r}, "
-                f"tool_call_id={task.tool_call_id!r}"
-            )
-            return
-
-        if message.success:
-            await self._handle_success(task, message)
-        else:
-            await self._handle_failure(task, message)
-
-    async def _handle_success(
-        self,
-        task: stream_registry.ActiveTask,
-        message: OperationCompleteMessage,
-    ) -> None:
-        """Handle successful operation completion."""
-        prisma = await self._ensure_prisma()
-        await process_operation_success(task, message.result, prisma)
-
-    async def _handle_failure(
-        self,
-        task: stream_registry.ActiveTask,
-        message: OperationCompleteMessage,
-    ) -> None:
-        """Handle failed operation completion."""
-        prisma = await self._ensure_prisma()
-        await process_operation_failure(task, message.error, prisma)
-
-
-# Module-level consumer instance
-_consumer: ChatCompletionConsumer | None = None
-
-
-async def start_completion_consumer() -> None:
-    """Start the global completion consumer."""
-    global _consumer
-    if _consumer is None:
-        _consumer = ChatCompletionConsumer()
-    await _consumer.start()
-
-
-async def stop_completion_consumer() -> None:
-    """Stop the global completion consumer."""
-    global _consumer
-    if _consumer:
-        await _consumer.stop()
-        _consumer = None
-
-
-async def publish_operation_complete(
-    operation_id: str,
-    task_id: str,
-    success: bool,
-    result: dict | str | None = None,
-    error: str | None = None,
-) -> None:
-    """Publish an operation completion message to Redis Streams.
-
-    Args:
-        operation_id: The operation ID that completed.
-        task_id: The task ID associated with the operation.
-        success: Whether the operation succeeded.
-        result: The result data (for success).
-        error: The error message (for failure).
-    """
-    message = OperationCompleteMessage(
-        operation_id=operation_id,
-        task_id=task_id,
-        success=success,
-        result=result,
-        error=error,
-    )
-
-    redis = await get_redis_async()
-    await redis.xadd(
-        config.stream_completion_name,
-        {"data": message.model_dump_json()},
-        maxlen=config.stream_max_length,
-    )
-    logger.info(f"Published completion for operation {operation_id}")
--- a/autogpt_platform/backend/backend/api/features/chat/completion_handler.py
+++ b/autogpt_platform/backend/backend/api/features/chat/completion_handler.py
@@ -1,344 +0,0 @@
-"""Shared completion handling for operation success and failure.
-
-This module provides common logic for handling operation completion from both:
- The Redis Streams consumer (completion_consumer.py)
- The HTTP webhook endpoint (routes.py)
-"""
-
-import logging
-from typing import Any
-
-import orjson
-from prisma import Prisma
-
-from . import service as chat_service
-from . import stream_registry
-from .response_model import StreamError, StreamToolOutputAvailable
-from .tools.models import ErrorResponse
-
-logger = logging.getLogger(__name__)
-
-# Tools that produce agent_json that needs to be saved to library
-AGENT_GENERATION_TOOLS = {"create_agent", "edit_agent"}
-
-# Keys that should be stripped from agent_json when returning in error responses
-SENSITIVE_KEYS = frozenset(
-    {
-        "api_key",
-        "apikey",
-        "api_secret",
-        "password",
-        "secret",
-        "credentials",
-        "credential",
-        "token",
-        "access_token",
-        "refresh_token",
-        "private_key",
-        "privatekey",
-        "auth",
-        "authorization",
-    }
-)
-
-
-def _sanitize_agent_json(obj: Any) -> Any:
-    """Recursively sanitize agent_json by removing sensitive keys.
-
-    Args:
-        obj: The object to sanitize (dict, list, or primitive)
-
-    Returns:
-        Sanitized copy with sensitive keys removed/redacted
-    """
-    if isinstance(obj, dict):
-        return {
-            k: "[REDACTED]" if k.lower() in SENSITIVE_KEYS else _sanitize_agent_json(v)
-            for k, v in obj.items()
-        }
-    elif isinstance(obj, list):
-        return [_sanitize_agent_json(item) for item in obj]
-    else:
-        return obj
-
-
-class ToolMessageUpdateError(Exception):
-    """Raised when updating a tool message in the database fails."""
-
-    pass
-
-
-async def _update_tool_message(
-    session_id: str,
-    tool_call_id: str,
-    content: str,
-    prisma_client: Prisma | None,
-) -> None:
-    """Update tool message in database.
-
-    Args:
-        session_id: The session ID
-        tool_call_id: The tool call ID to update
-        content: The new content for the message
-        prisma_client: Optional Prisma client. If None, uses chat_service.
-
-    Raises:
-        ToolMessageUpdateError: If the database update fails. The caller should
-            handle this to avoid marking the task as completed with inconsistent state.
-    """
-    try:
-        if prisma_client:
-            # Use provided Prisma client (for consumer with its own connection)
-            updated_count = await prisma_client.chatmessage.update_many(
-                where={
-                    "sessionId": session_id,
-                    "toolCallId": tool_call_id,
-                },
-                data={"content": content},
-            )
-            # Check if any rows were updated - 0 means message not found
-            if updated_count == 0:
-                raise ToolMessageUpdateError(
-                    f"No message found with tool_call_id={tool_call_id} in session {session_id}"
-                )
-        else:
-            # Use service function (for webhook endpoint)
-            await chat_service._update_pending_operation(
-                session_id=session_id,
-                tool_call_id=tool_call_id,
-                result=content,
-            )
-    except ToolMessageUpdateError:
-        raise
-    except Exception as e:
-        logger.error(f"[COMPLETION] Failed to update tool message: {e}", exc_info=True)
-        raise ToolMessageUpdateError(
-            f"Failed to update tool message for tool_call_id={tool_call_id}: {e}"
-        ) from e
-
-
-def serialize_result(result: dict | list | str | int | float | bool | None) -> str:
-    """Serialize result to JSON string with sensible defaults.
-
-    Args:
-        result: The result to serialize. Can be a dict, list, string,
-            number, boolean, or None.
-
-    Returns:
-        JSON string representation of the result. Returns '{"status": "completed"}'
-        only when result is explicitly None.
-    """
-    if isinstance(result, str):
-        return result
-    if result is None:
-        return '{"status": "completed"}'
-    return orjson.dumps(result).decode("utf-8")
-
-
-async def _save_agent_from_result(
-    result: dict[str, Any],
-    user_id: str | None,
-    tool_name: str,
-) -> dict[str, Any]:
-    """Save agent to library if result contains agent_json.
-
-    Args:
-        result: The result dict that may contain agent_json
-        user_id: The user ID to save the agent for
-        tool_name: The tool name (create_agent or edit_agent)
-
-    Returns:
-        Updated result dict with saved agent details, or original result if no agent_json
-    """
-    if not user_id:
-        logger.warning("[COMPLETION] Cannot save agent: no user_id in task")
-        return result
-
-    agent_json = result.get("agent_json")
-    if not agent_json:
-        logger.warning(
-            f"[COMPLETION] {tool_name} completed but no agent_json in result"
-        )
-        return result
-
-    try:
-        from .tools.agent_generator import save_agent_to_library
-
-        is_update = tool_name == "edit_agent"
-        created_graph, library_agent = await save_agent_to_library(
-            agent_json, user_id, is_update=is_update
-        )
-
-        logger.info(
-            f"[COMPLETION] Saved agent '{created_graph.name}' to library "
-            f"(graph_id={created_graph.id}, library_agent_id={library_agent.id})"
-        )
-
-        # Return a response similar to AgentSavedResponse
-        return {
-            "type": "agent_saved",
-            "message": f"Agent '{created_graph.name}' has been saved to your library!",
-            "agent_id": created_graph.id,
-            "agent_name": created_graph.name,
-            "library_agent_id": library_agent.id,
-            "library_agent_link": f"/library/agents/{library_agent.id}",
-            "agent_page_link": f"/build?flowID={created_graph.id}",
-        }
-    except Exception as e:
-        logger.error(
-            f"[COMPLETION] Failed to save agent to library: {e}",
-            exc_info=True,
-        )
-        # Return error but don't fail the whole operation
-        # Sanitize agent_json to remove sensitive keys before returning
-        return {
-            "type": "error",
-            "message": f"Agent was generated but failed to save: {str(e)}",
-            "error": str(e),
-            "agent_json": _sanitize_agent_json(agent_json),
-        }
-
-
-async def process_operation_success(
-    task: stream_registry.ActiveTask,
-    result: dict | str | None,
-    prisma_client: Prisma | None = None,
-) -> None:
-    """Handle successful operation completion.
-
-    Publishes the result to the stream registry, updates the database,
-    generates LLM continuation, and marks the task as completed.
-
-    Args:
-        task: The active task that completed
-        result: The result data from the operation
-        prisma_client: Optional Prisma client for database operations.
-            If None, uses chat_service._update_pending_operation instead.
-
-    Raises:
-        ToolMessageUpdateError: If the database update fails. The task will be
-            marked as failed instead of completed to avoid inconsistent state.
-    """
-    # For agent generation tools, save the agent to library
-    if task.tool_name in AGENT_GENERATION_TOOLS and isinstance(result, dict):
-        result = await _save_agent_from_result(result, task.user_id, task.tool_name)
-
-    # Serialize result for output (only substitute default when result is exactly None)
-    result_output = result if result is not None else {"status": "completed"}
-    output_str = (
-        result_output
-        if isinstance(result_output, str)
-        else orjson.dumps(result_output).decode("utf-8")
-    )
-
-    # Publish result to stream registry
-    await stream_registry.publish_chunk(
-        task.task_id,
-        StreamToolOutputAvailable(
-            toolCallId=task.tool_call_id,
-            toolName=task.tool_name,
-            output=output_str,
-            success=True,
-        ),
-    )
-
-    # Update pending operation in database
-    # If this fails, we must not continue to mark the task as completed
-    result_str = serialize_result(result)
-    try:
-        await _update_tool_message(
-            session_id=task.session_id,
-            tool_call_id=task.tool_call_id,
-            content=result_str,
-            prisma_client=prisma_client,
-        )
-    except ToolMessageUpdateError:
-        # DB update failed - mark task as failed to avoid inconsistent state
-        logger.error(
-            f"[COMPLETION] DB update failed for task {task.task_id}, "
-            "marking as failed instead of completed"
-        )
-        await stream_registry.publish_chunk(
-            task.task_id,
-            StreamError(errorText="Failed to save operation result to database"),
-        )
-        await stream_registry.mark_task_completed(task.task_id, status="failed")
-        raise
-
-    # Generate LLM continuation with streaming
-    try:
-        await chat_service._generate_llm_continuation_with_streaming(
-            session_id=task.session_id,
-            user_id=task.user_id,
-            task_id=task.task_id,
-        )
-    except Exception as e:
-        logger.error(
-            f"[COMPLETION] Failed to generate LLM continuation: {e}",
-            exc_info=True,
-        )
-
-    # Mark task as completed and release Redis lock
-    await stream_registry.mark_task_completed(task.task_id, status="completed")
-    try:
-        await chat_service._mark_operation_completed(task.tool_call_id)
-    except Exception as e:
-        logger.error(f"[COMPLETION] Failed to mark operation completed: {e}")
-
-    logger.info(
-        f"[COMPLETION] Successfully processed completion for task {task.task_id}"
-    )
-
-
-async def process_operation_failure(
-    task: stream_registry.ActiveTask,
-    error: str | None,
-    prisma_client: Prisma | None = None,
-) -> None:
-    """Handle failed operation completion.
-
-    Publishes the error to the stream registry, updates the database with
-    the error response, and marks the task as failed.
-
-    Args:
-        task: The active task that failed
-        error: The error message from the operation
-        prisma_client: Optional Prisma client for database operations.
-            If None, uses chat_service._update_pending_operation instead.
-    """
-    error_msg = error or "Operation failed"
-
-    # Publish error to stream registry
-    await stream_registry.publish_chunk(
-        task.task_id,
-        StreamError(errorText=error_msg),
-    )
-
-    # Update pending operation with error
-    # If this fails, we still continue to mark the task as failed
-    error_response = ErrorResponse(
-        message=error_msg,
-        error=error,
-    )
-    try:
-        await _update_tool_message(
-            session_id=task.session_id,
-            tool_call_id=task.tool_call_id,
-            content=error_response.model_dump_json(),
-            prisma_client=prisma_client,
-        )
-    except ToolMessageUpdateError:
-        # DB update failed - log but continue with cleanup
-        logger.error(
-            f"[COMPLETION] DB update failed while processing failure for task {task.task_id}, "
-            "continuing with cleanup"
-        )
-
-    # Mark task as failed and release Redis lock
-    await stream_registry.mark_task_completed(task.task_id, status="failed")
-    try:
-        await chat_service._mark_operation_completed(task.tool_call_id)
-    except Exception as e:
-        logger.error(f"[COMPLETION] Failed to mark operation completed: {e}")
-
-    logger.info(f"[COMPLETION] Processed failure for task {task.task_id}: {error_msg}")
--- a/autogpt_platform/backend/backend/api/features/chat/config.py
+++ b/autogpt_platform/backend/backend/api/features/chat/config.py
@@ -1,146 +0,0 @@
-"""Configuration management for chat system."""
-
-import os
-
-from pydantic import Field, field_validator
-from pydantic_settings import BaseSettings
-
-
-class ChatConfig(BaseSettings):
-    """Configuration for the chat system."""
-
-    # OpenAI API Configuration
-    model: str = Field(
-        default="anthropic/claude-opus-4.6", description="Default model to use"
-    )
-    title_model: str = Field(
-        default="openai/gpt-4o-mini",
-        description="Model to use for generating session titles (should be fast/cheap)",
-    )
-    api_key: str | None = Field(default=None, description="OpenAI API key")
-    base_url: str | None = Field(
-        default="https://openrouter.ai/api/v1",
-        description="Base URL for API (e.g., for OpenRouter)",
-    )
-
-    # Session TTL Configuration - 12 hours
-    session_ttl: int = Field(default=43200, description="Session TTL in seconds")
-
-    # Streaming Configuration
-    max_context_messages: int = Field(
-        default=50, ge=1, le=200, description="Maximum context messages"
-    )
-
-    stream_timeout: int = Field(default=300, description="Stream timeout in seconds")
-    max_retries: int = Field(default=3, description="Maximum number of retries")
-    max_agent_runs: int = Field(default=30, description="Maximum number of agent runs")
-    max_agent_schedules: int = Field(
-        default=30, description="Maximum number of agent schedules"
-    )
-
-    # Long-running operation configuration
-    long_running_operation_ttl: int = Field(
-        default=600,
-        description="TTL in seconds for long-running operation tracking in Redis (safety net if pod dies)",
-    )
-
-    # Stream registry configuration for SSE reconnection
-    stream_ttl: int = Field(
-        default=3600,
-        description="TTL in seconds for stream data in Redis (1 hour)",
-    )
-    stream_max_length: int = Field(
-        default=10000,
-        description="Maximum number of messages to store per stream",
-    )
-
-    # Redis Streams configuration for completion consumer
-    stream_completion_name: str = Field(
-        default="chat:completions",
-        description="Redis Stream name for operation completions",
-    )
-    stream_consumer_group: str = Field(
-        default="chat_consumers",
-        description="Consumer group name for completion stream",
-    )
-    stream_claim_min_idle_ms: int = Field(
-        default=60000,
-        description="Minimum idle time in milliseconds before claiming pending messages from dead consumers",
-    )
-
-    # Redis key prefixes for stream registry
-    task_meta_prefix: str = Field(
-        default="chat:task:meta:",
-        description="Prefix for task metadata hash keys",
-    )
-    task_stream_prefix: str = Field(
-        default="chat:stream:",
-        description="Prefix for task message stream keys",
-    )
-    task_op_prefix: str = Field(
-        default="chat:task:op:",
-        description="Prefix for operation ID to task ID mapping keys",
-    )
-    internal_api_key: str | None = Field(
-        default=None,
-        description="API key for internal webhook callbacks (env: CHAT_INTERNAL_API_KEY)",
-    )
-
-    # Langfuse Prompt Management Configuration
-    # Note: Langfuse credentials are in Settings().secrets (settings.py)
-    langfuse_prompt_name: str = Field(
-        default="CoPilot Prompt",
-        description="Name of the prompt in Langfuse to fetch",
-    )
-
-    @field_validator("api_key", mode="before")
-    @classmethod
-    def get_api_key(cls, v):
-        """Get API key from environment if not provided."""
-        if v is None:
-            # Try to get from environment variables
-            # First check for CHAT_API_KEY (Pydantic prefix)
-            v = os.getenv("CHAT_API_KEY")
-            if not v:
-                # Fall back to OPEN_ROUTER_API_KEY
-                v = os.getenv("OPEN_ROUTER_API_KEY")
-            if not v:
-                # Fall back to OPENAI_API_KEY
-                v = os.getenv("OPENAI_API_KEY")
-        return v
-
-    @field_validator("base_url", mode="before")
-    @classmethod
-    def get_base_url(cls, v):
-        """Get base URL from environment if not provided."""
-        if v is None:
-            # Check for OpenRouter or custom base URL
-            v = os.getenv("CHAT_BASE_URL")
-            if not v:
-                v = os.getenv("OPENROUTER_BASE_URL")
-            if not v:
-                v = os.getenv("OPENAI_BASE_URL")
-            if not v:
-                v = "https://openrouter.ai/api/v1"
-        return v
-
-    @field_validator("internal_api_key", mode="before")
-    @classmethod
-    def get_internal_api_key(cls, v):
-        """Get internal API key from environment if not provided."""
-        if v is None:
-            v = os.getenv("CHAT_INTERNAL_API_KEY")
-        return v
-
-    # Prompt paths for different contexts
-    PROMPT_PATHS: dict[str, str] = {
-        "default": "prompts/chat_system.md",
-        "onboarding": "prompts/onboarding_system.md",
-    }
-
-    class Config:
-        """Pydantic config."""
-
-        env_file = ".env"
-        env_file_encoding = "utf-8"
-        extra = "ignore"  # Ignore extra environment variables
--- a/autogpt_platform/backend/backend/api/features/chat/db.py
+++ b/autogpt_platform/backend/backend/api/features/chat/db.py
@@ -1,288 +0,0 @@
-"""Database operations for chat sessions."""
-
-import asyncio
-import logging
-from datetime import UTC, datetime
-from typing import Any, cast
-
-from prisma.models import ChatMessage as PrismaChatMessage
-from prisma.models import ChatSession as PrismaChatSession
-from prisma.types import (
-    ChatMessageCreateInput,
-    ChatSessionCreateInput,
-    ChatSessionUpdateInput,
-    ChatSessionWhereInput,
-)
-
-from backend.data.db import transaction
-from backend.util.json import SafeJson
-
-logger = logging.getLogger(__name__)
-
-
-async def get_chat_session(session_id: str) -> PrismaChatSession | None:
-    """Get a chat session by ID from the database."""
-    session = await PrismaChatSession.prisma().find_unique(
-        where={"id": session_id},
-        include={"Messages": True},
-    )
-    if session and session.Messages:
-        # Sort messages by sequence in Python - Prisma Python client doesn't support
-        # order_by in include clauses (unlike Prisma JS), so we sort after fetching
-        session.Messages.sort(key=lambda m: m.sequence)
-    return session
-
-
-async def create_chat_session(
-    session_id: str,
-    user_id: str,
-) -> PrismaChatSession:
-    """Create a new chat session in the database."""
-    data = ChatSessionCreateInput(
-        id=session_id,
-        userId=user_id,
-        credentials=SafeJson({}),
-        successfulAgentRuns=SafeJson({}),
-        successfulAgentSchedules=SafeJson({}),
-    )
-    return await PrismaChatSession.prisma().create(data=data)
-
-
-async def update_chat_session(
-    session_id: str,
-    credentials: dict[str, Any] | None = None,
-    successful_agent_runs: dict[str, Any] | None = None,
-    successful_agent_schedules: dict[str, Any] | None = None,
-    total_prompt_tokens: int | None = None,
-    total_completion_tokens: int | None = None,
-    title: str | None = None,
-) -> PrismaChatSession | None:
-    """Update a chat session's metadata."""
-    data: ChatSessionUpdateInput = {"updatedAt": datetime.now(UTC)}
-
-    if credentials is not None:
-        data["credentials"] = SafeJson(credentials)
-    if successful_agent_runs is not None:
-        data["successfulAgentRuns"] = SafeJson(successful_agent_runs)
-    if successful_agent_schedules is not None:
-        data["successfulAgentSchedules"] = SafeJson(successful_agent_schedules)
-    if total_prompt_tokens is not None:
-        data["totalPromptTokens"] = total_prompt_tokens
-    if total_completion_tokens is not None:
-        data["totalCompletionTokens"] = total_completion_tokens
-    if title is not None:
-        data["title"] = title
-
-    session = await PrismaChatSession.prisma().update(
-        where={"id": session_id},
-        data=data,
-        include={"Messages": True},
-    )
-    if session and session.Messages:
-        # Sort in Python - Prisma Python doesn't support order_by in include clauses
-        session.Messages.sort(key=lambda m: m.sequence)
-    return session
-
-
-async def add_chat_message(
-    session_id: str,
-    role: str,
-    sequence: int,
-    content: str | None = None,
-    name: str | None = None,
-    tool_call_id: str | None = None,
-    refusal: str | None = None,
-    tool_calls: list[dict[str, Any]] | None = None,
-    function_call: dict[str, Any] | None = None,
-) -> PrismaChatMessage:
-    """Add a message to a chat session."""
-    # Build input dict dynamically rather than using ChatMessageCreateInput directly
-    # because Prisma's TypedDict validation rejects optional fields set to None.
-    # We only include fields that have values, then cast at the end.
-    data: dict[str, Any] = {
-        "Session": {"connect": {"id": session_id}},
-        "role": role,
-        "sequence": sequence,
-    }
-
-    # Add optional string fields
-    if content is not None:
-        data["content"] = content
-    if name is not None:
-        data["name"] = name
-    if tool_call_id is not None:
-        data["toolCallId"] = tool_call_id
-    if refusal is not None:
-        data["refusal"] = refusal
-
-    # Add optional JSON fields only when they have values
-    if tool_calls is not None:
-        data["toolCalls"] = SafeJson(tool_calls)
-    if function_call is not None:
-        data["functionCall"] = SafeJson(function_call)
-
-    # Run message create and session timestamp update in parallel for lower latency
-    _, message = await asyncio.gather(
-        PrismaChatSession.prisma().update(
-            where={"id": session_id},
-            data={"updatedAt": datetime.now(UTC)},
-        ),
-        PrismaChatMessage.prisma().create(data=cast(ChatMessageCreateInput, data)),
-    )
-    return message
-
-
-async def add_chat_messages_batch(
-    session_id: str,
-    messages: list[dict[str, Any]],
-    start_sequence: int,
-) -> list[PrismaChatMessage]:
-    """Add multiple messages to a chat session in a batch.
-
-    Uses a transaction for atomicity - if any message creation fails,
-    the entire batch is rolled back.
-    """
-    if not messages:
-        return []
-
-    created_messages = []
-
-    async with transaction() as tx:
-        for i, msg in enumerate(messages):
-            # Build input dict dynamically rather than using ChatMessageCreateInput
-            # directly because Prisma's TypedDict validation rejects optional fields
-            # set to None. We only include fields that have values, then cast.
-            data: dict[str, Any] = {
-                "Session": {"connect": {"id": session_id}},
-                "role": msg["role"],
-                "sequence": start_sequence + i,
-            }
-
-            # Add optional string fields
-            if msg.get("content") is not None:
-                data["content"] = msg["content"]
-            if msg.get("name") is not None:
-                data["name"] = msg["name"]
-            if msg.get("tool_call_id") is not None:
-                data["toolCallId"] = msg["tool_call_id"]
-            if msg.get("refusal") is not None:
-                data["refusal"] = msg["refusal"]
-
-            # Add optional JSON fields only when they have values
-            if msg.get("tool_calls") is not None:
-                data["toolCalls"] = SafeJson(msg["tool_calls"])
-            if msg.get("function_call") is not None:
-                data["functionCall"] = SafeJson(msg["function_call"])
-
-            created = await PrismaChatMessage.prisma(tx).create(
-                data=cast(ChatMessageCreateInput, data)
-            )
-            created_messages.append(created)
-
-        # Update session's updatedAt timestamp within the same transaction.
-        # Note: Token usage (total_prompt_tokens, total_completion_tokens) is updated
-        # separately via update_chat_session() after streaming completes.
-        await PrismaChatSession.prisma(tx).update(
-            where={"id": session_id},
-            data={"updatedAt": datetime.now(UTC)},
-        )
-
-    return created_messages
-
-
-async def get_user_chat_sessions(
-    user_id: str,
-    limit: int = 50,
-    offset: int = 0,
-) -> list[PrismaChatSession]:
-    """Get chat sessions for a user, ordered by most recent."""
-    return await PrismaChatSession.prisma().find_many(
-        where={"userId": user_id},
-        order={"updatedAt": "desc"},
-        take=limit,
-        skip=offset,
-    )
-
-
-async def get_user_session_count(user_id: str) -> int:
-    """Get the total number of chat sessions for a user."""
-    return await PrismaChatSession.prisma().count(where={"userId": user_id})
-
-
-async def delete_chat_session(session_id: str, user_id: str | None = None) -> bool:
-    """Delete a chat session and all its messages.
-
-    Args:
-        session_id: The session ID to delete.
-        user_id: If provided, validates that the session belongs to this user
-            before deletion. This prevents unauthorized deletion of other
-            users' sessions.
-
-    Returns:
-        True if deleted successfully, False otherwise.
-    """
-    try:
-        # Build typed where clause with optional user_id validation
-        where_clause: ChatSessionWhereInput = {"id": session_id}
-        if user_id is not None:
-            where_clause["userId"] = user_id
-
-        result = await PrismaChatSession.prisma().delete_many(where=where_clause)
-        if result == 0:
-            logger.warning(
-                f"No session deleted for {session_id} "
-                f"(user_id validation: {user_id is not None})"
-            )
-            return False
-        return True
-    except Exception as e:
-        logger.error(f"Failed to delete chat session {session_id}: {e}")
-        return False
-
-
-async def get_chat_session_message_count(session_id: str) -> int:
-    """Get the number of messages in a chat session."""
-    count = await PrismaChatMessage.prisma().count(where={"sessionId": session_id})
-    return count
-
-
-async def update_tool_message_content(
-    session_id: str,
-    tool_call_id: str,
-    new_content: str,
-) -> bool:
-    """Update the content of a tool message in chat history.
-
-    Used by background tasks to update pending operation messages with final results.
-
-    Args:
-        session_id: The chat session ID.
-        tool_call_id: The tool call ID to find the message.
-        new_content: The new content to set.
-
-    Returns:
-        True if a message was updated, False otherwise.
-    """
-    try:
-        result = await PrismaChatMessage.prisma().update_many(
-            where={
-                "sessionId": session_id,
-                "toolCallId": tool_call_id,
-            },
-            data={
-                "content": new_content,
-            },
-        )
-        if result == 0:
-            logger.warning(
-                f"No message found to update for session {session_id}, "
-                f"tool_call_id {tool_call_id}"
-            )
-            return False
-        return True
-    except Exception as e:
-        logger.error(
-            f"Failed to update tool message for session {session_id}, "
-            f"tool_call_id {tool_call_id}: {e}"
-        )
-        return False
--- a/autogpt_platform/backend/backend/api/features/chat/model_test.py
+++ b/autogpt_platform/backend/backend/api/features/chat/model_test.py
@@ -1,119 +0,0 @@
-import pytest
-
-from .model import (
-    ChatMessage,
-    ChatSession,
-    Usage,
-    get_chat_session,
-    upsert_chat_session,
-)
-
-messages = [
-    ChatMessage(content="Hello, how are you?", role="user"),
-    ChatMessage(
-        content="I'm fine, thank you!",
-        role="assistant",
-        tool_calls=[
-            {
-                "id": "t123",
-                "type": "function",
-                "function": {
-                    "name": "get_weather",
-                    "arguments": '{"city": "New York"}',
-                },
-            }
-        ],
-    ),
-    ChatMessage(
-        content="I'm using the tool to get the weather",
-        role="tool",
-        tool_call_id="t123",
-    ),
-]
-
-
-@pytest.mark.asyncio(loop_scope="session")
-async def test_chatsession_serialization_deserialization():
-    s = ChatSession.new(user_id="abc123")
-    s.messages = messages
-    s.usage = [Usage(prompt_tokens=100, completion_tokens=200, total_tokens=300)]
-    serialized = s.model_dump_json()
-    s2 = ChatSession.model_validate_json(serialized)
-    assert s2.model_dump() == s.model_dump()
-
-
-@pytest.mark.asyncio(loop_scope="session")
-async def test_chatsession_redis_storage(setup_test_user, test_user_id):
-
-    s = ChatSession.new(user_id=test_user_id)
-    s.messages = messages
-
-    s = await upsert_chat_session(s)
-
-    s2 = await get_chat_session(
-        session_id=s.session_id,
-        user_id=s.user_id,
-    )
-
-    assert s2 == s
-
-
-@pytest.mark.asyncio(loop_scope="session")
-async def test_chatsession_redis_storage_user_id_mismatch(
-    setup_test_user, test_user_id
-):
-
-    s = ChatSession.new(user_id=test_user_id)
-    s.messages = messages
-    s = await upsert_chat_session(s)
-
-    s2 = await get_chat_session(s.session_id, "different_user_id")
-
-    assert s2 is None
-
-
-@pytest.mark.asyncio(loop_scope="session")
-async def test_chatsession_db_storage(setup_test_user, test_user_id):
-    """Test that messages are correctly saved to and loaded from DB (not cache)."""
-    from backend.data.redis_client import get_redis_async
-
-    # Create session with messages including assistant message
-    s = ChatSession.new(user_id=test_user_id)
-    s.messages = messages  # Contains user, assistant, and tool messages
-    assert s.session_id is not None, "Session id is not set"
-    # Upsert to save to both cache and DB
-    s = await upsert_chat_session(s)
-
-    # Clear the Redis cache to force DB load
-    redis_key = f"chat:session:{s.session_id}"
-    async_redis = await get_redis_async()
-    await async_redis.delete(redis_key)
-
-    # Load from DB (cache was cleared)
-    s2 = await get_chat_session(
-        session_id=s.session_id,
-        user_id=s.user_id,
-    )
-
-    assert s2 is not None, "Session not found after loading from DB"
-    assert len(s2.messages) == len(
-        s.messages
-    ), f"Message count mismatch: expected {len(s.messages)}, got {len(s2.messages)}"
-
-    # Verify all roles are present
-    roles = [m.role for m in s2.messages]
-    assert "user" in roles, f"User message missing. Roles found: {roles}"
-    assert "assistant" in roles, f"Assistant message missing. Roles found: {roles}"
-    assert "tool" in roles, f"Tool message missing. Roles found: {roles}"
-
-    # Verify message content
-    for orig, loaded in zip(s.messages, s2.messages):
-        assert orig.role == loaded.role, f"Role mismatch: {orig.role} != {loaded.role}"
-        assert (
-            orig.content == loaded.content
-        ), f"Content mismatch for {orig.role}: {orig.content} != {loaded.content}"
-        if orig.tool_calls:
-            assert (
-                loaded.tool_calls is not None
-            ), f"Tool calls missing for {orig.role} message"
-            assert len(orig.tool_calls) == len(loaded.tool_calls)
--- a/autogpt_platform/backend/backend/api/features/chat/routes.py
+++ b/autogpt_platform/backend/backend/api/features/chat/routes.py
--- a/autogpt_platform/backend/backend/api/features/chat/routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/chat/routes_test.py
@@ -0,0 +1,402 @@
+"""Tests for chat API routes: session title update, file attachment validation, usage, and rate limiting."""
+
+from datetime import UTC, datetime, timedelta
+from unittest.mock import AsyncMock
+
+import fastapi
+import fastapi.testclient
+import pytest
+import pytest_mock
+
+from backend.api.features.chat import routes as chat_routes
+
+app = fastapi.FastAPI()
+app.include_router(chat_routes.router)
+
+client = fastapi.testclient.TestClient(app)
+
+TEST_USER_ID = "3e53486c-cf57-477e-ba2a-cb02dc828e1a"
+
+
+@pytest.fixture(autouse=True)
+def setup_app_auth(mock_jwt_user):
+    """Setup auth overrides for all tests in this module"""
+    from autogpt_libs.auth.jwt_utils import get_jwt_payload
+
+    app.dependency_overrides[get_jwt_payload] = mock_jwt_user["get_jwt_payload"]
+    yield
+    app.dependency_overrides.clear()
+
+
+def _mock_update_session_title(
+    mocker: pytest_mock.MockerFixture, *, success: bool = True
+):
+    """Mock update_session_title."""
+    return mocker.patch(
+        "backend.api.features.chat.routes.update_session_title",
+        new_callable=AsyncMock,
+        return_value=success,
+    )
+
+
+# ─── Update title: success ─────────────────────────────────────────────
+
+
+def test_update_title_success(
+    mocker: pytest_mock.MockerFixture,
+    test_user_id: str,
+) -> None:
+    mock_update = _mock_update_session_title(mocker, success=True)
+
+    response = client.patch(
+        "/sessions/sess-1/title",
+        json={"title": "My project"},
+    )
+
+    assert response.status_code == 200
+    assert response.json() == {"status": "ok"}
+    mock_update.assert_called_once_with("sess-1", test_user_id, "My project")
+
+
+def test_update_title_trims_whitespace(
+    mocker: pytest_mock.MockerFixture,
+    test_user_id: str,
+) -> None:
+    mock_update = _mock_update_session_title(mocker, success=True)
+
+    response = client.patch(
+        "/sessions/sess-1/title",
+        json={"title": "  trimmed  "},
+    )
+
+    assert response.status_code == 200
+    mock_update.assert_called_once_with("sess-1", test_user_id, "trimmed")
+
+
+# ─── Update title: blank / whitespace-only → 422 ──────────────────────
+
+
+def test_update_title_blank_rejected(
+    test_user_id: str,
+) -> None:
+    """Whitespace-only titles must be rejected before hitting the DB."""
+    response = client.patch(
+        "/sessions/sess-1/title",
+        json={"title": "   "},
+    )
+
+    assert response.status_code == 422
+
+
+def test_update_title_empty_rejected(
+    test_user_id: str,
+) -> None:
+    response = client.patch(
+        "/sessions/sess-1/title",
+        json={"title": ""},
+    )
+
+    assert response.status_code == 422
+
+
+# ─── Update title: session not found or wrong user → 404 ──────────────
+
+
+def test_update_title_not_found(
+    mocker: pytest_mock.MockerFixture,
+    test_user_id: str,
+) -> None:
+    _mock_update_session_title(mocker, success=False)
+
+    response = client.patch(
+        "/sessions/sess-1/title",
+        json={"title": "New name"},
+    )
+
+    assert response.status_code == 404
+
+
+# ─── file_ids Pydantic validation ─────────────────────────────────────
+
+
+def test_stream_chat_rejects_too_many_file_ids():
+    """More than 20 file_ids should be rejected by Pydantic validation (422)."""
+    response = client.post(
+        "/sessions/sess-1/stream",
+        json={
+            "message": "hello",
+            "file_ids": [f"00000000-0000-0000-0000-{i:012d}" for i in range(21)],
+        },
+    )
+    assert response.status_code == 422
+
+
+def _mock_stream_internals(mocker: pytest_mock.MockFixture):
+    """Mock the async internals of stream_chat_post so tests can exercise
+    validation and enrichment logic without needing Redis/RabbitMQ."""
+    mocker.patch(
+        "backend.api.features.chat.routes._validate_and_get_session",
+        return_value=None,
+    )
+    mocker.patch(
+        "backend.api.features.chat.routes.append_and_save_message",
+        return_value=None,
+    )
+    mock_registry = mocker.MagicMock()
+    mock_registry.create_session = mocker.AsyncMock(return_value=None)
+    mocker.patch(
+        "backend.api.features.chat.routes.stream_registry",
+        mock_registry,
+    )
+    mocker.patch(
+        "backend.api.features.chat.routes.enqueue_copilot_turn",
+        return_value=None,
+    )
+    mocker.patch(
+        "backend.api.features.chat.routes.track_user_message",
+        return_value=None,
+    )
+
+
+def test_stream_chat_accepts_20_file_ids(mocker: pytest_mock.MockFixture):
+    """Exactly 20 file_ids should be accepted (not rejected by validation)."""
+    _mock_stream_internals(mocker)
+    # Patch workspace lookup as imported by the routes module
+    mocker.patch(
+        "backend.api.features.chat.routes.get_or_create_workspace",
+        return_value=type("W", (), {"id": "ws-1"})(),
+    )
+    mock_prisma = mocker.MagicMock()
+    mock_prisma.find_many = mocker.AsyncMock(return_value=[])
+    mocker.patch(
+        "prisma.models.UserWorkspaceFile.prisma",
+        return_value=mock_prisma,
+    )
+
+    response = client.post(
+        "/sessions/sess-1/stream",
+        json={
+            "message": "hello",
+            "file_ids": [f"00000000-0000-0000-0000-{i:012d}" for i in range(20)],
+        },
+    )
+    # Should get past validation — 200 streaming response expected
+    assert response.status_code == 200
+
+
+# ─── UUID format filtering ─────────────────────────────────────────────
+
+
+def test_file_ids_filters_invalid_uuids(mocker: pytest_mock.MockFixture):
+    """Non-UUID strings in file_ids should be silently filtered out
+    and NOT passed to the database query."""
+    _mock_stream_internals(mocker)
+    mocker.patch(
+        "backend.api.features.chat.routes.get_or_create_workspace",
+        return_value=type("W", (), {"id": "ws-1"})(),
+    )
+
+    mock_prisma = mocker.MagicMock()
+    mock_prisma.find_many = mocker.AsyncMock(return_value=[])
+    mocker.patch(
+        "prisma.models.UserWorkspaceFile.prisma",
+        return_value=mock_prisma,
+    )
+
+    valid_id = "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
+    client.post(
+        "/sessions/sess-1/stream",
+        json={
+            "message": "hello",
+            "file_ids": [
+                valid_id,
+                "not-a-uuid",
+                "../../../etc/passwd",
+                "",
+            ],
+        },
+    )
+
+    # The find_many call should only receive the one valid UUID
+    mock_prisma.find_many.assert_called_once()
+    call_kwargs = mock_prisma.find_many.call_args[1]
+    assert call_kwargs["where"]["id"]["in"] == [valid_id]
+
+
+# ─── Cross-workspace file_ids ─────────────────────────────────────────
+
+
+def test_file_ids_scoped_to_workspace(mocker: pytest_mock.MockFixture):
+    """The batch query should scope to the user's workspace."""
+    _mock_stream_internals(mocker)
+    mocker.patch(
+        "backend.api.features.chat.routes.get_or_create_workspace",
+        return_value=type("W", (), {"id": "my-workspace-id"})(),
+    )
+
+    mock_prisma = mocker.MagicMock()
+    mock_prisma.find_many = mocker.AsyncMock(return_value=[])
+    mocker.patch(
+        "prisma.models.UserWorkspaceFile.prisma",
+        return_value=mock_prisma,
+    )
+
+    fid = "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
+    client.post(
+        "/sessions/sess-1/stream",
+        json={"message": "hi", "file_ids": [fid]},
+    )
+
+    call_kwargs = mock_prisma.find_many.call_args[1]
+    assert call_kwargs["where"]["workspaceId"] == "my-workspace-id"
+    assert call_kwargs["where"]["isDeleted"] is False
+
+
+# ─── Rate limit → 429 ─────────────────────────────────────────────────
+
+
+def test_stream_chat_returns_429_on_daily_rate_limit(mocker: pytest_mock.MockFixture):
+    """When check_rate_limit raises RateLimitExceeded for daily limit the endpoint returns 429."""
+    from backend.copilot.rate_limit import RateLimitExceeded
+
+    _mock_stream_internals(mocker)
+    # Ensure the rate-limit branch is entered by setting a non-zero limit.
+    mocker.patch.object(chat_routes.config, "daily_token_limit", 10000)
+    mocker.patch.object(chat_routes.config, "weekly_token_limit", 50000)
+    mocker.patch(
+        "backend.api.features.chat.routes.check_rate_limit",
+        side_effect=RateLimitExceeded("daily", datetime.now(UTC) + timedelta(hours=1)),
+    )
+
+    response = client.post(
+        "/sessions/sess-1/stream",
+        json={"message": "hello"},
+    )
+    assert response.status_code == 429
+    assert "daily" in response.json()["detail"].lower()
+
+
+def test_stream_chat_returns_429_on_weekly_rate_limit(mocker: pytest_mock.MockFixture):
+    """When check_rate_limit raises RateLimitExceeded for weekly limit the endpoint returns 429."""
+    from backend.copilot.rate_limit import RateLimitExceeded
+
+    _mock_stream_internals(mocker)
+    mocker.patch.object(chat_routes.config, "daily_token_limit", 10000)
+    mocker.patch.object(chat_routes.config, "weekly_token_limit", 50000)
+    resets_at = datetime.now(UTC) + timedelta(days=3)
+    mocker.patch(
+        "backend.api.features.chat.routes.check_rate_limit",
+        side_effect=RateLimitExceeded("weekly", resets_at),
+    )
+
+    response = client.post(
+        "/sessions/sess-1/stream",
+        json={"message": "hello"},
+    )
+    assert response.status_code == 429
+    detail = response.json()["detail"].lower()
+    assert "weekly" in detail
+    assert "resets in" in detail
+
+
+def test_stream_chat_429_includes_reset_time(mocker: pytest_mock.MockFixture):
+    """The 429 response detail should include the human-readable reset time."""
+    from backend.copilot.rate_limit import RateLimitExceeded
+
+    _mock_stream_internals(mocker)
+    mocker.patch.object(chat_routes.config, "daily_token_limit", 10000)
+    mocker.patch.object(chat_routes.config, "weekly_token_limit", 50000)
+    mocker.patch(
+        "backend.api.features.chat.routes.check_rate_limit",
+        side_effect=RateLimitExceeded(
+            "daily", datetime.now(UTC) + timedelta(hours=2, minutes=30)
+        ),
+    )
+
+    response = client.post(
+        "/sessions/sess-1/stream",
+        json={"message": "hello"},
+    )
+    assert response.status_code == 429
+    detail = response.json()["detail"]
+    assert "2h" in detail
+    assert "Resets in" in detail
+
+
+# ─── Usage endpoint ───────────────────────────────────────────────────
+
+
+def _mock_usage(
+    mocker: pytest_mock.MockerFixture,
+    *,
+    daily_used: int = 500,
+    weekly_used: int = 2000,
+) -> AsyncMock:
+    """Mock get_usage_status to return a predictable CoPilotUsageStatus."""
+    from backend.copilot.rate_limit import CoPilotUsageStatus, UsageWindow
+
+    resets_at = datetime.now(UTC) + timedelta(days=1)
+    status = CoPilotUsageStatus(
+        daily=UsageWindow(used=daily_used, limit=10000, resets_at=resets_at),
+        weekly=UsageWindow(used=weekly_used, limit=50000, resets_at=resets_at),
+    )
+    return mocker.patch(
+        "backend.api.features.chat.routes.get_usage_status",
+        new_callable=AsyncMock,
+        return_value=status,
+    )
+
+
+def test_usage_returns_daily_and_weekly(
+    mocker: pytest_mock.MockerFixture,
+    test_user_id: str,
+) -> None:
+    """GET /usage returns daily and weekly usage."""
+    mock_get = _mock_usage(mocker, daily_used=500, weekly_used=2000)
+
+    mocker.patch.object(chat_routes.config, "daily_token_limit", 10000)
+    mocker.patch.object(chat_routes.config, "weekly_token_limit", 50000)
+
+    response = client.get("/usage")
+
+    assert response.status_code == 200
+    data = response.json()
+    assert data["daily"]["used"] == 500
+    assert data["weekly"]["used"] == 2000
+
+    mock_get.assert_called_once_with(
+        user_id=test_user_id,
+        daily_token_limit=10000,
+        weekly_token_limit=50000,
+    )
+
+
+def test_usage_uses_config_limits(
+    mocker: pytest_mock.MockerFixture,
+    test_user_id: str,
+) -> None:
+    """The endpoint forwards daily_token_limit and weekly_token_limit from config."""
+    mock_get = _mock_usage(mocker)
+
+    mocker.patch.object(chat_routes.config, "daily_token_limit", 99999)
+    mocker.patch.object(chat_routes.config, "weekly_token_limit", 77777)
+
+    response = client.get("/usage")
+
+    assert response.status_code == 200
+    mock_get.assert_called_once_with(
+        user_id=test_user_id,
+        daily_token_limit=99999,
+        weekly_token_limit=77777,
+    )
+
+
+def test_usage_rejects_unauthenticated_request() -> None:
+    """GET /usage should return 401 when no valid JWT is provided."""
+    unauthenticated_app = fastapi.FastAPI()
+    unauthenticated_app.include_router(chat_routes.router)
+    unauthenticated_client = fastapi.testclient.TestClient(unauthenticated_app)
+
+    response = unauthenticated_client.get("/usage")
+
+    assert response.status_code == 401
--- a/autogpt_platform/backend/backend/api/features/chat/service.py
+++ b/autogpt_platform/backend/backend/api/features/chat/service.py
--- a/autogpt_platform/backend/backend/api/features/chat/service_test.py
+++ b/autogpt_platform/backend/backend/api/features/chat/service_test.py
@@ -1,82 +0,0 @@
-import logging
-from os import getenv
-
-import pytest
-
-from . import service as chat_service
-from .model import create_chat_session, get_chat_session, upsert_chat_session
-from .response_model import (
-    StreamError,
-    StreamFinish,
-    StreamTextDelta,
-    StreamToolOutputAvailable,
-)
-
-logger = logging.getLogger(__name__)
-
-
-@pytest.mark.asyncio(loop_scope="session")
-async def test_stream_chat_completion(setup_test_user, test_user_id):
-    """
-    Test the stream_chat_completion function.
-    """
-    api_key: str | None = getenv("OPEN_ROUTER_API_KEY")
-    if not api_key:
-        return pytest.skip("OPEN_ROUTER_API_KEY is not set, skipping test")
-
-    session = await create_chat_session(test_user_id)
-
-    has_errors = False
-    has_ended = False
-    assistant_message = ""
-    async for chunk in chat_service.stream_chat_completion(
-        session.session_id, "Hello, how are you?", user_id=session.user_id
-    ):
-        logger.info(chunk)
-        if isinstance(chunk, StreamError):
-            has_errors = True
-        if isinstance(chunk, StreamTextDelta):
-            assistant_message += chunk.delta
-        if isinstance(chunk, StreamFinish):
-            has_ended = True
-
-    assert has_ended, "Chat completion did not end"
-    assert not has_errors, "Error occurred while streaming chat completion"
-    assert assistant_message, "Assistant message is empty"
-
-
-@pytest.mark.asyncio(loop_scope="session")
-async def test_stream_chat_completion_with_tool_calls(setup_test_user, test_user_id):
-    """
-    Test the stream_chat_completion function.
-    """
-    api_key: str | None = getenv("OPEN_ROUTER_API_KEY")
-    if not api_key:
-        return pytest.skip("OPEN_ROUTER_API_KEY is not set, skipping test")
-
-    session = await create_chat_session(test_user_id)
-    session = await upsert_chat_session(session)
-
-    has_errors = False
-    has_ended = False
-    had_tool_calls = False
-    async for chunk in chat_service.stream_chat_completion(
-        session.session_id,
-        "Please find me an agent that can help me with my business. Use the query 'moneny printing agent'",
-        user_id=session.user_id,
-    ):
-        logger.info(chunk)
-        if isinstance(chunk, StreamError):
-            has_errors = True
-
-        if isinstance(chunk, StreamFinish):
-            has_ended = True
-        if isinstance(chunk, StreamToolOutputAvailable):
-            had_tool_calls = True
-
-    assert has_ended, "Chat completion did not end"
-    assert not has_errors, "Error occurred while streaming chat completion"
-    assert had_tool_calls, "Tool calls did not occur"
-    session = await get_chat_session(session.session_id)
-    assert session, "Session not found"
-    assert session.usage, "Usage is empty"
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/service.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/service.py
@@ -1,498 +0,0 @@
-"""External Agent Generator service client.
-
-This module provides a client for communicating with the external Agent Generator
-microservice. When AGENTGENERATOR_HOST is configured, the agent generation functions
-will delegate to the external service instead of using the built-in LLM-based implementation.
-"""
-
-import logging
-from typing import Any
-
-import httpx
-
-from backend.util.settings import Settings
-
-logger = logging.getLogger(__name__)
-
-
-def _create_error_response(
-    error_message: str,
-    error_type: str = "unknown",
-    details: dict[str, Any] | None = None,
-) -> dict[str, Any]:
-    """Create a standardized error response dict.
-
-    Args:
-        error_message: Human-readable error message
-        error_type: Machine-readable error type
-        details: Optional additional error details
-
-    Returns:
-        Error dict with type="error" and error details
-    """
-    response: dict[str, Any] = {
-        "type": "error",
-        "error": error_message,
-        "error_type": error_type,
-    }
-    if details:
-        response["details"] = details
-    return response
-
-
-def _classify_http_error(e: httpx.HTTPStatusError) -> tuple[str, str]:
-    """Classify an HTTP error into error_type and message.
-
-    Args:
-        e: The HTTP status error
-
-    Returns:
-        Tuple of (error_type, error_message)
-    """
-    status = e.response.status_code
-    if status == 429:
-        return "rate_limit", f"Agent Generator rate limited: {e}"
-    elif status == 503:
-        return "service_unavailable", f"Agent Generator unavailable: {e}"
-    elif status == 504 or status == 408:
-        return "timeout", f"Agent Generator timed out: {e}"
-    else:
-        return "http_error", f"HTTP error calling Agent Generator: {e}"
-
-
-def _classify_request_error(e: httpx.RequestError) -> tuple[str, str]:
-    """Classify a request error into error_type and message.
-
-    Args:
-        e: The request error
-
-    Returns:
-        Tuple of (error_type, error_message)
-    """
-    error_str = str(e).lower()
-    if "timeout" in error_str or "timed out" in error_str:
-        return "timeout", f"Agent Generator request timed out: {e}"
-    elif "connect" in error_str:
-        return "connection_error", f"Could not connect to Agent Generator: {e}"
-    else:
-        return "request_error", f"Request error calling Agent Generator: {e}"
-
-
-_client: httpx.AsyncClient | None = None
-_settings: Settings | None = None
-
-
-def _get_settings() -> Settings:
-    """Get or create settings singleton."""
-    global _settings
-    if _settings is None:
-        _settings = Settings()
-    return _settings
-
-
-def is_external_service_configured() -> bool:
-    """Check if external Agent Generator service is configured."""
-    settings = _get_settings()
-    return bool(settings.config.agentgenerator_host)
-
-
-def _get_base_url() -> str:
-    """Get the base URL for the external service."""
-    settings = _get_settings()
-    host = settings.config.agentgenerator_host
-    port = settings.config.agentgenerator_port
-    return f"http://{host}:{port}"
-
-
-def _get_client() -> httpx.AsyncClient:
-    """Get or create the HTTP client for the external service."""
-    global _client
-    if _client is None:
-        settings = _get_settings()
-        _client = httpx.AsyncClient(
-            base_url=_get_base_url(),
-            timeout=httpx.Timeout(settings.config.agentgenerator_timeout),
-        )
-    return _client
-
-
-async def decompose_goal_external(
-    description: str,
-    context: str = "",
-    library_agents: list[dict[str, Any]] | None = None,
-) -> dict[str, Any] | None:
-    """Call the external service to decompose a goal.
-
-    Args:
-        description: Natural language goal description
-        context: Additional context (e.g., answers to previous questions)
-        library_agents: User's library agents available for sub-agent composition
-
-    Returns:
-        Dict with either:
-        - {"type": "clarifying_questions", "questions": [...]}
-        - {"type": "instructions", "steps": [...]}
-        - {"type": "unachievable_goal", ...}
-        - {"type": "vague_goal", ...}
-        - {"type": "error", "error": "...", "error_type": "..."} on error
-        Or None on unexpected error
-    """
-    client = _get_client()
-
-    if context:
-        description = f"{description}\n\nAdditional context from user:\n{context}"
-
-    payload: dict[str, Any] = {"description": description}
-    if library_agents:
-        payload["library_agents"] = library_agents
-
-    try:
-        response = await client.post("/api/decompose-description", json=payload)
-        response.raise_for_status()
-        data = response.json()
-
-        if not data.get("success"):
-            error_msg = data.get("error", "Unknown error from Agent Generator")
-            error_type = data.get("error_type", "unknown")
-            logger.error(
-                f"Agent Generator decomposition failed: {error_msg} "
-                f"(type: {error_type})"
-            )
-            return _create_error_response(error_msg, error_type)
-
-        # Map the response to the expected format
-        response_type = data.get("type")
-        if response_type == "instructions":
-            return {"type": "instructions", "steps": data.get("steps", [])}
-        elif response_type == "clarifying_questions":
-            return {
-                "type": "clarifying_questions",
-                "questions": data.get("questions", []),
-            }
-        elif response_type == "unachievable_goal":
-            return {
-                "type": "unachievable_goal",
-                "reason": data.get("reason"),
-                "suggested_goal": data.get("suggested_goal"),
-            }
-        elif response_type == "vague_goal":
-            return {
-                "type": "vague_goal",
-                "suggested_goal": data.get("suggested_goal"),
-            }
-        elif response_type == "error":
-            # Pass through error from the service
-            return _create_error_response(
-                data.get("error", "Unknown error"),
-                data.get("error_type", "unknown"),
-            )
-        else:
-            logger.error(
-                f"Unknown response type from external service: {response_type}"
-            )
-            return _create_error_response(
-                f"Unknown response type from Agent Generator: {response_type}",
-                "invalid_response",
-            )
-
-    except httpx.HTTPStatusError as e:
-        error_type, error_msg = _classify_http_error(e)
-        logger.error(error_msg)
-        return _create_error_response(error_msg, error_type)
-    except httpx.RequestError as e:
-        error_type, error_msg = _classify_request_error(e)
-        logger.error(error_msg)
-        return _create_error_response(error_msg, error_type)
-    except Exception as e:
-        error_msg = f"Unexpected error calling Agent Generator: {e}"
-        logger.error(error_msg)
-        return _create_error_response(error_msg, "unexpected_error")
-
-
-async def generate_agent_external(
-    instructions: dict[str, Any],
-    library_agents: list[dict[str, Any]] | None = None,
-    operation_id: str | None = None,
-    task_id: str | None = None,
-) -> dict[str, Any] | None:
-    """Call the external service to generate an agent from instructions.
-
-    Args:
-        instructions: Structured instructions from decompose_goal
-        library_agents: User's library agents available for sub-agent composition
-        operation_id: Operation ID for async processing (enables Redis Streams callback)
-        task_id: Task ID for async processing (enables Redis Streams callback)
-
-    Returns:
-        Agent JSON dict, {"status": "accepted"} for async, or error dict {"type": "error", ...} on error
-    """
-    client = _get_client()
-
-    # Build request payload
-    payload: dict[str, Any] = {"instructions": instructions}
-    if library_agents:
-        payload["library_agents"] = library_agents
-    if operation_id and task_id:
-        payload["operation_id"] = operation_id
-        payload["task_id"] = task_id
-
-    try:
-        response = await client.post("/api/generate-agent", json=payload)
-
-        # Handle 202 Accepted for async processing
-        if response.status_code == 202:
-            logger.info(
-                f"Agent Generator accepted async request "
-                f"(operation_id={operation_id}, task_id={task_id})"
-            )
-            return {
-                "status": "accepted",
-                "operation_id": operation_id,
-                "task_id": task_id,
-            }
-
-        response.raise_for_status()
-        data = response.json()
-
-        if not data.get("success"):
-            error_msg = data.get("error", "Unknown error from Agent Generator")
-            error_type = data.get("error_type", "unknown")
-            logger.error(
-                f"Agent Generator generation failed: {error_msg} (type: {error_type})"
-            )
-            return _create_error_response(error_msg, error_type)
-
-        return data.get("agent_json")
-
-    except httpx.HTTPStatusError as e:
-        error_type, error_msg = _classify_http_error(e)
-        logger.error(error_msg)
-        return _create_error_response(error_msg, error_type)
-    except httpx.RequestError as e:
-        error_type, error_msg = _classify_request_error(e)
-        logger.error(error_msg)
-        return _create_error_response(error_msg, error_type)
-    except Exception as e:
-        error_msg = f"Unexpected error calling Agent Generator: {e}"
-        logger.error(error_msg)
-        return _create_error_response(error_msg, "unexpected_error")
-
-
-async def generate_agent_patch_external(
-    update_request: str,
-    current_agent: dict[str, Any],
-    library_agents: list[dict[str, Any]] | None = None,
-    operation_id: str | None = None,
-    task_id: str | None = None,
-) -> dict[str, Any] | None:
-    """Call the external service to generate a patch for an existing agent.
-
-    Args:
-        update_request: Natural language description of changes
-        current_agent: Current agent JSON
-        library_agents: User's library agents available for sub-agent composition
-        operation_id: Operation ID for async processing (enables Redis Streams callback)
-        task_id: Task ID for async processing (enables Redis Streams callback)
-
-    Returns:
-        Updated agent JSON, clarifying questions dict, {"status": "accepted"} for async, or error dict on error
-    """
-    client = _get_client()
-
-    # Build request payload
-    payload: dict[str, Any] = {
-        "update_request": update_request,
-        "current_agent_json": current_agent,
-    }
-    if library_agents:
-        payload["library_agents"] = library_agents
-    if operation_id and task_id:
-        payload["operation_id"] = operation_id
-        payload["task_id"] = task_id
-
-    try:
-        response = await client.post("/api/update-agent", json=payload)
-
-        # Handle 202 Accepted for async processing
-        if response.status_code == 202:
-            logger.info(
-                f"Agent Generator accepted async update request "
-                f"(operation_id={operation_id}, task_id={task_id})"
-            )
-            return {
-                "status": "accepted",
-                "operation_id": operation_id,
-                "task_id": task_id,
-            }
-
-        response.raise_for_status()
-        data = response.json()
-
-        if not data.get("success"):
-            error_msg = data.get("error", "Unknown error from Agent Generator")
-            error_type = data.get("error_type", "unknown")
-            logger.error(
-                f"Agent Generator patch generation failed: {error_msg} "
-                f"(type: {error_type})"
-            )
-            return _create_error_response(error_msg, error_type)
-
-        # Check if it's clarifying questions
-        if data.get("type") == "clarifying_questions":
-            return {
-                "type": "clarifying_questions",
-                "questions": data.get("questions", []),
-            }
-
-        # Check if it's an error passed through
-        if data.get("type") == "error":
-            return _create_error_response(
-                data.get("error", "Unknown error"),
-                data.get("error_type", "unknown"),
-            )
-
-        # Otherwise return the updated agent JSON
-        return data.get("agent_json")
-
-    except httpx.HTTPStatusError as e:
-        error_type, error_msg = _classify_http_error(e)
-        logger.error(error_msg)
-        return _create_error_response(error_msg, error_type)
-    except httpx.RequestError as e:
-        error_type, error_msg = _classify_request_error(e)
-        logger.error(error_msg)
-        return _create_error_response(error_msg, error_type)
-    except Exception as e:
-        error_msg = f"Unexpected error calling Agent Generator: {e}"
-        logger.error(error_msg)
-        return _create_error_response(error_msg, "unexpected_error")
-
-
-async def customize_template_external(
-    template_agent: dict[str, Any],
-    modification_request: str,
-    context: str = "",
-) -> dict[str, Any] | None:
-    """Call the external service to customize a template/marketplace agent.
-
-    Args:
-        template_agent: The template agent JSON to customize
-        modification_request: Natural language description of customizations
-        context: Additional context (e.g., answers to previous questions)
-
-    Returns:
-        Customized agent JSON, clarifying questions dict, or error dict on error
-    """
-    client = _get_client()
-
-    request = modification_request
-    if context:
-        request = f"{modification_request}\n\nAdditional context from user:\n{context}"
-
-    payload: dict[str, Any] = {
-        "template_agent_json": template_agent,
-        "modification_request": request,
-    }
-
-    try:
-        response = await client.post("/api/template-modification", json=payload)
-        response.raise_for_status()
-        data = response.json()
-
-        if not data.get("success"):
-            error_msg = data.get("error", "Unknown error from Agent Generator")
-            error_type = data.get("error_type", "unknown")
-            logger.error(
-                f"Agent Generator template customization failed: {error_msg} "
-                f"(type: {error_type})"
-            )
-            return _create_error_response(error_msg, error_type)
-
-        # Check if it's clarifying questions
-        if data.get("type") == "clarifying_questions":
-            return {
-                "type": "clarifying_questions",
-                "questions": data.get("questions", []),
-            }
-
-        # Check if it's an error passed through
-        if data.get("type") == "error":
-            return _create_error_response(
-                data.get("error", "Unknown error"),
-                data.get("error_type", "unknown"),
-            )
-
-        # Otherwise return the customized agent JSON
-        return data.get("agent_json")
-
-    except httpx.HTTPStatusError as e:
-        error_type, error_msg = _classify_http_error(e)
-        logger.error(error_msg)
-        return _create_error_response(error_msg, error_type)
-    except httpx.RequestError as e:
-        error_type, error_msg = _classify_request_error(e)
-        logger.error(error_msg)
-        return _create_error_response(error_msg, error_type)
-    except Exception as e:
-        error_msg = f"Unexpected error calling Agent Generator: {e}"
-        logger.error(error_msg)
-        return _create_error_response(error_msg, "unexpected_error")
-
-
-async def get_blocks_external() -> list[dict[str, Any]] | None:
-    """Get available blocks from the external service.
-
-    Returns:
-        List of block info dicts or None on error
-    """
-    client = _get_client()
-
-    try:
-        response = await client.get("/api/blocks")
-        response.raise_for_status()
-        data = response.json()
-
-        if not data.get("success"):
-            logger.error("External service returned error getting blocks")
-            return None
-
-        return data.get("blocks", [])
-
-    except httpx.HTTPStatusError as e:
-        logger.error(f"HTTP error getting blocks from external service: {e}")
-        return None
-    except httpx.RequestError as e:
-        logger.error(f"Request error getting blocks from external service: {e}")
-        return None
-    except Exception as e:
-        logger.error(f"Unexpected error getting blocks from external service: {e}")
-        return None
-
-
-async def health_check() -> bool:
-    """Check if the external service is healthy.
-
-    Returns:
-        True if healthy, False otherwise
-    """
-    if not is_external_service_configured():
-        return False
-
-    client = _get_client()
-
-    try:
-        response = await client.get("/health")
-        response.raise_for_status()
-        data = response.json()
-        return data.get("status") == "healthy" and data.get("blocks_loaded", False)
-    except Exception as e:
-        logger.warning(f"External agent generator health check failed: {e}")
-        return False
-
-
-async def close_client() -> None:
-    """Close the HTTP client."""
-    global _client
-    if _client is not None:
-        await _client.aclose()
-        _client = None
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_search.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_search.py
@@ -1,239 +0,0 @@
-"""Shared agent search functionality for find_agent and find_library_agent tools."""
-
-import logging
-import re
-from typing import Literal
-
-from backend.api.features.library import db as library_db
-from backend.api.features.store import db as store_db
-from backend.util.exceptions import DatabaseError, NotFoundError
-
-from .models import (
-    AgentInfo,
-    AgentsFoundResponse,
-    ErrorResponse,
-    NoResultsResponse,
-    ToolResponseBase,
-)
-
-logger = logging.getLogger(__name__)
-
-SearchSource = Literal["marketplace", "library"]
-
-_UUID_PATTERN = re.compile(
-    r"^[a-f0-9]{8}-[a-f0-9]{4}-4[a-f0-9]{3}-[89ab][a-f0-9]{3}-[a-f0-9]{12}$",
-    re.IGNORECASE,
-)
-
-
-def _is_uuid(text: str) -> bool:
-    """Check if text is a valid UUID v4."""
-    return bool(_UUID_PATTERN.match(text.strip()))
-
-
-async def _get_library_agent_by_id(user_id: str, agent_id: str) -> AgentInfo | None:
-    """Fetch a library agent by ID (library agent ID or graph_id).
-
-    Tries multiple lookup strategies:
-    1. First by graph_id (AgentGraph primary key)
-    2. Then by library agent ID (LibraryAgent primary key)
-
-    Args:
-        user_id: The user ID
-        agent_id: The ID to look up (can be graph_id or library agent ID)
-
-    Returns:
-        AgentInfo if found, None otherwise
-    """
-    try:
-        agent = await library_db.get_library_agent_by_graph_id(user_id, agent_id)
-        if agent:
-            logger.debug(f"Found library agent by graph_id: {agent.name}")
-            return AgentInfo(
-                id=agent.id,
-                name=agent.name,
-                description=agent.description or "",
-                source="library",
-                in_library=True,
-                creator=agent.creator_name,
-                status=agent.status.value,
-                can_access_graph=agent.can_access_graph,
-                has_external_trigger=agent.has_external_trigger,
-                new_output=agent.new_output,
-                graph_id=agent.graph_id,
-            )
-    except DatabaseError:
-        raise
-    except Exception as e:
-        logger.warning(
-            f"Could not fetch library agent by graph_id {agent_id}: {e}",
-            exc_info=True,
-        )
-
-    try:
-        agent = await library_db.get_library_agent(agent_id, user_id)
-        if agent:
-            logger.debug(f"Found library agent by library_id: {agent.name}")
-            return AgentInfo(
-                id=agent.id,
-                name=agent.name,
-                description=agent.description or "",
-                source="library",
-                in_library=True,
-                creator=agent.creator_name,
-                status=agent.status.value,
-                can_access_graph=agent.can_access_graph,
-                has_external_trigger=agent.has_external_trigger,
-                new_output=agent.new_output,
-                graph_id=agent.graph_id,
-            )
-    except NotFoundError:
-        logger.debug(f"Library agent not found by library_id: {agent_id}")
-    except DatabaseError:
-        raise
-    except Exception as e:
-        logger.warning(
-            f"Could not fetch library agent by library_id {agent_id}: {e}",
-            exc_info=True,
-        )
-
-    return None
-
-
-async def search_agents(
-    query: str,
-    source: SearchSource,
-    session_id: str | None,
-    user_id: str | None = None,
-) -> ToolResponseBase:
-    """
-    Search for agents in marketplace or user library.
-
-    Args:
-        query: Search query string
-        source: "marketplace" or "library"
-        session_id: Chat session ID
-        user_id: User ID (required for library search)
-
-    Returns:
-        AgentsFoundResponse, NoResultsResponse, or ErrorResponse
-    """
-    if not query:
-        return ErrorResponse(
-            message="Please provide a search query", session_id=session_id
-        )
-
-    if source == "library" and not user_id:
-        return ErrorResponse(
-            message="User authentication required to search library",
-            session_id=session_id,
-        )
-
-    agents: list[AgentInfo] = []
-    try:
-        if source == "marketplace":
-            logger.info(f"Searching marketplace for: {query}")
-            results = await store_db.get_store_agents(search_query=query, page_size=5)
-            for agent in results.agents:
-                agents.append(
-                    AgentInfo(
-                        id=f"{agent.creator}/{agent.slug}",
-                        name=agent.agent_name,
-                        description=agent.description or "",
-                        source="marketplace",
-                        in_library=False,
-                        creator=agent.creator,
-                        category="general",
-                        rating=agent.rating,
-                        runs=agent.runs,
-                        is_featured=False,
-                    )
-                )
-        else:
-            if _is_uuid(query):
-                logger.info(f"Query looks like UUID, trying direct lookup: {query}")
-                agent = await _get_library_agent_by_id(user_id, query)  # type: ignore[arg-type]
-                if agent:
-                    agents.append(agent)
-                    logger.info(f"Found agent by direct ID lookup: {agent.name}")
-
-            if not agents:
-                logger.info(f"Searching user library for: {query}")
-                results = await library_db.list_library_agents(
-                    user_id=user_id,  # type: ignore[arg-type]
-                    search_term=query,
-                    page_size=10,
-                )
-                for agent in results.agents:
-                    agents.append(
-                        AgentInfo(
-                            id=agent.id,
-                            name=agent.name,
-                            description=agent.description or "",
-                            source="library",
-                            in_library=True,
-                            creator=agent.creator_name,
-                            status=agent.status.value,
-                            can_access_graph=agent.can_access_graph,
-                            has_external_trigger=agent.has_external_trigger,
-                            new_output=agent.new_output,
-                            graph_id=agent.graph_id,
-                        )
-                    )
-        logger.info(f"Found {len(agents)} agents in {source}")
-    except NotFoundError:
-        pass
-    except DatabaseError as e:
-        logger.error(f"Error searching {source}: {e}", exc_info=True)
-        return ErrorResponse(
-            message=f"Failed to search {source}. Please try again.",
-            error=str(e),
-            session_id=session_id,
-        )
-
-    if not agents:
-        suggestions = (
-            [
-                "Try more general terms",
-                "Browse categories in the marketplace",
-                "Check spelling",
-            ]
-            if source == "marketplace"
-            else [
-                "Try different keywords",
-                "Use find_agent to search the marketplace",
-                "Check your library at /library",
-            ]
-        )
-        no_results_msg = (
-            f"No agents found matching '{query}'. Let the user know they can try different keywords or browse the marketplace. Also let them know you can create a custom agent for them based on their needs."
-            if source == "marketplace"
-            else f"No agents matching '{query}' found in your library. Let the user know you can create a custom agent for them based on their needs."
-        )
-        return NoResultsResponse(
-            message=no_results_msg, session_id=session_id, suggestions=suggestions
-        )
-
-    title = f"Found {len(agents)} agent{'s' if len(agents) != 1 else ''} "
-    title += (
-        f"for '{query}'"
-        if source == "marketplace"
-        else f"in your library for '{query}'"
-    )
-
-    message = (
-        "Now you have found some options for the user to choose from. "
-        "You can add a link to a recommended agent at: /marketplace/agent/agent_id "
-        "Please ask the user if they would like to use any of these agents. Let the user know we can create a custom agent for them based on their needs."
-        if source == "marketplace"
-        else "Found agents in the user's library. You can provide a link to view an agent at: "
-        "/library/agents/{agent_id}. Use agent_output to get execution results, or run_agent to execute. Let the user know we can create a custom agent for them based on their needs."
-    )
-
-    return AgentsFoundResponse(
-        message=message,
-        title=title,
-        agents=agents,
-        count=len(agents),
-        session_id=session_id,
-    )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/base.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/base.py
@@ -1,129 +0,0 @@
-"""Base classes and shared utilities for chat tools."""
-
-import logging
-from typing import Any
-
-from openai.types.chat import ChatCompletionToolParam
-
-from backend.api.features.chat.model import ChatSession
-from backend.api.features.chat.response_model import StreamToolOutputAvailable
-
-from .models import ErrorResponse, NeedLoginResponse, ToolResponseBase
-
-logger = logging.getLogger(__name__)
-
-
-class BaseTool:
-    """Base class for all chat tools."""
-
-    @property
-    def name(self) -> str:
-        """Tool name for OpenAI function calling."""
-        raise NotImplementedError
-
-    @property
-    def description(self) -> str:
-        """Tool description for OpenAI."""
-        raise NotImplementedError
-
-    @property
-    def parameters(self) -> dict[str, Any]:
-        """Tool parameters schema for OpenAI."""
-        raise NotImplementedError
-
-    @property
-    def requires_auth(self) -> bool:
-        """Whether this tool requires authentication."""
-        return False
-
-    @property
-    def is_long_running(self) -> bool:
-        """Whether this tool is long-running and should execute in background.
-
-        Long-running tools (like agent generation) are executed via background
-        tasks to survive SSE disconnections. The result is persisted to chat
-        history and visible when the user refreshes.
-        """
-        return False
-
-    def as_openai_tool(self) -> ChatCompletionToolParam:
-        """Convert to OpenAI tool format."""
-        return ChatCompletionToolParam(
-            type="function",
-            function={
-                "name": self.name,
-                "description": self.description,
-                "parameters": self.parameters,
-            },
-        )
-
-    async def execute(
-        self,
-        user_id: str | None,
-        session: ChatSession,
-        tool_call_id: str,
-        **kwargs,
-    ) -> StreamToolOutputAvailable:
-        """Execute the tool with authentication check.
-
-        Args:
-            user_id: User ID (may be anonymous like "anon_123")
-            session_id: Chat session ID
-            **kwargs: Tool-specific parameters
-
-        Returns:
-            Pydantic response object
-
-        """
-        if self.requires_auth and not user_id:
-            logger.error(
-                f"Attempted tool call for {self.name} but user not authenticated"
-            )
-            return StreamToolOutputAvailable(
-                toolCallId=tool_call_id,
-                toolName=self.name,
-                output=NeedLoginResponse(
-                    message=f"Please sign in to use {self.name}",
-                    session_id=session.session_id,
-                ).model_dump_json(),
-                success=False,
-            )
-
-        try:
-            result = await self._execute(user_id, session, **kwargs)
-            return StreamToolOutputAvailable(
-                toolCallId=tool_call_id,
-                toolName=self.name,
-                output=result.model_dump_json(),
-            )
-        except Exception as e:
-            logger.error(f"Error in {self.name}: {e}", exc_info=True)
-            return StreamToolOutputAvailable(
-                toolCallId=tool_call_id,
-                toolName=self.name,
-                output=ErrorResponse(
-                    message=f"An error occurred while executing {self.name}",
-                    error=str(e),
-                    session_id=session.session_id,
-                ).model_dump_json(),
-                success=False,
-            )
-
-    async def _execute(
-        self,
-        user_id: str | None,
-        session: ChatSession,
-        **kwargs,
-    ) -> ToolResponseBase:
-        """Internal execution logic to be implemented by subclasses.
-
-        Args:
-            user_id: User ID (authenticated or anonymous)
-            session_id: Chat session ID
-            **kwargs: Tool-specific parameters
-
-        Returns:
-            Pydantic response object
-
-        """
-        raise NotImplementedError
--- a/autogpt_platform/backend/backend/api/features/chat/tools/create_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/create_agent.py
@@ -1,335 +0,0 @@
-"""CreateAgentTool - Creates agents from natural language descriptions."""
-
-import logging
-from typing import Any
-
-from backend.api.features.chat.model import ChatSession
-
-from .agent_generator import (
-    AgentGeneratorNotConfiguredError,
-    decompose_goal,
-    enrich_library_agents_from_steps,
-    generate_agent,
-    get_all_relevant_agents_for_generation,
-    get_user_message_for_error,
-    save_agent_to_library,
-)
-from .base import BaseTool
-from .models import (
-    AgentPreviewResponse,
-    AgentSavedResponse,
-    AsyncProcessingResponse,
-    ClarificationNeededResponse,
-    ClarifyingQuestion,
-    ErrorResponse,
-    ToolResponseBase,
-)
-
-logger = logging.getLogger(__name__)
-
-
-class CreateAgentTool(BaseTool):
-    """Tool for creating agents from natural language descriptions."""
-
-    @property
-    def name(self) -> str:
-        return "create_agent"
-
-    @property
-    def description(self) -> str:
-        return (
-            "Create a new agent workflow from a natural language description. "
-            "First generates a preview, then saves to library if save=true."
-        )
-
-    @property
-    def requires_auth(self) -> bool:
-        return True
-
-    @property
-    def is_long_running(self) -> bool:
-        return True
-
-    @property
-    def parameters(self) -> dict[str, Any]:
-        return {
-            "type": "object",
-            "properties": {
-                "description": {
-                    "type": "string",
-                    "description": (
-                        "Natural language description of what the agent should do. "
-                        "Be specific about inputs, outputs, and the workflow steps."
-                    ),
-                },
-                "context": {
-                    "type": "string",
-                    "description": (
-                        "Additional context or answers to previous clarifying questions. "
-                        "Include any preferences or constraints mentioned by the user."
-                    ),
-                },
-                "save": {
-                    "type": "boolean",
-                    "description": (
-                        "Whether to save the agent to the user's library. "
-                        "Default is true. Set to false for preview only."
-                    ),
-                    "default": True,
-                },
-            },
-            "required": ["description"],
-        }
-
-    async def _execute(
-        self,
-        user_id: str | None,
-        session: ChatSession,
-        **kwargs,
-    ) -> ToolResponseBase:
-        """Execute the create_agent tool.
-
-        Flow:
-        1. Decompose the description into steps (may return clarifying questions)
-        2. Generate agent JSON (external service handles fixing and validation)
-        3. Preview or save based on the save parameter
-        """
-        description = kwargs.get("description", "").strip()
-        context = kwargs.get("context", "")
-        save = kwargs.get("save", True)
-        session_id = session.session_id if session else None
-
-        # Extract async processing params (passed by long-running tool handler)
-        operation_id = kwargs.get("_operation_id")
-        task_id = kwargs.get("_task_id")
-
-        if not description:
-            return ErrorResponse(
-                message="Please provide a description of what the agent should do.",
-                error="Missing description parameter",
-                session_id=session_id,
-            )
-
-        library_agents = None
-        if user_id:
-            try:
-                library_agents = await get_all_relevant_agents_for_generation(
-                    user_id=user_id,
-                    search_query=description,
-                    include_marketplace=True,
-                )
-                logger.debug(
-                    f"Found {len(library_agents)} relevant agents for sub-agent composition"
-                )
-            except Exception as e:
-                logger.warning(f"Failed to fetch library agents: {e}")
-
-        try:
-            decomposition_result = await decompose_goal(
-                description, context, library_agents
-            )
-        except AgentGeneratorNotConfiguredError:
-            return ErrorResponse(
-                message=(
-                    "Agent generation is not available. "
-                    "The Agent Generator service is not configured."
-                ),
-                error="service_not_configured",
-                session_id=session_id,
-            )
-
-        if decomposition_result is None:
-            return ErrorResponse(
-                message="Failed to analyze the goal. The agent generation service may be unavailable. Please try again.",
-                error="decomposition_failed",
-                details={"description": description[:100]},
-                session_id=session_id,
-            )
-
-        if decomposition_result.get("type") == "error":
-            error_msg = decomposition_result.get("error", "Unknown error")
-            error_type = decomposition_result.get("error_type", "unknown")
-            user_message = get_user_message_for_error(
-                error_type,
-                operation="analyze the goal",
-                llm_parse_message="The AI had trouble understanding this request. Please try rephrasing your goal.",
-            )
-            return ErrorResponse(
-                message=user_message,
-                error=f"decomposition_failed:{error_type}",
-                details={
-                    "description": description[:100],
-                    "service_error": error_msg,
-                    "error_type": error_type,
-                },
-                session_id=session_id,
-            )
-
-        if decomposition_result.get("type") == "clarifying_questions":
-            questions = decomposition_result.get("questions", [])
-            return ClarificationNeededResponse(
-                message=(
-                    "I need some more information to create this agent. "
-                    "Please answer the following questions:"
-                ),
-                questions=[
-                    ClarifyingQuestion(
-                        question=q.get("question", ""),
-                        keyword=q.get("keyword", ""),
-                        example=q.get("example"),
-                    )
-                    for q in questions
-                ],
-                session_id=session_id,
-            )
-
-        if decomposition_result.get("type") == "unachievable_goal":
-            suggested = decomposition_result.get("suggested_goal", "")
-            reason = decomposition_result.get("reason", "")
-            return ErrorResponse(
-                message=(
-                    f"This goal cannot be accomplished with the available blocks. "
-                    f"{reason} "
-                    f"Suggestion: {suggested}"
-                ),
-                error="unachievable_goal",
-                details={"suggested_goal": suggested, "reason": reason},
-                session_id=session_id,
-            )
-
-        if decomposition_result.get("type") == "vague_goal":
-            suggested = decomposition_result.get("suggested_goal", "")
-            return ErrorResponse(
-                message=(
-                    f"The goal is too vague to create a specific workflow. "
-                    f"Suggestion: {suggested}"
-                ),
-                error="vague_goal",
-                details={"suggested_goal": suggested},
-                session_id=session_id,
-            )
-
-        if user_id and library_agents is not None:
-            try:
-                library_agents = await enrich_library_agents_from_steps(
-                    user_id=user_id,
-                    decomposition_result=decomposition_result,
-                    existing_agents=library_agents,
-                    include_marketplace=True,
-                )
-                logger.debug(
-                    f"After enrichment: {len(library_agents)} total agents for sub-agent composition"
-                )
-            except Exception as e:
-                logger.warning(f"Failed to enrich library agents from steps: {e}")
-
-        try:
-            agent_json = await generate_agent(
-                decomposition_result,
-                library_agents,
-                operation_id=operation_id,
-                task_id=task_id,
-            )
-        except AgentGeneratorNotConfiguredError:
-            return ErrorResponse(
-                message=(
-                    "Agent generation is not available. "
-                    "The Agent Generator service is not configured."
-                ),
-                error="service_not_configured",
-                session_id=session_id,
-            )
-
-        if agent_json is None:
-            return ErrorResponse(
-                message="Failed to generate the agent. The agent generation service may be unavailable. Please try again.",
-                error="generation_failed",
-                details={"description": description[:100]},
-                session_id=session_id,
-            )
-
-        if isinstance(agent_json, dict) and agent_json.get("type") == "error":
-            error_msg = agent_json.get("error", "Unknown error")
-            error_type = agent_json.get("error_type", "unknown")
-            user_message = get_user_message_for_error(
-                error_type,
-                operation="generate the agent",
-                llm_parse_message="The AI had trouble generating the agent. Please try again or simplify your goal.",
-                validation_message=(
-                    "I wasn't able to create a valid agent for this request. "
-                    "The generated workflow had some structural issues. "
-                    "Please try simplifying your goal or breaking it into smaller steps."
-                ),
-                error_details=error_msg,
-            )
-            return ErrorResponse(
-                message=user_message,
-                error=f"generation_failed:{error_type}",
-                details={
-                    "description": description[:100],
-                    "service_error": error_msg,
-                    "error_type": error_type,
-                },
-                session_id=session_id,
-            )
-
-        # Check if Agent Generator accepted for async processing
-        if agent_json.get("status") == "accepted":
-            logger.info(
-                f"Agent generation delegated to async processing "
-                f"(operation_id={operation_id}, task_id={task_id})"
-            )
-            return AsyncProcessingResponse(
-                message="Agent generation started. You'll be notified when it's complete.",
-                operation_id=operation_id,
-                task_id=task_id,
-                session_id=session_id,
-            )
-
-        agent_name = agent_json.get("name", "Generated Agent")
-        agent_description = agent_json.get("description", "")
-        node_count = len(agent_json.get("nodes", []))
-        link_count = len(agent_json.get("links", []))
-
-        if not save:
-            return AgentPreviewResponse(
-                message=(
-                    f"I've generated an agent called '{agent_name}' with {node_count} blocks. "
-                    f"Review it and call create_agent with save=true to save it to your library."
-                ),
-                agent_json=agent_json,
-                agent_name=agent_name,
-                description=agent_description,
-                node_count=node_count,
-                link_count=link_count,
-                session_id=session_id,
-            )
-
-        if not user_id:
-            return ErrorResponse(
-                message="You must be logged in to save agents.",
-                error="auth_required",
-                session_id=session_id,
-            )
-
-        try:
-            created_graph, library_agent = await save_agent_to_library(
-                agent_json, user_id
-            )
-
-            return AgentSavedResponse(
-                message=f"Agent '{created_graph.name}' has been saved to your library!",
-                agent_id=created_graph.id,
-                agent_name=created_graph.name,
-                library_agent_id=library_agent.id,
-                library_agent_link=f"/library/agents/{library_agent.id}",
-                agent_page_link=f"/build?flowID={created_graph.id}",
-                session_id=session_id,
-            )
-        except Exception as e:
-            return ErrorResponse(
-                message=f"Failed to save the agent: {str(e)}",
-                error="save_failed",
-                details={"exception": str(e)},
-                session_id=session_id,
-            )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/customize_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/customize_agent.py
@@ -1,337 +0,0 @@
-"""CustomizeAgentTool - Customizes marketplace/template agents using natural language."""
-
-import logging
-from typing import Any
-
-from backend.api.features.chat.model import ChatSession
-from backend.api.features.store import db as store_db
-from backend.api.features.store.exceptions import AgentNotFoundError
-
-from .agent_generator import (
-    AgentGeneratorNotConfiguredError,
-    customize_template,
-    get_user_message_for_error,
-    graph_to_json,
-    save_agent_to_library,
-)
-from .base import BaseTool
-from .models import (
-    AgentPreviewResponse,
-    AgentSavedResponse,
-    ClarificationNeededResponse,
-    ClarifyingQuestion,
-    ErrorResponse,
-    ToolResponseBase,
-)
-
-logger = logging.getLogger(__name__)
-
-
-class CustomizeAgentTool(BaseTool):
-    """Tool for customizing marketplace/template agents using natural language."""
-
-    @property
-    def name(self) -> str:
-        return "customize_agent"
-
-    @property
-    def description(self) -> str:
-        return (
-            "Customize a marketplace or template agent using natural language. "
-            "Takes an existing agent from the marketplace and modifies it based on "
-            "the user's requirements before adding to their library."
-        )
-
-    @property
-    def requires_auth(self) -> bool:
-        return True
-
-    @property
-    def is_long_running(self) -> bool:
-        return True
-
-    @property
-    def parameters(self) -> dict[str, Any]:
-        return {
-            "type": "object",
-            "properties": {
-                "agent_id": {
-                    "type": "string",
-                    "description": (
-                        "The marketplace agent ID in format 'creator/slug' "
-                        "(e.g., 'autogpt/newsletter-writer'). "
-                        "Get this from find_agent results."
-                    ),
-                },
-                "modifications": {
-                    "type": "string",
-                    "description": (
-                        "Natural language description of how to customize the agent. "
-                        "Be specific about what changes you want to make."
-                    ),
-                },
-                "context": {
-                    "type": "string",
-                    "description": (
-                        "Additional context or answers to previous clarifying questions."
-                    ),
-                },
-                "save": {
-                    "type": "boolean",
-                    "description": (
-                        "Whether to save the customized agent to the user's library. "
-                        "Default is true. Set to false for preview only."
-                    ),
-                    "default": True,
-                },
-            },
-            "required": ["agent_id", "modifications"],
-        }
-
-    async def _execute(
-        self,
-        user_id: str | None,
-        session: ChatSession,
-        **kwargs,
-    ) -> ToolResponseBase:
-        """Execute the customize_agent tool.
-
-        Flow:
-        1. Parse the agent ID to get creator/slug
-        2. Fetch the template agent from the marketplace
-        3. Call customize_template with the modification request
-        4. Preview or save based on the save parameter
-        """
-        agent_id = kwargs.get("agent_id", "").strip()
-        modifications = kwargs.get("modifications", "").strip()
-        context = kwargs.get("context", "")
-        save = kwargs.get("save", True)
-        session_id = session.session_id if session else None
-
-        if not agent_id:
-            return ErrorResponse(
-                message="Please provide the marketplace agent ID (e.g., 'creator/agent-name').",
-                error="missing_agent_id",
-                session_id=session_id,
-            )
-
-        if not modifications:
-            return ErrorResponse(
-                message="Please describe how you want to customize this agent.",
-                error="missing_modifications",
-                session_id=session_id,
-            )
-
-        # Parse agent_id in format "creator/slug"
-        parts = [p.strip() for p in agent_id.split("/")]
-        if len(parts) != 2 or not parts[0] or not parts[1]:
-            return ErrorResponse(
-                message=(
-                    f"Invalid agent ID format: '{agent_id}'. "
-                    "Expected format is 'creator/agent-name' "
-                    "(e.g., 'autogpt/newsletter-writer')."
-                ),
-                error="invalid_agent_id_format",
-                session_id=session_id,
-            )
-
-        creator_username, agent_slug = parts
-
-        # Fetch the marketplace agent details
-        try:
-            agent_details = await store_db.get_store_agent_details(
-                username=creator_username, agent_name=agent_slug
-            )
-        except AgentNotFoundError:
-            return ErrorResponse(
-                message=(
-                    f"Could not find marketplace agent '{agent_id}'. "
-                    "Please check the agent ID and try again."
-                ),
-                error="agent_not_found",
-                session_id=session_id,
-            )
-        except Exception as e:
-            logger.error(f"Error fetching marketplace agent {agent_id}: {e}")
-            return ErrorResponse(
-                message="Failed to fetch the marketplace agent. Please try again.",
-                error="fetch_error",
-                session_id=session_id,
-            )
-
-        if not agent_details.store_listing_version_id:
-            return ErrorResponse(
-                message=(
-                    f"The agent '{agent_id}' does not have an available version. "
-                    "Please try a different agent."
-                ),
-                error="no_version_available",
-                session_id=session_id,
-            )
-
-        # Get the full agent graph
-        try:
-            graph = await store_db.get_agent(agent_details.store_listing_version_id)
-            template_agent = graph_to_json(graph)
-        except Exception as e:
-            logger.error(f"Error fetching agent graph for {agent_id}: {e}")
-            return ErrorResponse(
-                message="Failed to fetch the agent configuration. Please try again.",
-                error="graph_fetch_error",
-                session_id=session_id,
-            )
-
-        # Call customize_template
-        try:
-            result = await customize_template(
-                template_agent=template_agent,
-                modification_request=modifications,
-                context=context,
-            )
-        except AgentGeneratorNotConfiguredError:
-            return ErrorResponse(
-                message=(
-                    "Agent customization is not available. "
-                    "The Agent Generator service is not configured."
-                ),
-                error="service_not_configured",
-                session_id=session_id,
-            )
-        except Exception as e:
-            logger.error(f"Error calling customize_template for {agent_id}: {e}")
-            return ErrorResponse(
-                message=(
-                    "Failed to customize the agent due to a service error. "
-                    "Please try again."
-                ),
-                error="customization_service_error",
-                session_id=session_id,
-            )
-
-        if result is None:
-            return ErrorResponse(
-                message=(
-                    "Failed to customize the agent. "
-                    "The agent generation service may be unavailable or timed out. "
-                    "Please try again."
-                ),
-                error="customization_failed",
-                session_id=session_id,
-            )
-
-        # Handle error response
-        if isinstance(result, dict) and result.get("type") == "error":
-            error_msg = result.get("error", "Unknown error")
-            error_type = result.get("error_type", "unknown")
-            user_message = get_user_message_for_error(
-                error_type,
-                operation="customize the agent",
-                llm_parse_message=(
-                    "The AI had trouble customizing the agent. "
-                    "Please try again or simplify your request."
-                ),
-                validation_message=(
-                    "The customized agent failed validation. "
-                    "Please try rephrasing your request."
-                ),
-                error_details=error_msg,
-            )
-            return ErrorResponse(
-                message=user_message,
-                error=f"customization_failed:{error_type}",
-                session_id=session_id,
-            )
-
-        # Handle clarifying questions
-        if isinstance(result, dict) and result.get("type") == "clarifying_questions":
-            questions = result.get("questions") or []
-            if not isinstance(questions, list):
-                logger.error(
-                    f"Unexpected clarifying questions format: {type(questions)}"
-                )
-                questions = []
-            return ClarificationNeededResponse(
-                message=(
-                    "I need some more information to customize this agent. "
-                    "Please answer the following questions:"
-                ),
-                questions=[
-                    ClarifyingQuestion(
-                        question=q.get("question", ""),
-                        keyword=q.get("keyword", ""),
-                        example=q.get("example"),
-                    )
-                    for q in questions
-                    if isinstance(q, dict)
-                ],
-                session_id=session_id,
-            )
-
-        # Result should be the customized agent JSON
-        if not isinstance(result, dict):
-            logger.error(f"Unexpected customize_template response type: {type(result)}")
-            return ErrorResponse(
-                message="Failed to customize the agent due to an unexpected response.",
-                error="unexpected_response_type",
-                session_id=session_id,
-            )
-
-        customized_agent = result
-
-        agent_name = customized_agent.get(
-            "name", f"Customized {agent_details.agent_name}"
-        )
-        agent_description = customized_agent.get("description", "")
-        nodes = customized_agent.get("nodes")
-        links = customized_agent.get("links")
-        node_count = len(nodes) if isinstance(nodes, list) else 0
-        link_count = len(links) if isinstance(links, list) else 0
-
-        if not save:
-            return AgentPreviewResponse(
-                message=(
-                    f"I've customized the agent '{agent_details.agent_name}'. "
-                    f"The customized agent has {node_count} blocks. "
-                    f"Review it and call customize_agent with save=true to save it."
-                ),
-                agent_json=customized_agent,
-                agent_name=agent_name,
-                description=agent_description,
-                node_count=node_count,
-                link_count=link_count,
-                session_id=session_id,
-            )
-
-        if not user_id:
-            return ErrorResponse(
-                message="You must be logged in to save agents.",
-                error="auth_required",
-                session_id=session_id,
-            )
-
-        # Save to user's library
-        try:
-            created_graph, library_agent = await save_agent_to_library(
-                customized_agent, user_id, is_update=False
-            )
-
-            return AgentSavedResponse(
-                message=(
-                    f"Customized agent '{created_graph.name}' "
-                    f"(based on '{agent_details.agent_name}') "
-                    f"has been saved to your library!"
-                ),
-                agent_id=created_graph.id,
-                agent_name=created_graph.name,
-                library_agent_id=library_agent.id,
-                library_agent_link=f"/library/agents/{library_agent.id}",
-                agent_page_link=f"/build?flowID={created_graph.id}",
-                session_id=session_id,
-            )
-        except Exception as e:
-            logger.error(f"Error saving customized agent: {e}")
-            return ErrorResponse(
-                message="Failed to save the customized agent. Please try again.",
-                error="save_failed",
-                session_id=session_id,
-            )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/edit_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/edit_agent.py
@@ -1,284 +0,0 @@
-"""EditAgentTool - Edits existing agents using natural language."""
-
-import logging
-from typing import Any
-
-from backend.api.features.chat.model import ChatSession
-
-from .agent_generator import (
-    AgentGeneratorNotConfiguredError,
-    generate_agent_patch,
-    get_agent_as_json,
-    get_all_relevant_agents_for_generation,
-    get_user_message_for_error,
-    save_agent_to_library,
-)
-from .base import BaseTool
-from .models import (
-    AgentPreviewResponse,
-    AgentSavedResponse,
-    AsyncProcessingResponse,
-    ClarificationNeededResponse,
-    ClarifyingQuestion,
-    ErrorResponse,
-    ToolResponseBase,
-)
-
-logger = logging.getLogger(__name__)
-
-
-class EditAgentTool(BaseTool):
-    """Tool for editing existing agents using natural language."""
-
-    @property
-    def name(self) -> str:
-        return "edit_agent"
-
-    @property
-    def description(self) -> str:
-        return (
-            "Edit an existing agent from the user's library using natural language. "
-            "Generates updates to the agent while preserving unchanged parts."
-        )
-
-    @property
-    def requires_auth(self) -> bool:
-        return True
-
-    @property
-    def is_long_running(self) -> bool:
-        return True
-
-    @property
-    def parameters(self) -> dict[str, Any]:
-        return {
-            "type": "object",
-            "properties": {
-                "agent_id": {
-                    "type": "string",
-                    "description": (
-                        "The ID of the agent to edit. "
-                        "Can be a graph ID or library agent ID."
-                    ),
-                },
-                "changes": {
-                    "type": "string",
-                    "description": (
-                        "Natural language description of what changes to make. "
-                        "Be specific about what to add, remove, or modify."
-                    ),
-                },
-                "context": {
-                    "type": "string",
-                    "description": (
-                        "Additional context or answers to previous clarifying questions."
-                    ),
-                },
-                "save": {
-                    "type": "boolean",
-                    "description": (
-                        "Whether to save the changes. "
-                        "Default is true. Set to false for preview only."
-                    ),
-                    "default": True,
-                },
-            },
-            "required": ["agent_id", "changes"],
-        }
-
-    async def _execute(
-        self,
-        user_id: str | None,
-        session: ChatSession,
-        **kwargs,
-    ) -> ToolResponseBase:
-        """Execute the edit_agent tool.
-
-        Flow:
-        1. Fetch the current agent
-        2. Generate updated agent (external service handles fixing and validation)
-        3. Preview or save based on the save parameter
-        """
-        agent_id = kwargs.get("agent_id", "").strip()
-        changes = kwargs.get("changes", "").strip()
-        context = kwargs.get("context", "")
-        save = kwargs.get("save", True)
-        session_id = session.session_id if session else None
-
-        # Extract async processing params (passed by long-running tool handler)
-        operation_id = kwargs.get("_operation_id")
-        task_id = kwargs.get("_task_id")
-
-        if not agent_id:
-            return ErrorResponse(
-                message="Please provide the agent ID to edit.",
-                error="Missing agent_id parameter",
-                session_id=session_id,
-            )
-
-        if not changes:
-            return ErrorResponse(
-                message="Please describe what changes you want to make.",
-                error="Missing changes parameter",
-                session_id=session_id,
-            )
-
-        current_agent = await get_agent_as_json(agent_id, user_id)
-
-        if current_agent is None:
-            return ErrorResponse(
-                message=f"Could not find agent with ID '{agent_id}' in your library.",
-                error="agent_not_found",
-                session_id=session_id,
-            )
-
-        library_agents = None
-        if user_id:
-            try:
-                graph_id = current_agent.get("id")
-                library_agents = await get_all_relevant_agents_for_generation(
-                    user_id=user_id,
-                    search_query=changes,
-                    exclude_graph_id=graph_id,
-                    include_marketplace=True,
-                )
-                logger.debug(
-                    f"Found {len(library_agents)} relevant agents for sub-agent composition"
-                )
-            except Exception as e:
-                logger.warning(f"Failed to fetch library agents: {e}")
-
-        update_request = changes
-        if context:
-            update_request = f"{changes}\n\nAdditional context:\n{context}"
-
-        try:
-            result = await generate_agent_patch(
-                update_request,
-                current_agent,
-                library_agents,
-                operation_id=operation_id,
-                task_id=task_id,
-            )
-        except AgentGeneratorNotConfiguredError:
-            return ErrorResponse(
-                message=(
-                    "Agent editing is not available. "
-                    "The Agent Generator service is not configured."
-                ),
-                error="service_not_configured",
-                session_id=session_id,
-            )
-
-        if result is None:
-            return ErrorResponse(
-                message="Failed to generate changes. The agent generation service may be unavailable or timed out. Please try again.",
-                error="update_generation_failed",
-                details={"agent_id": agent_id, "changes": changes[:100]},
-                session_id=session_id,
-            )
-
-        # Check if Agent Generator accepted for async processing
-        if result.get("status") == "accepted":
-            logger.info(
-                f"Agent edit delegated to async processing "
-                f"(operation_id={operation_id}, task_id={task_id})"
-            )
-            return AsyncProcessingResponse(
-                message="Agent edit started. You'll be notified when it's complete.",
-                operation_id=operation_id,
-                task_id=task_id,
-                session_id=session_id,
-            )
-
-        # Check if the result is an error from the external service
-        if isinstance(result, dict) and result.get("type") == "error":
-            error_msg = result.get("error", "Unknown error")
-            error_type = result.get("error_type", "unknown")
-            user_message = get_user_message_for_error(
-                error_type,
-                operation="generate the changes",
-                llm_parse_message="The AI had trouble generating the changes. Please try again or simplify your request.",
-                validation_message="The generated changes failed validation. Please try rephrasing your request.",
-                error_details=error_msg,
-            )
-            return ErrorResponse(
-                message=user_message,
-                error=f"update_generation_failed:{error_type}",
-                details={
-                    "agent_id": agent_id,
-                    "changes": changes[:100],
-                    "service_error": error_msg,
-                    "error_type": error_type,
-                },
-                session_id=session_id,
-            )
-
-        if result.get("type") == "clarifying_questions":
-            questions = result.get("questions", [])
-            return ClarificationNeededResponse(
-                message=(
-                    "I need some more information about the changes. "
-                    "Please answer the following questions:"
-                ),
-                questions=[
-                    ClarifyingQuestion(
-                        question=q.get("question", ""),
-                        keyword=q.get("keyword", ""),
-                        example=q.get("example"),
-                    )
-                    for q in questions
-                ],
-                session_id=session_id,
-            )
-
-        updated_agent = result
-
-        agent_name = updated_agent.get("name", "Updated Agent")
-        agent_description = updated_agent.get("description", "")
-        node_count = len(updated_agent.get("nodes", []))
-        link_count = len(updated_agent.get("links", []))
-
-        if not save:
-            return AgentPreviewResponse(
-                message=(
-                    f"I've updated the agent. "
-                    f"The agent now has {node_count} blocks. "
-                    f"Review it and call edit_agent with save=true to save the changes."
-                ),
-                agent_json=updated_agent,
-                agent_name=agent_name,
-                description=agent_description,
-                node_count=node_count,
-                link_count=link_count,
-                session_id=session_id,
-            )
-
-        if not user_id:
-            return ErrorResponse(
-                message="You must be logged in to save agents.",
-                error="auth_required",
-                session_id=session_id,
-            )
-
-        try:
-            created_graph, library_agent = await save_agent_to_library(
-                updated_agent, user_id, is_update=True
-            )
-
-            return AgentSavedResponse(
-                message=f"Updated agent '{created_graph.name}' has been saved to your library!",
-                agent_id=created_graph.id,
-                agent_name=created_graph.name,
-                library_agent_id=library_agent.id,
-                library_agent_link=f"/library/agents/{library_agent.id}",
-                agent_page_link=f"/build?flowID={created_graph.id}",
-                session_id=session_id,
-            )
-        except Exception as e:
-            return ErrorResponse(
-                message=f"Failed to save the updated agent: {str(e)}",
-                error="save_failed",
-                details={"exception": str(e)},
-                session_id=session_id,
-            )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/find_block_test.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/find_block_test.py
@@ -1,139 +0,0 @@
-"""Tests for block filtering in FindBlockTool."""
-
-from unittest.mock import AsyncMock, MagicMock, patch
-
-import pytest
-
-from backend.api.features.chat.tools.find_block import (
-    COPILOT_EXCLUDED_BLOCK_IDS,
-    COPILOT_EXCLUDED_BLOCK_TYPES,
-    FindBlockTool,
-)
-from backend.api.features.chat.tools.models import BlockListResponse
-from backend.data.block import BlockType
-
-from ._test_data import make_session
-
-_TEST_USER_ID = "test-user-find-block"
-
-
-def make_mock_block(
-    block_id: str, name: str, block_type: BlockType, disabled: bool = False
-):
-    """Create a mock block for testing."""
-    mock = MagicMock()
-    mock.id = block_id
-    mock.name = name
-    mock.description = f"{name} description"
-    mock.block_type = block_type
-    mock.disabled = disabled
-    mock.input_schema = MagicMock()
-    mock.input_schema.jsonschema.return_value = {"properties": {}, "required": []}
-    mock.input_schema.get_credentials_fields.return_value = {}
-    mock.output_schema = MagicMock()
-    mock.output_schema.jsonschema.return_value = {}
-    mock.categories = []
-    return mock
-
-
-class TestFindBlockFiltering:
-    """Tests for block filtering in FindBlockTool."""
-
-    def test_excluded_block_types_contains_expected_types(self):
-        """Verify COPILOT_EXCLUDED_BLOCK_TYPES contains all graph-only types."""
-        assert BlockType.INPUT in COPILOT_EXCLUDED_BLOCK_TYPES
-        assert BlockType.OUTPUT in COPILOT_EXCLUDED_BLOCK_TYPES
-        assert BlockType.WEBHOOK in COPILOT_EXCLUDED_BLOCK_TYPES
-        assert BlockType.WEBHOOK_MANUAL in COPILOT_EXCLUDED_BLOCK_TYPES
-        assert BlockType.NOTE in COPILOT_EXCLUDED_BLOCK_TYPES
-        assert BlockType.HUMAN_IN_THE_LOOP in COPILOT_EXCLUDED_BLOCK_TYPES
-        assert BlockType.AGENT in COPILOT_EXCLUDED_BLOCK_TYPES
-
-    def test_excluded_block_ids_contains_smart_decision_maker(self):
-        """Verify SmartDecisionMakerBlock is in COPILOT_EXCLUDED_BLOCK_IDS."""
-        assert "3b191d9f-356f-482d-8238-ba04b6d18381" in COPILOT_EXCLUDED_BLOCK_IDS
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_excluded_block_type_filtered_from_results(self):
-        """Verify blocks with excluded BlockTypes are filtered from search results."""
-        session = make_session(user_id=_TEST_USER_ID)
-
-        # Mock search returns an INPUT block (excluded) and a STANDARD block (included)
-        search_results = [
-            {"content_id": "input-block-id", "score": 0.9},
-            {"content_id": "standard-block-id", "score": 0.8},
-        ]
-
-        input_block = make_mock_block("input-block-id", "Input Block", BlockType.INPUT)
-        standard_block = make_mock_block(
-            "standard-block-id", "HTTP Request", BlockType.STANDARD
-        )
-
-        def mock_get_block(block_id):
-            return {
-                "input-block-id": input_block,
-                "standard-block-id": standard_block,
-            }.get(block_id)
-
-        with patch(
-            "backend.api.features.chat.tools.find_block.unified_hybrid_search",
-            new_callable=AsyncMock,
-            return_value=(search_results, 2),
-        ):
-            with patch(
-                "backend.api.features.chat.tools.find_block.get_block",
-                side_effect=mock_get_block,
-            ):
-                tool = FindBlockTool()
-                response = await tool._execute(
-                    user_id=_TEST_USER_ID, session=session, query="test"
-                )
-
-        # Should only return the standard block, not the INPUT block
-        assert isinstance(response, BlockListResponse)
-        assert len(response.blocks) == 1
-        assert response.blocks[0].id == "standard-block-id"
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_excluded_block_id_filtered_from_results(self):
-        """Verify SmartDecisionMakerBlock is filtered from search results."""
-        session = make_session(user_id=_TEST_USER_ID)
-
-        smart_decision_id = "3b191d9f-356f-482d-8238-ba04b6d18381"
-        search_results = [
-            {"content_id": smart_decision_id, "score": 0.9},
-            {"content_id": "normal-block-id", "score": 0.8},
-        ]
-
-        # SmartDecisionMakerBlock has STANDARD type but is excluded by ID
-        smart_block = make_mock_block(
-            smart_decision_id, "Smart Decision Maker", BlockType.STANDARD
-        )
-        normal_block = make_mock_block(
-            "normal-block-id", "Normal Block", BlockType.STANDARD
-        )
-
-        def mock_get_block(block_id):
-            return {
-                smart_decision_id: smart_block,
-                "normal-block-id": normal_block,
-            }.get(block_id)
-
-        with patch(
-            "backend.api.features.chat.tools.find_block.unified_hybrid_search",
-            new_callable=AsyncMock,
-            return_value=(search_results, 2),
-        ):
-            with patch(
-                "backend.api.features.chat.tools.find_block.get_block",
-                side_effect=mock_get_block,
-            ):
-                tool = FindBlockTool()
-                response = await tool._execute(
-                    user_id=_TEST_USER_ID, session=session, query="decision"
-                )
-
-        # Should only return normal block, not SmartDecisionMakerBlock
-        assert isinstance(response, BlockListResponse)
-        assert len(response.blocks) == 1
-        assert response.blocks[0].id == "normal-block-id"
--- a/autogpt_platform/backend/backend/api/features/chat/tools/helpers.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/helpers.py
@@ -1,29 +0,0 @@
-"""Shared helpers for chat tools."""
-
-from typing import Any
-
-
-def get_inputs_from_schema(
-    input_schema: dict[str, Any],
-    exclude_fields: set[str] | None = None,
-) -> list[dict[str, Any]]:
-    """Extract input field info from JSON schema."""
-    if not isinstance(input_schema, dict):
-        return []
-
-    exclude = exclude_fields or set()
-    properties = input_schema.get("properties", {})
-    required = set(input_schema.get("required", []))
-
-    return [
-        {
-            "name": name,
-            "title": schema.get("title", name),
-            "type": schema.get("type", "string"),
-            "description": schema.get("description", ""),
-            "required": name in required,
-            "default": schema.get("default"),
-        }
-        for name, schema in properties.items()
-        if name not in exclude
-    ]
--- a/autogpt_platform/backend/backend/api/features/chat/tools/models.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/models.py
@@ -1,423 +0,0 @@
-"""Pydantic models for tool responses."""
-
-from datetime import datetime
-from enum import Enum
-from typing import Any
-
-from pydantic import BaseModel, Field
-
-from backend.data.model import CredentialsMetaInput
-
-
-class ResponseType(str, Enum):
-    """Types of tool responses."""
-
-    AGENTS_FOUND = "agents_found"
-    AGENT_DETAILS = "agent_details"
-    SETUP_REQUIREMENTS = "setup_requirements"
-    EXECUTION_STARTED = "execution_started"
-    NEED_LOGIN = "need_login"
-    ERROR = "error"
-    NO_RESULTS = "no_results"
-    AGENT_OUTPUT = "agent_output"
-    UNDERSTANDING_UPDATED = "understanding_updated"
-    AGENT_PREVIEW = "agent_preview"
-    AGENT_SAVED = "agent_saved"
-    CLARIFICATION_NEEDED = "clarification_needed"
-    BLOCK_LIST = "block_list"
-    BLOCK_OUTPUT = "block_output"
-    DOC_SEARCH_RESULTS = "doc_search_results"
-    DOC_PAGE = "doc_page"
-    # Workspace response types
-    WORKSPACE_FILE_LIST = "workspace_file_list"
-    WORKSPACE_FILE_CONTENT = "workspace_file_content"
-    WORKSPACE_FILE_METADATA = "workspace_file_metadata"
-    WORKSPACE_FILE_WRITTEN = "workspace_file_written"
-    WORKSPACE_FILE_DELETED = "workspace_file_deleted"
-    # Long-running operation types
-    OPERATION_STARTED = "operation_started"
-    OPERATION_PENDING = "operation_pending"
-    OPERATION_IN_PROGRESS = "operation_in_progress"
-    # Input validation
-    INPUT_VALIDATION_ERROR = "input_validation_error"
-
-
-# Base response model
-class ToolResponseBase(BaseModel):
-    """Base model for all tool responses."""
-
-    type: ResponseType
-    message: str
-    session_id: str | None = None
-
-
-# Agent discovery models
-class AgentInfo(BaseModel):
-    """Information about an agent."""
-
-    id: str
-    name: str
-    description: str
-    source: str = Field(description="marketplace or library")
-    in_library: bool = False
-    creator: str | None = None
-    category: str | None = None
-    rating: float | None = None
-    runs: int | None = None
-    is_featured: bool | None = None
-    status: str | None = None
-    can_access_graph: bool | None = None
-    has_external_trigger: bool | None = None
-    new_output: bool | None = None
-    graph_id: str | None = None
-    inputs: dict[str, Any] | None = Field(
-        default=None,
-        description="Input schema for the agent, including field names, types, and defaults",
-    )
-
-
-class AgentsFoundResponse(ToolResponseBase):
-    """Response for find_agent tool."""
-
-    type: ResponseType = ResponseType.AGENTS_FOUND
-    title: str = "Available Agents"
-    agents: list[AgentInfo]
-    count: int
-    name: str = "agents_found"
-
-
-class NoResultsResponse(ToolResponseBase):
-    """Response when no agents found."""
-
-    type: ResponseType = ResponseType.NO_RESULTS
-    suggestions: list[str] = []
-    name: str = "no_results"
-
-
-# Agent details models
-class InputField(BaseModel):
-    """Input field specification."""
-
-    name: str
-    type: str = "string"
-    description: str = ""
-    required: bool = False
-    default: Any | None = None
-    options: list[Any] | None = None
-    format: str | None = None
-
-
-class ExecutionOptions(BaseModel):
-    """Available execution options for an agent."""
-
-    manual: bool = True
-    scheduled: bool = True
-    webhook: bool = False
-
-
-class AgentDetails(BaseModel):
-    """Detailed agent information."""
-
-    id: str
-    name: str
-    description: str
-    in_library: bool = False
-    inputs: dict[str, Any] = {}
-    credentials: list[CredentialsMetaInput] = []
-    execution_options: ExecutionOptions = Field(default_factory=ExecutionOptions)
-    trigger_info: dict[str, Any] | None = None
-
-
-class AgentDetailsResponse(ToolResponseBase):
-    """Response for get_details action."""
-
-    type: ResponseType = ResponseType.AGENT_DETAILS
-    agent: AgentDetails
-    user_authenticated: bool = False
-    graph_id: str | None = None
-    graph_version: int | None = None
-
-
-# Setup info models
-class UserReadiness(BaseModel):
-    """User readiness status."""
-
-    has_all_credentials: bool = False
-    missing_credentials: dict[str, Any] = {}
-    ready_to_run: bool = False
-
-
-class SetupInfo(BaseModel):
-    """Complete setup information."""
-
-    agent_id: str
-    agent_name: str
-    requirements: dict[str, list[Any]] = Field(
-        default_factory=lambda: {
-            "credentials": [],
-            "inputs": [],
-            "execution_modes": [],
-        },
-    )
-    user_readiness: UserReadiness = Field(default_factory=UserReadiness)
-
-
-class SetupRequirementsResponse(ToolResponseBase):
-    """Response for validate action."""
-
-    type: ResponseType = ResponseType.SETUP_REQUIREMENTS
-    setup_info: SetupInfo
-    graph_id: str | None = None
-    graph_version: int | None = None
-
-
-# Execution models
-class ExecutionStartedResponse(ToolResponseBase):
-    """Response for run/schedule actions."""
-
-    type: ResponseType = ResponseType.EXECUTION_STARTED
-    execution_id: str
-    graph_id: str
-    graph_name: str
-    library_agent_id: str | None = None
-    library_agent_link: str | None = None
-    status: str = "QUEUED"
-
-
-# Auth/error models
-class NeedLoginResponse(ToolResponseBase):
-    """Response when login is needed."""
-
-    type: ResponseType = ResponseType.NEED_LOGIN
-    agent_info: dict[str, Any] | None = None
-
-
-class ErrorResponse(ToolResponseBase):
-    """Response for errors."""
-
-    type: ResponseType = ResponseType.ERROR
-    error: str | None = None
-    details: dict[str, Any] | None = None
-
-
-class InputValidationErrorResponse(ToolResponseBase):
-    """Response when run_agent receives unknown input fields."""
-
-    type: ResponseType = ResponseType.INPUT_VALIDATION_ERROR
-    unrecognized_fields: list[str] = Field(
-        description="List of input field names that were not recognized"
-    )
-    inputs: dict[str, Any] = Field(
-        description="The agent's valid input schema for reference"
-    )
-    graph_id: str | None = None
-    graph_version: int | None = None
-
-
-# Agent output models
-class ExecutionOutputInfo(BaseModel):
-    """Summary of a single execution's outputs."""
-
-    execution_id: str
-    status: str
-    started_at: datetime | None = None
-    ended_at: datetime | None = None
-    outputs: dict[str, list[Any]]
-    inputs_summary: dict[str, Any] | None = None
-
-
-class AgentOutputResponse(ToolResponseBase):
-    """Response for agent_output tool."""
-
-    type: ResponseType = ResponseType.AGENT_OUTPUT
-    agent_name: str
-    agent_id: str
-    library_agent_id: str | None = None
-    library_agent_link: str | None = None
-    execution: ExecutionOutputInfo | None = None
-    available_executions: list[dict[str, Any]] | None = None
-    total_executions: int = 0
-
-
-# Business understanding models
-class UnderstandingUpdatedResponse(ToolResponseBase):
-    """Response for add_understanding tool."""
-
-    type: ResponseType = ResponseType.UNDERSTANDING_UPDATED
-    updated_fields: list[str] = Field(default_factory=list)
-    current_understanding: dict[str, Any] = Field(default_factory=dict)
-
-
-# Agent generation models
-class ClarifyingQuestion(BaseModel):
-    """A question that needs user clarification."""
-
-    question: str
-    keyword: str
-    example: str | None = None
-
-
-class AgentPreviewResponse(ToolResponseBase):
-    """Response for previewing a generated agent before saving."""
-
-    type: ResponseType = ResponseType.AGENT_PREVIEW
-    agent_json: dict[str, Any]
-    agent_name: str
-    description: str
-    node_count: int
-    link_count: int = 0
-
-
-class AgentSavedResponse(ToolResponseBase):
-    """Response when an agent is saved to the library."""
-
-    type: ResponseType = ResponseType.AGENT_SAVED
-    agent_id: str
-    agent_name: str
-    library_agent_id: str
-    library_agent_link: str
-    agent_page_link: str  # Link to the agent builder/editor page
-
-
-class ClarificationNeededResponse(ToolResponseBase):
-    """Response when the LLM needs more information from the user."""
-
-    type: ResponseType = ResponseType.CLARIFICATION_NEEDED
-    questions: list[ClarifyingQuestion] = Field(default_factory=list)
-
-
-# Documentation search models
-class DocSearchResult(BaseModel):
-    """A single documentation search result."""
-
-    title: str
-    path: str
-    section: str
-    snippet: str  # Short excerpt for UI display
-    score: float
-    doc_url: str | None = None
-
-
-class DocSearchResultsResponse(ToolResponseBase):
-    """Response for search_docs tool."""
-
-    type: ResponseType = ResponseType.DOC_SEARCH_RESULTS
-    results: list[DocSearchResult]
-    count: int
-    query: str
-
-
-class DocPageResponse(ToolResponseBase):
-    """Response for get_doc_page tool."""
-
-    type: ResponseType = ResponseType.DOC_PAGE
-    title: str
-    path: str
-    content: str  # Full document content
-    doc_url: str | None = None
-
-
-# Block models
-class BlockInputFieldInfo(BaseModel):
-    """Information about a block input field."""
-
-    name: str
-    type: str
-    description: str = ""
-    required: bool = False
-    default: Any | None = None
-
-
-class BlockInfoSummary(BaseModel):
-    """Summary of a block for search results."""
-
-    id: str
-    name: str
-    description: str
-    categories: list[str]
-    input_schema: dict[str, Any]
-    output_schema: dict[str, Any]
-    required_inputs: list[BlockInputFieldInfo] = Field(
-        default_factory=list,
-        description="List of required input fields for this block",
-    )
-
-
-class BlockListResponse(ToolResponseBase):
-    """Response for find_block tool."""
-
-    type: ResponseType = ResponseType.BLOCK_LIST
-    blocks: list[BlockInfoSummary]
-    count: int
-    query: str
-    usage_hint: str = Field(
-        default="To execute a block, call run_block with block_id set to the block's "
-        "'id' field and input_data containing the required fields from input_schema."
-    )
-
-
-class BlockOutputResponse(ToolResponseBase):
-    """Response for run_block tool."""
-
-    type: ResponseType = ResponseType.BLOCK_OUTPUT
-    block_id: str
-    block_name: str
-    outputs: dict[str, list[Any]]
-    success: bool = True
-
-
-# Long-running operation models
-class OperationStartedResponse(ToolResponseBase):
-    """Response when a long-running operation has been started in the background.
-
-    This is returned immediately to the client while the operation continues
-    to execute. The user can close the tab and check back later.
-
-    The task_id can be used to reconnect to the SSE stream via
-    GET /chat/tasks/{task_id}/stream?last_idx=0
-    """
-
-    type: ResponseType = ResponseType.OPERATION_STARTED
-    operation_id: str
-    tool_name: str
-    task_id: str | None = None  # For SSE reconnection
-
-
-class OperationPendingResponse(ToolResponseBase):
-    """Response stored in chat history while a long-running operation is executing.
-
-    This is persisted to the database so users see a pending state when they
-    refresh before the operation completes.
-    """
-
-    type: ResponseType = ResponseType.OPERATION_PENDING
-    operation_id: str
-    tool_name: str
-
-
-class OperationInProgressResponse(ToolResponseBase):
-    """Response when an operation is already in progress.
-
-    Returned for idempotency when the same tool_call_id is requested again
-    while the background task is still running.
-    """
-
-    type: ResponseType = ResponseType.OPERATION_IN_PROGRESS
-    tool_call_id: str
-
-
-class AsyncProcessingResponse(ToolResponseBase):
-    """Response when an operation has been delegated to async processing.
-
-    This is returned by tools when the external service accepts the request
-    for async processing (HTTP 202 Accepted). The Redis Streams completion
-    consumer will handle the result when the external service completes.
-
-    The status field is specifically "accepted" to allow the long-running tool
-    handler to detect this response and skip LLM continuation.
-    """
-
-    type: ResponseType = ResponseType.OPERATION_STARTED
-    status: str = "accepted"  # Must be "accepted" for detection
-    operation_id: str | None = None
-    task_id: str | None = None
--- a/autogpt_platform/backend/backend/api/features/chat/tools/run_block.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/run_block.py
@@ -1,355 +0,0 @@
-"""Tool for executing blocks directly."""
-
-import logging
-import uuid
-from collections import defaultdict
-from typing import Any
-
-from pydantic_core import PydanticUndefined
-
-from backend.api.features.chat.model import ChatSession
-from backend.api.features.chat.tools.find_block import (
-    COPILOT_EXCLUDED_BLOCK_IDS,
-    COPILOT_EXCLUDED_BLOCK_TYPES,
-)
-from backend.data.block import AnyBlockSchema, get_block
-from backend.data.execution import ExecutionContext
-from backend.data.model import CredentialsFieldInfo, CredentialsMetaInput
-from backend.data.workspace import get_or_create_workspace
-from backend.integrations.creds_manager import IntegrationCredentialsManager
-from backend.util.exceptions import BlockError
-
-from .base import BaseTool
-from .helpers import get_inputs_from_schema
-from .models import (
-    BlockOutputResponse,
-    ErrorResponse,
-    SetupInfo,
-    SetupRequirementsResponse,
-    ToolResponseBase,
-    UserReadiness,
-)
-from .utils import (
-    build_missing_credentials_from_field_info,
-    match_credentials_to_requirements,
-)
-
-logger = logging.getLogger(__name__)
-
-
-class RunBlockTool(BaseTool):
-    """Tool for executing a block and returning its outputs."""
-
-    @property
-    def name(self) -> str:
-        return "run_block"
-
-    @property
-    def description(self) -> str:
-        return (
-            "Execute a specific block with the provided input data. "
-            "IMPORTANT: You MUST call find_block first to get the block's 'id' - "
-            "do NOT guess or make up block IDs. "
-            "Use the 'id' from find_block results and provide input_data "
-            "matching the block's required_inputs."
-        )
-
-    @property
-    def parameters(self) -> dict[str, Any]:
-        return {
-            "type": "object",
-            "properties": {
-                "block_id": {
-                    "type": "string",
-                    "description": (
-                        "The block's 'id' field from find_block results. "
-                        "NEVER guess this - always get it from find_block first."
-                    ),
-                },
-                "input_data": {
-                    "type": "object",
-                    "description": (
-                        "Input values for the block. Use the 'required_inputs' field "
-                        "from find_block to see what fields are needed."
-                    ),
-                },
-            },
-            "required": ["block_id", "input_data"],
-        }
-
-    @property
-    def requires_auth(self) -> bool:
-        return True
-
-    async def _execute(
-        self,
-        user_id: str | None,
-        session: ChatSession,
-        **kwargs,
-    ) -> ToolResponseBase:
-        """Execute a block with the given input data.
-
-        Args:
-            user_id: User ID (required)
-            session: Chat session
-            block_id: Block UUID to execute
-            input_data: Input values for the block
-
-        Returns:
-            BlockOutputResponse: Block execution outputs
-            SetupRequirementsResponse: Missing credentials
-            ErrorResponse: Error message
-        """
-        block_id = kwargs.get("block_id", "").strip()
-        input_data = kwargs.get("input_data", {})
-        session_id = session.session_id
-
-        if not block_id:
-            return ErrorResponse(
-                message="Please provide a block_id",
-                session_id=session_id,
-            )
-
-        if not isinstance(input_data, dict):
-            return ErrorResponse(
-                message="input_data must be an object",
-                session_id=session_id,
-            )
-
-        if not user_id:
-            return ErrorResponse(
-                message="Authentication required",
-                session_id=session_id,
-            )
-
-        # Get the block
-        block = get_block(block_id)
-        if not block:
-            return ErrorResponse(
-                message=f"Block '{block_id}' not found",
-                session_id=session_id,
-            )
-        if block.disabled:
-            return ErrorResponse(
-                message=f"Block '{block_id}' is disabled",
-                session_id=session_id,
-            )
-
-        # Check if block is excluded from CoPilot (graph-only blocks)
-        if (
-            block.block_type in COPILOT_EXCLUDED_BLOCK_TYPES
-            or block.id in COPILOT_EXCLUDED_BLOCK_IDS
-        ):
-            return ErrorResponse(
-                message=(
-                    f"Block '{block.name}' cannot be run directly in CoPilot. "
-                    "This block is designed for use within graphs only."
-                ),
-                session_id=session_id,
-            )
-
-        logger.info(f"Executing block {block.name} ({block_id}) for user {user_id}")
-
-        creds_manager = IntegrationCredentialsManager()
-        matched_credentials, missing_credentials = (
-            await self._resolve_block_credentials(user_id, block, input_data)
-        )
-
-        if missing_credentials:
-            # Return setup requirements response with missing credentials
-            credentials_fields_info = block.input_schema.get_credentials_fields_info()
-            missing_creds_dict = build_missing_credentials_from_field_info(
-                credentials_fields_info, set(matched_credentials.keys())
-            )
-            missing_creds_list = list(missing_creds_dict.values())
-
-            return SetupRequirementsResponse(
-                message=(
-                    f"Block '{block.name}' requires credentials that are not configured. "
-                    "Please set up the required credentials before running this block."
-                ),
-                session_id=session_id,
-                setup_info=SetupInfo(
-                    agent_id=block_id,
-                    agent_name=block.name,
-                    user_readiness=UserReadiness(
-                        has_all_credentials=False,
-                        missing_credentials=missing_creds_dict,
-                        ready_to_run=False,
-                    ),
-                    requirements={
-                        "credentials": missing_creds_list,
-                        "inputs": self._get_inputs_list(block),
-                        "execution_modes": ["immediate"],
-                    },
-                ),
-                graph_id=None,
-                graph_version=None,
-            )
-
-        try:
-            # Get or create user's workspace for CoPilot file operations
-            workspace = await get_or_create_workspace(user_id)
-
-            # Generate synthetic IDs for CoPilot context
-            # Each chat session is treated as its own agent with one continuous run
-            # This means:
-            # - graph_id (agent) = session (memories scoped to session when limit_to_agent=True)
-            # - graph_exec_id (run) = session (memories scoped to session when limit_to_run=True)
-            # - node_exec_id = unique per block execution
-            synthetic_graph_id = f"copilot-session-{session.session_id}"
-            synthetic_graph_exec_id = f"copilot-session-{session.session_id}"
-            synthetic_node_id = f"copilot-node-{block_id}"
-            synthetic_node_exec_id = (
-                f"copilot-{session.session_id}-{uuid.uuid4().hex[:8]}"
-            )
-
-            # Create unified execution context with all required fields
-            execution_context = ExecutionContext(
-                # Execution identity
-                user_id=user_id,
-                graph_id=synthetic_graph_id,
-                graph_exec_id=synthetic_graph_exec_id,
-                graph_version=1,  # Versions are 1-indexed
-                node_id=synthetic_node_id,
-                node_exec_id=synthetic_node_exec_id,
-                # Workspace with session scoping
-                workspace_id=workspace.id,
-                session_id=session.session_id,
-            )
-
-            # Prepare kwargs for block execution
-            # Keep individual kwargs for backwards compatibility with existing blocks
-            exec_kwargs: dict[str, Any] = {
-                "user_id": user_id,
-                "execution_context": execution_context,
-                # Legacy: individual kwargs for blocks not yet using execution_context
-                "workspace_id": workspace.id,
-                "graph_exec_id": synthetic_graph_exec_id,
-                "node_exec_id": synthetic_node_exec_id,
-                "node_id": synthetic_node_id,
-                "graph_version": 1,  # Versions are 1-indexed
-                "graph_id": synthetic_graph_id,
-            }
-
-            for field_name, cred_meta in matched_credentials.items():
-                # Inject metadata into input_data (for validation)
-                if field_name not in input_data:
-                    input_data[field_name] = cred_meta.model_dump()
-
-                # Fetch actual credentials and pass as kwargs (for execution)
-                actual_credentials = await creds_manager.get(
-                    user_id, cred_meta.id, lock=False
-                )
-                if actual_credentials:
-                    exec_kwargs[field_name] = actual_credentials
-                else:
-                    return ErrorResponse(
-                        message=f"Failed to retrieve credentials for {field_name}",
-                        session_id=session_id,
-                    )
-
-            # Execute the block and collect outputs
-            outputs: dict[str, list[Any]] = defaultdict(list)
-            async for output_name, output_data in block.execute(
-                input_data,
-                **exec_kwargs,
-            ):
-                outputs[output_name].append(output_data)
-
-            return BlockOutputResponse(
-                message=f"Block '{block.name}' executed successfully",
-                block_id=block_id,
-                block_name=block.name,
-                outputs=dict(outputs),
-                success=True,
-                session_id=session_id,
-            )
-
-        except BlockError as e:
-            logger.warning(f"Block execution failed: {e}")
-            return ErrorResponse(
-                message=f"Block execution failed: {e}",
-                error=str(e),
-                session_id=session_id,
-            )
-        except Exception as e:
-            logger.error(f"Unexpected error executing block: {e}", exc_info=True)
-            return ErrorResponse(
-                message=f"Failed to execute block: {str(e)}",
-                error=str(e),
-                session_id=session_id,
-            )
-
-    async def _resolve_block_credentials(
-        self,
-        user_id: str,
-        block: AnyBlockSchema,
-        input_data: dict[str, Any] | None = None,
-    ) -> tuple[dict[str, CredentialsMetaInput], list[CredentialsMetaInput]]:
-        """
-        Resolve credentials for a block by matching user's available credentials.
-
-        Args:
-            user_id: User ID
-            block: Block to resolve credentials for
-            input_data: Input data for the block (used to determine provider via discriminator)
-
-        Returns:
-            tuple of (matched_credentials, missing_credentials) - matched credentials
-            are used for block execution, missing ones indicate setup requirements.
-        """
-        input_data = input_data or {}
-        requirements = self._resolve_discriminated_credentials(block, input_data)
-
-        if not requirements:
-            return {}, []
-
-        return await match_credentials_to_requirements(user_id, requirements)
-
-    def _get_inputs_list(self, block: AnyBlockSchema) -> list[dict[str, Any]]:
-        """Extract non-credential inputs from block schema."""
-        schema = block.input_schema.jsonschema()
-        credentials_fields = set(block.input_schema.get_credentials_fields().keys())
-        return get_inputs_from_schema(schema, exclude_fields=credentials_fields)
-
-    def _resolve_discriminated_credentials(
-        self,
-        block: AnyBlockSchema,
-        input_data: dict[str, Any],
-    ) -> dict[str, CredentialsFieldInfo]:
-        """Resolve credential requirements, applying discriminator logic where needed."""
-        credentials_fields_info = block.input_schema.get_credentials_fields_info()
-        if not credentials_fields_info:
-            return {}
-
-        resolved: dict[str, CredentialsFieldInfo] = {}
-
-        for field_name, field_info in credentials_fields_info.items():
-            effective_field_info = field_info
-
-            if field_info.discriminator and field_info.discriminator_mapping:
-                discriminator_value = input_data.get(field_info.discriminator)
-                if discriminator_value is None:
-                    field = block.input_schema.model_fields.get(
-                        field_info.discriminator
-                    )
-                    if field and field.default is not PydanticUndefined:
-                        discriminator_value = field.default
-
-                if (
-                    discriminator_value
-                    and discriminator_value in field_info.discriminator_mapping
-                ):
-                    effective_field_info = field_info.discriminate(discriminator_value)
-                    # For host-scoped credentials, add the discriminator value
-                    # (e.g., URL) so _credential_is_for_host can match it
-                    effective_field_info.discriminator_values.add(discriminator_value)
-                    logger.debug(
-                        f"Discriminated provider for {field_name}: "
-                        f"{discriminator_value} -> {effective_field_info.provider}"
-                    )
-
-            resolved[field_name] = effective_field_info
-
-        return resolved
--- a/autogpt_platform/backend/backend/api/features/chat/tools/run_block_test.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/run_block_test.py
@@ -1,106 +0,0 @@
-"""Tests for block execution guards in RunBlockTool."""
-
-from unittest.mock import MagicMock, patch
-
-import pytest
-
-from backend.api.features.chat.tools.models import ErrorResponse
-from backend.api.features.chat.tools.run_block import RunBlockTool
-from backend.data.block import BlockType
-
-from ._test_data import make_session
-
-_TEST_USER_ID = "test-user-run-block"
-
-
-def make_mock_block(
-    block_id: str, name: str, block_type: BlockType, disabled: bool = False
-):
-    """Create a mock block for testing."""
-    mock = MagicMock()
-    mock.id = block_id
-    mock.name = name
-    mock.block_type = block_type
-    mock.disabled = disabled
-    mock.input_schema = MagicMock()
-    mock.input_schema.jsonschema.return_value = {"properties": {}, "required": []}
-    mock.input_schema.get_credentials_fields_info.return_value = []
-    return mock
-
-
-class TestRunBlockFiltering:
-    """Tests for block execution guards in RunBlockTool."""
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_excluded_block_type_returns_error(self):
-        """Attempting to execute a block with excluded BlockType returns error."""
-        session = make_session(user_id=_TEST_USER_ID)
-
-        input_block = make_mock_block("input-block-id", "Input Block", BlockType.INPUT)
-
-        with patch(
-            "backend.api.features.chat.tools.run_block.get_block",
-            return_value=input_block,
-        ):
-            tool = RunBlockTool()
-            response = await tool._execute(
-                user_id=_TEST_USER_ID,
-                session=session,
-                block_id="input-block-id",
-                input_data={},
-            )
-
-        assert isinstance(response, ErrorResponse)
-        assert "cannot be run directly in CoPilot" in response.message
-        assert "designed for use within graphs only" in response.message
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_excluded_block_id_returns_error(self):
-        """Attempting to execute SmartDecisionMakerBlock returns error."""
-        session = make_session(user_id=_TEST_USER_ID)
-
-        smart_decision_id = "3b191d9f-356f-482d-8238-ba04b6d18381"
-        smart_block = make_mock_block(
-            smart_decision_id, "Smart Decision Maker", BlockType.STANDARD
-        )
-
-        with patch(
-            "backend.api.features.chat.tools.run_block.get_block",
-            return_value=smart_block,
-        ):
-            tool = RunBlockTool()
-            response = await tool._execute(
-                user_id=_TEST_USER_ID,
-                session=session,
-                block_id=smart_decision_id,
-                input_data={},
-            )
-
-        assert isinstance(response, ErrorResponse)
-        assert "cannot be run directly in CoPilot" in response.message
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_non_excluded_block_passes_guard(self):
-        """Non-excluded blocks pass the filtering guard (may fail later for other reasons)."""
-        session = make_session(user_id=_TEST_USER_ID)
-
-        standard_block = make_mock_block(
-            "standard-id", "HTTP Request", BlockType.STANDARD
-        )
-
-        with patch(
-            "backend.api.features.chat.tools.run_block.get_block",
-            return_value=standard_block,
-        ):
-            tool = RunBlockTool()
-            response = await tool._execute(
-                user_id=_TEST_USER_ID,
-                session=session,
-                block_id="standard-id",
-                input_data={},
-            )
-
-        # Should NOT be an ErrorResponse about CoPilot exclusion
-        # (may be other errors like missing credentials, but not the exclusion guard)
-        if isinstance(response, ErrorResponse):
-            assert "cannot be run directly in CoPilot" not in response.message
--- a/autogpt_platform/backend/backend/api/features/chat/tools/workspace_files.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/workspace_files.py
@@ -1,620 +0,0 @@
-"""CoPilot tools for workspace file operations."""
-
-import base64
-import logging
-from typing import Any, Optional
-
-from pydantic import BaseModel
-
-from backend.api.features.chat.model import ChatSession
-from backend.data.workspace import get_or_create_workspace
-from backend.util.settings import Config
-from backend.util.virus_scanner import scan_content_safe
-from backend.util.workspace import WorkspaceManager
-
-from .base import BaseTool
-from .models import ErrorResponse, ResponseType, ToolResponseBase
-
-logger = logging.getLogger(__name__)
-
-
-class WorkspaceFileInfoData(BaseModel):
-    """Data model for workspace file information (not a response itself)."""
-
-    file_id: str
-    name: str
-    path: str
-    mime_type: str
-    size_bytes: int
-
-
-class WorkspaceFileListResponse(ToolResponseBase):
-    """Response containing list of workspace files."""
-
-    type: ResponseType = ResponseType.WORKSPACE_FILE_LIST
-    files: list[WorkspaceFileInfoData]
-    total_count: int
-
-
-class WorkspaceFileContentResponse(ToolResponseBase):
-    """Response containing workspace file content (legacy, for small text files)."""
-
-    type: ResponseType = ResponseType.WORKSPACE_FILE_CONTENT
-    file_id: str
-    name: str
-    path: str
-    mime_type: str
-    content_base64: str
-
-
-class WorkspaceFileMetadataResponse(ToolResponseBase):
-    """Response containing workspace file metadata and download URL (prevents context bloat)."""
-
-    type: ResponseType = ResponseType.WORKSPACE_FILE_METADATA
-    file_id: str
-    name: str
-    path: str
-    mime_type: str
-    size_bytes: int
-    download_url: str
-    preview: str | None = None  # First 500 chars for text files
-
-
-class WorkspaceWriteResponse(ToolResponseBase):
-    """Response after writing a file to workspace."""
-
-    type: ResponseType = ResponseType.WORKSPACE_FILE_WRITTEN
-    file_id: str
-    name: str
-    path: str
-    size_bytes: int
-
-
-class WorkspaceDeleteResponse(ToolResponseBase):
-    """Response after deleting a file from workspace."""
-
-    type: ResponseType = ResponseType.WORKSPACE_FILE_DELETED
-    file_id: str
-    success: bool
-
-
-class ListWorkspaceFilesTool(BaseTool):
-    """Tool for listing files in user's workspace."""
-
-    @property
-    def name(self) -> str:
-        return "list_workspace_files"
-
-    @property
-    def description(self) -> str:
-        return (
-            "List files in the user's workspace. "
-            "Returns file names, paths, sizes, and metadata. "
-            "Optionally filter by path prefix."
-        )
-
-    @property
-    def parameters(self) -> dict[str, Any]:
-        return {
-            "type": "object",
-            "properties": {
-                "path_prefix": {
-                    "type": "string",
-                    "description": (
-                        "Optional path prefix to filter files "
-                        "(e.g., '/documents/' to list only files in documents folder). "
-                        "By default, only files from the current session are listed."
-                    ),
-                },
-                "limit": {
-                    "type": "integer",
-                    "description": "Maximum number of files to return (default 50, max 100)",
-                    "minimum": 1,
-                    "maximum": 100,
-                },
-                "include_all_sessions": {
-                    "type": "boolean",
-                    "description": (
-                        "If true, list files from all sessions. "
-                        "Default is false (only current session's files)."
-                    ),
-                },
-            },
-            "required": [],
-        }
-
-    @property
-    def requires_auth(self) -> bool:
-        return True
-
-    async def _execute(
-        self,
-        user_id: str | None,
-        session: ChatSession,
-        **kwargs,
-    ) -> ToolResponseBase:
-        session_id = session.session_id
-
-        if not user_id:
-            return ErrorResponse(
-                message="Authentication required",
-                session_id=session_id,
-            )
-
-        path_prefix: Optional[str] = kwargs.get("path_prefix")
-        limit = min(kwargs.get("limit", 50), 100)
-        include_all_sessions: bool = kwargs.get("include_all_sessions", False)
-
-        try:
-            workspace = await get_or_create_workspace(user_id)
-            # Pass session_id for session-scoped file access
-            manager = WorkspaceManager(user_id, workspace.id, session_id)
-
-            files = await manager.list_files(
-                path=path_prefix,
-                limit=limit,
-                include_all_sessions=include_all_sessions,
-            )
-            total = await manager.get_file_count(
-                path=path_prefix,
-                include_all_sessions=include_all_sessions,
-            )
-
-            file_infos = [
-                WorkspaceFileInfoData(
-                    file_id=f.id,
-                    name=f.name,
-                    path=f.path,
-                    mime_type=f.mimeType,
-                    size_bytes=f.sizeBytes,
-                )
-                for f in files
-            ]
-
-            scope_msg = "all sessions" if include_all_sessions else "current session"
-            return WorkspaceFileListResponse(
-                files=file_infos,
-                total_count=total,
-                message=f"Found {len(files)} files in workspace ({scope_msg})",
-                session_id=session_id,
-            )
-
-        except Exception as e:
-            logger.error(f"Error listing workspace files: {e}", exc_info=True)
-            return ErrorResponse(
-                message=f"Failed to list workspace files: {str(e)}",
-                error=str(e),
-                session_id=session_id,
-            )
-
-
-class ReadWorkspaceFileTool(BaseTool):
-    """Tool for reading file content from workspace."""
-
-    # Size threshold for returning full content vs metadata+URL
-    # Files larger than this return metadata with download URL to prevent context bloat
-    MAX_INLINE_SIZE_BYTES = 32 * 1024  # 32KB
-    # Preview size for text files
-    PREVIEW_SIZE = 500
-
-    @property
-    def name(self) -> str:
-        return "read_workspace_file"
-
-    @property
-    def description(self) -> str:
-        return (
-            "Read a file from the user's workspace. "
-            "Specify either file_id or path to identify the file. "
-            "For small text files, returns content directly. "
-            "For large or binary files, returns metadata and a download URL. "
-            "Paths are scoped to the current session by default. "
-            "Use /sessions/<session_id>/... for cross-session access."
-        )
-
-    @property
-    def parameters(self) -> dict[str, Any]:
-        return {
-            "type": "object",
-            "properties": {
-                "file_id": {
-                    "type": "string",
-                    "description": "The file's unique ID (from list_workspace_files)",
-                },
-                "path": {
-                    "type": "string",
-                    "description": (
-                        "The virtual file path (e.g., '/documents/report.pdf'). "
-                        "Scoped to current session by default."
-                    ),
-                },
-                "force_download_url": {
-                    "type": "boolean",
-                    "description": (
-                        "If true, always return metadata+URL instead of inline content. "
-                        "Default is false (auto-selects based on file size/type)."
-                    ),
-                },
-            },
-            "required": [],  # At least one must be provided
-        }
-
-    @property
-    def requires_auth(self) -> bool:
-        return True
-
-    def _is_text_mime_type(self, mime_type: str) -> bool:
-        """Check if the MIME type is a text-based type."""
-        text_types = [
-            "text/",
-            "application/json",
-            "application/xml",
-            "application/javascript",
-            "application/x-python",
-            "application/x-sh",
-        ]
-        return any(mime_type.startswith(t) for t in text_types)
-
-    async def _execute(
-        self,
-        user_id: str | None,
-        session: ChatSession,
-        **kwargs,
-    ) -> ToolResponseBase:
-        session_id = session.session_id
-
-        if not user_id:
-            return ErrorResponse(
-                message="Authentication required",
-                session_id=session_id,
-            )
-
-        file_id: Optional[str] = kwargs.get("file_id")
-        path: Optional[str] = kwargs.get("path")
-        force_download_url: bool = kwargs.get("force_download_url", False)
-
-        if not file_id and not path:
-            return ErrorResponse(
-                message="Please provide either file_id or path",
-                session_id=session_id,
-            )
-
-        try:
-            workspace = await get_or_create_workspace(user_id)
-            # Pass session_id for session-scoped file access
-            manager = WorkspaceManager(user_id, workspace.id, session_id)
-
-            # Get file info
-            if file_id:
-                file_info = await manager.get_file_info(file_id)
-                if file_info is None:
-                    return ErrorResponse(
-                        message=f"File not found: {file_id}",
-                        session_id=session_id,
-                    )
-                target_file_id = file_id
-            else:
-                # path is guaranteed to be non-None here due to the check above
-                assert path is not None
-                file_info = await manager.get_file_info_by_path(path)
-                if file_info is None:
-                    return ErrorResponse(
-                        message=f"File not found at path: {path}",
-                        session_id=session_id,
-                    )
-                target_file_id = file_info.id
-
-            # Decide whether to return inline content or metadata+URL
-            is_small_file = file_info.sizeBytes <= self.MAX_INLINE_SIZE_BYTES
-            is_text_file = self._is_text_mime_type(file_info.mimeType)
-
-            # Return inline content for small text files (unless force_download_url)
-            if is_small_file and is_text_file and not force_download_url:
-                content = await manager.read_file_by_id(target_file_id)
-                content_b64 = base64.b64encode(content).decode("utf-8")
-
-                return WorkspaceFileContentResponse(
-                    file_id=file_info.id,
-                    name=file_info.name,
-                    path=file_info.path,
-                    mime_type=file_info.mimeType,
-                    content_base64=content_b64,
-                    message=f"Successfully read file: {file_info.name}",
-                    session_id=session_id,
-                )
-
-            # Return metadata + workspace:// reference for large or binary files
-            # This prevents context bloat (100KB file = ~133KB as base64)
-            # Use workspace:// format so frontend urlTransform can add proxy prefix
-            download_url = f"workspace://{target_file_id}"
-
-            # Generate preview for text files
-            preview: str | None = None
-            if is_text_file:
-                try:
-                    content = await manager.read_file_by_id(target_file_id)
-                    preview_text = content[: self.PREVIEW_SIZE].decode(
-                        "utf-8", errors="replace"
-                    )
-                    if len(content) > self.PREVIEW_SIZE:
-                        preview_text += "..."
-                    preview = preview_text
-                except Exception:
-                    pass  # Preview is optional
-
-            return WorkspaceFileMetadataResponse(
-                file_id=file_info.id,
-                name=file_info.name,
-                path=file_info.path,
-                mime_type=file_info.mimeType,
-                size_bytes=file_info.sizeBytes,
-                download_url=download_url,
-                preview=preview,
-                message=f"File: {file_info.name} ({file_info.sizeBytes} bytes). Use download_url to retrieve content.",
-                session_id=session_id,
-            )
-
-        except FileNotFoundError as e:
-            return ErrorResponse(
-                message=str(e),
-                session_id=session_id,
-            )
-        except Exception as e:
-            logger.error(f"Error reading workspace file: {e}", exc_info=True)
-            return ErrorResponse(
-                message=f"Failed to read workspace file: {str(e)}",
-                error=str(e),
-                session_id=session_id,
-            )
-
-
-class WriteWorkspaceFileTool(BaseTool):
-    """Tool for writing files to workspace."""
-
-    @property
-    def name(self) -> str:
-        return "write_workspace_file"
-
-    @property
-    def description(self) -> str:
-        return (
-            "Write or create a file in the user's workspace. "
-            "Provide the content as a base64-encoded string. "
-            f"Maximum file size is {Config().max_file_size_mb}MB. "
-            "Files are saved to the current session's folder by default. "
-            "Use /sessions/<session_id>/... for cross-session access."
-        )
-
-    @property
-    def parameters(self) -> dict[str, Any]:
-        return {
-            "type": "object",
-            "properties": {
-                "filename": {
-                    "type": "string",
-                    "description": "Name for the file (e.g., 'report.pdf')",
-                },
-                "content_base64": {
-                    "type": "string",
-                    "description": "Base64-encoded file content",
-                },
-                "path": {
-                    "type": "string",
-                    "description": (
-                        "Optional virtual path where to save the file "
-                        "(e.g., '/documents/report.pdf'). "
-                        "Defaults to '/{filename}'. Scoped to current session."
-                    ),
-                },
-                "mime_type": {
-                    "type": "string",
-                    "description": (
-                        "Optional MIME type of the file. "
-                        "Auto-detected from filename if not provided."
-                    ),
-                },
-                "overwrite": {
-                    "type": "boolean",
-                    "description": "Whether to overwrite if file exists at path (default: false)",
-                },
-            },
-            "required": ["filename", "content_base64"],
-        }
-
-    @property
-    def requires_auth(self) -> bool:
-        return True
-
-    async def _execute(
-        self,
-        user_id: str | None,
-        session: ChatSession,
-        **kwargs,
-    ) -> ToolResponseBase:
-        session_id = session.session_id
-
-        if not user_id:
-            return ErrorResponse(
-                message="Authentication required",
-                session_id=session_id,
-            )
-
-        filename: str = kwargs.get("filename", "")
-        content_b64: str = kwargs.get("content_base64", "")
-        path: Optional[str] = kwargs.get("path")
-        mime_type: Optional[str] = kwargs.get("mime_type")
-        overwrite: bool = kwargs.get("overwrite", False)
-
-        if not filename:
-            return ErrorResponse(
-                message="Please provide a filename",
-                session_id=session_id,
-            )
-
-        if not content_b64:
-            return ErrorResponse(
-                message="Please provide content_base64",
-                session_id=session_id,
-            )
-
-        # Decode content
-        try:
-            content = base64.b64decode(content_b64)
-        except Exception:
-            return ErrorResponse(
-                message="Invalid base64-encoded content",
-                session_id=session_id,
-            )
-
-        # Check size
-        max_file_size = Config().max_file_size_mb * 1024 * 1024
-        if len(content) > max_file_size:
-            return ErrorResponse(
-                message=f"File too large. Maximum size is {Config().max_file_size_mb}MB",
-                session_id=session_id,
-            )
-
-        try:
-            # Virus scan
-            await scan_content_safe(content, filename=filename)
-
-            workspace = await get_or_create_workspace(user_id)
-            # Pass session_id for session-scoped file access
-            manager = WorkspaceManager(user_id, workspace.id, session_id)
-
-            file_record = await manager.write_file(
-                content=content,
-                filename=filename,
-                path=path,
-                mime_type=mime_type,
-                overwrite=overwrite,
-            )
-
-            return WorkspaceWriteResponse(
-                file_id=file_record.id,
-                name=file_record.name,
-                path=file_record.path,
-                size_bytes=file_record.sizeBytes,
-                message=f"Successfully wrote file: {file_record.name}",
-                session_id=session_id,
-            )
-
-        except ValueError as e:
-            return ErrorResponse(
-                message=str(e),
-                session_id=session_id,
-            )
-        except Exception as e:
-            logger.error(f"Error writing workspace file: {e}", exc_info=True)
-            return ErrorResponse(
-                message=f"Failed to write workspace file: {str(e)}",
-                error=str(e),
-                session_id=session_id,
-            )
-
-
-class DeleteWorkspaceFileTool(BaseTool):
-    """Tool for deleting files from workspace."""
-
-    @property
-    def name(self) -> str:
-        return "delete_workspace_file"
-
-    @property
-    def description(self) -> str:
-        return (
-            "Delete a file from the user's workspace. "
-            "Specify either file_id or path to identify the file. "
-            "Paths are scoped to the current session by default. "
-            "Use /sessions/<session_id>/... for cross-session access."
-        )
-
-    @property
-    def parameters(self) -> dict[str, Any]:
-        return {
-            "type": "object",
-            "properties": {
-                "file_id": {
-                    "type": "string",
-                    "description": "The file's unique ID (from list_workspace_files)",
-                },
-                "path": {
-                    "type": "string",
-                    "description": (
-                        "The virtual file path (e.g., '/documents/report.pdf'). "
-                        "Scoped to current session by default."
-                    ),
-                },
-            },
-            "required": [],  # At least one must be provided
-        }
-
-    @property
-    def requires_auth(self) -> bool:
-        return True
-
-    async def _execute(
-        self,
-        user_id: str | None,
-        session: ChatSession,
-        **kwargs,
-    ) -> ToolResponseBase:
-        session_id = session.session_id
-
-        if not user_id:
-            return ErrorResponse(
-                message="Authentication required",
-                session_id=session_id,
-            )
-
-        file_id: Optional[str] = kwargs.get("file_id")
-        path: Optional[str] = kwargs.get("path")
-
-        if not file_id and not path:
-            return ErrorResponse(
-                message="Please provide either file_id or path",
-                session_id=session_id,
-            )
-
-        try:
-            workspace = await get_or_create_workspace(user_id)
-            # Pass session_id for session-scoped file access
-            manager = WorkspaceManager(user_id, workspace.id, session_id)
-
-            # Determine the file_id to delete
-            target_file_id: str
-            if file_id:
-                target_file_id = file_id
-            else:
-                # path is guaranteed to be non-None here due to the check above
-                assert path is not None
-                file_info = await manager.get_file_info_by_path(path)
-                if file_info is None:
-                    return ErrorResponse(
-                        message=f"File not found at path: {path}",
-                        session_id=session_id,
-                    )
-                target_file_id = file_info.id
-
-            success = await manager.delete_file(target_file_id)
-
-            if not success:
-                return ErrorResponse(
-                    message=f"File not found: {target_file_id}",
-                    session_id=session_id,
-                )
-
-            return WorkspaceDeleteResponse(
-                file_id=target_file_id,
-                success=True,
-                message="File deleted successfully",
-                session_id=session_id,
-            )
-
-        except Exception as e:
-            logger.error(f"Error deleting workspace file: {e}", exc_info=True)
-            return ErrorResponse(
-                message=f"Failed to delete workspace file: {str(e)}",
-                error=str(e),
-                session_id=session_id,
-            )
--- a/autogpt_platform/backend/backend/api/features/executions/review/review_routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/executions/review/review_routes_test.py
@@ -638,7 +638,7 @@ async def test_process_review_action_auto_approve_creates_auto_approval_records(

    # Mock get_node_executions to return node_id mapping
    mock_get_node_executions = mocker.patch(
-        "backend.data.execution.get_node_executions"
+        "backend.api.features.executions.review.routes.get_node_executions"
    )
    mock_node_exec = mocker.Mock(spec=NodeExecutionResult)
    mock_node_exec.node_exec_id = "test_node_123"
@@ -936,7 +936,7 @@ async def test_process_review_action_auto_approve_only_applies_to_approved_revie

    # Mock get_node_executions to return node_id mapping
    mock_get_node_executions = mocker.patch(
-        "backend.data.execution.get_node_executions"
+        "backend.api.features.executions.review.routes.get_node_executions"
    )
    mock_node_exec = mocker.Mock(spec=NodeExecutionResult)
    mock_node_exec.node_exec_id = "node_exec_approved"
@@ -1148,7 +1148,7 @@ async def test_process_review_action_per_review_auto_approve_granularity(

    # Mock get_node_executions to return batch node data
    mock_get_node_executions = mocker.patch(
-        "backend.data.execution.get_node_executions"
+        "backend.api.features.executions.review.routes.get_node_executions"
    )
    # Create mock node executions for each review
    mock_node_execs = []
--- a/autogpt_platform/backend/backend/api/features/executions/review/routes.py
+++ b/autogpt_platform/backend/backend/api/features/executions/review/routes.py
@@ -6,10 +6,15 @@ import autogpt_libs.auth as autogpt_auth_lib
 from fastapi import APIRouter, HTTPException, Query, Security, status
 from prisma.enums import ReviewStatus

+from backend.copilot.constants import (
+    is_copilot_synthetic_id,
+    parse_node_id_from_exec_id,
+)
 from backend.data.execution import (
    ExecutionContext,
    ExecutionStatus,
    get_graph_execution_meta,
+    get_node_executions,
 )
 from backend.data.graph import get_graph_settings
 from backend.data.human_review import (
@@ -22,6 +27,7 @@ from backend.data.human_review import (
 )
 from backend.data.model import USER_TIMEZONE_NOT_SET
 from backend.data.user import get_user_by_id
+from backend.data.workspace import get_or_create_workspace
 from backend.executor.utils import add_graph_execution

 from .model import PendingHumanReviewModel, ReviewRequest, ReviewResponse
@@ -35,6 +41,38 @@ router = APIRouter(
 )


+async def _resolve_node_ids(
+    node_exec_ids: list[str],
+    graph_exec_id: str,
+    is_copilot: bool,
+) -> dict[str, str]:
+    """Resolve node_exec_id -> node_id for auto-approval records.
+
+    CoPilot synthetic IDs encode node_id in the format "{node_id}:{random}".
+    Graph executions look up node_id from NodeExecution records.
+    """
+    if not node_exec_ids:
+        return {}
+
+    if is_copilot:
+        return {neid: parse_node_id_from_exec_id(neid) for neid in node_exec_ids}
+
+    node_execs = await get_node_executions(
+        graph_exec_id=graph_exec_id, include_exec_data=False
+    )
+    node_exec_map = {ne.node_exec_id: ne.node_id for ne in node_execs}
+
+    result = {}
+    for neid in node_exec_ids:
+        if neid in node_exec_map:
+            result[neid] = node_exec_map[neid]
+        else:
+            logger.error(
+                f"Failed to resolve node_id for {neid}: Node execution not found."
+            )
+    return result
+
+
@router.get(
    "/pending",
    summary="Get Pending Reviews",
@@ -109,14 +147,16 @@ async def list_pending_reviews_for_execution(
    """

    # Verify user owns the graph execution before returning reviews
-    graph_exec = await get_graph_execution_meta(
-        user_id=user_id, execution_id=graph_exec_id
-    )
-    if not graph_exec:
-        raise HTTPException(
-            status_code=status.HTTP_404_NOT_FOUND,
-            detail=f"Graph execution #{graph_exec_id} not found",
+    # (CoPilot synthetic IDs don't have graph execution records)
+    if not is_copilot_synthetic_id(graph_exec_id):
+        graph_exec = await get_graph_execution_meta(
+            user_id=user_id, execution_id=graph_exec_id
        )
+        if not graph_exec:
+            raise HTTPException(
+                status_code=status.HTTP_404_NOT_FOUND,
+                detail=f"Graph execution #{graph_exec_id} not found",
+            )

    return await get_pending_reviews_for_execution(graph_exec_id, user_id)

@@ -159,30 +199,26 @@ async def process_review_action(
        )

    graph_exec_id = next(iter(graph_exec_ids))
+    is_copilot = is_copilot_synthetic_id(graph_exec_id)

-    # Validate execution status before processing reviews
-    graph_exec_meta = await get_graph_execution_meta(
-        user_id=user_id, execution_id=graph_exec_id
-    )
-
-    if not graph_exec_meta:
-        raise HTTPException(
-            status_code=status.HTTP_404_NOT_FOUND,
-            detail=f"Graph execution #{graph_exec_id} not found",
-        )
-
-    # Only allow processing reviews if execution is paused for review
-    # or incomplete (partial execution with some reviews already processed)
-    if graph_exec_meta.status not in (
-        ExecutionStatus.REVIEW,
-        ExecutionStatus.INCOMPLETE,
-    ):
-        raise HTTPException(
-            status_code=status.HTTP_409_CONFLICT,
-            detail=f"Cannot process reviews while execution status is {graph_exec_meta.status}. "
-            f"Reviews can only be processed when execution is paused (REVIEW status). "
-            f"Current status: {graph_exec_meta.status}",
+    # Validate execution status for graph executions (skip for CoPilot synthetic IDs)
+    if not is_copilot:
+        graph_exec_meta = await get_graph_execution_meta(
+            user_id=user_id, execution_id=graph_exec_id
        )
+        if not graph_exec_meta:
+            raise HTTPException(
+                status_code=status.HTTP_404_NOT_FOUND,
+                detail=f"Graph execution #{graph_exec_id} not found",
+            )
+        if graph_exec_meta.status not in (
+            ExecutionStatus.REVIEW,
+            ExecutionStatus.INCOMPLETE,
+        ):
+            raise HTTPException(
+                status_code=status.HTTP_409_CONFLICT,
+                detail=f"Cannot process reviews while execution status is {graph_exec_meta.status}",
+            )

    # Build review decisions map and track which reviews requested auto-approval
    # Auto-approved reviews use original data (no modifications allowed)
@@ -235,7 +271,7 @@ async def process_review_action(
            )
            return (node_id, False)

-    # Collect node_exec_ids that need auto-approval
+    # Collect node_exec_ids that need auto-approval and resolve their node_ids
    node_exec_ids_needing_auto_approval = [
        node_exec_id
        for node_exec_id, review_result in updated_reviews.items()
@@ -243,29 +279,16 @@ async def process_review_action(
        and auto_approve_requests.get(node_exec_id, False)
    ]

-    # Batch-fetch node executions to get node_ids
+    node_id_map = await _resolve_node_ids(
+        node_exec_ids_needing_auto_approval, graph_exec_id, is_copilot
+    )
+
+    # Deduplicate by node_id — one auto-approval per node
    nodes_needing_auto_approval: dict[str, Any] = {}
-    if node_exec_ids_needing_auto_approval:
-        from backend.data.execution import get_node_executions
-
-        node_execs = await get_node_executions(
-            graph_exec_id=graph_exec_id, include_exec_data=False
-        )
-        node_exec_map = {node_exec.node_exec_id: node_exec for node_exec in node_execs}
-
-        for node_exec_id in node_exec_ids_needing_auto_approval:
-            node_exec = node_exec_map.get(node_exec_id)
-            if node_exec:
-                review_result = updated_reviews[node_exec_id]
-                # Use the first approved review for this node (deduplicate by node_id)
-                if node_exec.node_id not in nodes_needing_auto_approval:
-                    nodes_needing_auto_approval[node_exec.node_id] = review_result
-            else:
-                logger.error(
-                    f"Failed to create auto-approval record for {node_exec_id}: "
-                    f"Node execution not found. This may indicate a race condition "
-                    f"or data inconsistency."
-                )
+    for node_exec_id in node_exec_ids_needing_auto_approval:
+        node_id = node_id_map.get(node_exec_id)
+        if node_id and node_id not in nodes_needing_auto_approval:
+            nodes_needing_auto_approval[node_id] = updated_reviews[node_exec_id]

    # Execute all auto-approval creations in parallel (deduplicated by node_id)
    auto_approval_results = await asyncio.gather(
@@ -280,13 +303,11 @@ async def process_review_action(
    auto_approval_failed_count = 0
    for result in auto_approval_results:
        if isinstance(result, Exception):
-            # Unexpected exception during auto-approval creation
            auto_approval_failed_count += 1
            logger.error(
                f"Unexpected exception during auto-approval creation: {result}"
            )
        elif isinstance(result, tuple) and len(result) == 2 and not result[1]:
-            # Auto-approval creation failed (returned False)
            auto_approval_failed_count += 1

    # Count results
@@ -301,30 +322,31 @@ async def process_review_action(
        if review.status == ReviewStatus.REJECTED
    )

-    # Resume execution only if ALL pending reviews for this execution have been processed
-    if updated_reviews:
+    # Resume graph execution only for real graph executions (not CoPilot)
+    # CoPilot sessions are resumed by the LLM retrying run_block with review_id
+    if not is_copilot and updated_reviews:
        still_has_pending = await has_pending_reviews_for_graph_exec(graph_exec_id)

        if not still_has_pending:
-            # Get the graph_id from any processed review
            first_review = next(iter(updated_reviews.values()))

            try:
-                # Fetch user and settings to build complete execution context
                user = await get_user_by_id(user_id)
                settings = await get_graph_settings(
                    user_id=user_id, graph_id=first_review.graph_id
                )

-                # Preserve user's timezone preference when resuming execution
                user_timezone = (
                    user.timezone if user.timezone != USER_TIMEZONE_NOT_SET else "UTC"
                )

+                workspace = await get_or_create_workspace(user_id)
+
                execution_context = ExecutionContext(
                    human_in_the_loop_safe_mode=settings.human_in_the_loop_safe_mode,
                    sensitive_action_safe_mode=settings.sensitive_action_safe_mode,
                    user_timezone=user_timezone,
+                    workspace_id=workspace.id,
                )

                await add_graph_execution(
--- a/autogpt_platform/backend/backend/api/features/integrations/router.py
+++ b/autogpt_platform/backend/backend/api/features/integrations/router.py
@@ -1,7 +1,7 @@
 import asyncio
 import logging
 from datetime import datetime, timedelta, timezone
-from typing import TYPE_CHECKING, Annotated, List, Literal
+from typing import TYPE_CHECKING, Annotated, Any, List, Literal

 from autogpt_libs.auth import get_user_id
 from fastapi import (
@@ -14,7 +14,7 @@ from fastapi import (
    Security,
    status,
 )
-from pydantic import BaseModel, Field, SecretStr
+from pydantic import BaseModel, Field, SecretStr, model_validator
 from starlette.status import HTTP_500_INTERNAL_SERVER_ERROR, HTTP_502_BAD_GATEWAY

 from backend.api.features.library.db import set_preset_webhook, update_preset
@@ -39,7 +39,11 @@ from backend.data.onboarding import OnboardingStep, complete_onboarding_step
 from backend.data.user import get_user_integrations
 from backend.executor.utils import add_graph_execution
 from backend.integrations.ayrshare import AyrshareClient, SocialPlatform
-from backend.integrations.creds_manager import IntegrationCredentialsManager
+from backend.integrations.credentials_store import provider_matches
+from backend.integrations.creds_manager import (
+    IntegrationCredentialsManager,
+    create_mcp_oauth_handler,
+)
 from backend.integrations.oauth import CREDENTIALS_BY_PROVIDER, HANDLERS_BY_NAME
 from backend.integrations.providers import ProviderName
 from backend.integrations.webhooks import get_webhook_manager
@@ -102,9 +106,37 @@ class CredentialsMetaResponse(BaseModel):
    scopes: list[str] | None
    username: str | None
    host: str | None = Field(
-        default=None, description="Host pattern for host-scoped credentials"
+        default=None,
+        description="Host pattern for host-scoped or MCP server URL for MCP credentials",
    )

+    @model_validator(mode="before")
+    @classmethod
+    def _normalize_provider(cls, data: Any) -> Any:
+        """Fix ``ProviderName.X`` format from Python 3.13 ``str(Enum)`` bug."""
+        if isinstance(data, dict):
+            prov = data.get("provider", "")
+            if isinstance(prov, str) and prov.startswith("ProviderName."):
+                member = prov.removeprefix("ProviderName.")
+                try:
+                    data = {**data, "provider": ProviderName[member].value}
+                except KeyError:
+                    pass
+        return data
+
+    @staticmethod
+    def get_host(cred: Credentials) -> str | None:
+        """Extract host from credential: HostScoped host or MCP server URL."""
+        if isinstance(cred, HostScopedCredentials):
+            return cred.host
+        if isinstance(cred, OAuth2Credentials) and cred.provider in (
+            ProviderName.MCP,
+            ProviderName.MCP.value,
+            "ProviderName.MCP",
+        ):
+            return (cred.metadata or {}).get("mcp_server_url")
+        return None
+

@router.post("/{provider}/callback", summary="Exchange OAuth code for tokens")
 async def callback(
@@ -179,9 +211,7 @@ async def callback(
        title=credentials.title,
        scopes=credentials.scopes,
        username=credentials.username,
-        host=(
-            credentials.host if isinstance(credentials, HostScopedCredentials) else None
-        ),
+        host=(CredentialsMetaResponse.get_host(credentials)),
    )


@@ -199,7 +229,7 @@ async def list_credentials(
            title=cred.title,
            scopes=cred.scopes if isinstance(cred, OAuth2Credentials) else None,
            username=cred.username if isinstance(cred, OAuth2Credentials) else None,
-            host=cred.host if isinstance(cred, HostScopedCredentials) else None,
+            host=CredentialsMetaResponse.get_host(cred),
        )
        for cred in credentials
    ]
@@ -222,7 +252,7 @@ async def list_credentials_by_provider(
            title=cred.title,
            scopes=cred.scopes if isinstance(cred, OAuth2Credentials) else None,
            username=cred.username if isinstance(cred, OAuth2Credentials) else None,
-            host=cred.host if isinstance(cred, HostScopedCredentials) else None,
+            host=CredentialsMetaResponse.get_host(cred),
        )
        for cred in credentials
    ]
@@ -322,7 +352,11 @@ async def delete_credentials(

    tokens_revoked = None
    if isinstance(creds, OAuth2Credentials):
-        handler = _get_provider_oauth_handler(request, provider)
+        if provider_matches(provider.value, ProviderName.MCP.value):
+            # MCP uses dynamic per-server OAuth — create handler from metadata
+            handler = create_mcp_oauth_handler(creds)
+        else:
+            handler = _get_provider_oauth_handler(request, provider)
        tokens_revoked = await handler.revoke_tokens(creds)

    return CredentialsDeletionResponse(revoked=tokens_revoked)
--- a/autogpt_platform/backend/backend/api/features/library/db.py
+++ b/autogpt_platform/backend/backend/api/features/library/db.py
--- a/autogpt_platform/backend/backend/api/features/library/db_test.py
+++ b/autogpt_platform/backend/backend/api/features/library/db_test.py
@@ -4,7 +4,6 @@ import prisma.enums
 import prisma.models
 import pytest

-import backend.api.features.store.exceptions
 from backend.data.db import connect
 from backend.data.includes import library_agent_include

@@ -144,6 +143,7 @@ async def test_add_agent_to_library(mocker):
    )

    mock_library_agent = mocker.patch("prisma.models.LibraryAgent.prisma")
+    mock_library_agent.return_value.find_first = mocker.AsyncMock(return_value=None)
    mock_library_agent.return_value.find_unique = mocker.AsyncMock(return_value=None)
    mock_library_agent.return_value.create = mocker.AsyncMock(
        return_value=mock_library_agent_data
@@ -178,7 +178,6 @@ async def test_add_agent_to_library(mocker):
                "agentGraphVersion": 1,
            }
        },
-        include={"AgentGraph": True},
    )
    # Check that create was called with the expected data including settings
    create_call_args = mock_library_agent.return_value.create.call_args
@@ -218,7 +217,7 @@ async def test_add_agent_to_library_not_found(mocker):
    )

    # Call function and verify exception
-    with pytest.raises(backend.api.features.store.exceptions.AgentNotFoundError):
+    with pytest.raises(db.NotFoundError):
        await db.add_store_agent_to_library("version123", "test-user")

    # Verify mock called correctly
--- a/autogpt_platform/backend/backend/api/features/library/exceptions.py
+++ b/autogpt_platform/backend/backend/api/features/library/exceptions.py
@@ -0,0 +1,10 @@
+class FolderValidationError(Exception):
+    """Raised when folder operations fail validation."""
+
+    pass
+
+
+class FolderAlreadyExistsError(FolderValidationError):
+    """Raised when a folder with the same name already exists in the location."""
+
+    pass
--- a/autogpt_platform/backend/backend/api/features/library/model.py
+++ b/autogpt_platform/backend/backend/api/features/library/model.py
@@ -6,9 +6,12 @@ import prisma.enums
 import prisma.models
 import pydantic

-from backend.data.block import BlockInput
 from backend.data.graph import GraphModel, GraphSettings, GraphTriggerInfo
-from backend.data.model import CredentialsMetaInput, is_credentials_field_name
+from backend.data.model import (
+    CredentialsMetaInput,
+    GraphInput,
+    is_credentials_field_name,
+)
 from backend.util.json import loads as json_loads
 from backend.util.models import Pagination

@@ -23,6 +26,95 @@ class LibraryAgentStatus(str, Enum):
    ERROR = "ERROR"


+# === Folder Models ===
+
+
+class LibraryFolder(pydantic.BaseModel):
+    """Represents a folder for organizing library agents."""
+
+    id: str
+    user_id: str
+    name: str
+    icon: str | None = None
+    color: str | None = None
+    parent_id: str | None = None
+    created_at: datetime.datetime
+    updated_at: datetime.datetime
+    agent_count: int = 0  # Direct agents in folder
+    subfolder_count: int = 0  # Direct child folders
+
+    @staticmethod
+    def from_db(
+        folder: prisma.models.LibraryFolder,
+        agent_count: int = 0,
+        subfolder_count: int = 0,
+    ) -> "LibraryFolder":
+        """Factory method that constructs a LibraryFolder from a Prisma model."""
+        return LibraryFolder(
+            id=folder.id,
+            user_id=folder.userId,
+            name=folder.name,
+            icon=folder.icon,
+            color=folder.color,
+            parent_id=folder.parentId,
+            created_at=folder.createdAt,
+            updated_at=folder.updatedAt,
+            agent_count=agent_count,
+            subfolder_count=subfolder_count,
+        )
+
+
+class LibraryFolderTree(LibraryFolder):
+    """Folder with nested children for tree view."""
+
+    children: list["LibraryFolderTree"] = []
+
+
+class FolderCreateRequest(pydantic.BaseModel):
+    """Request model for creating a folder."""
+
+    name: str = pydantic.Field(..., min_length=1, max_length=100)
+    icon: str | None = None
+    color: str | None = pydantic.Field(
+        None, pattern=r"^#[0-9A-Fa-f]{6}$", description="Hex color code (#RRGGBB)"
+    )
+    parent_id: str | None = None
+
+
+class FolderUpdateRequest(pydantic.BaseModel):
+    """Request model for updating a folder."""
+
+    name: str | None = pydantic.Field(None, min_length=1, max_length=100)
+    icon: str | None = None
+    color: str | None = None
+
+
+class FolderMoveRequest(pydantic.BaseModel):
+    """Request model for moving a folder to a new parent."""
+
+    target_parent_id: str | None = None  # None = move to root
+
+
+class BulkMoveAgentsRequest(pydantic.BaseModel):
+    """Request model for moving multiple agents to a folder."""
+
+    agent_ids: list[str]
+    folder_id: str | None = None  # None = move to root
+
+
+class FolderListResponse(pydantic.BaseModel):
+    """Response schema for a list of folders."""
+
+    folders: list[LibraryFolder]
+    pagination: Pagination
+
+
+class FolderTreeResponse(pydantic.BaseModel):
+    """Response schema for folder tree structure."""
+
+    tree: list[LibraryFolderTree]
+
+
 class MarketplaceListingCreator(pydantic.BaseModel):
    """Creator information for a marketplace listing."""

@@ -73,7 +165,6 @@ class LibraryAgent(pydantic.BaseModel):
    id: str
    graph_id: str
    graph_version: int
-    owner_user_id: str

    image_url: str | None

@@ -114,9 +205,14 @@ class LibraryAgent(pydantic.BaseModel):
        default_factory=list,
        description="List of recent executions with status, score, and summary",
    )
-    can_access_graph: bool
+    can_access_graph: bool = pydantic.Field(
+        description="Indicates whether the same user owns the corresponding graph"
+    )
    is_latest_version: bool
    is_favorite: bool
+    folder_id: str | None = None
+    folder_name: str | None = None  # Denormalized for display
+
    recommended_schedule_cron: str | None = None
    settings: GraphSettings = pydantic.Field(default_factory=GraphSettings)
    marketplace_listing: Optional["MarketplaceListing"] = None
@@ -229,7 +325,6 @@ class LibraryAgent(pydantic.BaseModel):
            id=agent.id,
            graph_id=agent.agentGraphId,
            graph_version=agent.agentGraphVersion,
-            owner_user_id=agent.userId,
            image_url=agent.imageUrl,
            creator_name=creator_name,
            creator_image_url=creator_image_url,
@@ -256,6 +351,8 @@ class LibraryAgent(pydantic.BaseModel):
            can_access_graph=can_access_graph,
            is_latest_version=is_latest_version,
            is_favorite=agent.isFavorite,
+            folder_id=agent.folderId,
+            folder_name=agent.Folder.name if agent.Folder else None,
            recommended_schedule_cron=agent.AgentGraph.recommendedScheduleCron,
            settings=_parse_settings(agent.settings),
            marketplace_listing=marketplace_listing_data,
@@ -323,7 +420,7 @@ class LibraryAgentPresetCreatable(pydantic.BaseModel):
    graph_id: str
    graph_version: int

-    inputs: BlockInput
+    inputs: GraphInput
    credentials: dict[str, CredentialsMetaInput]

    name: str
@@ -352,7 +449,7 @@ class LibraryAgentPresetUpdatable(pydantic.BaseModel):
    Request model used when updating a preset for a library agent.
    """

-    inputs: Optional[BlockInput] = None
+    inputs: Optional[GraphInput] = None
    credentials: Optional[dict[str, CredentialsMetaInput]] = None

    name: Optional[str] = None
@@ -395,7 +492,7 @@ class LibraryAgentPreset(LibraryAgentPresetCreatable):
                "Webhook must be included in AgentPreset query when webhookId is set"
            )

-        input_data: BlockInput = {}
+        input_data: GraphInput = {}
        input_credentials: dict[str, CredentialsMetaInput] = {}

        for preset_input in preset.InputPresets:
@@ -467,3 +564,7 @@ class LibraryAgentUpdateRequest(pydantic.BaseModel):
    settings: Optional[GraphSettings] = pydantic.Field(
        default=None, description="User-specific settings for this library agent"
    )
+    folder_id: Optional[str] = pydantic.Field(
+        default=None,
+        description="Folder ID to move agent to (None to move to root)",
+    )
--- a/autogpt_platform/backend/backend/api/features/library/routes/init.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes/init.py
@@ -1,9 +1,11 @@
 import fastapi

 from .agents import router as agents_router
+from .folders import router as folders_router
 from .presets import router as presets_router

 router = fastapi.APIRouter()

 router.include_router(presets_router)
+router.include_router(folders_router)
 router.include_router(agents_router)
--- a/autogpt_platform/backend/backend/api/features/library/routes/agents.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes/agents.py
@@ -41,6 +41,14 @@ async def list_library_agents(
        ge=1,
        description="Number of agents per page (must be >= 1)",
    ),
+    folder_id: Optional[str] = Query(
+        None,
+        description="Filter by folder ID",
+    ),
+    include_root_only: bool = Query(
+        False,
+        description="Only return agents without a folder (root-level agents)",
+    ),
 ) -> library_model.LibraryAgentResponse:
    """
    Get all agents in the user's library (both created and saved).
@@ -51,6 +59,8 @@ async def list_library_agents(
        sort_by=sort_by,
        page=page,
        page_size=page_size,
+        folder_id=folder_id,
+        include_root_only=include_root_only,
    )


@@ -168,6 +178,7 @@ async def update_library_agent(
        is_favorite=payload.is_favorite,
        is_archived=payload.is_archived,
        settings=payload.settings,
+        folder_id=payload.folder_id,
    )


--- a/autogpt_platform/backend/backend/api/features/library/routes/folders.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes/folders.py
@@ -0,0 +1,287 @@
+from typing import Optional
+
+import autogpt_libs.auth as autogpt_auth_lib
+from fastapi import APIRouter, Query, Security, status
+from fastapi.responses import Response
+
+from .. import db as library_db
+from .. import model as library_model
+
+router = APIRouter(
+    prefix="/folders",
+    tags=["library", "folders", "private"],
+    dependencies=[Security(autogpt_auth_lib.requires_user)],
+)
+
+
+@router.get(
+    "",
+    summary="List Library Folders",
+    response_model=library_model.FolderListResponse,
+    responses={
+        200: {"description": "List of folders"},
+        500: {"description": "Server error"},
+    },
+)
+async def list_folders(
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+    parent_id: Optional[str] = Query(
+        None,
+        description="Filter by parent folder ID. If not provided, returns root-level folders.",
+    ),
+    include_relations: bool = Query(
+        True,
+        description="Include agent and subfolder relations (for counts)",
+    ),
+) -> library_model.FolderListResponse:
+    """
+    List folders for the authenticated user.
+
+    Args:
+        user_id: ID of the authenticated user.
+        parent_id: Optional parent folder ID to filter by.
+        include_relations: Whether to include agent and subfolder relations for counts.
+
+    Returns:
+        A FolderListResponse containing folders.
+    """
+    folders = await library_db.list_folders(
+        user_id=user_id,
+        parent_id=parent_id,
+        include_relations=include_relations,
+    )
+    return library_model.FolderListResponse(
+        folders=folders,
+        pagination=library_model.Pagination(
+            total_items=len(folders),
+            total_pages=1,
+            current_page=1,
+            page_size=len(folders),
+        ),
+    )
+
+
+@router.get(
+    "/tree",
+    summary="Get Folder Tree",
+    response_model=library_model.FolderTreeResponse,
+    responses={
+        200: {"description": "Folder tree structure"},
+        500: {"description": "Server error"},
+    },
+)
+async def get_folder_tree(
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+) -> library_model.FolderTreeResponse:
+    """
+    Get the full folder tree for the authenticated user.
+
+    Args:
+        user_id: ID of the authenticated user.
+
+    Returns:
+        A FolderTreeResponse containing the nested folder structure.
+    """
+    tree = await library_db.get_folder_tree(user_id=user_id)
+    return library_model.FolderTreeResponse(tree=tree)
+
+
+@router.get(
+    "/{folder_id}",
+    summary="Get Folder",
+    response_model=library_model.LibraryFolder,
+    responses={
+        200: {"description": "Folder details"},
+        404: {"description": "Folder not found"},
+        500: {"description": "Server error"},
+    },
+)
+async def get_folder(
+    folder_id: str,
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+) -> library_model.LibraryFolder:
+    """
+    Get a specific folder.
+
+    Args:
+        folder_id: ID of the folder to retrieve.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        The requested LibraryFolder.
+    """
+    return await library_db.get_folder(folder_id=folder_id, user_id=user_id)
+
+
+@router.post(
+    "",
+    summary="Create Folder",
+    status_code=status.HTTP_201_CREATED,
+    response_model=library_model.LibraryFolder,
+    responses={
+        201: {"description": "Folder created successfully"},
+        400: {"description": "Validation error"},
+        404: {"description": "Parent folder not found"},
+        409: {"description": "Folder name conflict"},
+        500: {"description": "Server error"},
+    },
+)
+async def create_folder(
+    payload: library_model.FolderCreateRequest,
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+) -> library_model.LibraryFolder:
+    """
+    Create a new folder.
+
+    Args:
+        payload: The folder creation request.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        The created LibraryFolder.
+    """
+    return await library_db.create_folder(
+        user_id=user_id,
+        name=payload.name,
+        parent_id=payload.parent_id,
+        icon=payload.icon,
+        color=payload.color,
+    )
+
+
+@router.patch(
+    "/{folder_id}",
+    summary="Update Folder",
+    response_model=library_model.LibraryFolder,
+    responses={
+        200: {"description": "Folder updated successfully"},
+        400: {"description": "Validation error"},
+        404: {"description": "Folder not found"},
+        409: {"description": "Folder name conflict"},
+        500: {"description": "Server error"},
+    },
+)
+async def update_folder(
+    folder_id: str,
+    payload: library_model.FolderUpdateRequest,
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+) -> library_model.LibraryFolder:
+    """
+    Update a folder's properties.
+
+    Args:
+        folder_id: ID of the folder to update.
+        payload: The folder update request.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        The updated LibraryFolder.
+    """
+    return await library_db.update_folder(
+        folder_id=folder_id,
+        user_id=user_id,
+        name=payload.name,
+        icon=payload.icon,
+        color=payload.color,
+    )
+
+
+@router.post(
+    "/{folder_id}/move",
+    summary="Move Folder",
+    response_model=library_model.LibraryFolder,
+    responses={
+        200: {"description": "Folder moved successfully"},
+        400: {"description": "Validation error (circular reference)"},
+        404: {"description": "Folder or target parent not found"},
+        409: {"description": "Folder name conflict in target location"},
+        500: {"description": "Server error"},
+    },
+)
+async def move_folder(
+    folder_id: str,
+    payload: library_model.FolderMoveRequest,
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+) -> library_model.LibraryFolder:
+    """
+    Move a folder to a new parent.
+
+    Args:
+        folder_id: ID of the folder to move.
+        payload: The move request with target parent.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        The moved LibraryFolder.
+    """
+    return await library_db.move_folder(
+        folder_id=folder_id,
+        user_id=user_id,
+        target_parent_id=payload.target_parent_id,
+    )
+
+
+@router.delete(
+    "/{folder_id}",
+    summary="Delete Folder",
+    status_code=status.HTTP_204_NO_CONTENT,
+    responses={
+        204: {"description": "Folder deleted successfully"},
+        404: {"description": "Folder not found"},
+        500: {"description": "Server error"},
+    },
+)
+async def delete_folder(
+    folder_id: str,
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+) -> Response:
+    """
+    Soft-delete a folder and all its contents.
+
+    Args:
+        folder_id: ID of the folder to delete.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        204 No Content if successful.
+    """
+    await library_db.delete_folder(
+        folder_id=folder_id,
+        user_id=user_id,
+        soft_delete=True,
+    )
+    return Response(status_code=status.HTTP_204_NO_CONTENT)
+
+
+# === Bulk Agent Operations ===
+
+
+@router.post(
+    "/agents/bulk-move",
+    summary="Bulk Move Agents",
+    response_model=list[library_model.LibraryAgent],
+    responses={
+        200: {"description": "Agents moved successfully"},
+        404: {"description": "Folder not found"},
+        500: {"description": "Server error"},
+    },
+)
+async def bulk_move_agents(
+    payload: library_model.BulkMoveAgentsRequest,
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+) -> list[library_model.LibraryAgent]:
+    """
+    Move multiple agents to a folder.
+
+    Args:
+        payload: The bulk move request with agent IDs and target folder.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        The updated LibraryAgents.
+    """
+    return await library_db.bulk_move_agents_to_folder(
+        agent_ids=payload.agent_ids,
+        folder_id=payload.folder_id,
+        user_id=user_id,
+    )
--- a/autogpt_platform/backend/backend/api/features/library/routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes_test.py
@@ -42,7 +42,6 @@ async def test_get_library_agents_success(
                id="test-agent-1",
                graph_id="test-agent-1",
                graph_version=1,
-                owner_user_id=test_user_id,
                name="Test Agent 1",
                description="Test Description 1",
                image_url=None,
@@ -67,7 +66,6 @@ async def test_get_library_agents_success(
                id="test-agent-2",
                graph_id="test-agent-2",
                graph_version=1,
-                owner_user_id=test_user_id,
                name="Test Agent 2",
                description="Test Description 2",
                image_url=None,
@@ -115,6 +113,8 @@ async def test_get_library_agents_success(
        sort_by=library_model.LibraryAgentSort.UPDATED_AT,
        page=1,
        page_size=15,
+        folder_id=None,
+        include_root_only=False,
    )


@@ -129,7 +129,6 @@ async def test_get_favorite_library_agents_success(
                id="test-agent-1",
                graph_id="test-agent-1",
                graph_version=1,
-                owner_user_id=test_user_id,
                name="Favorite Agent 1",
                description="Test Favorite Description 1",
                image_url=None,
@@ -182,7 +181,6 @@ def test_add_agent_to_library_success(
        id="test-library-agent-id",
        graph_id="test-agent-1",
        graph_version=1,
-        owner_user_id=test_user_id,
        name="Test Agent 1",
        description="Test Description 1",
        image_url=None,
--- a/autogpt_platform/backend/backend/api/features/mcp/init.py
+++ b/autogpt_platform/backend/backend/api/features/mcp/init.py
--- a/autogpt_platform/backend/backend/api/features/mcp/routes.py
+++ b/autogpt_platform/backend/backend/api/features/mcp/routes.py
@@ -0,0 +1,511 @@
+"""
+MCP (Model Context Protocol) API routes.
+
+Provides endpoints for MCP tool discovery and OAuth authentication so the
+frontend can list available tools on an MCP server before placing a block.
+"""
+
+import logging
+from typing import Annotated, Any
+
+import fastapi
+from autogpt_libs.auth import get_user_id
+from fastapi import Security
+from pydantic import BaseModel, Field, SecretStr
+
+from backend.api.features.integrations.router import CredentialsMetaResponse
+from backend.blocks.mcp.client import MCPClient, MCPClientError
+from backend.blocks.mcp.helpers import (
+    auto_lookup_mcp_credential,
+    normalize_mcp_url,
+    server_host,
+)
+from backend.blocks.mcp.oauth import MCPOAuthHandler
+from backend.data.model import OAuth2Credentials
+from backend.integrations.creds_manager import IntegrationCredentialsManager
+from backend.integrations.providers import ProviderName
+from backend.util.request import HTTPClientError, Requests, validate_url_host
+from backend.util.settings import Settings
+
+logger = logging.getLogger(__name__)
+
+settings = Settings()
+router = fastapi.APIRouter(tags=["mcp"])
+creds_manager = IntegrationCredentialsManager()
+
+
+# ====================== Tool Discovery ====================== #
+
+
+class DiscoverToolsRequest(BaseModel):
+    """Request to discover tools on an MCP server."""
+
+    server_url: str = Field(description="URL of the MCP server")
+    auth_token: str | None = Field(
+        default=None,
+        description="Optional Bearer token for authenticated MCP servers",
+    )
+
+
+class MCPToolResponse(BaseModel):
+    """A single MCP tool returned by discovery."""
+
+    name: str
+    description: str
+    input_schema: dict[str, Any]
+
+
+class DiscoverToolsResponse(BaseModel):
+    """Response containing the list of tools available on an MCP server."""
+
+    tools: list[MCPToolResponse]
+    server_name: str | None = None
+    protocol_version: str | None = None
+
+
+@router.post(
+    "/discover-tools",
+    summary="Discover available tools on an MCP server",
+    response_model=DiscoverToolsResponse,
+)
+async def discover_tools(
+    request: DiscoverToolsRequest,
+    user_id: Annotated[str, Security(get_user_id)],
+) -> DiscoverToolsResponse:
+    """
+    Connect to an MCP server and return its available tools.
+
+    If the user has a stored MCP credential for this server URL, it will be
+    used automatically — no need to pass an explicit auth token.
+    """
+    # Validate URL to prevent SSRF — blocks loopback and private IP ranges.
+    try:
+        await validate_url_host(request.server_url)
+    except ValueError as e:
+        raise fastapi.HTTPException(status_code=400, detail=f"Invalid server URL: {e}")
+
+    auth_token = request.auth_token
+
+    # Auto-use stored MCP credential when no explicit token is provided.
+    if not auth_token:
+        best_cred = await auto_lookup_mcp_credential(
+            user_id, normalize_mcp_url(request.server_url)
+        )
+        if best_cred:
+            auth_token = best_cred.access_token.get_secret_value()
+
+    client = MCPClient(request.server_url, auth_token=auth_token)
+
+    try:
+        init_result = await client.initialize()
+        tools = await client.list_tools()
+    except HTTPClientError as e:
+        if e.status_code in (401, 403):
+            raise fastapi.HTTPException(
+                status_code=401,
+                detail="This MCP server requires authentication. "
+                "Please provide a valid auth token.",
+            )
+        raise fastapi.HTTPException(status_code=502, detail=str(e))
+    except MCPClientError as e:
+        raise fastapi.HTTPException(status_code=502, detail=str(e))
+    except Exception as e:
+        raise fastapi.HTTPException(
+            status_code=502,
+            detail=f"Failed to connect to MCP server: {e}",
+        )
+
+    return DiscoverToolsResponse(
+        tools=[
+            MCPToolResponse(
+                name=t.name,
+                description=t.description,
+                input_schema=t.input_schema,
+            )
+            for t in tools
+        ],
+        server_name=(
+            init_result.get("serverInfo", {}).get("name")
+            or server_host(request.server_url)
+            or "MCP"
+        ),
+        protocol_version=init_result.get("protocolVersion"),
+    )
+
+
+# ======================== OAuth Flow ======================== #
+
+
+class MCPOAuthLoginRequest(BaseModel):
+    """Request to start an OAuth flow for an MCP server."""
+
+    server_url: str = Field(description="URL of the MCP server that requires OAuth")
+
+
+class MCPOAuthLoginResponse(BaseModel):
+    """Response with the OAuth login URL for the user to authenticate."""
+
+    login_url: str
+    state_token: str
+
+
+@router.post(
+    "/oauth/login",
+    summary="Initiate OAuth login for an MCP server",
+)
+async def mcp_oauth_login(
+    request: MCPOAuthLoginRequest,
+    user_id: Annotated[str, Security(get_user_id)],
+) -> MCPOAuthLoginResponse:
+    """
+    Discover OAuth metadata from the MCP server and return a login URL.
+
+    1. Discovers the protected-resource metadata (RFC 9728)
+    2. Fetches the authorization server metadata (RFC 8414)
+    3. Performs Dynamic Client Registration (RFC 7591) if available
+    4. Returns the authorization URL for the frontend to open in a popup
+    """
+    # Validate URL to prevent SSRF — blocks loopback and private IP ranges.
+    try:
+        await validate_url_host(request.server_url)
+    except ValueError as e:
+        raise fastapi.HTTPException(status_code=400, detail=f"Invalid server URL: {e}")
+
+    # Normalize the URL so that credentials stored here are matched consistently
+    # by auto_lookup_mcp_credential (which also uses normalized URLs).
+    server_url = normalize_mcp_url(request.server_url)
+    client = MCPClient(server_url)
+
+    # Step 1: Discover protected-resource metadata (RFC 9728)
+    protected_resource = await client.discover_auth()
+
+    metadata: dict[str, Any] | None = None
+
+    if protected_resource and protected_resource.get("authorization_servers"):
+        auth_server_url = protected_resource["authorization_servers"][0]
+        resource_url = protected_resource.get("resource", server_url)
+
+        # Validate the auth server URL from metadata to prevent SSRF.
+        try:
+            await validate_url_host(auth_server_url)
+        except ValueError as e:
+            raise fastapi.HTTPException(
+                status_code=400,
+                detail=f"Invalid authorization server URL in metadata: {e}",
+            )
+
+        # Step 2a: Discover auth-server metadata (RFC 8414)
+        metadata = await client.discover_auth_server_metadata(auth_server_url)
+    else:
+        # Fallback: Some MCP servers (e.g. Linear) are their own auth server
+        # and serve OAuth metadata directly without protected-resource metadata.
+        # Don't assume a resource_url — omitting it lets the auth server choose
+        # the correct audience for the token (RFC 8707 resource is optional).
+        resource_url = None
+        metadata = await client.discover_auth_server_metadata(server_url)
+
+    if (
+        not metadata
+        or "authorization_endpoint" not in metadata
+        or "token_endpoint" not in metadata
+    ):
+        raise fastapi.HTTPException(
+            status_code=400,
+            detail="This MCP server does not advertise OAuth support. "
+            "You may need to provide an auth token manually.",
+        )
+
+    authorize_url = metadata["authorization_endpoint"]
+    token_url = metadata["token_endpoint"]
+    registration_endpoint = metadata.get("registration_endpoint")
+    revoke_url = metadata.get("revocation_endpoint")
+
+    # Step 3: Dynamic Client Registration (RFC 7591) if available
+    frontend_base_url = settings.config.frontend_base_url
+    if not frontend_base_url:
+        raise fastapi.HTTPException(
+            status_code=500,
+            detail="Frontend base URL is not configured.",
+        )
+    redirect_uri = f"{frontend_base_url}/auth/integrations/mcp_callback"
+
+    client_id = ""
+    client_secret = ""
+    if registration_endpoint:
+        # Validate the registration endpoint to prevent SSRF via metadata.
+        try:
+            await validate_url_host(registration_endpoint)
+        except ValueError:
+            pass  # Skip registration, fall back to default client_id
+        else:
+            reg_result = await _register_mcp_client(
+                registration_endpoint, redirect_uri, server_url
+            )
+            if reg_result:
+                client_id = reg_result.get("client_id", "")
+                client_secret = reg_result.get("client_secret", "")
+
+    if not client_id:
+        client_id = "autogpt-platform"
+
+    # Step 4: Store state token with OAuth metadata for the callback
+    scopes = (protected_resource or {}).get("scopes_supported") or metadata.get(
+        "scopes_supported", []
+    )
+    state_token, code_challenge = await creds_manager.store.store_state_token(
+        user_id,
+        ProviderName.MCP.value,
+        scopes,
+        state_metadata={
+            "authorize_url": authorize_url,
+            "token_url": token_url,
+            "revoke_url": revoke_url,
+            "resource_url": resource_url,
+            "server_url": server_url,
+            "client_id": client_id,
+            "client_secret": client_secret,
+        },
+    )
+
+    # Step 5: Build and return the login URL
+    handler = MCPOAuthHandler(
+        client_id=client_id,
+        client_secret=client_secret,
+        redirect_uri=redirect_uri,
+        authorize_url=authorize_url,
+        token_url=token_url,
+        resource_url=resource_url,
+    )
+    login_url = handler.get_login_url(
+        scopes, state_token, code_challenge=code_challenge
+    )
+
+    return MCPOAuthLoginResponse(login_url=login_url, state_token=state_token)
+
+
+class MCPOAuthCallbackRequest(BaseModel):
+    """Request to exchange an OAuth code for tokens."""
+
+    code: str = Field(description="Authorization code from OAuth callback")
+    state_token: str = Field(description="State token for CSRF verification")
+
+
+class MCPOAuthCallbackResponse(BaseModel):
+    """Response after successfully storing OAuth credentials."""
+
+    credential_id: str
+
+
+@router.post(
+    "/oauth/callback",
+    summary="Exchange OAuth code for MCP tokens",
+)
+async def mcp_oauth_callback(
+    request: MCPOAuthCallbackRequest,
+    user_id: Annotated[str, Security(get_user_id)],
+) -> CredentialsMetaResponse:
+    """
+    Exchange the authorization code for tokens and store the credential.
+
+    The frontend calls this after receiving the OAuth code from the popup.
+    On success, subsequent ``/discover-tools`` calls for the same server URL
+    will automatically use the stored credential.
+    """
+    valid_state = await creds_manager.store.verify_state_token(
+        user_id, request.state_token, ProviderName.MCP.value
+    )
+    if not valid_state:
+        raise fastapi.HTTPException(
+            status_code=400,
+            detail="Invalid or expired state token.",
+        )
+
+    meta = valid_state.state_metadata
+    frontend_base_url = settings.config.frontend_base_url
+    if not frontend_base_url:
+        raise fastapi.HTTPException(
+            status_code=500,
+            detail="Frontend base URL is not configured.",
+        )
+    redirect_uri = f"{frontend_base_url}/auth/integrations/mcp_callback"
+
+    handler = MCPOAuthHandler(
+        client_id=meta["client_id"],
+        client_secret=meta.get("client_secret", ""),
+        redirect_uri=redirect_uri,
+        authorize_url=meta["authorize_url"],
+        token_url=meta["token_url"],
+        revoke_url=meta.get("revoke_url"),
+        resource_url=meta.get("resource_url"),
+    )
+
+    try:
+        credentials = await handler.exchange_code_for_tokens(
+            request.code, valid_state.scopes, valid_state.code_verifier
+        )
+    except Exception as e:
+        raise fastapi.HTTPException(
+            status_code=400,
+            detail=f"OAuth token exchange failed: {e}",
+        )
+
+    # Enrich credential metadata for future lookup and token refresh
+    if credentials.metadata is None:
+        credentials.metadata = {}
+    credentials.metadata["mcp_server_url"] = meta["server_url"]
+    credentials.metadata["mcp_client_id"] = meta["client_id"]
+    credentials.metadata["mcp_client_secret"] = meta.get("client_secret", "")
+    credentials.metadata["mcp_token_url"] = meta["token_url"]
+    credentials.metadata["mcp_resource_url"] = meta.get("resource_url", "")
+
+    hostname = server_host(meta["server_url"])
+    credentials.title = f"MCP: {hostname}"
+
+    # Remove old MCP credentials for the same server to prevent stale token buildup.
+    try:
+        old_creds = await creds_manager.store.get_creds_by_provider(
+            user_id, ProviderName.MCP.value
+        )
+        for old in old_creds:
+            if (
+                isinstance(old, OAuth2Credentials)
+                and (old.metadata or {}).get("mcp_server_url") == meta["server_url"]
+            ):
+                await creds_manager.store.delete_creds_by_id(user_id, old.id)
+                logger.info(
+                    "Removed old MCP credential %s for %s",
+                    old.id,
+                    server_host(meta["server_url"]),
+                )
+    except Exception:
+        logger.debug("Could not clean up old MCP credentials", exc_info=True)
+
+    await creds_manager.create(user_id, credentials)
+
+    return CredentialsMetaResponse(
+        id=credentials.id,
+        provider=credentials.provider,
+        type=credentials.type,
+        title=credentials.title,
+        scopes=credentials.scopes,
+        username=credentials.username,
+        host=credentials.metadata.get("mcp_server_url"),
+    )
+
+
+# ======================== Bearer Token ======================== #
+
+
+class MCPStoreTokenRequest(BaseModel):
+    """Request to store a bearer token for an MCP server that doesn't support OAuth."""
+
+    server_url: str = Field(
+        description="MCP server URL the token authenticates against"
+    )
+    token: SecretStr = Field(
+        min_length=1, description="Bearer token / API key for the MCP server"
+    )
+
+
+@router.post(
+    "/token",
+    summary="Store a bearer token for an MCP server",
+)
+async def mcp_store_token(
+    request: MCPStoreTokenRequest,
+    user_id: Annotated[str, Security(get_user_id)],
+) -> CredentialsMetaResponse:
+    """
+    Store a manually provided bearer token as an MCP credential.
+
+    Used by the Copilot MCPSetupCard when the server doesn't support the MCP
+    OAuth discovery flow (returns 400 from /oauth/login).  Subsequent
+    ``run_mcp_tool`` calls will automatically pick up the token via
+    ``_auto_lookup_credential``.
+    """
+    token = request.token.get_secret_value().strip()
+    if not token:
+        raise fastapi.HTTPException(status_code=422, detail="Token must not be blank.")
+
+    # Validate URL to prevent SSRF — blocks loopback and private IP ranges.
+    try:
+        await validate_url_host(request.server_url)
+    except ValueError as e:
+        raise fastapi.HTTPException(status_code=400, detail=f"Invalid server URL: {e}")
+
+    # Normalize URL so trailing-slash variants match existing credentials.
+    server_url = normalize_mcp_url(request.server_url)
+    hostname = server_host(server_url)
+
+    # Collect IDs of old credentials to clean up after successful create.
+    old_cred_ids: list[str] = []
+    try:
+        old_creds = await creds_manager.store.get_creds_by_provider(
+            user_id, ProviderName.MCP.value
+        )
+        old_cred_ids = [
+            old.id
+            for old in old_creds
+            if isinstance(old, OAuth2Credentials)
+            and normalize_mcp_url((old.metadata or {}).get("mcp_server_url", ""))
+            == server_url
+        ]
+    except Exception:
+        logger.debug("Could not query old MCP token credentials", exc_info=True)
+
+    credentials = OAuth2Credentials(
+        provider=ProviderName.MCP.value,
+        title=f"MCP: {hostname}",
+        access_token=SecretStr(token),
+        scopes=[],
+        metadata={"mcp_server_url": server_url},
+    )
+    await creds_manager.create(user_id, credentials)
+
+    # Only delete old credentials after the new one is safely stored.
+    for old_id in old_cred_ids:
+        try:
+            await creds_manager.store.delete_creds_by_id(user_id, old_id)
+        except Exception:
+            logger.debug("Could not clean up old MCP token credential", exc_info=True)
+
+    return CredentialsMetaResponse(
+        id=credentials.id,
+        provider=credentials.provider,
+        type=credentials.type,
+        title=credentials.title,
+        scopes=credentials.scopes,
+        username=credentials.username,
+        host=hostname,
+    )
+
+
+# ======================== Helpers ======================== #
+
+
+async def _register_mcp_client(
+    registration_endpoint: str,
+    redirect_uri: str,
+    server_url: str,
+) -> dict[str, Any] | None:
+    """Attempt Dynamic Client Registration (RFC 7591) with an MCP auth server."""
+    try:
+        response = await Requests(raise_for_status=True).post(
+            registration_endpoint,
+            json={
+                "client_name": "AutoGPT Platform",
+                "redirect_uris": [redirect_uri],
+                "grant_types": ["authorization_code"],
+                "response_types": ["code"],
+                "token_endpoint_auth_method": "client_secret_post",
+            },
+        )
+        data = response.json()
+        if isinstance(data, dict) and "client_id" in data:
+            return data
+        return None
+    except Exception as e:
+        logger.warning(
+            "Dynamic client registration failed for %s: %s", server_host(server_url), e
+        )
+        return None
--- a/autogpt_platform/backend/backend/api/features/mcp/test_routes.py
+++ b/autogpt_platform/backend/backend/api/features/mcp/test_routes.py
@@ -0,0 +1,572 @@
+"""Tests for MCP API routes.
+
+Uses httpx.AsyncClient with ASGITransport instead of fastapi.testclient.TestClient
+to avoid creating blocking portals that can corrupt pytest-asyncio's session event loop.
+"""
+
+from unittest.mock import AsyncMock, patch
+
+import fastapi
+import httpx
+import pytest
+import pytest_asyncio
+from autogpt_libs.auth import get_user_id
+from pydantic import SecretStr
+
+from backend.api.features.mcp.routes import router
+from backend.blocks.mcp.client import MCPClientError, MCPTool
+from backend.data.model import OAuth2Credentials
+from backend.util.request import HTTPClientError
+
+app = fastapi.FastAPI()
+app.include_router(router)
+app.dependency_overrides[get_user_id] = lambda: "test-user-id"
+
+
+@pytest_asyncio.fixture(scope="module")
+async def client():
+    transport = httpx.ASGITransport(app=app)
+    async with httpx.AsyncClient(transport=transport, base_url="http://test") as c:
+        yield c
+
+
+@pytest.fixture(autouse=True)
+def _bypass_ssrf_validation():
+    """Bypass validate_url_host in all route tests (test URLs don't resolve)."""
+    with patch(
+        "backend.api.features.mcp.routes.validate_url_host",
+        new_callable=AsyncMock,
+    ):
+        yield
+
+
+class TestDiscoverTools:
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_discover_tools_success(self, client):
+        mock_tools = [
+            MCPTool(
+                name="get_weather",
+                description="Get weather for a city",
+                input_schema={
+                    "type": "object",
+                    "properties": {"city": {"type": "string"}},
+                    "required": ["city"],
+                },
+            ),
+            MCPTool(
+                name="add_numbers",
+                description="Add two numbers",
+                input_schema={
+                    "type": "object",
+                    "properties": {
+                        "a": {"type": "number"},
+                        "b": {"type": "number"},
+                    },
+                },
+            ),
+        ]
+
+        with (
+            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
+            patch(
+                "backend.api.features.mcp.routes.auto_lookup_mcp_credential",
+                new_callable=AsyncMock,
+                return_value=None,
+            ),
+        ):
+            instance = MockClient.return_value
+            instance.initialize = AsyncMock(
+                return_value={
+                    "protocolVersion": "2025-03-26",
+                    "serverInfo": {"name": "test-server"},
+                }
+            )
+            instance.list_tools = AsyncMock(return_value=mock_tools)
+
+            response = await client.post(
+                "/discover-tools",
+                json={"server_url": "https://mcp.example.com/mcp"},
+            )
+
+        assert response.status_code == 200
+        data = response.json()
+        assert len(data["tools"]) == 2
+        assert data["tools"][0]["name"] == "get_weather"
+        assert data["tools"][1]["name"] == "add_numbers"
+        assert data["server_name"] == "test-server"
+        assert data["protocol_version"] == "2025-03-26"
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_discover_tools_with_auth_token(self, client):
+        with patch("backend.api.features.mcp.routes.MCPClient") as MockClient:
+            instance = MockClient.return_value
+            instance.initialize = AsyncMock(
+                return_value={"serverInfo": {}, "protocolVersion": "2025-03-26"}
+            )
+            instance.list_tools = AsyncMock(return_value=[])
+
+            response = await client.post(
+                "/discover-tools",
+                json={
+                    "server_url": "https://mcp.example.com/mcp",
+                    "auth_token": "my-secret-token",
+                },
+            )
+
+        assert response.status_code == 200
+        MockClient.assert_called_once_with(
+            "https://mcp.example.com/mcp",
+            auth_token="my-secret-token",
+        )
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_discover_tools_auto_uses_stored_credential(self, client):
+        """When no explicit token is given, stored MCP credentials are used."""
+        stored_cred = OAuth2Credentials(
+            provider="mcp",
+            title="MCP: example.com",
+            access_token=SecretStr("stored-token-123"),
+            refresh_token=None,
+            access_token_expires_at=None,
+            refresh_token_expires_at=None,
+            scopes=[],
+            metadata={"mcp_server_url": "https://mcp.example.com/mcp"},
+        )
+
+        with (
+            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
+            patch(
+                "backend.api.features.mcp.routes.auto_lookup_mcp_credential",
+                new_callable=AsyncMock,
+                return_value=stored_cred,
+            ),
+        ):
+            instance = MockClient.return_value
+            instance.initialize = AsyncMock(
+                return_value={"serverInfo": {}, "protocolVersion": "2025-03-26"}
+            )
+            instance.list_tools = AsyncMock(return_value=[])
+
+            response = await client.post(
+                "/discover-tools",
+                json={"server_url": "https://mcp.example.com/mcp"},
+            )
+
+        assert response.status_code == 200
+        MockClient.assert_called_once_with(
+            "https://mcp.example.com/mcp",
+            auth_token="stored-token-123",
+        )
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_discover_tools_mcp_error(self, client):
+        with (
+            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
+            patch(
+                "backend.api.features.mcp.routes.auto_lookup_mcp_credential",
+                new_callable=AsyncMock,
+                return_value=None,
+            ),
+        ):
+            instance = MockClient.return_value
+            instance.initialize = AsyncMock(
+                side_effect=MCPClientError("Connection refused")
+            )
+
+            response = await client.post(
+                "/discover-tools",
+                json={"server_url": "https://bad-server.example.com/mcp"},
+            )
+
+        assert response.status_code == 502
+        assert "Connection refused" in response.json()["detail"]
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_discover_tools_generic_error(self, client):
+        with (
+            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
+            patch(
+                "backend.api.features.mcp.routes.auto_lookup_mcp_credential",
+                new_callable=AsyncMock,
+                return_value=None,
+            ),
+        ):
+            instance = MockClient.return_value
+            instance.initialize = AsyncMock(side_effect=Exception("Network timeout"))
+
+            response = await client.post(
+                "/discover-tools",
+                json={"server_url": "https://timeout.example.com/mcp"},
+            )
+
+        assert response.status_code == 502
+        assert "Failed to connect" in response.json()["detail"]
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_discover_tools_auth_required(self, client):
+        with (
+            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
+            patch(
+                "backend.api.features.mcp.routes.auto_lookup_mcp_credential",
+                new_callable=AsyncMock,
+                return_value=None,
+            ),
+        ):
+            instance = MockClient.return_value
+            instance.initialize = AsyncMock(
+                side_effect=HTTPClientError("HTTP 401 Error: Unauthorized", 401)
+            )
+
+            response = await client.post(
+                "/discover-tools",
+                json={"server_url": "https://auth-server.example.com/mcp"},
+            )
+
+        assert response.status_code == 401
+        assert "requires authentication" in response.json()["detail"]
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_discover_tools_forbidden(self, client):
+        with (
+            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
+            patch(
+                "backend.api.features.mcp.routes.auto_lookup_mcp_credential",
+                new_callable=AsyncMock,
+                return_value=None,
+            ),
+        ):
+            instance = MockClient.return_value
+            instance.initialize = AsyncMock(
+                side_effect=HTTPClientError("HTTP 403 Error: Forbidden", 403)
+            )
+
+            response = await client.post(
+                "/discover-tools",
+                json={"server_url": "https://auth-server.example.com/mcp"},
+            )
+
+        assert response.status_code == 401
+        assert "requires authentication" in response.json()["detail"]
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_discover_tools_missing_url(self, client):
+        response = await client.post("/discover-tools", json={})
+        assert response.status_code == 422
+
+
+class TestOAuthLogin:
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_oauth_login_success(self, client):
+        with (
+            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
+            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
+            patch("backend.api.features.mcp.routes.settings") as mock_settings,
+            patch(
+                "backend.api.features.mcp.routes._register_mcp_client"
+            ) as mock_register,
+        ):
+            instance = MockClient.return_value
+            instance.discover_auth = AsyncMock(
+                return_value={
+                    "authorization_servers": ["https://auth.sentry.io"],
+                    "resource": "https://mcp.sentry.dev/mcp",
+                    "scopes_supported": ["openid"],
+                }
+            )
+            instance.discover_auth_server_metadata = AsyncMock(
+                return_value={
+                    "authorization_endpoint": "https://auth.sentry.io/authorize",
+                    "token_endpoint": "https://auth.sentry.io/token",
+                    "registration_endpoint": "https://auth.sentry.io/register",
+                }
+            )
+            mock_register.return_value = {
+                "client_id": "registered-client-id",
+                "client_secret": "registered-secret",
+            }
+            mock_cm.store.store_state_token = AsyncMock(
+                return_value=("state-token-123", "code-challenge-abc")
+            )
+            mock_settings.config.frontend_base_url = "http://localhost:3000"
+
+            response = await client.post(
+                "/oauth/login",
+                json={"server_url": "https://mcp.sentry.dev/mcp"},
+            )
+
+        assert response.status_code == 200
+        data = response.json()
+        assert "login_url" in data
+        assert data["state_token"] == "state-token-123"
+        assert "auth.sentry.io/authorize" in data["login_url"]
+        assert "registered-client-id" in data["login_url"]
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_oauth_login_no_oauth_support(self, client):
+        with patch("backend.api.features.mcp.routes.MCPClient") as MockClient:
+            instance = MockClient.return_value
+            instance.discover_auth = AsyncMock(return_value=None)
+            instance.discover_auth_server_metadata = AsyncMock(return_value=None)
+
+            response = await client.post(
+                "/oauth/login",
+                json={"server_url": "https://simple-server.example.com/mcp"},
+            )
+
+        assert response.status_code == 400
+        assert "does not advertise OAuth" in response.json()["detail"]
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_oauth_login_fallback_to_public_client(self, client):
+        """When DCR is unavailable, falls back to default public client ID."""
+        with (
+            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
+            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
+            patch("backend.api.features.mcp.routes.settings") as mock_settings,
+        ):
+            instance = MockClient.return_value
+            instance.discover_auth = AsyncMock(
+                return_value={
+                    "authorization_servers": ["https://auth.example.com"],
+                    "resource": "https://mcp.example.com/mcp",
+                }
+            )
+            instance.discover_auth_server_metadata = AsyncMock(
+                return_value={
+                    "authorization_endpoint": "https://auth.example.com/authorize",
+                    "token_endpoint": "https://auth.example.com/token",
+                    # No registration_endpoint
+                }
+            )
+            mock_cm.store.store_state_token = AsyncMock(
+                return_value=("state-abc", "challenge-xyz")
+            )
+            mock_settings.config.frontend_base_url = "http://localhost:3000"
+
+            response = await client.post(
+                "/oauth/login",
+                json={"server_url": "https://mcp.example.com/mcp"},
+            )
+
+        assert response.status_code == 200
+        data = response.json()
+        assert "autogpt-platform" in data["login_url"]
+
+
+class TestOAuthCallback:
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_oauth_callback_success(self, client):
+        mock_creds = OAuth2Credentials(
+            provider="mcp",
+            title=None,
+            access_token=SecretStr("access-token-xyz"),
+            refresh_token=None,
+            access_token_expires_at=None,
+            refresh_token_expires_at=None,
+            scopes=[],
+            metadata={
+                "mcp_token_url": "https://auth.sentry.io/token",
+                "mcp_resource_url": "https://mcp.sentry.dev/mcp",
+            },
+        )
+
+        with (
+            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
+            patch("backend.api.features.mcp.routes.settings") as mock_settings,
+            patch("backend.api.features.mcp.routes.MCPOAuthHandler") as MockHandler,
+        ):
+            mock_settings.config.frontend_base_url = "http://localhost:3000"
+
+            # Mock state verification
+            mock_state = AsyncMock()
+            mock_state.state_metadata = {
+                "authorize_url": "https://auth.sentry.io/authorize",
+                "token_url": "https://auth.sentry.io/token",
+                "client_id": "test-client-id",
+                "client_secret": "test-secret",
+                "server_url": "https://mcp.sentry.dev/mcp",
+            }
+            mock_state.scopes = ["openid"]
+            mock_state.code_verifier = "verifier-123"
+            mock_cm.store.verify_state_token = AsyncMock(return_value=mock_state)
+            mock_cm.create = AsyncMock()
+
+            handler_instance = MockHandler.return_value
+            handler_instance.exchange_code_for_tokens = AsyncMock(
+                return_value=mock_creds
+            )
+
+            # Mock old credential cleanup
+            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[])
+
+            response = await client.post(
+                "/oauth/callback",
+                json={"code": "auth-code-abc", "state_token": "state-token-123"},
+            )
+
+        assert response.status_code == 200
+        data = response.json()
+        assert "id" in data
+        assert data["provider"] == "mcp"
+        assert data["type"] == "oauth2"
+        mock_cm.create.assert_called_once()
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_oauth_callback_invalid_state(self, client):
+        with patch("backend.api.features.mcp.routes.creds_manager") as mock_cm:
+            mock_cm.store.verify_state_token = AsyncMock(return_value=None)
+
+            response = await client.post(
+                "/oauth/callback",
+                json={"code": "auth-code", "state_token": "bad-state"},
+            )
+
+        assert response.status_code == 400
+        assert "Invalid or expired" in response.json()["detail"]
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_oauth_callback_token_exchange_fails(self, client):
+        with (
+            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
+            patch("backend.api.features.mcp.routes.settings") as mock_settings,
+            patch("backend.api.features.mcp.routes.MCPOAuthHandler") as MockHandler,
+        ):
+            mock_settings.config.frontend_base_url = "http://localhost:3000"
+            mock_state = AsyncMock()
+            mock_state.state_metadata = {
+                "authorize_url": "https://auth.example.com/authorize",
+                "token_url": "https://auth.example.com/token",
+                "client_id": "cid",
+                "server_url": "https://mcp.example.com/mcp",
+            }
+            mock_state.scopes = []
+            mock_state.code_verifier = "v"
+            mock_cm.store.verify_state_token = AsyncMock(return_value=mock_state)
+
+            handler_instance = MockHandler.return_value
+            handler_instance.exchange_code_for_tokens = AsyncMock(
+                side_effect=RuntimeError("Token exchange failed")
+            )
+
+            response = await client.post(
+                "/oauth/callback",
+                json={"code": "bad-code", "state_token": "state"},
+            )
+
+        assert response.status_code == 400
+        assert "token exchange failed" in response.json()["detail"].lower()
+
+
+class TestStoreToken:
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_store_token_success(self, client):
+        with patch("backend.api.features.mcp.routes.creds_manager") as mock_cm:
+            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[])
+            mock_cm.create = AsyncMock()
+
+            response = await client.post(
+                "/token",
+                json={
+                    "server_url": "https://mcp.example.com/mcp",
+                    "token": "my-api-key-123",
+                },
+            )
+
+        assert response.status_code == 200
+        data = response.json()
+        assert data["provider"] == "mcp"
+        assert data["type"] == "oauth2"
+        assert data["host"] == "mcp.example.com"
+        mock_cm.create.assert_called_once()
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_store_token_blank_rejected(self, client):
+        """Blank token string (after stripping) should return 422."""
+        response = await client.post(
+            "/token",
+            json={
+                "server_url": "https://mcp.example.com/mcp",
+                "token": "   ",
+            },
+        )
+        # Pydantic min_length=1 catches the whitespace-only token
+        assert response.status_code == 422
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_store_token_replaces_old_credential(self, client):
+        old_cred = OAuth2Credentials(
+            provider="mcp",
+            title="MCP: mcp.example.com",
+            access_token=SecretStr("old-token"),
+            scopes=[],
+            metadata={"mcp_server_url": "https://mcp.example.com/mcp"},
+        )
+        with patch("backend.api.features.mcp.routes.creds_manager") as mock_cm:
+            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[old_cred])
+            mock_cm.create = AsyncMock()
+            mock_cm.store.delete_creds_by_id = AsyncMock()
+
+            response = await client.post(
+                "/token",
+                json={
+                    "server_url": "https://mcp.example.com/mcp",
+                    "token": "new-token",
+                },
+            )
+
+        assert response.status_code == 200
+        mock_cm.store.delete_creds_by_id.assert_called_once_with(
+            "test-user-id", old_cred.id
+        )
+
+
+class TestSSRFValidation:
+    """Verify that validate_url_host is enforced on all endpoints."""
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_discover_tools_ssrf_blocked(self, client):
+        with patch(
+            "backend.api.features.mcp.routes.validate_url_host",
+            new_callable=AsyncMock,
+            side_effect=ValueError("blocked loopback"),
+        ):
+            response = await client.post(
+                "/discover-tools",
+                json={"server_url": "http://localhost/mcp"},
+            )
+
+        assert response.status_code == 400
+        assert "blocked loopback" in response.json()["detail"].lower()
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_oauth_login_ssrf_blocked(self, client):
+        with patch(
+            "backend.api.features.mcp.routes.validate_url_host",
+            new_callable=AsyncMock,
+            side_effect=ValueError("blocked private IP"),
+        ):
+            response = await client.post(
+                "/oauth/login",
+                json={"server_url": "http://10.0.0.1/mcp"},
+            )
+
+        assert response.status_code == 400
+        assert "blocked private ip" in response.json()["detail"].lower()
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_store_token_ssrf_blocked(self, client):
+        with patch(
+            "backend.api.features.mcp.routes.validate_url_host",
+            new_callable=AsyncMock,
+            side_effect=ValueError("blocked loopback"),
+        ):
+            response = await client.post(
+                "/token",
+                json={
+                    "server_url": "http://127.0.0.1/mcp",
+                    "token": "some-token",
+                },
+            )
+
+        assert response.status_code == 400
+        assert "blocked loopback" in response.json()["detail"].lower()
--- a/autogpt_platform/backend/backend/api/features/otto/service.py
+++ b/autogpt_platform/backend/backend/api/features/otto/service.py
@@ -5,8 +5,8 @@ from typing import Optional
 import aiohttp
 from fastapi import HTTPException

+from backend.blocks import get_block
 from backend.data import graph as graph_db
-from backend.data.block import get_block
 from backend.util.settings import Settings

 from .models import ApiResponse, ChatRequest, GraphData
--- a/autogpt_platform/backend/backend/api/features/store/cache.py
+++ b/autogpt_platform/backend/backend/api/features/store/cache.py
@@ -1,5 +1,3 @@
-from typing import Literal
-
 from backend.util.cache import cached

 from . import db as store_db
@@ -23,7 +21,7 @@ def clear_all_caches():
 async def _get_cached_store_agents(
    featured: bool,
    creator: str | None,
-    sorted_by: Literal["rating", "runs", "name", "updated_at"] | None,
+    sorted_by: store_db.StoreAgentsSortOptions | None,
    search_query: str | None,
    category: str | None,
    page: int,
@@ -57,7 +55,7 @@ async def _get_cached_agent_details(
 async def _get_cached_store_creators(
    featured: bool,
    search_query: str | None,
-    sorted_by: Literal["agent_rating", "agent_runs", "num_agents"] | None,
+    sorted_by: store_db.StoreCreatorsSortOptions | None,
    page: int,
    page_size: int,
 ):
@@ -75,4 +73,4 @@ async def _get_cached_store_creators(
@cached(maxsize=100, ttl_seconds=300, shared_cache=True)
 async def _get_cached_creator_details(username: str):
    """Cached helper to get creator details."""
-    return await store_db.get_store_creator_details(username=username.lower())
+    return await store_db.get_store_creator(username=username.lower())
--- a/autogpt_platform/backend/backend/api/features/store/content_handlers.py
+++ b/autogpt_platform/backend/backend/api/features/store/content_handlers.py
@@ -5,19 +5,40 @@ Pluggable system for different content sources (store agents, blocks, docs).
 Each handler knows how to fetch and process its content type for embedding.
 """

+from __future__ import annotations
+
+import asyncio
+import functools
+import itertools
 import logging
 from abc import ABC, abstractmethod
 from dataclasses import dataclass
 from pathlib import Path
-from typing import Any
+from typing import TYPE_CHECKING, Any, get_args, get_origin

 from prisma.enums import ContentType

+from backend.blocks import get_blocks
+from backend.blocks.llm import LlmModel
 from backend.data.db import query_raw_with_schema
+from backend.util.text import split_camelcase
+
+if TYPE_CHECKING:
+    from backend.blocks._base import AnyBlockSchema

 logger = logging.getLogger(__name__)


+def _contains_type(annotation: Any, target: type) -> bool:
+    """Check if an annotation is or contains the target type (handles Optional/Union/Annotated)."""
+    if annotation is target:
+        return True
+    origin = get_origin(annotation)
+    if origin is None:
+        return False
+    return any(_contains_type(arg, target) for arg in get_args(annotation))
+
+
@dataclass
 class ContentItem:
    """Represents a piece of content to be embedded."""
@@ -143,6 +164,28 @@ class StoreAgentHandler(ContentHandler):
        }


+@functools.lru_cache(maxsize=1)
+def _get_enabled_blocks() -> dict[str, AnyBlockSchema]:
+    """Return ``{block_id: block_instance}`` for all enabled, instantiable blocks.
+
+    Disabled blocks and blocks that fail to instantiate are silently skipped
+    (with a warning log), so callers never need their own try/except loop.
+
+    Results are cached for the process lifetime via ``lru_cache`` because
+    blocks are registered at import time and never change while running.
+    """
+    enabled: dict[str, AnyBlockSchema] = {}
+    for block_id, block_cls in get_blocks().items():
+        try:
+            instance = block_cls()
+        except Exception as e:
+            logger.warning(f"Skipping block {block_id}: init failed: {e}")
+            continue
+        if not instance.disabled:
+            enabled[block_id] = instance
+    return enabled
+
+
 class BlockHandler(ContentHandler):
    """Handler for block definitions (Python classes)."""

@@ -152,16 +195,14 @@ class BlockHandler(ContentHandler):

    async def get_missing_items(self, batch_size: int) -> list[ContentItem]:
        """Fetch blocks without embeddings."""
-        from backend.data.block import get_blocks
-
-        # Get all available blocks
-        all_blocks = get_blocks()
-
-        # Check which ones have embeddings
-        if not all_blocks:
+        # to_thread keeps the first (heavy) call off the event loop.  On
+        # subsequent calls the lru_cache makes this a dict lookup, so the
+        # thread-pool overhead is negligible compared to the DB queries below.
+        enabled = await asyncio.to_thread(_get_enabled_blocks)
+        if not enabled:
            return []

-        block_ids = list(all_blocks.keys())
+        block_ids = list(enabled.keys())

        # Query for existing embeddings
        placeholders = ",".join([f"${i+1}" for i in range(len(block_ids))])
@@ -176,57 +217,53 @@ class BlockHandler(ContentHandler):
        )

        existing_ids = {row["contentId"] for row in existing_result}
-        missing_blocks = [
-            (block_id, block_cls)
-            for block_id, block_cls in all_blocks.items()
-            if block_id not in existing_ids
-        ]

-        # Convert to ContentItem
+        # Convert to ContentItem — disabled filtering already done by
+        # _get_enabled_blocks so batch_size won't be exhausted by disabled blocks.
+        missing = ((bid, b) for bid, b in enabled.items() if bid not in existing_ids)
        items = []
-        for block_id, block_cls in missing_blocks[:batch_size]:
+        for block_id, block in itertools.islice(missing, batch_size):
            try:
-                block_instance = block_cls()
-
-                # Skip disabled blocks - they shouldn't be indexed
-                if block_instance.disabled:
-                    continue
-
                # Build searchable text from block metadata
-                parts = []
-                if hasattr(block_instance, "name") and block_instance.name:
-                    parts.append(block_instance.name)
-                if (
-                    hasattr(block_instance, "description")
-                    and block_instance.description
-                ):
-                    parts.append(block_instance.description)
-                if hasattr(block_instance, "categories") and block_instance.categories:
-                    # Convert BlockCategory enum to strings
-                    parts.append(
-                        " ".join(str(cat.value) for cat in block_instance.categories)
+                if not block.name:
+                    logger.warning(
+                        f"Block {block_id} has no name — using block_id as fallback"
                    )
+                display_name = split_camelcase(block.name) if block.name else ""
+                parts = []
+                if display_name:
+                    parts.append(display_name)
+                if block.description:
+                    parts.append(block.description)
+                if block.categories:
+                    parts.append(" ".join(str(cat.value) for cat in block.categories))

-                # Add input/output schema info
-                if hasattr(block_instance, "input_schema"):
-                    schema = block_instance.input_schema
-                    if hasattr(schema, "model_json_schema"):
-                        schema_dict = schema.model_json_schema()
-                        if "properties" in schema_dict:
-                            for prop_name, prop_info in schema_dict[
-                                "properties"
-                            ].items():
-                                if "description" in prop_info:
-                                    parts.append(
-                                        f"{prop_name}: {prop_info['description']}"
-                                    )
+                # Add input schema field descriptions
+                parts += [
+                    f"{field_name}: {field_info.description}"
+                    for field_name, field_info in block.input_schema.model_fields.items()
+                    if field_info.description
+                ]

                searchable_text = " ".join(parts)

-                # Convert categories set of enums to list of strings for JSON serialization
-                categories = getattr(block_instance, "categories", set())
                categories_list = (
-                    [cat.value for cat in categories] if categories else []
+                    [cat.value for cat in block.categories] if block.categories else []
+                )
+
+                # Extract provider names from credentials fields
+                credentials_info = block.input_schema.get_credentials_fields_info()
+                is_integration = len(credentials_info) > 0
+                provider_names = [
+                    provider.value.lower()
+                    for info in credentials_info.values()
+                    for provider in info.provider
+                ]
+
+                # Check if block has LlmModel field in input schema
+                has_llm_model_field = any(
+                    _contains_type(field.annotation, LlmModel)
+                    for field in block.input_schema.model_fields.values()
                )

                items.append(
@@ -235,10 +272,13 @@ class BlockHandler(ContentHandler):
                        content_type=ContentType.BLOCK,
                        searchable_text=searchable_text,
                        metadata={
-                            "name": getattr(block_instance, "name", ""),
+                            "name": display_name or block.name or block_id,
                            "categories": categories_list,
+                            "providers": provider_names,
+                            "has_llm_model_field": has_llm_model_field,
+                            "is_integration": is_integration,
                        },
-                        user_id=None,  # Blocks are public
+                        user_id=None,
                    )
                )
            except Exception as e:
@@ -249,22 +289,13 @@ class BlockHandler(ContentHandler):

    async def get_stats(self) -> dict[str, int]:
        """Get statistics about block embedding coverage."""
-        from backend.data.block import get_blocks
-
-        all_blocks = get_blocks()
-
-        # Filter out disabled blocks - they're not indexed
-        enabled_block_ids = [
-            block_id
-            for block_id, block_cls in all_blocks.items()
-            if not block_cls().disabled
-        ]
-        total_blocks = len(enabled_block_ids)
+        enabled = await asyncio.to_thread(_get_enabled_blocks)
+        total_blocks = len(enabled)

        if total_blocks == 0:
            return {"total": 0, "with_embeddings": 0, "without_embeddings": 0}

-        block_ids = enabled_block_ids
+        block_ids = list(enabled.keys())
        placeholders = ",".join([f"${i+1}" for i in range(len(block_ids))])

        embedded_result = await query_raw_with_schema(
--- a/autogpt_platform/backend/backend/api/features/store/content_handlers_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/content_handlers_test.py
@@ -1,7 +1,5 @@
 """
-E2E tests for content handlers (blocks, store agents, documentation).
-
-Tests the full flow: discovering content → generating embeddings → storing.
+Tests for content handlers (blocks, store agents, documentation).
 """

 from pathlib import Path
@@ -15,15 +13,103 @@ from backend.api.features.store.content_handlers import (
    BlockHandler,
    DocumentationHandler,
    StoreAgentHandler,
+    _get_enabled_blocks,
 )


+@pytest.fixture(autouse=True)
+def _clear_block_cache():
+    """Clear the lru_cache on _get_enabled_blocks before each test."""
+    _get_enabled_blocks.cache_clear()
+    yield
+    _get_enabled_blocks.cache_clear()
+
+
+# ---------------------------------------------------------------------------
+# Helper to build a mock block class that returns a pre-configured instance
+# ---------------------------------------------------------------------------
+
+
+def _make_block_class(
+    *,
+    name: str = "Block",
+    description: str = "",
+    disabled: bool = False,
+    categories: list[MagicMock] | None = None,
+    fields: dict[str, str] | None = None,
+    raise_on_init: Exception | None = None,
+) -> MagicMock:
+    cls = MagicMock()
+    if raise_on_init is not None:
+        cls.side_effect = raise_on_init
+        return cls
+    inst = MagicMock()
+    inst.name = name
+    inst.disabled = disabled
+    inst.description = description
+    inst.categories = categories or []
+    field_mocks = {
+        fname: MagicMock(description=fdesc) for fname, fdesc in (fields or {}).items()
+    }
+    inst.input_schema.model_fields = field_mocks
+    inst.input_schema.get_credentials_fields_info.return_value = {}
+    cls.return_value = inst
+    return cls
+
+
+# ---------------------------------------------------------------------------
+# _get_enabled_blocks
+# ---------------------------------------------------------------------------
+
+
+def test_get_enabled_blocks_filters_disabled():
+    """Disabled blocks are excluded."""
+    blocks = {
+        "enabled": _make_block_class(name="E", disabled=False),
+        "disabled": _make_block_class(name="D", disabled=True),
+    }
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        result = _get_enabled_blocks()
+    assert list(result.keys()) == ["enabled"]
+
+
+def test_get_enabled_blocks_skips_broken():
+    """Blocks that raise on init are skipped without crashing."""
+    blocks = {
+        "good": _make_block_class(name="Good"),
+        "bad": _make_block_class(raise_on_init=RuntimeError("boom")),
+    }
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        result = _get_enabled_blocks()
+    assert list(result.keys()) == ["good"]
+
+
+def test_get_enabled_blocks_cached():
+    """_get_enabled_blocks() calls get_blocks() only once across multiple calls."""
+    blocks = {"b1": _make_block_class(name="B1")}
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ) as mock_get_blocks:
+        result1 = _get_enabled_blocks()
+        result2 = _get_enabled_blocks()
+    assert result1 is result2
+    mock_get_blocks.assert_called_once()
+
+
+# ---------------------------------------------------------------------------
+# StoreAgentHandler
+# ---------------------------------------------------------------------------
+
+
@pytest.mark.asyncio(loop_scope="session")
 async def test_store_agent_handler_get_missing_items(mocker):
    """Test StoreAgentHandler fetches approved agents without embeddings."""
    handler = StoreAgentHandler()

-    # Mock database query
    mock_missing = [
        {
            "id": "agent-1",
@@ -54,9 +140,7 @@ async def test_store_agent_handler_get_stats(mocker):
    """Test StoreAgentHandler returns correct stats."""
    handler = StoreAgentHandler()

-    # Mock approved count query
    mock_approved = [{"count": 50}]
-    # Mock embedded count query
    mock_embedded = [{"count": 30}]

    with patch(
@@ -70,73 +154,130 @@ async def test_store_agent_handler_get_stats(mocker):
        assert stats["without_embeddings"] == 20


+# ---------------------------------------------------------------------------
+# BlockHandler
+# ---------------------------------------------------------------------------
+
+
@pytest.mark.asyncio(loop_scope="session")
-async def test_block_handler_get_missing_items(mocker):
+async def test_block_handler_get_missing_items():
    """Test BlockHandler discovers blocks without embeddings."""
    handler = BlockHandler()

-    # Mock get_blocks to return test blocks
-    mock_block_class = MagicMock()
-    mock_block_instance = MagicMock()
-    mock_block_instance.name = "Calculator Block"
-    mock_block_instance.description = "Performs calculations"
-    mock_block_instance.categories = [MagicMock(value="MATH")]
-    mock_block_instance.disabled = False
-    mock_block_instance.input_schema.model_json_schema.return_value = {
-        "properties": {"expression": {"description": "Math expression to evaluate"}}
+    blocks = {
+        "block-uuid-1": _make_block_class(
+            name="CalculatorBlock",
+            description="Performs calculations",
+            categories=[MagicMock(value="MATH")],
+            fields={"expression": "Math expression to evaluate"},
+        ),
    }
-    mock_block_class.return_value = mock_block_instance
-
-    mock_blocks = {"block-uuid-1": mock_block_class}
-
-    # Mock existing embeddings query (no embeddings exist)
-    mock_existing = []

    with patch(
-        "backend.data.block.get_blocks",
-        return_value=mock_blocks,
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
    ):
        with patch(
            "backend.api.features.store.content_handlers.query_raw_with_schema",
-            return_value=mock_existing,
+            return_value=[],
        ):
            items = await handler.get_missing_items(batch_size=10)

            assert len(items) == 1
            assert items[0].content_id == "block-uuid-1"
            assert items[0].content_type == ContentType.BLOCK
+            # CamelCase should be split in searchable text and metadata name
            assert "Calculator Block" in items[0].searchable_text
            assert "Performs calculations" in items[0].searchable_text
            assert "MATH" in items[0].searchable_text
            assert "expression: Math expression" in items[0].searchable_text
+            assert items[0].metadata["name"] == "Calculator Block"
            assert items[0].user_id is None


@pytest.mark.asyncio(loop_scope="session")
-async def test_block_handler_get_stats(mocker):
+async def test_block_handler_get_missing_items_splits_camelcase():
+    """CamelCase block names are split for better search indexing."""
+    handler = BlockHandler()
+
+    blocks = {
+        "ai-block": _make_block_class(name="AITextGeneratorBlock"),
+    }
+
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        with patch(
+            "backend.api.features.store.content_handlers.query_raw_with_schema",
+            return_value=[],
+        ):
+            items = await handler.get_missing_items(batch_size=10)
+
+            assert len(items) == 1
+            assert "AI Text Generator Block" in items[0].searchable_text
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_block_handler_get_missing_items_batch_size_zero():
+    """batch_size=0 returns an empty list; the DB is still queried to find missing IDs."""
+    handler = BlockHandler()
+
+    blocks = {"b1": _make_block_class(name="B1")}
+
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        with patch(
+            "backend.api.features.store.content_handlers.query_raw_with_schema",
+            return_value=[],
+        ) as mock_query:
+            items = await handler.get_missing_items(batch_size=0)
+            assert items == []
+            # DB query is still issued to learn which blocks lack embeddings;
+            # the empty result comes from itertools.islice limiting to 0 items.
+            mock_query.assert_called_once()
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_block_handler_disabled_dont_exhaust_batch():
+    """Disabled blocks don't consume batch budget, so enabled blocks get indexed."""
+    handler = BlockHandler()
+
+    # 5 disabled + 3 enabled, batch_size=2
+    blocks = {
+        **{
+            f"dis-{i}": _make_block_class(name=f"D{i}", disabled=True) for i in range(5)
+        },
+        **{f"en-{i}": _make_block_class(name=f"E{i}") for i in range(3)},
+    }
+
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        with patch(
+            "backend.api.features.store.content_handlers.query_raw_with_schema",
+            return_value=[],
+        ):
+            items = await handler.get_missing_items(batch_size=2)
+
+            assert len(items) == 2
+            assert all(item.content_id.startswith("en-") for item in items)
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_block_handler_get_stats():
    """Test BlockHandler returns correct stats."""
    handler = BlockHandler()

-    # Mock get_blocks - each block class returns an instance with disabled=False
-    def make_mock_block_class():
-        mock_class = MagicMock()
-        mock_instance = MagicMock()
-        mock_instance.disabled = False
-        mock_class.return_value = mock_instance
-        return mock_class
-
-    mock_blocks = {
-        "block-1": make_mock_block_class(),
-        "block-2": make_mock_block_class(),
-        "block-3": make_mock_block_class(),
+    blocks = {
+        "block-1": _make_block_class(name="B1"),
+        "block-2": _make_block_class(name="B2"),
+        "block-3": _make_block_class(name="B3"),
    }

-    # Mock embedded count query (2 blocks have embeddings)
    mock_embedded = [{"count": 2}]

    with patch(
-        "backend.data.block.get_blocks",
-        return_value=mock_blocks,
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
    ):
        with patch(
            "backend.api.features.store.content_handlers.query_raw_with_schema",
@@ -149,21 +290,123 @@ async def test_block_handler_get_stats(mocker):
            assert stats["without_embeddings"] == 1


+@pytest.mark.asyncio(loop_scope="session")
+async def test_block_handler_get_stats_skips_broken():
+    """get_stats skips broken blocks instead of crashing."""
+    handler = BlockHandler()
+
+    blocks = {
+        "good": _make_block_class(name="Good"),
+        "bad": _make_block_class(raise_on_init=RuntimeError("boom")),
+    }
+
+    mock_embedded = [{"count": 1}]
+
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        with patch(
+            "backend.api.features.store.content_handlers.query_raw_with_schema",
+            return_value=mock_embedded,
+        ):
+            stats = await handler.get_stats()
+
+            assert stats["total"] == 1  # only the good block
+            assert stats["with_embeddings"] == 1
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_block_handler_handles_none_name():
+    """When block.name is None the fallback display name logic is used."""
+    handler = BlockHandler()
+
+    blocks = {
+        "none-name-block": _make_block_class(
+            name="placeholder",  # will be overridden to None below
+            description="A block with no name",
+        ),
+    }
+    # Override the name to None after construction so _make_block_class
+    # doesn't interfere with the mock wiring.
+    blocks["none-name-block"].return_value.name = None
+
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        with patch(
+            "backend.api.features.store.content_handlers.query_raw_with_schema",
+            return_value=[],
+        ):
+            items = await handler.get_missing_items(batch_size=10)
+
+            assert len(items) == 1
+            # display_name should be "" because block.name is None
+            # searchable_text should still contain the description
+            assert "A block with no name" in items[0].searchable_text
+            # metadata["name"] falls back to block_id when both display_name
+            # and block.name are falsy, ensuring it is always a non-empty string.
+            assert items[0].metadata["name"] == "none-name-block"
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_block_handler_handles_empty_attributes():
+    """Test BlockHandler handles blocks with empty/falsy attribute values."""
+    handler = BlockHandler()
+
+    blocks = {"block-minimal": _make_block_class(name="Minimal Block")}
+
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        with patch(
+            "backend.api.features.store.content_handlers.query_raw_with_schema",
+            return_value=[],
+        ):
+            items = await handler.get_missing_items(batch_size=10)
+
+            assert len(items) == 1
+            assert items[0].searchable_text == "Minimal Block"
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_block_handler_skips_failed_blocks():
+    """Test BlockHandler skips blocks that fail to instantiate."""
+    handler = BlockHandler()
+
+    blocks = {
+        "good-block": _make_block_class(name="Good Block", description="Works fine"),
+        "bad-block": _make_block_class(raise_on_init=Exception("Instantiation failed")),
+    }
+
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        with patch(
+            "backend.api.features.store.content_handlers.query_raw_with_schema",
+            return_value=[],
+        ):
+            items = await handler.get_missing_items(batch_size=10)
+
+            assert len(items) == 1
+            assert items[0].content_id == "good-block"
+
+
+# ---------------------------------------------------------------------------
+# DocumentationHandler
+# ---------------------------------------------------------------------------
+
+
@pytest.mark.asyncio(loop_scope="session")
 async def test_documentation_handler_get_missing_items(tmp_path, mocker):
    """Test DocumentationHandler discovers docs without embeddings."""
    handler = DocumentationHandler()

-    # Create temporary docs directory with test files
    docs_root = tmp_path / "docs"
    docs_root.mkdir()
-
    (docs_root / "guide.md").write_text("# Getting Started\n\nThis is a guide.")
    (docs_root / "api.mdx").write_text("# API Reference\n\nAPI documentation.")

-    # Mock _get_docs_root to return temp dir
    with patch.object(handler, "_get_docs_root", return_value=docs_root):
-        # Mock existing embeddings query (no embeddings exist)
        with patch(
            "backend.api.features.store.content_handlers.query_raw_with_schema",
            return_value=[],
@@ -172,7 +415,6 @@ async def test_documentation_handler_get_missing_items(tmp_path, mocker):

            assert len(items) == 2

-            # Check guide.md (content_id format: doc_path::section_index)
            guide_item = next(
                (item for item in items if item.content_id == "guide.md::0"), None
            )
@@ -183,7 +425,6 @@ async def test_documentation_handler_get_missing_items(tmp_path, mocker):
            assert guide_item.metadata["doc_title"] == "Getting Started"
            assert guide_item.user_id is None

-            # Check api.mdx (content_id format: doc_path::section_index)
            api_item = next(
                (item for item in items if item.content_id == "api.mdx::0"), None
            )
@@ -196,14 +437,12 @@ async def test_documentation_handler_get_stats(tmp_path, mocker):
    """Test DocumentationHandler returns correct stats."""
    handler = DocumentationHandler()

-    # Create temporary docs directory
    docs_root = tmp_path / "docs"
    docs_root.mkdir()
    (docs_root / "doc1.md").write_text("# Doc 1")
    (docs_root / "doc2.md").write_text("# Doc 2")
    (docs_root / "doc3.mdx").write_text("# Doc 3")

-    # Mock embedded count query (1 doc has embedding)
    mock_embedded = [{"count": 1}]

    with patch.object(handler, "_get_docs_root", return_value=docs_root):
@@ -223,13 +462,11 @@ async def test_documentation_handler_title_extraction(tmp_path):
    """Test DocumentationHandler extracts title from markdown heading."""
    handler = DocumentationHandler()

-    # Test with heading
    doc_with_heading = tmp_path / "with_heading.md"
    doc_with_heading.write_text("# My Title\n\nContent here")
    title = handler._extract_doc_title(doc_with_heading)
    assert title == "My Title"

-    # Test without heading
    doc_without_heading = tmp_path / "no-heading.md"
    doc_without_heading.write_text("Just content, no heading")
    title = handler._extract_doc_title(doc_without_heading)
@@ -241,7 +478,6 @@ async def test_documentation_handler_markdown_chunking(tmp_path):
    """Test DocumentationHandler chunks markdown by headings."""
    handler = DocumentationHandler()

-    # Test document with multiple sections
    doc_with_sections = tmp_path / "sections.md"
    doc_with_sections.write_text(
        "# Document Title\n\n"
@@ -253,7 +489,6 @@ async def test_documentation_handler_markdown_chunking(tmp_path):
    )
    sections = handler._chunk_markdown_by_headings(doc_with_sections)

-    # Should have 3 sections: intro (with doc title), section one, section two
    assert len(sections) == 3
    assert sections[0].title == "Document Title"
    assert sections[0].index == 0
@@ -267,7 +502,6 @@ async def test_documentation_handler_markdown_chunking(tmp_path):
    assert sections[2].index == 2
    assert "Content for section two" in sections[2].content

-    # Test document without headings
    doc_no_sections = tmp_path / "no-sections.md"
    doc_no_sections.write_text("Just plain content without any headings.")
    sections = handler._chunk_markdown_by_headings(doc_no_sections)
@@ -281,21 +515,39 @@ async def test_documentation_handler_section_content_ids():
    """Test DocumentationHandler creates and parses section content IDs."""
    handler = DocumentationHandler()

-    # Test making content ID
    content_id = handler._make_section_content_id("docs/guide.md", 2)
    assert content_id == "docs/guide.md::2"

-    # Test parsing content ID
    doc_path, section_index = handler._parse_section_content_id("docs/guide.md::2")
    assert doc_path == "docs/guide.md"
    assert section_index == 2

-    # Test parsing legacy format (no section index)
    doc_path, section_index = handler._parse_section_content_id("docs/old-format.md")
    assert doc_path == "docs/old-format.md"
    assert section_index == 0


+@pytest.mark.asyncio(loop_scope="session")
+async def test_documentation_handler_missing_docs_directory():
+    """Test DocumentationHandler handles missing docs directory gracefully."""
+    handler = DocumentationHandler()
+
+    fake_path = Path("/nonexistent/docs")
+    with patch.object(handler, "_get_docs_root", return_value=fake_path):
+        items = await handler.get_missing_items(batch_size=10)
+        assert items == []
+
+        stats = await handler.get_stats()
+        assert stats["total"] == 0
+        assert stats["with_embeddings"] == 0
+        assert stats["without_embeddings"] == 0
+
+
+# ---------------------------------------------------------------------------
+# Registry
+# ---------------------------------------------------------------------------
+
+
@pytest.mark.asyncio(loop_scope="session")
 async def test_content_handlers_registry():
    """Test all content types are registered."""
@@ -306,86 +558,3 @@ async def test_content_handlers_registry():
    assert isinstance(CONTENT_HANDLERS[ContentType.STORE_AGENT], StoreAgentHandler)
    assert isinstance(CONTENT_HANDLERS[ContentType.BLOCK], BlockHandler)
    assert isinstance(CONTENT_HANDLERS[ContentType.DOCUMENTATION], DocumentationHandler)
-
-
-@pytest.mark.asyncio(loop_scope="session")
-async def test_block_handler_handles_missing_attributes():
-    """Test BlockHandler gracefully handles blocks with missing attributes."""
-    handler = BlockHandler()
-
-    # Mock block with minimal attributes
-    mock_block_class = MagicMock()
-    mock_block_instance = MagicMock()
-    mock_block_instance.name = "Minimal Block"
-    mock_block_instance.disabled = False
-    # No description, categories, or schema
-    del mock_block_instance.description
-    del mock_block_instance.categories
-    del mock_block_instance.input_schema
-    mock_block_class.return_value = mock_block_instance
-
-    mock_blocks = {"block-minimal": mock_block_class}
-
-    with patch(
-        "backend.data.block.get_blocks",
-        return_value=mock_blocks,
-    ):
-        with patch(
-            "backend.api.features.store.content_handlers.query_raw_with_schema",
-            return_value=[],
-        ):
-            items = await handler.get_missing_items(batch_size=10)
-
-            assert len(items) == 1
-            assert items[0].searchable_text == "Minimal Block"
-
-
-@pytest.mark.asyncio(loop_scope="session")
-async def test_block_handler_skips_failed_blocks():
-    """Test BlockHandler skips blocks that fail to instantiate."""
-    handler = BlockHandler()
-
-    # Mock one good block and one bad block
-    good_block = MagicMock()
-    good_instance = MagicMock()
-    good_instance.name = "Good Block"
-    good_instance.description = "Works fine"
-    good_instance.categories = []
-    good_instance.disabled = False
-    good_block.return_value = good_instance
-
-    bad_block = MagicMock()
-    bad_block.side_effect = Exception("Instantiation failed")
-
-    mock_blocks = {"good-block": good_block, "bad-block": bad_block}
-
-    with patch(
-        "backend.data.block.get_blocks",
-        return_value=mock_blocks,
-    ):
-        with patch(
-            "backend.api.features.store.content_handlers.query_raw_with_schema",
-            return_value=[],
-        ):
-            items = await handler.get_missing_items(batch_size=10)
-
-            # Should only get the good block
-            assert len(items) == 1
-            assert items[0].content_id == "good-block"
-
-
-@pytest.mark.asyncio(loop_scope="session")
-async def test_documentation_handler_missing_docs_directory():
-    """Test DocumentationHandler handles missing docs directory gracefully."""
-    handler = DocumentationHandler()
-
-    # Mock _get_docs_root to return non-existent path
-    fake_path = Path("/nonexistent/docs")
-    with patch.object(handler, "_get_docs_root", return_value=fake_path):
-        items = await handler.get_missing_items(batch_size=10)
-        assert items == []
-
-        stats = await handler.get_stats()
-        assert stats["total"] == 0
-        assert stats["with_embeddings"] == 0
-        assert stats["without_embeddings"] == 0
--- a/autogpt_platform/backend/backend/api/features/store/db.py
+++ b/autogpt_platform/backend/backend/api/features/store/db.py
--- a/autogpt_platform/backend/backend/api/features/store/db_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/db_test.py
@@ -26,7 +26,7 @@ async def test_get_store_agents(mocker):
    mock_agents = [
        prisma.models.StoreAgent(
            listing_id="test-id",
-            storeListingVersionId="version123",
+            listing_version_id="version123",
            slug="test-agent",
            agent_name="Test Agent",
            agent_video=None,
@@ -40,11 +40,11 @@ async def test_get_store_agents(mocker):
            runs=10,
            rating=4.5,
            versions=["1.0"],
-            agentGraphVersions=["1"],
-            agentGraphId="test-graph-id",
+            graph_id="test-graph-id",
+            graph_versions=["1"],
            updated_at=datetime.now(),
            is_available=False,
-            useForOnboarding=False,
+            use_for_onboarding=False,
        )
    ]

@@ -68,10 +68,10 @@ async def test_get_store_agents(mocker):

@pytest.mark.asyncio(loop_scope="session")
 async def test_get_store_agent_details(mocker):
-    # Mock data
+    # Mock data - StoreAgent view already contains the active version data
    mock_agent = prisma.models.StoreAgent(
        listing_id="test-id",
-        storeListingVersionId="version123",
+        listing_version_id="version123",
        slug="test-agent",
        agent_name="Test Agent",
        agent_video="video.mp4",
@@ -85,102 +85,38 @@ async def test_get_store_agent_details(mocker):
        runs=10,
        rating=4.5,
        versions=["1.0"],
-        agentGraphVersions=["1"],
-        agentGraphId="test-graph-id",
-        updated_at=datetime.now(),
-        is_available=False,
-        useForOnboarding=False,
-    )
-
-    # Mock active version agent (what we want to return for active version)
-    mock_active_agent = prisma.models.StoreAgent(
-        listing_id="test-id",
-        storeListingVersionId="active-version-id",
-        slug="test-agent",
-        agent_name="Test Agent Active",
-        agent_video="active_video.mp4",
-        agent_image=["active_image.jpg"],
-        featured=False,
-        creator_username="creator",
-        creator_avatar="avatar.jpg",
-        sub_heading="Test heading active",
-        description="Test description active",
-        categories=["test"],
-        runs=15,
-        rating=4.8,
-        versions=["1.0", "2.0"],
-        agentGraphVersions=["1", "2"],
-        agentGraphId="test-graph-id-active",
+        graph_id="test-graph-id",
+        graph_versions=["1"],
        updated_at=datetime.now(),
        is_available=True,
-        useForOnboarding=False,
+        use_for_onboarding=False,
    )

-    # Create a mock StoreListing result
-    mock_store_listing = mocker.MagicMock()
-    mock_store_listing.activeVersionId = "active-version-id"
-    mock_store_listing.hasApprovedVersion = True
-    mock_store_listing.ActiveVersion = mocker.MagicMock()
-    mock_store_listing.ActiveVersion.recommendedScheduleCron = None
-
-    # Mock StoreAgent prisma call - need to handle multiple calls
+    # Mock StoreAgent prisma call
    mock_store_agent = mocker.patch("prisma.models.StoreAgent.prisma")
-
-    # Set up side_effect to return different results for different calls
-    def mock_find_first_side_effect(*args, **kwargs):
-        where_clause = kwargs.get("where", {})
-        if "storeListingVersionId" in where_clause:
-            # Second call for active version
-            return mock_active_agent
-        else:
-            # First call for initial lookup
-            return mock_agent
-
-    mock_store_agent.return_value.find_first = mocker.AsyncMock(
-        side_effect=mock_find_first_side_effect
-    )
-
-    # Mock Profile prisma call
-    mock_profile = mocker.MagicMock()
-    mock_profile.userId = "user-id-123"
-    mock_profile_db = mocker.patch("prisma.models.Profile.prisma")
-    mock_profile_db.return_value.find_first = mocker.AsyncMock(
-        return_value=mock_profile
-    )
-
-    # Mock StoreListing prisma call
-    mock_store_listing_db = mocker.patch("prisma.models.StoreListing.prisma")
-    mock_store_listing_db.return_value.find_first = mocker.AsyncMock(
-        return_value=mock_store_listing
-    )
+    mock_store_agent.return_value.find_first = mocker.AsyncMock(return_value=mock_agent)

    # Call function
    result = await db.get_store_agent_details("creator", "test-agent")

-    # Verify results - should use active version data
+    # Verify results - constructed from the StoreAgent view
    assert result.slug == "test-agent"
-    assert result.agent_name == "Test Agent Active"  # From active version
-    assert result.active_version_id == "active-version-id"
+    assert result.agent_name == "Test Agent"
+    assert result.active_version_id == "version123"
    assert result.has_approved_version is True
-    assert (
-        result.store_listing_version_id == "active-version-id"
-    )  # Should be active version ID
+    assert result.store_listing_version_id == "version123"
+    assert result.graph_id == "test-graph-id"
+    assert result.runs == 10
+    assert result.rating == 4.5

-    # Verify mocks called correctly - now expecting 2 calls
-    assert mock_store_agent.return_value.find_first.call_count == 2
-
-    # Check the specific calls
-    calls = mock_store_agent.return_value.find_first.call_args_list
-    assert calls[0] == mocker.call(
+    # Verify single StoreAgent lookup
+    mock_store_agent.return_value.find_first.assert_called_once_with(
        where={"creator_username": "creator", "slug": "test-agent"}
    )
-    assert calls[1] == mocker.call(where={"storeListingVersionId": "active-version-id"})
-
-    mock_store_listing_db.return_value.find_first.assert_called_once()


@pytest.mark.asyncio(loop_scope="session")
-async def test_get_store_creator_details(mocker):
+async def test_get_store_creator(mocker):
    # Mock data
    mock_creator_data = prisma.models.Creator(
        name="Test Creator",
@@ -202,7 +138,7 @@ async def test_get_store_creator_details(mocker):
    mock_creator.return_value.find_unique.return_value = mock_creator_data

    # Call function
-    result = await db.get_store_creator_details("creator")
+    result = await db.get_store_creator("creator")

    # Verify results
    assert result.username == "creator"
@@ -218,61 +154,110 @@ async def test_get_store_creator_details(mocker):

@pytest.mark.asyncio(loop_scope="session")
 async def test_create_store_submission(mocker):
-    # Mock data
+    now = datetime.now()
+
+    # Mock agent graph (with no pending submissions) and user with profile
+    mock_profile = prisma.models.Profile(
+        id="profile-id",
+        userId="user-id",
+        name="Test User",
+        username="testuser",
+        description="Test",
+        isFeatured=False,
+        links=[],
+        createdAt=now,
+        updatedAt=now,
+    )
+    mock_user = prisma.models.User(
+        id="user-id",
+        email="test@example.com",
+        createdAt=now,
+        updatedAt=now,
+        Profile=[mock_profile],
+        emailVerified=True,
+        metadata="{}",  # type: ignore[reportArgumentType]
+        integrations="",
+        maxEmailsPerDay=1,
+        notifyOnAgentRun=True,
+        notifyOnZeroBalance=True,
+        notifyOnLowBalance=True,
+        notifyOnBlockExecutionFailed=True,
+        notifyOnContinuousAgentError=True,
+        notifyOnDailySummary=True,
+        notifyOnWeeklySummary=True,
+        notifyOnMonthlySummary=True,
+        notifyOnAgentApproved=True,
+        notifyOnAgentRejected=True,
+        timezone="Europe/Delft",
+    )
    mock_agent = prisma.models.AgentGraph(
        id="agent-id",
        version=1,
        userId="user-id",
-        createdAt=datetime.now(),
+        createdAt=now,
        isActive=True,
+        StoreListingVersions=[],
+        User=mock_user,
    )

-    mock_listing = prisma.models.StoreListing(
+    # Mock the created StoreListingVersion (returned by create)
+    mock_store_listing_obj = prisma.models.StoreListing(
        id="listing-id",
-        createdAt=datetime.now(),
-        updatedAt=datetime.now(),
+        createdAt=now,
+        updatedAt=now,
        isDeleted=False,
        hasApprovedVersion=False,
        slug="test-agent",
        agentGraphId="agent-id",
-        agentGraphVersion=1,
        owningUserId="user-id",
-        Versions=[
-            prisma.models.StoreListingVersion(
-                id="version-id",
-                agentGraphId="agent-id",
-                agentGraphVersion=1,
-                name="Test Agent",
-                description="Test description",
-                createdAt=datetime.now(),
-                updatedAt=datetime.now(),
-                subHeading="Test heading",
-                imageUrls=["image.jpg"],
-                categories=["test"],
-                isFeatured=False,
-                isDeleted=False,
-                version=1,
-                storeListingId="listing-id",
-                submissionStatus=prisma.enums.SubmissionStatus.PENDING,
-                isAvailable=True,
-            )
-        ],
        useForOnboarding=False,
    )
+    mock_version = prisma.models.StoreListingVersion(
+        id="version-id",
+        agentGraphId="agent-id",
+        agentGraphVersion=1,
+        name="Test Agent",
+        description="Test description",
+        createdAt=now,
+        updatedAt=now,
+        subHeading="",
+        imageUrls=[],
+        categories=[],
+        isFeatured=False,
+        isDeleted=False,
+        version=1,
+        storeListingId="listing-id",
+        submissionStatus=prisma.enums.SubmissionStatus.PENDING,
+        isAvailable=True,
+        submittedAt=now,
+        StoreListing=mock_store_listing_obj,
+    )

    # Mock prisma calls
    mock_agent_graph = mocker.patch("prisma.models.AgentGraph.prisma")
    mock_agent_graph.return_value.find_first = mocker.AsyncMock(return_value=mock_agent)

-    mock_store_listing = mocker.patch("prisma.models.StoreListing.prisma")
-    mock_store_listing.return_value.find_first = mocker.AsyncMock(return_value=None)
-    mock_store_listing.return_value.create = mocker.AsyncMock(return_value=mock_listing)
+    # Mock transaction context manager
+    mock_tx = mocker.MagicMock()
+    mocker.patch(
+        "backend.api.features.store.db.transaction",
+        return_value=mocker.AsyncMock(
+            __aenter__=mocker.AsyncMock(return_value=mock_tx),
+            __aexit__=mocker.AsyncMock(return_value=False),
+        ),
+    )
+
+    mock_sl = mocker.patch("prisma.models.StoreListing.prisma")
+    mock_sl.return_value.find_unique = mocker.AsyncMock(return_value=None)
+
+    mock_slv = mocker.patch("prisma.models.StoreListingVersion.prisma")
+    mock_slv.return_value.create = mocker.AsyncMock(return_value=mock_version)

    # Call function
    result = await db.create_store_submission(
        user_id="user-id",
-        agent_id="agent-id",
-        agent_version=1,
+        graph_id="agent-id",
+        graph_version=1,
        slug="test-agent",
        name="Test Agent",
        description="Test description",
@@ -281,11 +266,11 @@ async def test_create_store_submission(mocker):
    # Verify results
    assert result.name == "Test Agent"
    assert result.description == "Test description"
-    assert result.store_listing_version_id == "version-id"
+    assert result.listing_version_id == "version-id"

    # Verify mocks called correctly
    mock_agent_graph.return_value.find_first.assert_called_once()
-    mock_store_listing.return_value.create.assert_called_once()
+    mock_slv.return_value.create.assert_called_once()


@pytest.mark.asyncio(loop_scope="session")
@@ -318,7 +303,6 @@ async def test_update_profile(mocker):
        description="Test description",
        links=["link1"],
        avatar_url="avatar.jpg",
-        is_featured=False,
    )

    # Call function
@@ -389,7 +373,7 @@ async def test_get_store_agents_with_search_and_filters_parameterized():
        creators=["creator1'; DROP TABLE Users; --", "creator2"],
        category="AI'; DELETE FROM StoreAgent; --",
        featured=True,
-        sorted_by="rating",
+        sorted_by=db.StoreAgentsSortOptions.RATING,
        page=1,
        page_size=20,
    )
--- a/autogpt_platform/backend/backend/api/features/store/embeddings.py
+++ b/autogpt_platform/backend/backend/api/features/store/embeddings.py
@@ -15,6 +15,7 @@ from prisma.enums import ContentType
 from tiktoken import encoding_for_model

 from backend.api.features.store.content_handlers import CONTENT_HANDLERS
+from backend.blocks import get_blocks
 from backend.data.db import execute_raw_with_schema, query_raw_with_schema
 from backend.util.clients import get_openai_client
 from backend.util.json import dumps
@@ -662,8 +663,6 @@ async def cleanup_orphaned_embeddings() -> dict[str, Any]:
                )
                current_ids = {row["id"] for row in valid_agents}
            elif content_type == ContentType.BLOCK:
-                from backend.data.block import get_blocks
-
                current_ids = set(get_blocks().keys())
            elif content_type == ContentType.DOCUMENTATION:
                # Use DocumentationHandler to get section-based content IDs
--- a/autogpt_platform/backend/backend/api/features/store/exceptions.py
+++ b/autogpt_platform/backend/backend/api/features/store/exceptions.py
@@ -57,12 +57,6 @@ class StoreError(ValueError):
    pass


-class AgentNotFoundError(NotFoundError):
-    """Raised when an agent is not found"""
-
-    pass
-
-
 class CreatorNotFoundError(NotFoundError):
    """Raised when a creator is not found"""

--- a/autogpt_platform/backend/backend/api/features/store/hybrid_search.py
+++ b/autogpt_platform/backend/backend/api/features/store/hybrid_search.py
@@ -31,12 +31,10 @@ logger = logging.getLogger(__name__)


 def tokenize(text: str) -> list[str]:
-    """Simple tokenizer for BM25 - lowercase and split on non-alphanumeric."""
+    """Tokenize text for BM25."""
    if not text:
        return []
-    # Lowercase and split on non-alphanumeric characters
-    tokens = re.findall(r"\b\w+\b", text.lower())
-    return tokens
+    return re.findall(r"\b\w+\b", text.lower())


 def bm25_rerank(
@@ -568,7 +566,7 @@ async def hybrid_search(
            SELECT uce."contentId" as "storeListingVersionId"
            FROM {{schema_prefix}}"UnifiedContentEmbedding" uce
            INNER JOIN {{schema_prefix}}"StoreAgent" sa
-                ON uce."contentId" = sa."storeListingVersionId"
+                ON uce."contentId" = sa.listing_version_id
            WHERE uce."contentType" = 'STORE_AGENT'::{{schema_prefix}}"ContentType"
            AND uce."userId" IS NULL
            AND uce.search @@ plainto_tsquery('english', {query_param})
@@ -582,7 +580,7 @@ async def hybrid_search(
                SELECT uce."contentId", uce.embedding
                FROM {{schema_prefix}}"UnifiedContentEmbedding" uce
                INNER JOIN {{schema_prefix}}"StoreAgent" sa
-                    ON uce."contentId" = sa."storeListingVersionId"
+                    ON uce."contentId" = sa.listing_version_id
                WHERE uce."contentType" = 'STORE_AGENT'::{{schema_prefix}}"ContentType"
                AND uce."userId" IS NULL
                AND {where_clause}
@@ -605,7 +603,7 @@ async def hybrid_search(
                sa.featured,
                sa.is_available,
                sa.updated_at,
-                sa."agentGraphId",
+                sa.graph_id,
                -- Searchable text for BM25 reranking
                COALESCE(sa.agent_name, '') || ' ' || COALESCE(sa.sub_heading, '') || ' ' || COALESCE(sa.description, '') as searchable_text,
                -- Semantic score
@@ -627,9 +625,9 @@ async def hybrid_search(
                sa.runs as popularity_raw
            FROM candidates c
            INNER JOIN {{schema_prefix}}"StoreAgent" sa
-                ON c."storeListingVersionId" = sa."storeListingVersionId"
+                ON c."storeListingVersionId" = sa.listing_version_id
            INNER JOIN {{schema_prefix}}"UnifiedContentEmbedding" uce
-                ON sa."storeListingVersionId" = uce."contentId"
+                ON sa.listing_version_id = uce."contentId"
                AND uce."contentType" = 'STORE_AGENT'::{{schema_prefix}}"ContentType"
        ),
        max_vals AS (
@@ -665,7 +663,7 @@ async def hybrid_search(
                featured,
                is_available,
                updated_at,
-                "agentGraphId",
+                graph_id,
                searchable_text,
                semantic_score,
                lexical_score,
--- a/autogpt_platform/backend/backend/api/features/store/hybrid_search_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/hybrid_search_test.py
@@ -14,9 +14,27 @@ from backend.api.features.store.hybrid_search import (
    HybridSearchWeights,
    UnifiedSearchWeights,
    hybrid_search,
+    tokenize,
    unified_hybrid_search,
 )

+# ---------------------------------------------------------------------------
+# tokenize (BM25)
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.parametrize(
+    "input_text, expected",
+    [
+        ("AITextGeneratorBlock", ["aitextgeneratorblock"]),
+        ("hello world", ["hello", "world"]),
+        ("", []),
+        ("HTTPRequest", ["httprequest"]),
+    ],
+)
+def test_tokenize(input_text: str, expected: list[str]):
+    assert tokenize(input_text) == expected
+

@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
--- a/Show More
+++ b/Show More