Merge branch 'dev' of github.com:Significant-Gravitas/AutoGPT into feat/ask-question-tool

feat(platform): add generic ask_question copilot tool (#12647 )
### Why / What / How **Why:** The copilot can ask clarifying questions in plain text, but that text gets collapsed into hidden "reasoning" UI when the LLM also calls tools in the same turn. This makes clarification questions invisible to users. The existing `ClarificationNeededResponse` model and `ClarificationQuestionsCard` UI component were built for this purpose but had no tool wiring them up. **What:** Adds a generic `ask_question` tool that produces a visible, interactive clarification card instead of collapsible plain text. Unlike the agent-generation-specific `clarify_agent_request` proposed in #12601, this tool is workflow-agnostic — usable for agent building, editing, troubleshooting, or any flow needing user input. **How:** - Backend: New `AskQuestionTool` reuses existing `ClarificationNeededResponse` model. Registered in `TOOL_REGISTRY` and `ToolName` permissions. - Frontend: New `AskQuestion/` renderer reuses `ClarificationQuestionsCard` from CreateAgent. Registered in `CUSTOM_TOOL_TYPES` (prevents collapse into reasoning) and `MessagePartRenderer`. - Guide: `agent_generation_guide.md` updated to reference `ask_question` for the clarification step. ### Changes 🏗️ - **`copilot/tools/ask_question.py`** — New generic tool: takes `question`, optional `options[]` and `keyword`, returns `ClarificationNeededResponse` - **`copilot/tools/__init__.py`** — Register `ask_question` in `TOOL_REGISTRY` - **`copilot/permissions.py`** — Add `ask_question` to `ToolName` literal - **`copilot/sdk/agent_generation_guide.md`** — Reference `ask_question` tool in clarification step - **`ChatMessagesContainer/helpers.ts`** — Add `tool-ask_question` to `CUSTOM_TOOL_TYPES` - **`MessagePartRenderer.tsx`** — Add switch case for `tool-ask_question` - **`AskQuestion/AskQuestion.tsx`** — Renderer reusing `ClarificationQuestionsCard` - **`AskQuestion/helpers.ts`** — Output parsing and animation text ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Backend format + pyright pass - [x] Frontend lint + types pass - [x] Pre-commit hooks pass - [ ] Manual test: copilot uses `ask_question` and card renders visibly (not collapsed)
2026-04-08 03:00:28 -04:00 · 2026-04-02 15:55:52 +02:00 · 2026-04-02 12:56:48 +00:00 · 2026-04-02 14:34:58 +02:00 · 2026-04-02 14:18:58 +02:00 · 2026-04-02 14:14:57 +02:00
896 changed files with 115546 additions and 36234 deletions
--- a/.agents/skills
+++ b/.agents/skills
@@ -0,0 +1 @@
+../.claude/skills
--- a/.claude/skills/open-pr/SKILL.md
+++ b/.claude/skills/open-pr/SKILL.md
@@ -0,0 +1,106 @@
+---
+name: open-pr
+description: Open a pull request with proper PR template, test coverage, and review workflow. Guides agents through creating a PR that follows repo conventions, ensures existing behaviors aren't broken, covers new behaviors with tests, and handles review via bot when local testing isn't possible. TRIGGER when user asks to "open a PR", "create a PR", "make a PR", "submit a PR", "open pull request", "push and create PR", or any variation of opening/submitting a pull request.
+user-invocable: true
+args: "[base-branch] — optional target branch (defaults to dev)."
+metadata:
+  author: autogpt-team
+  version: "1.0.0"
+---
+
+# Open a Pull Request
+
+## Step 1: Pre-flight checks
+
+Before opening the PR:
+
+1. Ensure all changes are committed
+2. Ensure the branch is pushed to the remote (`git push -u origin <branch>`)
+3. Run linters/formatters across the whole repo (not just changed files) and commit any fixes
+
+## Step 2: Test coverage
+
+**This is critical.** Before opening the PR, verify:
+
+### Existing behavior is not broken
+- Identify which modules/components your changes touch
+- Run the existing test suites for those areas
+- If tests fail, fix them before opening the PR — do not open a PR with known regressions
+
+### New behavior has test coverage
+- Every new feature, endpoint, or behavior change needs tests
+- If you added a new block, add tests for that block
+- If you changed API behavior, add or update API tests
+- If you changed frontend behavior, verify it doesn't break existing flows
+
+If you cannot run the full test suite locally, note which tests you ran and which you couldn't in the test plan.
+
+## Step 3: Create the PR using the repo template
+
+Read the canonical PR template at `.github/PULL_REQUEST_TEMPLATE.md` and use it **verbatim** as your PR body:
+
+1. Read the template: `cat .github/PULL_REQUEST_TEMPLATE.md`
+2. Preserve the exact section titles and formatting, including:
+   - `### Why / What / How`
+   - `### Changes 🏗️`
+   - `### Checklist 📋`
+3. Replace HTML comment prompts (`<!-- ... -->`) with actual content; do not leave them in
+4. **Do not pre-check boxes** — leave all checkboxes as `- [ ]` until each step is actually completed
+5. Do not alter the template structure, rename sections, or remove any checklist items
+
+**PR title must use conventional commit format** (e.g., `feat(backend): add new block`, `fix(frontend): resolve routing bug`, `dx(skills): update PR workflow`). See CLAUDE.md for the full list of scopes.
+
+Use `gh pr create` with the base branch (defaults to `dev` if no `[base-branch]` was provided). Use `--body-file` to avoid shell interpretation of backticks and special characters:
+
+```bash
+BASE_BRANCH="${BASE_BRANCH:-dev}"
+PR_BODY=$(mktemp)
+cat > "$PR_BODY" << 'PREOF'
+<filled-in template from .github/PULL_REQUEST_TEMPLATE.md>
+PREOF
+gh pr create --base "$BASE_BRANCH" --title "<type>(scope): short description" --body-file "$PR_BODY"
+rm "$PR_BODY"
+```
+
+## Step 4: Review workflow
+
+### If you have a workspace that allows testing (docker, running backend, etc.)
+- Run `/pr-test` to do E2E manual testing of the PR using docker compose, agent-browser, and API calls. This is the most thorough way to validate your changes before review.
+- After testing, run `/pr-review` to self-review the PR for correctness, security, code quality, and testing gaps before requesting human review.
+
+### If you do NOT have a workspace that allows testing
+This is common for agents running in worktrees without a full stack. In this case:
+
+1. Run `/pr-review` locally to catch obvious issues before pushing
+2. **Comment `/review` on the PR** after creating it to trigger the review bot
+3. **Poll for the review** rather than blindly waiting — check for new review comments every 30 seconds using `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate` and the GraphQL inline threads query. The bot typically responds within 30 minutes, but polling lets the agent react as soon as it arrives.
+4. Do NOT proceed or merge until the bot review comes back
+5. Address any issues the bot raises — use `/pr-address` which has a full polling loop with CI + comment tracking
+
+```bash
+# After creating the PR:
+PR_NUMBER=$(gh pr view --json number -q .number)
+gh pr comment "$PR_NUMBER" --body "/review"
+# Then use /pr-address to poll for and address the review when it arrives
+```
+
+## Step 5: Address review feedback
+
+Once the review bot or human reviewers leave comments:
+- Run `/pr-address` to address review comments. It will loop until CI is green and all comments are resolved.
+- Do not merge without human approval.
+
+## Related skills
+
+| Skill | When to use |
+|---|---|
+| `/pr-test` | E2E testing with docker compose, agent-browser, API calls — use when you have a running workspace |
+| `/pr-review` | Review for correctness, security, code quality — use before requesting human review |
+| `/pr-address` | Address reviewer comments and loop until CI green — use after reviews come in |
+
+## Step 6: Post-creation
+
+After the PR is created and review is triggered:
+- Share the PR URL with the user
+- If waiting on the review bot, let the user know the expected wait time (~30 min)
+- Do not merge without human approval
--- a/.claude/skills/pr-address/SKILL.md
+++ b/.claude/skills/pr-address/SKILL.md
@@ -0,0 +1,210 @@
+---
+name: pr-address
+description: Address PR review comments and loop until CI green and all comments resolved. TRIGGER when user asks to address comments, fix PR feedback, respond to reviewers, or babysit/monitor a PR.
+user-invocable: true
+argument-hint: "[PR number or URL] — if omitted, finds PR for current branch."
+metadata:
+  author: autogpt-team
+  version: "1.0.0"
+---
+
+# PR Address
+
+## Find the PR
+
+```bash
+gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT
+gh pr view {N}
+```
+
+## Read the PR description
+
+Understand the **Why / What / How** before addressing comments — you need context to make good fixes:
+
+```bash
+gh pr view {N} --json body --jq '.body'
+```
+
+## Fetch comments (all sources)
+
+### 1. Inline review threads — GraphQL (primary source of actionable items)
+
+Use GraphQL to fetch inline threads. It natively exposes `isResolved`, returns threads already grouped with all replies, and paginates via cursor — no manual thread reconstruction needed.
+
+```bash
+gh api graphql -f query='
+{
+  repository(owner: "Significant-Gravitas", name: "AutoGPT") {
+    pullRequest(number: {N}) {
+      reviewThreads(first: 100) {
+        pageInfo { hasNextPage endCursor }
+        nodes {
+          id
+          isResolved
+          path
+          comments(last: 1) {
+            nodes { databaseId body author { login } createdAt }
+          }
+        }
+      }
+    }
+  }
+}'
+```
+
+If `pageInfo.hasNextPage` is true, fetch subsequent pages by adding `after: "<endCursor>"` to `reviewThreads(first: 100, after: "...")` and repeat until `hasNextPage` is false.
+
+**Filter to unresolved threads only** — skip any thread where `isResolved: true`. `comments(last: 1)` returns the most recent comment in the thread — act on that; it reflects the reviewer's final ask. Use the thread `id` (Relay global ID) to track threads across polls.
+
+### 2. Top-level reviews — REST (MUST paginate)
+
+```bash
+gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate
+```
+
+**CRITICAL — always `--paginate`.** Reviews default to 30 per page. PRs can have 80–170+ reviews (mostly empty resolution events). Without pagination you miss reviews past position 30 — including `autogpt-reviewer`'s structured review which is typically posted after several CI runs and sits well beyond the first page.
+
+Two things to extract:
+- **Overall state**: look for `CHANGES_REQUESTED` or `APPROVED` reviews.
+- **Actionable feedback**: non-empty bodies only. Empty-body reviews are thread-resolution events — they indicate progress but have no feedback to act on.
+
+**Where each reviewer posts:**
+- `autogpt-reviewer` — posts detailed structured reviews ("Blockers", "Should Fix", "Nice to Have") as **top-level reviews**. Not present on every PR. Address ALL items.
+- `sentry[bot]` — posts bug predictions as **inline threads**. Fix real bugs, explain false positives.
+- `coderabbitai[bot]` — posts summaries as **top-level reviews** AND actionable items as **inline threads**. Address actionable items.
+- Human reviewers — can post in any source. Address ALL non-empty feedback.
+
+### 3. PR conversation comments — REST
+
+```bash
+gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments --paginate
+```
+
+Mostly contains: bot summaries (`coderabbitai[bot]`), CI/conflict detection (`github-actions[bot]`), and author status updates. Scan for non-empty messages from non-bot human reviewers that aren't the PR author — those are the ones that need a response.
+
+## For each unaddressed comment
+
+Address comments **one at a time**: fix → commit → push → inline reply → next.
+
+1. Read the referenced code, make the fix (or reply explaining why it's not needed)
+2. Commit and push the fix
+3. Reply **inline** (not as a new top-level comment) referencing the fixing commit — this is what resolves the conversation for bot reviewers (coderabbitai, sentry):
+
+| Comment type | How to reply |
+|---|---|
+| Inline review (`pulls/{N}/comments`) | `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID}/replies -f body="🤖 Fixed in <commit-sha>: <description>"` |
+| Conversation (`issues/{N}/comments`) | `gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments -f body="🤖 Fixed in <commit-sha>: <description>"` |
+
+## Format and commit
+
+After fixing, format the changed code:
+
+- **Backend** (from `autogpt_platform/backend/`): `poetry run format`
+- **Frontend** (from `autogpt_platform/frontend/`): `pnpm format && pnpm lint && pnpm types`
+
+If API routes changed, regenerate the frontend client:
+```bash
+cd autogpt_platform/backend && poetry run rest &
+REST_PID=$!
+trap "kill $REST_PID 2>/dev/null" EXIT
+WAIT=0; until curl -sf http://localhost:8006/health > /dev/null 2>&1; do sleep 1; WAIT=$((WAIT+1)); [ $WAIT -ge 60 ] && echo "Timed out" && exit 1; done
+cd ../frontend && pnpm generate:api:force
+kill $REST_PID 2>/dev/null; trap - EXIT
+```
+Never manually edit files in `src/app/api/__generated__/`.
+
+Then commit and **push immediately** — never batch commits without pushing. Each fix should be visible on GitHub right away so CI can start and reviewers can see progress.
+
+**Never push empty commits** (`git commit --allow-empty`) to re-trigger CI or bot checks. When a check fails, investigate the root cause (unchecked PR checklist, unaddressed review comments, code issues) and fix those directly. Empty commits add noise to git history.
+
+For backend commits in worktrees: `poetry run git commit` (pre-commit hooks).
+
+## The loop
+
+```text
+address comments → format → commit → push
+→ wait for CI (while addressing new comments) → fix failures → push
+→ re-check comments after CI settles
+→ repeat until: all comments addressed AND CI green AND no new comments arriving
+```
+
+### Polling for CI + new comments
+
+After pushing, poll for **both** CI status and new comments in a single loop. Do not use `gh pr checks --watch` — it blocks the tool and prevents reacting to new comments while CI is running.
+
+> **Note:** `gh pr checks --watch --fail-fast` is tempting but it blocks the entire Bash tool call, meaning the agent cannot check for or address new comments until CI fully completes. Always poll manually instead.
+
+**Polling loop — repeat every 30 seconds:**
+
+1. Check CI status:
+```bash
+gh pr checks {N} --repo Significant-Gravitas/AutoGPT --json bucket,name,link
+```
+   Parse the results: if every check has `bucket` of `"pass"` or `"skipping"`, CI is green. If any has `"fail"`, CI has failed. Otherwise CI is still pending.
+
+2. Check for merge conflicts:
+```bash
+gh pr view {N} --repo Significant-Gravitas/AutoGPT --json mergeable --jq '.mergeable'
+```
+   If the result is `"CONFLICTING"`, the PR has a merge conflict — see "Resolving merge conflicts" below. If `"UNKNOWN"`, GitHub is still computing mergeability — wait and re-check next poll.
+
+3. Check for new/changed comments (all three sources):
+
+   **Inline threads** — re-run the GraphQL query from "Fetch comments". For each unresolved thread, record `{thread_id, last_comment_databaseId}` as your baseline. On each poll, action is needed if:
+   - A new thread `id` appears that wasn't in the baseline (new thread), OR
+   - An existing thread's `last_comment_databaseId` has changed (new reply on existing thread)
+
+   **Conversation comments:**
+   ```bash
+   gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments --paginate
+   ```
+   Compare total count and newest `id` against baseline. Filter to non-empty, non-bot, non-author-update messages.
+
+   **Top-level reviews:**
+   ```bash
+   gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate
+   ```
+   Watch for new non-empty reviews (`CHANGES_REQUESTED` or `COMMENTED` with body). Compare total count and newest `id` against baseline.
+
+4. **React in this precedence order (first match wins):**
+
+| What happened | Action |
+|---|---|
+| Merge conflict detected | See "Resolving merge conflicts" below. |
+| Mergeability is `UNKNOWN` | GitHub is still computing mergeability. Sleep 30 seconds, then restart polling from the top. |
+| New comments detected | Address them (fix → commit → push → reply). After pushing, re-fetch all comments to update your baseline, then restart this polling loop from the top (new commits invalidate CI status). |
+| CI failed (bucket == "fail") | Get failed check links: `gh pr checks {N} --repo Significant-Gravitas/AutoGPT --json bucket,link --jq '.[] \| select(.bucket == "fail") \| .link'`. Extract run ID from link (format: `.../actions/runs/<run-id>/job/...`), read logs with `gh run view <run-id> --repo Significant-Gravitas/AutoGPT --log-failed`. Fix → commit → push → restart polling. |
+| CI green + no new comments | **Do not exit immediately.** Bots (coderabbitai, sentry) often post reviews shortly after CI settles. Continue polling for **2 more cycles (60s)** after CI goes green. Only exit after 2 consecutive green+quiet polls. |
+| CI pending + no new comments | Sleep 30 seconds, then poll again. |
+
+**The loop ends when:** CI fully green + all comments addressed + **2 consecutive polls with no new comments after CI settled.**
+
+### Resolving merge conflicts
+
+1. Identify the PR's target branch and remote:
+```bash
+gh pr view {N} --repo Significant-Gravitas/AutoGPT --json baseRefName --jq '.baseRefName'
+git remote -v   # find the remote pointing to Significant-Gravitas/AutoGPT (typically 'upstream' in forks, 'origin' for direct contributors)
+```
+
+2. Pull the latest base branch with a 3-way merge:
+```bash
+git pull {base-remote} {base-branch} --no-rebase
+```
+
+3. Resolve conflicting files, then verify no conflict markers remain:
+```bash
+if grep -R -n -E '^(<<<<<<<|=======|>>>>>>>)' <conflicted-files>; then
+  echo "Unresolved conflict markers found — resolve before proceeding."
+  exit 1
+fi
+```
+
+4. Stage and push:
+```bash
+git add <conflicted-files>
+git commit -m "Resolve merge conflicts with {base-branch}"
+git push
+```
+
+5. Restart the polling loop from the top — new commits reset CI status.
--- a/.claude/skills/pr-review/SKILL.md
+++ b/.claude/skills/pr-review/SKILL.md
@@ -0,0 +1,86 @@
+---
+name: pr-review
+description: Review a PR for correctness, security, code quality, and testing issues. TRIGGER when user asks to review a PR, check PR quality, or give feedback on a PR.
+user-invocable: true
+args: "[PR number or URL] — if omitted, finds PR for current branch."
+metadata:
+  author: autogpt-team
+  version: "1.0.0"
+---
+
+# PR Review
+
+## Find the PR
+
+```bash
+gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT
+gh pr view {N}
+```
+
+## Read the PR description
+
+Before reading code, understand the **why**, **what**, and **how** from the PR description:
+
+```bash
+gh pr view {N} --json body --jq '.body'
+```
+
+Every PR should have a Why / What / How structure. If any of these are missing, note it as feedback.
+
+## Read the diff
+
+```bash
+gh pr diff {N}
+```
+
+## Fetch existing review comments
+
+Before posting anything, fetch existing inline comments to avoid duplicates:
+
+```bash
+gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments --paginate
+gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews
+```
+
+## What to check
+
+**Description quality:** Does the PR description cover Why (motivation/problem), What (summary of changes), and How (approach/implementation details)? If any are missing, request them — you can't judge the approach without understanding the problem and intent.
+
+**Correctness:** logic errors, off-by-one, missing edge cases, race conditions (TOCTOU in file access, credit charging), error handling gaps, async correctness (missing `await`, unclosed resources).
+
+**Security:** input validation at boundaries, no injection (command, XSS, SQL), secrets not logged, file paths sanitized (`os.path.basename()` in error messages).
+
+**Code quality:** apply rules from backend/frontend CLAUDE.md files.
+
+**Architecture:** DRY, single responsibility, modular functions. `Security()` vs `Depends()` for FastAPI auth. `data:` for SSE events, `: comment` for heartbeats. `transaction=True` for Redis pipelines.
+
+**Testing:** edge cases covered, colocated `*_test.py` (backend) / `__tests__/` (frontend), mocks target where symbol is **used** not defined, `AsyncMock` for async.
+
+## Output format
+
+Every comment **must** be prefixed with `🤖` and a criticality badge:
+
+| Tier | Badge | Meaning |
+|---|---|---|
+| Blocker | `🔴 **Blocker**` | Must fix before merge |
+| Should Fix | `🟠 **Should Fix**` | Important improvement |
+| Nice to Have | `🟡 **Nice to Have**` | Minor suggestion |
+| Nit | `🔵 **Nit**` | Style / wording |
+
+Example: `🤖 🔴 **Blocker**: Missing error handling for X — suggest wrapping in try/except.`
+
+## Post inline comments
+
+For each finding, post an inline comment on the PR (do not just write a local report):
+
+```bash
+# Get the latest commit SHA for the PR
+COMMIT_SHA=$(gh api repos/Significant-Gravitas/AutoGPT/pulls/{N} --jq '.head.sha')
+
+# Post an inline comment on a specific file/line
+gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments \
+  -f body="🤖 🔴 **Blocker**: <description>" \
+  -f commit_id="$COMMIT_SHA" \
+  -f path="<file path>" \
+  -F line=<line number>
+```
--- a/.claude/skills/pr-test/SKILL.md
+++ b/.claude/skills/pr-test/SKILL.md
@@ -0,0 +1,754 @@
+---
+name: pr-test
+description: "E2E manual testing of PRs/branches using docker compose, agent-browser, and API calls. TRIGGER when user asks to manually test a PR, test a feature end-to-end, or run integration tests against a running system."
+user-invocable: true
+argument-hint: "[worktree path or PR number] — tests the PR in the given worktree. Optional flags: --fix (auto-fix issues found)"
+metadata:
+  author: autogpt-team
+  version: "2.0.0"
+---
+
+# Manual E2E Test
+
+Test a PR/branch end-to-end by building the full platform, interacting via browser and API, capturing screenshots, and reporting results.
+
+## Critical Requirements
+
+These are NON-NEGOTIABLE. Every test run MUST satisfy ALL the following:
+
+### 1. Screenshots at Every Step
+- Take a screenshot at EVERY significant test step — not just at the end
+- Every test scenario MUST have at least one BEFORE and one AFTER screenshot
+- Name screenshots sequentially: `{NN}-{action}-{state}.png` (e.g., `01-credits-before.png`, `02-credits-after.png`)
+- If a screenshot is missing for a scenario, the test is INCOMPLETE — go back and take it
+
+### 2. Screenshots MUST Be Posted to PR
+- Push ALL screenshots to a temp branch `test-screenshots/pr-{N}`
+- Post a PR comment with ALL screenshots embedded inline using GitHub raw URLs
+- This is NOT optional — every test run MUST end with a PR comment containing screenshots
+- If screenshot upload fails, retry. If it still fails, list failed files and require manual drag-and-drop/paste attachment in the PR comment
+
+### 3. State Verification with Before/After Evidence
+- For EVERY state-changing operation (API call, user action), capture the state BEFORE and AFTER
+- Log the actual API response values (e.g., `credits_before=100, credits_after=95`)
+- Screenshot MUST show the relevant UI state change
+- Compare expected vs actual values explicitly — do not just eyeball it
+
+### 4. Negative Test Cases Are Mandatory
+- Test at least ONE negative case per feature (e.g., insufficient credits, invalid input, unauthorized access)
+- Verify error messages are user-friendly and accurate
+- Verify the system state did NOT change after a rejected operation
+
+### 5. Test Report Must Include Full Evidence
+Each test scenario in the report MUST have:
+- **Steps**: What was done (exact commands or UI actions)
+- **Expected**: What should happen
+- **Actual**: What actually happened
+- **API Evidence**: Before/after API response values for state-changing operations
+- **Screenshot Evidence**: Before/after screenshots with explanations
+
+## State Manipulation for Realistic Testing
+
+When testing features that depend on specific states (rate limits, credits, quotas):
+
+1. **Use Redis CLI to set counters directly:**
+   ```bash
+   # Find the Redis container
+   REDIS_CONTAINER=$(docker ps --format '{{.Names}}' | grep redis | head -1)
+   # Set a key with expiry
+   docker exec $REDIS_CONTAINER redis-cli SET key value EX ttl
+   # Example: Set rate limit counter to near-limit
+   docker exec $REDIS_CONTAINER redis-cli SET "rate_limit:user:test@test.com" 99 EX 3600
+   # Example: Check current value
+   docker exec $REDIS_CONTAINER redis-cli GET "rate_limit:user:test@test.com"
+   ```
+
+2. **Use API calls to check before/after state:**
+   ```bash
+   # BEFORE: Record current state
+   BEFORE=$(curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/credits | jq '.credits')
+   echo "Credits BEFORE: $BEFORE"
+
+   # Perform the action...
+
+   # AFTER: Record new state and compare
+   AFTER=$(curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/credits | jq '.credits')
+   echo "Credits AFTER: $AFTER"
+   echo "Delta: $(( BEFORE - AFTER ))"
+   ```
+
+3. **Take screenshots BEFORE and AFTER state changes** — the UI must reflect the backend state change
+
+4. **Never rely on mocked/injected browser state** — always use real backend state. Do NOT use `agent-browser eval` to fake UI state. The backend must be the source of truth.
+
+5. **Use direct DB queries when needed:**
+   ```bash
+   # Query via Supabase's PostgREST or docker exec into the DB
+   docker exec supabase-db psql -U supabase_admin -d postgres -c "SELECT credits FROM user_credits WHERE user_id = '...';"
+   ```
+
+6. **After every API test, verify the state change actually persisted:**
+   ```bash
+   # Example: After a credits purchase, verify DB matches API
+   API_CREDITS=$(curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/credits | jq '.credits')
+   DB_CREDITS=$(docker exec supabase-db psql -U supabase_admin -d postgres -t -c "SELECT credits FROM user_credits WHERE user_id = '...';" | tr -d ' ')
+   [ "$API_CREDITS" = "$DB_CREDITS" ] && echo "CONSISTENT" || echo "MISMATCH: API=$API_CREDITS DB=$DB_CREDITS"
+   ```
+
+## Arguments
+
+- `$ARGUMENTS` — worktree path (e.g. `$REPO_ROOT`) or PR number
+- If `--fix` flag is present, auto-fix bugs found and push fixes (like pr-address loop)
+
+## Step 0: Resolve the target
+
+```bash
+# If argument is a PR number, find its worktree
+gh pr view {N} --json headRefName --jq '.headRefName'
+# If argument is a path, use it directly
+```
+
+Determine:
+- `REPO_ROOT` — the root repo directory: `git -C "$WORKTREE_PATH" worktree list | head -1 | awk '{print $1}'` (or `git rev-parse --show-toplevel` if not a worktree)
+- `WORKTREE_PATH` — the worktree directory
+- `PLATFORM_DIR` — `$WORKTREE_PATH/autogpt_platform`
+- `BACKEND_DIR` — `$PLATFORM_DIR/backend`
+- `FRONTEND_DIR` — `$PLATFORM_DIR/frontend`
+- `PR_NUMBER` — the PR number (from `gh pr list --head $(git branch --show-current)`)
+- `PR_TITLE` — the PR title, slugified (e.g. "Add copilot permissions" → "add-copilot-permissions")
+- `RESULTS_DIR` — `$REPO_ROOT/test-results/PR-{PR_NUMBER}-{slugified-title}`
+
+Create the results directory:
+```bash
+PR_NUMBER=$(cd $WORKTREE_PATH && gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT --json number --jq '.[0].number')
+PR_TITLE=$(cd $WORKTREE_PATH && gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT --json title --jq '.[0].title' | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9]/-/g' | sed 's/--*/-/g' | sed 's/^-//;s/-$//' | head -c 50)
+RESULTS_DIR="$REPO_ROOT/test-results/PR-${PR_NUMBER}-${PR_TITLE}"
+mkdir -p $RESULTS_DIR
+```
+
+**Test user credentials** (for logging into the UI or verifying results manually):
+- Email: `test@test.com`
+- Password: `testtest123`
+
+## Step 1: Understand the PR
+
+Before testing, understand what changed:
+
+```bash
+cd $WORKTREE_PATH
+
+# Read PR description to understand the WHY
+gh pr view {N} --json body --jq '.body'
+
+git log --oneline dev..HEAD | head -20
+git diff dev --stat
+```
+
+Read the PR description (Why / What / How) and changed files to understand:
+0. **Why** does this PR exist? What problem does it solve?
+1. **What** feature/fix does this PR implement?
+2. **How** does it work? What's the approach?
+3. What components are affected? (backend, frontend, copilot, executor, etc.)
+4. What are the key user-facing behaviors to test?
+
+## Step 2: Write test scenarios
+
+Based on the PR analysis, write a test plan to `$RESULTS_DIR/test-plan.md`:
+
+```markdown
+# Test Plan: PR #{N} — {title}
+
+## Scenarios
+1. [Scenario name] — [what to verify]
+2. ...
+
+## API Tests (if applicable)
+1. [Endpoint] — [expected behavior]
+   - Before state: [what to check before]
+   - After state: [what to verify changed]
+
+## UI Tests (if applicable)
+1. [Page/component] — [interaction to test]
+   - Screenshot before: [what to capture]
+   - Screenshot after: [what to capture]
+
+## Negative Tests (REQUIRED — at least one per feature)
+1. [What should NOT happen] — [how to trigger it]
+   - Expected error: [what error message/code]
+   - State unchanged: [what to verify did NOT change]
+```
+
+**Be critical** — include edge cases, error paths, and security checks. Every scenario MUST specify what screenshots to take and what state to verify.
+
+## Step 3: Environment setup
+
+### 3a. Copy .env files from the root worktree
+
+The root worktree (`$REPO_ROOT`) has the canonical `.env` files with all API keys. Copy them to the target worktree:
+
+```bash
+# CRITICAL: .env files are NOT checked into git. They must be copied manually.
+cp $REPO_ROOT/autogpt_platform/.env $PLATFORM_DIR/.env
+cp $REPO_ROOT/autogpt_platform/backend/.env $BACKEND_DIR/.env
+cp $REPO_ROOT/autogpt_platform/frontend/.env $FRONTEND_DIR/.env
+```
+
+### 3b. Configure copilot authentication
+
+The copilot needs an LLM API to function. Two approaches (try subscription first):
+
+#### Option 1: Subscription mode (preferred — uses your Claude Max/Pro subscription)
+
+The `claude_agent_sdk` Python package **bundles its own Claude CLI binary** — no need to install `@anthropic-ai/claude-code` via npm. The backend auto-provisions credentials from environment variables on startup.
+
+Run the helper script to extract tokens from your host and auto-update `backend/.env` (works on macOS, Linux, and Windows/WSL):
+
+```bash
+# Extracts OAuth tokens and writes CLAUDE_CODE_OAUTH_TOKEN + CLAUDE_CODE_REFRESH_TOKEN into .env
+bash $BACKEND_DIR/scripts/refresh_claude_token.sh --env-file $BACKEND_DIR/.env
+```
+
+**How it works:** The script reads the OAuth token from:
+- **macOS**: system keychain (`"Claude Code-credentials"`)
+- **Linux/WSL**: `~/.claude/.credentials.json`
+- **Windows**: `%APPDATA%/claude/.credentials.json`
+
+It sets `CLAUDE_CODE_OAUTH_TOKEN`, `CLAUDE_CODE_REFRESH_TOKEN`, and `CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=true` in the `.env` file. On container startup, the backend auto-provisions `~/.claude/.credentials.json` inside the container from these env vars. The SDK's bundled CLI then authenticates using that file. No `claude login`, no npm install needed.
+
+**Note:** The OAuth token expires (~24h). If copilot returns auth errors, re-run the script and restart: `$BACKEND_DIR/scripts/refresh_claude_token.sh --env-file $BACKEND_DIR/.env && docker compose up -d copilot_executor`
+
+#### Option 2: OpenRouter API key mode (fallback)
+
+If subscription mode doesn't work, switch to API key mode using OpenRouter:
+
+```bash
+# In $BACKEND_DIR/.env, ensure these are set:
+CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=false
+CHAT_API_KEY=<value of OPEN_ROUTER_API_KEY from the same .env>
+CHAT_BASE_URL=https://openrouter.ai/api/v1
+CHAT_USE_CLAUDE_AGENT_SDK=true
+```
+
+Use `sed` to update these values:
+```bash
+ORKEY=$(grep "^OPEN_ROUTER_API_KEY=" $BACKEND_DIR/.env | cut -d= -f2)
+[ -n "$ORKEY" ] || { echo "ERROR: OPEN_ROUTER_API_KEY is missing in $BACKEND_DIR/.env"; exit 1; }
+perl -i -pe 's/CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=true/CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=false/' $BACKEND_DIR/.env
+# Add or update CHAT_API_KEY and CHAT_BASE_URL
+grep -q "^CHAT_API_KEY=" $BACKEND_DIR/.env && perl -i -pe "s|^CHAT_API_KEY=.*|CHAT_API_KEY=$ORKEY|" $BACKEND_DIR/.env || echo "CHAT_API_KEY=$ORKEY" >> $BACKEND_DIR/.env
+grep -q "^CHAT_BASE_URL=" $BACKEND_DIR/.env && perl -i -pe 's|^CHAT_BASE_URL=.*|CHAT_BASE_URL=https://openrouter.ai/api/v1|' $BACKEND_DIR/.env || echo "CHAT_BASE_URL=https://openrouter.ai/api/v1" >> $BACKEND_DIR/.env
+```
+
+### 3c. Stop conflicting containers
+
+```bash
+# Stop any running app containers (keep infra: supabase, redis, rabbitmq, clamav)
+docker ps --format "{{.Names}}" | grep -E "rest_server|executor|copilot|websocket|database_manager|scheduler|notification|frontend|migrate" | while read name; do
+  docker stop "$name" 2>/dev/null
+done
+```
+
+### 3e. Build and start
+
+```bash
+cd $PLATFORM_DIR && docker compose build --no-cache 2>&1 | tail -20
+if [ ${PIPESTATUS[0]} -ne 0 ]; then echo "ERROR: Docker build failed"; exit 1; fi
+
+cd $PLATFORM_DIR && docker compose up -d 2>&1 | tail -20
+if [ ${PIPESTATUS[0]} -ne 0 ]; then echo "ERROR: Docker compose up failed"; exit 1; fi
+```
+
+**Note:** If the container appears to be running old code (e.g. missing PR changes), use `docker compose build --no-cache` to force a full rebuild. Docker BuildKit may sometimes reuse cached `COPY` layers from a previous build on a different branch.
+
+**Expected time: 3-8 minutes** for build, 5-10 minutes with `--no-cache`.
+
+### 3f. Wait for services to be ready
+
+```bash
+# Poll until backend and frontend respond
+for i in $(seq 1 60); do
+  BACKEND=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:8006/docs 2>/dev/null)
+  FRONTEND=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:3000 2>/dev/null)
+  if [ "$BACKEND" = "200" ] && [ "$FRONTEND" = "200" ]; then
+    echo "Services ready"
+    break
+  fi
+  sleep 5
+done
+```
+
+
+### 3h. Create test user and get auth token
+
+```bash
+ANON_KEY=$(grep "NEXT_PUBLIC_SUPABASE_ANON_KEY=" $FRONTEND_DIR/.env | sed 's/.*NEXT_PUBLIC_SUPABASE_ANON_KEY=//' | tr -d '[:space:]')
+
+# Signup (idempotent — returns "User already registered" if exists)
+RESULT=$(curl -s -X POST 'http://localhost:8000/auth/v1/signup' \
+  -H "apikey: $ANON_KEY" \
+  -H 'Content-Type: application/json' \
+  -d '{"email":"test@test.com","password":"testtest123"}')
+
+# If "Database error finding user", restart supabase-auth and retry
+if echo "$RESULT" | grep -q "Database error"; then
+  docker restart supabase-auth && sleep 5
+  curl -s -X POST 'http://localhost:8000/auth/v1/signup' \
+    -H "apikey: $ANON_KEY" \
+    -H 'Content-Type: application/json' \
+    -d '{"email":"test@test.com","password":"testtest123"}'
+fi
+
+# Get auth token
+TOKEN=$(curl -s -X POST 'http://localhost:8000/auth/v1/token?grant_type=password' \
+  -H "apikey: $ANON_KEY" \
+  -H 'Content-Type: application/json' \
+  -d '{"email":"test@test.com","password":"testtest123"}' | jq -r '.access_token // ""')
+```
+
+**Use this token for ALL API calls:**
+```bash
+curl -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/...
+```
+
+## Step 4: Run tests
+
+### Service ports reference
+
+| Service | Port | URL |
+|---------|------|-----|
+| Frontend | 3000 | http://localhost:3000 |
+| Backend REST | 8006 | http://localhost:8006 |
+| Supabase Auth (via Kong) | 8000 | http://localhost:8000 |
+| Executor | 8002 | http://localhost:8002 |
+| Copilot Executor | 8008 | http://localhost:8008 |
+| WebSocket | 8001 | http://localhost:8001 |
+| Database Manager | 8005 | http://localhost:8005 |
+| Redis | 6379 | localhost:6379 |
+| RabbitMQ | 5672 | localhost:5672 |
+
+### API testing
+
+Use `curl` with the auth token for backend API tests. **For EVERY API call that changes state, record before/after values:**
+
+```bash
+# Example: List agents
+curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/graphs | jq . | head -20
+
+# Example: Create an agent
+curl -s -X POST http://localhost:8006/api/graphs \
+  -H "Authorization: Bearer $TOKEN" \
+  -H 'Content-Type: application/json' \
+  -d '{...}' | jq .
+
+# Example: Run an agent
+curl -s -X POST "http://localhost:8006/api/graphs/{graph_id}/execute" \
+  -H "Authorization: Bearer $TOKEN" \
+  -H 'Content-Type: application/json' \
+  -d '{"data": {...}}'
+
+# Example: Get execution results
+curl -s -H "Authorization: Bearer $TOKEN" \
+  "http://localhost:8006/api/graphs/{graph_id}/executions/{exec_id}" | jq .
+```
+
+**State verification pattern (use for EVERY state-changing API call):**
+```bash
+# 1. Record BEFORE state
+BEFORE_STATE=$(curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/{resource} | jq '{relevant_fields}')
+echo "BEFORE: $BEFORE_STATE"
+
+# 2. Perform the action
+ACTION_RESULT=$(curl -s -X POST ... | jq .)
+echo "ACTION RESULT: $ACTION_RESULT"
+
+# 3. Record AFTER state
+AFTER_STATE=$(curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8006/api/{resource} | jq '{relevant_fields}')
+echo "AFTER: $AFTER_STATE"
+
+# 4. Log the comparison
+echo "=== STATE CHANGE VERIFICATION ==="
+echo "Before: $BEFORE_STATE"
+echo "After: $AFTER_STATE"
+echo "Expected change: {describe what should have changed}"
+```
+
+### Browser testing with agent-browser
+
+```bash
+# Close any existing session
+agent-browser close 2>/dev/null || true
+
+# Use --session-name to persist cookies across navigations
+# This means login only needs to happen once per test session
+agent-browser --session-name pr-test open 'http://localhost:3000/login' --timeout 15000
+
+# Get interactive elements
+agent-browser --session-name pr-test snapshot | grep "textbox\|button"
+
+# Login
+agent-browser --session-name pr-test fill {email_ref} "test@test.com"
+agent-browser --session-name pr-test fill {password_ref} "testtest123"
+agent-browser --session-name pr-test click {login_button_ref}
+sleep 5
+
+# Dismiss cookie banner if present
+agent-browser --session-name pr-test click 'text=Accept All' 2>/dev/null || true
+
+# Navigate — cookies are preserved so login persists
+agent-browser --session-name pr-test open 'http://localhost:3000/copilot' --timeout 10000
+
+# Take screenshot
+agent-browser --session-name pr-test screenshot $RESULTS_DIR/01-page.png
+
+# Interact with elements
+agent-browser --session-name pr-test fill {ref} "text"
+agent-browser --session-name pr-test press "Enter"
+agent-browser --session-name pr-test click {ref}
+agent-browser --session-name pr-test click 'text=Button Text'
+
+# Read page content
+agent-browser --session-name pr-test snapshot | grep "text:"
+```
+
+**Key pages:**
+- `/copilot` — CoPilot chat (for testing copilot features)
+- `/build` — Agent builder (for testing block/node features)
+- `/build?flowID={id}` — Specific agent in builder
+- `/library` — Agent library (for testing listing/import features)
+- `/library/agents/{id}` — Agent detail with run history
+- `/marketplace` — Marketplace
+
+### Checking logs
+
+```bash
+# Backend REST server
+docker logs autogpt_platform-rest_server-1 2>&1 | tail -30
+
+# Executor (runs agent graphs)
+docker logs autogpt_platform-executor-1 2>&1 | tail -30
+
+# Copilot executor (runs copilot chat sessions)
+docker logs autogpt_platform-copilot_executor-1 2>&1 | tail -30
+
+# Frontend
+docker logs autogpt_platform-frontend-1 2>&1 | tail -30
+
+# Filter for errors
+docker logs autogpt_platform-executor-1 2>&1 | grep -i "error\|exception\|traceback" | tail -20
+```
+
+### Copilot chat testing
+
+The copilot uses SSE streaming. To test via API:
+
+```bash
+# Create a session
+SESSION_ID=$(curl -s -X POST 'http://localhost:8006/api/chat/sessions' \
+  -H "Authorization: Bearer $TOKEN" \
+  -H 'Content-Type: application/json' \
+  -d '{}' | jq -r '.id // .session_id // ""')
+
+# Stream a message (SSE - will stream chunks)
+curl -N -X POST "http://localhost:8006/api/chat/sessions/$SESSION_ID/stream" \
+  -H "Authorization: Bearer $TOKEN" \
+  -H 'Content-Type: application/json' \
+  -d '{"message": "Hello, what can you help me with?"}' \
+  --max-time 60 2>/dev/null | head -50
+```
+
+Or test via browser (preferred for UI verification):
+```bash
+agent-browser --session-name pr-test open 'http://localhost:3000/copilot' --timeout 10000
+# ... fill chat input and press Enter, wait 20-30s for response
+```
+
+## Step 5: Record results and take screenshots
+
+**Take a screenshot at EVERY significant test step** — before and after interactions, on success, and on failure. This is NON-NEGOTIABLE.
+
+**Required screenshot pattern for each test scenario:**
+```bash
+# BEFORE the action
+agent-browser --session-name pr-test screenshot $RESULTS_DIR/{NN}-{scenario}-before.png
+
+# Perform the action...
+
+# AFTER the action
+agent-browser --session-name pr-test screenshot $RESULTS_DIR/{NN}-{scenario}-after.png
+```
+
+**Naming convention:**
+```bash
+# Examples:
+# $RESULTS_DIR/01-login-page-before.png
+# $RESULTS_DIR/02-login-page-after.png
+# $RESULTS_DIR/03-credits-page-before.png
+# $RESULTS_DIR/04-credits-purchase-after.png
+# $RESULTS_DIR/05-negative-insufficient-credits.png
+# $RESULTS_DIR/06-error-state.png
+```
+
+**Minimum requirements:**
+- At least TWO screenshots per test scenario (before + after)
+- At least ONE screenshot for each negative test case showing the error state
+- If a test fails, screenshot the failure state AND any error logs visible in the UI
+
+## Step 6: Show results to user with screenshots
+
+**CRITICAL: After all tests complete, you MUST show every screenshot to the user using the Read tool, with an explanation of what each screenshot shows.** This is the most important part of the test report — the user needs to visually verify the results.
+
+For each screenshot:
+1. Use the `Read` tool to display the PNG file (Claude can read images)
+2. Write a 1-2 sentence explanation below it describing:
+   - What page/state is being shown
+   - What the screenshot proves (which test scenario it validates)
+   - Any notable details visible in the UI
+
+Format the output like this:
+
+```markdown
+### Screenshot 1: {descriptive title}
+[Read the PNG file here]
+
+**What it shows:** {1-2 sentence explanation of what this screenshot proves}
+
+---
+```
+
+After showing all screenshots, output a **detailed** summary table:
+
+| # | Scenario | Result | API Evidence | Screenshot Evidence |
+|---|----------|--------|-------------|-------------------|
+| 1 | {name} | PASS/FAIL | Before: X, After: Y | 01-before.png, 02-after.png |
+| 2 | ... | ... | ... | ... |
+
+**IMPORTANT:** As you show each screenshot and record test results, persist them in shell variables for Step 7:
+
+```bash
+# Build these variables during Step 6 — they are required by Step 7's script
+# NOTE: declare -A requires Bash 4.0+. This is standard on modern systems (macOS ships zsh
+# but Homebrew bash is 5.x; Linux typically has bash 5.x). If running on Bash <4, use a
+# plain variable with a lookup function instead.
+declare -A SCREENSHOT_EXPLANATIONS=(
+  ["01-login-page.png"]="Shows the login page loaded successfully with SSO options visible."
+  ["02-builder-with-block.png"]="The builder canvas displays the newly added block connected to the trigger."
+  # ... one entry per screenshot, using the same explanations you showed the user above
+)
+
+TEST_RESULTS_TABLE="| 1 | Login flow | PASS | N/A | 01-login-before.png, 02-login-after.png |
+| 2 | Credits purchase | PASS | Before: 100, After: 95 | 03-credits-before.png, 04-credits-after.png |
+| 3 | Insufficient credits (negative) | PASS | Credits: 0, rejected | 05-insufficient-credits-error.png |"
+# ... one row per test scenario with actual results
+```
+
+## Step 7: Post test report as PR comment with screenshots
+
+Upload screenshots to the PR using the GitHub Git API (no local git operations — safe for worktrees), then post a comment with inline images and per-screenshot explanations.
+
+**This step is MANDATORY. Every test run MUST post a PR comment with screenshots. No exceptions.**
+
+```bash
+# Upload screenshots via GitHub Git API (creates blobs, tree, commit, and ref remotely)
+REPO="Significant-Gravitas/AutoGPT"
+SCREENSHOTS_BRANCH="test-screenshots/pr-${PR_NUMBER}"
+SCREENSHOTS_DIR="test-screenshots/PR-${PR_NUMBER}"
+
+# Step 1: Create blobs for each screenshot and build tree JSON
+# Retry each blob upload up to 3 times. If still failing, list them at end of report.
+shopt -s nullglob
+SCREENSHOT_FILES=("$RESULTS_DIR"/*.png)
+if [ ${#SCREENSHOT_FILES[@]} -eq 0 ]; then
+  echo "ERROR: No screenshots found in $RESULTS_DIR. Test run is incomplete."
+  exit 1
+fi
+TREE_JSON='['
+FIRST=true
+FAILED_UPLOADS=()
+for img in "${SCREENSHOT_FILES[@]}"; do
+  BASENAME=$(basename "$img")
+  B64=$(base64 < "$img")
+  BLOB_SHA=""
+  for attempt in 1 2 3; do
+    BLOB_SHA=$(gh api "repos/${REPO}/git/blobs" -f content="$B64" -f encoding="base64" --jq '.sha' 2>/dev/null || true)
+    [ -n "$BLOB_SHA" ] && break
+    sleep 1
+  done
+  if [ -z "$BLOB_SHA" ]; then
+    FAILED_UPLOADS+=("$img")
+    continue
+  fi
+  if [ "$FIRST" = true ]; then FIRST=false; else TREE_JSON+=','; fi
+  TREE_JSON+="{\"path\":\"${SCREENSHOTS_DIR}/${BASENAME}\",\"mode\":\"100644\",\"type\":\"blob\",\"sha\":\"${BLOB_SHA}\"}"
+done
+TREE_JSON+=']'
+
+# Step 2: Create tree, commit, and branch ref
+TREE_SHA=$(echo "$TREE_JSON" | jq -c '{tree: .}' | gh api "repos/${REPO}/git/trees" --input - --jq '.sha')
+COMMIT_SHA=$(gh api "repos/${REPO}/git/commits" \
+  -f message="test: add E2E test screenshots for PR #${PR_NUMBER}" \
+  -f tree="$TREE_SHA" \
+  --jq '.sha')
+gh api "repos/${REPO}/git/refs" \
+  -f ref="refs/heads/${SCREENSHOTS_BRANCH}" \
+  -f sha="$COMMIT_SHA" 2>/dev/null \
+  || gh api "repos/${REPO}/git/refs/heads/${SCREENSHOTS_BRANCH}" \
+    -X PATCH -f sha="$COMMIT_SHA" -f force=true
+```
+
+Then post the comment with **inline images AND explanations for each screenshot**:
+
+```bash
+REPO_URL="https://raw.githubusercontent.com/${REPO}/${SCREENSHOTS_BRANCH}"
+
+# Build image markdown using uploaded image URLs; skip FAILED_UPLOADS (listed separately)
+
+IMAGE_MARKDOWN=""
+for img in "${SCREENSHOT_FILES[@]}"; do
+  BASENAME=$(basename "$img")
+  TITLE=$(echo "${BASENAME%.png}" | sed 's/^[0-9]*-//' | sed 's/-/ /g' | awk '{for(i=1;i<=NF;i++) $i=toupper(substr($i,1,1)) tolower(substr($i,2))}1')
+  # Skip images that failed to upload — they will be listed at the end
+  IS_FAILED=false
+  for failed in "${FAILED_UPLOADS[@]}"; do
+    [ "$(basename "$failed")" = "$BASENAME" ] && IS_FAILED=true && break
+  done
+  if [ "$IS_FAILED" = true ]; then
+    continue
+  fi
+  EXPLANATION="${SCREENSHOT_EXPLANATIONS[$BASENAME]}"
+  if [ -z "$EXPLANATION" ]; then
+    echo "ERROR: Missing screenshot explanation for $BASENAME. Add it to SCREENSHOT_EXPLANATIONS in Step 6."
+    exit 1
+  fi
+  IMAGE_MARKDOWN="${IMAGE_MARKDOWN}
+### ${TITLE}
+![${BASENAME}](${REPO_URL}/${SCREENSHOTS_DIR}/${BASENAME})
+${EXPLANATION}
+"
+done
+
+# Write comment body to file to avoid shell interpretation issues with special characters
+COMMENT_FILE=$(mktemp)
+# If any uploads failed, append a section listing them with instructions
+FAILED_SECTION=""
+if [ ${#FAILED_UPLOADS[@]} -gt 0 ]; then
+  FAILED_SECTION="
+## ⚠️ Failed Screenshot Uploads
+The following screenshots could not be uploaded via the GitHub API after 3 retries.
+**To add them:** drag-and-drop or paste these files into a PR comment manually:
+"
+  for failed in "${FAILED_UPLOADS[@]}"; do
+    FAILED_SECTION="${FAILED_SECTION}
+- \`$(basename "$failed")\` (local path: \`$failed\`)"
+  done
+  FAILED_SECTION="${FAILED_SECTION}
+
+**Run status:** INCOMPLETE until the files above are manually attached and visible inline in the PR."
+fi
+
+cat > "$COMMENT_FILE" <<INNEREOF
+## E2E Test Report
+
+| # | Scenario | Result | API Evidence | Screenshot Evidence |
+|---|----------|--------|-------------|-------------------|
+${TEST_RESULTS_TABLE}
+
+${IMAGE_MARKDOWN}
+${FAILED_SECTION}
+INNEREOF
+
+gh api "repos/${REPO}/issues/$PR_NUMBER/comments" -F body=@"$COMMENT_FILE"
+rm -f "$COMMENT_FILE"
+```
+
+**The PR comment MUST include:**
+1. A summary table of all scenarios with PASS/FAIL and before/after API evidence
+2. Every successfully uploaded screenshot rendered inline; any failed uploads listed with manual attachment instructions
+3. A 1-2 sentence explanation below each screenshot describing what it proves
+
+This approach uses the GitHub Git API to create blobs, trees, commits, and refs entirely server-side. No local `git checkout` or `git push` — safe for worktrees and won't interfere with the PR branch.
+
+## Fix mode (--fix flag)
+
+When `--fix` is present, the standard is HIGHER. Do not just note issues — FIX them immediately.
+
+### Fix protocol for EVERY issue found (including UX issues):
+
+1. **Identify** the root cause in the code — read the relevant source files
+2. **Write a failing test first** (TDD): For backend bugs, write a test marked with `pytest.mark.xfail(reason="...")`. For frontend/Playwright bugs, write a test with `.fixme` annotation. Run it to confirm it fails as expected.
+3. **Screenshot** the broken state: `agent-browser screenshot $RESULTS_DIR/{NN}-broken-{description}.png`
+4. **Fix** the code in the worktree
+5. **Rebuild** ONLY the affected service (not the whole stack):
+   ```bash
+   cd $PLATFORM_DIR && docker compose up --build -d {service_name}
+   # e.g., docker compose up --build -d rest_server
+   # e.g., docker compose up --build -d frontend
+   ```
+6. **Wait** for the service to be ready (poll health endpoint)
+7. **Re-test** the same scenario
+8. **Screenshot** the fixed state: `agent-browser screenshot $RESULTS_DIR/{NN}-fixed-{description}.png`
+9. **Remove the xfail/fixme marker** from the test written in step 2, and verify it passes
+10. **Verify** the fix did not break other scenarios (run a quick smoke test)
+11. **Commit and push** immediately:
+   ```bash
+   cd $WORKTREE_PATH
+   git add -A
+   git commit -m "fix: {description of fix}"
+   git push
+   ```
+12. **Continue** to the next test scenario
+
+### Fix loop (like pr-address)
+
+```text
+test scenario → find issue (bug OR UX problem) → screenshot broken state
+→ fix code → rebuild affected service only → re-test → screenshot fixed state
+→ verify no regressions → commit + push
+→ repeat for next scenario
+→ after ALL scenarios pass, run full re-test to verify everything together
+```
+
+**Key differences from non-fix mode:**
+- UX issues count as bugs — fix them (bad alignment, confusing labels, missing loading states)
+- Every fix MUST have a before/after screenshot pair proving it works
+- Commit after EACH fix, not in a batch at the end
+- The final re-test must produce a clean set of all-passing screenshots
+
+## Known issues and workarounds
+
+### Problem: "Database error finding user" on signup
+**Cause:** Supabase auth service schema cache is stale after migration.
+**Fix:** `docker restart supabase-auth && sleep 5` then retry signup.
+
+### Problem: Copilot returns auth errors in subscription mode
+**Cause:** `CHAT_USE_CLAUDE_CODE_SUBSCRIPTION=true` but `CLAUDE_CODE_OAUTH_TOKEN` is not set or expired.
+**Fix:** Re-extract the OAuth token from macOS keychain (see step 3b, Option 1) and recreate the container (`docker compose up -d copilot_executor`). The backend auto-provisions `~/.claude/.credentials.json` from the env var on startup. No `npm install` or `claude login` needed — the SDK bundles its own CLI binary.
+
+### Problem: agent-browser can't find chromium
+**Cause:** The Dockerfile auto-provisions system chromium on all architectures (including ARM64). If your branch is behind `dev`, this may not be present yet.
+**Fix:** Check if chromium exists: `which chromium || which chromium-browser`. If missing, install it: `apt-get install -y chromium` and set `AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium` in the container environment.
+
+### Problem: agent-browser selector matches multiple elements
+**Cause:** `text=X` matches all elements containing that text.
+**Fix:** Use `agent-browser snapshot` to get specific `ref=eNN` references, then use those: `agent-browser click eNN`.
+
+### Problem: Frontend shows cookie banner blocking interaction
+**Fix:** `agent-browser click 'text=Accept All'` before other interactions.
+
+### Problem: Container loses npm packages after rebuild
+**Cause:** `docker compose up --build` rebuilds the image, losing runtime installs.
+**Fix:** Add packages to the Dockerfile instead of installing at runtime.
+
+### Problem: Services not starting after `docker compose up`
+**Fix:** Wait and check health: `docker compose ps`. Common cause: migration hasn't finished. Check: `docker logs autogpt_platform-migrate-1 2>&1 | tail -5`. If supabase-db isn't healthy: `docker restart supabase-db && sleep 10`.
+
+### Problem: Docker uses cached layers with old code (PR changes not visible)
+**Cause:** `docker compose up --build` reuses cached `COPY` layers from previous builds. If the PR branch changes Python files but the previous build already cached that layer from `dev`, the container runs `dev` code.
+**Fix:** Always use `docker compose build --no-cache` for the first build of a PR branch. Subsequent rebuilds within the same branch can use `--build`.
+
+### Problem: `agent-browser open` loses login session
+**Cause:** Without session persistence, `agent-browser open` starts fresh.
+**Fix:** Use `--session-name pr-test` on ALL agent-browser commands. This auto-saves/restores cookies and localStorage across navigations. Alternatively, use `agent-browser eval "window.location.href = '...'"` to navigate within the same context.
+
+### Problem: Supabase auth returns "Database error querying schema"
+**Cause:** The database schema changed (migration ran) but supabase-auth has a stale schema cache.
+**Fix:** `docker restart supabase-db && sleep 10 && docker restart supabase-auth && sleep 8`. If user data was lost, re-signup.
--- a/.claude/skills/setup-repo/SKILL.md
+++ b/.claude/skills/setup-repo/SKILL.md
@@ -0,0 +1,195 @@
+---
+name: setup-repo
+description: Initialize a worktree-based repo layout for parallel development. Creates a main worktree, a reviews worktree for PR reviews, and N numbered work branches. Handles .env creation, dependency installation, and branchlet config. TRIGGER when user asks to set up the repo from scratch, initialize worktrees, bootstrap their dev environment, "setup repo", "setup worktrees", "initialize dev environment", "set up branches", or when a freshly cloned repo has no sibling worktrees.
+user-invocable: true
+args: "No arguments — interactive setup via prompts."
+metadata:
+  author: autogpt-team
+  version: "1.0.0"
+---
+
+# Repository Setup
+
+This skill sets up a worktree-based development layout from a freshly cloned repo. It creates:
+- A **main** worktree (the primary checkout)
+- A **reviews** worktree (for PR reviews)
+- **N work branches** (branch1..branchN) for parallel development
+
+## Step 1: Identify the repo
+
+Determine the repo root and parent directory:
+
+```bash
+ROOT=$(git rev-parse --show-toplevel)
+REPO_NAME=$(basename "$ROOT")
+PARENT=$(dirname "$ROOT")
+```
+
+Detect if the repo is already inside a worktree layout by counting sibling worktrees (not just checking the directory name, which could be anything):
+
+```bash
+# Count worktrees that are siblings (live under $PARENT but aren't $ROOT itself)
+SIBLING_COUNT=$(git worktree list --porcelain 2>/dev/null | grep "^worktree " | grep -c "$PARENT/" || true)
+if [ "$SIBLING_COUNT" -gt 1 ]; then
+  echo "INFO: Existing worktree layout detected at $PARENT ($SIBLING_COUNT worktrees)"
+  # Use $ROOT as-is; skip renaming/restructuring
+else
+  echo "INFO: Fresh clone detected, proceeding with setup"
+fi
+```
+
+## Step 2: Ask the user questions
+
+Use AskUserQuestion to gather setup preferences:
+
+1. **How many parallel work branches do you need?** (Options: 4, 8, 16, or custom)
+   - These become `branch1` through `branchN`
+2. **Which branch should be the base?** (Options: origin/master, origin/dev, or custom)
+   - All work branches and reviews will start from this
+
+## Step 3: Fetch and set up branches
+
+```bash
+cd "$ROOT"
+git fetch origin
+
+# Create the reviews branch from base (skip if already exists)
+if git show-ref --verify --quiet refs/heads/reviews; then
+  echo "INFO: Branch 'reviews' already exists, skipping"
+else
+  git branch reviews <base-branch>
+fi
+
+# Create numbered work branches from base (skip if already exists)
+for i in $(seq 1 "$COUNT"); do
+  if git show-ref --verify --quiet "refs/heads/branch$i"; then
+    echo "INFO: Branch 'branch$i' already exists, skipping"
+  else
+    git branch "branch$i" <base-branch>
+  fi
+done
+```
+
+## Step 4: Create worktrees
+
+Create worktrees as siblings to the main checkout:
+
+```bash
+if [ -d "$PARENT/reviews" ]; then
+  echo "INFO: Worktree '$PARENT/reviews' already exists, skipping"
+else
+  git worktree add "$PARENT/reviews" reviews
+fi
+
+for i in $(seq 1 "$COUNT"); do
+  if [ -d "$PARENT/branch$i" ]; then
+    echo "INFO: Worktree '$PARENT/branch$i' already exists, skipping"
+  else
+    git worktree add "$PARENT/branch$i" "branch$i"
+  fi
+done
+```
+
+## Step 5: Set up environment files
+
+**Do NOT assume .env files exist.** For each worktree (including main if needed):
+
+1. Check if `.env` exists in the source worktree for each path
+2. If `.env` exists, copy it
+3. If only `.env.default` or `.env.example` exists, copy that as `.env`
+4. If neither exists, warn the user and list which env files are missing
+
+Env file locations to check (same as the `/worktree` skill — keep these in sync):
+- `autogpt_platform/.env`
+- `autogpt_platform/backend/.env`
+- `autogpt_platform/frontend/.env`
+
+> **Note:** This env copying logic intentionally mirrors the `/worktree` skill's approach. If you update the path list or fallback logic here, update `/worktree` as well.
+
+```bash
+SOURCE="$ROOT"
+WORKTREES="reviews"
+for i in $(seq 1 "$COUNT"); do WORKTREES="$WORKTREES branch$i"; done
+
+FOUND_ANY_ENV=0
+for wt in $WORKTREES; do
+  TARGET="$PARENT/$wt"
+  for envpath in autogpt_platform autogpt_platform/backend autogpt_platform/frontend; do
+    if [ -f "$SOURCE/$envpath/.env" ]; then
+      FOUND_ANY_ENV=1
+      cp "$SOURCE/$envpath/.env" "$TARGET/$envpath/.env"
+    elif [ -f "$SOURCE/$envpath/.env.default" ]; then
+      FOUND_ANY_ENV=1
+      cp "$SOURCE/$envpath/.env.default" "$TARGET/$envpath/.env"
+      echo "NOTE: $wt/$envpath/.env was created from .env.default — you may need to edit it"
+    elif [ -f "$SOURCE/$envpath/.env.example" ]; then
+      FOUND_ANY_ENV=1
+      cp "$SOURCE/$envpath/.env.example" "$TARGET/$envpath/.env"
+      echo "NOTE: $wt/$envpath/.env was created from .env.example — you may need to edit it"
+    else
+      echo "WARNING: No .env, .env.default, or .env.example found at $SOURCE/$envpath/"
+    fi
+  done
+done
+
+if [ "$FOUND_ANY_ENV" -eq 0 ]; then
+  echo "WARNING: No environment files or templates were found in the source worktree."
+  # Use AskUserQuestion to confirm: "Continue setup without env files?"
+  # If the user declines, stop here and let them set up .env files first.
+fi
+```
+
+## Step 6: Copy branchlet config
+
+Copy `.branchlet.json` from main to each worktree so branchlet can manage sub-worktrees:
+
+```bash
+if [ -f "$ROOT/.branchlet.json" ]; then
+  for wt in $WORKTREES; do
+    cp "$ROOT/.branchlet.json" "$PARENT/$wt/.branchlet.json"
+  done
+fi
+```
+
+## Step 7: Install dependencies
+
+Install deps in all worktrees. Run these sequentially per worktree:
+
+```bash
+for wt in $WORKTREES; do
+  TARGET="$PARENT/$wt"
+  echo "=== Installing deps for $wt ==="
+  (cd "$TARGET/autogpt_platform/autogpt_libs" && poetry install) &&
+  (cd "$TARGET/autogpt_platform/backend" && poetry install && poetry run prisma generate) &&
+  (cd "$TARGET/autogpt_platform/frontend" && pnpm install) &&
+  echo "=== Done: $wt ===" ||
+  echo "=== FAILED: $wt ==="
+done
+```
+
+This is slow. Run in background if possible and notify when complete.
+
+## Step 8: Verify and report
+
+After setup, verify and report to the user:
+
+```bash
+git worktree list
+```
+
+Summarize:
+- Number of worktrees created
+- Which env files were copied vs created from defaults vs missing
+- Any warnings or errors encountered
+
+## Final directory layout
+
+```
+parent/
+  main/              # Primary checkout (already exists)
+  reviews/           # PR review worktree
+  branch1/           # Work branch 1
+  branch2/           # Work branch 2
+  ...
+  branchN/           # Work branch N
+```
--- a/.claude/skills/worktree/SKILL.md
+++ b/.claude/skills/worktree/SKILL.md
@@ -0,0 +1,85 @@
+---
+name: worktree
+description: Set up a new git worktree for parallel development. Creates the worktree, copies .env files, installs dependencies, and generates Prisma client. TRIGGER when user asks to set up a worktree, work on a branch in isolation, or needs a separate environment for a branch or PR.
+user-invocable: true
+args: "[name] — optional worktree name (e.g., 'AutoGPT7'). If omitted, uses next available AutoGPT<N>."
+metadata:
+  author: autogpt-team
+  version: "3.0.0"
+---
+
+# Worktree Setup
+
+## Create the worktree
+
+Derive paths from the git toplevel. If a name is provided as argument, use it. Otherwise, check `git worktree list` and pick the next `AutoGPT<N>`.
+
+```bash
+ROOT=$(git rev-parse --show-toplevel)
+PARENT=$(dirname "$ROOT")
+
+# From an existing branch
+git worktree add "$PARENT/<NAME>" <branch-name>
+
+# From a new branch off dev
+git worktree add -b <new-branch> "$PARENT/<NAME>" dev
+```
+
+## Copy environment files
+
+Copy `.env` from the root worktree. Falls back to `.env.default` if `.env` doesn't exist.
+
+```bash
+ROOT=$(git rev-parse --show-toplevel)
+TARGET="$(dirname "$ROOT")/<NAME>"
+
+for envpath in autogpt_platform/backend autogpt_platform/frontend autogpt_platform; do
+  if [ -f "$ROOT/$envpath/.env" ]; then
+    cp "$ROOT/$envpath/.env" "$TARGET/$envpath/.env"
+  elif [ -f "$ROOT/$envpath/.env.default" ]; then
+    cp "$ROOT/$envpath/.env.default" "$TARGET/$envpath/.env"
+  fi
+done
+```
+
+## Install dependencies
+
+```bash
+TARGET="$(dirname "$(git rev-parse --show-toplevel)")/<NAME>"
+cd "$TARGET/autogpt_platform/autogpt_libs" && poetry install
+cd "$TARGET/autogpt_platform/backend" && poetry install && poetry run prisma generate
+cd "$TARGET/autogpt_platform/frontend" && pnpm install
+```
+
+Replace `<NAME>` with the actual worktree name (e.g., `AutoGPT7`).
+
+## Running the app (optional)
+
+Backend uses ports: 8001, 8002, 8003, 8005, 8006, 8007, 8008. Free them first if needed:
+
+```bash
+TARGET="$(dirname "$(git rev-parse --show-toplevel)")/<NAME>"
+for port in 8001 8002 8003 8005 8006 8007 8008; do
+  lsof -ti :$port | xargs kill -9 2>/dev/null || true
+done
+cd "$TARGET/autogpt_platform/backend" && poetry run app
+```
+
+## CoPilot testing
+
+SDK mode spawns a Claude subprocess — won't work inside Claude Code. Set `CHAT_USE_CLAUDE_AGENT_SDK=false` in `backend/.env` to use baseline mode.
+
+## Cleanup
+
+```bash
+# Replace <NAME> with the actual worktree name (e.g., AutoGPT7)
+git worktree remove "$(dirname "$(git rev-parse --show-toplevel)")/<NAME>"
+```
+
+## Alternative: Branchlet (optional)
+
+If [branchlet](https://www.npmjs.com/package/branchlet) is installed:
+
+```bash
+branchlet create -n <name> -s <source-branch> -b <new-branch>
+```
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -1,8 +1,12 @@
-<!-- Clearly explain the need for these changes: -->
+### Why / What / How
+
+<!-- Why: Why does this PR exist? What problem does it solve, or what's broken/missing without it? -->
+<!-- What: What does this PR change? Summarize the changes at a high level. -->
+<!-- How: How does it work? Describe the approach, key implementation details, or architecture decisions. -->

 ### Changes 🏗️

-<!-- Concisely describe all of the changes made in this pull request: -->
+<!-- List the key changes. Keep it higher level than the diff but specific enough to highlight what's new/modified. -->

 ### Checklist 📋

--- a/.github/workflows/platform-backend-ci.yml
+++ b/.github/workflows/platform-backend-ci.yml
@@ -5,12 +5,14 @@ on:
    branches: [master, dev, ci-test*]
    paths:
      - ".github/workflows/platform-backend-ci.yml"
+      - ".github/workflows/scripts/get_package_version_from_lockfile.py"
      - "autogpt_platform/backend/**"
      - "autogpt_platform/autogpt_libs/**"
  pull_request:
    branches: [master, dev, release-*]
    paths:
      - ".github/workflows/platform-backend-ci.yml"
+      - ".github/workflows/scripts/get_package_version_from_lockfile.py"
      - "autogpt_platform/backend/**"
      - "autogpt_platform/autogpt_libs/**"
  merge_group:
@@ -25,10 +27,91 @@ defaults:
    working-directory: autogpt_platform/backend

 jobs:
+  lint:
+    permissions:
+      contents: read
+    timeout-minutes: 10
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v6
+
+      - name: Set up Python 3.12
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+
+      - name: Set up Python dependency cache
+        uses: actions/cache@v5
+        with:
+          path: ~/.cache/pypoetry
+          key: poetry-${{ runner.os }}-py3.12-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
+
+      - name: Install Poetry
+        run: |
+          HEAD_POETRY_VERSION=$(python ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
+          echo "Using Poetry version ${HEAD_POETRY_VERSION}"
+          curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
+
+      - name: Install Python dependencies
+        run: poetry install
+
+      - name: Run Linters
+        run: poetry run lint --skip-pyright
+
+    env:
+      CI: true
+      PLAIN_OUTPUT: True
+
+  type-check:
+    permissions:
+      contents: read
+    timeout-minutes: 10
+    strategy:
+      fail-fast: false
+      matrix:
+        python-version: ["3.11", "3.12", "3.13"]
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v6
+
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+
+      - name: Set up Python dependency cache
+        uses: actions/cache@v5
+        with:
+          path: ~/.cache/pypoetry
+          key: poetry-${{ runner.os }}-py${{ matrix.python-version }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
+
+      - name: Install Poetry
+        run: |
+          HEAD_POETRY_VERSION=$(python ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
+          echo "Using Poetry version ${HEAD_POETRY_VERSION}"
+          curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
+
+      - name: Install Python dependencies
+        run: poetry install
+
+      - name: Generate Prisma Client
+        run: poetry run prisma generate && poetry run gen-prisma-stub
+
+      - name: Run Pyright
+        run: poetry run pyright --pythonversion ${{ matrix.python-version }}
+
+    env:
+      CI: true
+      PLAIN_OUTPUT: True
+
  test:
    permissions:
      contents: read
-    timeout-minutes: 30
+    timeout-minutes: 15
    strategy:
      fail-fast: false
      matrix:
@@ -96,9 +179,9 @@ jobs:
        uses: actions/cache@v5
        with:
          path: ~/.cache/pypoetry
-          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
+          key: poetry-${{ runner.os }}-py${{ matrix.python-version }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}

-      - name: Install Poetry (Unix)
+      - name: Install Poetry
        run: |
          # Extract Poetry version from backend/poetry.lock
          HEAD_POETRY_VERSION=$(python ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
@@ -156,22 +239,22 @@ jobs:
          echo "Waiting for ClamAV daemon to start..."
          max_attempts=60
          attempt=0
-          
+
          until nc -z localhost 3310 || [ $attempt -eq $max_attempts ]; do
            echo "ClamAV is unavailable - sleeping (attempt $((attempt+1))/$max_attempts)"
            sleep 5
            attempt=$((attempt+1))
          done
-          
+
          if [ $attempt -eq $max_attempts ]; then
            echo "ClamAV failed to start after $((max_attempts*5)) seconds"
            echo "Checking ClamAV service logs..."
            docker logs $(docker ps -q --filter "ancestor=clamav/clamav-debian:latest") 2>&1 | tail -50 || echo "No ClamAV container found"
            exit 1
          fi
-          
+
          echo "ClamAV is ready!"
-          
+
          # Verify ClamAV is responsive
          echo "Testing ClamAV connection..."
          timeout 10 bash -c 'echo "PING" | nc localhost 3310' || {
@@ -186,18 +269,13 @@ jobs:
          DATABASE_URL: ${{ steps.supabase.outputs.DB_URL }}
          DIRECT_URL: ${{ steps.supabase.outputs.DB_URL }}

-      - id: lint
-        name: Run Linter
-        run: poetry run lint
-
-      - name: Run pytest with coverage
+      - name: Run pytest
        run: |
          if [[ "${{ runner.debug }}" == "1" ]]; then
            poetry run pytest -s -vv -o log_cli=true -o log_cli_level=DEBUG
          else
            poetry run pytest -s -vv
          fi
-        if: success() || (failure() && steps.lint.outcome == 'failure')
        env:
          LOG_LEVEL: ${{ runner.debug && 'DEBUG' || 'INFO' }}
          DATABASE_URL: ${{ steps.supabase.outputs.DB_URL }}
@@ -209,6 +287,12 @@ jobs:
          REDIS_PORT: "6379"
          ENCRYPTION_KEY: "dvziYgz0KSK8FENhju0ZYi8-fRTfAdlz6YLhdB_jhNw=" # DO NOT USE IN PRODUCTION!!

+      # - name: Upload coverage reports to Codecov
+      #   uses: codecov/codecov-action@v4
+      #   with:
+      #     token: ${{ secrets.CODECOV_TOKEN }}
+      #     flags: backend,${{ runner.os }}
+
    env:
      CI: true
      PLAIN_OUTPUT: True
@@ -222,9 +306,3 @@ jobs:
      # the backend service, docker composes, and examples
      RABBITMQ_DEFAULT_USER: "rabbitmq_user_default"
      RABBITMQ_DEFAULT_PASS: "k0VMxyIJF9S35f3x2uaw5IWAl6Y536O7"
-
-      # - name: Upload coverage reports to Codecov
-      #   uses: codecov/codecov-action@v4
-      #   with:
-      #     token: ${{ secrets.CODECOV_TOKEN }}
-      #     flags: backend,${{ runner.os }}
--- a/.github/workflows/platform-frontend-ci.yml
+++ b/.github/workflows/platform-frontend-ci.yml
@@ -120,175 +120,6 @@ jobs:
          token: ${{ secrets.GITHUB_TOKEN }}
          exitOnceUploaded: true

-  e2e_test:
-    name: end-to-end tests
-    runs-on: big-boi
-
-    steps:
-      - name: Checkout repository
-        uses: actions/checkout@v6
-        with:
-          submodules: recursive
-
-      - name: Set up Platform - Copy default supabase .env
-        run: |
-          cp ../.env.default ../.env
-
-      - name: Set up Platform - Copy backend .env and set OpenAI API key
-        run: |
-          cp ../backend/.env.default ../backend/.env
-          echo "OPENAI_INTERNAL_API_KEY=${{ secrets.OPENAI_API_KEY }}" >> ../backend/.env
-        env:
-          # Used by E2E test data script to generate embeddings for approved store agents
-          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
-
-      - name: Set up Platform - Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3
-        with:
-          driver: docker-container
-          driver-opts: network=host
-
-      - name: Set up Platform - Expose GHA cache to docker buildx CLI
-        uses: crazy-max/ghaction-github-runtime@v3
-
-      - name: Set up Platform - Build Docker images (with cache)
-        working-directory: autogpt_platform
-        run: |
-          pip install pyyaml
-
-          # Resolve extends and generate a flat compose file that bake can understand
-          docker compose -f docker-compose.yml config > docker-compose.resolved.yml
-
-          # Add cache configuration to the resolved compose file
-          python ../.github/workflows/scripts/docker-ci-fix-compose-build-cache.py \
-            --source docker-compose.resolved.yml \
-            --cache-from "type=gha" \
-            --cache-to "type=gha,mode=max" \
-            --backend-hash "${{ hashFiles('autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/poetry.lock', 'autogpt_platform/backend/backend') }}" \
-            --frontend-hash "${{ hashFiles('autogpt_platform/frontend/Dockerfile', 'autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/src') }}" \
-            --git-ref "${{ github.ref }}"
-
-          # Build with bake using the resolved compose file (now includes cache config)
-          docker buildx bake --allow=fs.read=.. -f docker-compose.resolved.yml --load
-        env:
-          NEXT_PUBLIC_PW_TEST: true
-
-      - name: Set up tests - Cache E2E test data
-        id: e2e-data-cache
-        uses: actions/cache@v5
-        with:
-          path: /tmp/e2e_test_data.sql
-          key: e2e-test-data-${{ hashFiles('autogpt_platform/backend/test/e2e_test_data.py', 'autogpt_platform/backend/migrations/**', '.github/workflows/platform-frontend-ci.yml') }}
-
-      - name: Set up Platform - Start Supabase DB + Auth
-        run: |
-          docker compose -f ../docker-compose.resolved.yml up -d db auth --no-build
-          echo "Waiting for database to be ready..."
-          timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done'
-          echo "Waiting for auth service to be ready..."
-          timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -c "SELECT 1 FROM auth.users LIMIT 1" 2>/dev/null; do sleep 2; done' || echo "Auth schema check timeout, continuing..."
-
-      - name: Set up Platform - Run migrations
-        run: |
-          echo "Running migrations..."
-          docker compose -f ../docker-compose.resolved.yml run --rm migrate
-          echo "✅ Migrations completed"
-        env:
-          NEXT_PUBLIC_PW_TEST: true
-
-      - name: Set up tests - Load cached E2E test data
-        if: steps.e2e-data-cache.outputs.cache-hit == 'true'
-        run: |
-          echo "✅ Found cached E2E test data, restoring..."
-          {
-            echo "SET session_replication_role = 'replica';"
-            cat /tmp/e2e_test_data.sql
-            echo "SET session_replication_role = 'origin';"
-          } | docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -b
-          # Refresh materialized views after restore
-          docker compose -f ../docker-compose.resolved.yml exec -T db \
-            psql -U postgres -d postgres -b -c "SET search_path TO platform; SELECT refresh_store_materialized_views();" || true
-
-          echo "✅ E2E test data restored from cache"
-
-      - name: Set up Platform - Start (all other services)
-        run: |
-          docker compose -f ../docker-compose.resolved.yml up -d --no-build
-          echo "Waiting for rest_server to be ready..."
-          timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
-        env:
-          NEXT_PUBLIC_PW_TEST: true
-
-      - name: Set up tests - Create E2E test data
-        if: steps.e2e-data-cache.outputs.cache-hit != 'true'
-        run: |
-          echo "Creating E2E test data..."
-          docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.resolved.yml ps -q rest_server):/tmp/e2e_test_data.py
-          docker compose -f ../docker-compose.resolved.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
-            echo "❌ E2E test data creation failed!"
-            docker compose -f ../docker-compose.resolved.yml logs --tail=50 rest_server
-            exit 1
-          }
-
-          # Dump auth.users + platform schema for cache (two separate dumps)
-          echo "Dumping database for cache..."
-          {
-            docker compose -f ../docker-compose.resolved.yml exec -T db \
-              pg_dump -U postgres --data-only --column-inserts \
-              --table='auth.users' postgres
-            docker compose -f ../docker-compose.resolved.yml exec -T db \
-              pg_dump -U postgres --data-only --column-inserts \
-              --schema=platform \
-              --exclude-table='platform._prisma_migrations' \
-              --exclude-table='platform.apscheduler_jobs' \
-              --exclude-table='platform.apscheduler_jobs_batched_notifications' \
-              postgres
-          } > /tmp/e2e_test_data.sql
-
-          echo "✅ Database dump created for caching ($(wc -l < /tmp/e2e_test_data.sql) lines)"
-
-      - name: Set up tests - Enable corepack
-        run: corepack enable
-
-      - name: Set up tests - Set up Node
-        uses: actions/setup-node@v6
-        with:
-          node-version: "22.18.0"
-          cache: "pnpm"
-          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
-
-      - name: Set up tests - Install dependencies
-        run: pnpm install --frozen-lockfile
-
-      - name: Set up tests - Install browser 'chromium'
-        run: pnpm playwright install --with-deps chromium
-
-      - name: Run Playwright tests
-        run: pnpm test:no-build
-        continue-on-error: false
-
-      - name: Upload Playwright report
-        if: always()
-        uses: actions/upload-artifact@v4
-        with:
-          name: playwright-report
-          path: playwright-report
-          if-no-files-found: ignore
-          retention-days: 3
-
-      - name: Upload Playwright test results
-        if: always()
-        uses: actions/upload-artifact@v4
-        with:
-          name: playwright-test-results
-          path: test-results
-          if-no-files-found: ignore
-          retention-days: 3
-
-      - name: Print Final Docker Compose logs
-        if: always()
-        run: docker compose -f ../docker-compose.resolved.yml logs
-
  integration_test:
    runs-on: ubuntu-latest
    needs: setup
--- a/.github/workflows/platform-fullstack-ci.yml
+++ b/.github/workflows/platform-fullstack-ci.yml
@@ -1,14 +1,18 @@
-name: AutoGPT Platform - Frontend CI
+name: AutoGPT Platform - Full-stack CI

 on:
  push:
    branches: [master, dev]
    paths:
      - ".github/workflows/platform-fullstack-ci.yml"
+      - ".github/workflows/scripts/docker-ci-fix-compose-build-cache.py"
+      - ".github/workflows/scripts/get_package_version_from_lockfile.py"
      - "autogpt_platform/**"
  pull_request:
    paths:
      - ".github/workflows/platform-fullstack-ci.yml"
+      - ".github/workflows/scripts/docker-ci-fix-compose-build-cache.py"
+      - ".github/workflows/scripts/get_package_version_from_lockfile.py"
      - "autogpt_platform/**"
  merge_group:

@@ -24,42 +28,28 @@ defaults:
 jobs:
  setup:
    runs-on: ubuntu-latest
-    outputs:
-      cache-key: ${{ steps.cache-key.outputs.key }}

    steps:
      - name: Checkout repository
        uses: actions/checkout@v6

-      - name: Set up Node.js
-        uses: actions/setup-node@v6
-        with:
-          node-version: "22.18.0"
-
      - name: Enable corepack
        run: corepack enable

-      - name: Generate cache key
-        id: cache-key
-        run: echo "key=${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}" >> $GITHUB_OUTPUT
-
-      - name: Cache dependencies
-        uses: actions/cache@v5
+      - name: Set up Node
+        uses: actions/setup-node@v6
        with:
-          path: ~/.pnpm-store
-          key: ${{ steps.cache-key.outputs.key }}
-          restore-keys: |
-            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
-            ${{ runner.os }}-pnpm-
+          node-version: "22.18.0"
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml

-      - name: Install dependencies
+      - name: Install dependencies to populate cache
        run: pnpm install --frozen-lockfile

-  types:
-    runs-on: big-boi
+  check-api-types:
+    name: check API types
+    runs-on: ubuntu-latest
    needs: setup
-    strategy:
-      fail-fast: false

    steps:
      - name: Checkout repository
@@ -67,70 +57,256 @@ jobs:
        with:
          submodules: recursive

-      - name: Set up Node.js
+      # ------------------------ Backend setup ------------------------
+
+      - name: Set up Backend - Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+
+      - name: Set up Backend - Install Poetry
+        working-directory: autogpt_platform/backend
+        run: |
+          POETRY_VERSION=$(python ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
+          echo "Installing Poetry version ${POETRY_VERSION}"
+          curl -sSL https://install.python-poetry.org | POETRY_VERSION=$POETRY_VERSION python3 -
+
+      - name: Set up Backend - Set up dependency cache
+        uses: actions/cache@v5
+        with:
+          path: ~/.cache/pypoetry
+          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
+
+      - name: Set up Backend - Install dependencies
+        working-directory: autogpt_platform/backend
+        run: poetry install
+
+      - name: Set up Backend - Generate Prisma client
+        working-directory: autogpt_platform/backend
+        run: poetry run prisma generate && poetry run gen-prisma-stub
+
+      - name: Set up Frontend - Export OpenAPI schema from Backend
+        working-directory: autogpt_platform/backend
+        run: poetry run export-api-schema --output ../frontend/src/app/api/openapi.json
+
+      # ------------------------ Frontend setup ------------------------
+
+      - name: Set up Frontend - Enable corepack
+        run: corepack enable
+
+      - name: Set up Frontend - Set up Node
        uses: actions/setup-node@v6
        with:
          node-version: "22.18.0"
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml

-      - name: Enable corepack
-        run: corepack enable
-
-      - name: Copy default supabase .env
-        run: |
-          cp ../.env.default ../.env
-
-      - name: Copy backend .env
-        run: |
-          cp ../backend/.env.default ../backend/.env
-
-      - name: Run docker compose
-        run: |
-          docker compose -f ../docker-compose.yml --profile local up -d deps_backend
-
-      - name: Restore dependencies cache
-        uses: actions/cache@v5
-        with:
-          path: ~/.pnpm-store
-          key: ${{ needs.setup.outputs.cache-key }}
-          restore-keys: |
-            ${{ runner.os }}-pnpm-
-
-      - name: Install dependencies
+      - name: Set up Frontend - Install dependencies
        run: pnpm install --frozen-lockfile

-      - name: Setup .env
-        run: cp .env.default .env
-
-      - name: Wait for services to be ready
-        run: |
-          echo "Waiting for rest_server to be ready..."
-          timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
-          echo "Waiting for database to be ready..."
-          timeout 60 sh -c 'until docker compose -f ../docker-compose.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done' || echo "Database ready check timeout, continuing..."
-
-      - name: Generate API queries
-        run: pnpm generate:api:force
+      - name: Set up Frontend - Format OpenAPI schema
+        id: format-schema
+        run: pnpm prettier --write ./src/app/api/openapi.json

      - name: Check for API schema changes
        run: |
          if ! git diff --exit-code src/app/api/openapi.json; then
            echo "❌ API schema changes detected in src/app/api/openapi.json"
            echo ""
-            echo "The openapi.json file has been modified after running 'pnpm generate:api-all'."
+            echo "The openapi.json file has been modified after exporting the API schema."
            echo "This usually means changes have been made in the BE endpoints without updating the Frontend."
            echo "The API schema is now out of sync with the Front-end queries."
            echo ""
            echo "To fix this:"
-            echo "1. Pull the backend 'docker compose pull && docker compose up -d --build --force-recreate'"
-            echo "2. Run 'pnpm generate:api' locally"
-            echo "3. Run 'pnpm types' locally"
-            echo "4. Fix any TypeScript errors that may have been introduced"
-            echo "5. Commit and push your changes"
+            echo "\nIn the backend directory:"
+            echo "1. Run 'poetry run export-api-schema --output ../frontend/src/app/api/openapi.json'"
+            echo "\nIn the frontend directory:"
+            echo "2. Run 'pnpm prettier --write src/app/api/openapi.json'"
+            echo "3. Run 'pnpm generate:api'"
+            echo "4. Run 'pnpm types'"
+            echo "5. Fix any TypeScript errors that may have been introduced"
+            echo "6. Commit and push your changes"
            echo ""
            exit 1
          else
            echo "✅ No API schema changes detected"
          fi

-      - name: Run Typescript checks
+      - name: Set up Frontend - Generate API client
+        id: generate-api-client
+        run: pnpm orval --config ./orval.config.ts
+        # Continue with type generation & check even if there are schema changes
+        if: success() || (steps.format-schema.outcome == 'success')
+
+      - name: Check for TypeScript errors
        run: pnpm types
+        if: success() || (steps.generate-api-client.outcome == 'success')
+
+  e2e_test:
+    name: end-to-end tests
+    runs-on: big-boi
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v6
+        with:
+          submodules: recursive
+
+      - name: Set up Platform - Copy default supabase .env
+        run: |
+          cp ../.env.default ../.env
+
+      - name: Set up Platform - Copy backend .env and set OpenAI API key
+        run: |
+          cp ../backend/.env.default ../backend/.env
+          echo "OPENAI_INTERNAL_API_KEY=${{ secrets.OPENAI_API_KEY }}" >> ../backend/.env
+        env:
+          # Used by E2E test data script to generate embeddings for approved store agents
+          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+
+      - name: Set up Platform - Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3
+        with:
+          driver: docker-container
+          driver-opts: network=host
+
+      - name: Set up Platform - Expose GHA cache to docker buildx CLI
+        uses: crazy-max/ghaction-github-runtime@v4
+
+      - name: Set up Platform - Build Docker images (with cache)
+        working-directory: autogpt_platform
+        run: |
+          pip install pyyaml
+
+          # Resolve extends and generate a flat compose file that bake can understand
+          docker compose -f docker-compose.yml config > docker-compose.resolved.yml
+
+          # Add cache configuration to the resolved compose file
+          python ../.github/workflows/scripts/docker-ci-fix-compose-build-cache.py \
+            --source docker-compose.resolved.yml \
+            --cache-from "type=gha" \
+            --cache-to "type=gha,mode=max" \
+            --backend-hash "${{ hashFiles('autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/poetry.lock', 'autogpt_platform/backend/backend/**') }}" \
+            --frontend-hash "${{ hashFiles('autogpt_platform/frontend/Dockerfile', 'autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/src/**') }}" \
+            --git-ref "${{ github.ref }}"
+
+          # Build with bake using the resolved compose file (now includes cache config)
+          docker buildx bake --allow=fs.read=.. -f docker-compose.resolved.yml --load
+        env:
+          NEXT_PUBLIC_PW_TEST: true
+
+      - name: Set up tests - Cache E2E test data
+        id: e2e-data-cache
+        uses: actions/cache@v5
+        with:
+          path: /tmp/e2e_test_data.sql
+          key: e2e-test-data-${{ hashFiles('autogpt_platform/backend/test/e2e_test_data.py', 'autogpt_platform/backend/migrations/**', '.github/workflows/platform-fullstack-ci.yml') }}
+
+      - name: Set up Platform - Start Supabase DB + Auth
+        run: |
+          docker compose -f ../docker-compose.resolved.yml up -d db auth --no-build
+          echo "Waiting for database to be ready..."
+          timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done'
+          echo "Waiting for auth service to be ready..."
+          timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -c "SELECT 1 FROM auth.users LIMIT 1" 2>/dev/null; do sleep 2; done' || echo "Auth schema check timeout, continuing..."
+
+      - name: Set up Platform - Run migrations
+        run: |
+          echo "Running migrations..."
+          docker compose -f ../docker-compose.resolved.yml run --rm migrate
+          echo "✅ Migrations completed"
+        env:
+          NEXT_PUBLIC_PW_TEST: true
+
+      - name: Set up tests - Load cached E2E test data
+        if: steps.e2e-data-cache.outputs.cache-hit == 'true'
+        run: |
+          echo "✅ Found cached E2E test data, restoring..."
+          {
+            echo "SET session_replication_role = 'replica';"
+            cat /tmp/e2e_test_data.sql
+            echo "SET session_replication_role = 'origin';"
+          } | docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -b
+          # Refresh materialized views after restore
+          docker compose -f ../docker-compose.resolved.yml exec -T db \
+            psql -U postgres -d postgres -b -c "SET search_path TO platform; SELECT refresh_store_materialized_views();" || true
+
+          echo "✅ E2E test data restored from cache"
+
+      - name: Set up Platform - Start (all other services)
+        run: |
+          docker compose -f ../docker-compose.resolved.yml up -d --no-build
+          echo "Waiting for rest_server to be ready..."
+          timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
+        env:
+          NEXT_PUBLIC_PW_TEST: true
+
+      - name: Set up tests - Create E2E test data
+        if: steps.e2e-data-cache.outputs.cache-hit != 'true'
+        run: |
+          echo "Creating E2E test data..."
+          docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.resolved.yml ps -q rest_server):/tmp/e2e_test_data.py
+          docker compose -f ../docker-compose.resolved.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
+            echo "❌ E2E test data creation failed!"
+            docker compose -f ../docker-compose.resolved.yml logs --tail=50 rest_server
+            exit 1
+          }
+
+          # Dump auth.users + platform schema for cache (two separate dumps)
+          echo "Dumping database for cache..."
+          {
+            docker compose -f ../docker-compose.resolved.yml exec -T db \
+              pg_dump -U postgres --data-only --column-inserts \
+              --table='auth.users' postgres
+            docker compose -f ../docker-compose.resolved.yml exec -T db \
+              pg_dump -U postgres --data-only --column-inserts \
+              --schema=platform \
+              --exclude-table='platform._prisma_migrations' \
+              --exclude-table='platform.apscheduler_jobs' \
+              --exclude-table='platform.apscheduler_jobs_batched_notifications' \
+              postgres
+          } > /tmp/e2e_test_data.sql
+
+          echo "✅ Database dump created for caching ($(wc -l < /tmp/e2e_test_data.sql) lines)"
+
+      - name: Set up tests - Enable corepack
+        run: corepack enable
+
+      - name: Set up tests - Set up Node
+        uses: actions/setup-node@v6
+        with:
+          node-version: "22.18.0"
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
+
+      - name: Set up tests - Install dependencies
+        run: pnpm install --frozen-lockfile
+
+      - name: Set up tests - Install browser 'chromium'
+        run: pnpm playwright install --with-deps chromium
+
+      - name: Run Playwright tests
+        run: pnpm test:no-build
+        continue-on-error: false
+
+      - name: Upload Playwright report
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: playwright-report
+          path: autogpt_platform/frontend/playwright-report
+          if-no-files-found: ignore
+          retention-days: 3
+
+      - name: Upload Playwright test results
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: playwright-test-results
+          path: autogpt_platform/frontend/test-results
+          if-no-files-found: ignore
+          retention-days: 3
+
+      - name: Print Final Docker Compose logs
+        if: always()
+        run: docker compose -f ../docker-compose.resolved.yml logs
--- a/.gitignore
+++ b/.gitignore
@@ -180,4 +180,6 @@ autogpt_platform/backend/settings.py
 .claude/settings.local.json
 CLAUDE.local.md
 /autogpt_platform/backend/logs
-.next
+.next
+# Implementation plans (generated by AI agents)
+plans/
--- a/.nvmrc
+++ b/.nvmrc
@@ -0,0 +1 @@
+22
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -1,3 +1,10 @@
+default_install_hook_types:
+  - pre-commit
+  - pre-push
+  - post-checkout
+
+default_stages: [pre-commit]
+
 repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.4.0
@@ -17,6 +24,7 @@ repos:
        name: Detect secrets
        description: Detects high entropy strings that are likely to be passwords.
        files: ^autogpt_platform/
+        exclude: pnpm-lock\.yaml$
        stages: [pre-push]

  - repo: local
@@ -26,49 +34,106 @@ repos:
      - id: poetry-install
        name: Check & Install dependencies - AutoGPT Platform - Backend
        alias: poetry-install-platform-backend
-        entry: poetry -C autogpt_platform/backend install
        # include autogpt_libs source (since it's a path dependency)
-        files: ^autogpt_platform/(backend|autogpt_libs)/poetry\.lock$
-        types: [file]
+        entry: >
+          bash -c '
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
+          else
+            git diff --cached --name-only
+          fi | grep -qE "^autogpt_platform/(backend|autogpt_libs)/poetry\.lock$" || exit 0;
+          poetry -C autogpt_platform/backend install
+          '
+        always_run: true
        language: system
        pass_filenames: false
+        stages: [pre-commit, post-checkout]

      - id: poetry-install
        name: Check & Install dependencies - AutoGPT Platform - Libs
        alias: poetry-install-platform-libs
-        entry: poetry -C autogpt_platform/autogpt_libs install
-        files: ^autogpt_platform/autogpt_libs/poetry\.lock$
-        types: [file]
+        entry: >
+          bash -c '
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
+          else
+            git diff --cached --name-only
+          fi | grep -qE "^autogpt_platform/autogpt_libs/poetry\.lock$" || exit 0;
+          poetry -C autogpt_platform/autogpt_libs install
+          '
+        always_run: true
        language: system
        pass_filenames: false
+        stages: [pre-commit, post-checkout]
+
+      - id: pnpm-install
+        name: Check & Install dependencies - AutoGPT Platform - Frontend
+        alias: pnpm-install-platform-frontend
+        entry: >
+          bash -c '
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
+          else
+            git diff --cached --name-only
+          fi | grep -qE "^autogpt_platform/frontend/pnpm-lock\.yaml$" || exit 0;
+          pnpm --prefix autogpt_platform/frontend install
+          '
+        always_run: true
+        language: system
+        pass_filenames: false
+        stages: [pre-commit, post-checkout]

      - id: poetry-install
        name: Check & Install dependencies - Classic - AutoGPT
        alias: poetry-install-classic-autogpt
-        entry: poetry -C classic/original_autogpt install
+        entry: >
+          bash -c '
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
+          else
+            git diff --cached --name-only
+          fi | grep -qE "^classic/(original_autogpt|forge)/poetry\.lock$" || exit 0;
+          poetry -C classic/original_autogpt install
+          '
        # include forge source (since it's a path dependency)
-        files: ^classic/(original_autogpt|forge)/poetry\.lock$
-        types: [file]
+        always_run: true
        language: system
        pass_filenames: false
+        stages: [pre-commit, post-checkout]

      - id: poetry-install
        name: Check & Install dependencies - Classic - Forge
        alias: poetry-install-classic-forge
-        entry: poetry -C classic/forge install
-        files: ^classic/forge/poetry\.lock$
-        types: [file]
+        entry: >
+          bash -c '
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
+          else
+            git diff --cached --name-only
+          fi | grep -qE "^classic/forge/poetry\.lock$" || exit 0;
+          poetry -C classic/forge install
+          '
+        always_run: true
        language: system
        pass_filenames: false
+        stages: [pre-commit, post-checkout]

      - id: poetry-install
        name: Check & Install dependencies - Classic - Benchmark
        alias: poetry-install-classic-benchmark
-        entry: poetry -C classic/benchmark install
-        files: ^classic/benchmark/poetry\.lock$
-        types: [file]
+        entry: >
+          bash -c '
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
+          else
+            git diff --cached --name-only
+          fi | grep -qE "^classic/benchmark/poetry\.lock$" || exit 0;
+          poetry -C classic/benchmark install
+          '
+        always_run: true
        language: system
        pass_filenames: false
+        stages: [pre-commit, post-checkout]

  - repo: local
    # For proper type checking, Prisma client must be up-to-date.
@@ -76,12 +141,54 @@ repos:
      - id: prisma-generate
        name: Prisma Generate - AutoGPT Platform - Backend
        alias: prisma-generate-platform-backend
-        entry: bash -c 'cd autogpt_platform/backend && poetry run prisma generate'
+        entry: >
+          bash -c '
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --name-only "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF"
+          else
+            git diff --cached --name-only
+          fi | grep -qE "^autogpt_platform/((backend|autogpt_libs)/poetry\.lock|backend/schema\.prisma)$" || exit 0;
+          cd autogpt_platform/backend
+          && poetry run prisma generate
+          && poetry run gen-prisma-stub
+          '
        # include everything that triggers poetry install + the prisma schema
-        files: ^autogpt_platform/((backend|autogpt_libs)/poetry\.lock|backend/schema.prisma)$
-        types: [file]
+        always_run: true
        language: system
        pass_filenames: false
+        stages: [pre-commit, post-checkout]
+
+      - id: export-api-schema
+        name: Export API schema - AutoGPT Platform - Backend -> Frontend
+        alias: export-api-schema-platform
+        entry: >
+          bash -c '
+          cd autogpt_platform/backend
+          && poetry run export-api-schema --output ../frontend/src/app/api/openapi.json
+          && cd ../frontend
+          && pnpm prettier --write ./src/app/api/openapi.json
+          '
+        files: ^autogpt_platform/backend/
+        language: system
+        pass_filenames: false
+
+      - id: generate-api-client
+        name: Generate API client - AutoGPT Platform - Frontend
+        alias: generate-api-client-platform-frontend
+        entry: >
+          bash -c '
+          SCHEMA=autogpt_platform/frontend/src/app/api/openapi.json;
+          if [ -n "$PRE_COMMIT_FROM_REF" ]; then
+            git diff --quiet "$PRE_COMMIT_FROM_REF" "$PRE_COMMIT_TO_REF" -- "$SCHEMA" && exit 0
+          else
+            git diff --quiet HEAD -- "$SCHEMA" && exit 0
+          fi;
+          cd autogpt_platform/frontend && pnpm generate:api
+          '
+        always_run: true
+        language: system
+        pass_filenames: false
+        stages: [pre-commit, post-checkout]

  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.7.2
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -1,6 +1,6 @@
 # AutoGPT Platform Contribution Guide

-This guide provides context for Codex when updating the **autogpt_platform** folder.
+This guide provides context for coding agents when updating the **autogpt_platform** folder.

 ## Directory overview

--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1 @@
+@AGENTS.md
--- a/README.md
+++ b/README.md
@@ -83,13 +83,13 @@ The AutoGPT frontend is where users interact with our powerful AI automation pla

   **Agent Builder:** For those who want to customize, our intuitive, low-code interface allows you to design and configure your own AI agents. 
   
-   **Workflow Management:** Build, modify, and optimize your automation workflows with ease. You build your agent by connecting blocks, where each block     performs a single action.
+   **Workflow Management:** Build, modify, and optimize your automation workflows with ease. You build your agent by connecting blocks, where each block performs a single action.
   
   **Deployment Controls:** Manage the lifecycle of your agents, from testing to production.
   
   **Ready-to-Use Agents:** Don't want to build? Simply select from our library of pre-configured agents and put them to work immediately.
   
-   **Agent Interaction:** Whether you've built your own or are using pre-configured agents, easily run and interact with them through our user-friendly      interface.
+   **Agent Interaction:** Whether you've built your own or are using pre-configured agents, easily run and interact with them through our user-friendly interface.

   **Monitoring and Analytics:** Keep track of your agents' performance and gain insights to continually improve your automation processes.

--- a/autogpt_platform/.gitignore
+++ b/autogpt_platform/.gitignore
@@ -1,2 +1,3 @@
 *.ignore.*
-*.ign.*
+*.ign.*
+.application.logs
--- a/autogpt_platform/AGENTS.md
+++ b/autogpt_platform/AGENTS.md
@@ -0,0 +1,120 @@
+# AutoGPT Platform
+
+This file provides guidance to coding agents when working with code in this repository.
+
+## Repository Overview
+
+AutoGPT Platform is a monorepo containing:
+
+- **Backend** (`backend`): Python FastAPI server with async support
+- **Frontend** (`frontend`): Next.js React application
+- **Shared Libraries** (`autogpt_libs`): Common Python utilities
+
+## Component Documentation
+
+- **Backend**: See @backend/AGENTS.md for backend-specific commands, architecture, and development tasks
+- **Frontend**: See @frontend/AGENTS.md for frontend-specific commands, architecture, and development patterns
+
+## Key Concepts
+
+1. **Agent Graphs**: Workflow definitions stored as JSON, executed by the backend
+2. **Blocks**: Reusable components in `backend/backend/blocks/` that perform specific tasks
+3. **Integrations**: OAuth and API connections stored per user
+4. **Store**: Marketplace for sharing agent templates
+5. **Virus Scanning**: ClamAV integration for file upload security
+
+### Environment Configuration
+
+#### Configuration Files
+
+- **Backend**: `backend/.env.default` (defaults) → `backend/.env` (user overrides)
+- **Frontend**: `frontend/.env.default` (defaults) → `frontend/.env` (user overrides)
+- **Platform**: `.env.default` (Supabase/shared defaults) → `.env` (user overrides)
+
+#### Docker Environment Loading Order
+
+1. `.env.default` files provide base configuration (tracked in git)
+2. `.env` files provide user-specific overrides (gitignored)
+3. Docker Compose `environment:` sections provide service-specific overrides
+4. Shell environment variables have highest precedence
+
+#### Key Points
+
+- All services use hardcoded defaults in docker-compose files (no `${VARIABLE}` substitutions)
+- The `env_file` directive loads variables INTO containers at runtime
+- Backend/Frontend services use YAML anchors for consistent configuration
+- Supabase services (`db/docker/docker-compose.yml`) follow the same pattern
+
+### Branching Strategy
+
+- **`dev`** is the main development branch. All PRs should target `dev`.
+- **`master`** is the production branch. Only used for production releases.
+
+### Creating Pull Requests
+
+- Create the PR against the `dev` branch of the repository.
+- **Split PRs by concern** — each PR should have a single clear purpose. For example, "usage tracking" and "credit charging" should be separate PRs even if related. Combining multiple concerns makes it harder for reviewers to understand what belongs to what.
+- Ensure the branch name is descriptive (e.g., `feature/add-new-block`)
+- Use conventional commit messages (see below)
+- **Structure the PR description with Why / What / How** — Why: the motivation (what problem it solves, what's broken/missing without it); What: high-level summary of changes; How: approach, key implementation details, or architecture decisions. Reviewers need all three to judge whether the approach fits the problem.
+- Fill out the .github/PULL_REQUEST_TEMPLATE.md template as the PR description
+- Always use `--body-file` to pass PR body — avoids shell interpretation of backticks and special characters:
+  ```bash
+  PR_BODY=$(mktemp)
+  cat > "$PR_BODY" << 'PREOF'
+  ## Summary
+  - use `backticks` freely here
+  PREOF
+  gh pr create --title "..." --body-file "$PR_BODY" --base dev
+  rm "$PR_BODY"
+  ```
+- Run the github pre-commit hooks to ensure code quality.
+
+### Test-Driven Development (TDD)
+
+When fixing a bug or adding a feature, follow a test-first approach:
+
+1. **Write a failing test first** — create a test that reproduces the bug or validates the new behavior, marked with `@pytest.mark.xfail` (backend) or `.fixme` (Playwright). Run it to confirm it fails for the right reason.
+2. **Implement the fix/feature** — write the minimal code to make the test pass.
+3. **Remove the xfail marker** — once the test passes, remove the `xfail`/`.fixme` annotation and run the full test suite to confirm nothing else broke.
+
+This ensures every change is covered by a test and that the test actually validates the intended behavior.
+
+### Reviewing/Revising Pull Requests
+
+Use `/pr-review` to review a PR or `/pr-address` to address comments.
+
+When fetching comments manually:
+- `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate` — top-level reviews
+- `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments --paginate` — inline review comments (always paginate to avoid missing comments beyond page 1)
+- `gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments` — PR conversation comments
+
+### Conventional Commits
+
+Use this format for commit messages and Pull Request titles:
+
+**Conventional Commit Types:**
+
+- `feat`: Introduces a new feature to the codebase
+- `fix`: Patches a bug in the codebase
+- `refactor`: Code change that neither fixes a bug nor adds a feature; also applies to removing features
+- `ci`: Changes to CI configuration
+- `docs`: Documentation-only changes
+- `dx`: Improvements to the developer experience
+
+**Recommended Base Scopes:**
+
+- `platform`: Changes affecting both frontend and backend
+- `frontend`
+- `backend`
+- `infra`
+- `blocks`: Modifications/additions of individual blocks
+
+**Subscope Examples:**
+
+- `backend/executor`
+- `backend/db`
+- `frontend/builder` (includes changes to the block UI component)
+- `infra/prod`
+
+Use these scopes and subscopes for clarity and consistency in commit messages.
--- a/autogpt_platform/CLAUDE.md
+++ b/autogpt_platform/CLAUDE.md
@@ -1,95 +1 @@
-# CLAUDE.md
-
-This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
-
-## Repository Overview
-
-AutoGPT Platform is a monorepo containing:
-
- **Backend** (`backend`): Python FastAPI server with async support
- **Frontend** (`frontend`): Next.js React application
- **Shared Libraries** (`autogpt_libs`): Common Python utilities
-
-## Component Documentation
-
- **Backend**: See @backend/CLAUDE.md for backend-specific commands, architecture, and development tasks
- **Frontend**: See @frontend/CLAUDE.md for frontend-specific commands, architecture, and development patterns
-
-## Key Concepts
-
-1. **Agent Graphs**: Workflow definitions stored as JSON, executed by the backend
-2. **Blocks**: Reusable components in `backend/backend/blocks/` that perform specific tasks
-3. **Integrations**: OAuth and API connections stored per user
-4. **Store**: Marketplace for sharing agent templates
-5. **Virus Scanning**: ClamAV integration for file upload security
-
-### Environment Configuration
-
-#### Configuration Files
-
- **Backend**: `backend/.env.default` (defaults) → `backend/.env` (user overrides)
- **Frontend**: `frontend/.env.default` (defaults) → `frontend/.env` (user overrides)
- **Platform**: `.env.default` (Supabase/shared defaults) → `.env` (user overrides)
-
-#### Docker Environment Loading Order
-
-1. `.env.default` files provide base configuration (tracked in git)
-2. `.env` files provide user-specific overrides (gitignored)
-3. Docker Compose `environment:` sections provide service-specific overrides
-4. Shell environment variables have highest precedence
-
-#### Key Points
-
- All services use hardcoded defaults in docker-compose files (no `${VARIABLE}` substitutions)
- The `env_file` directive loads variables INTO containers at runtime
- Backend/Frontend services use YAML anchors for consistent configuration
- Supabase services (`db/docker/docker-compose.yml`) follow the same pattern
-
-### Branching Strategy
-
- **`dev`** is the main development branch. All PRs should target `dev`.
- **`master`** is the production branch. Only used for production releases.
-
-### Creating Pull Requests
-
- Create the PR against the `dev` branch of the repository.
- Ensure the branch name is descriptive (e.g., `feature/add-new-block`)
- Use conventional commit messages (see below)
- Fill out the .github/PULL_REQUEST_TEMPLATE.md template as the PR description
- Run the github pre-commit hooks to ensure code quality.
-
-### Reviewing/Revising Pull Requests
-
- When the user runs /pr-comments or tries to fetch them, also run gh api /repos/Significant-Gravitas/AutoGPT/pulls/[issuenum]/reviews to get the reviews
- Use gh api /repos/Significant-Gravitas/AutoGPT/pulls/[issuenum]/reviews/[review_id]/comments to get the review contents
- Use gh api /repos/Significant-Gravitas/AutoGPT/issues/9924/comments to get the pr specific comments
-
-### Conventional Commits
-
-Use this format for commit messages and Pull Request titles:
-
-**Conventional Commit Types:**
-
- `feat`: Introduces a new feature to the codebase
- `fix`: Patches a bug in the codebase
- `refactor`: Code change that neither fixes a bug nor adds a feature; also applies to removing features
- `ci`: Changes to CI configuration
- `docs`: Documentation-only changes
- `dx`: Improvements to the developer experience
-
-**Recommended Base Scopes:**
-
- `platform`: Changes affecting both frontend and backend
- `frontend`
- `backend`
- `infra`
- `blocks`: Modifications/additions of individual blocks
-
-**Subscope Examples:**
-
- `backend/executor`
- `backend/db`
- `frontend/builder` (includes changes to the block UI component)
- `infra/prod`
-
-Use these scopes and subscopes for clarity and consistency in commit messages.
+@AGENTS.md
--- a/autogpt_platform/analytics/queries/auth_activities.sql
+++ b/autogpt_platform/analytics/queries/auth_activities.sql
@@ -0,0 +1,40 @@
+-- =============================================================
+-- View: analytics.auth_activities
+-- Looker source alias: ds49  |  Charts: 1
+-- =============================================================
+-- DESCRIPTION
+--   Tracks authentication events (login, logout, SSO, password
+--   reset, etc.) from Supabase's internal audit log.
+--   Useful for monitoring sign-in patterns and detecting anomalies.
+--
+-- SOURCE TABLES
+--   auth.audit_log_entries  — Supabase internal auth event log
+--
+-- OUTPUT COLUMNS
+--   created_at      TIMESTAMPTZ  When the auth event occurred
+--   actor_id        TEXT         User ID who triggered the event
+--   actor_via_sso   TEXT         Whether the action was via SSO ('true'/'false')
+--   action          TEXT         Event type (e.g. 'login', 'logout', 'token_refreshed')
+--
+-- WINDOW
+--   Rolling 90 days from current date
+--
+-- EXAMPLE QUERIES
+--   -- Daily login counts
+--   SELECT DATE_TRUNC('day', created_at) AS day, COUNT(*) AS logins
+--   FROM analytics.auth_activities
+--   WHERE action = 'login'
+--   GROUP BY 1 ORDER BY 1;
+--
+--   -- SSO vs password login breakdown
+--   SELECT actor_via_sso, COUNT(*) FROM analytics.auth_activities
+--   WHERE action = 'login' GROUP BY 1;
+-- =============================================================
+
+SELECT
+    created_at,
+    payload->>'actor_id'      AS actor_id,
+    payload->>'actor_via_sso' AS actor_via_sso,
+    payload->>'action'        AS action
+FROM auth.audit_log_entries
+WHERE created_at >= NOW() - INTERVAL '90 days'
--- a/autogpt_platform/analytics/queries/graph_execution.sql
+++ b/autogpt_platform/analytics/queries/graph_execution.sql
@@ -0,0 +1,105 @@
+-- =============================================================
+-- View: analytics.graph_execution
+-- Looker source alias: ds16  |  Charts: 21
+-- =============================================================
+-- DESCRIPTION
+--   One row per agent graph execution (last 90 days).
+--   Unpacks the JSONB stats column into individual numeric columns
+--   and normalises the executionStatus — runs that failed due to
+--   insufficient credits are reclassified as 'NO_CREDITS' for
+--   easier filtering.  Error messages are scrubbed of IDs and URLs
+--   to allow safe grouping.
+--
+-- SOURCE TABLES
+--   platform.AgentGraphExecution  — Execution records
+--   platform.AgentGraph           — Agent graph metadata (for name)
+--   platform.LibraryAgent         — To flag possibly-AI (safe-mode) agents
+--
+-- OUTPUT COLUMNS
+--   id                TEXT         Execution UUID
+--   agentGraphId      TEXT         Agent graph UUID
+--   agentGraphVersion INT          Graph version number
+--   executionStatus   TEXT         COMPLETED | FAILED | NO_CREDITS | RUNNING | QUEUED | TERMINATED
+--   createdAt         TIMESTAMPTZ  When the execution was queued
+--   updatedAt         TIMESTAMPTZ  Last status update time
+--   userId            TEXT         Owner user UUID
+--   agentGraphName    TEXT         Human-readable agent name
+--   cputime           DECIMAL      Total CPU seconds consumed
+--   walltime          DECIMAL      Total wall-clock seconds
+--   node_count        DECIMAL      Number of nodes in the graph
+--   nodes_cputime     DECIMAL      CPU time across all nodes
+--   nodes_walltime    DECIMAL      Wall time across all nodes
+--   execution_cost    DECIMAL      Credit cost of this execution
+--   correctness_score FLOAT        AI correctness score (if available)
+--   possibly_ai       BOOLEAN      True if agent has sensitive_action_safe_mode enabled
+--   groupedErrorMessage TEXT       Scrubbed error string (IDs/URLs replaced with wildcards)
+--
+-- WINDOW
+--   Rolling 90 days (createdAt > CURRENT_DATE - 90 days)
+--
+-- EXAMPLE QUERIES
+--   -- Daily execution counts by status
+--   SELECT DATE_TRUNC('day', "createdAt") AS day, "executionStatus", COUNT(*)
+--   FROM analytics.graph_execution
+--   GROUP BY 1, 2 ORDER BY 1;
+--
+--   -- Average cost per execution by agent
+--   SELECT "agentGraphName", AVG("execution_cost") AS avg_cost, COUNT(*) AS runs
+--   FROM analytics.graph_execution
+--   WHERE "executionStatus" = 'COMPLETED'
+--   GROUP BY 1 ORDER BY avg_cost DESC;
+--
+--   -- Top error messages
+--   SELECT "groupedErrorMessage", COUNT(*) AS occurrences
+--   FROM analytics.graph_execution
+--   WHERE "executionStatus" = 'FAILED'
+--   GROUP BY 1 ORDER BY 2 DESC LIMIT 20;
+-- =============================================================
+
+SELECT
+    ge."id"                                                        AS id,
+    ge."agentGraphId"                                              AS agentGraphId,
+    ge."agentGraphVersion"                                         AS agentGraphVersion,
+    CASE
+        WHEN jsonb_exists(ge."stats"::jsonb, 'error')
+         AND (
+               (ge."stats"::jsonb->>'error') ILIKE '%insufficient balance%'
+            OR (ge."stats"::jsonb->>'error') ILIKE '%you have no credits left%'
+             )
+        THEN 'NO_CREDITS'
+        ELSE CAST(ge."executionStatus" AS TEXT)
+    END                                                            AS executionStatus,
+    ge."createdAt"                                                 AS createdAt,
+    ge."updatedAt"                                                 AS updatedAt,
+    ge."userId"                                                    AS userId,
+    g."name"                                                       AS agentGraphName,
+    (ge."stats"::jsonb->>'cputime')::decimal                       AS cputime,
+    (ge."stats"::jsonb->>'walltime')::decimal                      AS walltime,
+    (ge."stats"::jsonb->>'node_count')::decimal                    AS node_count,
+    (ge."stats"::jsonb->>'nodes_cputime')::decimal                 AS nodes_cputime,
+    (ge."stats"::jsonb->>'nodes_walltime')::decimal                AS nodes_walltime,
+    (ge."stats"::jsonb->>'cost')::decimal                          AS execution_cost,
+    (ge."stats"::jsonb->>'correctness_score')::float               AS correctness_score,
+    COALESCE(la.possibly_ai, FALSE)                                AS possibly_ai,
+    REGEXP_REPLACE(
+        REGEXP_REPLACE(
+            TRIM(BOTH '"' FROM ge."stats"::jsonb->>'error'),
+            '(https?://)([A-Za-z0-9.-]+)(:[0-9]+)?(/[^\s]*)?',
+            '\1\2/...', 'gi'
+        ),
+        '[a-zA-Z0-9_:-]*\d[a-zA-Z0-9_:-]*', '*', 'g'
+    )                                                              AS groupedErrorMessage
+FROM platform."AgentGraphExecution" ge
+LEFT JOIN platform."AgentGraph" g
+       ON ge."agentGraphId" = g."id"
+      AND ge."agentGraphVersion" = g."version"
+LEFT JOIN (
+    SELECT DISTINCT ON ("userId", "agentGraphId")
+           "userId", "agentGraphId",
+           ("settings"::jsonb->>'sensitive_action_safe_mode')::boolean AS possibly_ai
+    FROM platform."LibraryAgent"
+    WHERE "isDeleted"  = FALSE
+      AND "isArchived" = FALSE
+    ORDER BY "userId", "agentGraphId", "agentGraphVersion" DESC
+) la ON la."userId" = ge."userId" AND la."agentGraphId" = ge."agentGraphId"
+WHERE ge."createdAt" > CURRENT_DATE - INTERVAL '90 days'
--- a/autogpt_platform/analytics/queries/node_block_execution.sql
+++ b/autogpt_platform/analytics/queries/node_block_execution.sql
@@ -0,0 +1,101 @@
+-- =============================================================
+-- View: analytics.node_block_execution
+-- Looker source alias: ds14  |  Charts: 11
+-- =============================================================
+-- DESCRIPTION
+--   One row per node (block) execution (last 90 days).
+--   Unpacks stats JSONB and joins to identify which block type
+--   was run.  For failed nodes, joins the error output and
+--   scrubs it for safe grouping.
+--
+-- SOURCE TABLES
+--   platform.AgentNodeExecution              — Node execution records
+--   platform.AgentNode                       — Node → block mapping
+--   platform.AgentBlock                      — Block name/ID
+--   platform.AgentNodeExecutionInputOutput   — Error output values
+--
+-- OUTPUT COLUMNS
+--   id                    TEXT         Node execution UUID
+--   agentGraphExecutionId TEXT         Parent graph execution UUID
+--   agentNodeId           TEXT         Node UUID within the graph
+--   executionStatus       TEXT         COMPLETED | FAILED | QUEUED | RUNNING | TERMINATED
+--   addedTime             TIMESTAMPTZ  When the node was queued
+--   queuedTime            TIMESTAMPTZ  When it entered the queue
+--   startedTime           TIMESTAMPTZ  When execution started
+--   endedTime             TIMESTAMPTZ  When execution finished
+--   inputSize             BIGINT       Input payload size in bytes
+--   outputSize            BIGINT       Output payload size in bytes
+--   walltime              NUMERIC      Wall-clock seconds for this node
+--   cputime               NUMERIC      CPU seconds for this node
+--   llmRetryCount         INT          Number of LLM retries
+--   llmCallCount          INT          Number of LLM API calls made
+--   inputTokenCount       BIGINT       LLM input tokens consumed
+--   outputTokenCount      BIGINT       LLM output tokens produced
+--   blockName             TEXT         Human-readable block name (e.g. 'OpenAIBlock')
+--   blockId               TEXT         Block UUID
+--   groupedErrorMessage   TEXT         Scrubbed error (IDs/URLs wildcarded)
+--   errorMessage          TEXT         Raw error output (only set when FAILED)
+--
+-- WINDOW
+--   Rolling 90 days (addedTime > CURRENT_DATE - 90 days)
+--
+-- EXAMPLE QUERIES
+--   -- Most-used blocks by execution count
+--   SELECT "blockName", COUNT(*) AS executions,
+--          COUNT(*) FILTER (WHERE "executionStatus"='FAILED') AS failures
+--   FROM analytics.node_block_execution
+--   GROUP BY 1 ORDER BY executions DESC LIMIT 20;
+--
+--   -- Average LLM token usage per block
+--   SELECT "blockName",
+--          AVG("inputTokenCount") AS avg_input_tokens,
+--          AVG("outputTokenCount") AS avg_output_tokens
+--   FROM analytics.node_block_execution
+--   WHERE "llmCallCount" > 0
+--   GROUP BY 1 ORDER BY avg_input_tokens DESC;
+--
+--   -- Top failure reasons
+--   SELECT "blockName", "groupedErrorMessage", COUNT(*) AS count
+--   FROM analytics.node_block_execution
+--   WHERE "executionStatus" = 'FAILED'
+--   GROUP BY 1, 2 ORDER BY count DESC LIMIT 20;
+-- =============================================================
+
+SELECT
+    ne."id"                                                            AS id,
+    ne."agentGraphExecutionId"                                         AS agentGraphExecutionId,
+    ne."agentNodeId"                                                   AS agentNodeId,
+    CAST(ne."executionStatus" AS TEXT)                                 AS executionStatus,
+    ne."addedTime"                                                     AS addedTime,
+    ne."queuedTime"                                                    AS queuedTime,
+    ne."startedTime"                                                   AS startedTime,
+    ne."endedTime"                                                     AS endedTime,
+    (ne."stats"::jsonb->>'input_size')::bigint                         AS inputSize,
+    (ne."stats"::jsonb->>'output_size')::bigint                        AS outputSize,
+    (ne."stats"::jsonb->>'walltime')::numeric                          AS walltime,
+    (ne."stats"::jsonb->>'cputime')::numeric                           AS cputime,
+    (ne."stats"::jsonb->>'llm_retry_count')::int                       AS llmRetryCount,
+    (ne."stats"::jsonb->>'llm_call_count')::int                        AS llmCallCount,
+    (ne."stats"::jsonb->>'input_token_count')::bigint                  AS inputTokenCount,
+    (ne."stats"::jsonb->>'output_token_count')::bigint                 AS outputTokenCount,
+    b."name"                                                           AS blockName,
+    b."id"                                                             AS blockId,
+    REGEXP_REPLACE(
+        REGEXP_REPLACE(
+            TRIM(BOTH '"' FROM eio."data"::text),
+            '(https?://)([A-Za-z0-9.-]+)(:[0-9]+)?(/[^\s]*)?',
+            '\1\2/...', 'gi'
+        ),
+        '[a-zA-Z0-9_:-]*\d[a-zA-Z0-9_:-]*', '*', 'g'
+    )                                                                  AS groupedErrorMessage,
+    eio."data"                                                         AS errorMessage
+FROM platform."AgentNodeExecution" ne
+LEFT JOIN platform."AgentNode" nd
+       ON ne."agentNodeId" = nd."id"
+LEFT JOIN platform."AgentBlock" b
+       ON nd."agentBlockId" = b."id"
+LEFT JOIN platform."AgentNodeExecutionInputOutput" eio
+       ON eio."referencedByOutputExecId" = ne."id"
+      AND eio."name" = 'error'
+      AND ne."executionStatus" = 'FAILED'
+WHERE ne."addedTime" > CURRENT_DATE - INTERVAL '90 days'
--- a/autogpt_platform/analytics/queries/retention_agent.sql
+++ b/autogpt_platform/analytics/queries/retention_agent.sql
@@ -0,0 +1,97 @@
+-- =============================================================
+-- View: analytics.retention_agent
+-- Looker source alias: ds35  |  Charts: 2
+-- =============================================================
+-- DESCRIPTION
+--   Weekly cohort retention broken down per individual agent.
+--   Cohort = week of a user's first use of THAT specific agent.
+--   Tells you which agents keep users coming back vs. one-shot
+--   use. Only includes cohorts from the last 180 days.
+--
+-- SOURCE TABLES
+--   platform.AgentGraphExecution  — Execution records (user × agent × time)
+--   platform.AgentGraph           — Agent names
+--
+-- OUTPUT COLUMNS
+--   agent_id            TEXT   Agent graph UUID
+--   agent_label         TEXT   'AgentName [first8chars]'
+--   agent_label_n       TEXT   'AgentName [first8chars] (n=total_users)'
+--   cohort_week_start   DATE   Week users first ran this agent
+--   cohort_label        TEXT   ISO week label
+--   cohort_label_n      TEXT   ISO week label with cohort size
+--   user_lifetime_week  INT    Weeks since first use of this agent
+--   cohort_users        BIGINT Users in this cohort for this agent
+--   active_users        BIGINT Users who ran the agent again in week k
+--   retention_rate      FLOAT  active_users / cohort_users
+--   cohort_users_w0     BIGINT cohort_users only at week 0 (safe to SUM)
+--   agent_total_users   BIGINT Total users across all cohorts for this agent
+--
+-- EXAMPLE QUERIES
+--   -- Best-retained agents at week 2
+--   SELECT agent_label, AVG(retention_rate) AS w2_retention
+--   FROM analytics.retention_agent
+--   WHERE user_lifetime_week = 2 AND cohort_users >= 10
+--   GROUP BY 1 ORDER BY w2_retention DESC LIMIT 10;
+--
+--   -- Agents with most unique users
+--   SELECT DISTINCT agent_label, agent_total_users
+--   FROM analytics.retention_agent
+--   ORDER BY agent_total_users DESC LIMIT 20;
+-- =============================================================
+
+WITH params AS (SELECT 12::int AS max_weeks, (CURRENT_DATE - INTERVAL '180 days') AS cohort_start),
+events AS (
+  SELECT e."userId"::text AS user_id, e."agentGraphId" AS agent_id,
+         e."createdAt"::timestamptz AS created_at,
+         DATE_TRUNC('week', e."createdAt")::date AS week_start
+  FROM platform."AgentGraphExecution" e
+),
+first_use AS (
+  SELECT user_id, agent_id, MIN(created_at) AS first_use_at,
+         DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
+  FROM events GROUP BY 1,2
+  HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
+),
+activity_weeks AS (SELECT DISTINCT user_id, agent_id, week_start FROM events),
+user_week_age AS (
+  SELECT aw.user_id, aw.agent_id, fu.cohort_week_start,
+         ((aw.week_start - DATE_TRUNC('week',fu.first_use_at)::date)/7)::int AS user_lifetime_week
+  FROM activity_weeks aw JOIN first_use fu USING (user_id, agent_id)
+  WHERE aw.week_start >= DATE_TRUNC('week',fu.first_use_at)::date
+),
+active_counts AS (
+  SELECT agent_id, cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users
+  FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2,3
+),
+cohort_sizes AS (
+  SELECT agent_id, cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_use GROUP BY 1,2
+),
+cohort_caps AS (
+  SELECT cs.agent_id, cs.cohort_week_start, cs.cohort_users,
+         LEAST((SELECT max_weeks FROM params),
+               GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date-cs.cohort_week_start)/7)::int)) AS cap_weeks
+  FROM cohort_sizes cs
+),
+grid AS (
+  SELECT cc.agent_id, cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
+  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
+),
+agent_names AS (SELECT DISTINCT ON (g."id") g."id" AS agent_id, g."name" AS agent_name FROM platform."AgentGraph" g ORDER BY g."id", g."version" DESC),
+agent_total_users AS (SELECT agent_id, SUM(cohort_users) AS agent_total_users FROM cohort_sizes GROUP BY 1)
+SELECT
+  g.agent_id,
+  COALESCE(an.agent_name,'(unnamed)')||' ['||LEFT(g.agent_id::text,8)||']'  AS agent_label,
+  COALESCE(an.agent_name,'(unnamed)')||' ['||LEFT(g.agent_id::text,8)||'] (n='||COALESCE(atu.agent_total_users,0)||')' AS agent_label_n,
+  g.cohort_week_start,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')                               AS cohort_label,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')'  AS cohort_label_n,
+  g.user_lifetime_week, g.cohort_users,
+  COALESCE(ac.active_users,0)                                              AS active_users,
+  COALESCE(ac.active_users,0)::float / NULLIF(g.cohort_users,0)           AS retention_rate,
+  CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END         AS cohort_users_w0,
+  COALESCE(atu.agent_total_users,0)                                        AS agent_total_users
+FROM grid g
+LEFT JOIN active_counts     ac  ON ac.agent_id=g.agent_id AND ac.cohort_week_start=g.cohort_week_start AND ac.user_lifetime_week=g.user_lifetime_week
+LEFT JOIN agent_names       an  ON an.agent_id=g.agent_id
+LEFT JOIN agent_total_users atu ON atu.agent_id=g.agent_id
+ORDER BY agent_label, g.cohort_week_start, g.user_lifetime_week;
--- a/autogpt_platform/analytics/queries/retention_execution_daily.sql
+++ b/autogpt_platform/analytics/queries/retention_execution_daily.sql
@@ -0,0 +1,81 @@
+-- =============================================================
+-- View: analytics.retention_execution_daily
+-- Looker source alias: ds111  |  Charts: 1
+-- =============================================================
+-- DESCRIPTION
+--   Daily cohort retention based on agent executions.
+--   Cohort anchor = day of user's FIRST ever execution.
+--   Only includes cohorts from the last 90 days, up to day 30.
+--   Great for early engagement analysis (did users run another
+--   agent the next day?).
+--
+-- SOURCE TABLES
+--   platform.AgentGraphExecution  — Execution records
+--
+-- OUTPUT COLUMNS
+--   Same pattern as retention_login_daily.
+--   cohort_day_start = day of first execution (not first login)
+--
+-- EXAMPLE QUERIES
+--   -- Day-3 execution retention
+--   SELECT cohort_label, retention_rate_bounded AS d3_retention
+--   FROM analytics.retention_execution_daily
+--   WHERE user_lifetime_day = 3 ORDER BY cohort_day_start;
+-- =============================================================
+
+WITH params AS (SELECT 30::int AS max_days, (CURRENT_DATE - INTERVAL '90 days') AS cohort_start),
+events AS (
+  SELECT e."userId"::text AS user_id, e."createdAt"::timestamptz AS created_at,
+         DATE_TRUNC('day', e."createdAt")::date AS day_start
+  FROM platform."AgentGraphExecution" e WHERE e."userId" IS NOT NULL
+),
+first_exec AS (
+  SELECT user_id, MIN(created_at) AS first_exec_at,
+         DATE_TRUNC('day', MIN(created_at))::date AS cohort_day_start
+  FROM events GROUP BY 1
+  HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
+),
+activity_days AS (SELECT DISTINCT user_id, day_start FROM events),
+user_day_age AS (
+  SELECT ad.user_id, fe.cohort_day_start,
+         (ad.day_start - DATE_TRUNC('day',fe.first_exec_at)::date)::int AS user_lifetime_day
+  FROM activity_days ad JOIN first_exec fe USING (user_id)
+  WHERE ad.day_start >= DATE_TRUNC('day',fe.first_exec_at)::date
+),
+bounded_counts AS (
+  SELECT cohort_day_start, user_lifetime_day, COUNT(DISTINCT user_id) AS active_users_bounded
+  FROM user_day_age WHERE user_lifetime_day >= 0 GROUP BY 1,2
+),
+last_active AS (
+  SELECT cohort_day_start, user_id, MAX(user_lifetime_day) AS last_active_day FROM user_day_age GROUP BY 1,2
+),
+unbounded_counts AS (
+  SELECT la.cohort_day_start, gs AS user_lifetime_day, COUNT(*) AS retained_users_unbounded
+  FROM last_active la
+  CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_day,(SELECT max_days FROM params))) gs
+  GROUP BY 1,2
+),
+cohort_sizes AS (SELECT cohort_day_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_exec GROUP BY 1),
+cohort_caps AS (
+  SELECT cs.cohort_day_start, cs.cohort_users,
+         LEAST((SELECT max_days FROM params), GREATEST(0,(CURRENT_DATE-cs.cohort_day_start)::int)) AS cap_days
+  FROM cohort_sizes cs
+),
+grid AS (
+  SELECT cc.cohort_day_start, gs AS user_lifetime_day, cc.cohort_users
+  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_days) gs
+)
+SELECT
+  g.cohort_day_start,
+  TO_CHAR(g.cohort_day_start,'YYYY-MM-DD')                                AS cohort_label,
+  TO_CHAR(g.cohort_day_start,'YYYY-MM-DD')||' (n='||g.cohort_users||')'   AS cohort_label_n,
+  g.user_lifetime_day, g.cohort_users,
+  COALESCE(b.active_users_bounded,0)     AS active_users_bounded,
+  COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END    AS retention_rate_bounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
+  CASE WHEN g.user_lifetime_day=0 THEN g.cohort_users ELSE 0 END          AS cohort_users_d0
+FROM grid g
+LEFT JOIN bounded_counts   b ON b.cohort_day_start=g.cohort_day_start AND b.user_lifetime_day=g.user_lifetime_day
+LEFT JOIN unbounded_counts u ON u.cohort_day_start=g.cohort_day_start AND u.user_lifetime_day=g.user_lifetime_day
+ORDER BY g.cohort_day_start, g.user_lifetime_day;
--- a/autogpt_platform/analytics/queries/retention_execution_weekly.sql
+++ b/autogpt_platform/analytics/queries/retention_execution_weekly.sql
@@ -0,0 +1,81 @@
+-- =============================================================
+-- View: analytics.retention_execution_weekly
+-- Looker source alias: ds92  |  Charts: 2
+-- =============================================================
+-- DESCRIPTION
+--   Weekly cohort retention based on agent executions.
+--   Cohort anchor = week of user's FIRST ever agent execution
+--   (not first login). Only includes cohorts from the last 180 days.
+--   Useful when you care about product engagement, not just visits.
+--
+-- SOURCE TABLES
+--   platform.AgentGraphExecution  — Execution records
+--
+-- OUTPUT COLUMNS
+--   Same pattern as retention_login_weekly.
+--   cohort_week_start = week of first execution (not first login)
+--
+-- EXAMPLE QUERIES
+--   -- Week-2 execution retention
+--   SELECT cohort_label, retention_rate_bounded
+--   FROM analytics.retention_execution_weekly
+--   WHERE user_lifetime_week = 2 ORDER BY cohort_week_start;
+-- =============================================================
+
+WITH params AS (SELECT 12::int AS max_weeks, (CURRENT_DATE - INTERVAL '180 days') AS cohort_start),
+events AS (
+  SELECT e."userId"::text AS user_id, e."createdAt"::timestamptz AS created_at,
+         DATE_TRUNC('week', e."createdAt")::date AS week_start
+  FROM platform."AgentGraphExecution" e WHERE e."userId" IS NOT NULL
+),
+first_exec AS (
+  SELECT user_id, MIN(created_at) AS first_exec_at,
+         DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
+  FROM events GROUP BY 1
+  HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
+),
+activity_weeks AS (SELECT DISTINCT user_id, week_start FROM events),
+user_week_age AS (
+  SELECT aw.user_id, fe.cohort_week_start,
+         ((aw.week_start - DATE_TRUNC('week',fe.first_exec_at)::date)/7)::int AS user_lifetime_week
+  FROM activity_weeks aw JOIN first_exec fe USING (user_id)
+  WHERE aw.week_start >= DATE_TRUNC('week',fe.first_exec_at)::date
+),
+bounded_counts AS (
+  SELECT cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users_bounded
+  FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2
+),
+last_active AS (
+  SELECT cohort_week_start, user_id, MAX(user_lifetime_week) AS last_active_week FROM user_week_age GROUP BY 1,2
+),
+unbounded_counts AS (
+  SELECT la.cohort_week_start, gs AS user_lifetime_week, COUNT(*) AS retained_users_unbounded
+  FROM last_active la
+  CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_week,(SELECT max_weeks FROM params))) gs
+  GROUP BY 1,2
+),
+cohort_sizes AS (SELECT cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_exec GROUP BY 1),
+cohort_caps AS (
+  SELECT cs.cohort_week_start, cs.cohort_users,
+         LEAST((SELECT max_weeks FROM params),
+               GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date-cs.cohort_week_start)/7)::int)) AS cap_weeks
+  FROM cohort_sizes cs
+),
+grid AS (
+  SELECT cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
+  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
+)
+SELECT
+  g.cohort_week_start,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')                               AS cohort_label,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')'  AS cohort_label_n,
+  g.user_lifetime_week, g.cohort_users,
+  COALESCE(b.active_users_bounded,0)     AS active_users_bounded,
+  COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END    AS retention_rate_bounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
+  CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END         AS cohort_users_w0
+FROM grid g
+LEFT JOIN bounded_counts   b ON b.cohort_week_start=g.cohort_week_start AND b.user_lifetime_week=g.user_lifetime_week
+LEFT JOIN unbounded_counts u ON u.cohort_week_start=g.cohort_week_start AND u.user_lifetime_week=g.user_lifetime_week
+ORDER BY g.cohort_week_start, g.user_lifetime_week;
--- a/autogpt_platform/analytics/queries/retention_login_daily.sql
+++ b/autogpt_platform/analytics/queries/retention_login_daily.sql
@@ -0,0 +1,94 @@
+-- =============================================================
+-- View: analytics.retention_login_daily
+-- Looker source alias: ds112  |  Charts: 1
+-- =============================================================
+-- DESCRIPTION
+--   Daily cohort retention based on login sessions.
+--   Same logic as retention_login_weekly but at day granularity,
+--   showing up to day 30 for cohorts from the last 90 days.
+--   Useful for analysing early activation (days 1-7) in detail.
+--
+-- SOURCE TABLES
+--   auth.sessions  — Login session records
+--
+-- OUTPUT COLUMNS (same pattern as retention_login_weekly)
+--   cohort_day_start          DATE     First day the cohort logged in
+--   cohort_label              TEXT     Date string (e.g. '2025-03-01')
+--   cohort_label_n            TEXT     Date + cohort size (e.g. '2025-03-01 (n=12)')
+--   user_lifetime_day         INT      Days since first login (0 = signup day)
+--   cohort_users              BIGINT   Total users in cohort
+--   active_users_bounded      BIGINT   Users active on exactly day k
+--   retained_users_unbounded  BIGINT   Users active any time on/after day k
+--   retention_rate_bounded    FLOAT    bounded / cohort_users
+--   retention_rate_unbounded  FLOAT    unbounded / cohort_users
+--   cohort_users_d0           BIGINT   cohort_users only at day 0, else 0 (safe to SUM)
+--
+-- EXAMPLE QUERIES
+--   -- Day-1 retention rate (came back next day)
+--   SELECT cohort_label, retention_rate_bounded AS d1_retention
+--   FROM analytics.retention_login_daily
+--   WHERE user_lifetime_day = 1 ORDER BY cohort_day_start;
+--
+--   -- Average retention curve across all cohorts
+--   SELECT user_lifetime_day,
+--          SUM(active_users_bounded)::float / NULLIF(SUM(cohort_users_d0), 0) AS avg_retention
+--   FROM analytics.retention_login_daily
+--   GROUP BY 1 ORDER BY 1;
+-- =============================================================
+
+WITH params AS (SELECT 30::int AS max_days, (CURRENT_DATE - INTERVAL '90 days')::date AS cohort_start),
+events AS (
+  SELECT s.user_id::text AS user_id, s.created_at::timestamptz AS created_at,
+         DATE_TRUNC('day', s.created_at)::date AS day_start
+  FROM auth.sessions s WHERE s.user_id IS NOT NULL
+),
+first_login AS (
+  SELECT user_id, MIN(created_at) AS first_login_time,
+         DATE_TRUNC('day', MIN(created_at))::date AS cohort_day_start
+  FROM events GROUP BY 1
+  HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
+),
+activity_days AS (SELECT DISTINCT user_id, day_start FROM events),
+user_day_age AS (
+  SELECT ad.user_id, fl.cohort_day_start,
+         (ad.day_start - DATE_TRUNC('day', fl.first_login_time)::date)::int AS user_lifetime_day
+  FROM activity_days ad JOIN first_login fl USING (user_id)
+  WHERE ad.day_start >= DATE_TRUNC('day', fl.first_login_time)::date
+),
+bounded_counts AS (
+  SELECT cohort_day_start, user_lifetime_day, COUNT(DISTINCT user_id) AS active_users_bounded
+  FROM user_day_age WHERE user_lifetime_day >= 0 GROUP BY 1,2
+),
+last_active AS (
+  SELECT cohort_day_start, user_id, MAX(user_lifetime_day) AS last_active_day FROM user_day_age GROUP BY 1,2
+),
+unbounded_counts AS (
+  SELECT la.cohort_day_start, gs AS user_lifetime_day, COUNT(*) AS retained_users_unbounded
+  FROM last_active la
+  CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_day,(SELECT max_days FROM params))) gs
+  GROUP BY 1,2
+),
+cohort_sizes AS (SELECT cohort_day_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_login GROUP BY 1),
+cohort_caps AS (
+  SELECT cs.cohort_day_start, cs.cohort_users,
+         LEAST((SELECT max_days FROM params), GREATEST(0,(CURRENT_DATE-cs.cohort_day_start)::int)) AS cap_days
+  FROM cohort_sizes cs
+),
+grid AS (
+  SELECT cc.cohort_day_start, gs AS user_lifetime_day, cc.cohort_users
+  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_days) gs
+)
+SELECT
+  g.cohort_day_start,
+  TO_CHAR(g.cohort_day_start,'YYYY-MM-DD')                                  AS cohort_label,
+  TO_CHAR(g.cohort_day_start,'YYYY-MM-DD')||' (n='||g.cohort_users||')'     AS cohort_label_n,
+  g.user_lifetime_day, g.cohort_users,
+  COALESCE(b.active_users_bounded,0)     AS active_users_bounded,
+  COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END    AS retention_rate_bounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
+  CASE WHEN g.user_lifetime_day=0 THEN g.cohort_users ELSE 0 END            AS cohort_users_d0
+FROM grid g
+LEFT JOIN bounded_counts   b ON b.cohort_day_start=g.cohort_day_start AND b.user_lifetime_day=g.user_lifetime_day
+LEFT JOIN unbounded_counts u ON u.cohort_day_start=g.cohort_day_start AND u.user_lifetime_day=g.user_lifetime_day
+ORDER BY g.cohort_day_start, g.user_lifetime_day;
--- a/autogpt_platform/analytics/queries/retention_login_onboarded_weekly.sql
+++ b/autogpt_platform/analytics/queries/retention_login_onboarded_weekly.sql
@@ -0,0 +1,96 @@
+-- =============================================================
+-- View: analytics.retention_login_onboarded_weekly
+-- Looker source alias: ds101  |  Charts: 2
+-- =============================================================
+-- DESCRIPTION
+--   Weekly cohort retention from login sessions, restricted to
+--   users who "onboarded" — defined as running at least one
+--   agent within 365 days of their first login.
+--   Filters out users who signed up but never activated,
+--   giving a cleaner view of engaged-user retention.
+--
+-- SOURCE TABLES
+--   auth.sessions                  — Login session records
+--   platform.AgentGraphExecution   — Used to identify onboarders
+--
+-- OUTPUT COLUMNS
+--   Same as retention_login_weekly (cohort_week_start, user_lifetime_week,
+--   retention_rate_bounded, retention_rate_unbounded, etc.)
+--   Only difference: cohort is filtered to onboarded users only.
+--
+-- EXAMPLE QUERIES
+--   -- Compare week-4 retention: all users vs onboarded only
+--   SELECT 'all_users' AS segment, AVG(retention_rate_bounded) AS w4_retention
+--   FROM analytics.retention_login_weekly WHERE user_lifetime_week = 4
+--   UNION ALL
+--   SELECT 'onboarded', AVG(retention_rate_bounded)
+--   FROM analytics.retention_login_onboarded_weekly WHERE user_lifetime_week = 4;
+-- =============================================================
+
+WITH params AS (SELECT 12::int AS max_weeks, 365::int AS onboarding_window_days),
+events AS (
+  SELECT s.user_id::text AS user_id, s.created_at::timestamptz AS created_at,
+         DATE_TRUNC('week', s.created_at)::date AS week_start
+  FROM auth.sessions s WHERE s.user_id IS NOT NULL
+),
+first_login_all AS (
+  SELECT user_id, MIN(created_at) AS first_login_time,
+         DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
+  FROM events GROUP BY 1
+),
+onboarders AS (
+  SELECT fl.user_id FROM first_login_all fl
+  WHERE EXISTS (
+    SELECT 1 FROM platform."AgentGraphExecution" e
+    WHERE e."userId"::text = fl.user_id
+      AND e."createdAt" >= fl.first_login_time
+      AND e."createdAt" < fl.first_login_time
+          + make_interval(days => (SELECT onboarding_window_days FROM params))
+  )
+),
+first_login AS (SELECT * FROM first_login_all WHERE user_id IN (SELECT user_id FROM onboarders)),
+activity_weeks AS (SELECT DISTINCT user_id, week_start FROM events),
+user_week_age AS (
+  SELECT aw.user_id, fl.cohort_week_start,
+         ((aw.week_start - DATE_TRUNC('week',fl.first_login_time)::date)/7)::int AS user_lifetime_week
+  FROM activity_weeks aw JOIN first_login fl USING (user_id)
+  WHERE aw.week_start >= DATE_TRUNC('week',fl.first_login_time)::date
+),
+bounded_counts AS (
+  SELECT cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users_bounded
+  FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2
+),
+last_active AS (
+  SELECT cohort_week_start, user_id, MAX(user_lifetime_week) AS last_active_week FROM user_week_age GROUP BY 1,2
+),
+unbounded_counts AS (
+  SELECT la.cohort_week_start, gs AS user_lifetime_week, COUNT(*) AS retained_users_unbounded
+  FROM last_active la
+  CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_week,(SELECT max_weeks FROM params))) gs
+  GROUP BY 1,2
+),
+cohort_sizes AS (SELECT cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_login GROUP BY 1),
+cohort_caps AS (
+  SELECT cs.cohort_week_start, cs.cohort_users,
+         LEAST((SELECT max_weeks FROM params),
+               GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date-cs.cohort_week_start)/7)::int)) AS cap_weeks
+  FROM cohort_sizes cs
+),
+grid AS (
+  SELECT cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
+  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
+)
+SELECT
+  g.cohort_week_start,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')                               AS cohort_label,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')'  AS cohort_label_n,
+  g.user_lifetime_week, g.cohort_users,
+  COALESCE(b.active_users_bounded,0)     AS active_users_bounded,
+  COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END    AS retention_rate_bounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
+  CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END         AS cohort_users_w0
+FROM grid g
+LEFT JOIN bounded_counts   b ON b.cohort_week_start=g.cohort_week_start AND b.user_lifetime_week=g.user_lifetime_week
+LEFT JOIN unbounded_counts u ON u.cohort_week_start=g.cohort_week_start AND u.user_lifetime_week=g.user_lifetime_week
+ORDER BY g.cohort_week_start, g.user_lifetime_week;
--- a/autogpt_platform/analytics/queries/retention_login_weekly.sql
+++ b/autogpt_platform/analytics/queries/retention_login_weekly.sql
@@ -0,0 +1,103 @@
+-- =============================================================
+-- View: analytics.retention_login_weekly
+-- Looker source alias: ds83  |  Charts: 2
+-- =============================================================
+-- DESCRIPTION
+--   Weekly cohort retention based on login sessions.
+--   Users are grouped by the ISO week of their first ever login.
+--   For each cohort × lifetime-week combination, outputs both:
+--     - bounded rate: % active in exactly that week
+--     - unbounded rate: % who were ever active on or after that week
+--   Weeks are capped to the cohort's actual age (no future data points).
+--
+-- SOURCE TABLES
+--   auth.sessions  — Login session records
+--
+-- HOW TO READ THE OUTPUT
+--   cohort_week_start   The Monday of the week users first logged in
+--   user_lifetime_week  0 = signup week, 1 = one week later, etc.
+--   retention_rate_bounded   = active_users_bounded / cohort_users
+--   retention_rate_unbounded = retained_users_unbounded / cohort_users
+--
+-- OUTPUT COLUMNS
+--   cohort_week_start         DATE     First day of the cohort's signup week
+--   cohort_label              TEXT     ISO week label (e.g. '2025-W01')
+--   cohort_label_n            TEXT     ISO week label with cohort size (e.g. '2025-W01 (n=42)')
+--   user_lifetime_week        INT      Weeks since first login (0 = signup week)
+--   cohort_users              BIGINT   Total users in this cohort (denominator)
+--   active_users_bounded      BIGINT   Users active in exactly week k
+--   retained_users_unbounded  BIGINT   Users active any time on/after week k
+--   retention_rate_bounded    FLOAT    bounded active / cohort_users
+--   retention_rate_unbounded  FLOAT    unbounded retained / cohort_users
+--   cohort_users_w0           BIGINT   cohort_users only at week 0, else 0 (safe to SUM in pivot tables)
+--
+-- EXAMPLE QUERIES
+--   -- Week-1 retention rate per cohort
+--   SELECT cohort_label, retention_rate_bounded AS w1_retention
+--   FROM analytics.retention_login_weekly
+--   WHERE user_lifetime_week = 1
+--   ORDER BY cohort_week_start;
+--
+--   -- Overall average retention curve (all cohorts combined)
+--   SELECT user_lifetime_week,
+--          SUM(active_users_bounded)::float / NULLIF(SUM(cohort_users_w0), 0) AS avg_retention
+--   FROM analytics.retention_login_weekly
+--   GROUP BY 1 ORDER BY 1;
+-- =============================================================
+
+WITH params AS (SELECT 12::int AS max_weeks),
+events AS (
+  SELECT s.user_id::text AS user_id, s.created_at::timestamptz AS created_at,
+         DATE_TRUNC('week', s.created_at)::date AS week_start
+  FROM auth.sessions s WHERE s.user_id IS NOT NULL
+),
+first_login AS (
+  SELECT user_id, MIN(created_at) AS first_login_time,
+         DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
+  FROM events GROUP BY 1
+),
+activity_weeks AS (SELECT DISTINCT user_id, week_start FROM events),
+user_week_age AS (
+  SELECT aw.user_id, fl.cohort_week_start,
+         ((aw.week_start - DATE_TRUNC('week', fl.first_login_time)::date) / 7)::int AS user_lifetime_week
+  FROM activity_weeks aw JOIN first_login fl USING (user_id)
+  WHERE aw.week_start >= DATE_TRUNC('week', fl.first_login_time)::date
+),
+bounded_counts AS (
+  SELECT cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users_bounded
+  FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2
+),
+last_active AS (
+  SELECT cohort_week_start, user_id, MAX(user_lifetime_week) AS last_active_week FROM user_week_age GROUP BY 1,2
+),
+unbounded_counts AS (
+  SELECT la.cohort_week_start, gs AS user_lifetime_week, COUNT(*) AS retained_users_unbounded
+  FROM last_active la
+  CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_week,(SELECT max_weeks FROM params))) gs
+  GROUP BY 1,2
+),
+cohort_sizes AS (SELECT cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_login GROUP BY 1),
+cohort_caps AS (
+  SELECT cs.cohort_week_start, cs.cohort_users,
+         LEAST((SELECT max_weeks FROM params),
+               GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date - cs.cohort_week_start)/7)::int)) AS cap_weeks
+  FROM cohort_sizes cs
+),
+grid AS (
+  SELECT cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
+  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
+)
+SELECT
+  g.cohort_week_start,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')                                    AS cohort_label,
+  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')'       AS cohort_label_n,
+  g.user_lifetime_week, g.cohort_users,
+  COALESCE(b.active_users_bounded,0)     AS active_users_bounded,
+  COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END    AS retention_rate_bounded,
+  CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
+  CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END               AS cohort_users_w0
+FROM grid g
+LEFT JOIN bounded_counts   b ON b.cohort_week_start=g.cohort_week_start AND b.user_lifetime_week=g.user_lifetime_week
+LEFT JOIN unbounded_counts u ON u.cohort_week_start=g.cohort_week_start AND u.user_lifetime_week=g.user_lifetime_week
+ORDER BY g.cohort_week_start, g.user_lifetime_week
--- a/autogpt_platform/analytics/queries/user_block_spending.sql
+++ b/autogpt_platform/analytics/queries/user_block_spending.sql
@@ -0,0 +1,71 @@
+-- =============================================================
+-- View: analytics.user_block_spending
+-- Looker source alias: ds6  |  Charts: 5
+-- =============================================================
+-- DESCRIPTION
+--   One row per credit transaction (last 90 days).
+--   Shows how users spend credits broken down by block type,
+--   LLM provider and model.  Joins node execution stats for
+--   token-level detail.
+--
+-- SOURCE TABLES
+--   platform.CreditTransaction   — Credit debit/credit records
+--   platform.AgentNodeExecution  — Node execution stats (for token counts)
+--
+-- OUTPUT COLUMNS
+--   transactionKey        TEXT         Unique transaction identifier
+--   userId                TEXT         User who was charged
+--   amount                DECIMAL      Credit amount (positive = credit, negative = debit)
+--   negativeAmount        DECIMAL      amount * -1 (convenience for spend charts)
+--   transactionType       TEXT         Transaction type (e.g. 'USAGE', 'REFUND', 'TOP_UP')
+--   transactionTime       TIMESTAMPTZ  When the transaction was recorded
+--   blockId               TEXT         Block UUID that triggered the spend
+--   blockName             TEXT         Human-readable block name
+--   llm_provider          TEXT         LLM provider (e.g. 'openai', 'anthropic')
+--   llm_model             TEXT         Model name (e.g. 'gpt-4o', 'claude-3-5-sonnet')
+--   node_exec_id          TEXT         Linked node execution UUID
+--   llm_call_count        INT          LLM API calls made in that execution
+--   llm_retry_count       INT          LLM retries in that execution
+--   llm_input_token_count INT          Input tokens consumed
+--   llm_output_token_count INT         Output tokens produced
+--
+-- WINDOW
+--   Rolling 90 days (createdAt > CURRENT_DATE - 90 days)
+--
+-- EXAMPLE QUERIES
+--   -- Total spend per user (last 90 days)
+--   SELECT "userId", SUM("negativeAmount") AS total_spent
+--   FROM analytics.user_block_spending
+--   WHERE "transactionType" = 'USAGE'
+--   GROUP BY 1 ORDER BY total_spent DESC;
+--
+--   -- Spend by LLM provider + model
+--   SELECT "llm_provider", "llm_model",
+--          SUM("negativeAmount") AS total_cost,
+--          SUM("llm_input_token_count") AS input_tokens,
+--          SUM("llm_output_token_count") AS output_tokens
+--   FROM analytics.user_block_spending
+--   WHERE "llm_provider" IS NOT NULL
+--   GROUP BY 1, 2 ORDER BY total_cost DESC;
+-- =============================================================
+
+SELECT
+    c."transactionKey"                                        AS transactionKey,
+    c."userId"                                                AS userId,
+    c."amount"                                                AS amount,
+    c."amount" * -1                                           AS negativeAmount,
+    c."type"                                                  AS transactionType,
+    c."createdAt"                                             AS transactionTime,
+    c.metadata->>'block_id'                                   AS blockId,
+    c.metadata->>'block'                                      AS blockName,
+    c.metadata->'input'->'credentials'->>'provider'           AS llm_provider,
+    c.metadata->'input'->>'model'                             AS llm_model,
+    c.metadata->>'node_exec_id'                               AS node_exec_id,
+    (ne."stats"->>'llm_call_count')::int                       AS llm_call_count,
+    (ne."stats"->>'llm_retry_count')::int                      AS llm_retry_count,
+    (ne."stats"->>'input_token_count')::int                    AS llm_input_token_count,
+    (ne."stats"->>'output_token_count')::int                   AS llm_output_token_count
+FROM platform."CreditTransaction" c
+LEFT JOIN platform."AgentNodeExecution" ne
+       ON (c.metadata->>'node_exec_id') = ne."id"::text
+WHERE c."createdAt" > CURRENT_DATE - INTERVAL '90 days'
--- a/autogpt_platform/analytics/queries/user_onboarding.sql
+++ b/autogpt_platform/analytics/queries/user_onboarding.sql
@@ -0,0 +1,45 @@
+-- =============================================================
+-- View: analytics.user_onboarding
+-- Looker source alias: ds68  |  Charts: 3
+-- =============================================================
+-- DESCRIPTION
+--   One row per user onboarding record.  Contains the user's
+--   stated usage reason, selected integrations, completed
+--   onboarding steps and optional first agent selection.
+--   Full history (no date filter) since onboarding happens
+--   once per user.
+--
+-- SOURCE TABLES
+--   platform.UserOnboarding  — Onboarding state per user
+--
+-- OUTPUT COLUMNS
+--   id                            TEXT         Onboarding record UUID
+--   createdAt                     TIMESTAMPTZ  When onboarding started
+--   updatedAt                     TIMESTAMPTZ  Last update to onboarding state
+--   usageReason                   TEXT         Why user signed up (e.g. 'work', 'personal')
+--   integrations                  TEXT[]       Array of integration names the user selected
+--   userId                        TEXT         User UUID
+--   completedSteps                TEXT[]       Array of onboarding step enums completed
+--   selectedStoreListingVersionId TEXT         First marketplace agent the user chose (if any)
+--
+-- EXAMPLE QUERIES
+--   -- Usage reason breakdown
+--   SELECT "usageReason", COUNT(*) FROM analytics.user_onboarding GROUP BY 1;
+--
+--   -- Completion rate per step
+--   SELECT step, COUNT(*) AS users_completed
+--   FROM analytics.user_onboarding
+--   CROSS JOIN LATERAL UNNEST("completedSteps") AS step
+--   GROUP BY 1 ORDER BY users_completed DESC;
+-- =============================================================
+
+SELECT
+    id,
+    "createdAt",
+    "updatedAt",
+    "usageReason",
+    integrations,
+    "userId",
+    "completedSteps",
+    "selectedStoreListingVersionId"
+FROM platform."UserOnboarding"
--- a/autogpt_platform/analytics/queries/user_onboarding_funnel.sql
+++ b/autogpt_platform/analytics/queries/user_onboarding_funnel.sql
@@ -0,0 +1,100 @@
+-- =============================================================
+-- View: analytics.user_onboarding_funnel
+-- Looker source alias: ds74  |  Charts: 1
+-- =============================================================
+-- DESCRIPTION
+--   Pre-aggregated onboarding funnel showing how many users
+--   completed each step and the drop-off percentage from the
+--   previous step.  One row per onboarding step (all 22 steps
+--   always present, even with 0 completions — prevents sparse
+--   gaps from making LAG compare the wrong predecessors).
+--
+-- SOURCE TABLES
+--   platform.UserOnboarding  — Onboarding records with completedSteps array
+--
+-- OUTPUT COLUMNS
+--   step             TEXT     Onboarding step enum name (e.g. 'WELCOME', 'CONGRATS')
+--   step_order       INT      Numeric position in the funnel (1=first, 22=last)
+--   users_completed  BIGINT   Distinct users who completed this step
+--   pct_from_prev    NUMERIC  % of users from the previous step who reached this one
+--
+-- STEP ORDER
+--   1  WELCOME               9  MARKETPLACE_VISIT     17  SCHEDULE_AGENT
+--   2  USAGE_REASON         10  MARKETPLACE_ADD_AGENT  18  RUN_AGENTS
+--   3  INTEGRATIONS         11  MARKETPLACE_RUN_AGENT  19  RUN_3_DAYS
+--   4  AGENT_CHOICE         12  BUILDER_OPEN           20  TRIGGER_WEBHOOK
+--   5  AGENT_NEW_RUN        13  BUILDER_SAVE_AGENT     21  RUN_14_DAYS
+--   6  AGENT_INPUT          14  BUILDER_RUN_AGENT      22  RUN_AGENTS_100
+--   7  CONGRATS             15  VISIT_COPILOT
+--   8  GET_RESULTS          16  RE_RUN_AGENT
+--
+-- WINDOW
+--   Users who started onboarding in the last 90 days
+--
+-- EXAMPLE QUERIES
+--   -- Full funnel
+--   SELECT * FROM analytics.user_onboarding_funnel ORDER BY step_order;
+--
+--   -- Biggest drop-off point
+--   SELECT step, pct_from_prev FROM analytics.user_onboarding_funnel
+--   ORDER BY pct_from_prev ASC LIMIT 3;
+-- =============================================================
+
+WITH all_steps AS (
+  -- Complete ordered grid of all 22 steps so zero-completion steps
+  -- are always present, keeping LAG comparisons correct.
+  SELECT step_name, step_order
+  FROM (VALUES
+    ('WELCOME',               1),
+    ('USAGE_REASON',          2),
+    ('INTEGRATIONS',          3),
+    ('AGENT_CHOICE',          4),
+    ('AGENT_NEW_RUN',         5),
+    ('AGENT_INPUT',           6),
+    ('CONGRATS',              7),
+    ('GET_RESULTS',           8),
+    ('MARKETPLACE_VISIT',     9),
+    ('MARKETPLACE_ADD_AGENT', 10),
+    ('MARKETPLACE_RUN_AGENT', 11),
+    ('BUILDER_OPEN',          12),
+    ('BUILDER_SAVE_AGENT',    13),
+    ('BUILDER_RUN_AGENT',     14),
+    ('VISIT_COPILOT',         15),
+    ('RE_RUN_AGENT',          16),
+    ('SCHEDULE_AGENT',        17),
+    ('RUN_AGENTS',            18),
+    ('RUN_3_DAYS',            19),
+    ('TRIGGER_WEBHOOK',       20),
+    ('RUN_14_DAYS',           21),
+    ('RUN_AGENTS_100',        22)
+  ) AS t(step_name, step_order)
+),
+raw AS (
+  SELECT
+      u."userId",
+      step_txt::text AS step
+  FROM platform."UserOnboarding" u
+  CROSS JOIN LATERAL UNNEST(u."completedSteps") AS step_txt
+  WHERE u."createdAt" >= CURRENT_DATE - INTERVAL '90 days'
+),
+step_counts AS (
+  SELECT step, COUNT(DISTINCT "userId") AS users_completed
+  FROM raw GROUP BY step
+),
+funnel AS (
+  SELECT
+      a.step_name                          AS step,
+      a.step_order,
+      COALESCE(sc.users_completed, 0)      AS users_completed,
+      ROUND(
+        100.0 * COALESCE(sc.users_completed, 0)
+        / NULLIF(
+            LAG(COALESCE(sc.users_completed, 0)) OVER (ORDER BY a.step_order),
+            0
+          ),
+        2
+      )                                    AS pct_from_prev
+  FROM all_steps a
+  LEFT JOIN step_counts sc ON sc.step = a.step_name
+)
+SELECT * FROM funnel ORDER BY step_order
--- a/autogpt_platform/analytics/queries/user_onboarding_integration.sql
+++ b/autogpt_platform/analytics/queries/user_onboarding_integration.sql
@@ -0,0 +1,41 @@
+-- =============================================================
+-- View: analytics.user_onboarding_integration
+-- Looker source alias: ds75  |  Charts: 1
+-- =============================================================
+-- DESCRIPTION
+--   Pre-aggregated count of users who selected each integration
+--   during onboarding.  One row per integration type, sorted
+--   by popularity.
+--
+-- SOURCE TABLES
+--   platform.UserOnboarding  — integrations array column
+--
+-- OUTPUT COLUMNS
+--   integration            TEXT    Integration name (e.g. 'github', 'slack', 'notion')
+--   users_with_integration BIGINT  Distinct users who selected this integration
+--
+-- WINDOW
+--   Users who started onboarding in the last 90 days
+--
+-- EXAMPLE QUERIES
+--   -- Full integration popularity ranking
+--   SELECT * FROM analytics.user_onboarding_integration;
+--
+--   -- Top 5 integrations
+--   SELECT * FROM analytics.user_onboarding_integration LIMIT 5;
+-- =============================================================
+
+WITH exploded AS (
+  SELECT
+      u."userId" AS user_id,
+      UNNEST(u."integrations") AS integration
+  FROM platform."UserOnboarding" u
+  WHERE u."createdAt" >= CURRENT_DATE - INTERVAL '90 days'
+)
+SELECT
+    integration,
+    COUNT(DISTINCT user_id) AS users_with_integration
+FROM exploded
+WHERE integration IS NOT NULL AND integration <> ''
+GROUP BY integration
+ORDER BY users_with_integration DESC
--- a/autogpt_platform/analytics/queries/users_activities.sql
+++ b/autogpt_platform/analytics/queries/users_activities.sql
@@ -0,0 +1,145 @@
+-- =============================================================
+-- View: analytics.users_activities
+-- Looker source alias: ds56  |  Charts: 5
+-- =============================================================
+-- DESCRIPTION
+--   One row per user with lifetime activity summary.
+--   Joins login sessions with agent graphs, executions and
+--   node-level runs to give a full picture of how engaged
+--   each user is.  Includes a convenience flag for 7-day
+--   activation (did the user return at least 7 days after
+--   their first login?).
+--
+-- SOURCE TABLES
+--   auth.sessions                    — Login/session records
+--   platform.AgentGraph              — Graphs (agents) built by the user
+--   platform.AgentGraphExecution     — Agent run history
+--   platform.AgentNodeExecution      — Individual block execution history
+--
+-- PERFORMANCE NOTE
+--   Each CTE aggregates its own table independently by userId.
+--   This avoids the fan-out that occurs when driving every join
+--   from user_logins across the two largest tables
+--   (AgentGraphExecution and AgentNodeExecution).
+--
+-- OUTPUT COLUMNS
+--   user_id                   TEXT         Supabase user UUID
+--   first_login_time          TIMESTAMPTZ  First ever session created_at
+--   last_login_time           TIMESTAMPTZ  Most recent session created_at
+--   last_visit_time           TIMESTAMPTZ  Max of last refresh or login
+--   last_agent_save_time      TIMESTAMPTZ  Last time user saved an agent graph
+--   agent_count               BIGINT       Number of distinct active graphs built (0 if none)
+--   first_agent_run_time      TIMESTAMPTZ  First ever graph execution
+--   last_agent_run_time       TIMESTAMPTZ  Most recent graph execution
+--   unique_agent_runs         BIGINT       Distinct agent graphs ever run (0 if none)
+--   agent_runs                BIGINT       Total graph execution count (0 if none)
+--   node_execution_count      BIGINT       Total node executions across all runs
+--   node_execution_failed     BIGINT       Node executions with FAILED status
+--   node_execution_completed  BIGINT       Node executions with COMPLETED status
+--   node_execution_terminated BIGINT       Node executions with TERMINATED status
+--   node_execution_queued     BIGINT       Node executions with QUEUED status
+--   node_execution_running    BIGINT       Node executions with RUNNING status
+--   is_active_after_7d        INT          1=returned after day 7, 0=did not, NULL=too early to tell
+--   node_execution_incomplete BIGINT       Node executions with INCOMPLETE status
+--   node_execution_review     BIGINT       Node executions with REVIEW status
+--
+-- EXAMPLE QUERIES
+--   -- Users who ran at least one agent and returned after 7 days
+--   SELECT COUNT(*) FROM analytics.users_activities
+--   WHERE agent_runs > 0 AND is_active_after_7d = 1;
+--
+--   -- Top 10 most active users by agent runs
+--   SELECT user_id, agent_runs, node_execution_count
+--   FROM analytics.users_activities
+--   ORDER BY agent_runs DESC LIMIT 10;
+--
+--   -- 7-day activation rate
+--   SELECT
+--     SUM(CASE WHEN is_active_after_7d = 1 THEN 1 ELSE 0 END)::float
+--     / NULLIF(COUNT(CASE WHEN is_active_after_7d IS NOT NULL THEN 1 END), 0)
+--     AS activation_rate
+--   FROM analytics.users_activities;
+-- =============================================================
+
+WITH user_logins AS (
+  SELECT
+    user_id::text                                    AS user_id,
+    MIN(created_at)                                  AS first_login_time,
+    MAX(created_at)                                  AS last_login_time,
+    GREATEST(
+      MAX(refreshed_at)::timestamptz,
+      MAX(created_at)::timestamptz
+    )                                                AS last_visit_time
+  FROM auth.sessions
+  GROUP BY user_id
+),
+user_agents AS (
+  -- Aggregate AgentGraph directly by userId (no fan-out from user_logins)
+  SELECT
+    "userId"::text                AS user_id,
+    MAX("updatedAt")              AS last_agent_save_time,
+    COUNT(DISTINCT "id")          AS agent_count
+  FROM platform."AgentGraph"
+  WHERE "isActive"
+  GROUP BY "userId"
+),
+user_graph_runs AS (
+  -- Aggregate AgentGraphExecution directly by userId
+  SELECT
+    "userId"::text                        AS user_id,
+    MIN("createdAt")                      AS first_agent_run_time,
+    MAX("createdAt")                      AS last_agent_run_time,
+    COUNT(DISTINCT "agentGraphId")        AS unique_agent_runs,
+    COUNT("id")                           AS agent_runs
+  FROM platform."AgentGraphExecution"
+  GROUP BY "userId"
+),
+user_node_runs AS (
+  -- Aggregate AgentNodeExecution directly; resolve userId via a
+  -- single join to AgentGraphExecution instead of fanning out from
+  -- user_logins through both large tables.
+  SELECT
+    g."userId"::text                                                   AS user_id,
+    COUNT(*)                                                           AS node_execution_count,
+    COUNT(*) FILTER (WHERE n."executionStatus" = 'FAILED')             AS node_execution_failed,
+    COUNT(*) FILTER (WHERE n."executionStatus" = 'COMPLETED')          AS node_execution_completed,
+    COUNT(*) FILTER (WHERE n."executionStatus" = 'TERMINATED')         AS node_execution_terminated,
+    COUNT(*) FILTER (WHERE n."executionStatus" = 'QUEUED')             AS node_execution_queued,
+    COUNT(*) FILTER (WHERE n."executionStatus" = 'RUNNING')            AS node_execution_running,
+    COUNT(*) FILTER (WHERE n."executionStatus" = 'INCOMPLETE')         AS node_execution_incomplete,
+    COUNT(*) FILTER (WHERE n."executionStatus" = 'REVIEW')             AS node_execution_review
+  FROM platform."AgentNodeExecution" n
+  JOIN platform."AgentGraphExecution" g
+    ON g."id" = n."agentGraphExecutionId"
+  GROUP BY g."userId"
+)
+SELECT
+  ul.user_id,
+  ul.first_login_time,
+  ul.last_login_time,
+  ul.last_visit_time,
+  ua.last_agent_save_time,
+  COALESCE(ua.agent_count, 0)             AS agent_count,
+  gr.first_agent_run_time,
+  gr.last_agent_run_time,
+  COALESCE(gr.unique_agent_runs, 0)       AS unique_agent_runs,
+  COALESCE(gr.agent_runs, 0)              AS agent_runs,
+  COALESCE(nr.node_execution_count, 0)      AS node_execution_count,
+  COALESCE(nr.node_execution_failed, 0)     AS node_execution_failed,
+  COALESCE(nr.node_execution_completed, 0)  AS node_execution_completed,
+  COALESCE(nr.node_execution_terminated, 0) AS node_execution_terminated,
+  COALESCE(nr.node_execution_queued, 0)     AS node_execution_queued,
+  COALESCE(nr.node_execution_running, 0)    AS node_execution_running,
+  CASE
+    WHEN ul.first_login_time < NOW() - INTERVAL '7 days'
+     AND ul.last_visit_time  >= ul.first_login_time + INTERVAL '7 days' THEN 1
+    WHEN ul.first_login_time < NOW() - INTERVAL '7 days'
+     AND ul.last_visit_time  <  ul.first_login_time + INTERVAL '7 days' THEN 0
+    ELSE NULL
+  END AS is_active_after_7d,
+  COALESCE(nr.node_execution_incomplete, 0) AS node_execution_incomplete,
+  COALESCE(nr.node_execution_review, 0)     AS node_execution_review
+FROM user_logins ul
+LEFT JOIN user_agents     ua ON ul.user_id = ua.user_id
+LEFT JOIN user_graph_runs gr ON ul.user_id = gr.user_id
+LEFT JOIN user_node_runs  nr ON ul.user_id = nr.user_id
--- a/autogpt_platform/autogpt_libs/poetry.lock
+++ b/autogpt_platform/autogpt_libs/poetry.lock
@@ -1,4 +1,4 @@
-# This file is automatically @generated by Poetry 2.1.1 and should not be changed by hand.
+# This file is automatically @generated by Poetry 2.2.1 and should not be changed by hand.

 [[package]]
 name = "annotated-doc"
@@ -67,7 +67,7 @@ description = "Backport of asyncio.Runner, a context manager that controls event
 optional = false
 python-versions = "<3.11,>=3.8"
 groups = ["dev"]
-markers = "python_version < \"3.11\""
+markers = "python_version == \"3.10\""
 files = [
    {file = "backports_asyncio_runner-1.2.0-py3-none-any.whl", hash = "sha256:0da0a936a8aeb554eccb426dc55af3ba63bcdc69fa1a600b5bb305413a4477b5"},
    {file = "backports_asyncio_runner-1.2.0.tar.gz", hash = "sha256:a5aa7b2b7d8f8bfcaa2b57313f70792df84e32a2a746f585213373f900b42162"},
@@ -541,7 +541,7 @@ description = "Backport of PEP 654 (exception groups)"
 optional = false
 python-versions = ">=3.7"
 groups = ["main", "dev"]
-markers = "python_version < \"3.11\""
+markers = "python_version == \"3.10\""
 files = [
    {file = "exceptiongroup-1.3.0-py3-none-any.whl", hash = "sha256:4d111e6e0c13d0644cad6ddaa7ed0261a0b36971f6d23e7ec9b4b9097da78a10"},
    {file = "exceptiongroup-1.3.0.tar.gz", hash = "sha256:b241f5885f560bc56a59ee63ca4c6a8bfa46ae4ad651af316d4e81817bb9fd88"},
@@ -2181,14 +2181,14 @@ testing = ["coverage (>=6.2)", "hypothesis (>=5.7.1)"]

 [[package]]
 name = "pytest-cov"
-version = "7.0.0"
+version = "7.1.0"
 description = "Pytest plugin for measuring coverage."
 optional = false
 python-versions = ">=3.9"
 groups = ["dev"]
 files = [
-    {file = "pytest_cov-7.0.0-py3-none-any.whl", hash = "sha256:3b8e9558b16cc1479da72058bdecf8073661c7f57f7d3c5f22a1c23507f2d861"},
-    {file = "pytest_cov-7.0.0.tar.gz", hash = "sha256:33c97eda2e049a0c5298e91f519302a1334c26ac65c1a483d6206fd458361af1"},
+    {file = "pytest_cov-7.1.0-py3-none-any.whl", hash = "sha256:a0461110b7865f9a271aa1b51e516c9a95de9d696734a2f71e3e78f46e1d4678"},
+    {file = "pytest_cov-7.1.0.tar.gz", hash = "sha256:30674f2b5f6351aa09702a9c8c364f6a01c27aae0c1366ae8016160d1efc56b2"},
 ]

 [package.dependencies]
@@ -2342,30 +2342,30 @@ pyasn1 = ">=0.1.3"

 [[package]]
 name = "ruff"
-version = "0.15.0"
+version = "0.15.7"
 description = "An extremely fast Python linter and code formatter, written in Rust."
 optional = false
 python-versions = ">=3.7"
 groups = ["dev"]
 files = [
-    {file = "ruff-0.15.0-py3-none-linux_armv6l.whl", hash = "sha256:aac4ebaa612a82b23d45964586f24ae9bc23ca101919f5590bdb368d74ad5455"},
-    {file = "ruff-0.15.0-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:dcd4be7cc75cfbbca24a98d04d0b9b36a270d0833241f776b788d59f4142b14d"},
-    {file = "ruff-0.15.0-py3-none-macosx_11_0_arm64.whl", hash = "sha256:d747e3319b2bce179c7c1eaad3d884dc0a199b5f4d5187620530adf9105268ce"},
-    {file = "ruff-0.15.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:650bd9c56ae03102c51a5e4b554d74d825ff3abe4db22b90fd32d816c2e90621"},
-    {file = "ruff-0.15.0-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:a6664b7eac559e3048223a2da77769c2f92b43a6dfd4720cef42654299a599c9"},
-    {file = "ruff-0.15.0-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:6f811f97b0f092b35320d1556f3353bf238763420ade5d9e62ebd2b73f2ff179"},
-    {file = "ruff-0.15.0-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:761ec0a66680fab6454236635a39abaf14198818c8cdf691e036f4bc0f406b2d"},
-    {file = "ruff-0.15.0-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:940f11c2604d317e797b289f4f9f3fa5555ffe4fb574b55ed006c3d9b6f0eb78"},
-    {file = "ruff-0.15.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bcbca3d40558789126da91d7ef9a7c87772ee107033db7191edefa34e2c7f1b4"},
-    {file = "ruff-0.15.0-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:9a121a96db1d75fa3eb39c4539e607f628920dd72ff1f7c5ee4f1b768ac62d6e"},
-    {file = "ruff-0.15.0-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:5298d518e493061f2eabd4abd067c7e4fb89e2f63291c94332e35631c07c3662"},
-    {file = "ruff-0.15.0-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:afb6e603d6375ff0d6b0cee563fa21ab570fd15e65c852cb24922cef25050cf1"},
-    {file = "ruff-0.15.0-py3-none-musllinux_1_2_i686.whl", hash = "sha256:77e515f6b15f828b94dc17d2b4ace334c9ddb7d9468c54b2f9ed2b9c1593ef16"},
-    {file = "ruff-0.15.0-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:6f6e80850a01eb13b3e42ee0ebdf6e4497151b48c35051aab51c101266d187a3"},
-    {file = "ruff-0.15.0-py3-none-win32.whl", hash = "sha256:238a717ef803e501b6d51e0bdd0d2c6e8513fe9eec14002445134d3907cd46c3"},
-    {file = "ruff-0.15.0-py3-none-win_amd64.whl", hash = "sha256:dd5e4d3301dc01de614da3cdffc33d4b1b96fb89e45721f1598e5532ccf78b18"},
-    {file = "ruff-0.15.0-py3-none-win_arm64.whl", hash = "sha256:c480d632cc0ca3f0727acac8b7d053542d9e114a462a145d0b00e7cd658c515a"},
-    {file = "ruff-0.15.0.tar.gz", hash = "sha256:6bdea47cdbea30d40f8f8d7d69c0854ba7c15420ec75a26f463290949d7f7e9a"},
+    {file = "ruff-0.15.7-py3-none-linux_armv6l.whl", hash = "sha256:a81cc5b6910fb7dfc7c32d20652e50fa05963f6e13ead3c5915c41ac5d16668e"},
+    {file = "ruff-0.15.7-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:722d165bd52403f3bdabc0ce9e41fc47070ac56d7a91b4e0d097b516a53a3477"},
+    {file = "ruff-0.15.7-py3-none-macosx_11_0_arm64.whl", hash = "sha256:7fbc2448094262552146cbe1b9643a92f66559d3761f1ad0656d4991491af49e"},
+    {file = "ruff-0.15.7-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:6b39329b60eba44156d138275323cc726bbfbddcec3063da57caa8a8b1d50adf"},
+    {file = "ruff-0.15.7-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:87768c151808505f2bfc93ae44e5f9e7c8518943e5074f76ac21558ef5627c85"},
+    {file = "ruff-0.15.7-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:fb0511670002c6c529ec66c0e30641c976c8963de26a113f3a30456b702468b0"},
+    {file = "ruff-0.15.7-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:e0d19644f801849229db8345180a71bee5407b429dd217f853ec515e968a6912"},
+    {file = "ruff-0.15.7-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:4806d8e09ef5e84eb19ba833d0442f7e300b23fe3f0981cae159a248a10f0036"},
+    {file = "ruff-0.15.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:dce0896488562f09a27b9c91b1f58a097457143931f3c4d519690dea54e624c5"},
+    {file = "ruff-0.15.7-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:1852ce241d2bc89e5dc823e03cff4ce73d816b5c6cdadd27dbfe7b03217d2a12"},
+    {file = "ruff-0.15.7-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:5f3e4b221fb4bd293f79912fc5e93a9063ebd6d0dcbd528f91b89172a9b8436c"},
+    {file = "ruff-0.15.7-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:b15e48602c9c1d9bdc504b472e90b90c97dc7d46c7028011ae67f3861ceba7b4"},
+    {file = "ruff-0.15.7-py3-none-musllinux_1_2_i686.whl", hash = "sha256:1b4705e0e85cedc74b0a23cf6a179dbb3df184cb227761979cc76c0440b5ab0d"},
+    {file = "ruff-0.15.7-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:112c1fa316a558bb34319282c1200a8bf0495f1b735aeb78bfcb2991e6087580"},
+    {file = "ruff-0.15.7-py3-none-win32.whl", hash = "sha256:6d39e2d3505b082323352f733599f28169d12e891f7dd407f2d4f54b4c2886de"},
+    {file = "ruff-0.15.7-py3-none-win_amd64.whl", hash = "sha256:4d53d712ddebcd7dace1bc395367aec12c057aacfe9adbb6d832302575f4d3a1"},
+    {file = "ruff-0.15.7-py3-none-win_arm64.whl", hash = "sha256:18e8d73f1c3fdf27931497972250340f92e8c861722161a9caeb89a58ead6ed2"},
+    {file = "ruff-0.15.7.tar.gz", hash = "sha256:04f1ae61fc20fe0b148617c324d9d009b5f63412c0b16474f3d5f1a1a665f7ac"},
 ]

 [[package]]
@@ -2564,7 +2564,7 @@ description = "A lil' TOML parser"
 optional = false
 python-versions = ">=3.8"
 groups = ["dev"]
-markers = "python_version < \"3.11\""
+markers = "python_version == \"3.10\""
 files = [
    {file = "tomli-2.2.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:678e4fa69e4575eb77d103de3df8a895e1591b48e740211bd1067378c69e8249"},
    {file = "tomli-2.2.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:023aa114dd824ade0100497eb2318602af309e5a55595f76b626d6d9f3b7b0a6"},
@@ -2912,4 +2912,4 @@ type = ["pytest-mypy"]
 [metadata]
 lock-version = "2.1"
 python-versions = ">=3.10,<4.0"
-content-hash = "9619cae908ad38fa2c48016a58bcf4241f6f5793aa0e6cc140276e91c433cbbb"
+content-hash = "e0936a065565550afed18f6298b7e04e814b44100def7049f1a0d68662624a39"
--- a/autogpt_platform/autogpt_libs/pyproject.toml
+++ b/autogpt_platform/autogpt_libs/pyproject.toml
@@ -26,8 +26,8 @@ pyright = "^1.1.408"
 pytest = "^8.4.1"
 pytest-asyncio = "^1.3.0"
 pytest-mock = "^3.15.1"
-pytest-cov = "^7.0.0"
-ruff = "^0.15.0"
+pytest-cov = "^7.1.0"
+ruff = "^0.15.7"

 [build-system]
 requires = ["poetry-core"]
--- a/autogpt_platform/backend/.env.default
+++ b/autogpt_platform/backend/.env.default
@@ -178,6 +178,7 @@ SMTP_USERNAME=
 SMTP_PASSWORD=

 # Business & Marketing Tools
+AGENTMAIL_API_KEY=
 APOLLO_API_KEY=
 ENRICHLAYER_API_KEY=
 AYRSHARE_API_KEY=
@@ -190,5 +191,8 @@ ZEROBOUNCE_API_KEY=
 POSTHOG_API_KEY=
 POSTHOG_HOST=https://eu.i.posthog.com

+# Tally Form Integration (pre-populate business understanding on signup)
+TALLY_API_KEY=
+
 # Other Services
 AUTOMOD_API_KEY=
--- a/autogpt_platform/backend/AGENTS.md
+++ b/autogpt_platform/backend/AGENTS.md
@@ -0,0 +1,227 @@
+# Backend
+
+This file provides guidance to coding agents when working with the backend.
+
+## Essential Commands
+
+To run something with Python package dependencies you MUST use `poetry run ...`.
+
+```bash
+# Install dependencies
+poetry install
+
+# Run database migrations
+poetry run prisma migrate dev
+
+# Start all services (database, redis, rabbitmq, clamav)
+docker compose up -d
+
+# Run the backend as a whole
+poetry run app
+
+# Run tests
+poetry run test
+
+# Run specific test
+poetry run pytest path/to/test_file.py::test_function_name
+
+# Run block tests (tests that validate all blocks work correctly)
+poetry run pytest backend/blocks/test/test_block.py -xvs
+
+# Run tests for a specific block (e.g., GetCurrentTimeBlock)
+poetry run pytest 'backend/blocks/test/test_block.py::test_available_blocks[GetCurrentTimeBlock]' -xvs
+
+# Lint and format
+# prefer format if you want to just "fix" it and only get the errors that can't be autofixed
+poetry run format  # Black + isort
+poetry run lint    # ruff
+```
+
+More details can be found in @TESTING.md
+
+### Creating/Updating Snapshots
+
+When you first write a test or when the expected output changes:
+
+```bash
+poetry run pytest path/to/test.py --snapshot-update
+```
+
+⚠️ **Important**: Always review snapshot changes before committing! Use `git diff` to verify the changes are expected.
+
+## Architecture
+
+- **API Layer**: FastAPI with REST and WebSocket endpoints
+- **Database**: PostgreSQL with Prisma ORM, includes pgvector for embeddings
+- **Queue System**: RabbitMQ for async task processing
+- **Execution Engine**: Separate executor service processes agent workflows
+- **Authentication**: JWT-based with Supabase integration
+- **Security**: Cache protection middleware prevents sensitive data caching in browsers/proxies
+
+## Code Style
+
+- **Top-level imports only** — no local/inner imports (lazy imports only for heavy optional deps like `openpyxl`)
+- **Absolute imports** — use `from backend.module import ...` for cross-package imports. Single-dot relative (`from .sibling import ...`) is acceptable for sibling modules within the same package (e.g., blocks). Avoid double-dot relative imports (`from ..parent import ...`) — use the absolute path instead
+- **No duck typing** — no `hasattr`/`getattr`/`isinstance` for type dispatch; use typed interfaces/unions/protocols
+- **Pydantic models** over dataclass/namedtuple/dict for structured data
+- **No linter suppressors** — no `# type: ignore`, `# noqa`, `# pyright: ignore`; fix the type/code
+- **List comprehensions** over manual loop-and-append
+- **Early return** — guard clauses first, avoid deep nesting
+- **f-strings vs printf syntax in log statements** — Use `%s` for deferred interpolation in `debug` statements, f-strings elsewhere for readability: `logger.debug("Processing %s items", count)`, `logger.info(f"Processing {count} items")`
+- **Sanitize error paths** — `os.path.basename()` in error messages to avoid leaking directory structure
+- **TOCTOU awareness** — avoid check-then-act patterns for file access and credit charging
+- **`Security()` vs `Depends()`** — use `Security()` for auth deps to get proper OpenAPI security spec
+- **Redis pipelines** — `transaction=True` for atomicity on multi-step operations
+- **`max(0, value)` guards** — for computed values that should never be negative
+- **SSE protocol** — `data:` lines for frontend-parsed events (must match Zod schema), `: comment` lines for heartbeats/status
+- **File length** — keep files under ~300 lines; if a file grows beyond this, split by responsibility (e.g. extract helpers, models, or a sub-module into a new file). Never keep appending to a long file.
+- **Function length** — keep functions under ~40 lines; extract named helpers when a function grows longer. Long functions are a sign of mixed concerns, not complexity.
+- **Top-down ordering** — define the main/public function or class first, then the helpers it uses below. A reader should encounter high-level logic before implementation details.
+
+## Testing Approach
+
+- Uses pytest with snapshot testing for API responses
+- Test files are colocated with source files (`*_test.py`)
+- Mock at boundaries — mock where the symbol is **used**, not where it's **defined**
+- After refactoring, update mock targets to match new module paths
+- Use `AsyncMock` for async functions (`from unittest.mock import AsyncMock`)
+
+### Test-Driven Development (TDD)
+
+When fixing a bug or adding a feature, write the test **before** the implementation:
+
+```python
+# 1. Write a failing test marked xfail
+@pytest.mark.xfail(reason="Bug #1234: widget crashes on empty input")
+def test_widget_handles_empty_input():
+    result = widget.process("")
+    assert result == Widget.EMPTY_RESULT
+
+# 2. Run it — confirm it fails (XFAIL)
+# poetry run pytest path/to/test.py::test_widget_handles_empty_input -xvs
+
+# 3. Implement the fix
+
+# 4. Remove xfail, run again — confirm it passes
+def test_widget_handles_empty_input():
+    result = widget.process("")
+    assert result == Widget.EMPTY_RESULT
+```
+
+This catches regressions and proves the fix actually works. **Every bug fix should include a test that would have caught it.**
+
+## Database Schema
+
+Key models (defined in `schema.prisma`):
+
+- `User`: Authentication and profile data
+- `AgentGraph`: Workflow definitions with version control
+- `AgentGraphExecution`: Execution history and results
+- `AgentNode`: Individual nodes in a workflow
+- `StoreListing`: Marketplace listings for sharing agents
+
+## Environment Configuration
+
+- **Backend**: `.env.default` (defaults) → `.env` (user overrides)
+
+## Common Development Tasks
+
+### Adding a new block
+
+Follow the comprehensive [Block SDK Guide](@../../docs/platform/block-sdk-guide.md) which covers:
+
+- Provider configuration with `ProviderBuilder`
+- Block schema definition
+- Authentication (API keys, OAuth, webhooks)
+- Testing and validation
+- File organization
+
+Quick steps:
+
+1. Create new file in `backend/blocks/`
+2. Configure provider using `ProviderBuilder` in `_config.py`
+3. Inherit from `Block` base class
+4. Define input/output schemas using `BlockSchema`
+5. Implement async `run` method
+6. Generate unique block ID using `uuid.uuid4()`
+7. Test with `poetry run pytest backend/blocks/test/test_block.py`
+
+Note: when making many new blocks analyze the interfaces for each of these blocks and picture if they would go well together in a graph-based editor or would they struggle to connect productively?
+ex: do the inputs and outputs tie well together?
+
+If you get any pushback or hit complex block conditions check the new_blocks guide in the docs.
+
+#### Handling files in blocks with `store_media_file()`
+
+When blocks need to work with files (images, videos, documents), use `store_media_file()` from `backend.util.file`. The `return_format` parameter determines what you get back:
+
+| Format | Use When | Returns |
+|--------|----------|---------|
+| `"for_local_processing"` | Processing with local tools (ffmpeg, MoviePy, PIL) | Local file path (e.g., `"image.png"`) |
+| `"for_external_api"` | Sending content to external APIs (Replicate, OpenAI) | Data URI (e.g., `"data:image/png;base64,..."`) |
+| `"for_block_output"` | Returning output from your block | Smart: `workspace://` in CoPilot, data URI in graphs |
+
+**Examples:**
+
+```python
+# INPUT: Need to process file locally with ffmpeg
+local_path = await store_media_file(
+    file=input_data.video,
+    execution_context=execution_context,
+    return_format="for_local_processing",
+)
+# local_path = "video.mp4" - use with Path/ffmpeg/etc
+
+# INPUT: Need to send to external API like Replicate
+image_b64 = await store_media_file(
+    file=input_data.image,
+    execution_context=execution_context,
+    return_format="for_external_api",
+)
+# image_b64 = "data:image/png;base64,iVBORw0..." - send to API
+
+# OUTPUT: Returning result from block
+result_url = await store_media_file(
+    file=generated_image_url,
+    execution_context=execution_context,
+    return_format="for_block_output",
+)
+yield "image_url", result_url
+# In CoPilot: result_url = "workspace://abc123"
+# In graphs:  result_url = "data:image/png;base64,..."
+```
+
+**Key points:**
+
+- `for_block_output` is the ONLY format that auto-adapts to execution context
+- Always use `for_block_output` for block outputs unless you have a specific reason not to
+- Never hardcode workspace checks - let `for_block_output` handle it
+
+### Modifying the API
+
+1. Update route in `backend/api/features/`
+2. Add/update Pydantic models in same directory
+3. Write tests alongside the route file
+4. Run `poetry run test` to verify
+
+## Workspace & Media Files
+
+**Read [Workspace & Media Architecture](../../docs/platform/workspace-media-architecture.md) when:**
+- Working on CoPilot file upload/download features
+- Building blocks that handle `MediaFileType` inputs/outputs
+- Modifying `WorkspaceManager` or `store_media_file()`
+- Debugging file persistence or virus scanning issues
+
+Covers: `WorkspaceManager` (persistent storage with session scoping), `store_media_file()` (media normalization pipeline), and responsibility boundaries for virus scanning and persistence.
+
+## Security Implementation
+
+### Cache Protection Middleware
+
+- Located in `backend/api/middleware/security.py`
+- Default behavior: Disables caching for ALL endpoints with `Cache-Control: no-store, no-cache, must-revalidate, private`
+- Uses an allow list approach - only explicitly permitted paths can be cached
+- Cacheable paths include: static assets (`static/*`, `_next/static/*`), health checks, public store pages, documentation
+- Prevents sensitive data (auth tokens, API keys, user data) from being cached by browsers/proxies
+- To allow caching for a new endpoint, add it to `CACHEABLE_PATHS` in the middleware
+- Applied to both main API server and external API applications
--- a/autogpt_platform/backend/CLAUDE.md
+++ b/autogpt_platform/backend/CLAUDE.md
@@ -1,170 +1 @@
-# CLAUDE.md - Backend
-
-This file provides guidance to Claude Code when working with the backend.
-
-## Essential Commands
-
-To run something with Python package dependencies you MUST use `poetry run ...`.
-
-```bash
-# Install dependencies
-poetry install
-
-# Run database migrations
-poetry run prisma migrate dev
-
-# Start all services (database, redis, rabbitmq, clamav)
-docker compose up -d
-
-# Run the backend as a whole
-poetry run app
-
-# Run tests
-poetry run test
-
-# Run specific test
-poetry run pytest path/to/test_file.py::test_function_name
-
-# Run block tests (tests that validate all blocks work correctly)
-poetry run pytest backend/blocks/test/test_block.py -xvs
-
-# Run tests for a specific block (e.g., GetCurrentTimeBlock)
-poetry run pytest 'backend/blocks/test/test_block.py::test_available_blocks[GetCurrentTimeBlock]' -xvs
-
-# Lint and format
-# prefer format if you want to just "fix" it and only get the errors that can't be autofixed
-poetry run format  # Black + isort
-poetry run lint    # ruff
-```
-
-More details can be found in @TESTING.md
-
-### Creating/Updating Snapshots
-
-When you first write a test or when the expected output changes:
-
-```bash
-poetry run pytest path/to/test.py --snapshot-update
-```
-
-⚠️ **Important**: Always review snapshot changes before committing! Use `git diff` to verify the changes are expected.
-
-## Architecture
-
- **API Layer**: FastAPI with REST and WebSocket endpoints
- **Database**: PostgreSQL with Prisma ORM, includes pgvector for embeddings
- **Queue System**: RabbitMQ for async task processing
- **Execution Engine**: Separate executor service processes agent workflows
- **Authentication**: JWT-based with Supabase integration
- **Security**: Cache protection middleware prevents sensitive data caching in browsers/proxies
-
-## Testing Approach
-
- Uses pytest with snapshot testing for API responses
- Test files are colocated with source files (`*_test.py`)
-
-## Database Schema
-
-Key models (defined in `schema.prisma`):
-
- `User`: Authentication and profile data
- `AgentGraph`: Workflow definitions with version control
- `AgentGraphExecution`: Execution history and results
- `AgentNode`: Individual nodes in a workflow
- `StoreListing`: Marketplace listings for sharing agents
-
-## Environment Configuration
-
- **Backend**: `.env.default` (defaults) → `.env` (user overrides)
-
-## Common Development Tasks
-
-### Adding a new block
-
-Follow the comprehensive [Block SDK Guide](@../../docs/content/platform/block-sdk-guide.md) which covers:
-
- Provider configuration with `ProviderBuilder`
- Block schema definition
- Authentication (API keys, OAuth, webhooks)
- Testing and validation
- File organization
-
-Quick steps:
-
-1. Create new file in `backend/blocks/`
-2. Configure provider using `ProviderBuilder` in `_config.py`
-3. Inherit from `Block` base class
-4. Define input/output schemas using `BlockSchema`
-5. Implement async `run` method
-6. Generate unique block ID using `uuid.uuid4()`
-7. Test with `poetry run pytest backend/blocks/test/test_block.py`
-
-Note: when making many new blocks analyze the interfaces for each of these blocks and picture if they would go well together in a graph-based editor or would they struggle to connect productively?
-ex: do the inputs and outputs tie well together?
-
-If you get any pushback or hit complex block conditions check the new_blocks guide in the docs.
-
-#### Handling files in blocks with `store_media_file()`
-
-When blocks need to work with files (images, videos, documents), use `store_media_file()` from `backend.util.file`. The `return_format` parameter determines what you get back:
-
-| Format | Use When | Returns |
-|--------|----------|---------|
-| `"for_local_processing"` | Processing with local tools (ffmpeg, MoviePy, PIL) | Local file path (e.g., `"image.png"`) |
-| `"for_external_api"` | Sending content to external APIs (Replicate, OpenAI) | Data URI (e.g., `"data:image/png;base64,..."`) |
-| `"for_block_output"` | Returning output from your block | Smart: `workspace://` in CoPilot, data URI in graphs |
-
-**Examples:**
-
-```python
-# INPUT: Need to process file locally with ffmpeg
-local_path = await store_media_file(
-    file=input_data.video,
-    execution_context=execution_context,
-    return_format="for_local_processing",
-)
-# local_path = "video.mp4" - use with Path/ffmpeg/etc
-
-# INPUT: Need to send to external API like Replicate
-image_b64 = await store_media_file(
-    file=input_data.image,
-    execution_context=execution_context,
-    return_format="for_external_api",
-)
-# image_b64 = "data:image/png;base64,iVBORw0..." - send to API
-
-# OUTPUT: Returning result from block
-result_url = await store_media_file(
-    file=generated_image_url,
-    execution_context=execution_context,
-    return_format="for_block_output",
-)
-yield "image_url", result_url
-# In CoPilot: result_url = "workspace://abc123"
-# In graphs:  result_url = "data:image/png;base64,..."
-```
-
-**Key points:**
-
- `for_block_output` is the ONLY format that auto-adapts to execution context
- Always use `for_block_output` for block outputs unless you have a specific reason not to
- Never hardcode workspace checks - let `for_block_output` handle it
-
-### Modifying the API
-
-1. Update route in `backend/api/features/`
-2. Add/update Pydantic models in same directory
-3. Write tests alongside the route file
-4. Run `poetry run test` to verify
-
-## Security Implementation
-
-### Cache Protection Middleware
-
- Located in `backend/api/middleware/security.py`
- Default behavior: Disables caching for ALL endpoints with `Cache-Control: no-store, no-cache, must-revalidate, private`
- Uses an allow list approach - only explicitly permitted paths can be cached
- Cacheable paths include: static assets (`static/*`, `_next/static/*`), health checks, public store pages, documentation
- Prevents sensitive data (auth tokens, API keys, user data) from being cached by browsers/proxies
- To allow caching for a new endpoint, add it to `CACHEABLE_PATHS` in the middleware
- Applied to both main API server and external API applications
+@AGENTS.md
--- a/autogpt_platform/backend/Dockerfile
+++ b/autogpt_platform/backend/Dockerfile
@@ -50,7 +50,7 @@ RUN poetry install --no-ansi --no-root
 # Generate Prisma client
 COPY autogpt_platform/backend/schema.prisma ./
 COPY autogpt_platform/backend/backend/data/partial_types.py ./backend/data/partial_types.py
-COPY autogpt_platform/backend/gen_prisma_types_stub.py ./
+COPY autogpt_platform/backend/scripts/gen_prisma_types_stub.py ./scripts/
 RUN poetry run prisma generate && poetry run gen-prisma-stub

 # =============================== DB MIGRATOR =============================== #
@@ -82,7 +82,7 @@ RUN pip3 install prisma>=0.15.0 --break-system-packages

 COPY autogpt_platform/backend/schema.prisma ./
 COPY autogpt_platform/backend/backend/data/partial_types.py ./backend/data/partial_types.py
-COPY autogpt_platform/backend/gen_prisma_types_stub.py ./
+COPY autogpt_platform/backend/scripts/gen_prisma_types_stub.py ./scripts/
 COPY autogpt_platform/backend/migrations ./migrations

 # ============================== BACKEND SERVER ============================== #
@@ -95,7 +95,7 @@ ENV DEBIAN_FRONTEND=noninteractive

 # Install Python, FFmpeg, ImageMagick, and CLI tools for agent use.
 # bubblewrap provides OS-level sandbox (whitelist-only FS + no network)
-# for the bash_exec MCP tool.
+# for the bash_exec MCP tool (fallback when E2B is not configured).
 # Using --no-install-recommends saves ~650MB by skipping unnecessary deps like llvm, mesa, etc.
 RUN apt-get update && apt-get install -y --no-install-recommends \
    python3.13 \
@@ -111,13 +111,31 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
 # Copy poetry (build-time only, for `poetry install --only-root` to create entry points)
 COPY --from=builder /usr/local/lib/python3* /usr/local/lib/python3*
 COPY --from=builder /usr/local/bin/poetry /usr/local/bin/poetry
-# Copy Node.js installation for Prisma
+# Copy Node.js installation for Prisma and agent-browser.
+# npm/npx are symlinks in the builder (-> ../lib/node_modules/npm/bin/*-cli.js);
+# COPY resolves them to regular files, breaking require() paths.  Recreate as
+# proper symlinks so npm/npx can find their modules.
 COPY --from=builder /usr/bin/node /usr/bin/node
 COPY --from=builder /usr/lib/node_modules /usr/lib/node_modules
-COPY --from=builder /usr/bin/npm /usr/bin/npm
-COPY --from=builder /usr/bin/npx /usr/bin/npx
+RUN ln -s ../lib/node_modules/npm/bin/npm-cli.js /usr/bin/npm \
+    && ln -s ../lib/node_modules/npm/bin/npx-cli.js /usr/bin/npx
 COPY --from=builder /root/.cache/prisma-python/binaries /root/.cache/prisma-python/binaries

+# Install agent-browser (Copilot browser tool) using the system chromium package.
+# Chrome for Testing (the binary agent-browser downloads via `agent-browser install`)
+# has no ARM64 builds, so we use the distro-packaged chromium instead — verified to
+# work with agent-browser via Docker tests on arm64; amd64 is validated in CI.
+# Note: system chromium tracks the Debian package schedule rather than a pinned
+# Chrome for Testing release. If agent-browser requires a specific Chrome version,
+# verify compatibility against the chromium package version in the base image.
+RUN apt-get update \
+    && apt-get install -y --no-install-recommends chromium fonts-liberation \
+    && rm -rf /var/lib/apt/lists/* \
+    && npm install -g agent-browser \
+    && rm -rf /tmp/* /root/.npm
+
+ENV AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
+
 WORKDIR /app/autogpt_platform/backend

 # Copy only the .venv from builder (not the entire /app directory)
--- a/autogpt_platform/backend/backend/api/external/middleware.py
+++ b/autogpt_platform/backend/backend/api/external/middleware.py
@@ -88,20 +88,23 @@ async def require_auth(
    )


-def require_permission(permission: APIKeyPermission):
+def require_permission(*permissions: APIKeyPermission):
    """
-    Dependency function for checking specific permissions
+    Dependency function for checking required permissions.
+    All listed permissions must be present.
    (works with API keys and OAuth tokens)
    """

-    async def check_permission(
+    async def check_permissions(
        auth: APIAuthorizationInfo = Security(require_auth),
    ) -> APIAuthorizationInfo:
-        if permission not in auth.scopes:
+        missing = [p for p in permissions if p not in auth.scopes]
+        if missing:
            raise HTTPException(
                status_code=status.HTTP_403_FORBIDDEN,
-                detail=f"Missing required permission: {permission.value}",
+                detail=f"Missing required permission(s): "
+                f"{', '.join(p.value for p in missing)}",
            )
        return auth

-    return check_permission
+    return check_permissions
--- a/autogpt_platform/backend/backend/api/external/v1/integrations.py
+++ b/autogpt_platform/backend/backend/api/external/v1/integrations.py
@@ -18,14 +18,22 @@ from pydantic import BaseModel, Field, SecretStr

 from backend.api.external.middleware import require_permission
 from backend.api.features.integrations.models import get_all_provider_names
+from backend.api.features.integrations.router import (
+    CredentialsMetaResponse,
+    to_meta_response,
+)
 from backend.data.auth.base import APIAuthorizationInfo
 from backend.data.model import (
    APIKeyCredentials,
    Credentials,
    CredentialsType,
    HostScopedCredentials,
-    OAuth2Credentials,
    UserPasswordCredentials,
+    is_sdk_default,
+)
+from backend.integrations.credentials_store import (
+    is_system_credential,
+    provider_matches,
 )
 from backend.integrations.creds_manager import IntegrationCredentialsManager
 from backend.integrations.oauth import CREDENTIALS_BY_PROVIDER, HANDLERS_BY_NAME
@@ -91,18 +99,6 @@ class OAuthCompleteResponse(BaseModel):
    )


-class CredentialSummary(BaseModel):
-    """Summary of a credential without sensitive data."""
-
-    id: str
-    provider: str
-    type: CredentialsType
-    title: Optional[str] = None
-    scopes: Optional[list[str]] = None
-    username: Optional[str] = None
-    host: Optional[str] = None
-
-
 class ProviderInfo(BaseModel):
    """Information about an integration provider."""

@@ -473,12 +469,12 @@ async def complete_oauth(
    )


-@integrations_router.get("/credentials", response_model=list[CredentialSummary])
+@integrations_router.get("/credentials", response_model=list[CredentialsMetaResponse])
 async def list_credentials(
    auth: APIAuthorizationInfo = Security(
        require_permission(APIKeyPermission.READ_INTEGRATIONS)
    ),
-) -> list[CredentialSummary]:
+) -> list[CredentialsMetaResponse]:
    """
    List all credentials for the authenticated user.

@@ -486,28 +482,19 @@ async def list_credentials(
    """
    credentials = await creds_manager.store.get_all_creds(auth.user_id)
    return [
-        CredentialSummary(
-            id=cred.id,
-            provider=cred.provider,
-            type=cred.type,
-            title=cred.title,
-            scopes=cred.scopes if isinstance(cred, OAuth2Credentials) else None,
-            username=cred.username if isinstance(cred, OAuth2Credentials) else None,
-            host=cred.host if isinstance(cred, HostScopedCredentials) else None,
-        )
-        for cred in credentials
+        to_meta_response(cred) for cred in credentials if not is_sdk_default(cred.id)
    ]


@integrations_router.get(
-    "/{provider}/credentials", response_model=list[CredentialSummary]
+    "/{provider}/credentials", response_model=list[CredentialsMetaResponse]
 )
 async def list_credentials_by_provider(
    provider: Annotated[str, Path(title="The provider to list credentials for")],
    auth: APIAuthorizationInfo = Security(
        require_permission(APIKeyPermission.READ_INTEGRATIONS)
    ),
-) -> list[CredentialSummary]:
+) -> list[CredentialsMetaResponse]:
    """
    List credentials for a specific provider.
    """
@@ -515,16 +502,7 @@ async def list_credentials_by_provider(
        auth.user_id, provider
    )
    return [
-        CredentialSummary(
-            id=cred.id,
-            provider=cred.provider,
-            type=cred.type,
-            title=cred.title,
-            scopes=cred.scopes if isinstance(cred, OAuth2Credentials) else None,
-            username=cred.username if isinstance(cred, OAuth2Credentials) else None,
-            host=cred.host if isinstance(cred, HostScopedCredentials) else None,
-        )
-        for cred in credentials
+        to_meta_response(cred) for cred in credentials if not is_sdk_default(cred.id)
    ]


@@ -597,11 +575,11 @@ async def create_credential(
    # Store credentials
    try:
        await creds_manager.create(auth.user_id, credentials)
-    except Exception as e:
-        logger.error(f"Failed to store credentials: {e}")
+    except Exception:
+        logger.exception("Failed to store credentials")
        raise HTTPException(
            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
-            detail=f"Failed to store credentials: {str(e)}",
+            detail="Failed to store credentials",
        )

    logger.info(f"Created {request.type} credentials for provider {provider}")
@@ -639,15 +617,23 @@ async def delete_credential(
    use the main API's delete endpoint which handles webhook cleanup and
    token revocation.
    """
+    if is_sdk_default(cred_id):
+        raise HTTPException(
+            status_code=status.HTTP_404_NOT_FOUND, detail="Credentials not found"
+        )
+    if is_system_credential(cred_id):
+        raise HTTPException(
+            status_code=status.HTTP_403_FORBIDDEN,
+            detail="System-managed credentials cannot be deleted",
+        )
    creds = await creds_manager.store.get_creds_by_id(auth.user_id, cred_id)
    if not creds:
        raise HTTPException(
            status_code=status.HTTP_404_NOT_FOUND, detail="Credentials not found"
        )
-    if creds.provider != provider:
+    if not provider_matches(creds.provider, provider):
        raise HTTPException(
-            status_code=status.HTTP_404_NOT_FOUND,
-            detail="Credentials do not match the specified provider",
+            status_code=status.HTTP_404_NOT_FOUND, detail="Credentials not found"
        )

    await creds_manager.delete(auth.user_id, cred_id)
--- a/autogpt_platform/backend/backend/api/external/v1/routes.py
+++ b/autogpt_platform/backend/backend/api/external/v1/routes.py
@@ -1,7 +1,7 @@
 import logging
 import urllib.parse
 from collections import defaultdict
-from typing import Annotated, Any, Literal, Optional, Sequence
+from typing import Annotated, Any, Optional, Sequence

 from fastapi import APIRouter, Body, HTTPException, Security
 from prisma.enums import AgentExecutionStatus, APIKeyPermission
@@ -9,15 +9,17 @@ from pydantic import BaseModel, Field
 from typing_extensions import TypedDict

 import backend.api.features.store.cache as store_cache
+import backend.api.features.store.db as store_db
 import backend.api.features.store.model as store_model
 import backend.blocks
-from backend.api.external.middleware import require_permission
+from backend.api.external.middleware import require_auth, require_permission
 from backend.data import execution as execution_db
 from backend.data import graph as graph_db
 from backend.data import user as user_db
 from backend.data.auth.base import APIAuthorizationInfo
 from backend.data.block import BlockInput, CompletedBlockOutput
 from backend.executor.utils import add_graph_execution
+from backend.integrations.webhooks.graph_lifecycle_hooks import on_graph_activate
 from backend.util.settings import Settings

 from .integrations import integrations_router
@@ -95,6 +97,43 @@ async def execute_graph_block(
    return output


+@v1_router.post(
+    path="/graphs",
+    tags=["graphs"],
+    status_code=201,
+    dependencies=[
+        Security(
+            require_permission(
+                APIKeyPermission.WRITE_GRAPH, APIKeyPermission.WRITE_LIBRARY
+            )
+        )
+    ],
+)
+async def create_graph(
+    graph: graph_db.Graph,
+    auth: APIAuthorizationInfo = Security(
+        require_permission(APIKeyPermission.WRITE_GRAPH, APIKeyPermission.WRITE_LIBRARY)
+    ),
+) -> graph_db.GraphModel:
+    """
+    Create a new agent graph.
+
+    The graph will be validated and assigned a new ID.
+    It is automatically added to the user's library.
+    """
+    from backend.api.features.library import db as library_db
+
+    graph_model = graph_db.make_graph_model(graph, auth.user_id)
+    graph_model.reassign_ids(user_id=auth.user_id, reassign_graph_id=True)
+    graph_model.validate_graph(for_run=False)
+
+    await graph_db.create_graph(graph_model, user_id=auth.user_id)
+    await library_db.create_library_agent(graph_model, auth.user_id)
+    activated_graph = await on_graph_activate(graph_model, user_id=auth.user_id)
+
+    return activated_graph
+
+
@v1_router.post(
    path="/graphs/{graph_id}/execute/{graph_version}",
    tags=["graphs"],
@@ -192,13 +231,13 @@ async def get_graph_execution_results(
@v1_router.get(
    path="/store/agents",
    tags=["store"],
-    dependencies=[Security(require_permission(APIKeyPermission.READ_STORE))],
+    dependencies=[Security(require_auth)],  # data is public; auth required as anti-DDoS
    response_model=store_model.StoreAgentsResponse,
 )
 async def get_store_agents(
    featured: bool = False,
    creator: str | None = None,
-    sorted_by: Literal["rating", "runs", "name", "updated_at"] | None = None,
+    sorted_by: store_db.StoreAgentsSortOptions | None = None,
    search_query: str | None = None,
    category: str | None = None,
    page: int = 1,
@@ -240,7 +279,7 @@ async def get_store_agents(
@v1_router.get(
    path="/store/agents/{username}/{agent_name}",
    tags=["store"],
-    dependencies=[Security(require_permission(APIKeyPermission.READ_STORE))],
+    dependencies=[Security(require_auth)],  # data is public; auth required as anti-DDoS
    response_model=store_model.StoreAgentDetails,
 )
 async def get_store_agent(
@@ -268,13 +307,13 @@ async def get_store_agent(
@v1_router.get(
    path="/store/creators",
    tags=["store"],
-    dependencies=[Security(require_permission(APIKeyPermission.READ_STORE))],
+    dependencies=[Security(require_auth)],  # data is public; auth required as anti-DDoS
    response_model=store_model.CreatorsResponse,
 )
 async def get_store_creators(
    featured: bool = False,
    search_query: str | None = None,
-    sorted_by: Literal["agent_rating", "agent_runs", "num_agents"] | None = None,
+    sorted_by: store_db.StoreCreatorsSortOptions | None = None,
    page: int = 1,
    page_size: int = 20,
 ) -> store_model.CreatorsResponse:
@@ -310,7 +349,7 @@ async def get_store_creators(
@v1_router.get(
    path="/store/creators/{username}",
    tags=["store"],
-    dependencies=[Security(require_permission(APIKeyPermission.READ_STORE))],
+    dependencies=[Security(require_auth)],  # data is public; auth required as anti-DDoS
    response_model=store_model.CreatorDetails,
 )
 async def get_store_creator(
--- a/autogpt_platform/backend/backend/api/external/v1/tools.py
+++ b/autogpt_platform/backend/backend/api/external/v1/tools.py
@@ -72,7 +72,7 @@ class RunAgentRequest(BaseModel):

 def _create_ephemeral_session(user_id: str) -> ChatSession:
    """Create an ephemeral session for stateless API requests."""
-    return ChatSession.new(user_id)
+    return ChatSession.new(user_id, dry_run=False)


@tools_router.post(
--- a/autogpt_platform/backend/backend/api/features/admin/rate_limit_admin_routes.py
+++ b/autogpt_platform/backend/backend/api/features/admin/rate_limit_admin_routes.py
@@ -0,0 +1,146 @@
+"""Admin endpoints for checking and resetting user CoPilot rate limit usage."""
+
+import logging
+from typing import Optional
+
+from autogpt_libs.auth import get_user_id, requires_admin_user
+from fastapi import APIRouter, Body, HTTPException, Security
+from pydantic import BaseModel
+
+from backend.copilot.config import ChatConfig
+from backend.copilot.rate_limit import (
+    get_global_rate_limits,
+    get_usage_status,
+    reset_user_usage,
+)
+from backend.data.user import get_user_by_email, get_user_email_by_id
+
+logger = logging.getLogger(__name__)
+
+config = ChatConfig()
+
+router = APIRouter(
+    prefix="/admin",
+    tags=["copilot", "admin"],
+    dependencies=[Security(requires_admin_user)],
+)
+
+
+class UserRateLimitResponse(BaseModel):
+    user_id: str
+    user_email: Optional[str] = None
+    daily_token_limit: int
+    weekly_token_limit: int
+    daily_tokens_used: int
+    weekly_tokens_used: int
+
+
+async def _resolve_user_id(
+    user_id: Optional[str], email: Optional[str]
+) -> tuple[str, Optional[str]]:
+    """Resolve a user_id and email from the provided parameters.
+
+    Returns (user_id, email). Accepts either user_id or email; at least one
+    must be provided.  When both are provided, ``email`` takes precedence.
+    """
+    if email:
+        user = await get_user_by_email(email)
+        if not user:
+            raise HTTPException(
+                status_code=404, detail="No user found with the provided email."
+            )
+        return user.id, email
+
+    if not user_id:
+        raise HTTPException(
+            status_code=400,
+            detail="Either user_id or email query parameter is required.",
+        )
+
+    # We have a user_id; try to look up their email for display purposes.
+    # This is non-critical -- a failure should not block the response.
+    try:
+        resolved_email = await get_user_email_by_id(user_id)
+    except Exception:
+        logger.warning("Failed to resolve email for user %s", user_id, exc_info=True)
+        resolved_email = None
+    return user_id, resolved_email
+
+
+@router.get(
+    "/rate_limit",
+    response_model=UserRateLimitResponse,
+    summary="Get User Rate Limit",
+)
+async def get_user_rate_limit(
+    user_id: Optional[str] = None,
+    email: Optional[str] = None,
+    admin_user_id: str = Security(get_user_id),
+) -> UserRateLimitResponse:
+    """Get a user's current usage and effective rate limits. Admin-only.
+
+    Accepts either ``user_id`` or ``email`` as a query parameter.
+    When ``email`` is provided the user is looked up by email first.
+    """
+    resolved_id, resolved_email = await _resolve_user_id(user_id, email)
+
+    logger.info("Admin %s checking rate limit for user %s", admin_user_id, resolved_id)
+
+    daily_limit, weekly_limit = await get_global_rate_limits(
+        resolved_id, config.daily_token_limit, config.weekly_token_limit
+    )
+    usage = await get_usage_status(resolved_id, daily_limit, weekly_limit)
+
+    return UserRateLimitResponse(
+        user_id=resolved_id,
+        user_email=resolved_email,
+        daily_token_limit=daily_limit,
+        weekly_token_limit=weekly_limit,
+        daily_tokens_used=usage.daily.used,
+        weekly_tokens_used=usage.weekly.used,
+    )
+
+
+@router.post(
+    "/rate_limit/reset",
+    response_model=UserRateLimitResponse,
+    summary="Reset User Rate Limit Usage",
+)
+async def reset_user_rate_limit(
+    user_id: str = Body(embed=True),
+    reset_weekly: bool = Body(False, embed=True),
+    admin_user_id: str = Security(get_user_id),
+) -> UserRateLimitResponse:
+    """Reset a user's daily usage counter (and optionally weekly). Admin-only."""
+    logger.info(
+        "Admin %s resetting rate limit for user %s (reset_weekly=%s)",
+        admin_user_id,
+        user_id,
+        reset_weekly,
+    )
+
+    try:
+        await reset_user_usage(user_id, reset_weekly=reset_weekly)
+    except Exception as e:
+        logger.exception("Failed to reset user usage")
+        raise HTTPException(status_code=500, detail="Failed to reset usage") from e
+
+    daily_limit, weekly_limit = await get_global_rate_limits(
+        user_id, config.daily_token_limit, config.weekly_token_limit
+    )
+    usage = await get_usage_status(user_id, daily_limit, weekly_limit)
+
+    try:
+        resolved_email = await get_user_email_by_id(user_id)
+    except Exception:
+        logger.warning("Failed to resolve email for user %s", user_id, exc_info=True)
+        resolved_email = None
+
+    return UserRateLimitResponse(
+        user_id=user_id,
+        user_email=resolved_email,
+        daily_token_limit=daily_limit,
+        weekly_token_limit=weekly_limit,
+        daily_tokens_used=usage.daily.used,
+        weekly_tokens_used=usage.weekly.used,
+    )
--- a/autogpt_platform/backend/backend/api/features/admin/rate_limit_admin_routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/admin/rate_limit_admin_routes_test.py
@@ -0,0 +1,263 @@
+import json
+from types import SimpleNamespace
+from unittest.mock import AsyncMock
+
+import fastapi
+import fastapi.testclient
+import pytest
+import pytest_mock
+from autogpt_libs.auth.jwt_utils import get_jwt_payload
+from pytest_snapshot.plugin import Snapshot
+
+from backend.copilot.rate_limit import CoPilotUsageStatus, UsageWindow
+
+from .rate_limit_admin_routes import router as rate_limit_admin_router
+
+app = fastapi.FastAPI()
+app.include_router(rate_limit_admin_router)
+
+client = fastapi.testclient.TestClient(app)
+
+_MOCK_MODULE = "backend.api.features.admin.rate_limit_admin_routes"
+
+_TARGET_EMAIL = "target@example.com"
+
+
+@pytest.fixture(autouse=True)
+def setup_app_admin_auth(mock_jwt_admin):
+    """Setup admin auth overrides for all tests in this module"""
+    app.dependency_overrides[get_jwt_payload] = mock_jwt_admin["get_jwt_payload"]
+    yield
+    app.dependency_overrides.clear()
+
+
+def _mock_usage_status(
+    daily_used: int = 500_000, weekly_used: int = 3_000_000
+) -> CoPilotUsageStatus:
+    from datetime import UTC, datetime, timedelta
+
+    now = datetime.now(UTC)
+    return CoPilotUsageStatus(
+        daily=UsageWindow(
+            used=daily_used, limit=2_500_000, resets_at=now + timedelta(hours=6)
+        ),
+        weekly=UsageWindow(
+            used=weekly_used, limit=12_500_000, resets_at=now + timedelta(days=3)
+        ),
+    )
+
+
+def _patch_rate_limit_deps(
+    mocker: pytest_mock.MockerFixture,
+    target_user_id: str,
+    daily_used: int = 500_000,
+    weekly_used: int = 3_000_000,
+):
+    """Patch the common rate-limit + user-lookup dependencies."""
+    mocker.patch(
+        f"{_MOCK_MODULE}.get_global_rate_limits",
+        new_callable=AsyncMock,
+        return_value=(2_500_000, 12_500_000),
+    )
+    mocker.patch(
+        f"{_MOCK_MODULE}.get_usage_status",
+        new_callable=AsyncMock,
+        return_value=_mock_usage_status(daily_used=daily_used, weekly_used=weekly_used),
+    )
+    mocker.patch(
+        f"{_MOCK_MODULE}.get_user_email_by_id",
+        new_callable=AsyncMock,
+        return_value=_TARGET_EMAIL,
+    )
+
+
+def test_get_rate_limit(
+    mocker: pytest_mock.MockerFixture,
+    configured_snapshot: Snapshot,
+    target_user_id: str,
+) -> None:
+    """Test getting rate limit and usage for a user."""
+    _patch_rate_limit_deps(mocker, target_user_id)
+
+    response = client.get("/admin/rate_limit", params={"user_id": target_user_id})
+
+    assert response.status_code == 200
+    data = response.json()
+    assert data["user_id"] == target_user_id
+    assert data["user_email"] == _TARGET_EMAIL
+    assert data["daily_token_limit"] == 2_500_000
+    assert data["weekly_token_limit"] == 12_500_000
+    assert data["daily_tokens_used"] == 500_000
+    assert data["weekly_tokens_used"] == 3_000_000
+
+    configured_snapshot.assert_match(
+        json.dumps(data, indent=2, sort_keys=True) + "\n",
+        "get_rate_limit",
+    )
+
+
+def test_get_rate_limit_by_email(
+    mocker: pytest_mock.MockerFixture,
+    target_user_id: str,
+) -> None:
+    """Test looking up rate limits via email instead of user_id."""
+    _patch_rate_limit_deps(mocker, target_user_id)
+
+    mock_user = SimpleNamespace(id=target_user_id, email=_TARGET_EMAIL)
+    mocker.patch(
+        f"{_MOCK_MODULE}.get_user_by_email",
+        new_callable=AsyncMock,
+        return_value=mock_user,
+    )
+
+    response = client.get("/admin/rate_limit", params={"email": _TARGET_EMAIL})
+
+    assert response.status_code == 200
+    data = response.json()
+    assert data["user_id"] == target_user_id
+    assert data["user_email"] == _TARGET_EMAIL
+    assert data["daily_token_limit"] == 2_500_000
+
+
+def test_get_rate_limit_by_email_not_found(
+    mocker: pytest_mock.MockerFixture,
+) -> None:
+    """Test that looking up a non-existent email returns 404."""
+    mocker.patch(
+        f"{_MOCK_MODULE}.get_user_by_email",
+        new_callable=AsyncMock,
+        return_value=None,
+    )
+
+    response = client.get("/admin/rate_limit", params={"email": "nobody@example.com"})
+
+    assert response.status_code == 404
+
+
+def test_get_rate_limit_no_params() -> None:
+    """Test that omitting both user_id and email returns 400."""
+    response = client.get("/admin/rate_limit")
+    assert response.status_code == 400
+
+
+def test_reset_user_usage_daily_only(
+    mocker: pytest_mock.MockerFixture,
+    configured_snapshot: Snapshot,
+    target_user_id: str,
+) -> None:
+    """Test resetting only daily usage (default behaviour)."""
+    mock_reset = mocker.patch(
+        f"{_MOCK_MODULE}.reset_user_usage",
+        new_callable=AsyncMock,
+    )
+    _patch_rate_limit_deps(mocker, target_user_id, daily_used=0, weekly_used=3_000_000)
+
+    response = client.post(
+        "/admin/rate_limit/reset",
+        json={"user_id": target_user_id},
+    )
+
+    assert response.status_code == 200
+    data = response.json()
+    assert data["daily_tokens_used"] == 0
+    # Weekly is untouched
+    assert data["weekly_tokens_used"] == 3_000_000
+
+    mock_reset.assert_awaited_once_with(target_user_id, reset_weekly=False)
+
+    configured_snapshot.assert_match(
+        json.dumps(data, indent=2, sort_keys=True) + "\n",
+        "reset_user_usage_daily_only",
+    )
+
+
+def test_reset_user_usage_daily_and_weekly(
+    mocker: pytest_mock.MockerFixture,
+    configured_snapshot: Snapshot,
+    target_user_id: str,
+) -> None:
+    """Test resetting both daily and weekly usage."""
+    mock_reset = mocker.patch(
+        f"{_MOCK_MODULE}.reset_user_usage",
+        new_callable=AsyncMock,
+    )
+    _patch_rate_limit_deps(mocker, target_user_id, daily_used=0, weekly_used=0)
+
+    response = client.post(
+        "/admin/rate_limit/reset",
+        json={"user_id": target_user_id, "reset_weekly": True},
+    )
+
+    assert response.status_code == 200
+    data = response.json()
+    assert data["daily_tokens_used"] == 0
+    assert data["weekly_tokens_used"] == 0
+
+    mock_reset.assert_awaited_once_with(target_user_id, reset_weekly=True)
+
+    configured_snapshot.assert_match(
+        json.dumps(data, indent=2, sort_keys=True) + "\n",
+        "reset_user_usage_daily_and_weekly",
+    )
+
+
+def test_reset_user_usage_redis_failure(
+    mocker: pytest_mock.MockerFixture,
+    target_user_id: str,
+) -> None:
+    """Test that Redis failure on reset returns 500."""
+    mocker.patch(
+        f"{_MOCK_MODULE}.reset_user_usage",
+        new_callable=AsyncMock,
+        side_effect=Exception("Redis connection refused"),
+    )
+
+    response = client.post(
+        "/admin/rate_limit/reset",
+        json={"user_id": target_user_id},
+    )
+
+    assert response.status_code == 500
+
+
+def test_get_rate_limit_email_lookup_failure(
+    mocker: pytest_mock.MockerFixture,
+    target_user_id: str,
+) -> None:
+    """Test that failing to resolve a user email degrades gracefully."""
+    mocker.patch(
+        f"{_MOCK_MODULE}.get_global_rate_limits",
+        new_callable=AsyncMock,
+        return_value=(2_500_000, 12_500_000),
+    )
+    mocker.patch(
+        f"{_MOCK_MODULE}.get_usage_status",
+        new_callable=AsyncMock,
+        return_value=_mock_usage_status(),
+    )
+    mocker.patch(
+        f"{_MOCK_MODULE}.get_user_email_by_id",
+        new_callable=AsyncMock,
+        side_effect=Exception("DB connection lost"),
+    )
+
+    response = client.get("/admin/rate_limit", params={"user_id": target_user_id})
+
+    assert response.status_code == 200
+    data = response.json()
+    assert data["user_id"] == target_user_id
+    assert data["user_email"] is None
+
+
+def test_admin_endpoints_require_admin_role(mock_jwt_user) -> None:
+    """Test that rate limit admin endpoints require admin role."""
+    app.dependency_overrides[get_jwt_payload] = mock_jwt_user["get_jwt_payload"]
+
+    response = client.get("/admin/rate_limit", params={"user_id": "test"})
+    assert response.status_code == 403
+
+    response = client.post(
+        "/admin/rate_limit/reset",
+        json={"user_id": "test"},
+    )
+    assert response.status_code == 403
--- a/autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py
+++ b/autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py
@@ -7,6 +7,8 @@ import fastapi
 import fastapi.responses
 import prisma.enums

+import backend.api.features.library.db as library_db
+import backend.api.features.library.model as library_model
 import backend.api.features.store.cache as store_cache
 import backend.api.features.store.db as store_db
 import backend.api.features.store.model as store_model
@@ -24,14 +26,13 @@ router = fastapi.APIRouter(
@router.get(
    "/listings",
    summary="Get Admin Listings History",
-    response_model=store_model.StoreListingsWithVersionsResponse,
 )
 async def get_admin_listings_with_versions(
    status: typing.Optional[prisma.enums.SubmissionStatus] = None,
    search: typing.Optional[str] = None,
    page: int = 1,
    page_size: int = 20,
-):
+) -> store_model.StoreListingsWithVersionsAdminViewResponse:
    """
    Get store listings with their version history for admins.

@@ -45,36 +46,26 @@ async def get_admin_listings_with_versions(
        page_size: Number of items per page

    Returns:
-        StoreListingsWithVersionsResponse with listings and their versions
+        Paginated listings with their versions
    """
-    try:
-        listings = await store_db.get_admin_listings_with_versions(
-            status=status,
-            search_query=search,
-            page=page,
-            page_size=page_size,
-        )
-        return listings
-    except Exception as e:
-        logger.exception("Error getting admin listings with versions: %s", e)
-        return fastapi.responses.JSONResponse(
-            status_code=500,
-            content={
-                "detail": "An error occurred while retrieving listings with versions"
-            },
-        )
+    listings = await store_db.get_admin_listings_with_versions(
+        status=status,
+        search_query=search,
+        page=page,
+        page_size=page_size,
+    )
+    return listings


@router.post(
    "/submissions/{store_listing_version_id}/review",
    summary="Review Store Submission",
-    response_model=store_model.StoreSubmission,
 )
 async def review_submission(
    store_listing_version_id: str,
    request: store_model.ReviewSubmissionRequest,
    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
-):
+) -> store_model.StoreSubmissionAdminView:
    """
    Review a store listing submission.

@@ -84,31 +75,24 @@ async def review_submission(
        user_id: Authenticated admin user performing the review

    Returns:
-        StoreSubmission with updated review information
+        StoreSubmissionAdminView with updated review information
    """
-    try:
-        already_approved = await store_db.check_submission_already_approved(
-            store_listing_version_id=store_listing_version_id,
-        )
-        submission = await store_db.review_store_submission(
-            store_listing_version_id=store_listing_version_id,
-            is_approved=request.is_approved,
-            external_comments=request.comments,
-            internal_comments=request.internal_comments or "",
-            reviewer_id=user_id,
-        )
+    already_approved = await store_db.check_submission_already_approved(
+        store_listing_version_id=store_listing_version_id,
+    )
+    submission = await store_db.review_store_submission(
+        store_listing_version_id=store_listing_version_id,
+        is_approved=request.is_approved,
+        external_comments=request.comments,
+        internal_comments=request.internal_comments or "",
+        reviewer_id=user_id,
+    )

-        state_changed = already_approved != request.is_approved
-        # Clear caches when the request is approved as it updates what is shown on the store
-        if state_changed:
-            store_cache.clear_all_caches()
-        return submission
-    except Exception as e:
-        logger.exception("Error reviewing submission: %s", e)
-        return fastapi.responses.JSONResponse(
-            status_code=500,
-            content={"detail": "An error occurred while reviewing the submission"},
-        )
+    state_changed = already_approved != request.is_approved
+    # Clear caches whenever approval state changes, since store visibility can change
+    if state_changed:
+        store_cache.clear_all_caches()
+    return submission


@router.get(
@@ -150,3 +134,40 @@ async def admin_download_agent_file(
        return fastapi.responses.FileResponse(
            tmp_file.name, filename=file_name, media_type="application/json"
        )
+
+
+@router.get(
+    "/submissions/{store_listing_version_id}/preview",
+    summary="Admin Preview Submission Listing",
+)
+async def admin_preview_submission(
+    store_listing_version_id: str,
+) -> store_model.StoreAgentDetails:
+    """
+    Preview a marketplace submission as it would appear on the listing page.
+    Bypasses the APPROVED-only StoreAgent view so admins can preview pending
+    submissions before approving.
+    """
+    return await store_db.get_store_agent_details_as_admin(store_listing_version_id)
+
+
+@router.post(
+    "/submissions/{store_listing_version_id}/add-to-library",
+    summary="Admin Add Pending Agent to Library",
+    status_code=201,
+)
+async def admin_add_agent_to_library(
+    store_listing_version_id: str,
+    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
+) -> library_model.LibraryAgent:
+    """
+    Add a pending marketplace agent to the admin's library for review.
+    Uses admin-level access to bypass marketplace APPROVED-only checks.
+
+    The builder can load the graph because get_graph() checks library
+    membership as a fallback: "you added it, you keep it."
+    """
+    return await library_db.add_store_agent_to_library_as_admin(
+        store_listing_version_id=store_listing_version_id,
+        user_id=user_id,
+    )
--- a/autogpt_platform/backend/backend/api/features/admin/store_admin_routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/admin/store_admin_routes_test.py
@@ -0,0 +1,335 @@
+"""Tests for admin store routes and the bypass logic they depend on.
+
+Tests are organized by what they protect:
+- SECRT-2162: get_graph_as_admin bypasses ownership/marketplace checks
+- SECRT-2167 security: admin endpoints reject non-admin users
+- SECRT-2167 bypass: preview queries StoreListingVersion (not StoreAgent view),
+  and add-to-library uses get_graph_as_admin (not get_graph)
+"""
+
+from datetime import datetime, timezone
+from unittest.mock import AsyncMock, MagicMock, patch
+
+import fastapi
+import fastapi.responses
+import fastapi.testclient
+import pytest
+import pytest_mock
+from autogpt_libs.auth.jwt_utils import get_jwt_payload
+
+from backend.data.graph import get_graph_as_admin
+from backend.util.exceptions import NotFoundError
+
+from .store_admin_routes import router as store_admin_router
+
+# Shared constants
+ADMIN_USER_ID = "admin-user-id"
+CREATOR_USER_ID = "other-creator-id"
+GRAPH_ID = "test-graph-id"
+GRAPH_VERSION = 3
+SLV_ID = "test-store-listing-version-id"
+
+
+def _make_mock_graph(user_id: str = CREATOR_USER_ID) -> MagicMock:
+    graph = MagicMock()
+    graph.userId = user_id
+    graph.id = GRAPH_ID
+    graph.version = GRAPH_VERSION
+    graph.Nodes = []
+    return graph
+
+
+# ---- SECRT-2162: get_graph_as_admin bypasses ownership checks ---- #
+
+
+@pytest.mark.asyncio
+async def test_admin_can_access_pending_agent_not_owned() -> None:
+    """get_graph_as_admin must return a graph even when the admin doesn't own
+    it and it's not APPROVED in the marketplace."""
+    mock_graph = _make_mock_graph()
+    mock_graph_model = MagicMock(name="GraphModel")
+
+    with (
+        patch("backend.data.graph.AgentGraph.prisma") as mock_prisma,
+        patch(
+            "backend.data.graph.GraphModel.from_db",
+            return_value=mock_graph_model,
+        ),
+    ):
+        mock_prisma.return_value.find_first = AsyncMock(return_value=mock_graph)
+
+        result = await get_graph_as_admin(
+            graph_id=GRAPH_ID,
+            version=GRAPH_VERSION,
+            user_id=ADMIN_USER_ID,
+            for_export=False,
+        )
+
+    assert result is mock_graph_model
+
+
+@pytest.mark.asyncio
+async def test_admin_download_pending_agent_with_subagents() -> None:
+    """get_graph_as_admin with for_export=True must call get_sub_graphs
+    and pass sub_graphs to GraphModel.from_db."""
+    mock_graph = _make_mock_graph()
+    mock_sub_graph = MagicMock(name="SubGraph")
+    mock_graph_model = MagicMock(name="GraphModel")
+
+    with (
+        patch("backend.data.graph.AgentGraph.prisma") as mock_prisma,
+        patch(
+            "backend.data.graph.get_sub_graphs",
+            new_callable=AsyncMock,
+            return_value=[mock_sub_graph],
+        ) as mock_get_sub,
+        patch(
+            "backend.data.graph.GraphModel.from_db",
+            return_value=mock_graph_model,
+        ) as mock_from_db,
+    ):
+        mock_prisma.return_value.find_first = AsyncMock(return_value=mock_graph)
+
+        result = await get_graph_as_admin(
+            graph_id=GRAPH_ID,
+            version=GRAPH_VERSION,
+            user_id=ADMIN_USER_ID,
+            for_export=True,
+        )
+
+    assert result is mock_graph_model
+    mock_get_sub.assert_awaited_once_with(mock_graph)
+    mock_from_db.assert_called_once_with(
+        graph=mock_graph,
+        sub_graphs=[mock_sub_graph],
+        for_export=True,
+    )
+
+
+# ---- SECRT-2167 security: admin endpoints reject non-admin users ---- #
+
+app = fastapi.FastAPI()
+app.include_router(store_admin_router)
+
+
+@app.exception_handler(NotFoundError)
+async def _not_found_handler(
+    request: fastapi.Request, exc: NotFoundError
+) -> fastapi.responses.JSONResponse:
+    return fastapi.responses.JSONResponse(status_code=404, content={"detail": str(exc)})
+
+
+client = fastapi.testclient.TestClient(app)
+
+
+@pytest.fixture(autouse=True)
+def setup_app_admin_auth(mock_jwt_admin):
+    """Setup admin auth overrides for all route tests in this module."""
+    app.dependency_overrides[get_jwt_payload] = mock_jwt_admin["get_jwt_payload"]
+    yield
+    app.dependency_overrides.clear()
+
+
+def test_preview_requires_admin(mock_jwt_user) -> None:
+    """Non-admin users must get 403 on the preview endpoint."""
+    app.dependency_overrides[get_jwt_payload] = mock_jwt_user["get_jwt_payload"]
+    response = client.get(f"/admin/submissions/{SLV_ID}/preview")
+    assert response.status_code == 403
+
+
+def test_add_to_library_requires_admin(mock_jwt_user) -> None:
+    """Non-admin users must get 403 on the add-to-library endpoint."""
+    app.dependency_overrides[get_jwt_payload] = mock_jwt_user["get_jwt_payload"]
+    response = client.post(f"/admin/submissions/{SLV_ID}/add-to-library")
+    assert response.status_code == 403
+
+
+def test_preview_nonexistent_submission(
+    mocker: pytest_mock.MockerFixture,
+) -> None:
+    """Preview of a nonexistent submission returns 404."""
+    mocker.patch(
+        "backend.api.features.admin.store_admin_routes.store_db"
+        ".get_store_agent_details_as_admin",
+        side_effect=NotFoundError("not found"),
+    )
+    response = client.get(f"/admin/submissions/{SLV_ID}/preview")
+    assert response.status_code == 404
+
+
+# ---- SECRT-2167 bypass: verify the right data sources are used ---- #
+
+
+@pytest.mark.asyncio
+async def test_preview_queries_store_listing_version_not_store_agent() -> None:
+    """get_store_agent_details_as_admin must query StoreListingVersion
+    directly (not the APPROVED-only StoreAgent view). This is THE test that
+    prevents the bypass from being accidentally reverted."""
+    from backend.api.features.store.db import get_store_agent_details_as_admin
+
+    mock_slv = MagicMock()
+    mock_slv.id = SLV_ID
+    mock_slv.name = "Test Agent"
+    mock_slv.subHeading = "Short desc"
+    mock_slv.description = "Long desc"
+    mock_slv.videoUrl = None
+    mock_slv.agentOutputDemoUrl = None
+    mock_slv.imageUrls = ["https://example.com/img.png"]
+    mock_slv.instructions = None
+    mock_slv.categories = ["productivity"]
+    mock_slv.version = 1
+    mock_slv.agentGraphId = GRAPH_ID
+    mock_slv.agentGraphVersion = GRAPH_VERSION
+    mock_slv.updatedAt = datetime(2026, 3, 24, tzinfo=timezone.utc)
+    mock_slv.recommendedScheduleCron = "0 9 * * *"
+
+    mock_listing = MagicMock()
+    mock_listing.id = "listing-id"
+    mock_listing.slug = "test-agent"
+    mock_listing.activeVersionId = SLV_ID
+    mock_listing.hasApprovedVersion = False
+    mock_listing.CreatorProfile = MagicMock(username="creator", avatarUrl="")
+    mock_slv.StoreListing = mock_listing
+
+    with (
+        patch(
+            "backend.api.features.store.db.prisma.models" ".StoreListingVersion.prisma",
+        ) as mock_slv_prisma,
+        patch(
+            "backend.api.features.store.db.prisma.models.StoreAgent.prisma",
+        ) as mock_store_agent_prisma,
+    ):
+        mock_slv_prisma.return_value.find_unique = AsyncMock(return_value=mock_slv)
+
+        result = await get_store_agent_details_as_admin(SLV_ID)
+
+    # Verify it queried StoreListingVersion (not the APPROVED-only StoreAgent)
+    mock_slv_prisma.return_value.find_unique.assert_awaited_once()
+    await_args = mock_slv_prisma.return_value.find_unique.await_args
+    assert await_args is not None
+    assert await_args.kwargs["where"] == {"id": SLV_ID}
+
+    # Verify the APPROVED-only StoreAgent view was NOT touched
+    mock_store_agent_prisma.assert_not_called()
+
+    # Verify the result has the right data
+    assert result.agent_name == "Test Agent"
+    assert result.agent_image == ["https://example.com/img.png"]
+    assert result.has_approved_version is False
+    assert result.runs == 0
+    assert result.rating == 0.0
+
+
+@pytest.mark.asyncio
+async def test_resolve_graph_admin_uses_get_graph_as_admin() -> None:
+    """resolve_graph_for_library(admin=True) must call get_graph_as_admin,
+    not get_graph. This is THE test that prevents the add-to-library bypass
+    from being accidentally reverted."""
+    from backend.api.features.library._add_to_library import resolve_graph_for_library
+
+    mock_slv = MagicMock()
+    mock_slv.AgentGraph = MagicMock(id=GRAPH_ID, version=GRAPH_VERSION)
+    mock_graph_model = MagicMock(name="GraphModel")
+
+    with (
+        patch(
+            "backend.api.features.library._add_to_library.prisma.models"
+            ".StoreListingVersion.prisma",
+        ) as mock_prisma,
+        patch(
+            "backend.api.features.library._add_to_library.graph_db"
+            ".get_graph_as_admin",
+            new_callable=AsyncMock,
+            return_value=mock_graph_model,
+        ) as mock_admin,
+        patch(
+            "backend.api.features.library._add_to_library.graph_db.get_graph",
+            new_callable=AsyncMock,
+        ) as mock_regular,
+    ):
+        mock_prisma.return_value.find_unique = AsyncMock(return_value=mock_slv)
+
+        result = await resolve_graph_for_library(SLV_ID, ADMIN_USER_ID, admin=True)
+
+    assert result is mock_graph_model
+    mock_admin.assert_awaited_once_with(
+        graph_id=GRAPH_ID, version=GRAPH_VERSION, user_id=ADMIN_USER_ID
+    )
+    mock_regular.assert_not_awaited()
+
+
+@pytest.mark.asyncio
+async def test_resolve_graph_regular_uses_get_graph() -> None:
+    """resolve_graph_for_library(admin=False) must call get_graph,
+    not get_graph_as_admin. Ensures the non-admin path is preserved."""
+    from backend.api.features.library._add_to_library import resolve_graph_for_library
+
+    mock_slv = MagicMock()
+    mock_slv.AgentGraph = MagicMock(id=GRAPH_ID, version=GRAPH_VERSION)
+    mock_graph_model = MagicMock(name="GraphModel")
+
+    with (
+        patch(
+            "backend.api.features.library._add_to_library.prisma.models"
+            ".StoreListingVersion.prisma",
+        ) as mock_prisma,
+        patch(
+            "backend.api.features.library._add_to_library.graph_db"
+            ".get_graph_as_admin",
+            new_callable=AsyncMock,
+        ) as mock_admin,
+        patch(
+            "backend.api.features.library._add_to_library.graph_db.get_graph",
+            new_callable=AsyncMock,
+            return_value=mock_graph_model,
+        ) as mock_regular,
+    ):
+        mock_prisma.return_value.find_unique = AsyncMock(return_value=mock_slv)
+
+        result = await resolve_graph_for_library(SLV_ID, "regular-user-id", admin=False)
+
+    assert result is mock_graph_model
+    mock_regular.assert_awaited_once_with(
+        graph_id=GRAPH_ID, version=GRAPH_VERSION, user_id="regular-user-id"
+    )
+    mock_admin.assert_not_awaited()
+
+
+# ---- Library membership grants graph access (product decision) ---- #
+
+
+@pytest.mark.asyncio
+async def test_library_member_can_view_pending_agent_in_builder() -> None:
+    """After adding a pending agent to their library, the user should be
+    able to load the graph in the builder via get_graph()."""
+    mock_graph = _make_mock_graph()
+    mock_graph_model = MagicMock(name="GraphModel")
+    mock_library_agent = MagicMock()
+    mock_library_agent.AgentGraph = mock_graph
+
+    with (
+        patch("backend.data.graph.AgentGraph.prisma") as mock_ag_prisma,
+        patch(
+            "backend.data.graph.StoreListingVersion.prisma",
+        ) as mock_slv_prisma,
+        patch("backend.data.graph.LibraryAgent.prisma") as mock_lib_prisma,
+        patch(
+            "backend.data.graph.GraphModel.from_db",
+            return_value=mock_graph_model,
+        ),
+    ):
+        mock_ag_prisma.return_value.find_first = AsyncMock(return_value=None)
+        mock_slv_prisma.return_value.find_first = AsyncMock(return_value=None)
+        mock_lib_prisma.return_value.find_first = AsyncMock(
+            return_value=mock_library_agent
+        )
+
+        from backend.data.graph import get_graph
+
+        result = await get_graph(
+            graph_id=GRAPH_ID,
+            version=GRAPH_VERSION,
+            user_id=ADMIN_USER_ID,
+        )
+
+    assert result is mock_graph_model, "Library membership should grant graph access"
--- a/autogpt_platform/backend/backend/api/features/builder/db.py
+++ b/autogpt_platform/backend/backend/api/features/builder/db.py
@@ -1,10 +1,10 @@
 import logging
 from dataclasses import dataclass
-from datetime import datetime, timedelta, timezone
 from difflib import SequenceMatcher
-from typing import Sequence
+from typing import Any, Sequence, get_args, get_origin

 import prisma
+from prisma.models import mv_suggested_blocks

 import backend.api.features.library.db as library_db
 import backend.api.features.library.model as library_model
@@ -19,10 +19,10 @@ from backend.blocks._base import (
    BlockType,
 )
 from backend.blocks.llm import LlmModel
-from backend.data.db import query_raw_with_schema
 from backend.integrations.providers import ProviderName
 from backend.util.cache import cached
 from backend.util.models import Pagination
+from backend.util.text import split_camelcase

 from .model import (
    BlockCategoryResponse,
@@ -42,6 +42,16 @@ MAX_LIBRARY_AGENT_RESULTS = 100
 MAX_MARKETPLACE_AGENT_RESULTS = 100
 MIN_SCORE_FOR_FILTERED_RESULTS = 10.0

+# Boost blocks over marketplace agents in search results
+BLOCK_SCORE_BOOST = 50.0
+
+# Block IDs to exclude from search results
+EXCLUDED_BLOCK_IDS = frozenset(
+    {
+        "e189baac-8c20-45a1-94a7-55177ea42565",  # AgentExecutorBlock
+    }
+)
+
 SearchResultItem = BlockInfo | library_model.LibraryAgent | store_model.StoreAgent


@@ -64,8 +74,8 @@ def get_block_categories(category_blocks: int = 3) -> list[BlockCategoryResponse

    for block_type in load_all_blocks().values():
        block: AnyBlockSchema = block_type()
-        # Skip disabled blocks
-        if block.disabled:
+        # Skip disabled and excluded blocks
+        if block.disabled or block.id in EXCLUDED_BLOCK_IDS:
            continue
        # Skip blocks that don't have categories (all should have at least one)
        if not block.categories:
@@ -116,6 +126,9 @@ def get_blocks(
        # Skip disabled blocks
        if block.disabled:
            continue
+        # Skip excluded blocks
+        if block.id in EXCLUDED_BLOCK_IDS:
+            continue
        # Skip blocks that don't match the category
        if category and category not in {c.name.lower() for c in block.categories}:
            continue
@@ -255,14 +268,25 @@ async def _build_cached_search_results(
        "my_agents": 0,
    }

-    block_results, block_total, integration_total = _collect_block_results(
-        normalized_query=normalized_query,
-        include_blocks=include_blocks,
-        include_integrations=include_integrations,
-    )
-    scored_items.extend(block_results)
-    total_items["blocks"] = block_total
-    total_items["integrations"] = integration_total
+    # Use hybrid search when query is present, otherwise list all blocks
+    if (include_blocks or include_integrations) and normalized_query:
+        block_results, block_total, integration_total = await _text_search_blocks(
+            query=search_query,
+            include_blocks=include_blocks,
+            include_integrations=include_integrations,
+        )
+        scored_items.extend(block_results)
+        total_items["blocks"] = block_total
+        total_items["integrations"] = integration_total
+    elif include_blocks or include_integrations:
+        # No query - list all blocks using in-memory approach
+        block_results, block_total, integration_total = _collect_block_results(
+            include_blocks=include_blocks,
+            include_integrations=include_integrations,
+        )
+        scored_items.extend(block_results)
+        total_items["blocks"] = block_total
+        total_items["integrations"] = integration_total

    if include_library_agents:
        library_response = await library_db.list_library_agents(
@@ -307,10 +331,14 @@ async def _build_cached_search_results(

 def _collect_block_results(
    *,
-    normalized_query: str,
    include_blocks: bool,
    include_integrations: bool,
 ) -> tuple[list[_ScoredItem], int, int]:
+    """
+    Collect all blocks for listing (no search query).
+
+    All blocks get BLOCK_SCORE_BOOST to prioritize them over marketplace agents.
+    """
    results: list[_ScoredItem] = []
    block_count = 0
    integration_count = 0
@@ -323,6 +351,10 @@ def _collect_block_results(
        if block.disabled:
            continue

+        # Skip excluded blocks
+        if block.id in EXCLUDED_BLOCK_IDS:
+            continue
+
        block_info = block.get_info()
        credentials = list(block.input_schema.get_credentials_fields().values())
        is_integration = len(credentials) > 0
@@ -332,10 +364,6 @@ def _collect_block_results(
        if not is_integration and not include_blocks:
            continue

-        score = _score_block(block, block_info, normalized_query)
-        if not _should_include_item(score, normalized_query):
-            continue
-
        filter_type: FilterType = "integrations" if is_integration else "blocks"
        if is_integration:
            integration_count += 1
@@ -346,14 +374,86 @@ def _collect_block_results(
            _ScoredItem(
                item=block_info,
                filter_type=filter_type,
-                score=score,
-                sort_key=_get_item_name(block_info),
+                score=BLOCK_SCORE_BOOST,
+                sort_key=block_info.name.lower(),
            )
        )

    return results, block_count, integration_count


+async def _text_search_blocks(
+    *,
+    query: str,
+    include_blocks: bool,
+    include_integrations: bool,
+) -> tuple[list[_ScoredItem], int, int]:
+    """
+    Search blocks using in-memory text matching over the block registry.
+
+    All blocks are already loaded in memory, so this is fast and reliable
+    regardless of whether OpenAI embeddings are available.
+
+    Scoring:
+        - Base: text relevance via _score_primary_fields, plus BLOCK_SCORE_BOOST
+          to prioritize blocks over marketplace agents in combined results
+        - +20 if the block has an LlmModel field and the query matches an LLM model name
+    """
+    results: list[_ScoredItem] = []
+
+    if not include_blocks and not include_integrations:
+        return results, 0, 0
+
+    normalized_query = query.strip().lower()
+
+    all_results, _, _ = _collect_block_results(
+        include_blocks=include_blocks,
+        include_integrations=include_integrations,
+    )
+
+    all_blocks = load_all_blocks()
+
+    for item in all_results:
+        block_info = item.item
+        assert isinstance(block_info, BlockInfo)
+        name = split_camelcase(block_info.name).lower()
+
+        # Build rich description including input field descriptions,
+        # matching the searchable text that the embedding pipeline uses
+        desc_parts = [block_info.description or ""]
+        block_cls = all_blocks.get(block_info.id)
+        if block_cls is not None:
+            block: AnyBlockSchema = block_cls()
+            desc_parts += [
+                f"{f}: {info.description}"
+                for f, info in block.input_schema.model_fields.items()
+                if info.description
+            ]
+        description = " ".join(desc_parts).lower()
+
+        score = _score_primary_fields(name, description, normalized_query)
+
+        # Add LLM model match bonus
+        if block_cls is not None and _matches_llm_model(
+            block_cls().input_schema, normalized_query
+        ):
+            score += 20
+
+        if score >= MIN_SCORE_FOR_FILTERED_RESULTS:
+            results.append(
+                _ScoredItem(
+                    item=block_info,
+                    filter_type=item.filter_type,
+                    score=score + BLOCK_SCORE_BOOST,
+                    sort_key=name,
+                )
+            )
+
+    block_count = sum(1 for r in results if r.filter_type == "blocks")
+    integration_count = sum(1 for r in results if r.filter_type == "integrations")
+    return results, block_count, integration_count
+
+
 def _build_library_items(
    *,
    agents: list[library_model.LibraryAgent],
@@ -472,6 +572,8 @@ async def _get_static_counts():
        block: AnyBlockSchema = block_type()
        if block.disabled:
            continue
+        if block.id in EXCLUDED_BLOCK_IDS:
+            continue

        all_blocks += 1

@@ -498,47 +600,25 @@ async def _get_static_counts():
    }


+def _contains_type(annotation: Any, target: type) -> bool:
+    """Check if an annotation is or contains the target type (handles Optional/Union/Annotated)."""
+    if annotation is target:
+        return True
+    origin = get_origin(annotation)
+    if origin is None:
+        return False
+    return any(_contains_type(arg, target) for arg in get_args(annotation))
+
+
 def _matches_llm_model(schema_cls: type[BlockSchema], query: str) -> bool:
    for field in schema_cls.model_fields.values():
-        if field.annotation == LlmModel:
+        if _contains_type(field.annotation, LlmModel):
            # Check if query matches any value in llm_models
            if any(query in name for name in llm_models):
                return True
    return False


-def _score_block(
-    block: AnyBlockSchema,
-    block_info: BlockInfo,
-    normalized_query: str,
-) -> float:
-    if not normalized_query:
-        return 0.0
-
-    name = block_info.name.lower()
-    description = block_info.description.lower()
-    score = _score_primary_fields(name, description, normalized_query)
-
-    category_text = " ".join(
-        category.get("category", "").lower() for category in block_info.categories
-    )
-    score += _score_additional_field(category_text, normalized_query, 12, 6)
-
-    credentials_info = block.input_schema.get_credentials_fields_info().values()
-    provider_names = [
-        provider.value.lower()
-        for info in credentials_info
-        for provider in info.provider
-    ]
-    provider_text = " ".join(provider_names)
-    score += _score_additional_field(provider_text, normalized_query, 15, 6)
-
-    if _matches_llm_model(block.input_schema, normalized_query):
-        score += 20
-
-    return score
-
-
 def _score_library_agent(
    agent: library_model.LibraryAgent,
    normalized_query: str,
@@ -645,31 +725,20 @@ def _get_all_providers() -> dict[ProviderName, Provider]:
    return providers


-@cached(ttl_seconds=3600)
+@cached(ttl_seconds=3600, shared_cache=True)
 async def get_suggested_blocks(count: int = 5) -> list[BlockInfo]:
-    suggested_blocks = []
-    # Sum the number of executions for each block type
-    # Prisma cannot group by nested relations, so we do a raw query
-    # Calculate the cutoff timestamp
-    timestamp_threshold = datetime.now(timezone.utc) - timedelta(days=30)
+    """Return the most-executed blocks from the last 14 days.

-    results = await query_raw_with_schema(
-        """
-        SELECT
-            agent_node."agentBlockId" AS block_id,
-            COUNT(execution.id) AS execution_count
-        FROM {schema_prefix}"AgentNodeExecution" execution
-        JOIN {schema_prefix}"AgentNode" agent_node ON execution."agentNodeId" = agent_node.id
-        WHERE execution."endedTime" >= $1::timestamp
-        GROUP BY agent_node."agentBlockId"
-        ORDER BY execution_count DESC;
-        """,
-        timestamp_threshold,
-    )
+    Queries the mv_suggested_blocks materialized view (refreshed hourly via pg_cron)
+    and returns the top `count` blocks sorted by execution count, excluding
+    Input/Output/Agent block types and blocks in EXCLUDED_BLOCK_IDS.
+    """
+    results = await mv_suggested_blocks.prisma().find_many()

    # Get the top blocks based on execution count
-    # But ignore Input and Output blocks
+    # But ignore Input, Output, Agent, and excluded blocks
    blocks: list[tuple[BlockInfo, int]] = []
+    execution_counts = {row.block_id: row.execution_count for row in results}

    for block_type in load_all_blocks().values():
        block: AnyBlockSchema = block_type()
@@ -679,11 +748,9 @@ async def get_suggested_blocks(count: int = 5) -> list[BlockInfo]:
            BlockType.AGENT,
        ):
            continue
-        # Find the execution count for this block
-        execution_count = next(
-            (row["execution_count"] for row in results if row["block_id"] == block.id),
-            0,
-        )
+        if block.id in EXCLUDED_BLOCK_IDS:
+            continue
+        execution_count = execution_counts.get(block.id, 0)
        blocks.append((block.get_info(), execution_count))
    # Sort blocks by execution count
    blocks.sort(key=lambda x: x[1], reverse=True)
--- a/autogpt_platform/backend/backend/api/features/builder/model.py
+++ b/autogpt_platform/backend/backend/api/features/builder/model.py
@@ -27,7 +27,6 @@ class SearchEntry(BaseModel):

 # Suggestions
 class SuggestionsResponse(BaseModel):
-    otto_suggestions: list[str]
    recent_searches: list[SearchEntry]
    providers: list[ProviderName]
    top_blocks: list[BlockInfo]
--- a/autogpt_platform/backend/backend/api/features/builder/routes.py
+++ b/autogpt_platform/backend/backend/api/features/builder/routes.py
@@ -1,5 +1,5 @@
 import logging
-from typing import Annotated, Sequence
+from typing import Annotated, Sequence, cast, get_args

 import fastapi
 from autogpt_libs.auth.dependencies import get_user_id, requires_user
@@ -10,6 +10,8 @@ from backend.util.models import Pagination
 from . import db as builder_db
 from . import model as builder_model

+VALID_FILTER_VALUES = get_args(builder_model.FilterType)
+
 logger = logging.getLogger(__name__)

 router = fastapi.APIRouter(
@@ -49,11 +51,6 @@ async def get_suggestions(
    Get all suggestions for the Blocks Menu.
    """
    return builder_model.SuggestionsResponse(
-        otto_suggestions=[
-            "What blocks do I need to get started?",
-            "Help me create a list",
-            "Help me feed my data to Google Maps",
-        ],
        recent_searches=await builder_db.get_recent_searches(user_id),
        providers=[
            ProviderName.TWITTER,
@@ -151,7 +148,7 @@ async def get_providers(
 async def search(
    user_id: Annotated[str, fastapi.Security(get_user_id)],
    search_query: Annotated[str | None, fastapi.Query()] = None,
-    filter: Annotated[list[builder_model.FilterType] | None, fastapi.Query()] = None,
+    filter: Annotated[str | None, fastapi.Query()] = None,
    search_id: Annotated[str | None, fastapi.Query()] = None,
    by_creator: Annotated[list[str] | None, fastapi.Query()] = None,
    page: Annotated[int, fastapi.Query()] = 1,
@@ -160,9 +157,20 @@ async def search(
    """
    Search for blocks (including integrations), marketplace agents, and user library agents.
    """
-    # If no filters are provided, then we will return all types
-    if not filter:
-        filter = [
+    # Parse and validate filter parameter
+    filters: list[builder_model.FilterType]
+    if filter:
+        filter_values = [f.strip() for f in filter.split(",")]
+        invalid_filters = [f for f in filter_values if f not in VALID_FILTER_VALUES]
+        if invalid_filters:
+            raise fastapi.HTTPException(
+                status_code=400,
+                detail=f"Invalid filter value(s): {', '.join(invalid_filters)}. "
+                f"Valid values are: {', '.join(VALID_FILTER_VALUES)}",
+            )
+        filters = cast(list[builder_model.FilterType], filter_values)
+    else:
+        filters = [
            "blocks",
            "integrations",
            "marketplace_agents",
@@ -174,7 +182,7 @@ async def search(
    cached_results = await builder_db.get_sorted_search_results(
        user_id=user_id,
        search_query=search_query,
-        filters=filter,
+        filters=filters,
        by_creator=by_creator,
    )

@@ -196,7 +204,7 @@ async def search(
        user_id,
        builder_model.SearchEntry(
            search_query=search_query,
-            filter=filter,
+            filter=filters,
            by_creator=by_creator,
            search_id=search_id,
        ),
--- a/autogpt_platform/backend/backend/api/features/chat/routes.py
+++ b/autogpt_platform/backend/backend/api/features/chat/routes.py
--- a/autogpt_platform/backend/backend/api/features/chat/routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/chat/routes_test.py
@@ -0,0 +1,528 @@
+"""Tests for chat API routes: session title update, file attachment validation, usage, and rate limiting."""
+
+from datetime import UTC, datetime, timedelta
+from unittest.mock import AsyncMock, MagicMock
+
+import fastapi
+import fastapi.testclient
+import pytest
+import pytest_mock
+
+from backend.api.features.chat import routes as chat_routes
+
+app = fastapi.FastAPI()
+app.include_router(chat_routes.router)
+
+client = fastapi.testclient.TestClient(app)
+
+TEST_USER_ID = "3e53486c-cf57-477e-ba2a-cb02dc828e1a"
+
+
+@pytest.fixture(autouse=True)
+def setup_app_auth(mock_jwt_user):
+    """Setup auth overrides for all tests in this module"""
+    from autogpt_libs.auth.jwt_utils import get_jwt_payload
+
+    app.dependency_overrides[get_jwt_payload] = mock_jwt_user["get_jwt_payload"]
+    yield
+    app.dependency_overrides.clear()
+
+
+def _mock_update_session_title(
+    mocker: pytest_mock.MockerFixture, *, success: bool = True
+):
+    """Mock update_session_title."""
+    return mocker.patch(
+        "backend.api.features.chat.routes.update_session_title",
+        new_callable=AsyncMock,
+        return_value=success,
+    )
+
+
+# ─── Update title: success ─────────────────────────────────────────────
+
+
+def test_update_title_success(
+    mocker: pytest_mock.MockerFixture,
+    test_user_id: str,
+) -> None:
+    mock_update = _mock_update_session_title(mocker, success=True)
+
+    response = client.patch(
+        "/sessions/sess-1/title",
+        json={"title": "My project"},
+    )
+
+    assert response.status_code == 200
+    assert response.json() == {"status": "ok"}
+    mock_update.assert_called_once_with("sess-1", test_user_id, "My project")
+
+
+def test_update_title_trims_whitespace(
+    mocker: pytest_mock.MockerFixture,
+    test_user_id: str,
+) -> None:
+    mock_update = _mock_update_session_title(mocker, success=True)
+
+    response = client.patch(
+        "/sessions/sess-1/title",
+        json={"title": "  trimmed  "},
+    )
+
+    assert response.status_code == 200
+    mock_update.assert_called_once_with("sess-1", test_user_id, "trimmed")
+
+
+# ─── Update title: blank / whitespace-only → 422 ──────────────────────
+
+
+def test_update_title_blank_rejected(
+    test_user_id: str,
+) -> None:
+    """Whitespace-only titles must be rejected before hitting the DB."""
+    response = client.patch(
+        "/sessions/sess-1/title",
+        json={"title": "   "},
+    )
+
+    assert response.status_code == 422
+
+
+def test_update_title_empty_rejected(
+    test_user_id: str,
+) -> None:
+    response = client.patch(
+        "/sessions/sess-1/title",
+        json={"title": ""},
+    )
+
+    assert response.status_code == 422
+
+
+# ─── Update title: session not found or wrong user → 404 ──────────────
+
+
+def test_update_title_not_found(
+    mocker: pytest_mock.MockerFixture,
+    test_user_id: str,
+) -> None:
+    _mock_update_session_title(mocker, success=False)
+
+    response = client.patch(
+        "/sessions/sess-1/title",
+        json={"title": "New name"},
+    )
+
+    assert response.status_code == 404
+
+
+# ─── file_ids Pydantic validation ─────────────────────────────────────
+
+
+def test_stream_chat_rejects_too_many_file_ids():
+    """More than 20 file_ids should be rejected by Pydantic validation (422)."""
+    response = client.post(
+        "/sessions/sess-1/stream",
+        json={
+            "message": "hello",
+            "file_ids": [f"00000000-0000-0000-0000-{i:012d}" for i in range(21)],
+        },
+    )
+    assert response.status_code == 422
+
+
+def _mock_stream_internals(mocker: pytest_mock.MockFixture):
+    """Mock the async internals of stream_chat_post so tests can exercise
+    validation and enrichment logic without needing Redis/RabbitMQ."""
+    mocker.patch(
+        "backend.api.features.chat.routes._validate_and_get_session",
+        return_value=None,
+    )
+    mocker.patch(
+        "backend.api.features.chat.routes.append_and_save_message",
+        return_value=None,
+    )
+    mock_registry = mocker.MagicMock()
+    mock_registry.create_session = mocker.AsyncMock(return_value=None)
+    mocker.patch(
+        "backend.api.features.chat.routes.stream_registry",
+        mock_registry,
+    )
+    mocker.patch(
+        "backend.api.features.chat.routes.enqueue_copilot_turn",
+        return_value=None,
+    )
+    mocker.patch(
+        "backend.api.features.chat.routes.track_user_message",
+        return_value=None,
+    )
+
+
+def test_stream_chat_accepts_20_file_ids(mocker: pytest_mock.MockFixture):
+    """Exactly 20 file_ids should be accepted (not rejected by validation)."""
+    _mock_stream_internals(mocker)
+    # Patch workspace lookup as imported by the routes module
+    mocker.patch(
+        "backend.api.features.chat.routes.get_or_create_workspace",
+        return_value=type("W", (), {"id": "ws-1"})(),
+    )
+    mock_prisma = mocker.MagicMock()
+    mock_prisma.find_many = mocker.AsyncMock(return_value=[])
+    mocker.patch(
+        "prisma.models.UserWorkspaceFile.prisma",
+        return_value=mock_prisma,
+    )
+
+    response = client.post(
+        "/sessions/sess-1/stream",
+        json={
+            "message": "hello",
+            "file_ids": [f"00000000-0000-0000-0000-{i:012d}" for i in range(20)],
+        },
+    )
+    # Should get past validation — 200 streaming response expected
+    assert response.status_code == 200
+
+
+# ─── UUID format filtering ─────────────────────────────────────────────
+
+
+def test_file_ids_filters_invalid_uuids(mocker: pytest_mock.MockFixture):
+    """Non-UUID strings in file_ids should be silently filtered out
+    and NOT passed to the database query."""
+    _mock_stream_internals(mocker)
+    mocker.patch(
+        "backend.api.features.chat.routes.get_or_create_workspace",
+        return_value=type("W", (), {"id": "ws-1"})(),
+    )
+
+    mock_prisma = mocker.MagicMock()
+    mock_prisma.find_many = mocker.AsyncMock(return_value=[])
+    mocker.patch(
+        "prisma.models.UserWorkspaceFile.prisma",
+        return_value=mock_prisma,
+    )
+
+    valid_id = "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
+    client.post(
+        "/sessions/sess-1/stream",
+        json={
+            "message": "hello",
+            "file_ids": [
+                valid_id,
+                "not-a-uuid",
+                "../../../etc/passwd",
+                "",
+            ],
+        },
+    )
+
+    # The find_many call should only receive the one valid UUID
+    mock_prisma.find_many.assert_called_once()
+    call_kwargs = mock_prisma.find_many.call_args[1]
+    assert call_kwargs["where"]["id"]["in"] == [valid_id]
+
+
+# ─── Cross-workspace file_ids ─────────────────────────────────────────
+
+
+def test_file_ids_scoped_to_workspace(mocker: pytest_mock.MockFixture):
+    """The batch query should scope to the user's workspace."""
+    _mock_stream_internals(mocker)
+    mocker.patch(
+        "backend.api.features.chat.routes.get_or_create_workspace",
+        return_value=type("W", (), {"id": "my-workspace-id"})(),
+    )
+
+    mock_prisma = mocker.MagicMock()
+    mock_prisma.find_many = mocker.AsyncMock(return_value=[])
+    mocker.patch(
+        "prisma.models.UserWorkspaceFile.prisma",
+        return_value=mock_prisma,
+    )
+
+    fid = "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
+    client.post(
+        "/sessions/sess-1/stream",
+        json={"message": "hi", "file_ids": [fid]},
+    )
+
+    call_kwargs = mock_prisma.find_many.call_args[1]
+    assert call_kwargs["where"]["workspaceId"] == "my-workspace-id"
+    assert call_kwargs["where"]["isDeleted"] is False
+
+
+# ─── Rate limit → 429 ─────────────────────────────────────────────────
+
+
+def test_stream_chat_returns_429_on_daily_rate_limit(mocker: pytest_mock.MockFixture):
+    """When check_rate_limit raises RateLimitExceeded for daily limit the endpoint returns 429."""
+    from backend.copilot.rate_limit import RateLimitExceeded
+
+    _mock_stream_internals(mocker)
+    # Ensure the rate-limit branch is entered by setting a non-zero limit.
+    mocker.patch.object(chat_routes.config, "daily_token_limit", 10000)
+    mocker.patch.object(chat_routes.config, "weekly_token_limit", 50000)
+    mocker.patch(
+        "backend.api.features.chat.routes.check_rate_limit",
+        side_effect=RateLimitExceeded("daily", datetime.now(UTC) + timedelta(hours=1)),
+    )
+
+    response = client.post(
+        "/sessions/sess-1/stream",
+        json={"message": "hello"},
+    )
+    assert response.status_code == 429
+    assert "daily" in response.json()["detail"].lower()
+
+
+def test_stream_chat_returns_429_on_weekly_rate_limit(mocker: pytest_mock.MockFixture):
+    """When check_rate_limit raises RateLimitExceeded for weekly limit the endpoint returns 429."""
+    from backend.copilot.rate_limit import RateLimitExceeded
+
+    _mock_stream_internals(mocker)
+    mocker.patch.object(chat_routes.config, "daily_token_limit", 10000)
+    mocker.patch.object(chat_routes.config, "weekly_token_limit", 50000)
+    resets_at = datetime.now(UTC) + timedelta(days=3)
+    mocker.patch(
+        "backend.api.features.chat.routes.check_rate_limit",
+        side_effect=RateLimitExceeded("weekly", resets_at),
+    )
+
+    response = client.post(
+        "/sessions/sess-1/stream",
+        json={"message": "hello"},
+    )
+    assert response.status_code == 429
+    detail = response.json()["detail"].lower()
+    assert "weekly" in detail
+    assert "resets in" in detail
+
+
+def test_stream_chat_429_includes_reset_time(mocker: pytest_mock.MockFixture):
+    """The 429 response detail should include the human-readable reset time."""
+    from backend.copilot.rate_limit import RateLimitExceeded
+
+    _mock_stream_internals(mocker)
+    mocker.patch.object(chat_routes.config, "daily_token_limit", 10000)
+    mocker.patch.object(chat_routes.config, "weekly_token_limit", 50000)
+    mocker.patch(
+        "backend.api.features.chat.routes.check_rate_limit",
+        side_effect=RateLimitExceeded(
+            "daily", datetime.now(UTC) + timedelta(hours=2, minutes=30)
+        ),
+    )
+
+    response = client.post(
+        "/sessions/sess-1/stream",
+        json={"message": "hello"},
+    )
+    assert response.status_code == 429
+    detail = response.json()["detail"]
+    assert "2h" in detail
+    assert "Resets in" in detail
+
+
+# ─── Usage endpoint ───────────────────────────────────────────────────
+
+
+def _mock_usage(
+    mocker: pytest_mock.MockerFixture,
+    *,
+    daily_used: int = 500,
+    weekly_used: int = 2000,
+) -> AsyncMock:
+    """Mock get_usage_status to return a predictable CoPilotUsageStatus."""
+    from backend.copilot.rate_limit import CoPilotUsageStatus, UsageWindow
+
+    resets_at = datetime.now(UTC) + timedelta(days=1)
+    status = CoPilotUsageStatus(
+        daily=UsageWindow(used=daily_used, limit=10000, resets_at=resets_at),
+        weekly=UsageWindow(used=weekly_used, limit=50000, resets_at=resets_at),
+    )
+    return mocker.patch(
+        "backend.api.features.chat.routes.get_usage_status",
+        new_callable=AsyncMock,
+        return_value=status,
+    )
+
+
+def test_usage_returns_daily_and_weekly(
+    mocker: pytest_mock.MockerFixture,
+    test_user_id: str,
+) -> None:
+    """GET /usage returns daily and weekly usage."""
+    mock_get = _mock_usage(mocker, daily_used=500, weekly_used=2000)
+
+    mocker.patch.object(chat_routes.config, "daily_token_limit", 10000)
+    mocker.patch.object(chat_routes.config, "weekly_token_limit", 50000)
+
+    response = client.get("/usage")
+
+    assert response.status_code == 200
+    data = response.json()
+    assert data["daily"]["used"] == 500
+    assert data["weekly"]["used"] == 2000
+
+    mock_get.assert_called_once_with(
+        user_id=test_user_id,
+        daily_token_limit=10000,
+        weekly_token_limit=50000,
+        rate_limit_reset_cost=chat_routes.config.rate_limit_reset_cost,
+    )
+
+
+def test_usage_uses_config_limits(
+    mocker: pytest_mock.MockerFixture,
+    test_user_id: str,
+) -> None:
+    """The endpoint forwards daily_token_limit and weekly_token_limit from config."""
+    mock_get = _mock_usage(mocker)
+
+    mocker.patch.object(chat_routes.config, "daily_token_limit", 99999)
+    mocker.patch.object(chat_routes.config, "weekly_token_limit", 77777)
+    mocker.patch.object(chat_routes.config, "rate_limit_reset_cost", 500)
+
+    response = client.get("/usage")
+
+    assert response.status_code == 200
+    mock_get.assert_called_once_with(
+        user_id=test_user_id,
+        daily_token_limit=99999,
+        weekly_token_limit=77777,
+        rate_limit_reset_cost=500,
+    )
+
+
+def test_usage_rejects_unauthenticated_request() -> None:
+    """GET /usage should return 401 when no valid JWT is provided."""
+    unauthenticated_app = fastapi.FastAPI()
+    unauthenticated_app.include_router(chat_routes.router)
+    unauthenticated_client = fastapi.testclient.TestClient(unauthenticated_app)
+
+    response = unauthenticated_client.get("/usage")
+
+    assert response.status_code == 401
+
+
+# ─── Suggested prompts endpoint ──────────────────────────────────────
+
+
+def _mock_get_business_understanding(
+    mocker: pytest_mock.MockerFixture,
+    *,
+    return_value=None,
+):
+    """Mock get_business_understanding."""
+    return mocker.patch(
+        "backend.api.features.chat.routes.get_business_understanding",
+        new_callable=AsyncMock,
+        return_value=return_value,
+    )
+
+
+def test_suggested_prompts_returns_themes(
+    mocker: pytest_mock.MockerFixture,
+    test_user_id: str,
+) -> None:
+    """User with themed prompts gets them back as themes list."""
+    mock_understanding = MagicMock()
+    mock_understanding.suggested_prompts = {
+        "Learn": ["L1", "L2"],
+        "Create": ["C1"],
+    }
+    _mock_get_business_understanding(mocker, return_value=mock_understanding)
+
+    response = client.get("/suggested-prompts")
+
+    assert response.status_code == 200
+    data = response.json()
+    assert "themes" in data
+    themes_by_name = {t["name"]: t["prompts"] for t in data["themes"]}
+    assert themes_by_name["Learn"] == ["L1", "L2"]
+    assert themes_by_name["Create"] == ["C1"]
+
+
+def test_suggested_prompts_no_understanding(
+    mocker: pytest_mock.MockerFixture,
+    test_user_id: str,
+) -> None:
+    """User with no understanding gets empty themes list."""
+    _mock_get_business_understanding(mocker, return_value=None)
+
+    response = client.get("/suggested-prompts")
+
+    assert response.status_code == 200
+    assert response.json() == {"themes": []}
+
+
+def test_suggested_prompts_empty_prompts(
+    mocker: pytest_mock.MockerFixture,
+    test_user_id: str,
+) -> None:
+    """User with understanding but empty prompts gets empty themes list."""
+    mock_understanding = MagicMock()
+    mock_understanding.suggested_prompts = {}
+    _mock_get_business_understanding(mocker, return_value=mock_understanding)
+
+    response = client.get("/suggested-prompts")
+
+    assert response.status_code == 200
+    assert response.json() == {"themes": []}
+
+
+# ─── Create session: dry_run contract ─────────────────────────────────
+
+
+def _mock_create_chat_session(mocker: pytest_mock.MockerFixture):
+    """Mock create_chat_session to return a fake session."""
+    from backend.copilot.model import ChatSession
+
+    async def _fake_create(user_id: str, *, dry_run: bool):
+        return ChatSession.new(user_id, dry_run=dry_run)
+
+    return mocker.patch(
+        "backend.api.features.chat.routes.create_chat_session",
+        new_callable=AsyncMock,
+        side_effect=_fake_create,
+    )
+
+
+def test_create_session_dry_run_true(
+    mocker: pytest_mock.MockerFixture,
+    test_user_id: str,
+) -> None:
+    """Sending ``{"dry_run": true}`` sets metadata.dry_run to True."""
+    _mock_create_chat_session(mocker)
+
+    response = client.post("/sessions", json={"dry_run": True})
+
+    assert response.status_code == 200
+    assert response.json()["metadata"]["dry_run"] is True
+
+
+def test_create_session_dry_run_default_false(
+    mocker: pytest_mock.MockerFixture,
+    test_user_id: str,
+) -> None:
+    """Empty body defaults dry_run to False."""
+    _mock_create_chat_session(mocker)
+
+    response = client.post("/sessions")
+
+    assert response.status_code == 200
+    assert response.json()["metadata"]["dry_run"] is False
+
+
+def test_create_session_rejects_nested_metadata(
+    test_user_id: str,
+) -> None:
+    """Sending ``{"metadata": {"dry_run": true}}`` must return 422, not silently
+    default to ``dry_run=False``. This guards against the common mistake of
+    nesting dry_run inside metadata instead of providing it at the top level."""
+    response = client.post(
+        "/sessions",
+        json={"metadata": {"dry_run": True}},
+    )
+
+    assert response.status_code == 422
--- a/autogpt_platform/backend/backend/api/features/executions/review/review_routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/executions/review/review_routes_test.py
@@ -638,7 +638,7 @@ async def test_process_review_action_auto_approve_creates_auto_approval_records(

    # Mock get_node_executions to return node_id mapping
    mock_get_node_executions = mocker.patch(
-        "backend.data.execution.get_node_executions"
+        "backend.api.features.executions.review.routes.get_node_executions"
    )
    mock_node_exec = mocker.Mock(spec=NodeExecutionResult)
    mock_node_exec.node_exec_id = "test_node_123"
@@ -936,7 +936,7 @@ async def test_process_review_action_auto_approve_only_applies_to_approved_revie

    # Mock get_node_executions to return node_id mapping
    mock_get_node_executions = mocker.patch(
-        "backend.data.execution.get_node_executions"
+        "backend.api.features.executions.review.routes.get_node_executions"
    )
    mock_node_exec = mocker.Mock(spec=NodeExecutionResult)
    mock_node_exec.node_exec_id = "node_exec_approved"
@@ -1148,7 +1148,7 @@ async def test_process_review_action_per_review_auto_approve_granularity(

    # Mock get_node_executions to return batch node data
    mock_get_node_executions = mocker.patch(
-        "backend.data.execution.get_node_executions"
+        "backend.api.features.executions.review.routes.get_node_executions"
    )
    # Create mock node executions for each review
    mock_node_execs = []
--- a/autogpt_platform/backend/backend/api/features/executions/review/routes.py
+++ b/autogpt_platform/backend/backend/api/features/executions/review/routes.py
@@ -6,10 +6,15 @@ import autogpt_libs.auth as autogpt_auth_lib
 from fastapi import APIRouter, HTTPException, Query, Security, status
 from prisma.enums import ReviewStatus

+from backend.copilot.constants import (
+    is_copilot_synthetic_id,
+    parse_node_id_from_exec_id,
+)
 from backend.data.execution import (
    ExecutionContext,
    ExecutionStatus,
    get_graph_execution_meta,
+    get_node_executions,
 )
 from backend.data.graph import get_graph_settings
 from backend.data.human_review import (
@@ -22,6 +27,7 @@ from backend.data.human_review import (
 )
 from backend.data.model import USER_TIMEZONE_NOT_SET
 from backend.data.user import get_user_by_id
+from backend.data.workspace import get_or_create_workspace
 from backend.executor.utils import add_graph_execution

 from .model import PendingHumanReviewModel, ReviewRequest, ReviewResponse
@@ -35,6 +41,38 @@ router = APIRouter(
 )


+async def _resolve_node_ids(
+    node_exec_ids: list[str],
+    graph_exec_id: str,
+    is_copilot: bool,
+) -> dict[str, str]:
+    """Resolve node_exec_id -> node_id for auto-approval records.
+
+    CoPilot synthetic IDs encode node_id in the format "{node_id}:{random}".
+    Graph executions look up node_id from NodeExecution records.
+    """
+    if not node_exec_ids:
+        return {}
+
+    if is_copilot:
+        return {neid: parse_node_id_from_exec_id(neid) for neid in node_exec_ids}
+
+    node_execs = await get_node_executions(
+        graph_exec_id=graph_exec_id, include_exec_data=False
+    )
+    node_exec_map = {ne.node_exec_id: ne.node_id for ne in node_execs}
+
+    result = {}
+    for neid in node_exec_ids:
+        if neid in node_exec_map:
+            result[neid] = node_exec_map[neid]
+        else:
+            logger.error(
+                f"Failed to resolve node_id for {neid}: Node execution not found."
+            )
+    return result
+
+
@router.get(
    "/pending",
    summary="Get Pending Reviews",
@@ -109,14 +147,16 @@ async def list_pending_reviews_for_execution(
    """

    # Verify user owns the graph execution before returning reviews
-    graph_exec = await get_graph_execution_meta(
-        user_id=user_id, execution_id=graph_exec_id
-    )
-    if not graph_exec:
-        raise HTTPException(
-            status_code=status.HTTP_404_NOT_FOUND,
-            detail=f"Graph execution #{graph_exec_id} not found",
+    # (CoPilot synthetic IDs don't have graph execution records)
+    if not is_copilot_synthetic_id(graph_exec_id):
+        graph_exec = await get_graph_execution_meta(
+            user_id=user_id, execution_id=graph_exec_id
        )
+        if not graph_exec:
+            raise HTTPException(
+                status_code=status.HTTP_404_NOT_FOUND,
+                detail=f"Graph execution #{graph_exec_id} not found",
+            )

    return await get_pending_reviews_for_execution(graph_exec_id, user_id)

@@ -159,30 +199,26 @@ async def process_review_action(
        )

    graph_exec_id = next(iter(graph_exec_ids))
+    is_copilot = is_copilot_synthetic_id(graph_exec_id)

-    # Validate execution status before processing reviews
-    graph_exec_meta = await get_graph_execution_meta(
-        user_id=user_id, execution_id=graph_exec_id
-    )
-
-    if not graph_exec_meta:
-        raise HTTPException(
-            status_code=status.HTTP_404_NOT_FOUND,
-            detail=f"Graph execution #{graph_exec_id} not found",
-        )
-
-    # Only allow processing reviews if execution is paused for review
-    # or incomplete (partial execution with some reviews already processed)
-    if graph_exec_meta.status not in (
-        ExecutionStatus.REVIEW,
-        ExecutionStatus.INCOMPLETE,
-    ):
-        raise HTTPException(
-            status_code=status.HTTP_409_CONFLICT,
-            detail=f"Cannot process reviews while execution status is {graph_exec_meta.status}. "
-            f"Reviews can only be processed when execution is paused (REVIEW status). "
-            f"Current status: {graph_exec_meta.status}",
+    # Validate execution status for graph executions (skip for CoPilot synthetic IDs)
+    if not is_copilot:
+        graph_exec_meta = await get_graph_execution_meta(
+            user_id=user_id, execution_id=graph_exec_id
        )
+        if not graph_exec_meta:
+            raise HTTPException(
+                status_code=status.HTTP_404_NOT_FOUND,
+                detail=f"Graph execution #{graph_exec_id} not found",
+            )
+        if graph_exec_meta.status not in (
+            ExecutionStatus.REVIEW,
+            ExecutionStatus.INCOMPLETE,
+        ):
+            raise HTTPException(
+                status_code=status.HTTP_409_CONFLICT,
+                detail=f"Cannot process reviews while execution status is {graph_exec_meta.status}",
+            )

    # Build review decisions map and track which reviews requested auto-approval
    # Auto-approved reviews use original data (no modifications allowed)
@@ -235,7 +271,7 @@ async def process_review_action(
            )
            return (node_id, False)

-    # Collect node_exec_ids that need auto-approval
+    # Collect node_exec_ids that need auto-approval and resolve their node_ids
    node_exec_ids_needing_auto_approval = [
        node_exec_id
        for node_exec_id, review_result in updated_reviews.items()
@@ -243,29 +279,16 @@ async def process_review_action(
        and auto_approve_requests.get(node_exec_id, False)
    ]

-    # Batch-fetch node executions to get node_ids
+    node_id_map = await _resolve_node_ids(
+        node_exec_ids_needing_auto_approval, graph_exec_id, is_copilot
+    )
+
+    # Deduplicate by node_id — one auto-approval per node
    nodes_needing_auto_approval: dict[str, Any] = {}
-    if node_exec_ids_needing_auto_approval:
-        from backend.data.execution import get_node_executions
-
-        node_execs = await get_node_executions(
-            graph_exec_id=graph_exec_id, include_exec_data=False
-        )
-        node_exec_map = {node_exec.node_exec_id: node_exec for node_exec in node_execs}
-
-        for node_exec_id in node_exec_ids_needing_auto_approval:
-            node_exec = node_exec_map.get(node_exec_id)
-            if node_exec:
-                review_result = updated_reviews[node_exec_id]
-                # Use the first approved review for this node (deduplicate by node_id)
-                if node_exec.node_id not in nodes_needing_auto_approval:
-                    nodes_needing_auto_approval[node_exec.node_id] = review_result
-            else:
-                logger.error(
-                    f"Failed to create auto-approval record for {node_exec_id}: "
-                    f"Node execution not found. This may indicate a race condition "
-                    f"or data inconsistency."
-                )
+    for node_exec_id in node_exec_ids_needing_auto_approval:
+        node_id = node_id_map.get(node_exec_id)
+        if node_id and node_id not in nodes_needing_auto_approval:
+            nodes_needing_auto_approval[node_id] = updated_reviews[node_exec_id]

    # Execute all auto-approval creations in parallel (deduplicated by node_id)
    auto_approval_results = await asyncio.gather(
@@ -280,13 +303,11 @@ async def process_review_action(
    auto_approval_failed_count = 0
    for result in auto_approval_results:
        if isinstance(result, Exception):
-            # Unexpected exception during auto-approval creation
            auto_approval_failed_count += 1
            logger.error(
                f"Unexpected exception during auto-approval creation: {result}"
            )
        elif isinstance(result, tuple) and len(result) == 2 and not result[1]:
-            # Auto-approval creation failed (returned False)
            auto_approval_failed_count += 1

    # Count results
@@ -301,30 +322,31 @@ async def process_review_action(
        if review.status == ReviewStatus.REJECTED
    )

-    # Resume execution only if ALL pending reviews for this execution have been processed
-    if updated_reviews:
+    # Resume graph execution only for real graph executions (not CoPilot)
+    # CoPilot sessions are resumed by the LLM retrying run_block with review_id
+    if not is_copilot and updated_reviews:
        still_has_pending = await has_pending_reviews_for_graph_exec(graph_exec_id)

        if not still_has_pending:
-            # Get the graph_id from any processed review
            first_review = next(iter(updated_reviews.values()))

            try:
-                # Fetch user and settings to build complete execution context
                user = await get_user_by_id(user_id)
                settings = await get_graph_settings(
                    user_id=user_id, graph_id=first_review.graph_id
                )

-                # Preserve user's timezone preference when resuming execution
                user_timezone = (
                    user.timezone if user.timezone != USER_TIMEZONE_NOT_SET else "UTC"
                )

+                workspace = await get_or_create_workspace(user_id)
+
                execution_context = ExecutionContext(
                    human_in_the_loop_safe_mode=settings.human_in_the_loop_safe_mode,
                    sensitive_action_safe_mode=settings.sensitive_action_safe_mode,
                    user_timezone=user_timezone,
+                    workspace_id=workspace.id,
                )

                await add_graph_execution(
--- a/autogpt_platform/backend/backend/api/features/integrations/conftest.py
+++ b/autogpt_platform/backend/backend/api/features/integrations/conftest.py
@@ -0,0 +1,13 @@
+"""Override session-scoped fixtures so unit tests run without the server."""
+
+import pytest
+
+
+@pytest.fixture(scope="session")
+def server():
+    yield None
+
+
+@pytest.fixture(scope="session", autouse=True)
+def graph_cleanup():
+    yield
--- a/autogpt_platform/backend/backend/api/features/integrations/router.py
+++ b/autogpt_platform/backend/backend/api/features/integrations/router.py
@@ -34,16 +34,21 @@ from backend.data.model import (
    HostScopedCredentials,
    OAuth2Credentials,
    UserIntegrations,
+    is_sdk_default,
 )
 from backend.data.onboarding import OnboardingStep, complete_onboarding_step
 from backend.data.user import get_user_integrations
 from backend.executor.utils import add_graph_execution
 from backend.integrations.ayrshare import AyrshareClient, SocialPlatform
-from backend.integrations.credentials_store import provider_matches
+from backend.integrations.credentials_store import (
+    is_system_credential,
+    provider_matches,
+)
 from backend.integrations.creds_manager import (
    IntegrationCredentialsManager,
    create_mcp_oauth_handler,
 )
+from backend.integrations.managed_credentials import ensure_managed_credentials
 from backend.integrations.oauth import CREDENTIALS_BY_PROVIDER, HANDLERS_BY_NAME
 from backend.integrations.providers import ProviderName
 from backend.integrations.webhooks import get_webhook_manager
@@ -109,6 +114,7 @@ class CredentialsMetaResponse(BaseModel):
        default=None,
        description="Host pattern for host-scoped or MCP server URL for MCP credentials",
    )
+    is_managed: bool = False

    @model_validator(mode="before")
    @classmethod
@@ -138,6 +144,19 @@ class CredentialsMetaResponse(BaseModel):
        return None


+def to_meta_response(cred: Credentials) -> CredentialsMetaResponse:
+    return CredentialsMetaResponse(
+        id=cred.id,
+        provider=cred.provider,
+        type=cred.type,
+        title=cred.title,
+        scopes=cred.scopes if isinstance(cred, OAuth2Credentials) else None,
+        username=cred.username if isinstance(cred, OAuth2Credentials) else None,
+        host=CredentialsMetaResponse.get_host(cred),
+        is_managed=cred.is_managed,
+    )
+
+
@router.post("/{provider}/callback", summary="Exchange OAuth code for tokens")
 async def callback(
    provider: Annotated[
@@ -204,34 +223,20 @@ async def callback(
        f"and provider {provider.value}"
    )

-    return CredentialsMetaResponse(
-        id=credentials.id,
-        provider=credentials.provider,
-        type=credentials.type,
-        title=credentials.title,
-        scopes=credentials.scopes,
-        username=credentials.username,
-        host=(CredentialsMetaResponse.get_host(credentials)),
-    )
+    return to_meta_response(credentials)


@router.get("/credentials", summary="List Credentials")
 async def list_credentials(
    user_id: Annotated[str, Security(get_user_id)],
 ) -> list[CredentialsMetaResponse]:
+    # Fire-and-forget: provision missing managed credentials in the background.
+    # The credential appears on the next page load; listing is never blocked.
+    asyncio.create_task(ensure_managed_credentials(user_id, creds_manager.store))
    credentials = await creds_manager.store.get_all_creds(user_id)

    return [
-        CredentialsMetaResponse(
-            id=cred.id,
-            provider=cred.provider,
-            type=cred.type,
-            title=cred.title,
-            scopes=cred.scopes if isinstance(cred, OAuth2Credentials) else None,
-            username=cred.username if isinstance(cred, OAuth2Credentials) else None,
-            host=CredentialsMetaResponse.get_host(cred),
-        )
-        for cred in credentials
+        to_meta_response(cred) for cred in credentials if not is_sdk_default(cred.id)
    ]


@@ -242,19 +247,11 @@ async def list_credentials_by_provider(
    ],
    user_id: Annotated[str, Security(get_user_id)],
 ) -> list[CredentialsMetaResponse]:
+    asyncio.create_task(ensure_managed_credentials(user_id, creds_manager.store))
    credentials = await creds_manager.store.get_creds_by_provider(user_id, provider)

    return [
-        CredentialsMetaResponse(
-            id=cred.id,
-            provider=cred.provider,
-            type=cred.type,
-            title=cred.title,
-            scopes=cred.scopes if isinstance(cred, OAuth2Credentials) else None,
-            username=cred.username if isinstance(cred, OAuth2Credentials) else None,
-            host=CredentialsMetaResponse.get_host(cred),
-        )
-        for cred in credentials
+        to_meta_response(cred) for cred in credentials if not is_sdk_default(cred.id)
    ]


@@ -267,18 +264,21 @@ async def get_credential(
    ],
    cred_id: Annotated[str, Path(title="The ID of the credentials to retrieve")],
    user_id: Annotated[str, Security(get_user_id)],
-) -> Credentials:
+) -> CredentialsMetaResponse:
+    if is_sdk_default(cred_id):
+        raise HTTPException(
+            status_code=status.HTTP_404_NOT_FOUND, detail="Credentials not found"
+        )
    credential = await creds_manager.get(user_id, cred_id)
    if not credential:
        raise HTTPException(
            status_code=status.HTTP_404_NOT_FOUND, detail="Credentials not found"
        )
-    if credential.provider != provider:
+    if not provider_matches(credential.provider, provider):
        raise HTTPException(
-            status_code=status.HTTP_404_NOT_FOUND,
-            detail="Credentials do not match the specified provider",
+            status_code=status.HTTP_404_NOT_FOUND, detail="Credentials not found"
        )
-    return credential
+    return to_meta_response(credential)


@router.post("/{provider}/credentials", status_code=201, summary="Create Credentials")
@@ -288,16 +288,22 @@ async def create_credentials(
        ProviderName, Path(title="The provider to create credentials for")
    ],
    credentials: Credentials,
-) -> Credentials:
+) -> CredentialsMetaResponse:
+    if is_sdk_default(credentials.id):
+        raise HTTPException(
+            status_code=status.HTTP_403_FORBIDDEN,
+            detail="Cannot create credentials with a reserved ID",
+        )
    credentials.provider = provider
    try:
        await creds_manager.create(user_id, credentials)
-    except Exception as e:
+    except Exception:
+        logger.exception("Failed to store credentials")
        raise HTTPException(
            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
-            detail=f"Failed to store credentials: {str(e)}",
+            detail="Failed to store credentials",
        )
-    return credentials
+    return to_meta_response(credentials)


 class CredentialsDeletionResponse(BaseModel):
@@ -332,15 +338,29 @@ async def delete_credentials(
        bool, Query(title="Whether to proceed if any linked webhooks are still in use")
    ] = False,
 ) -> CredentialsDeletionResponse | CredentialsDeletionNeedsConfirmationResponse:
+    if is_sdk_default(cred_id):
+        raise HTTPException(
+            status_code=status.HTTP_404_NOT_FOUND, detail="Credentials not found"
+        )
+    if is_system_credential(cred_id):
+        raise HTTPException(
+            status_code=status.HTTP_403_FORBIDDEN,
+            detail="System-managed credentials cannot be deleted",
+        )
    creds = await creds_manager.store.get_creds_by_id(user_id, cred_id)
    if not creds:
        raise HTTPException(
            status_code=status.HTTP_404_NOT_FOUND, detail="Credentials not found"
        )
-    if creds.provider != provider:
+    if not provider_matches(creds.provider, provider):
        raise HTTPException(
            status_code=status.HTTP_404_NOT_FOUND,
-            detail="Credentials do not match the specified provider",
+            detail="Credentials not found",
+        )
+    if creds.is_managed:
+        raise HTTPException(
+            status_code=status.HTTP_403_FORBIDDEN,
+            detail="AutoGPT-managed credentials cannot be deleted",
        )

    try:
--- a/autogpt_platform/backend/backend/api/features/integrations/router_test.py
+++ b/autogpt_platform/backend/backend/api/features/integrations/router_test.py
@@ -0,0 +1,570 @@
+"""Tests for credentials API security: no secret leakage, SDK defaults filtered."""
+
+from contextlib import asynccontextmanager
+from unittest.mock import AsyncMock, MagicMock, patch
+
+import fastapi
+import fastapi.testclient
+import pytest
+from pydantic import SecretStr
+
+from backend.api.features.integrations.router import router
+from backend.data.model import (
+    APIKeyCredentials,
+    HostScopedCredentials,
+    OAuth2Credentials,
+    UserPasswordCredentials,
+)
+
+app = fastapi.FastAPI()
+app.include_router(router)
+client = fastapi.testclient.TestClient(app)
+
+TEST_USER_ID = "test-user-id"
+
+
+def _make_api_key_cred(cred_id: str = "cred-123", provider: str = "openai"):
+    return APIKeyCredentials(
+        id=cred_id,
+        provider=provider,
+        title="My API Key",
+        api_key=SecretStr("sk-secret-key-value"),
+    )
+
+
+def _make_oauth2_cred(cred_id: str = "cred-456", provider: str = "github"):
+    return OAuth2Credentials(
+        id=cred_id,
+        provider=provider,
+        title="My OAuth",
+        access_token=SecretStr("ghp_secret_token"),
+        refresh_token=SecretStr("ghp_refresh_secret"),
+        scopes=["repo", "user"],
+        username="testuser",
+    )
+
+
+def _make_user_password_cred(cred_id: str = "cred-789", provider: str = "openai"):
+    return UserPasswordCredentials(
+        id=cred_id,
+        provider=provider,
+        title="My Login",
+        username=SecretStr("admin"),
+        password=SecretStr("s3cret-pass"),
+    )
+
+
+def _make_host_scoped_cred(cred_id: str = "cred-host", provider: str = "openai"):
+    return HostScopedCredentials(
+        id=cred_id,
+        provider=provider,
+        title="Host Cred",
+        host="https://api.example.com",
+        headers={"Authorization": SecretStr("Bearer top-secret")},
+    )
+
+
+def _make_sdk_default_cred(provider: str = "openai"):
+    return APIKeyCredentials(
+        id=f"{provider}-default",
+        provider=provider,
+        title=f"{provider} (default)",
+        api_key=SecretStr("sk-platform-secret-key"),
+    )
+
+
+@pytest.fixture(autouse=True)
+def setup_auth(mock_jwt_user):
+    from autogpt_libs.auth.jwt_utils import get_jwt_payload
+
+    app.dependency_overrides[get_jwt_payload] = mock_jwt_user["get_jwt_payload"]
+    yield
+    app.dependency_overrides.clear()
+
+
+class TestGetCredentialReturnsMetaOnly:
+    """GET /{provider}/credentials/{cred_id} must not return secrets."""
+
+    def test_api_key_credential_no_secret(self):
+        cred = _make_api_key_cred()
+        with (
+            patch.object(router, "dependencies", []),
+            patch("backend.api.features.integrations.router.creds_manager") as mock_mgr,
+        ):
+            mock_mgr.get = AsyncMock(return_value=cred)
+            resp = client.get("/openai/credentials/cred-123")
+
+        assert resp.status_code == 200
+        data = resp.json()
+        assert data["id"] == "cred-123"
+        assert data["provider"] == "openai"
+        assert data["type"] == "api_key"
+        assert "api_key" not in data
+        assert "sk-secret-key-value" not in str(data)
+
+    def test_oauth2_credential_no_secret(self):
+        cred = _make_oauth2_cred()
+        with patch(
+            "backend.api.features.integrations.router.creds_manager"
+        ) as mock_mgr:
+            mock_mgr.get = AsyncMock(return_value=cred)
+            resp = client.get("/github/credentials/cred-456")
+
+        assert resp.status_code == 200
+        data = resp.json()
+        assert data["id"] == "cred-456"
+        assert data["scopes"] == ["repo", "user"]
+        assert data["username"] == "testuser"
+        assert "access_token" not in data
+        assert "refresh_token" not in data
+        assert "ghp_" not in str(data)
+
+    def test_user_password_credential_no_secret(self):
+        cred = _make_user_password_cred()
+        with patch(
+            "backend.api.features.integrations.router.creds_manager"
+        ) as mock_mgr:
+            mock_mgr.get = AsyncMock(return_value=cred)
+            resp = client.get("/openai/credentials/cred-789")
+
+        assert resp.status_code == 200
+        data = resp.json()
+        assert data["id"] == "cred-789"
+        assert "password" not in data
+        assert "username" not in data or data["username"] is None
+        assert "s3cret-pass" not in str(data)
+        assert "admin" not in str(data)
+
+    def test_host_scoped_credential_no_secret(self):
+        cred = _make_host_scoped_cred()
+        with patch(
+            "backend.api.features.integrations.router.creds_manager"
+        ) as mock_mgr:
+            mock_mgr.get = AsyncMock(return_value=cred)
+            resp = client.get("/openai/credentials/cred-host")
+
+        assert resp.status_code == 200
+        data = resp.json()
+        assert data["id"] == "cred-host"
+        assert data["host"] == "https://api.example.com"
+        assert "headers" not in data
+        assert "top-secret" not in str(data)
+
+    def test_get_credential_wrong_provider_returns_404(self):
+        """Provider mismatch should return generic 404, not leak credential existence."""
+        cred = _make_api_key_cred(provider="openai")
+        with patch(
+            "backend.api.features.integrations.router.creds_manager"
+        ) as mock_mgr:
+            mock_mgr.get = AsyncMock(return_value=cred)
+            resp = client.get("/github/credentials/cred-123")
+
+        assert resp.status_code == 404
+        assert resp.json()["detail"] == "Credentials not found"
+
+    def test_list_credentials_no_secrets(self):
+        """List endpoint must not leak secrets in any credential."""
+        creds = [_make_api_key_cred(), _make_oauth2_cred()]
+        with patch(
+            "backend.api.features.integrations.router.creds_manager"
+        ) as mock_mgr:
+            mock_mgr.store.get_all_creds = AsyncMock(return_value=creds)
+            resp = client.get("/credentials")
+
+        assert resp.status_code == 200
+        raw = str(resp.json())
+        assert "sk-secret-key-value" not in raw
+        assert "ghp_secret_token" not in raw
+        assert "ghp_refresh_secret" not in raw
+
+
+class TestSdkDefaultCredentialsNotAccessible:
+    """SDK default credentials (ID ending in '-default') must be hidden."""
+
+    def test_get_sdk_default_returns_404(self):
+        with patch(
+            "backend.api.features.integrations.router.creds_manager"
+        ) as mock_mgr:
+            mock_mgr.get = AsyncMock()
+            resp = client.get("/openai/credentials/openai-default")
+
+        assert resp.status_code == 404
+        mock_mgr.get.assert_not_called()
+
+    def test_list_credentials_excludes_sdk_defaults(self):
+        user_cred = _make_api_key_cred()
+        sdk_cred = _make_sdk_default_cred("openai")
+        with patch(
+            "backend.api.features.integrations.router.creds_manager"
+        ) as mock_mgr:
+            mock_mgr.store.get_all_creds = AsyncMock(return_value=[user_cred, sdk_cred])
+            resp = client.get("/credentials")
+
+        assert resp.status_code == 200
+        data = resp.json()
+        ids = [c["id"] for c in data]
+        assert "cred-123" in ids
+        assert "openai-default" not in ids
+
+    def test_list_by_provider_excludes_sdk_defaults(self):
+        user_cred = _make_api_key_cred()
+        sdk_cred = _make_sdk_default_cred("openai")
+        with patch(
+            "backend.api.features.integrations.router.creds_manager"
+        ) as mock_mgr:
+            mock_mgr.store.get_creds_by_provider = AsyncMock(
+                return_value=[user_cred, sdk_cred]
+            )
+            resp = client.get("/openai/credentials")
+
+        assert resp.status_code == 200
+        data = resp.json()
+        ids = [c["id"] for c in data]
+        assert "cred-123" in ids
+        assert "openai-default" not in ids
+
+    def test_delete_sdk_default_returns_404(self):
+        with patch(
+            "backend.api.features.integrations.router.creds_manager"
+        ) as mock_mgr:
+            mock_mgr.store.get_creds_by_id = AsyncMock()
+            resp = client.request("DELETE", "/openai/credentials/openai-default")
+
+        assert resp.status_code == 404
+        mock_mgr.store.get_creds_by_id.assert_not_called()
+
+
+class TestCreateCredentialNoSecretInResponse:
+    """POST /{provider}/credentials must not return secrets."""
+
+    def test_create_api_key_no_secret_in_response(self):
+        with patch(
+            "backend.api.features.integrations.router.creds_manager"
+        ) as mock_mgr:
+            mock_mgr.create = AsyncMock()
+            resp = client.post(
+                "/openai/credentials",
+                json={
+                    "id": "new-cred",
+                    "provider": "openai",
+                    "type": "api_key",
+                    "title": "New Key",
+                    "api_key": "sk-newsecret",
+                },
+            )
+
+        assert resp.status_code == 201
+        data = resp.json()
+        assert data["id"] == "new-cred"
+        assert "api_key" not in data
+        assert "sk-newsecret" not in str(data)
+
+    def test_create_with_sdk_default_id_rejected(self):
+        with patch(
+            "backend.api.features.integrations.router.creds_manager"
+        ) as mock_mgr:
+            mock_mgr.create = AsyncMock()
+            resp = client.post(
+                "/openai/credentials",
+                json={
+                    "id": "openai-default",
+                    "provider": "openai",
+                    "type": "api_key",
+                    "title": "Sneaky",
+                    "api_key": "sk-evil",
+                },
+            )
+
+        assert resp.status_code == 403
+        mock_mgr.create.assert_not_called()
+
+
+class TestManagedCredentials:
+    """AutoGPT-managed credentials cannot be deleted by users."""
+
+    def test_delete_is_managed_returns_403(self):
+        cred = APIKeyCredentials(
+            id="managed-cred-1",
+            provider="agent_mail",
+            title="AgentMail (managed by AutoGPT)",
+            api_key=SecretStr("sk-managed-key"),
+            is_managed=True,
+        )
+        with patch(
+            "backend.api.features.integrations.router.creds_manager"
+        ) as mock_mgr:
+            mock_mgr.store.get_creds_by_id = AsyncMock(return_value=cred)
+            resp = client.request("DELETE", "/agent_mail/credentials/managed-cred-1")
+
+        assert resp.status_code == 403
+        assert "AutoGPT-managed" in resp.json()["detail"]
+
+    def test_list_credentials_includes_is_managed_field(self):
+        managed = APIKeyCredentials(
+            id="managed-1",
+            provider="agent_mail",
+            title="AgentMail (managed)",
+            api_key=SecretStr("sk-key"),
+            is_managed=True,
+        )
+        regular = APIKeyCredentials(
+            id="regular-1",
+            provider="openai",
+            title="My Key",
+            api_key=SecretStr("sk-key"),
+        )
+        with patch(
+            "backend.api.features.integrations.router.creds_manager"
+        ) as mock_mgr:
+            mock_mgr.store.get_all_creds = AsyncMock(return_value=[managed, regular])
+            resp = client.get("/credentials")
+
+        assert resp.status_code == 200
+        data = resp.json()
+        managed_cred = next(c for c in data if c["id"] == "managed-1")
+        regular_cred = next(c for c in data if c["id"] == "regular-1")
+        assert managed_cred["is_managed"] is True
+        assert regular_cred["is_managed"] is False
+
+
+# ---------------------------------------------------------------------------
+# Managed credential provisioning infrastructure
+# ---------------------------------------------------------------------------
+
+
+def _make_managed_cred(
+    provider: str = "agent_mail", pod_id: str = "pod-abc"
+) -> APIKeyCredentials:
+    return APIKeyCredentials(
+        id="managed-auto",
+        provider=provider,
+        title="AgentMail (managed by AutoGPT)",
+        api_key=SecretStr("sk-pod-key"),
+        is_managed=True,
+        metadata={"pod_id": pod_id},
+    )
+
+
+def _make_store_mock(**kwargs) -> MagicMock:
+    """Create a store mock with a working async ``locks()`` context manager."""
+
+    @asynccontextmanager
+    async def _noop_locked(key):
+        yield
+
+    locks_obj = MagicMock()
+    locks_obj.locked = _noop_locked
+
+    store = MagicMock(**kwargs)
+    store.locks = AsyncMock(return_value=locks_obj)
+    return store
+
+
+class TestEnsureManagedCredentials:
+    """Unit tests for the ensure/cleanup helpers in managed_credentials.py."""
+
+    @pytest.mark.asyncio
+    async def test_provisions_when_missing(self):
+        """Provider.provision() is called when no managed credential exists."""
+        from backend.integrations.managed_credentials import (
+            _PROVIDERS,
+            _provisioned_users,
+            ensure_managed_credentials,
+        )
+
+        cred = _make_managed_cred()
+        provider = MagicMock()
+        provider.provider_name = "test_provider"
+        provider.is_available = AsyncMock(return_value=True)
+        provider.provision = AsyncMock(return_value=cred)
+
+        store = _make_store_mock()
+        store.has_managed_credential = AsyncMock(return_value=False)
+        store.add_managed_credential = AsyncMock()
+
+        saved = dict(_PROVIDERS)
+        _PROVIDERS.clear()
+        _PROVIDERS["test_provider"] = provider
+        _provisioned_users.pop("user-1", None)
+        try:
+            await ensure_managed_credentials("user-1", store)
+        finally:
+            _PROVIDERS.clear()
+            _PROVIDERS.update(saved)
+            _provisioned_users.pop("user-1", None)
+
+        provider.provision.assert_awaited_once_with("user-1")
+        store.add_managed_credential.assert_awaited_once_with("user-1", cred)
+
+    @pytest.mark.asyncio
+    async def test_skips_when_already_exists(self):
+        """Provider.provision() is NOT called when managed credential exists."""
+        from backend.integrations.managed_credentials import (
+            _PROVIDERS,
+            _provisioned_users,
+            ensure_managed_credentials,
+        )
+
+        provider = MagicMock()
+        provider.provider_name = "test_provider"
+        provider.is_available = AsyncMock(return_value=True)
+        provider.provision = AsyncMock()
+
+        store = _make_store_mock()
+        store.has_managed_credential = AsyncMock(return_value=True)
+
+        saved = dict(_PROVIDERS)
+        _PROVIDERS.clear()
+        _PROVIDERS["test_provider"] = provider
+        _provisioned_users.pop("user-1", None)
+        try:
+            await ensure_managed_credentials("user-1", store)
+        finally:
+            _PROVIDERS.clear()
+            _PROVIDERS.update(saved)
+            _provisioned_users.pop("user-1", None)
+
+        provider.provision.assert_not_awaited()
+
+    @pytest.mark.asyncio
+    async def test_skips_when_unavailable(self):
+        """Provider.provision() is NOT called when provider is not available."""
+        from backend.integrations.managed_credentials import (
+            _PROVIDERS,
+            _provisioned_users,
+            ensure_managed_credentials,
+        )
+
+        provider = MagicMock()
+        provider.provider_name = "test_provider"
+        provider.is_available = AsyncMock(return_value=False)
+        provider.provision = AsyncMock()
+
+        store = _make_store_mock()
+        store.has_managed_credential = AsyncMock()
+
+        saved = dict(_PROVIDERS)
+        _PROVIDERS.clear()
+        _PROVIDERS["test_provider"] = provider
+        _provisioned_users.pop("user-1", None)
+        try:
+            await ensure_managed_credentials("user-1", store)
+        finally:
+            _PROVIDERS.clear()
+            _PROVIDERS.update(saved)
+            _provisioned_users.pop("user-1", None)
+
+        provider.provision.assert_not_awaited()
+        store.has_managed_credential.assert_not_awaited()
+
+    @pytest.mark.asyncio
+    async def test_provision_failure_does_not_propagate(self):
+        """A failed provision is logged but does not raise."""
+        from backend.integrations.managed_credentials import (
+            _PROVIDERS,
+            _provisioned_users,
+            ensure_managed_credentials,
+        )
+
+        provider = MagicMock()
+        provider.provider_name = "test_provider"
+        provider.is_available = AsyncMock(return_value=True)
+        provider.provision = AsyncMock(side_effect=RuntimeError("boom"))
+
+        store = _make_store_mock()
+        store.has_managed_credential = AsyncMock(return_value=False)
+
+        saved = dict(_PROVIDERS)
+        _PROVIDERS.clear()
+        _PROVIDERS["test_provider"] = provider
+        _provisioned_users.pop("user-1", None)
+        try:
+            await ensure_managed_credentials("user-1", store)
+        finally:
+            _PROVIDERS.clear()
+            _PROVIDERS.update(saved)
+            _provisioned_users.pop("user-1", None)
+
+        # No exception raised — provisioning failure is swallowed.
+
+
+class TestCleanupManagedCredentials:
+    """Unit tests for cleanup_managed_credentials."""
+
+    @pytest.mark.asyncio
+    async def test_calls_deprovision_for_managed_creds(self):
+        from backend.integrations.managed_credentials import (
+            _PROVIDERS,
+            cleanup_managed_credentials,
+        )
+
+        cred = _make_managed_cred()
+        provider = MagicMock()
+        provider.provider_name = "agent_mail"
+        provider.deprovision = AsyncMock()
+
+        store = MagicMock()
+        store.get_all_creds = AsyncMock(return_value=[cred])
+
+        saved = dict(_PROVIDERS)
+        _PROVIDERS.clear()
+        _PROVIDERS["agent_mail"] = provider
+        try:
+            await cleanup_managed_credentials("user-1", store)
+        finally:
+            _PROVIDERS.clear()
+            _PROVIDERS.update(saved)
+
+        provider.deprovision.assert_awaited_once_with("user-1", cred)
+
+    @pytest.mark.asyncio
+    async def test_skips_non_managed_creds(self):
+        from backend.integrations.managed_credentials import (
+            _PROVIDERS,
+            cleanup_managed_credentials,
+        )
+
+        regular = _make_api_key_cred()
+        provider = MagicMock()
+        provider.provider_name = "openai"
+        provider.deprovision = AsyncMock()
+
+        store = MagicMock()
+        store.get_all_creds = AsyncMock(return_value=[regular])
+
+        saved = dict(_PROVIDERS)
+        _PROVIDERS.clear()
+        _PROVIDERS["openai"] = provider
+        try:
+            await cleanup_managed_credentials("user-1", store)
+        finally:
+            _PROVIDERS.clear()
+            _PROVIDERS.update(saved)
+
+        provider.deprovision.assert_not_awaited()
+
+    @pytest.mark.asyncio
+    async def test_deprovision_failure_does_not_propagate(self):
+        from backend.integrations.managed_credentials import (
+            _PROVIDERS,
+            cleanup_managed_credentials,
+        )
+
+        cred = _make_managed_cred()
+        provider = MagicMock()
+        provider.provider_name = "agent_mail"
+        provider.deprovision = AsyncMock(side_effect=RuntimeError("boom"))
+
+        store = MagicMock()
+        store.get_all_creds = AsyncMock(return_value=[cred])
+
+        saved = dict(_PROVIDERS)
+        _PROVIDERS.clear()
+        _PROVIDERS["agent_mail"] = provider
+        try:
+            await cleanup_managed_credentials("user-1", store)
+        finally:
+            _PROVIDERS.clear()
+            _PROVIDERS.update(saved)
+
+        # No exception raised — cleanup failure is swallowed.
--- a/autogpt_platform/backend/backend/api/features/library/_add_to_library.py
+++ b/autogpt_platform/backend/backend/api/features/library/_add_to_library.py
@@ -0,0 +1,120 @@
+"""Shared logic for adding store agents to a user's library.
+
+Both `add_store_agent_to_library` and `add_store_agent_to_library_as_admin`
+delegate to these helpers so the duplication-prone create/restore/dedup
+logic lives in exactly one place.
+"""
+
+import logging
+
+import prisma.errors
+import prisma.models
+
+import backend.api.features.library.model as library_model
+import backend.data.graph as graph_db
+from backend.data.graph import GraphModel, GraphSettings
+from backend.data.includes import library_agent_include
+from backend.util.exceptions import NotFoundError
+from backend.util.json import SafeJson
+
+logger = logging.getLogger(__name__)
+
+
+async def resolve_graph_for_library(
+    store_listing_version_id: str,
+    user_id: str,
+    *,
+    admin: bool,
+) -> GraphModel:
+    """Look up a StoreListingVersion and resolve its graph.
+
+    When ``admin=True``, uses ``get_graph_as_admin`` to bypass the marketplace
+    APPROVED-only check.  Otherwise uses the regular ``get_graph``.
+    """
+    slv = await prisma.models.StoreListingVersion.prisma().find_unique(
+        where={"id": store_listing_version_id}, include={"AgentGraph": True}
+    )
+    if not slv or not slv.AgentGraph:
+        raise NotFoundError(
+            f"Store listing version {store_listing_version_id} not found or invalid"
+        )
+
+    ag = slv.AgentGraph
+    if admin:
+        graph_model = await graph_db.get_graph_as_admin(
+            graph_id=ag.id, version=ag.version, user_id=user_id
+        )
+    else:
+        graph_model = await graph_db.get_graph(
+            graph_id=ag.id, version=ag.version, user_id=user_id
+        )
+
+    if not graph_model:
+        raise NotFoundError(f"Graph #{ag.id} v{ag.version} not found or accessible")
+    return graph_model
+
+
+async def add_graph_to_library(
+    store_listing_version_id: str,
+    graph_model: GraphModel,
+    user_id: str,
+) -> library_model.LibraryAgent:
+    """Check existing / restore soft-deleted / create new LibraryAgent.
+
+    Uses a create-then-catch-UniqueViolationError-then-update pattern on
+    the (userId, agentGraphId, agentGraphVersion) composite unique constraint.
+    This is more robust than ``upsert`` because Prisma's upsert atomicity
+    guarantees are not well-documented for all versions.
+    """
+    settings_json = SafeJson(GraphSettings.from_graph(graph_model).model_dump())
+    _include = library_agent_include(
+        user_id, include_nodes=False, include_executions=False
+    )
+
+    try:
+        added_agent = await prisma.models.LibraryAgent.prisma().create(
+            data={
+                "User": {"connect": {"id": user_id}},
+                "AgentGraph": {
+                    "connect": {
+                        "graphVersionId": {
+                            "id": graph_model.id,
+                            "version": graph_model.version,
+                        }
+                    }
+                },
+                "isCreatedByUser": False,
+                "useGraphIsActiveVersion": False,
+                "settings": settings_json,
+            },
+            include=_include,
+        )
+    except prisma.errors.UniqueViolationError:
+        # Already exists — update to restore if previously soft-deleted/archived
+        added_agent = await prisma.models.LibraryAgent.prisma().update(
+            where={
+                "userId_agentGraphId_agentGraphVersion": {
+                    "userId": user_id,
+                    "agentGraphId": graph_model.id,
+                    "agentGraphVersion": graph_model.version,
+                }
+            },
+            data={
+                "isDeleted": False,
+                "isArchived": False,
+                "settings": settings_json,
+            },
+            include=_include,
+        )
+        if added_agent is None:
+            raise NotFoundError(
+                f"LibraryAgent for graph #{graph_model.id} "
+                f"v{graph_model.version} not found after UniqueViolationError"
+            )
+
+    logger.debug(
+        f"Added graph #{graph_model.id} v{graph_model.version} "
+        f"for store listing version #{store_listing_version_id} "
+        f"to library for user #{user_id}"
+    )
+    return library_model.LibraryAgent.from_db(added_agent)
--- a/autogpt_platform/backend/backend/api/features/library/_add_to_library_test.py
+++ b/autogpt_platform/backend/backend/api/features/library/_add_to_library_test.py
@@ -0,0 +1,80 @@
+from unittest.mock import AsyncMock, MagicMock, patch
+
+import prisma.errors
+import pytest
+
+from ._add_to_library import add_graph_to_library
+
+
+@pytest.mark.asyncio
+async def test_add_graph_to_library_create_new_agent() -> None:
+    """When no matching LibraryAgent exists, create inserts a new one."""
+    graph_model = MagicMock(id="graph-id", version=2, nodes=[])
+    created_agent = MagicMock(name="CreatedLibraryAgent")
+    converted_agent = MagicMock(name="ConvertedLibraryAgent")
+
+    with (
+        patch(
+            "backend.api.features.library._add_to_library.prisma.models.LibraryAgent.prisma"
+        ) as mock_prisma,
+        patch(
+            "backend.api.features.library._add_to_library.library_model.LibraryAgent.from_db",
+            return_value=converted_agent,
+        ) as mock_from_db,
+    ):
+        mock_prisma.return_value.create = AsyncMock(return_value=created_agent)
+
+        result = await add_graph_to_library("slv-id", graph_model, "user-id")
+
+    assert result is converted_agent
+    mock_from_db.assert_called_once_with(created_agent)
+    # Verify create was called with correct data
+    create_call = mock_prisma.return_value.create.call_args
+    create_data = create_call.kwargs["data"]
+    assert create_data["User"] == {"connect": {"id": "user-id"}}
+    assert create_data["AgentGraph"] == {
+        "connect": {"graphVersionId": {"id": "graph-id", "version": 2}}
+    }
+    assert create_data["isCreatedByUser"] is False
+    assert create_data["useGraphIsActiveVersion"] is False
+
+
+@pytest.mark.asyncio
+async def test_add_graph_to_library_unique_violation_updates_existing() -> None:
+    """UniqueViolationError on create falls back to update."""
+    graph_model = MagicMock(id="graph-id", version=2, nodes=[])
+    updated_agent = MagicMock(name="UpdatedLibraryAgent")
+    converted_agent = MagicMock(name="ConvertedLibraryAgent")
+
+    with (
+        patch(
+            "backend.api.features.library._add_to_library.prisma.models.LibraryAgent.prisma"
+        ) as mock_prisma,
+        patch(
+            "backend.api.features.library._add_to_library.library_model.LibraryAgent.from_db",
+            return_value=converted_agent,
+        ) as mock_from_db,
+    ):
+        mock_prisma.return_value.create = AsyncMock(
+            side_effect=prisma.errors.UniqueViolationError(
+                MagicMock(), message="unique constraint"
+            )
+        )
+        mock_prisma.return_value.update = AsyncMock(return_value=updated_agent)
+
+        result = await add_graph_to_library("slv-id", graph_model, "user-id")
+
+    assert result is converted_agent
+    mock_from_db.assert_called_once_with(updated_agent)
+    # Verify update was called with correct where and data
+    update_call = mock_prisma.return_value.update.call_args
+    assert update_call.kwargs["where"] == {
+        "userId_agentGraphId_agentGraphVersion": {
+            "userId": "user-id",
+            "agentGraphId": "graph-id",
+            "agentGraphVersion": 2,
+        }
+    }
+    update_data = update_call.kwargs["data"]
+    assert update_data["isDeleted"] is False
+    assert update_data["isArchived"] is False
--- a/autogpt_platform/backend/backend/api/features/library/db.py
+++ b/autogpt_platform/backend/backend/api/features/library/db.py
--- a/autogpt_platform/backend/backend/api/features/library/db_test.py
+++ b/autogpt_platform/backend/backend/api/features/library/db_test.py
@@ -1,10 +1,11 @@
+from contextlib import asynccontextmanager
 from datetime import datetime
+from unittest.mock import AsyncMock, MagicMock, patch

 import prisma.enums
 import prisma.models
 import pytest

-import backend.api.features.store.exceptions
 from backend.data.db import connect
 from backend.data.includes import library_agent_include

@@ -86,10 +87,6 @@ async def test_get_library_agents(mocker):
 async def test_add_agent_to_library(mocker):
    await connect()

-    # Mock the transaction context
-    mock_transaction = mocker.patch("backend.api.features.library.db.transaction")
-    mock_transaction.return_value.__aenter__ = mocker.AsyncMock(return_value=None)
-    mock_transaction.return_value.__aexit__ = mocker.AsyncMock(return_value=None)
    # Mock data
    mock_store_listing_data = prisma.models.StoreListingVersion(
        id="version123",
@@ -144,14 +141,18 @@ async def test_add_agent_to_library(mocker):
    )

    mock_library_agent = mocker.patch("prisma.models.LibraryAgent.prisma")
-    mock_library_agent.return_value.find_unique = mocker.AsyncMock(return_value=None)
    mock_library_agent.return_value.create = mocker.AsyncMock(
        return_value=mock_library_agent_data
    )

-    # Mock graph_db.get_graph function that's called to check for HITL blocks
-    mock_graph_db = mocker.patch("backend.api.features.library.db.graph_db")
+    # Mock graph_db.get_graph function that's called in resolve_graph_for_library
+    # (lives in _add_to_library.py after refactor, not db.py)
+    mock_graph_db = mocker.patch(
+        "backend.api.features.library._add_to_library.graph_db"
+    )
    mock_graph_model = mocker.Mock()
+    mock_graph_model.id = "agent1"
+    mock_graph_model.version = 1
    mock_graph_model.nodes = (
        []
    )  # Empty list so _has_human_in_the_loop_blocks returns False
@@ -170,38 +171,27 @@ async def test_add_agent_to_library(mocker):
    mock_store_listing_version.return_value.find_unique.assert_called_once_with(
        where={"id": "version123"}, include={"AgentGraph": True}
    )
-    mock_library_agent.return_value.find_unique.assert_called_once_with(
-        where={
-            "userId_agentGraphId_agentGraphVersion": {
-                "userId": "test-user",
-                "agentGraphId": "agent1",
-                "agentGraphVersion": 1,
-            }
-        },
-        include={"AgentGraph": True},
-    )
    # Check that create was called with the expected data including settings
    create_call_args = mock_library_agent.return_value.create.call_args
    assert create_call_args is not None

-    # Verify the main structure
-    expected_data = {
+    # Verify the create data structure
+    create_data = create_call_args.kwargs["data"]
+    expected_create = {
        "User": {"connect": {"id": "test-user"}},
        "AgentGraph": {"connect": {"graphVersionId": {"id": "agent1", "version": 1}}},
        "isCreatedByUser": False,
+        "useGraphIsActiveVersion": False,
    }
-
-    actual_data = create_call_args[1]["data"]
-    # Check that all expected fields are present
-    for key, value in expected_data.items():
-        assert actual_data[key] == value
+    for key, value in expected_create.items():
+        assert create_data[key] == value

    # Check that settings field is present and is a SafeJson object
-    assert "settings" in actual_data
-    assert hasattr(actual_data["settings"], "__class__")  # Should be a SafeJson object
+    assert "settings" in create_data
+    assert hasattr(create_data["settings"], "__class__")  # Should be a SafeJson object

    # Check include parameter
-    assert create_call_args[1]["include"] == library_agent_include(
+    assert create_call_args.kwargs["include"] == library_agent_include(
        "test-user", include_nodes=False, include_executions=False
    )

@@ -218,10 +208,148 @@ async def test_add_agent_to_library_not_found(mocker):
    )

    # Call function and verify exception
-    with pytest.raises(backend.api.features.store.exceptions.AgentNotFoundError):
+    with pytest.raises(db.NotFoundError):
        await db.add_store_agent_to_library("version123", "test-user")

    # Verify mock called correctly
    mock_store_listing_version.return_value.find_unique.assert_called_once_with(
        where={"id": "version123"}, include={"AgentGraph": True}
    )
+
+
+@pytest.mark.asyncio
+async def test_get_library_agent_by_graph_id_excludes_archived(mocker):
+    mock_library_agent = mocker.patch("prisma.models.LibraryAgent.prisma")
+    mock_library_agent.return_value.find_first = mocker.AsyncMock(return_value=None)
+
+    result = await db.get_library_agent_by_graph_id("test-user", "agent1", 7)
+
+    assert result is None
+    mock_library_agent.return_value.find_first.assert_called_once()
+    where = mock_library_agent.return_value.find_first.call_args.kwargs["where"]
+    assert where == {
+        "agentGraphId": "agent1",
+        "userId": "test-user",
+        "isDeleted": False,
+        "isArchived": False,
+        "agentGraphVersion": 7,
+    }
+
+
+@pytest.mark.asyncio
+async def test_get_library_agent_by_graph_id_can_include_archived(mocker):
+    mock_library_agent = mocker.patch("prisma.models.LibraryAgent.prisma")
+    mock_library_agent.return_value.find_first = mocker.AsyncMock(return_value=None)
+
+    result = await db.get_library_agent_by_graph_id(
+        "test-user",
+        "agent1",
+        7,
+        include_archived=True,
+    )
+
+    assert result is None
+    mock_library_agent.return_value.find_first.assert_called_once()
+    where = mock_library_agent.return_value.find_first.call_args.kwargs["where"]
+    assert where == {
+        "agentGraphId": "agent1",
+        "userId": "test-user",
+        "isDeleted": False,
+        "agentGraphVersion": 7,
+    }
+
+
+@pytest.mark.asyncio
+async def test_update_graph_in_library_allows_archived_library_agent(mocker):
+    graph = mocker.Mock(id="graph-id")
+    existing_version = mocker.Mock(version=1, is_active=True)
+    graph_model = mocker.Mock()
+    created_graph = mocker.Mock(id="graph-id", version=2, is_active=False)
+    current_library_agent = mocker.Mock()
+    updated_library_agent = mocker.Mock()
+
+    mocker.patch(
+        "backend.api.features.library.db.graph_db.get_graph_all_versions",
+        new=mocker.AsyncMock(return_value=[existing_version]),
+    )
+    mocker.patch(
+        "backend.api.features.library.db.graph_db.make_graph_model",
+        return_value=graph_model,
+    )
+    mocker.patch(
+        "backend.api.features.library.db.graph_db.create_graph",
+        new=mocker.AsyncMock(return_value=created_graph),
+    )
+    mock_get_library_agent = mocker.patch(
+        "backend.api.features.library.db.get_library_agent_by_graph_id",
+        new=mocker.AsyncMock(return_value=current_library_agent),
+    )
+    mock_update_library_agent = mocker.patch(
+        "backend.api.features.library.db.update_library_agent_version_and_settings",
+        new=mocker.AsyncMock(return_value=updated_library_agent),
+    )
+
+    result_graph, result_library_agent = await db.update_graph_in_library(
+        graph,
+        "test-user",
+    )
+
+    assert result_graph is created_graph
+    assert result_library_agent is updated_library_agent
+    assert graph.version == 2
+    graph_model.reassign_ids.assert_called_once_with(
+        user_id="test-user", reassign_graph_id=False
+    )
+    mock_get_library_agent.assert_awaited_once_with(
+        "test-user",
+        "graph-id",
+        include_archived=True,
+    )
+    mock_update_library_agent.assert_awaited_once_with("test-user", created_graph)
+
+
+@pytest.mark.asyncio
+async def test_create_library_agent_uses_upsert():
+    """create_library_agent should use upsert (not create) to handle duplicates."""
+    mock_graph = MagicMock()
+    mock_graph.id = "graph-1"
+    mock_graph.version = 1
+    mock_graph.user_id = "user-1"
+    mock_graph.nodes = []
+    mock_graph.sub_graphs = []
+
+    mock_upserted = MagicMock(name="UpsertedLibraryAgent")
+
+    @asynccontextmanager
+    async def fake_tx():
+        yield None
+
+    with (
+        patch("backend.api.features.library.db.transaction", fake_tx),
+        patch("prisma.models.LibraryAgent.prisma") as mock_prisma,
+        patch(
+            "backend.api.features.library.db.add_generated_agent_image",
+            new=AsyncMock(),
+        ),
+        patch(
+            "backend.api.features.library.model.LibraryAgent.from_db",
+            return_value=MagicMock(),
+        ),
+    ):
+        mock_prisma.return_value.upsert = AsyncMock(return_value=mock_upserted)
+
+        result = await db.create_library_agent(mock_graph, "user-1")
+
+    assert len(result) == 1
+    upsert_call = mock_prisma.return_value.upsert.call_args
+    assert upsert_call is not None
+    # Verify the upsert where clause uses the composite unique key
+    where = upsert_call.kwargs["where"]
+    assert "userId_agentGraphId_agentGraphVersion" in where
+    # Verify the upsert data has both create and update branches
+    data = upsert_call.kwargs["data"]
+    assert "create" in data
+    assert "update" in data
+    # Verify update branch restores soft-deleted/archived agents
+    assert data["update"]["isDeleted"] is False
+    assert data["update"]["isArchived"] is False
--- a/autogpt_platform/backend/backend/api/features/library/exceptions.py
+++ b/autogpt_platform/backend/backend/api/features/library/exceptions.py
@@ -0,0 +1,10 @@
+class FolderValidationError(Exception):
+    """Raised when folder operations fail validation."""
+
+    pass
+
+
+class FolderAlreadyExistsError(FolderValidationError):
+    """Raised when a folder with the same name already exists in the location."""
+
+    pass
--- a/autogpt_platform/backend/backend/api/features/library/model.py
+++ b/autogpt_platform/backend/backend/api/features/library/model.py
@@ -26,6 +26,95 @@ class LibraryAgentStatus(str, Enum):
    ERROR = "ERROR"


+# === Folder Models ===
+
+
+class LibraryFolder(pydantic.BaseModel):
+    """Represents a folder for organizing library agents."""
+
+    id: str
+    user_id: str
+    name: str
+    icon: str | None = None
+    color: str | None = None
+    parent_id: str | None = None
+    created_at: datetime.datetime
+    updated_at: datetime.datetime
+    agent_count: int = 0  # Direct agents in folder
+    subfolder_count: int = 0  # Direct child folders
+
+    @staticmethod
+    def from_db(
+        folder: prisma.models.LibraryFolder,
+        agent_count: int = 0,
+        subfolder_count: int = 0,
+    ) -> "LibraryFolder":
+        """Factory method that constructs a LibraryFolder from a Prisma model."""
+        return LibraryFolder(
+            id=folder.id,
+            user_id=folder.userId,
+            name=folder.name,
+            icon=folder.icon,
+            color=folder.color,
+            parent_id=folder.parentId,
+            created_at=folder.createdAt,
+            updated_at=folder.updatedAt,
+            agent_count=agent_count,
+            subfolder_count=subfolder_count,
+        )
+
+
+class LibraryFolderTree(LibraryFolder):
+    """Folder with nested children for tree view."""
+
+    children: list["LibraryFolderTree"] = []
+
+
+class FolderCreateRequest(pydantic.BaseModel):
+    """Request model for creating a folder."""
+
+    name: str = pydantic.Field(..., min_length=1, max_length=100)
+    icon: str | None = None
+    color: str | None = pydantic.Field(
+        None, pattern=r"^#[0-9A-Fa-f]{6}$", description="Hex color code (#RRGGBB)"
+    )
+    parent_id: str | None = None
+
+
+class FolderUpdateRequest(pydantic.BaseModel):
+    """Request model for updating a folder."""
+
+    name: str | None = pydantic.Field(None, min_length=1, max_length=100)
+    icon: str | None = None
+    color: str | None = None
+
+
+class FolderMoveRequest(pydantic.BaseModel):
+    """Request model for moving a folder to a new parent."""
+
+    target_parent_id: str | None = None  # None = move to root
+
+
+class BulkMoveAgentsRequest(pydantic.BaseModel):
+    """Request model for moving multiple agents to a folder."""
+
+    agent_ids: list[str]
+    folder_id: str | None = None  # None = move to root
+
+
+class FolderListResponse(pydantic.BaseModel):
+    """Response schema for a list of folders."""
+
+    folders: list[LibraryFolder]
+    pagination: Pagination
+
+
+class FolderTreeResponse(pydantic.BaseModel):
+    """Response schema for folder tree structure."""
+
+    tree: list[LibraryFolderTree]
+
+
 class MarketplaceListingCreator(pydantic.BaseModel):
    """Creator information for a marketplace listing."""

@@ -76,7 +165,6 @@ class LibraryAgent(pydantic.BaseModel):
    id: str
    graph_id: str
    graph_version: int
-    owner_user_id: str

    image_url: str | None

@@ -117,9 +205,14 @@ class LibraryAgent(pydantic.BaseModel):
        default_factory=list,
        description="List of recent executions with status, score, and summary",
    )
-    can_access_graph: bool
+    can_access_graph: bool = pydantic.Field(
+        description="Indicates whether the same user owns the corresponding graph"
+    )
    is_latest_version: bool
    is_favorite: bool
+    folder_id: str | None = None
+    folder_name: str | None = None  # Denormalized for display
+
    recommended_schedule_cron: str | None = None
    settings: GraphSettings = pydantic.Field(default_factory=GraphSettings)
    marketplace_listing: Optional["MarketplaceListing"] = None
@@ -232,7 +325,6 @@ class LibraryAgent(pydantic.BaseModel):
            id=agent.id,
            graph_id=agent.agentGraphId,
            graph_version=agent.agentGraphVersion,
-            owner_user_id=agent.userId,
            image_url=agent.imageUrl,
            creator_name=creator_name,
            creator_image_url=creator_image_url,
@@ -259,6 +351,8 @@ class LibraryAgent(pydantic.BaseModel):
            can_access_graph=can_access_graph,
            is_latest_version=is_latest_version,
            is_favorite=agent.isFavorite,
+            folder_id=agent.folderId,
+            folder_name=agent.Folder.name if agent.Folder else None,
            recommended_schedule_cron=agent.AgentGraph.recommendedScheduleCron,
            settings=_parse_settings(agent.settings),
            marketplace_listing=marketplace_listing_data,
@@ -470,3 +564,7 @@ class LibraryAgentUpdateRequest(pydantic.BaseModel):
    settings: Optional[GraphSettings] = pydantic.Field(
        default=None, description="User-specific settings for this library agent"
    )
+    folder_id: Optional[str] = pydantic.Field(
+        default=None,
+        description="Folder ID to move agent to (None to move to root)",
+    )
--- a/autogpt_platform/backend/backend/api/features/library/routes/init.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes/init.py
@@ -1,9 +1,11 @@
 import fastapi

 from .agents import router as agents_router
+from .folders import router as folders_router
 from .presets import router as presets_router

 router = fastapi.APIRouter()

 router.include_router(presets_router)
+router.include_router(folders_router)
 router.include_router(agents_router)
--- a/autogpt_platform/backend/backend/api/features/library/routes/agents.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes/agents.py
@@ -41,6 +41,14 @@ async def list_library_agents(
        ge=1,
        description="Number of agents per page (must be >= 1)",
    ),
+    folder_id: Optional[str] = Query(
+        None,
+        description="Filter by folder ID",
+    ),
+    include_root_only: bool = Query(
+        False,
+        description="Only return agents without a folder (root-level agents)",
+    ),
 ) -> library_model.LibraryAgentResponse:
    """
    Get all agents in the user's library (both created and saved).
@@ -51,6 +59,8 @@ async def list_library_agents(
        sort_by=sort_by,
        page=page,
        page_size=page_size,
+        folder_id=folder_id,
+        include_root_only=include_root_only,
    )


@@ -168,6 +178,7 @@ async def update_library_agent(
        is_favorite=payload.is_favorite,
        is_archived=payload.is_archived,
        settings=payload.settings,
+        folder_id=payload.folder_id,
    )


--- a/autogpt_platform/backend/backend/api/features/library/routes/folders.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes/folders.py
@@ -0,0 +1,287 @@
+from typing import Optional
+
+import autogpt_libs.auth as autogpt_auth_lib
+from fastapi import APIRouter, Query, Security, status
+from fastapi.responses import Response
+
+from .. import db as library_db
+from .. import model as library_model
+
+router = APIRouter(
+    prefix="/folders",
+    tags=["library", "folders", "private"],
+    dependencies=[Security(autogpt_auth_lib.requires_user)],
+)
+
+
+@router.get(
+    "",
+    summary="List Library Folders",
+    response_model=library_model.FolderListResponse,
+    responses={
+        200: {"description": "List of folders"},
+        500: {"description": "Server error"},
+    },
+)
+async def list_folders(
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+    parent_id: Optional[str] = Query(
+        None,
+        description="Filter by parent folder ID. If not provided, returns root-level folders.",
+    ),
+    include_relations: bool = Query(
+        True,
+        description="Include agent and subfolder relations (for counts)",
+    ),
+) -> library_model.FolderListResponse:
+    """
+    List folders for the authenticated user.
+
+    Args:
+        user_id: ID of the authenticated user.
+        parent_id: Optional parent folder ID to filter by.
+        include_relations: Whether to include agent and subfolder relations for counts.
+
+    Returns:
+        A FolderListResponse containing folders.
+    """
+    folders = await library_db.list_folders(
+        user_id=user_id,
+        parent_id=parent_id,
+        include_relations=include_relations,
+    )
+    return library_model.FolderListResponse(
+        folders=folders,
+        pagination=library_model.Pagination(
+            total_items=len(folders),
+            total_pages=1,
+            current_page=1,
+            page_size=len(folders),
+        ),
+    )
+
+
+@router.get(
+    "/tree",
+    summary="Get Folder Tree",
+    response_model=library_model.FolderTreeResponse,
+    responses={
+        200: {"description": "Folder tree structure"},
+        500: {"description": "Server error"},
+    },
+)
+async def get_folder_tree(
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+) -> library_model.FolderTreeResponse:
+    """
+    Get the full folder tree for the authenticated user.
+
+    Args:
+        user_id: ID of the authenticated user.
+
+    Returns:
+        A FolderTreeResponse containing the nested folder structure.
+    """
+    tree = await library_db.get_folder_tree(user_id=user_id)
+    return library_model.FolderTreeResponse(tree=tree)
+
+
+@router.get(
+    "/{folder_id}",
+    summary="Get Folder",
+    response_model=library_model.LibraryFolder,
+    responses={
+        200: {"description": "Folder details"},
+        404: {"description": "Folder not found"},
+        500: {"description": "Server error"},
+    },
+)
+async def get_folder(
+    folder_id: str,
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+) -> library_model.LibraryFolder:
+    """
+    Get a specific folder.
+
+    Args:
+        folder_id: ID of the folder to retrieve.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        The requested LibraryFolder.
+    """
+    return await library_db.get_folder(folder_id=folder_id, user_id=user_id)
+
+
+@router.post(
+    "",
+    summary="Create Folder",
+    status_code=status.HTTP_201_CREATED,
+    response_model=library_model.LibraryFolder,
+    responses={
+        201: {"description": "Folder created successfully"},
+        400: {"description": "Validation error"},
+        404: {"description": "Parent folder not found"},
+        409: {"description": "Folder name conflict"},
+        500: {"description": "Server error"},
+    },
+)
+async def create_folder(
+    payload: library_model.FolderCreateRequest,
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+) -> library_model.LibraryFolder:
+    """
+    Create a new folder.
+
+    Args:
+        payload: The folder creation request.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        The created LibraryFolder.
+    """
+    return await library_db.create_folder(
+        user_id=user_id,
+        name=payload.name,
+        parent_id=payload.parent_id,
+        icon=payload.icon,
+        color=payload.color,
+    )
+
+
+@router.patch(
+    "/{folder_id}",
+    summary="Update Folder",
+    response_model=library_model.LibraryFolder,
+    responses={
+        200: {"description": "Folder updated successfully"},
+        400: {"description": "Validation error"},
+        404: {"description": "Folder not found"},
+        409: {"description": "Folder name conflict"},
+        500: {"description": "Server error"},
+    },
+)
+async def update_folder(
+    folder_id: str,
+    payload: library_model.FolderUpdateRequest,
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+) -> library_model.LibraryFolder:
+    """
+    Update a folder's properties.
+
+    Args:
+        folder_id: ID of the folder to update.
+        payload: The folder update request.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        The updated LibraryFolder.
+    """
+    return await library_db.update_folder(
+        folder_id=folder_id,
+        user_id=user_id,
+        name=payload.name,
+        icon=payload.icon,
+        color=payload.color,
+    )
+
+
+@router.post(
+    "/{folder_id}/move",
+    summary="Move Folder",
+    response_model=library_model.LibraryFolder,
+    responses={
+        200: {"description": "Folder moved successfully"},
+        400: {"description": "Validation error (circular reference)"},
+        404: {"description": "Folder or target parent not found"},
+        409: {"description": "Folder name conflict in target location"},
+        500: {"description": "Server error"},
+    },
+)
+async def move_folder(
+    folder_id: str,
+    payload: library_model.FolderMoveRequest,
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+) -> library_model.LibraryFolder:
+    """
+    Move a folder to a new parent.
+
+    Args:
+        folder_id: ID of the folder to move.
+        payload: The move request with target parent.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        The moved LibraryFolder.
+    """
+    return await library_db.move_folder(
+        folder_id=folder_id,
+        user_id=user_id,
+        target_parent_id=payload.target_parent_id,
+    )
+
+
+@router.delete(
+    "/{folder_id}",
+    summary="Delete Folder",
+    status_code=status.HTTP_204_NO_CONTENT,
+    responses={
+        204: {"description": "Folder deleted successfully"},
+        404: {"description": "Folder not found"},
+        500: {"description": "Server error"},
+    },
+)
+async def delete_folder(
+    folder_id: str,
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+) -> Response:
+    """
+    Soft-delete a folder and all its contents.
+
+    Args:
+        folder_id: ID of the folder to delete.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        204 No Content if successful.
+    """
+    await library_db.delete_folder(
+        folder_id=folder_id,
+        user_id=user_id,
+        soft_delete=True,
+    )
+    return Response(status_code=status.HTTP_204_NO_CONTENT)
+
+
+# === Bulk Agent Operations ===
+
+
+@router.post(
+    "/agents/bulk-move",
+    summary="Bulk Move Agents",
+    response_model=list[library_model.LibraryAgent],
+    responses={
+        200: {"description": "Agents moved successfully"},
+        404: {"description": "Folder not found"},
+        500: {"description": "Server error"},
+    },
+)
+async def bulk_move_agents(
+    payload: library_model.BulkMoveAgentsRequest,
+    user_id: str = Security(autogpt_auth_lib.get_user_id),
+) -> list[library_model.LibraryAgent]:
+    """
+    Move multiple agents to a folder.
+
+    Args:
+        payload: The bulk move request with agent IDs and target folder.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        The updated LibraryAgents.
+    """
+    return await library_db.bulk_move_agents_to_folder(
+        agent_ids=payload.agent_ids,
+        folder_id=payload.folder_id,
+        user_id=user_id,
+    )
--- a/autogpt_platform/backend/backend/api/features/library/routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes_test.py
@@ -42,7 +42,6 @@ async def test_get_library_agents_success(
                id="test-agent-1",
                graph_id="test-agent-1",
                graph_version=1,
-                owner_user_id=test_user_id,
                name="Test Agent 1",
                description="Test Description 1",
                image_url=None,
@@ -67,7 +66,6 @@ async def test_get_library_agents_success(
                id="test-agent-2",
                graph_id="test-agent-2",
                graph_version=1,
-                owner_user_id=test_user_id,
                name="Test Agent 2",
                description="Test Description 2",
                image_url=None,
@@ -115,6 +113,8 @@ async def test_get_library_agents_success(
        sort_by=library_model.LibraryAgentSort.UPDATED_AT,
        page=1,
        page_size=15,
+        folder_id=None,
+        include_root_only=False,
    )


@@ -129,7 +129,6 @@ async def test_get_favorite_library_agents_success(
                id="test-agent-1",
                graph_id="test-agent-1",
                graph_version=1,
-                owner_user_id=test_user_id,
                name="Favorite Agent 1",
                description="Test Favorite Description 1",
                image_url=None,
@@ -182,7 +181,6 @@ def test_add_agent_to_library_success(
        id="test-library-agent-id",
        graph_id="test-agent-1",
        graph_version=1,
-        owner_user_id=test_user_id,
        name="Test Agent 1",
        description="Test Description 1",
        image_url=None,
--- a/autogpt_platform/backend/backend/api/features/mcp/routes.py
+++ b/autogpt_platform/backend/backend/api/features/mcp/routes.py
@@ -7,20 +7,24 @@ frontend can list available tools on an MCP server before placing a block.

 import logging
 from typing import Annotated, Any
-from urllib.parse import urlparse

 import fastapi
 from autogpt_libs.auth import get_user_id
 from fastapi import Security
-from pydantic import BaseModel, Field
+from pydantic import BaseModel, Field, SecretStr

 from backend.api.features.integrations.router import CredentialsMetaResponse
 from backend.blocks.mcp.client import MCPClient, MCPClientError
+from backend.blocks.mcp.helpers import (
+    auto_lookup_mcp_credential,
+    normalize_mcp_url,
+    server_host,
+)
 from backend.blocks.mcp.oauth import MCPOAuthHandler
 from backend.data.model import OAuth2Credentials
 from backend.integrations.creds_manager import IntegrationCredentialsManager
 from backend.integrations.providers import ProviderName
-from backend.util.request import HTTPClientError, Requests
+from backend.util.request import HTTPClientError, Requests, validate_url_host
 from backend.util.settings import Settings

 logger = logging.getLogger(__name__)
@@ -74,32 +78,20 @@ async def discover_tools(
    If the user has a stored MCP credential for this server URL, it will be
    used automatically — no need to pass an explicit auth token.
    """
+    # Validate URL to prevent SSRF — blocks loopback and private IP ranges.
+    try:
+        await validate_url_host(request.server_url)
+    except ValueError as e:
+        raise fastapi.HTTPException(status_code=400, detail=f"Invalid server URL: {e}")
+
    auth_token = request.auth_token

    # Auto-use stored MCP credential when no explicit token is provided.
    if not auth_token:
-        mcp_creds = await creds_manager.store.get_creds_by_provider(
-            user_id, ProviderName.MCP.value
+        best_cred = await auto_lookup_mcp_credential(
+            user_id, normalize_mcp_url(request.server_url)
        )
-        # Find the freshest credential for this server URL
-        best_cred: OAuth2Credentials | None = None
-        for cred in mcp_creds:
-            if (
-                isinstance(cred, OAuth2Credentials)
-                and (cred.metadata or {}).get("mcp_server_url") == request.server_url
-            ):
-                if best_cred is None or (
-                    (cred.access_token_expires_at or 0)
-                    > (best_cred.access_token_expires_at or 0)
-                ):
-                    best_cred = cred
        if best_cred:
-            # Refresh the token if expired before using it
-            best_cred = await creds_manager.refresh_if_needed(user_id, best_cred)
-            logger.info(
-                f"Using MCP credential {best_cred.id} for {request.server_url}, "
-                f"expires_at={best_cred.access_token_expires_at}"
-            )
            auth_token = best_cred.access_token.get_secret_value()

    client = MCPClient(request.server_url, auth_token=auth_token)
@@ -134,7 +126,7 @@ async def discover_tools(
        ],
        server_name=(
            init_result.get("serverInfo", {}).get("name")
-            or urlparse(request.server_url).hostname
+            or server_host(request.server_url)
            or "MCP"
        ),
        protocol_version=init_result.get("protocolVersion"),
@@ -173,7 +165,16 @@ async def mcp_oauth_login(
    3. Performs Dynamic Client Registration (RFC 7591) if available
    4. Returns the authorization URL for the frontend to open in a popup
    """
-    client = MCPClient(request.server_url)
+    # Validate URL to prevent SSRF — blocks loopback and private IP ranges.
+    try:
+        await validate_url_host(request.server_url)
+    except ValueError as e:
+        raise fastapi.HTTPException(status_code=400, detail=f"Invalid server URL: {e}")
+
+    # Normalize the URL so that credentials stored here are matched consistently
+    # by auto_lookup_mcp_credential (which also uses normalized URLs).
+    server_url = normalize_mcp_url(request.server_url)
+    client = MCPClient(server_url)

    # Step 1: Discover protected-resource metadata (RFC 9728)
    protected_resource = await client.discover_auth()
@@ -182,7 +183,16 @@ async def mcp_oauth_login(

    if protected_resource and protected_resource.get("authorization_servers"):
        auth_server_url = protected_resource["authorization_servers"][0]
-        resource_url = protected_resource.get("resource", request.server_url)
+        resource_url = protected_resource.get("resource", server_url)
+
+        # Validate the auth server URL from metadata to prevent SSRF.
+        try:
+            await validate_url_host(auth_server_url)
+        except ValueError as e:
+            raise fastapi.HTTPException(
+                status_code=400,
+                detail=f"Invalid authorization server URL in metadata: {e}",
+            )

        # Step 2a: Discover auth-server metadata (RFC 8414)
        metadata = await client.discover_auth_server_metadata(auth_server_url)
@@ -192,7 +202,7 @@ async def mcp_oauth_login(
        # Don't assume a resource_url — omitting it lets the auth server choose
        # the correct audience for the token (RFC 8707 resource is optional).
        resource_url = None
-        metadata = await client.discover_auth_server_metadata(request.server_url)
+        metadata = await client.discover_auth_server_metadata(server_url)

    if (
        not metadata
@@ -222,12 +232,18 @@ async def mcp_oauth_login(
    client_id = ""
    client_secret = ""
    if registration_endpoint:
-        reg_result = await _register_mcp_client(
-            registration_endpoint, redirect_uri, request.server_url
-        )
-        if reg_result:
-            client_id = reg_result.get("client_id", "")
-            client_secret = reg_result.get("client_secret", "")
+        # Validate the registration endpoint to prevent SSRF via metadata.
+        try:
+            await validate_url_host(registration_endpoint)
+        except ValueError:
+            pass  # Skip registration, fall back to default client_id
+        else:
+            reg_result = await _register_mcp_client(
+                registration_endpoint, redirect_uri, server_url
+            )
+            if reg_result:
+                client_id = reg_result.get("client_id", "")
+                client_secret = reg_result.get("client_secret", "")

    if not client_id:
        client_id = "autogpt-platform"
@@ -245,7 +261,7 @@ async def mcp_oauth_login(
            "token_url": token_url,
            "revoke_url": revoke_url,
            "resource_url": resource_url,
-            "server_url": request.server_url,
+            "server_url": server_url,
            "client_id": client_id,
            "client_secret": client_secret,
        },
@@ -342,7 +358,7 @@ async def mcp_oauth_callback(
    credentials.metadata["mcp_token_url"] = meta["token_url"]
    credentials.metadata["mcp_resource_url"] = meta.get("resource_url", "")

-    hostname = urlparse(meta["server_url"]).hostname or meta["server_url"]
+    hostname = server_host(meta["server_url"])
    credentials.title = f"MCP: {hostname}"

    # Remove old MCP credentials for the same server to prevent stale token buildup.
@@ -357,7 +373,9 @@ async def mcp_oauth_callback(
            ):
                await creds_manager.store.delete_creds_by_id(user_id, old.id)
                logger.info(
-                    f"Removed old MCP credential {old.id} for {meta['server_url']}"
+                    "Removed old MCP credential %s for %s",
+                    old.id,
+                    server_host(meta["server_url"]),
                )
    except Exception:
        logger.debug("Could not clean up old MCP credentials", exc_info=True)
@@ -375,6 +393,93 @@ async def mcp_oauth_callback(
    )


+# ======================== Bearer Token ======================== #
+
+
+class MCPStoreTokenRequest(BaseModel):
+    """Request to store a bearer token for an MCP server that doesn't support OAuth."""
+
+    server_url: str = Field(
+        description="MCP server URL the token authenticates against"
+    )
+    token: SecretStr = Field(
+        min_length=1, description="Bearer token / API key for the MCP server"
+    )
+
+
+@router.post(
+    "/token",
+    summary="Store a bearer token for an MCP server",
+)
+async def mcp_store_token(
+    request: MCPStoreTokenRequest,
+    user_id: Annotated[str, Security(get_user_id)],
+) -> CredentialsMetaResponse:
+    """
+    Store a manually provided bearer token as an MCP credential.
+
+    Used by the Copilot MCPSetupCard when the server doesn't support the MCP
+    OAuth discovery flow (returns 400 from /oauth/login).  Subsequent
+    ``run_mcp_tool`` calls will automatically pick up the token via
+    ``_auto_lookup_credential``.
+    """
+    token = request.token.get_secret_value().strip()
+    if not token:
+        raise fastapi.HTTPException(status_code=422, detail="Token must not be blank.")
+
+    # Validate URL to prevent SSRF — blocks loopback and private IP ranges.
+    try:
+        await validate_url_host(request.server_url)
+    except ValueError as e:
+        raise fastapi.HTTPException(status_code=400, detail=f"Invalid server URL: {e}")
+
+    # Normalize URL so trailing-slash variants match existing credentials.
+    server_url = normalize_mcp_url(request.server_url)
+    hostname = server_host(server_url)
+
+    # Collect IDs of old credentials to clean up after successful create.
+    old_cred_ids: list[str] = []
+    try:
+        old_creds = await creds_manager.store.get_creds_by_provider(
+            user_id, ProviderName.MCP.value
+        )
+        old_cred_ids = [
+            old.id
+            for old in old_creds
+            if isinstance(old, OAuth2Credentials)
+            and normalize_mcp_url((old.metadata or {}).get("mcp_server_url", ""))
+            == server_url
+        ]
+    except Exception:
+        logger.debug("Could not query old MCP token credentials", exc_info=True)
+
+    credentials = OAuth2Credentials(
+        provider=ProviderName.MCP.value,
+        title=f"MCP: {hostname}",
+        access_token=SecretStr(token),
+        scopes=[],
+        metadata={"mcp_server_url": server_url},
+    )
+    await creds_manager.create(user_id, credentials)
+
+    # Only delete old credentials after the new one is safely stored.
+    for old_id in old_cred_ids:
+        try:
+            await creds_manager.store.delete_creds_by_id(user_id, old_id)
+        except Exception:
+            logger.debug("Could not clean up old MCP token credential", exc_info=True)
+
+    return CredentialsMetaResponse(
+        id=credentials.id,
+        provider=credentials.provider,
+        type=credentials.type,
+        title=credentials.title,
+        scopes=credentials.scopes,
+        username=credentials.username,
+        host=hostname,
+    )
+
+
 # ======================== Helpers ======================== #


@@ -400,5 +505,7 @@ async def _register_mcp_client(
            return data
        return None
    except Exception as e:
-        logger.warning(f"Dynamic client registration failed for {server_url}: {e}")
+        logger.warning(
+            "Dynamic client registration failed for %s: %s", server_host(server_url), e
+        )
        return None
--- a/autogpt_platform/backend/backend/api/features/mcp/test_routes.py
+++ b/autogpt_platform/backend/backend/api/features/mcp/test_routes.py
@@ -11,9 +11,11 @@ import httpx
 import pytest
 import pytest_asyncio
 from autogpt_libs.auth import get_user_id
+from pydantic import SecretStr

 from backend.api.features.mcp.routes import router
 from backend.blocks.mcp.client import MCPClientError, MCPTool
+from backend.data.model import OAuth2Credentials
 from backend.util.request import HTTPClientError

 app = fastapi.FastAPI()
@@ -28,6 +30,16 @@ async def client():
        yield c


+@pytest.fixture(autouse=True)
+def _bypass_ssrf_validation():
+    """Bypass validate_url_host in all route tests (test URLs don't resolve)."""
+    with patch(
+        "backend.api.features.mcp.routes.validate_url_host",
+        new_callable=AsyncMock,
+    ):
+        yield
+
+
 class TestDiscoverTools:
    @pytest.mark.asyncio(loop_scope="session")
    async def test_discover_tools_success(self, client):
@@ -56,9 +68,12 @@ class TestDiscoverTools:

        with (
            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
+            patch(
+                "backend.api.features.mcp.routes.auto_lookup_mcp_credential",
+                new_callable=AsyncMock,
+                return_value=None,
+            ),
        ):
-            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[])
            instance = MockClient.return_value
            instance.initialize = AsyncMock(
                return_value={
@@ -107,10 +122,6 @@ class TestDiscoverTools:
    @pytest.mark.asyncio(loop_scope="session")
    async def test_discover_tools_auto_uses_stored_credential(self, client):
        """When no explicit token is given, stored MCP credentials are used."""
-        from pydantic import SecretStr
-
-        from backend.data.model import OAuth2Credentials
-
        stored_cred = OAuth2Credentials(
            provider="mcp",
            title="MCP: example.com",
@@ -124,10 +135,12 @@ class TestDiscoverTools:

        with (
            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
+            patch(
+                "backend.api.features.mcp.routes.auto_lookup_mcp_credential",
+                new_callable=AsyncMock,
+                return_value=stored_cred,
+            ),
        ):
-            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[stored_cred])
-            mock_cm.refresh_if_needed = AsyncMock(return_value=stored_cred)
            instance = MockClient.return_value
            instance.initialize = AsyncMock(
                return_value={"serverInfo": {}, "protocolVersion": "2025-03-26"}
@@ -149,9 +162,12 @@ class TestDiscoverTools:
    async def test_discover_tools_mcp_error(self, client):
        with (
            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
+            patch(
+                "backend.api.features.mcp.routes.auto_lookup_mcp_credential",
+                new_callable=AsyncMock,
+                return_value=None,
+            ),
        ):
-            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[])
            instance = MockClient.return_value
            instance.initialize = AsyncMock(
                side_effect=MCPClientError("Connection refused")
@@ -169,9 +185,12 @@ class TestDiscoverTools:
    async def test_discover_tools_generic_error(self, client):
        with (
            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
+            patch(
+                "backend.api.features.mcp.routes.auto_lookup_mcp_credential",
+                new_callable=AsyncMock,
+                return_value=None,
+            ),
        ):
-            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[])
            instance = MockClient.return_value
            instance.initialize = AsyncMock(side_effect=Exception("Network timeout"))

@@ -187,9 +206,12 @@ class TestDiscoverTools:
    async def test_discover_tools_auth_required(self, client):
        with (
            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
+            patch(
+                "backend.api.features.mcp.routes.auto_lookup_mcp_credential",
+                new_callable=AsyncMock,
+                return_value=None,
+            ),
        ):
-            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[])
            instance = MockClient.return_value
            instance.initialize = AsyncMock(
                side_effect=HTTPClientError("HTTP 401 Error: Unauthorized", 401)
@@ -207,9 +229,12 @@ class TestDiscoverTools:
    async def test_discover_tools_forbidden(self, client):
        with (
            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
+            patch(
+                "backend.api.features.mcp.routes.auto_lookup_mcp_credential",
+                new_callable=AsyncMock,
+                return_value=None,
+            ),
        ):
-            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[])
            instance = MockClient.return_value
            instance.initialize = AsyncMock(
                side_effect=HTTPClientError("HTTP 403 Error: Forbidden", 403)
@@ -331,10 +356,6 @@ class TestOAuthLogin:
 class TestOAuthCallback:
    @pytest.mark.asyncio(loop_scope="session")
    async def test_oauth_callback_success(self, client):
-        from pydantic import SecretStr
-
-        from backend.data.model import OAuth2Credentials
-
        mock_creds = OAuth2Credentials(
            provider="mcp",
            title=None,
@@ -434,3 +455,118 @@ class TestOAuthCallback:

        assert response.status_code == 400
        assert "token exchange failed" in response.json()["detail"].lower()
+
+
+class TestStoreToken:
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_store_token_success(self, client):
+        with patch("backend.api.features.mcp.routes.creds_manager") as mock_cm:
+            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[])
+            mock_cm.create = AsyncMock()
+
+            response = await client.post(
+                "/token",
+                json={
+                    "server_url": "https://mcp.example.com/mcp",
+                    "token": "my-api-key-123",
+                },
+            )
+
+        assert response.status_code == 200
+        data = response.json()
+        assert data["provider"] == "mcp"
+        assert data["type"] == "oauth2"
+        assert data["host"] == "mcp.example.com"
+        mock_cm.create.assert_called_once()
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_store_token_blank_rejected(self, client):
+        """Blank token string (after stripping) should return 422."""
+        response = await client.post(
+            "/token",
+            json={
+                "server_url": "https://mcp.example.com/mcp",
+                "token": "   ",
+            },
+        )
+        # Pydantic min_length=1 catches the whitespace-only token
+        assert response.status_code == 422
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_store_token_replaces_old_credential(self, client):
+        old_cred = OAuth2Credentials(
+            provider="mcp",
+            title="MCP: mcp.example.com",
+            access_token=SecretStr("old-token"),
+            scopes=[],
+            metadata={"mcp_server_url": "https://mcp.example.com/mcp"},
+        )
+        with patch("backend.api.features.mcp.routes.creds_manager") as mock_cm:
+            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[old_cred])
+            mock_cm.create = AsyncMock()
+            mock_cm.store.delete_creds_by_id = AsyncMock()
+
+            response = await client.post(
+                "/token",
+                json={
+                    "server_url": "https://mcp.example.com/mcp",
+                    "token": "new-token",
+                },
+            )
+
+        assert response.status_code == 200
+        mock_cm.store.delete_creds_by_id.assert_called_once_with(
+            "test-user-id", old_cred.id
+        )
+
+
+class TestSSRFValidation:
+    """Verify that validate_url_host is enforced on all endpoints."""
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_discover_tools_ssrf_blocked(self, client):
+        with patch(
+            "backend.api.features.mcp.routes.validate_url_host",
+            new_callable=AsyncMock,
+            side_effect=ValueError("blocked loopback"),
+        ):
+            response = await client.post(
+                "/discover-tools",
+                json={"server_url": "http://localhost/mcp"},
+            )
+
+        assert response.status_code == 400
+        assert "blocked loopback" in response.json()["detail"].lower()
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_oauth_login_ssrf_blocked(self, client):
+        with patch(
+            "backend.api.features.mcp.routes.validate_url_host",
+            new_callable=AsyncMock,
+            side_effect=ValueError("blocked private IP"),
+        ):
+            response = await client.post(
+                "/oauth/login",
+                json={"server_url": "http://10.0.0.1/mcp"},
+            )
+
+        assert response.status_code == 400
+        assert "blocked private ip" in response.json()["detail"].lower()
+
+    @pytest.mark.asyncio(loop_scope="session")
+    async def test_store_token_ssrf_blocked(self, client):
+        with patch(
+            "backend.api.features.mcp.routes.validate_url_host",
+            new_callable=AsyncMock,
+            side_effect=ValueError("blocked loopback"),
+        ):
+            response = await client.post(
+                "/token",
+                json={
+                    "server_url": "http://127.0.0.1/mcp",
+                    "token": "some-token",
+                },
+            )
+
+        assert response.status_code == 400
+        assert "blocked loopback" in response.json()["detail"].lower()
--- a/autogpt_platform/backend/backend/api/features/oauth_test.py
+++ b/autogpt_platform/backend/backend/api/features/oauth_test.py
@@ -12,6 +12,7 @@ Tests cover:
 5. Complete OAuth flow end-to-end
 """

+import asyncio
 import base64
 import hashlib
 import secrets
@@ -58,14 +59,27 @@ async def test_user(server, test_user_id: str):

    yield test_user_id

-    # Cleanup - delete in correct order due to foreign key constraints
-    await PrismaOAuthAccessToken.prisma().delete_many(where={"userId": test_user_id})
-    await PrismaOAuthRefreshToken.prisma().delete_many(where={"userId": test_user_id})
-    await PrismaOAuthAuthorizationCode.prisma().delete_many(
-        where={"userId": test_user_id}
-    )
-    await PrismaOAuthApplication.prisma().delete_many(where={"ownerId": test_user_id})
-    await PrismaUser.prisma().delete(where={"id": test_user_id})
+    # Cleanup - delete in correct order due to foreign key constraints.
+    # Wrap in try/except because the event loop or Prisma engine may already
+    # be closed during session teardown on Python 3.12+.
+    try:
+        await asyncio.gather(
+            PrismaOAuthAccessToken.prisma().delete_many(where={"userId": test_user_id}),
+            PrismaOAuthRefreshToken.prisma().delete_many(
+                where={"userId": test_user_id}
+            ),
+            PrismaOAuthAuthorizationCode.prisma().delete_many(
+                where={"userId": test_user_id}
+            ),
+        )
+        await asyncio.gather(
+            PrismaOAuthApplication.prisma().delete_many(
+                where={"ownerId": test_user_id}
+            ),
+            PrismaUser.prisma().delete(where={"id": test_user_id}),
+        )
+    except RuntimeError:
+        pass


@pytest_asyncio.fixture
--- a/autogpt_platform/backend/backend/api/features/store/cache.py
+++ b/autogpt_platform/backend/backend/api/features/store/cache.py
@@ -1,5 +1,3 @@
-from typing import Literal
-
 from backend.util.cache import cached

 from . import db as store_db
@@ -23,7 +21,7 @@ def clear_all_caches():
 async def _get_cached_store_agents(
    featured: bool,
    creator: str | None,
-    sorted_by: Literal["rating", "runs", "name", "updated_at"] | None,
+    sorted_by: store_db.StoreAgentsSortOptions | None,
    search_query: str | None,
    category: str | None,
    page: int,
@@ -57,7 +55,7 @@ async def _get_cached_agent_details(
 async def _get_cached_store_creators(
    featured: bool,
    search_query: str | None,
-    sorted_by: Literal["agent_rating", "agent_runs", "num_agents"] | None,
+    sorted_by: store_db.StoreCreatorsSortOptions | None,
    page: int,
    page_size: int,
 ):
@@ -75,4 +73,4 @@ async def _get_cached_store_creators(
@cached(maxsize=100, ttl_seconds=300, shared_cache=True)
 async def _get_cached_creator_details(username: str):
    """Cached helper to get creator details."""
-    return await store_db.get_store_creator_details(username=username.lower())
+    return await store_db.get_store_creator(username=username.lower())
--- a/autogpt_platform/backend/backend/api/features/store/content_handlers.py
+++ b/autogpt_platform/backend/backend/api/features/store/content_handlers.py
@@ -5,19 +5,40 @@ Pluggable system for different content sources (store agents, blocks, docs).
 Each handler knows how to fetch and process its content type for embedding.
 """

+from __future__ import annotations
+
+import asyncio
+import functools
+import itertools
 import logging
 from abc import ABC, abstractmethod
 from dataclasses import dataclass
 from pathlib import Path
-from typing import Any
+from typing import TYPE_CHECKING, Any, get_args, get_origin

 from prisma.enums import ContentType

+from backend.blocks import get_blocks
+from backend.blocks.llm import LlmModel
 from backend.data.db import query_raw_with_schema
+from backend.util.text import split_camelcase
+
+if TYPE_CHECKING:
+    from backend.blocks._base import AnyBlockSchema

 logger = logging.getLogger(__name__)


+def _contains_type(annotation: Any, target: type) -> bool:
+    """Check if an annotation is or contains the target type (handles Optional/Union/Annotated)."""
+    if annotation is target:
+        return True
+    origin = get_origin(annotation)
+    if origin is None:
+        return False
+    return any(_contains_type(arg, target) for arg in get_args(annotation))
+
+
@dataclass
 class ContentItem:
    """Represents a piece of content to be embedded."""
@@ -143,6 +164,28 @@ class StoreAgentHandler(ContentHandler):
        }


+@functools.lru_cache(maxsize=1)
+def _get_enabled_blocks() -> dict[str, AnyBlockSchema]:
+    """Return ``{block_id: block_instance}`` for all enabled, instantiable blocks.
+
+    Disabled blocks and blocks that fail to instantiate are silently skipped
+    (with a warning log), so callers never need their own try/except loop.
+
+    Results are cached for the process lifetime via ``lru_cache`` because
+    blocks are registered at import time and never change while running.
+    """
+    enabled: dict[str, AnyBlockSchema] = {}
+    for block_id, block_cls in get_blocks().items():
+        try:
+            instance = block_cls()
+        except Exception as e:
+            logger.warning(f"Skipping block {block_id}: init failed: {e}")
+            continue
+        if not instance.disabled:
+            enabled[block_id] = instance
+    return enabled
+
+
 class BlockHandler(ContentHandler):
    """Handler for block definitions (Python classes)."""

@@ -152,16 +195,14 @@ class BlockHandler(ContentHandler):

    async def get_missing_items(self, batch_size: int) -> list[ContentItem]:
        """Fetch blocks without embeddings."""
-        from backend.blocks import get_blocks
-
-        # Get all available blocks
-        all_blocks = get_blocks()
-
-        # Check which ones have embeddings
-        if not all_blocks:
+        # to_thread keeps the first (heavy) call off the event loop.  On
+        # subsequent calls the lru_cache makes this a dict lookup, so the
+        # thread-pool overhead is negligible compared to the DB queries below.
+        enabled = await asyncio.to_thread(_get_enabled_blocks)
+        if not enabled:
            return []

-        block_ids = list(all_blocks.keys())
+        block_ids = list(enabled.keys())

        # Query for existing embeddings
        placeholders = ",".join([f"${i+1}" for i in range(len(block_ids))])
@@ -176,57 +217,53 @@ class BlockHandler(ContentHandler):
        )

        existing_ids = {row["contentId"] for row in existing_result}
-        missing_blocks = [
-            (block_id, block_cls)
-            for block_id, block_cls in all_blocks.items()
-            if block_id not in existing_ids
-        ]

-        # Convert to ContentItem
+        # Convert to ContentItem — disabled filtering already done by
+        # _get_enabled_blocks so batch_size won't be exhausted by disabled blocks.
+        missing = ((bid, b) for bid, b in enabled.items() if bid not in existing_ids)
        items = []
-        for block_id, block_cls in missing_blocks[:batch_size]:
+        for block_id, block in itertools.islice(missing, batch_size):
            try:
-                block_instance = block_cls()
-
-                # Skip disabled blocks - they shouldn't be indexed
-                if block_instance.disabled:
-                    continue
-
                # Build searchable text from block metadata
-                parts = []
-                if hasattr(block_instance, "name") and block_instance.name:
-                    parts.append(block_instance.name)
-                if (
-                    hasattr(block_instance, "description")
-                    and block_instance.description
-                ):
-                    parts.append(block_instance.description)
-                if hasattr(block_instance, "categories") and block_instance.categories:
-                    # Convert BlockCategory enum to strings
-                    parts.append(
-                        " ".join(str(cat.value) for cat in block_instance.categories)
+                if not block.name:
+                    logger.warning(
+                        f"Block {block_id} has no name — using block_id as fallback"
                    )
+                display_name = split_camelcase(block.name) if block.name else ""
+                parts = []
+                if display_name:
+                    parts.append(display_name)
+                if block.description:
+                    parts.append(block.description)
+                if block.categories:
+                    parts.append(" ".join(str(cat.value) for cat in block.categories))

-                # Add input/output schema info
-                if hasattr(block_instance, "input_schema"):
-                    schema = block_instance.input_schema
-                    if hasattr(schema, "model_json_schema"):
-                        schema_dict = schema.model_json_schema()
-                        if "properties" in schema_dict:
-                            for prop_name, prop_info in schema_dict[
-                                "properties"
-                            ].items():
-                                if "description" in prop_info:
-                                    parts.append(
-                                        f"{prop_name}: {prop_info['description']}"
-                                    )
+                # Add input schema field descriptions
+                parts += [
+                    f"{field_name}: {field_info.description}"
+                    for field_name, field_info in block.input_schema.model_fields.items()
+                    if field_info.description
+                ]

                searchable_text = " ".join(parts)

-                # Convert categories set of enums to list of strings for JSON serialization
-                categories = getattr(block_instance, "categories", set())
                categories_list = (
-                    [cat.value for cat in categories] if categories else []
+                    [cat.value for cat in block.categories] if block.categories else []
+                )
+
+                # Extract provider names from credentials fields
+                credentials_info = block.input_schema.get_credentials_fields_info()
+                is_integration = len(credentials_info) > 0
+                provider_names = [
+                    provider.value.lower()
+                    for info in credentials_info.values()
+                    for provider in info.provider
+                ]
+
+                # Check if block has LlmModel field in input schema
+                has_llm_model_field = any(
+                    _contains_type(field.annotation, LlmModel)
+                    for field in block.input_schema.model_fields.values()
                )

                items.append(
@@ -235,10 +272,13 @@ class BlockHandler(ContentHandler):
                        content_type=ContentType.BLOCK,
                        searchable_text=searchable_text,
                        metadata={
-                            "name": getattr(block_instance, "name", ""),
+                            "name": display_name or block.name or block_id,
                            "categories": categories_list,
+                            "providers": provider_names,
+                            "has_llm_model_field": has_llm_model_field,
+                            "is_integration": is_integration,
                        },
-                        user_id=None,  # Blocks are public
+                        user_id=None,
                    )
                )
            except Exception as e:
@@ -249,22 +289,13 @@ class BlockHandler(ContentHandler):

    async def get_stats(self) -> dict[str, int]:
        """Get statistics about block embedding coverage."""
-        from backend.blocks import get_blocks
-
-        all_blocks = get_blocks()
-
-        # Filter out disabled blocks - they're not indexed
-        enabled_block_ids = [
-            block_id
-            for block_id, block_cls in all_blocks.items()
-            if not block_cls().disabled
-        ]
-        total_blocks = len(enabled_block_ids)
+        enabled = await asyncio.to_thread(_get_enabled_blocks)
+        total_blocks = len(enabled)

        if total_blocks == 0:
            return {"total": 0, "with_embeddings": 0, "without_embeddings": 0}

-        block_ids = enabled_block_ids
+        block_ids = list(enabled.keys())
        placeholders = ",".join([f"${i+1}" for i in range(len(block_ids))])

        embedded_result = await query_raw_with_schema(
--- a/autogpt_platform/backend/backend/api/features/store/content_handlers_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/content_handlers_test.py
@@ -1,7 +1,5 @@
 """
-E2E tests for content handlers (blocks, store agents, documentation).
-
-Tests the full flow: discovering content → generating embeddings → storing.
+Tests for content handlers (blocks, store agents, documentation).
 """

 from pathlib import Path
@@ -15,15 +13,103 @@ from backend.api.features.store.content_handlers import (
    BlockHandler,
    DocumentationHandler,
    StoreAgentHandler,
+    _get_enabled_blocks,
 )


+@pytest.fixture(autouse=True)
+def _clear_block_cache():
+    """Clear the lru_cache on _get_enabled_blocks before each test."""
+    _get_enabled_blocks.cache_clear()
+    yield
+    _get_enabled_blocks.cache_clear()
+
+
+# ---------------------------------------------------------------------------
+# Helper to build a mock block class that returns a pre-configured instance
+# ---------------------------------------------------------------------------
+
+
+def _make_block_class(
+    *,
+    name: str = "Block",
+    description: str = "",
+    disabled: bool = False,
+    categories: list[MagicMock] | None = None,
+    fields: dict[str, str] | None = None,
+    raise_on_init: Exception | None = None,
+) -> MagicMock:
+    cls = MagicMock()
+    if raise_on_init is not None:
+        cls.side_effect = raise_on_init
+        return cls
+    inst = MagicMock()
+    inst.name = name
+    inst.disabled = disabled
+    inst.description = description
+    inst.categories = categories or []
+    field_mocks = {
+        fname: MagicMock(description=fdesc) for fname, fdesc in (fields or {}).items()
+    }
+    inst.input_schema.model_fields = field_mocks
+    inst.input_schema.get_credentials_fields_info.return_value = {}
+    cls.return_value = inst
+    return cls
+
+
+# ---------------------------------------------------------------------------
+# _get_enabled_blocks
+# ---------------------------------------------------------------------------
+
+
+def test_get_enabled_blocks_filters_disabled():
+    """Disabled blocks are excluded."""
+    blocks = {
+        "enabled": _make_block_class(name="E", disabled=False),
+        "disabled": _make_block_class(name="D", disabled=True),
+    }
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        result = _get_enabled_blocks()
+    assert list(result.keys()) == ["enabled"]
+
+
+def test_get_enabled_blocks_skips_broken():
+    """Blocks that raise on init are skipped without crashing."""
+    blocks = {
+        "good": _make_block_class(name="Good"),
+        "bad": _make_block_class(raise_on_init=RuntimeError("boom")),
+    }
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        result = _get_enabled_blocks()
+    assert list(result.keys()) == ["good"]
+
+
+def test_get_enabled_blocks_cached():
+    """_get_enabled_blocks() calls get_blocks() only once across multiple calls."""
+    blocks = {"b1": _make_block_class(name="B1")}
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ) as mock_get_blocks:
+        result1 = _get_enabled_blocks()
+        result2 = _get_enabled_blocks()
+    assert result1 is result2
+    mock_get_blocks.assert_called_once()
+
+
+# ---------------------------------------------------------------------------
+# StoreAgentHandler
+# ---------------------------------------------------------------------------
+
+
@pytest.mark.asyncio(loop_scope="session")
 async def test_store_agent_handler_get_missing_items(mocker):
    """Test StoreAgentHandler fetches approved agents without embeddings."""
    handler = StoreAgentHandler()

-    # Mock database query
    mock_missing = [
        {
            "id": "agent-1",
@@ -54,9 +140,7 @@ async def test_store_agent_handler_get_stats(mocker):
    """Test StoreAgentHandler returns correct stats."""
    handler = StoreAgentHandler()

-    # Mock approved count query
    mock_approved = [{"count": 50}]
-    # Mock embedded count query
    mock_embedded = [{"count": 30}]

    with patch(
@@ -70,73 +154,130 @@ async def test_store_agent_handler_get_stats(mocker):
        assert stats["without_embeddings"] == 20


+# ---------------------------------------------------------------------------
+# BlockHandler
+# ---------------------------------------------------------------------------
+
+
@pytest.mark.asyncio(loop_scope="session")
-async def test_block_handler_get_missing_items(mocker):
+async def test_block_handler_get_missing_items():
    """Test BlockHandler discovers blocks without embeddings."""
    handler = BlockHandler()

-    # Mock get_blocks to return test blocks
-    mock_block_class = MagicMock()
-    mock_block_instance = MagicMock()
-    mock_block_instance.name = "Calculator Block"
-    mock_block_instance.description = "Performs calculations"
-    mock_block_instance.categories = [MagicMock(value="MATH")]
-    mock_block_instance.disabled = False
-    mock_block_instance.input_schema.model_json_schema.return_value = {
-        "properties": {"expression": {"description": "Math expression to evaluate"}}
+    blocks = {
+        "block-uuid-1": _make_block_class(
+            name="CalculatorBlock",
+            description="Performs calculations",
+            categories=[MagicMock(value="MATH")],
+            fields={"expression": "Math expression to evaluate"},
+        ),
    }
-    mock_block_class.return_value = mock_block_instance
-
-    mock_blocks = {"block-uuid-1": mock_block_class}
-
-    # Mock existing embeddings query (no embeddings exist)
-    mock_existing = []

    with patch(
-        "backend.blocks.get_blocks",
-        return_value=mock_blocks,
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
    ):
        with patch(
            "backend.api.features.store.content_handlers.query_raw_with_schema",
-            return_value=mock_existing,
+            return_value=[],
        ):
            items = await handler.get_missing_items(batch_size=10)

            assert len(items) == 1
            assert items[0].content_id == "block-uuid-1"
            assert items[0].content_type == ContentType.BLOCK
+            # CamelCase should be split in searchable text and metadata name
            assert "Calculator Block" in items[0].searchable_text
            assert "Performs calculations" in items[0].searchable_text
            assert "MATH" in items[0].searchable_text
            assert "expression: Math expression" in items[0].searchable_text
+            assert items[0].metadata["name"] == "Calculator Block"
            assert items[0].user_id is None


@pytest.mark.asyncio(loop_scope="session")
-async def test_block_handler_get_stats(mocker):
+async def test_block_handler_get_missing_items_splits_camelcase():
+    """CamelCase block names are split for better search indexing."""
+    handler = BlockHandler()
+
+    blocks = {
+        "ai-block": _make_block_class(name="AITextGeneratorBlock"),
+    }
+
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        with patch(
+            "backend.api.features.store.content_handlers.query_raw_with_schema",
+            return_value=[],
+        ):
+            items = await handler.get_missing_items(batch_size=10)
+
+            assert len(items) == 1
+            assert "AI Text Generator Block" in items[0].searchable_text
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_block_handler_get_missing_items_batch_size_zero():
+    """batch_size=0 returns an empty list; the DB is still queried to find missing IDs."""
+    handler = BlockHandler()
+
+    blocks = {"b1": _make_block_class(name="B1")}
+
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        with patch(
+            "backend.api.features.store.content_handlers.query_raw_with_schema",
+            return_value=[],
+        ) as mock_query:
+            items = await handler.get_missing_items(batch_size=0)
+            assert items == []
+            # DB query is still issued to learn which blocks lack embeddings;
+            # the empty result comes from itertools.islice limiting to 0 items.
+            mock_query.assert_called_once()
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_block_handler_disabled_dont_exhaust_batch():
+    """Disabled blocks don't consume batch budget, so enabled blocks get indexed."""
+    handler = BlockHandler()
+
+    # 5 disabled + 3 enabled, batch_size=2
+    blocks = {
+        **{
+            f"dis-{i}": _make_block_class(name=f"D{i}", disabled=True) for i in range(5)
+        },
+        **{f"en-{i}": _make_block_class(name=f"E{i}") for i in range(3)},
+    }
+
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        with patch(
+            "backend.api.features.store.content_handlers.query_raw_with_schema",
+            return_value=[],
+        ):
+            items = await handler.get_missing_items(batch_size=2)
+
+            assert len(items) == 2
+            assert all(item.content_id.startswith("en-") for item in items)
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_block_handler_get_stats():
    """Test BlockHandler returns correct stats."""
    handler = BlockHandler()

-    # Mock get_blocks - each block class returns an instance with disabled=False
-    def make_mock_block_class():
-        mock_class = MagicMock()
-        mock_instance = MagicMock()
-        mock_instance.disabled = False
-        mock_class.return_value = mock_instance
-        return mock_class
-
-    mock_blocks = {
-        "block-1": make_mock_block_class(),
-        "block-2": make_mock_block_class(),
-        "block-3": make_mock_block_class(),
+    blocks = {
+        "block-1": _make_block_class(name="B1"),
+        "block-2": _make_block_class(name="B2"),
+        "block-3": _make_block_class(name="B3"),
    }

-    # Mock embedded count query (2 blocks have embeddings)
    mock_embedded = [{"count": 2}]

    with patch(
-        "backend.blocks.get_blocks",
-        return_value=mock_blocks,
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
    ):
        with patch(
            "backend.api.features.store.content_handlers.query_raw_with_schema",
@@ -149,21 +290,123 @@ async def test_block_handler_get_stats(mocker):
            assert stats["without_embeddings"] == 1


+@pytest.mark.asyncio(loop_scope="session")
+async def test_block_handler_get_stats_skips_broken():
+    """get_stats skips broken blocks instead of crashing."""
+    handler = BlockHandler()
+
+    blocks = {
+        "good": _make_block_class(name="Good"),
+        "bad": _make_block_class(raise_on_init=RuntimeError("boom")),
+    }
+
+    mock_embedded = [{"count": 1}]
+
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        with patch(
+            "backend.api.features.store.content_handlers.query_raw_with_schema",
+            return_value=mock_embedded,
+        ):
+            stats = await handler.get_stats()
+
+            assert stats["total"] == 1  # only the good block
+            assert stats["with_embeddings"] == 1
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_block_handler_handles_none_name():
+    """When block.name is None the fallback display name logic is used."""
+    handler = BlockHandler()
+
+    blocks = {
+        "none-name-block": _make_block_class(
+            name="placeholder",  # will be overridden to None below
+            description="A block with no name",
+        ),
+    }
+    # Override the name to None after construction so _make_block_class
+    # doesn't interfere with the mock wiring.
+    blocks["none-name-block"].return_value.name = None
+
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        with patch(
+            "backend.api.features.store.content_handlers.query_raw_with_schema",
+            return_value=[],
+        ):
+            items = await handler.get_missing_items(batch_size=10)
+
+            assert len(items) == 1
+            # display_name should be "" because block.name is None
+            # searchable_text should still contain the description
+            assert "A block with no name" in items[0].searchable_text
+            # metadata["name"] falls back to block_id when both display_name
+            # and block.name are falsy, ensuring it is always a non-empty string.
+            assert items[0].metadata["name"] == "none-name-block"
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_block_handler_handles_empty_attributes():
+    """Test BlockHandler handles blocks with empty/falsy attribute values."""
+    handler = BlockHandler()
+
+    blocks = {"block-minimal": _make_block_class(name="Minimal Block")}
+
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        with patch(
+            "backend.api.features.store.content_handlers.query_raw_with_schema",
+            return_value=[],
+        ):
+            items = await handler.get_missing_items(batch_size=10)
+
+            assert len(items) == 1
+            assert items[0].searchable_text == "Minimal Block"
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_block_handler_skips_failed_blocks():
+    """Test BlockHandler skips blocks that fail to instantiate."""
+    handler = BlockHandler()
+
+    blocks = {
+        "good-block": _make_block_class(name="Good Block", description="Works fine"),
+        "bad-block": _make_block_class(raise_on_init=Exception("Instantiation failed")),
+    }
+
+    with patch(
+        "backend.api.features.store.content_handlers.get_blocks", return_value=blocks
+    ):
+        with patch(
+            "backend.api.features.store.content_handlers.query_raw_with_schema",
+            return_value=[],
+        ):
+            items = await handler.get_missing_items(batch_size=10)
+
+            assert len(items) == 1
+            assert items[0].content_id == "good-block"
+
+
+# ---------------------------------------------------------------------------
+# DocumentationHandler
+# ---------------------------------------------------------------------------
+
+
@pytest.mark.asyncio(loop_scope="session")
 async def test_documentation_handler_get_missing_items(tmp_path, mocker):
    """Test DocumentationHandler discovers docs without embeddings."""
    handler = DocumentationHandler()

-    # Create temporary docs directory with test files
    docs_root = tmp_path / "docs"
    docs_root.mkdir()
-
    (docs_root / "guide.md").write_text("# Getting Started\n\nThis is a guide.")
    (docs_root / "api.mdx").write_text("# API Reference\n\nAPI documentation.")

-    # Mock _get_docs_root to return temp dir
    with patch.object(handler, "_get_docs_root", return_value=docs_root):
-        # Mock existing embeddings query (no embeddings exist)
        with patch(
            "backend.api.features.store.content_handlers.query_raw_with_schema",
            return_value=[],
@@ -172,7 +415,6 @@ async def test_documentation_handler_get_missing_items(tmp_path, mocker):

            assert len(items) == 2

-            # Check guide.md (content_id format: doc_path::section_index)
            guide_item = next(
                (item for item in items if item.content_id == "guide.md::0"), None
            )
@@ -183,7 +425,6 @@ async def test_documentation_handler_get_missing_items(tmp_path, mocker):
            assert guide_item.metadata["doc_title"] == "Getting Started"
            assert guide_item.user_id is None

-            # Check api.mdx (content_id format: doc_path::section_index)
            api_item = next(
                (item for item in items if item.content_id == "api.mdx::0"), None
            )
@@ -196,14 +437,12 @@ async def test_documentation_handler_get_stats(tmp_path, mocker):
    """Test DocumentationHandler returns correct stats."""
    handler = DocumentationHandler()

-    # Create temporary docs directory
    docs_root = tmp_path / "docs"
    docs_root.mkdir()
    (docs_root / "doc1.md").write_text("# Doc 1")
    (docs_root / "doc2.md").write_text("# Doc 2")
    (docs_root / "doc3.mdx").write_text("# Doc 3")

-    # Mock embedded count query (1 doc has embedding)
    mock_embedded = [{"count": 1}]

    with patch.object(handler, "_get_docs_root", return_value=docs_root):
@@ -223,13 +462,11 @@ async def test_documentation_handler_title_extraction(tmp_path):
    """Test DocumentationHandler extracts title from markdown heading."""
    handler = DocumentationHandler()

-    # Test with heading
    doc_with_heading = tmp_path / "with_heading.md"
    doc_with_heading.write_text("# My Title\n\nContent here")
    title = handler._extract_doc_title(doc_with_heading)
    assert title == "My Title"

-    # Test without heading
    doc_without_heading = tmp_path / "no-heading.md"
    doc_without_heading.write_text("Just content, no heading")
    title = handler._extract_doc_title(doc_without_heading)
@@ -241,7 +478,6 @@ async def test_documentation_handler_markdown_chunking(tmp_path):
    """Test DocumentationHandler chunks markdown by headings."""
    handler = DocumentationHandler()

-    # Test document with multiple sections
    doc_with_sections = tmp_path / "sections.md"
    doc_with_sections.write_text(
        "# Document Title\n\n"
@@ -253,7 +489,6 @@ async def test_documentation_handler_markdown_chunking(tmp_path):
    )
    sections = handler._chunk_markdown_by_headings(doc_with_sections)

-    # Should have 3 sections: intro (with doc title), section one, section two
    assert len(sections) == 3
    assert sections[0].title == "Document Title"
    assert sections[0].index == 0
@@ -267,7 +502,6 @@ async def test_documentation_handler_markdown_chunking(tmp_path):
    assert sections[2].index == 2
    assert "Content for section two" in sections[2].content

-    # Test document without headings
    doc_no_sections = tmp_path / "no-sections.md"
    doc_no_sections.write_text("Just plain content without any headings.")
    sections = handler._chunk_markdown_by_headings(doc_no_sections)
@@ -281,21 +515,39 @@ async def test_documentation_handler_section_content_ids():
    """Test DocumentationHandler creates and parses section content IDs."""
    handler = DocumentationHandler()

-    # Test making content ID
    content_id = handler._make_section_content_id("docs/guide.md", 2)
    assert content_id == "docs/guide.md::2"

-    # Test parsing content ID
    doc_path, section_index = handler._parse_section_content_id("docs/guide.md::2")
    assert doc_path == "docs/guide.md"
    assert section_index == 2

-    # Test parsing legacy format (no section index)
    doc_path, section_index = handler._parse_section_content_id("docs/old-format.md")
    assert doc_path == "docs/old-format.md"
    assert section_index == 0


+@pytest.mark.asyncio(loop_scope="session")
+async def test_documentation_handler_missing_docs_directory():
+    """Test DocumentationHandler handles missing docs directory gracefully."""
+    handler = DocumentationHandler()
+
+    fake_path = Path("/nonexistent/docs")
+    with patch.object(handler, "_get_docs_root", return_value=fake_path):
+        items = await handler.get_missing_items(batch_size=10)
+        assert items == []
+
+        stats = await handler.get_stats()
+        assert stats["total"] == 0
+        assert stats["with_embeddings"] == 0
+        assert stats["without_embeddings"] == 0
+
+
+# ---------------------------------------------------------------------------
+# Registry
+# ---------------------------------------------------------------------------
+
+
@pytest.mark.asyncio(loop_scope="session")
 async def test_content_handlers_registry():
    """Test all content types are registered."""
@@ -306,86 +558,3 @@ async def test_content_handlers_registry():
    assert isinstance(CONTENT_HANDLERS[ContentType.STORE_AGENT], StoreAgentHandler)
    assert isinstance(CONTENT_HANDLERS[ContentType.BLOCK], BlockHandler)
    assert isinstance(CONTENT_HANDLERS[ContentType.DOCUMENTATION], DocumentationHandler)
-
-
-@pytest.mark.asyncio(loop_scope="session")
-async def test_block_handler_handles_missing_attributes():
-    """Test BlockHandler gracefully handles blocks with missing attributes."""
-    handler = BlockHandler()
-
-    # Mock block with minimal attributes
-    mock_block_class = MagicMock()
-    mock_block_instance = MagicMock()
-    mock_block_instance.name = "Minimal Block"
-    mock_block_instance.disabled = False
-    # No description, categories, or schema
-    del mock_block_instance.description
-    del mock_block_instance.categories
-    del mock_block_instance.input_schema
-    mock_block_class.return_value = mock_block_instance
-
-    mock_blocks = {"block-minimal": mock_block_class}
-
-    with patch(
-        "backend.blocks.get_blocks",
-        return_value=mock_blocks,
-    ):
-        with patch(
-            "backend.api.features.store.content_handlers.query_raw_with_schema",
-            return_value=[],
-        ):
-            items = await handler.get_missing_items(batch_size=10)
-
-            assert len(items) == 1
-            assert items[0].searchable_text == "Minimal Block"
-
-
-@pytest.mark.asyncio(loop_scope="session")
-async def test_block_handler_skips_failed_blocks():
-    """Test BlockHandler skips blocks that fail to instantiate."""
-    handler = BlockHandler()
-
-    # Mock one good block and one bad block
-    good_block = MagicMock()
-    good_instance = MagicMock()
-    good_instance.name = "Good Block"
-    good_instance.description = "Works fine"
-    good_instance.categories = []
-    good_instance.disabled = False
-    good_block.return_value = good_instance
-
-    bad_block = MagicMock()
-    bad_block.side_effect = Exception("Instantiation failed")
-
-    mock_blocks = {"good-block": good_block, "bad-block": bad_block}
-
-    with patch(
-        "backend.blocks.get_blocks",
-        return_value=mock_blocks,
-    ):
-        with patch(
-            "backend.api.features.store.content_handlers.query_raw_with_schema",
-            return_value=[],
-        ):
-            items = await handler.get_missing_items(batch_size=10)
-
-            # Should only get the good block
-            assert len(items) == 1
-            assert items[0].content_id == "good-block"
-
-
-@pytest.mark.asyncio(loop_scope="session")
-async def test_documentation_handler_missing_docs_directory():
-    """Test DocumentationHandler handles missing docs directory gracefully."""
-    handler = DocumentationHandler()
-
-    # Mock _get_docs_root to return non-existent path
-    fake_path = Path("/nonexistent/docs")
-    with patch.object(handler, "_get_docs_root", return_value=fake_path):
-        items = await handler.get_missing_items(batch_size=10)
-        assert items == []
-
-        stats = await handler.get_stats()
-        assert stats["total"] == 0
-        assert stats["with_embeddings"] == 0
-        assert stats["without_embeddings"] == 0
--- a/autogpt_platform/backend/backend/api/features/store/db.py
+++ b/autogpt_platform/backend/backend/api/features/store/db.py
--- a/autogpt_platform/backend/backend/api/features/store/db_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/db_test.py
@@ -26,7 +26,7 @@ async def test_get_store_agents(mocker):
    mock_agents = [
        prisma.models.StoreAgent(
            listing_id="test-id",
-            storeListingVersionId="version123",
+            listing_version_id="version123",
            slug="test-agent",
            agent_name="Test Agent",
            agent_video=None,
@@ -40,11 +40,11 @@ async def test_get_store_agents(mocker):
            runs=10,
            rating=4.5,
            versions=["1.0"],
-            agentGraphVersions=["1"],
-            agentGraphId="test-graph-id",
+            graph_id="test-graph-id",
+            graph_versions=["1"],
            updated_at=datetime.now(),
            is_available=False,
-            useForOnboarding=False,
+            use_for_onboarding=False,
        )
    ]

@@ -68,10 +68,10 @@ async def test_get_store_agents(mocker):

@pytest.mark.asyncio(loop_scope="session")
 async def test_get_store_agent_details(mocker):
-    # Mock data
+    # Mock data - StoreAgent view already contains the active version data
    mock_agent = prisma.models.StoreAgent(
        listing_id="test-id",
-        storeListingVersionId="version123",
+        listing_version_id="version123",
        slug="test-agent",
        agent_name="Test Agent",
        agent_video="video.mp4",
@@ -85,102 +85,38 @@ async def test_get_store_agent_details(mocker):
        runs=10,
        rating=4.5,
        versions=["1.0"],
-        agentGraphVersions=["1"],
-        agentGraphId="test-graph-id",
-        updated_at=datetime.now(),
-        is_available=False,
-        useForOnboarding=False,
-    )
-
-    # Mock active version agent (what we want to return for active version)
-    mock_active_agent = prisma.models.StoreAgent(
-        listing_id="test-id",
-        storeListingVersionId="active-version-id",
-        slug="test-agent",
-        agent_name="Test Agent Active",
-        agent_video="active_video.mp4",
-        agent_image=["active_image.jpg"],
-        featured=False,
-        creator_username="creator",
-        creator_avatar="avatar.jpg",
-        sub_heading="Test heading active",
-        description="Test description active",
-        categories=["test"],
-        runs=15,
-        rating=4.8,
-        versions=["1.0", "2.0"],
-        agentGraphVersions=["1", "2"],
-        agentGraphId="test-graph-id-active",
+        graph_id="test-graph-id",
+        graph_versions=["1"],
        updated_at=datetime.now(),
        is_available=True,
-        useForOnboarding=False,
+        use_for_onboarding=False,
    )

-    # Create a mock StoreListing result
-    mock_store_listing = mocker.MagicMock()
-    mock_store_listing.activeVersionId = "active-version-id"
-    mock_store_listing.hasApprovedVersion = True
-    mock_store_listing.ActiveVersion = mocker.MagicMock()
-    mock_store_listing.ActiveVersion.recommendedScheduleCron = None
-
-    # Mock StoreAgent prisma call - need to handle multiple calls
+    # Mock StoreAgent prisma call
    mock_store_agent = mocker.patch("prisma.models.StoreAgent.prisma")
-
-    # Set up side_effect to return different results for different calls
-    def mock_find_first_side_effect(*args, **kwargs):
-        where_clause = kwargs.get("where", {})
-        if "storeListingVersionId" in where_clause:
-            # Second call for active version
-            return mock_active_agent
-        else:
-            # First call for initial lookup
-            return mock_agent
-
-    mock_store_agent.return_value.find_first = mocker.AsyncMock(
-        side_effect=mock_find_first_side_effect
-    )
-
-    # Mock Profile prisma call
-    mock_profile = mocker.MagicMock()
-    mock_profile.userId = "user-id-123"
-    mock_profile_db = mocker.patch("prisma.models.Profile.prisma")
-    mock_profile_db.return_value.find_first = mocker.AsyncMock(
-        return_value=mock_profile
-    )
-
-    # Mock StoreListing prisma call
-    mock_store_listing_db = mocker.patch("prisma.models.StoreListing.prisma")
-    mock_store_listing_db.return_value.find_first = mocker.AsyncMock(
-        return_value=mock_store_listing
-    )
+    mock_store_agent.return_value.find_first = mocker.AsyncMock(return_value=mock_agent)

    # Call function
    result = await db.get_store_agent_details("creator", "test-agent")

-    # Verify results - should use active version data
+    # Verify results - constructed from the StoreAgent view
    assert result.slug == "test-agent"
-    assert result.agent_name == "Test Agent Active"  # From active version
-    assert result.active_version_id == "active-version-id"
+    assert result.agent_name == "Test Agent"
+    assert result.active_version_id == "version123"
    assert result.has_approved_version is True
-    assert (
-        result.store_listing_version_id == "active-version-id"
-    )  # Should be active version ID
+    assert result.store_listing_version_id == "version123"
+    assert result.graph_id == "test-graph-id"
+    assert result.runs == 10
+    assert result.rating == 4.5

-    # Verify mocks called correctly - now expecting 2 calls
-    assert mock_store_agent.return_value.find_first.call_count == 2
-
-    # Check the specific calls
-    calls = mock_store_agent.return_value.find_first.call_args_list
-    assert calls[0] == mocker.call(
+    # Verify single StoreAgent lookup
+    mock_store_agent.return_value.find_first.assert_called_once_with(
        where={"creator_username": "creator", "slug": "test-agent"}
    )
-    assert calls[1] == mocker.call(where={"storeListingVersionId": "active-version-id"})
-
-    mock_store_listing_db.return_value.find_first.assert_called_once()


@pytest.mark.asyncio(loop_scope="session")
-async def test_get_store_creator_details(mocker):
+async def test_get_store_creator(mocker):
    # Mock data
    mock_creator_data = prisma.models.Creator(
        name="Test Creator",
@@ -202,7 +138,7 @@ async def test_get_store_creator_details(mocker):
    mock_creator.return_value.find_unique.return_value = mock_creator_data

    # Call function
-    result = await db.get_store_creator_details("creator")
+    result = await db.get_store_creator("creator")

    # Verify results
    assert result.username == "creator"
@@ -218,61 +154,110 @@ async def test_get_store_creator_details(mocker):

@pytest.mark.asyncio(loop_scope="session")
 async def test_create_store_submission(mocker):
-    # Mock data
+    now = datetime.now()
+
+    # Mock agent graph (with no pending submissions) and user with profile
+    mock_profile = prisma.models.Profile(
+        id="profile-id",
+        userId="user-id",
+        name="Test User",
+        username="testuser",
+        description="Test",
+        isFeatured=False,
+        links=[],
+        createdAt=now,
+        updatedAt=now,
+    )
+    mock_user = prisma.models.User(
+        id="user-id",
+        email="test@example.com",
+        createdAt=now,
+        updatedAt=now,
+        Profile=[mock_profile],
+        emailVerified=True,
+        metadata="{}",  # type: ignore[reportArgumentType]
+        integrations="",
+        maxEmailsPerDay=1,
+        notifyOnAgentRun=True,
+        notifyOnZeroBalance=True,
+        notifyOnLowBalance=True,
+        notifyOnBlockExecutionFailed=True,
+        notifyOnContinuousAgentError=True,
+        notifyOnDailySummary=True,
+        notifyOnWeeklySummary=True,
+        notifyOnMonthlySummary=True,
+        notifyOnAgentApproved=True,
+        notifyOnAgentRejected=True,
+        timezone="Europe/Delft",
+    )
    mock_agent = prisma.models.AgentGraph(
        id="agent-id",
        version=1,
        userId="user-id",
-        createdAt=datetime.now(),
+        createdAt=now,
        isActive=True,
+        StoreListingVersions=[],
+        User=mock_user,
    )

-    mock_listing = prisma.models.StoreListing(
+    # Mock the created StoreListingVersion (returned by create)
+    mock_store_listing_obj = prisma.models.StoreListing(
        id="listing-id",
-        createdAt=datetime.now(),
-        updatedAt=datetime.now(),
+        createdAt=now,
+        updatedAt=now,
        isDeleted=False,
        hasApprovedVersion=False,
        slug="test-agent",
        agentGraphId="agent-id",
-        agentGraphVersion=1,
        owningUserId="user-id",
-        Versions=[
-            prisma.models.StoreListingVersion(
-                id="version-id",
-                agentGraphId="agent-id",
-                agentGraphVersion=1,
-                name="Test Agent",
-                description="Test description",
-                createdAt=datetime.now(),
-                updatedAt=datetime.now(),
-                subHeading="Test heading",
-                imageUrls=["image.jpg"],
-                categories=["test"],
-                isFeatured=False,
-                isDeleted=False,
-                version=1,
-                storeListingId="listing-id",
-                submissionStatus=prisma.enums.SubmissionStatus.PENDING,
-                isAvailable=True,
-            )
-        ],
        useForOnboarding=False,
    )
+    mock_version = prisma.models.StoreListingVersion(
+        id="version-id",
+        agentGraphId="agent-id",
+        agentGraphVersion=1,
+        name="Test Agent",
+        description="Test description",
+        createdAt=now,
+        updatedAt=now,
+        subHeading="",
+        imageUrls=[],
+        categories=[],
+        isFeatured=False,
+        isDeleted=False,
+        version=1,
+        storeListingId="listing-id",
+        submissionStatus=prisma.enums.SubmissionStatus.PENDING,
+        isAvailable=True,
+        submittedAt=now,
+        StoreListing=mock_store_listing_obj,
+    )

    # Mock prisma calls
    mock_agent_graph = mocker.patch("prisma.models.AgentGraph.prisma")
    mock_agent_graph.return_value.find_first = mocker.AsyncMock(return_value=mock_agent)

-    mock_store_listing = mocker.patch("prisma.models.StoreListing.prisma")
-    mock_store_listing.return_value.find_first = mocker.AsyncMock(return_value=None)
-    mock_store_listing.return_value.create = mocker.AsyncMock(return_value=mock_listing)
+    # Mock transaction context manager
+    mock_tx = mocker.MagicMock()
+    mocker.patch(
+        "backend.api.features.store.db.transaction",
+        return_value=mocker.AsyncMock(
+            __aenter__=mocker.AsyncMock(return_value=mock_tx),
+            __aexit__=mocker.AsyncMock(return_value=False),
+        ),
+    )
+
+    mock_sl = mocker.patch("prisma.models.StoreListing.prisma")
+    mock_sl.return_value.find_unique = mocker.AsyncMock(return_value=None)
+
+    mock_slv = mocker.patch("prisma.models.StoreListingVersion.prisma")
+    mock_slv.return_value.create = mocker.AsyncMock(return_value=mock_version)

    # Call function
    result = await db.create_store_submission(
        user_id="user-id",
-        agent_id="agent-id",
-        agent_version=1,
+        graph_id="agent-id",
+        graph_version=1,
        slug="test-agent",
        name="Test Agent",
        description="Test description",
@@ -281,11 +266,11 @@ async def test_create_store_submission(mocker):
    # Verify results
    assert result.name == "Test Agent"
    assert result.description == "Test description"
-    assert result.store_listing_version_id == "version-id"
+    assert result.listing_version_id == "version-id"

    # Verify mocks called correctly
    mock_agent_graph.return_value.find_first.assert_called_once()
-    mock_store_listing.return_value.create.assert_called_once()
+    mock_slv.return_value.create.assert_called_once()


@pytest.mark.asyncio(loop_scope="session")
@@ -318,7 +303,6 @@ async def test_update_profile(mocker):
        description="Test description",
        links=["link1"],
        avatar_url="avatar.jpg",
-        is_featured=False,
    )

    # Call function
@@ -389,7 +373,7 @@ async def test_get_store_agents_with_search_and_filters_parameterized():
        creators=["creator1'; DROP TABLE Users; --", "creator2"],
        category="AI'; DELETE FROM StoreAgent; --",
        featured=True,
-        sorted_by="rating",
+        sorted_by=db.StoreAgentsSortOptions.RATING,
        page=1,
        page_size=20,
    )
--- a/autogpt_platform/backend/backend/api/features/store/embeddings.py
+++ b/autogpt_platform/backend/backend/api/features/store/embeddings.py
@@ -15,6 +15,7 @@ from prisma.enums import ContentType
 from tiktoken import encoding_for_model

 from backend.api.features.store.content_handlers import CONTENT_HANDLERS
+from backend.blocks import get_blocks
 from backend.data.db import execute_raw_with_schema, query_raw_with_schema
 from backend.util.clients import get_openai_client
 from backend.util.json import dumps
@@ -662,8 +663,6 @@ async def cleanup_orphaned_embeddings() -> dict[str, Any]:
                )
                current_ids = {row["id"] for row in valid_agents}
            elif content_type == ContentType.BLOCK:
-                from backend.blocks import get_blocks
-
                current_ids = set(get_blocks().keys())
            elif content_type == ContentType.DOCUMENTATION:
                # Use DocumentationHandler to get section-based content IDs
--- a/autogpt_platform/backend/backend/api/features/store/exceptions.py
+++ b/autogpt_platform/backend/backend/api/features/store/exceptions.py
@@ -57,12 +57,6 @@ class StoreError(ValueError):
    pass


-class AgentNotFoundError(NotFoundError):
-    """Raised when an agent is not found"""
-
-    pass
-
-
 class CreatorNotFoundError(NotFoundError):
    """Raised when a creator is not found"""

--- a/autogpt_platform/backend/backend/api/features/store/hybrid_search.py
+++ b/autogpt_platform/backend/backend/api/features/store/hybrid_search.py
@@ -31,12 +31,10 @@ logger = logging.getLogger(__name__)


 def tokenize(text: str) -> list[str]:
-    """Simple tokenizer for BM25 - lowercase and split on non-alphanumeric."""
+    """Tokenize text for BM25."""
    if not text:
        return []
-    # Lowercase and split on non-alphanumeric characters
-    tokens = re.findall(r"\b\w+\b", text.lower())
-    return tokens
+    return re.findall(r"\b\w+\b", text.lower())


 def bm25_rerank(
@@ -568,7 +566,7 @@ async def hybrid_search(
            SELECT uce."contentId" as "storeListingVersionId"
            FROM {{schema_prefix}}"UnifiedContentEmbedding" uce
            INNER JOIN {{schema_prefix}}"StoreAgent" sa
-                ON uce."contentId" = sa."storeListingVersionId"
+                ON uce."contentId" = sa.listing_version_id
            WHERE uce."contentType" = 'STORE_AGENT'::{{schema_prefix}}"ContentType"
            AND uce."userId" IS NULL
            AND uce.search @@ plainto_tsquery('english', {query_param})
@@ -582,7 +580,7 @@ async def hybrid_search(
                SELECT uce."contentId", uce.embedding
                FROM {{schema_prefix}}"UnifiedContentEmbedding" uce
                INNER JOIN {{schema_prefix}}"StoreAgent" sa
-                    ON uce."contentId" = sa."storeListingVersionId"
+                    ON uce."contentId" = sa.listing_version_id
                WHERE uce."contentType" = 'STORE_AGENT'::{{schema_prefix}}"ContentType"
                AND uce."userId" IS NULL
                AND {where_clause}
@@ -605,7 +603,7 @@ async def hybrid_search(
                sa.featured,
                sa.is_available,
                sa.updated_at,
-                sa."agentGraphId",
+                sa.graph_id,
                -- Searchable text for BM25 reranking
                COALESCE(sa.agent_name, '') || ' ' || COALESCE(sa.sub_heading, '') || ' ' || COALESCE(sa.description, '') as searchable_text,
                -- Semantic score
@@ -627,9 +625,9 @@ async def hybrid_search(
                sa.runs as popularity_raw
            FROM candidates c
            INNER JOIN {{schema_prefix}}"StoreAgent" sa
-                ON c."storeListingVersionId" = sa."storeListingVersionId"
+                ON c."storeListingVersionId" = sa.listing_version_id
            INNER JOIN {{schema_prefix}}"UnifiedContentEmbedding" uce
-                ON sa."storeListingVersionId" = uce."contentId"
+                ON sa.listing_version_id = uce."contentId"
                AND uce."contentType" = 'STORE_AGENT'::{{schema_prefix}}"ContentType"
        ),
        max_vals AS (
@@ -665,7 +663,7 @@ async def hybrid_search(
                featured,
                is_available,
                updated_at,
-                "agentGraphId",
+                graph_id,
                searchable_text,
                semantic_score,
                lexical_score,
--- a/autogpt_platform/backend/backend/api/features/store/hybrid_search_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/hybrid_search_test.py
@@ -14,9 +14,27 @@ from backend.api.features.store.hybrid_search import (
    HybridSearchWeights,
    UnifiedSearchWeights,
    hybrid_search,
+    tokenize,
    unified_hybrid_search,
 )

+# ---------------------------------------------------------------------------
+# tokenize (BM25)
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.parametrize(
+    "input_text, expected",
+    [
+        ("AITextGeneratorBlock", ["aitextgeneratorblock"]),
+        ("hello world", ["hello", "world"]),
+        ("", []),
+        ("HTTPRequest", ["httprequest"]),
+    ],
+)
+def test_tokenize(input_text: str, expected: list[str]):
+    assert tokenize(input_text) == expected
+

@pytest.mark.asyncio(loop_scope="session")
@pytest.mark.integration
--- a/autogpt_platform/backend/backend/api/features/store/model.py
+++ b/autogpt_platform/backend/backend/api/features/store/model.py
@@ -1,11 +1,14 @@
 import datetime
-from typing import List
+from typing import TYPE_CHECKING, List, Self

 import prisma.enums
 import pydantic

 from backend.util.models import Pagination

+if TYPE_CHECKING:
+    import prisma.models
+

 class ChangelogEntry(pydantic.BaseModel):
    version: str
@@ -13,9 +16,9 @@ class ChangelogEntry(pydantic.BaseModel):
    date: datetime.datetime


-class MyAgent(pydantic.BaseModel):
-    agent_id: str
-    agent_version: int
+class MyUnpublishedAgent(pydantic.BaseModel):
+    graph_id: str
+    graph_version: int
    agent_name: str
    agent_image: str | None = None
    description: str
@@ -23,8 +26,8 @@ class MyAgent(pydantic.BaseModel):
    recommended_schedule_cron: str | None = None


-class MyAgentsResponse(pydantic.BaseModel):
-    agents: list[MyAgent]
+class MyUnpublishedAgentsResponse(pydantic.BaseModel):
+    agents: list[MyUnpublishedAgent]
    pagination: Pagination


@@ -40,6 +43,21 @@ class StoreAgent(pydantic.BaseModel):
    rating: float
    agent_graph_id: str

+    @classmethod
+    def from_db(cls, agent: "prisma.models.StoreAgent") -> "StoreAgent":
+        return cls(
+            slug=agent.slug,
+            agent_name=agent.agent_name,
+            agent_image=agent.agent_image[0] if agent.agent_image else "",
+            creator=agent.creator_username or "Needs Profile",
+            creator_avatar=agent.creator_avatar or "",
+            sub_heading=agent.sub_heading,
+            description=agent.description,
+            runs=agent.runs,
+            rating=agent.rating,
+            agent_graph_id=agent.graph_id,
+        )
+

 class StoreAgentsResponse(pydantic.BaseModel):
    agents: list[StoreAgent]
@@ -62,81 +80,192 @@ class StoreAgentDetails(pydantic.BaseModel):
    runs: int
    rating: float
    versions: list[str]
-    agentGraphVersions: list[str]
-    agentGraphId: str
+    graph_id: str
+    graph_versions: list[str]
    last_updated: datetime.datetime
    recommended_schedule_cron: str | None = None

-    active_version_id: str | None = None
-    has_approved_version: bool = False
+    active_version_id: str
+    has_approved_version: bool

    # Optional changelog data when include_changelog=True
    changelog: list[ChangelogEntry] | None = None

-
-class Creator(pydantic.BaseModel):
-    name: str
-    username: str
-    description: str
-    avatar_url: str
-    num_agents: int
-    agent_rating: float
-    agent_runs: int
-    is_featured: bool
-
-
-class CreatorsResponse(pydantic.BaseModel):
-    creators: List[Creator]
-    pagination: Pagination
-
-
-class CreatorDetails(pydantic.BaseModel):
-    name: str
-    username: str
-    description: str
-    links: list[str]
-    avatar_url: str
-    agent_rating: float
-    agent_runs: int
-    top_categories: list[str]
+    @classmethod
+    def from_db(cls, agent: "prisma.models.StoreAgent") -> "StoreAgentDetails":
+        return cls(
+            store_listing_version_id=agent.listing_version_id,
+            slug=agent.slug,
+            agent_name=agent.agent_name,
+            agent_video=agent.agent_video or "",
+            agent_output_demo=agent.agent_output_demo or "",
+            agent_image=agent.agent_image,
+            creator=agent.creator_username or "",
+            creator_avatar=agent.creator_avatar or "",
+            sub_heading=agent.sub_heading,
+            description=agent.description,
+            categories=agent.categories,
+            runs=agent.runs,
+            rating=agent.rating,
+            versions=agent.versions,
+            graph_id=agent.graph_id,
+            graph_versions=agent.graph_versions,
+            last_updated=agent.updated_at,
+            recommended_schedule_cron=agent.recommended_schedule_cron,
+            active_version_id=agent.listing_version_id,
+            has_approved_version=True,  # StoreAgent view only has approved agents
+        )


 class Profile(pydantic.BaseModel):
-    name: str
+    """Marketplace user profile (only attributes that the user can update)"""
+
    username: str
+    name: str
    description: str
+    avatar_url: str | None
    links: list[str]
-    avatar_url: str
-    is_featured: bool = False
+
+
+class ProfileDetails(Profile):
+    """Marketplace user profile (including read-only fields)"""
+
+    is_featured: bool
+
+    @classmethod
+    def from_db(cls, profile: "prisma.models.Profile") -> "ProfileDetails":
+        return cls(
+            name=profile.name,
+            username=profile.username,
+            avatar_url=profile.avatarUrl,
+            description=profile.description,
+            links=profile.links,
+            is_featured=profile.isFeatured,
+        )
+
+
+class CreatorDetails(ProfileDetails):
+    """Marketplace creator profile details, including aggregated stats"""
+
+    num_agents: int
+    agent_runs: int
+    agent_rating: float
+    top_categories: list[str]
+
+    @classmethod
+    def from_db(cls, creator: "prisma.models.Creator") -> "CreatorDetails":  # type: ignore[override]
+        return cls(
+            name=creator.name,
+            username=creator.username,
+            avatar_url=creator.avatar_url,
+            description=creator.description,
+            links=creator.links,
+            is_featured=creator.is_featured,
+            num_agents=creator.num_agents,
+            agent_runs=creator.agent_runs,
+            agent_rating=creator.agent_rating,
+            top_categories=creator.top_categories,
+        )
+
+
+class CreatorsResponse(pydantic.BaseModel):
+    creators: List[CreatorDetails]
+    pagination: Pagination


 class StoreSubmission(pydantic.BaseModel):
+    # From StoreListing:
    listing_id: str
-    agent_id: str
-    agent_version: int
+    user_id: str
+    slug: str
+
+    # From StoreListingVersion:
+    listing_version_id: str
+    listing_version: int
+    graph_id: str
+    graph_version: int
    name: str
    sub_heading: str
-    slug: str
    description: str
-    instructions: str | None = None
+    instructions: str | None
+    categories: list[str]
    image_urls: list[str]
-    date_submitted: datetime.datetime
-    status: prisma.enums.SubmissionStatus
-    runs: int
-    rating: float
-    store_listing_version_id: str | None = None
-    version: int | None = None  # Actual version number from the database
+    video_url: str | None
+    agent_output_demo_url: str | None

+    submitted_at: datetime.datetime | None
+    changes_summary: str | None
+    status: prisma.enums.SubmissionStatus
+    reviewed_at: datetime.datetime | None = None
    reviewer_id: str | None = None
    review_comments: str | None = None  # External comments visible to creator
-    internal_comments: str | None = None  # Private notes for admin use only
-    reviewed_at: datetime.datetime | None = None
-    changes_summary: str | None = None

-    # Additional fields for editing
-    video_url: str | None = None
-    agent_output_demo_url: str | None = None
-    categories: list[str] = []
+    # Aggregated from AgentGraphExecutions and StoreListingReviews:
+    run_count: int = 0
+    review_count: int = 0
+    review_avg_rating: float = 0.0
+
+    @classmethod
+    def from_db(cls, _sub: "prisma.models.StoreSubmission") -> Self:
+        """Construct from the StoreSubmission Prisma view."""
+        return cls(
+            listing_id=_sub.listing_id,
+            user_id=_sub.user_id,
+            slug=_sub.slug,
+            listing_version_id=_sub.listing_version_id,
+            listing_version=_sub.listing_version,
+            graph_id=_sub.graph_id,
+            graph_version=_sub.graph_version,
+            name=_sub.name,
+            sub_heading=_sub.sub_heading,
+            description=_sub.description,
+            instructions=_sub.instructions,
+            categories=_sub.categories,
+            image_urls=_sub.image_urls,
+            video_url=_sub.video_url,
+            agent_output_demo_url=_sub.agent_output_demo_url,
+            submitted_at=_sub.submitted_at,
+            changes_summary=_sub.changes_summary,
+            status=_sub.status,
+            reviewed_at=_sub.reviewed_at,
+            reviewer_id=_sub.reviewer_id,
+            review_comments=_sub.review_comments,
+            run_count=_sub.run_count,
+            review_count=_sub.review_count,
+            review_avg_rating=_sub.review_avg_rating,
+        )
+
+    @classmethod
+    def from_listing_version(cls, _lv: "prisma.models.StoreListingVersion") -> Self:
+        """
+        Construct from the StoreListingVersion Prisma model (with StoreListing included)
+        """
+        if not (_l := _lv.StoreListing):
+            raise ValueError("StoreListingVersion must have included StoreListing")
+
+        return cls(
+            listing_id=_l.id,
+            user_id=_l.owningUserId,
+            slug=_l.slug,
+            listing_version_id=_lv.id,
+            listing_version=_lv.version,
+            graph_id=_lv.agentGraphId,
+            graph_version=_lv.agentGraphVersion,
+            name=_lv.name,
+            sub_heading=_lv.subHeading,
+            description=_lv.description,
+            instructions=_lv.instructions,
+            categories=_lv.categories,
+            image_urls=_lv.imageUrls,
+            video_url=_lv.videoUrl,
+            agent_output_demo_url=_lv.agentOutputDemoUrl,
+            submitted_at=_lv.submittedAt,
+            changes_summary=_lv.changesSummary,
+            status=_lv.submissionStatus,
+            reviewed_at=_lv.reviewedAt,
+            reviewer_id=_lv.reviewerId,
+            review_comments=_lv.reviewComments,
+        )


 class StoreSubmissionsResponse(pydantic.BaseModel):
@@ -144,33 +273,12 @@ class StoreSubmissionsResponse(pydantic.BaseModel):
    pagination: Pagination


-class StoreListingWithVersions(pydantic.BaseModel):
-    """A store listing with its version history"""
-
-    listing_id: str
-    slug: str
-    agent_id: str
-    agent_version: int
-    active_version_id: str | None = None
-    has_approved_version: bool = False
-    creator_email: str | None = None
-    latest_version: StoreSubmission | None = None
-    versions: list[StoreSubmission] = []
-
-
-class StoreListingsWithVersionsResponse(pydantic.BaseModel):
-    """Response model for listings with version history"""
-
-    listings: list[StoreListingWithVersions]
-    pagination: Pagination
-
-
 class StoreSubmissionRequest(pydantic.BaseModel):
-    agent_id: str = pydantic.Field(
-        ..., min_length=1, description="Agent ID cannot be empty"
+    graph_id: str = pydantic.Field(
+        ..., min_length=1, description="Graph ID cannot be empty"
    )
-    agent_version: int = pydantic.Field(
-        ..., gt=0, description="Agent version must be greater than 0"
+    graph_version: int = pydantic.Field(
+        ..., gt=0, description="Graph version must be greater than 0"
    )
    slug: str
    name: str
@@ -198,12 +306,42 @@ class StoreSubmissionEditRequest(pydantic.BaseModel):
    recommended_schedule_cron: str | None = None


-class ProfileDetails(pydantic.BaseModel):
-    name: str
-    username: str
-    description: str
-    links: list[str]
-    avatar_url: str | None = None
+class StoreSubmissionAdminView(StoreSubmission):
+    internal_comments: str | None  # Private admin notes
+
+    @classmethod
+    def from_db(cls, _sub: "prisma.models.StoreSubmission") -> Self:
+        return cls(
+            **StoreSubmission.from_db(_sub).model_dump(),
+            internal_comments=_sub.internal_comments,
+        )
+
+    @classmethod
+    def from_listing_version(cls, _lv: "prisma.models.StoreListingVersion") -> Self:
+        return cls(
+            **StoreSubmission.from_listing_version(_lv).model_dump(),
+            internal_comments=_lv.internalComments,
+        )
+
+
+class StoreListingWithVersionsAdminView(pydantic.BaseModel):
+    """A store listing with its version history"""
+
+    listing_id: str
+    graph_id: str
+    slug: str
+    active_listing_version_id: str | None = None
+    has_approved_version: bool = False
+    creator_email: str | None = None
+    latest_version: StoreSubmissionAdminView | None = None
+    versions: list[StoreSubmissionAdminView] = []
+
+
+class StoreListingsWithVersionsAdminViewResponse(pydantic.BaseModel):
+    """Response model for listings with version history"""
+
+    listings: list[StoreListingWithVersionsAdminView]
+    pagination: Pagination


 class StoreReview(pydantic.BaseModel):
--- a/autogpt_platform/backend/backend/api/features/store/model_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/model_test.py
@@ -1,203 +0,0 @@
-import datetime
-
-import prisma.enums
-
-from . import model as store_model
-
-
-def test_pagination():
-    pagination = store_model.Pagination(
-        total_items=100, total_pages=5, current_page=2, page_size=20
-    )
-    assert pagination.total_items == 100
-    assert pagination.total_pages == 5
-    assert pagination.current_page == 2
-    assert pagination.page_size == 20
-
-
-def test_store_agent():
-    agent = store_model.StoreAgent(
-        slug="test-agent",
-        agent_name="Test Agent",
-        agent_image="test.jpg",
-        creator="creator1",
-        creator_avatar="avatar.jpg",
-        sub_heading="Test subheading",
-        description="Test description",
-        runs=50,
-        rating=4.5,
-        agent_graph_id="test-graph-id",
-    )
-    assert agent.slug == "test-agent"
-    assert agent.agent_name == "Test Agent"
-    assert agent.runs == 50
-    assert agent.rating == 4.5
-    assert agent.agent_graph_id == "test-graph-id"
-
-
-def test_store_agents_response():
-    response = store_model.StoreAgentsResponse(
-        agents=[
-            store_model.StoreAgent(
-                slug="test-agent",
-                agent_name="Test Agent",
-                agent_image="test.jpg",
-                creator="creator1",
-                creator_avatar="avatar.jpg",
-                sub_heading="Test subheading",
-                description="Test description",
-                runs=50,
-                rating=4.5,
-                agent_graph_id="test-graph-id",
-            )
-        ],
-        pagination=store_model.Pagination(
-            total_items=1, total_pages=1, current_page=1, page_size=20
-        ),
-    )
-    assert len(response.agents) == 1
-    assert response.pagination.total_items == 1
-
-
-def test_store_agent_details():
-    details = store_model.StoreAgentDetails(
-        store_listing_version_id="version123",
-        slug="test-agent",
-        agent_name="Test Agent",
-        agent_video="video.mp4",
-        agent_output_demo="demo.mp4",
-        agent_image=["image1.jpg", "image2.jpg"],
-        creator="creator1",
-        creator_avatar="avatar.jpg",
-        sub_heading="Test subheading",
-        description="Test description",
-        categories=["cat1", "cat2"],
-        runs=50,
-        rating=4.5,
-        versions=["1.0", "2.0"],
-        agentGraphVersions=["1", "2"],
-        agentGraphId="test-graph-id",
-        last_updated=datetime.datetime.now(),
-    )
-    assert details.slug == "test-agent"
-    assert len(details.agent_image) == 2
-    assert len(details.categories) == 2
-    assert len(details.versions) == 2
-
-
-def test_creator():
-    creator = store_model.Creator(
-        agent_rating=4.8,
-        agent_runs=1000,
-        name="Test Creator",
-        username="creator1",
-        description="Test description",
-        avatar_url="avatar.jpg",
-        num_agents=5,
-        is_featured=False,
-    )
-    assert creator.name == "Test Creator"
-    assert creator.num_agents == 5
-
-
-def test_creators_response():
-    response = store_model.CreatorsResponse(
-        creators=[
-            store_model.Creator(
-                agent_rating=4.8,
-                agent_runs=1000,
-                name="Test Creator",
-                username="creator1",
-                description="Test description",
-                avatar_url="avatar.jpg",
-                num_agents=5,
-                is_featured=False,
-            )
-        ],
-        pagination=store_model.Pagination(
-            total_items=1, total_pages=1, current_page=1, page_size=20
-        ),
-    )
-    assert len(response.creators) == 1
-    assert response.pagination.total_items == 1
-
-
-def test_creator_details():
-    details = store_model.CreatorDetails(
-        name="Test Creator",
-        username="creator1",
-        description="Test description",
-        links=["link1.com", "link2.com"],
-        avatar_url="avatar.jpg",
-        agent_rating=4.8,
-        agent_runs=1000,
-        top_categories=["cat1", "cat2"],
-    )
-    assert details.name == "Test Creator"
-    assert len(details.links) == 2
-    assert details.agent_rating == 4.8
-    assert len(details.top_categories) == 2
-
-
-def test_store_submission():
-    submission = store_model.StoreSubmission(
-        listing_id="listing123",
-        agent_id="agent123",
-        agent_version=1,
-        sub_heading="Test subheading",
-        name="Test Agent",
-        slug="test-agent",
-        description="Test description",
-        image_urls=["image1.jpg", "image2.jpg"],
-        date_submitted=datetime.datetime(2023, 1, 1),
-        status=prisma.enums.SubmissionStatus.PENDING,
-        runs=50,
-        rating=4.5,
-    )
-    assert submission.name == "Test Agent"
-    assert len(submission.image_urls) == 2
-    assert submission.status == prisma.enums.SubmissionStatus.PENDING
-
-
-def test_store_submissions_response():
-    response = store_model.StoreSubmissionsResponse(
-        submissions=[
-            store_model.StoreSubmission(
-                listing_id="listing123",
-                agent_id="agent123",
-                agent_version=1,
-                sub_heading="Test subheading",
-                name="Test Agent",
-                slug="test-agent",
-                description="Test description",
-                image_urls=["image1.jpg"],
-                date_submitted=datetime.datetime(2023, 1, 1),
-                status=prisma.enums.SubmissionStatus.PENDING,
-                runs=50,
-                rating=4.5,
-            )
-        ],
-        pagination=store_model.Pagination(
-            total_items=1, total_pages=1, current_page=1, page_size=20
-        ),
-    )
-    assert len(response.submissions) == 1
-    assert response.pagination.total_items == 1
-
-
-def test_store_submission_request():
-    request = store_model.StoreSubmissionRequest(
-        agent_id="agent123",
-        agent_version=1,
-        slug="test-agent",
-        name="Test Agent",
-        sub_heading="Test subheading",
-        video_url="video.mp4",
-        image_urls=["image1.jpg", "image2.jpg"],
-        description="Test description",
-        categories=["cat1", "cat2"],
-    )
-    assert request.agent_id == "agent123"
-    assert request.agent_version == 1
-    assert len(request.image_urls) == 2
-    assert len(request.categories) == 2
--- a/autogpt_platform/backend/backend/api/features/store/routes.py
+++ b/autogpt_platform/backend/backend/api/features/store/routes.py
@@ -1,16 +1,16 @@
 import logging
-import tempfile
-import typing
 import urllib.parse
-from typing import Literal

 import autogpt_libs.auth
 import fastapi
 import fastapi.responses
 import prisma.enums
+from fastapi import Query, Security
+from pydantic import BaseModel

 import backend.data.graph
 import backend.util.json
+from backend.util.exceptions import NotFoundError
 from backend.util.models import Pagination

 from . import cache as store_cache
@@ -34,22 +34,15 @@ router = fastapi.APIRouter()
    "/profile",
    summary="Get user profile",
    tags=["store", "private"],
-    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
-    response_model=store_model.ProfileDetails,
+    dependencies=[Security(autogpt_libs.auth.requires_user)],
 )
 async def get_profile(
-    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
-):
-    """
-    Get the profile details for the authenticated user.
-    Cached for 1 hour per user.
-    """
+    user_id: str = Security(autogpt_libs.auth.get_user_id),
+) -> store_model.ProfileDetails:
+    """Get the profile details for the authenticated user."""
    profile = await store_db.get_user_profile(user_id)
    if profile is None:
-        return fastapi.responses.JSONResponse(
-            status_code=404,
-            content={"detail": "Profile not found"},
-        )
+        raise NotFoundError("User does not have a profile yet")
    return profile


@@ -57,98 +50,17 @@ async def get_profile(
    "/profile",
    summary="Update user profile",
    tags=["store", "private"],
-    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
-    response_model=store_model.CreatorDetails,
+    dependencies=[Security(autogpt_libs.auth.requires_user)],
 )
 async def update_or_create_profile(
    profile: store_model.Profile,
-    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
-):
-    """
-    Update the store profile for the authenticated user.
-
-    Args:
-        profile (Profile): The updated profile details
-        user_id (str): ID of the authenticated user
-
-    Returns:
-        CreatorDetails: The updated profile
-
-    Raises:
-        HTTPException: If there is an error updating the profile
-    """
+    user_id: str = Security(autogpt_libs.auth.get_user_id),
+) -> store_model.ProfileDetails:
+    """Update the store profile for the authenticated user."""
    updated_profile = await store_db.update_profile(user_id=user_id, profile=profile)
    return updated_profile


-##############################################
-############### Agent Endpoints ##############
-##############################################
-
-
-@router.get(
-    "/agents",
-    summary="List store agents",
-    tags=["store", "public"],
-    response_model=store_model.StoreAgentsResponse,
-)
-async def get_agents(
-    featured: bool = False,
-    creator: str | None = None,
-    sorted_by: Literal["rating", "runs", "name", "updated_at"] | None = None,
-    search_query: str | None = None,
-    category: str | None = None,
-    page: int = 1,
-    page_size: int = 20,
-):
-    """
-    Get a paginated list of agents from the store with optional filtering and sorting.
-
-    Args:
-        featured (bool, optional): Filter to only show featured agents. Defaults to False.
-        creator (str | None, optional): Filter agents by creator username. Defaults to None.
-        sorted_by (str | None, optional): Sort agents by "runs" or "rating". Defaults to None.
-        search_query (str | None, optional): Search agents by name, subheading and description. Defaults to None.
-        category (str | None, optional): Filter agents by category. Defaults to None.
-        page (int, optional): Page number for pagination. Defaults to 1.
-        page_size (int, optional): Number of agents per page. Defaults to 20.
-
-    Returns:
-        StoreAgentsResponse: Paginated list of agents matching the filters
-
-    Raises:
-        HTTPException: If page or page_size are less than 1
-
-    Used for:
-    - Home Page Featured Agents
-    - Home Page Top Agents
-    - Search Results
-    - Agent Details - Other Agents By Creator
-    - Agent Details - Similar Agents
-    - Creator Details - Agents By Creator
-    """
-    if page < 1:
-        raise fastapi.HTTPException(
-            status_code=422, detail="Page must be greater than 0"
-        )
-
-    if page_size < 1:
-        raise fastapi.HTTPException(
-            status_code=422, detail="Page size must be greater than 0"
-        )
-
-    agents = await store_cache._get_cached_store_agents(
-        featured=featured,
-        creator=creator,
-        sorted_by=sorted_by,
-        search_query=search_query,
-        category=category,
-        page=page,
-        page_size=page_size,
-    )
-    return agents
-
-
 ##############################################
 ############### Search Endpoints #############
 ##############################################
@@ -158,60 +70,30 @@ async def get_agents(
    "/search",
    summary="Unified search across all content types",
    tags=["store", "public"],
-    response_model=store_model.UnifiedSearchResponse,
 )
 async def unified_search(
    query: str,
-    content_types: list[str] | None = fastapi.Query(
+    content_types: list[prisma.enums.ContentType] | None = Query(
        default=None,
-        description="Content types to search: STORE_AGENT, BLOCK, DOCUMENTATION. If not specified, searches all.",
+        description="Content types to search. If not specified, searches all.",
    ),
-    page: int = 1,
-    page_size: int = 20,
-    user_id: str | None = fastapi.Security(
+    page: int = Query(ge=1, default=1),
+    page_size: int = Query(ge=1, default=20),
+    user_id: str | None = Security(
        autogpt_libs.auth.get_optional_user_id, use_cache=False
    ),
-):
+) -> store_model.UnifiedSearchResponse:
    """
-    Search across all content types (store agents, blocks, documentation) using hybrid search.
+    Search across all content types (marketplace agents, blocks, documentation)
+    using hybrid search.

    Combines semantic (embedding-based) and lexical (text-based) search for best results.
-
-    Args:
-        query: The search query string
-        content_types: Optional list of content types to filter by (STORE_AGENT, BLOCK, DOCUMENTATION)
-        page: Page number for pagination (default 1)
-        page_size: Number of results per page (default 20)
-        user_id: Optional authenticated user ID (for user-scoped content in future)
-
-    Returns:
-        UnifiedSearchResponse: Paginated list of search results with relevance scores
    """
-    if page < 1:
-        raise fastapi.HTTPException(
-            status_code=422, detail="Page must be greater than 0"
-        )
-
-    if page_size < 1:
-        raise fastapi.HTTPException(
-            status_code=422, detail="Page size must be greater than 0"
-        )
-
-    # Convert string content types to enum
-    content_type_enums: list[prisma.enums.ContentType] | None = None
-    if content_types:
-        try:
-            content_type_enums = [prisma.enums.ContentType(ct) for ct in content_types]
-        except ValueError as e:
-            raise fastapi.HTTPException(
-                status_code=422,
-                detail=f"Invalid content type. Valid values: STORE_AGENT, BLOCK, DOCUMENTATION. Error: {e}",
-            )

    # Perform unified hybrid search
    results, total = await store_hybrid_search.unified_hybrid_search(
        query=query,
-        content_types=content_type_enums,
+        content_types=content_types,
        user_id=user_id,
        page=page,
        page_size=page_size,
@@ -245,22 +127,69 @@ async def unified_search(
    )


+##############################################
+############### Agent Endpoints ##############
+##############################################
+
+
+@router.get(
+    "/agents",
+    summary="List store agents",
+    tags=["store", "public"],
+)
+async def get_agents(
+    featured: bool = Query(
+        default=False, description="Filter to only show featured agents"
+    ),
+    creator: str | None = Query(
+        default=None, description="Filter agents by creator username"
+    ),
+    category: str | None = Query(default=None, description="Filter agents by category"),
+    search_query: str | None = Query(
+        default=None, description="Literal + semantic search on names and descriptions"
+    ),
+    sorted_by: store_db.StoreAgentsSortOptions | None = Query(
+        default=None,
+        description="Property to sort results by. Ignored if search_query is provided.",
+    ),
+    page: int = Query(ge=1, default=1),
+    page_size: int = Query(ge=1, default=20),
+) -> store_model.StoreAgentsResponse:
+    """
+    Get a paginated list of agents from the marketplace,
+    with optional filtering and sorting.
+
+    Used for:
+    - Home Page Featured Agents
+    - Home Page Top Agents
+    - Search Results
+    - Agent Details - Other Agents By Creator
+    - Agent Details - Similar Agents
+    - Creator Details - Agents By Creator
+    """
+    agents = await store_cache._get_cached_store_agents(
+        featured=featured,
+        creator=creator,
+        sorted_by=sorted_by,
+        search_query=search_query,
+        category=category,
+        page=page,
+        page_size=page_size,
+    )
+    return agents
+
+
@router.get(
    "/agents/{username}/{agent_name}",
    summary="Get specific agent",
    tags=["store", "public"],
-    response_model=store_model.StoreAgentDetails,
 )
-async def get_agent(
+async def get_agent_by_name(
    username: str,
    agent_name: str,
-    include_changelog: bool = fastapi.Query(default=False),
-):
-    """
-    This is only used on the AgentDetails Page.
-
-    It returns the store listing agents details.
-    """
+    include_changelog: bool = Query(default=False),
+) -> store_model.StoreAgentDetails:
+    """Get details of a marketplace agent"""
    username = urllib.parse.unquote(username).lower()
    # URL decode the agent name since it comes from the URL path
    agent_name = urllib.parse.unquote(agent_name).lower()
@@ -270,76 +199,79 @@ async def get_agent(
    return agent


-@router.get(
-    "/graph/{store_listing_version_id}",
-    summary="Get agent graph",
-    tags=["store"],
-    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
-)
-async def get_graph_meta_by_store_listing_version_id(
-    store_listing_version_id: str,
-) -> backend.data.graph.GraphModelWithoutNodes:
-    """
-    Get Agent Graph from Store Listing Version ID.
-    """
-    graph = await store_db.get_available_graph(store_listing_version_id)
-    return graph
-
-
-@router.get(
-    "/agents/{store_listing_version_id}",
-    summary="Get agent by version",
-    tags=["store"],
-    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
-    response_model=store_model.StoreAgentDetails,
-)
-async def get_store_agent(store_listing_version_id: str):
-    """
-    Get Store Agent Details from Store Listing Version ID.
-    """
-    agent = await store_db.get_store_agent_by_version_id(store_listing_version_id)
-
-    return agent
-
-
@router.post(
    "/agents/{username}/{agent_name}/review",
    summary="Create agent review",
    tags=["store"],
-    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
-    response_model=store_model.StoreReview,
+    dependencies=[Security(autogpt_libs.auth.requires_user)],
 )
-async def create_review(
+async def post_user_review_for_agent(
    username: str,
    agent_name: str,
    review: store_model.StoreReviewCreate,
-    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
-):
-    """
-    Create a review for a store agent.
-
-    Args:
-        username: Creator's username
-        agent_name: Name/slug of the agent
-        review: Review details including score and optional comments
-        user_id: ID of authenticated user creating the review
-
-    Returns:
-        The created review
-    """
+    user_id: str = Security(autogpt_libs.auth.get_user_id),
+) -> store_model.StoreReview:
+    """Post a user review on a marketplace agent listing"""
    username = urllib.parse.unquote(username).lower()
    agent_name = urllib.parse.unquote(agent_name).lower()
-    # Create the review
+
    created_review = await store_db.create_store_review(
        user_id=user_id,
        store_listing_version_id=review.store_listing_version_id,
        score=review.score,
        comments=review.comments,
    )
-
    return created_review


+@router.get(
+    "/listings/versions/{store_listing_version_id}",
+    summary="Get agent by version",
+    tags=["store"],
+    dependencies=[Security(autogpt_libs.auth.requires_user)],
+)
+async def get_agent_by_listing_version(
+    store_listing_version_id: str,
+) -> store_model.StoreAgentDetails:
+    agent = await store_db.get_store_agent_by_version_id(store_listing_version_id)
+    return agent
+
+
+@router.get(
+    "/listings/versions/{store_listing_version_id}/graph",
+    summary="Get agent graph",
+    tags=["store"],
+    dependencies=[Security(autogpt_libs.auth.requires_user)],
+)
+async def get_graph_meta_by_store_listing_version_id(
+    store_listing_version_id: str,
+) -> backend.data.graph.GraphModelWithoutNodes:
+    """Get outline of graph belonging to a specific marketplace listing version"""
+    graph = await store_db.get_available_graph(store_listing_version_id)
+    return graph
+
+
+@router.get(
+    "/listings/versions/{store_listing_version_id}/graph/download",
+    summary="Download agent file",
+    tags=["store", "public"],
+)
+async def download_agent_file(
+    store_listing_version_id: str,
+) -> fastapi.responses.Response:
+    """Download agent graph file for a specific marketplace listing version"""
+    graph_data = await store_db.get_agent(store_listing_version_id)
+    file_name = f"agent_{graph_data.id}_v{graph_data.version or 'latest'}.json"
+
+    return fastapi.responses.Response(
+        content=backend.util.json.dumps(graph_data),
+        media_type="application/json",
+        headers={
+            "Content-Disposition": f'attachment; filename="{file_name}"',
+        },
+    )
+
+
 ##############################################
 ############# Creator Endpoints #############
 ##############################################
@@ -349,37 +281,19 @@ async def create_review(
    "/creators",
    summary="List store creators",
    tags=["store", "public"],
-    response_model=store_model.CreatorsResponse,
 )
 async def get_creators(
-    featured: bool = False,
-    search_query: str | None = None,
-    sorted_by: Literal["agent_rating", "agent_runs", "num_agents"] | None = None,
-    page: int = 1,
-    page_size: int = 20,
-):
-    """
-    This is needed for:
-    - Home Page Featured Creators
-    - Search Results Page
-
-    ---
-
-    To support this functionality we need:
-    - featured: bool - to limit the list to just featured agents
-    - search_query: str - vector search based on the creators profile description.
-    - sorted_by: [agent_rating, agent_runs] -
-    """
-    if page < 1:
-        raise fastapi.HTTPException(
-            status_code=422, detail="Page must be greater than 0"
-        )
-
-    if page_size < 1:
-        raise fastapi.HTTPException(
-            status_code=422, detail="Page size must be greater than 0"
-        )
-
+    featured: bool = Query(
+        default=False, description="Filter to only show featured creators"
+    ),
+    search_query: str | None = Query(
+        default=None, description="Literal + semantic search on names and descriptions"
+    ),
+    sorted_by: store_db.StoreCreatorsSortOptions | None = None,
+    page: int = Query(ge=1, default=1),
+    page_size: int = Query(ge=1, default=20),
+) -> store_model.CreatorsResponse:
+    """List or search marketplace creators"""
    creators = await store_cache._get_cached_store_creators(
        featured=featured,
        search_query=search_query,
@@ -391,18 +305,12 @@ async def get_creators(


@router.get(
-    "/creator/{username}",
+    "/creators/{username}",
    summary="Get creator details",
    tags=["store", "public"],
-    response_model=store_model.CreatorDetails,
 )
-async def get_creator(
-    username: str,
-):
-    """
-    Get the details of a creator.
-    - Creator Details Page
-    """
+async def get_creator(username: str) -> store_model.CreatorDetails:
+    """Get details on a marketplace creator"""
    username = urllib.parse.unquote(username).lower()
    creator = await store_cache._get_cached_creator_details(username=username)
    return creator
@@ -414,20 +322,17 @@ async def get_creator(


@router.get(
-    "/myagents",
+    "/my-unpublished-agents",
    summary="Get my agents",
    tags=["store", "private"],
-    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
-    response_model=store_model.MyAgentsResponse,
+    dependencies=[Security(autogpt_libs.auth.requires_user)],
 )
-async def get_my_agents(
-    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
-    page: typing.Annotated[int, fastapi.Query(ge=1)] = 1,
-    page_size: typing.Annotated[int, fastapi.Query(ge=1)] = 20,
-):
-    """
-    Get user's own agents.
-    """
+async def get_my_unpublished_agents(
+    user_id: str = Security(autogpt_libs.auth.get_user_id),
+    page: int = Query(ge=1, default=1),
+    page_size: int = Query(ge=1, default=20),
+) -> store_model.MyUnpublishedAgentsResponse:
+    """List the authenticated user's unpublished agents"""
    agents = await store_db.get_my_agents(user_id, page=page, page_size=page_size)
    return agents

@@ -436,28 +341,17 @@ async def get_my_agents(
    "/submissions/{submission_id}",
    summary="Delete store submission",
    tags=["store", "private"],
-    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
-    response_model=bool,
+    dependencies=[Security(autogpt_libs.auth.requires_user)],
 )
 async def delete_submission(
    submission_id: str,
-    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
-):
-    """
-    Delete a store listing submission.
-
-    Args:
-        user_id (str): ID of the authenticated user
-        submission_id (str): ID of the submission to be deleted
-
-    Returns:
-        bool: True if the submission was successfully deleted, False otherwise
-    """
+    user_id: str = Security(autogpt_libs.auth.get_user_id),
+) -> bool:
+    """Delete a marketplace listing submission"""
    result = await store_db.delete_store_submission(
        user_id=user_id,
        submission_id=submission_id,
    )
-
    return result


@@ -465,37 +359,14 @@ async def delete_submission(
    "/submissions",
    summary="List my submissions",
    tags=["store", "private"],
-    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
-    response_model=store_model.StoreSubmissionsResponse,
+    dependencies=[Security(autogpt_libs.auth.requires_user)],
 )
 async def get_submissions(
-    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
-    page: int = 1,
-    page_size: int = 20,
-):
-    """
-    Get a paginated list of store submissions for the authenticated user.
-
-    Args:
-        user_id (str): ID of the authenticated user
-        page (int, optional): Page number for pagination. Defaults to 1.
-        page_size (int, optional): Number of submissions per page. Defaults to 20.
-
-    Returns:
-        StoreListingsResponse: Paginated list of store submissions
-
-    Raises:
-        HTTPException: If page or page_size are less than 1
-    """
-    if page < 1:
-        raise fastapi.HTTPException(
-            status_code=422, detail="Page must be greater than 0"
-        )
-
-    if page_size < 1:
-        raise fastapi.HTTPException(
-            status_code=422, detail="Page size must be greater than 0"
-        )
+    user_id: str = Security(autogpt_libs.auth.get_user_id),
+    page: int = Query(ge=1, default=1),
+    page_size: int = Query(ge=1, default=20),
+) -> store_model.StoreSubmissionsResponse:
+    """List the authenticated user's marketplace listing submissions"""
    listings = await store_db.get_store_submissions(
        user_id=user_id,
        page=page,
@@ -508,30 +379,17 @@ async def get_submissions(
    "/submissions",
    summary="Create store submission",
    tags=["store", "private"],
-    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
-    response_model=store_model.StoreSubmission,
+    dependencies=[Security(autogpt_libs.auth.requires_user)],
 )
 async def create_submission(
    submission_request: store_model.StoreSubmissionRequest,
-    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
-):
-    """
-    Create a new store listing submission.
-
-    Args:
-        submission_request (StoreSubmissionRequest): The submission details
-        user_id (str): ID of the authenticated user submitting the listing
-
-    Returns:
-        StoreSubmission: The created store submission
-
-    Raises:
-        HTTPException: If there is an error creating the submission
-    """
+    user_id: str = Security(autogpt_libs.auth.get_user_id),
+) -> store_model.StoreSubmission:
+    """Submit a new marketplace listing for review"""
    result = await store_db.create_store_submission(
        user_id=user_id,
-        agent_id=submission_request.agent_id,
-        agent_version=submission_request.agent_version,
+        graph_id=submission_request.graph_id,
+        graph_version=submission_request.graph_version,
        slug=submission_request.slug,
        name=submission_request.name,
        video_url=submission_request.video_url,
@@ -544,7 +402,6 @@ async def create_submission(
        changes_summary=submission_request.changes_summary or "Initial Submission",
        recommended_schedule_cron=submission_request.recommended_schedule_cron,
    )
-
    return result


@@ -552,28 +409,14 @@ async def create_submission(
    "/submissions/{store_listing_version_id}",
    summary="Edit store submission",
    tags=["store", "private"],
-    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
-    response_model=store_model.StoreSubmission,
+    dependencies=[Security(autogpt_libs.auth.requires_user)],
 )
 async def edit_submission(
    store_listing_version_id: str,
    submission_request: store_model.StoreSubmissionEditRequest,
-    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
-):
-    """
-    Edit an existing store listing submission.
-
-    Args:
-        store_listing_version_id (str): ID of the store listing version to edit
-        submission_request (StoreSubmissionRequest): The updated submission details
-        user_id (str): ID of the authenticated user editing the listing
-
-    Returns:
-        StoreSubmission: The updated store submission
-
-    Raises:
-        HTTPException: If there is an error editing the submission
-    """
+    user_id: str = Security(autogpt_libs.auth.get_user_id),
+) -> store_model.StoreSubmission:
+    """Update a pending marketplace listing submission"""
    result = await store_db.edit_store_submission(
        user_id=user_id,
        store_listing_version_id=store_listing_version_id,
@@ -588,7 +431,6 @@ async def edit_submission(
        changes_summary=submission_request.changes_summary,
        recommended_schedule_cron=submission_request.recommended_schedule_cron,
    )
-
    return result


@@ -596,115 +438,61 @@ async def edit_submission(
    "/submissions/media",
    summary="Upload submission media",
    tags=["store", "private"],
-    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
+    dependencies=[Security(autogpt_libs.auth.requires_user)],
 )
 async def upload_submission_media(
    file: fastapi.UploadFile,
-    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
-):
-    """
-    Upload media (images/videos) for a store listing submission.
-
-    Args:
-        file (UploadFile): The media file to upload
-        user_id (str): ID of the authenticated user uploading the media
-
-    Returns:
-        str: URL of the uploaded media file
-
-    Raises:
-        HTTPException: If there is an error uploading the media
-    """
+    user_id: str = Security(autogpt_libs.auth.get_user_id),
+) -> str:
+    """Upload media for a marketplace listing submission"""
    media_url = await store_media.upload_media(user_id=user_id, file=file)
    return media_url


+class ImageURLResponse(BaseModel):
+    image_url: str
+
+
@router.post(
    "/submissions/generate_image",
    summary="Generate submission image",
    tags=["store", "private"],
-    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
+    dependencies=[Security(autogpt_libs.auth.requires_user)],
 )
 async def generate_image(
-    agent_id: str,
-    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
-) -> fastapi.responses.Response:
+    graph_id: str,
+    user_id: str = Security(autogpt_libs.auth.get_user_id),
+) -> ImageURLResponse:
    """
-    Generate an image for a store listing submission.
-
-    Args:
-        agent_id (str): ID of the agent to generate an image for
-        user_id (str): ID of the authenticated user
-
-    Returns:
-        JSONResponse: JSON containing the URL of the generated image
+    Generate an image for a marketplace listing submission based on the properties
+    of a given graph.
    """
-    agent = await backend.data.graph.get_graph(
-        graph_id=agent_id, version=None, user_id=user_id
+    graph = await backend.data.graph.get_graph(
+        graph_id=graph_id, version=None, user_id=user_id
    )

-    if not agent:
-        raise fastapi.HTTPException(
-            status_code=404, detail=f"Agent with ID {agent_id} not found"
-        )
+    if not graph:
+        raise NotFoundError(f"Agent graph #{graph_id} not found")
    # Use .jpeg here since we are generating JPEG images
-    filename = f"agent_{agent_id}.jpeg"
+    filename = f"agent_{graph_id}.jpeg"

    existing_url = await store_media.check_media_exists(user_id, filename)
    if existing_url:
-        logger.info(f"Using existing image for agent {agent_id}")
-        return fastapi.responses.JSONResponse(content={"image_url": existing_url})
+        logger.info(f"Using existing image for agent graph {graph_id}")
+        return ImageURLResponse(image_url=existing_url)
    # Generate agent image as JPEG
-    image = await store_image_gen.generate_agent_image(agent=agent)
+    image = await store_image_gen.generate_agent_image(agent=graph)

    # Create UploadFile with the correct filename and content_type
    image_file = fastapi.UploadFile(
        file=image,
        filename=filename,
    )
-
    image_url = await store_media.upload_media(
        user_id=user_id, file=image_file, use_file_name=True
    )

-    return fastapi.responses.JSONResponse(content={"image_url": image_url})
-
-
-@router.get(
-    "/download/agents/{store_listing_version_id}",
-    summary="Download agent file",
-    tags=["store", "public"],
-)
-async def download_agent_file(
-    store_listing_version_id: str = fastapi.Path(
-        ..., description="The ID of the agent to download"
-    ),
-) -> fastapi.responses.FileResponse:
-    """
-    Download the agent file by streaming its content.
-
-    Args:
-        store_listing_version_id (str): The ID of the agent to download
-
-    Returns:
-        StreamingResponse: A streaming response containing the agent's graph data.
-
-    Raises:
-        HTTPException: If the agent is not found or an unexpected error occurs.
-    """
-    graph_data = await store_db.get_agent(store_listing_version_id)
-    file_name = f"agent_{graph_data.id}_v{graph_data.version or 'latest'}.json"
-
-    # Sending graph as a stream (similar to marketplace v1)
-    with tempfile.NamedTemporaryFile(
-        mode="w", suffix=".json", delete=False
-    ) as tmp_file:
-        tmp_file.write(backend.util.json.dumps(graph_data))
-        tmp_file.flush()
-
-        return fastapi.responses.FileResponse(
-            tmp_file.name, filename=file_name, media_type="application/json"
-        )
+    return ImageURLResponse(image_url=image_url)


 ##############################################
--- a/autogpt_platform/backend/backend/api/features/store/routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/routes_test.py
@@ -8,6 +8,8 @@ import pytest
 import pytest_mock
 from pytest_snapshot.plugin import Snapshot

+from backend.api.features.store.db import StoreAgentsSortOptions
+
 from . import model as store_model
 from . import routes as store_routes

@@ -196,7 +198,7 @@ def test_get_agents_sorted(
    mock_db_call.assert_called_once_with(
        featured=False,
        creators=None,
-        sorted_by="runs",
+        sorted_by=StoreAgentsSortOptions.RUNS,
        search_query=None,
        category=None,
        page=1,
@@ -380,9 +382,11 @@ def test_get_agent_details(
        runs=100,
        rating=4.5,
        versions=["1.0.0", "1.1.0"],
-        agentGraphVersions=["1", "2"],
-        agentGraphId="test-graph-id",
+        graph_versions=["1", "2"],
+        graph_id="test-graph-id",
        last_updated=FIXED_NOW,
+        active_version_id="test-version-id",
+        has_approved_version=True,
    )
    mock_db_call = mocker.patch("backend.api.features.store.db.get_store_agent_details")
    mock_db_call.return_value = mocked_value
@@ -435,15 +439,17 @@ def test_get_creators_pagination(
 ) -> None:
    mocked_value = store_model.CreatorsResponse(
        creators=[
-            store_model.Creator(
+            store_model.CreatorDetails(
                name=f"Creator {i}",
                username=f"creator{i}",
-                description=f"Creator {i} description",
                avatar_url=f"avatar{i}.jpg",
-                num_agents=1,
-                agent_rating=4.5,
-                agent_runs=100,
+                description=f"Creator {i} description",
+                links=[f"user{i}.link.com"],
                is_featured=False,
+                num_agents=1,
+                agent_runs=100,
+                agent_rating=4.5,
+                top_categories=["cat1", "cat2", "cat3"],
            )
            for i in range(5)
        ],
@@ -496,19 +502,19 @@ def test_get_creator_details(
    mocked_value = store_model.CreatorDetails(
        name="Test User",
        username="creator1",
+        avatar_url="avatar.jpg",
        description="Test creator description",
        links=["link1.com", "link2.com"],
-        avatar_url="avatar.jpg",
-        agent_rating=4.8,
+        is_featured=True,
+        num_agents=5,
        agent_runs=1000,
+        agent_rating=4.8,
        top_categories=["category1", "category2"],
    )
-    mock_db_call = mocker.patch(
-        "backend.api.features.store.db.get_store_creator_details"
-    )
+    mock_db_call = mocker.patch("backend.api.features.store.db.get_store_creator")
    mock_db_call.return_value = mocked_value

-    response = client.get("/creator/creator1")
+    response = client.get("/creators/creator1")
    assert response.status_code == 200

    data = store_model.CreatorDetails.model_validate(response.json())
@@ -528,19 +534,26 @@ def test_get_submissions_success(
        submissions=[
            store_model.StoreSubmission(
                listing_id="test-listing-id",
-                name="Test Agent",
-                description="Test agent description",
-                image_urls=["test.jpg"],
-                date_submitted=FIXED_NOW,
-                status=prisma.enums.SubmissionStatus.APPROVED,
-                runs=50,
-                rating=4.2,
-                agent_id="test-agent-id",
-                agent_version=1,
-                sub_heading="Test agent subheading",
+                user_id="test-user-id",
                slug="test-agent",
-                video_url="test.mp4",
+                listing_version_id="test-version-id",
+                listing_version=1,
+                graph_id="test-agent-id",
+                graph_version=1,
+                name="Test Agent",
+                sub_heading="Test agent subheading",
+                description="Test agent description",
+                instructions="Click the button!",
                categories=["test-category"],
+                image_urls=["test.jpg"],
+                video_url="test.mp4",
+                agent_output_demo_url="demo_video.mp4",
+                submitted_at=FIXED_NOW,
+                changes_summary="Initial Submission",
+                status=prisma.enums.SubmissionStatus.APPROVED,
+                run_count=50,
+                review_count=5,
+                review_avg_rating=4.2,
            )
        ],
        pagination=store_model.Pagination(
--- a/autogpt_platform/backend/backend/api/features/store/test_cache_delete.py
+++ b/autogpt_platform/backend/backend/api/features/store/test_cache_delete.py
@@ -11,6 +11,7 @@ import pytest
 from backend.util.models import Pagination

 from . import cache as store_cache
+from .db import StoreAgentsSortOptions
 from .model import StoreAgent, StoreAgentsResponse


@@ -215,7 +216,7 @@ class TestCacheDeletion:
            await store_cache._get_cached_store_agents(
                featured=True,
                creator="testuser",
-                sorted_by="rating",
+                sorted_by=StoreAgentsSortOptions.RATING,
                search_query="AI assistant",
                category="productivity",
                page=2,
@@ -227,7 +228,7 @@ class TestCacheDeletion:
            deleted = store_cache._get_cached_store_agents.cache_delete(
                featured=True,
                creator="testuser",
-                sorted_by="rating",
+                sorted_by=StoreAgentsSortOptions.RATING,
                search_query="AI assistant",
                category="productivity",
                page=2,
@@ -239,7 +240,7 @@ class TestCacheDeletion:
            deleted = store_cache._get_cached_store_agents.cache_delete(
                featured=True,
                creator="testuser",
-                sorted_by="rating",
+                sorted_by=StoreAgentsSortOptions.RATING,
                search_query="AI assistant",
                category="productivity",
                page=2,
--- a/autogpt_platform/backend/backend/api/features/store/text_utils.py
+++ b/autogpt_platform/backend/backend/api/features/store/text_utils.py
@@ -0,0 +1,5 @@
+"""Backward-compatibility shim — ``split_camelcase`` now lives in backend.util.text."""
+
+from backend.util.text import split_camelcase  # noqa: F401
+
+__all__ = ["split_camelcase"]
--- a/autogpt_platform/backend/backend/api/features/store/text_utils_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/text_utils_test.py
@@ -0,0 +1,49 @@
+"""Tests for split_camelcase (now in backend.util.text)."""
+
+import pytest
+
+from backend.util.text import split_camelcase
+
+# ---------------------------------------------------------------------------
+# split_camelcase
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.parametrize(
+    "input_text, expected",
+    [
+        ("AITextGeneratorBlock", "AI Text Generator Block"),
+        ("HTTPRequestBlock", "HTTP Request Block"),
+        ("simpleWord", "simple Word"),
+        ("already spaced", "already spaced"),
+        ("XMLParser", "XML Parser"),
+        ("getHTTPResponse", "get HTTP Response"),
+        ("Block", "Block"),
+        ("", ""),
+        ("OAuth2Block", "OAuth2 Block"),
+        ("IOError", "IO Error"),
+        ("getHTTPSResponse", "get HTTPS Response"),
+        # Known limitation: single-letter uppercase prefixes are NOT split.
+        # "ABlock" stays "ABlock" because the algorithm requires the left
+        # part of an uppercase run to retain at least 2 uppercase chars.
+        ("ABlock", "ABlock"),
+        # Digit-to-uppercase transitions
+        ("Base64Encoder", "Base64 Encoder"),
+        ("UTF8Decoder", "UTF8 Decoder"),
+        # Pure digits — no camelCase boundaries to split
+        ("123", "123"),
+        # Known limitation: single-letter uppercase segments after digits
+        # are not split from the following word.  "3D" is only 1 uppercase
+        # char so the uppercase-run rule cannot fire, producing "3 DRenderer"
+        # rather than the ideal "3D Renderer".
+        ("3DRenderer", "3 DRenderer"),
+        # Exception list — compound terms that should stay together
+        ("YouTubeBlock", "YouTube Block"),
+        ("OpenAIBlock", "OpenAI Block"),
+        ("AutoGPTAgent", "AutoGPT Agent"),
+        ("GitHubIntegration", "GitHub Integration"),
+        ("LinkedInBlock", "LinkedIn Block"),
+    ],
+)
+def test_split_camelcase(input_text: str, expected: str):
+    assert split_camelcase(input_text) == expected
--- a/autogpt_platform/backend/backend/api/features/v1.py
+++ b/autogpt_platform/backend/backend/api/features/v1.py
@@ -126,6 +126,9 @@ v1_router = APIRouter()
 ########################################################


+_tally_background_tasks: set[asyncio.Task] = set()
+
+
@v1_router.post(
    "/auth/user",
    summary="Get or create user",
@@ -134,6 +137,24 @@ v1_router = APIRouter()
 )
 async def get_or_create_user_route(user_data: dict = Security(get_jwt_payload)):
    user = await get_or_create_user(user_data)
+
+    # Fire-and-forget: populate business understanding from Tally form.
+    # We use created_at proximity instead of an is_new flag because
+    # get_or_create_user is cached — a separate is_new return value would be
+    # unreliable on repeated calls within the cache TTL.
+    age_seconds = (datetime.now(timezone.utc) - user.created_at).total_seconds()
+    if age_seconds < 30:
+        try:
+            from backend.data.tally import populate_understanding_from_tally
+
+            task = asyncio.create_task(
+                populate_understanding_from_tally(user.id, user.email)
+            )
+            _tally_background_tasks.add(task)
+            task.add_done_callback(_tally_background_tasks.discard)
+        except Exception:
+            logger.debug("Failed to start Tally population task", exc_info=True)
+
    return user.model_dump()


@@ -428,7 +449,6 @@ async def execute_graph_block(
 async def upload_file(
    user_id: Annotated[str, Security(get_user_id)],
    file: UploadFile = File(...),
-    provider: str = "gcs",
    expiration_hours: int = 24,
 ) -> UploadFileResponse:
    """
@@ -491,7 +511,6 @@ async def upload_file(
    storage_path = await cloud_storage.store_file(
        content=content,
        filename=file_name,
-        provider=provider,
        expiration_hours=expiration_hours,
        user_id=user_id,
    )
@@ -573,6 +592,11 @@ async def fulfill_checkout(user_id: Annotated[str, Security(get_user_id)]):
 async def configure_user_auto_top_up(
    request: AutoTopUpConfig, user_id: Annotated[str, Security(get_user_id)]
 ) -> str:
+    """Configure auto top-up settings and perform an immediate top-up if needed.
+
+    Raises HTTPException(422) if the request parameters are invalid or if
+    the credit top-up fails.
+    """
    if request.threshold < 0:
        raise HTTPException(status_code=422, detail="Threshold must be greater than 0")
    if request.amount < 500 and request.amount != 0:
@@ -587,10 +611,20 @@ async def configure_user_auto_top_up(
    user_credit_model = await get_user_credit_model(user_id)
    current_balance = await user_credit_model.get_credits(user_id)

-    if current_balance < request.threshold:
-        await user_credit_model.top_up_credits(user_id, request.amount)
-    else:
-        await user_credit_model.top_up_credits(user_id, 0)
+    try:
+        if current_balance < request.threshold:
+            await user_credit_model.top_up_credits(user_id, request.amount)
+        else:
+            await user_credit_model.top_up_credits(user_id, 0)
+    except ValueError as e:
+        known_messages = (
+            "must not be negative",
+            "already exists for user",
+            "No payment method found",
+        )
+        if any(msg in str(e) for msg in known_messages):
+            raise HTTPException(status_code=422, detail=str(e))
+        raise

    await set_auto_top_up(
        user_id, AutoTopUpConfig(threshold=request.threshold, amount=request.amount)
@@ -946,14 +980,16 @@ async def execute_graph(
    source: Annotated[GraphExecutionSource | None, Body(embed=True)] = None,
    graph_version: Optional[int] = None,
    preset_id: Optional[str] = None,
+    dry_run: Annotated[bool, Body(embed=True)] = False,
 ) -> execution_db.GraphExecutionMeta:
-    user_credit_model = await get_user_credit_model(user_id)
-    current_balance = await user_credit_model.get_credits(user_id)
-    if current_balance <= 0:
-        raise HTTPException(
-            status_code=402,
-            detail="Insufficient balance to execute the agent. Please top up your account.",
-        )
+    if not dry_run:
+        user_credit_model = await get_user_credit_model(user_id)
+        current_balance = await user_credit_model.get_credits(user_id)
+        if current_balance <= 0:
+            raise HTTPException(
+                status_code=402,
+                detail="Insufficient balance to execute the agent. Please top up your account.",
+            )

    try:
        result = await execution_utils.add_graph_execution(
@@ -963,6 +999,7 @@ async def execute_graph(
            preset_id=preset_id,
            graph_version=graph_version,
            graph_credentials_inputs=credentials_inputs,
+            dry_run=dry_run,
        )
        # Record successful graph execution
        record_graph_execution(graph_id=graph_id, status="success", user_id=user_id)
--- a/autogpt_platform/backend/backend/api/features/v1_test.py
+++ b/autogpt_platform/backend/backend/api/features/v1_test.py
@@ -1,5 +1,5 @@
 import json
-from datetime import datetime
+from datetime import datetime, timezone
 from io import BytesIO
 from unittest.mock import AsyncMock, Mock, patch

@@ -43,6 +43,7 @@ def test_get_or_create_user_route(
 ) -> None:
    """Test get or create user endpoint"""
    mock_user = Mock()
+    mock_user.created_at = datetime.now(timezone.utc)
    mock_user.model_dump.return_value = {
        "id": test_user_id,
        "email": "test@example.com",
@@ -514,7 +515,6 @@ async def test_upload_file_success(test_user_id: str):
        result = await upload_file(
            file=upload_file_mock,
            user_id=test_user_id,
-            provider="gcs",
            expiration_hours=24,
        )

@@ -532,7 +532,6 @@ async def test_upload_file_success(test_user_id: str):
        mock_handler.store_file.assert_called_once_with(
            content=file_content,
            filename="test.txt",
-            provider="gcs",
            expiration_hours=24,
            user_id=test_user_id,
        )
--- a/autogpt_platform/backend/backend/api/features/workspace/routes.py
+++ b/autogpt_platform/backend/backend/api/features/workspace/routes.py
@@ -3,15 +3,29 @@ Workspace API routes for managing user file storage.
 """

 import logging
+import os
 import re
 from typing import Annotated
 from urllib.parse import quote

 import fastapi
 from autogpt_libs.auth.dependencies import get_user_id, requires_user
+from fastapi import Query, UploadFile
 from fastapi.responses import Response
+from pydantic import BaseModel

-from backend.data.workspace import WorkspaceFile, get_workspace, get_workspace_file
+from backend.data.workspace import (
+    WorkspaceFile,
+    count_workspace_files,
+    get_or_create_workspace,
+    get_workspace,
+    get_workspace_file,
+    get_workspace_total_size,
+    soft_delete_workspace_file,
+)
+from backend.util.settings import Config
+from backend.util.virus_scanner import scan_content_safe
+from backend.util.workspace import WorkspaceManager
 from backend.util.workspace_storage import get_workspace_storage


@@ -98,6 +112,25 @@ async def _create_file_download_response(file: WorkspaceFile) -> Response:
            raise


+class UploadFileResponse(BaseModel):
+    file_id: str
+    name: str
+    path: str
+    mime_type: str
+    size_bytes: int
+
+
+class DeleteFileResponse(BaseModel):
+    deleted: bool
+
+
+class StorageUsageResponse(BaseModel):
+    used_bytes: int
+    limit_bytes: int
+    used_percent: float
+    file_count: int
+
+
@router.get(
    "/files/{file_id}/download",
    summary="Download file by ID",
@@ -120,3 +153,151 @@ async def download_file(
        raise fastapi.HTTPException(status_code=404, detail="File not found")

    return await _create_file_download_response(file)
+
+
+@router.delete(
+    "/files/{file_id}",
+    summary="Delete a workspace file",
+)
+async def delete_workspace_file(
+    user_id: Annotated[str, fastapi.Security(get_user_id)],
+    file_id: str,
+) -> DeleteFileResponse:
+    """
+    Soft-delete a workspace file and attempt to remove it from storage.
+
+    Used when a user clears a file input in the builder.
+    """
+    workspace = await get_workspace(user_id)
+    if workspace is None:
+        raise fastapi.HTTPException(status_code=404, detail="Workspace not found")
+
+    manager = WorkspaceManager(user_id, workspace.id)
+    deleted = await manager.delete_file(file_id)
+    if not deleted:
+        raise fastapi.HTTPException(status_code=404, detail="File not found")
+
+    return DeleteFileResponse(deleted=True)
+
+
+@router.post(
+    "/files/upload",
+    summary="Upload file to workspace",
+)
+async def upload_file(
+    user_id: Annotated[str, fastapi.Security(get_user_id)],
+    file: UploadFile,
+    session_id: str | None = Query(default=None),
+    overwrite: bool = Query(default=False),
+) -> UploadFileResponse:
+    """
+    Upload a file to the user's workspace.
+
+    Files are stored in session-scoped paths when session_id is provided,
+    so the agent's session-scoped tools can discover them automatically.
+    """
+    config = Config()
+
+    # Sanitize filename — strip any directory components
+    filename = os.path.basename(file.filename or "upload") or "upload"
+
+    # Read file content with early abort on size limit
+    max_file_bytes = config.max_file_size_mb * 1024 * 1024
+    chunks: list[bytes] = []
+    total_size = 0
+    while chunk := await file.read(64 * 1024):  # 64KB chunks
+        total_size += len(chunk)
+        if total_size > max_file_bytes:
+            raise fastapi.HTTPException(
+                status_code=413,
+                detail=f"File exceeds maximum size of {config.max_file_size_mb} MB",
+            )
+        chunks.append(chunk)
+    content = b"".join(chunks)
+
+    # Get or create workspace
+    workspace = await get_or_create_workspace(user_id)
+
+    # Pre-write storage cap check (soft check — final enforcement is post-write)
+    storage_limit_bytes = config.max_workspace_storage_mb * 1024 * 1024
+    current_usage = await get_workspace_total_size(workspace.id)
+    if storage_limit_bytes and current_usage + len(content) > storage_limit_bytes:
+        used_percent = (current_usage / storage_limit_bytes) * 100
+        raise fastapi.HTTPException(
+            status_code=413,
+            detail={
+                "message": "Storage limit exceeded",
+                "used_bytes": current_usage,
+                "limit_bytes": storage_limit_bytes,
+                "used_percent": round(used_percent, 1),
+            },
+        )
+
+    # Warn at 80% usage
+    if (
+        storage_limit_bytes
+        and (usage_ratio := (current_usage + len(content)) / storage_limit_bytes) >= 0.8
+    ):
+        logger.warning(
+            f"User {user_id} workspace storage at {usage_ratio * 100:.1f}% "
+            f"({current_usage + len(content)} / {storage_limit_bytes} bytes)"
+        )
+
+    # Virus scan
+    await scan_content_safe(content, filename=filename)
+
+    # Write file via WorkspaceManager
+    manager = WorkspaceManager(user_id, workspace.id, session_id)
+    try:
+        workspace_file = await manager.write_file(
+            content, filename, overwrite=overwrite
+        )
+    except ValueError as e:
+        raise fastapi.HTTPException(status_code=409, detail=str(e)) from e
+
+    # Post-write storage check — eliminates TOCTOU race on the quota.
+    # If a concurrent upload pushed us over the limit, undo this write.
+    new_total = await get_workspace_total_size(workspace.id)
+    if storage_limit_bytes and new_total > storage_limit_bytes:
+        await soft_delete_workspace_file(workspace_file.id, workspace.id)
+        raise fastapi.HTTPException(
+            status_code=413,
+            detail={
+                "message": "Storage limit exceeded (concurrent upload)",
+                "used_bytes": new_total,
+                "limit_bytes": storage_limit_bytes,
+            },
+        )
+
+    return UploadFileResponse(
+        file_id=workspace_file.id,
+        name=workspace_file.name,
+        path=workspace_file.path,
+        mime_type=workspace_file.mime_type,
+        size_bytes=workspace_file.size_bytes,
+    )
+
+
+@router.get(
+    "/storage/usage",
+    summary="Get workspace storage usage",
+)
+async def get_storage_usage(
+    user_id: Annotated[str, fastapi.Security(get_user_id)],
+) -> StorageUsageResponse:
+    """
+    Get storage usage information for the user's workspace.
+    """
+    config = Config()
+    workspace = await get_or_create_workspace(user_id)
+
+    used_bytes = await get_workspace_total_size(workspace.id)
+    file_count = await count_workspace_files(workspace.id)
+    limit_bytes = config.max_workspace_storage_mb * 1024 * 1024
+
+    return StorageUsageResponse(
+        used_bytes=used_bytes,
+        limit_bytes=limit_bytes,
+        used_percent=round((used_bytes / limit_bytes) * 100, 1) if limit_bytes else 0,
+        file_count=file_count,
+    )
--- a/autogpt_platform/backend/backend/api/features/workspace/routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/workspace/routes_test.py
@@ -0,0 +1,359 @@
+"""Tests for workspace file upload and download routes."""
+
+import io
+from datetime import datetime, timezone
+
+import fastapi
+import fastapi.testclient
+import pytest
+import pytest_mock
+
+from backend.api.features.workspace import routes as workspace_routes
+from backend.data.workspace import WorkspaceFile
+
+app = fastapi.FastAPI()
+app.include_router(workspace_routes.router)
+
+
+@app.exception_handler(ValueError)
+async def _value_error_handler(
+    request: fastapi.Request, exc: ValueError
+) -> fastapi.responses.JSONResponse:
+    """Mirror the production ValueError → 400 mapping from rest_api.py."""
+    return fastapi.responses.JSONResponse(status_code=400, content={"detail": str(exc)})
+
+
+client = fastapi.testclient.TestClient(app)
+
+TEST_USER_ID = "3e53486c-cf57-477e-ba2a-cb02dc828e1a"
+
+MOCK_WORKSPACE = type("W", (), {"id": "ws-1"})()
+
+_NOW = datetime(2023, 1, 1, tzinfo=timezone.utc)
+
+MOCK_FILE = WorkspaceFile(
+    id="file-aaa-bbb",
+    workspace_id="ws-1",
+    created_at=_NOW,
+    updated_at=_NOW,
+    name="hello.txt",
+    path="/session/hello.txt",
+    mime_type="text/plain",
+    size_bytes=13,
+    storage_path="local://hello.txt",
+)
+
+
+@pytest.fixture(autouse=True)
+def setup_app_auth(mock_jwt_user):
+    from autogpt_libs.auth.jwt_utils import get_jwt_payload
+
+    app.dependency_overrides[get_jwt_payload] = mock_jwt_user["get_jwt_payload"]
+    yield
+    app.dependency_overrides.clear()
+
+
+def _upload(
+    filename: str = "hello.txt",
+    content: bytes = b"Hello, world!",
+    content_type: str = "text/plain",
+):
+    """Helper to POST a file upload."""
+    return client.post(
+        "/files/upload?session_id=sess-1",
+        files={"file": (filename, io.BytesIO(content), content_type)},
+    )
+
+
+# ---- Happy path ----
+
+
+def test_upload_happy_path(mocker: pytest_mock.MockFixture):
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_or_create_workspace",
+        return_value=MOCK_WORKSPACE,
+    )
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_workspace_total_size",
+        return_value=0,
+    )
+    mocker.patch(
+        "backend.api.features.workspace.routes.scan_content_safe",
+        return_value=None,
+    )
+    mock_manager = mocker.MagicMock()
+    mock_manager.write_file = mocker.AsyncMock(return_value=MOCK_FILE)
+    mocker.patch(
+        "backend.api.features.workspace.routes.WorkspaceManager",
+        return_value=mock_manager,
+    )
+
+    response = _upload()
+    assert response.status_code == 200
+    data = response.json()
+    assert data["file_id"] == "file-aaa-bbb"
+    assert data["name"] == "hello.txt"
+    assert data["size_bytes"] == 13
+
+
+# ---- Per-file size limit ----
+
+
+def test_upload_exceeds_max_file_size(mocker: pytest_mock.MockFixture):
+    """Files larger than max_file_size_mb should be rejected with 413."""
+    cfg = mocker.patch("backend.api.features.workspace.routes.Config")
+    cfg.return_value.max_file_size_mb = 0  # 0 MB → any content is too big
+    cfg.return_value.max_workspace_storage_mb = 500
+
+    response = _upload(content=b"x" * 1024)
+    assert response.status_code == 413
+
+
+# ---- Storage quota exceeded ----
+
+
+def test_upload_storage_quota_exceeded(mocker: pytest_mock.MockFixture):
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_or_create_workspace",
+        return_value=MOCK_WORKSPACE,
+    )
+    # Current usage already at limit
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_workspace_total_size",
+        return_value=500 * 1024 * 1024,
+    )
+
+    response = _upload()
+    assert response.status_code == 413
+    assert "Storage limit exceeded" in response.text
+
+
+# ---- Post-write quota race (B2) ----
+
+
+def test_upload_post_write_quota_race(mocker: pytest_mock.MockFixture):
+    """If a concurrent upload tips the total over the limit after write,
+    the file should be soft-deleted and 413 returned."""
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_or_create_workspace",
+        return_value=MOCK_WORKSPACE,
+    )
+    # Pre-write check passes (under limit), but post-write check fails
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_workspace_total_size",
+        side_effect=[0, 600 * 1024 * 1024],  # first call OK, second over limit
+    )
+    mocker.patch(
+        "backend.api.features.workspace.routes.scan_content_safe",
+        return_value=None,
+    )
+    mock_manager = mocker.MagicMock()
+    mock_manager.write_file = mocker.AsyncMock(return_value=MOCK_FILE)
+    mocker.patch(
+        "backend.api.features.workspace.routes.WorkspaceManager",
+        return_value=mock_manager,
+    )
+    mock_delete = mocker.patch(
+        "backend.api.features.workspace.routes.soft_delete_workspace_file",
+        return_value=None,
+    )
+
+    response = _upload()
+    assert response.status_code == 413
+    mock_delete.assert_called_once_with("file-aaa-bbb", "ws-1")
+
+
+# ---- Any extension accepted (no allowlist) ----
+
+
+def test_upload_any_extension(mocker: pytest_mock.MockFixture):
+    """Any file extension should be accepted — ClamAV is the security layer."""
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_or_create_workspace",
+        return_value=MOCK_WORKSPACE,
+    )
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_workspace_total_size",
+        return_value=0,
+    )
+    mocker.patch(
+        "backend.api.features.workspace.routes.scan_content_safe",
+        return_value=None,
+    )
+    mock_manager = mocker.MagicMock()
+    mock_manager.write_file = mocker.AsyncMock(return_value=MOCK_FILE)
+    mocker.patch(
+        "backend.api.features.workspace.routes.WorkspaceManager",
+        return_value=mock_manager,
+    )
+
+    response = _upload(filename="data.xyz", content=b"arbitrary")
+    assert response.status_code == 200
+
+
+# ---- Virus scan rejection ----
+
+
+def test_upload_blocked_by_virus_scan(mocker: pytest_mock.MockFixture):
+    """Files flagged by ClamAV should be rejected and never written to storage."""
+    from backend.api.features.store.exceptions import VirusDetectedError
+
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_or_create_workspace",
+        return_value=MOCK_WORKSPACE,
+    )
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_workspace_total_size",
+        return_value=0,
+    )
+    mocker.patch(
+        "backend.api.features.workspace.routes.scan_content_safe",
+        side_effect=VirusDetectedError("Eicar-Test-Signature"),
+    )
+    mock_manager = mocker.MagicMock()
+    mock_manager.write_file = mocker.AsyncMock(return_value=MOCK_FILE)
+    mocker.patch(
+        "backend.api.features.workspace.routes.WorkspaceManager",
+        return_value=mock_manager,
+    )
+
+    response = _upload(filename="evil.exe", content=b"X5O!P%@AP...")
+    assert response.status_code == 400
+    assert "Virus detected" in response.text
+    mock_manager.write_file.assert_not_called()
+
+
+# ---- No file extension ----
+
+
+def test_upload_file_without_extension(mocker: pytest_mock.MockFixture):
+    """Files without an extension should be accepted and stored as-is."""
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_or_create_workspace",
+        return_value=MOCK_WORKSPACE,
+    )
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_workspace_total_size",
+        return_value=0,
+    )
+    mocker.patch(
+        "backend.api.features.workspace.routes.scan_content_safe",
+        return_value=None,
+    )
+    mock_manager = mocker.MagicMock()
+    mock_manager.write_file = mocker.AsyncMock(return_value=MOCK_FILE)
+    mocker.patch(
+        "backend.api.features.workspace.routes.WorkspaceManager",
+        return_value=mock_manager,
+    )
+
+    response = _upload(
+        filename="Makefile",
+        content=b"all:\n\techo hello",
+        content_type="application/octet-stream",
+    )
+    assert response.status_code == 200
+    mock_manager.write_file.assert_called_once()
+    assert mock_manager.write_file.call_args[0][1] == "Makefile"
+
+
+# ---- Filename sanitization (SF5) ----
+
+
+def test_upload_strips_path_components(mocker: pytest_mock.MockFixture):
+    """Path-traversal filenames should be reduced to their basename."""
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_or_create_workspace",
+        return_value=MOCK_WORKSPACE,
+    )
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_workspace_total_size",
+        return_value=0,
+    )
+    mocker.patch(
+        "backend.api.features.workspace.routes.scan_content_safe",
+        return_value=None,
+    )
+    mock_manager = mocker.MagicMock()
+    mock_manager.write_file = mocker.AsyncMock(return_value=MOCK_FILE)
+    mocker.patch(
+        "backend.api.features.workspace.routes.WorkspaceManager",
+        return_value=mock_manager,
+    )
+
+    # Filename with traversal
+    _upload(filename="../../etc/passwd.txt")
+
+    # write_file should have been called with just the basename
+    mock_manager.write_file.assert_called_once()
+    call_args = mock_manager.write_file.call_args
+    assert call_args[0][1] == "passwd.txt"
+
+
+# ---- Download ----
+
+
+def test_download_file_not_found(mocker: pytest_mock.MockFixture):
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_workspace",
+        return_value=MOCK_WORKSPACE,
+    )
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_workspace_file",
+        return_value=None,
+    )
+
+    response = client.get("/files/some-file-id/download")
+    assert response.status_code == 404
+
+
+# ---- Delete ----
+
+
+def test_delete_file_success(mocker: pytest_mock.MockFixture):
+    """Deleting an existing file should return {"deleted": true}."""
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_workspace",
+        return_value=MOCK_WORKSPACE,
+    )
+    mock_manager = mocker.MagicMock()
+    mock_manager.delete_file = mocker.AsyncMock(return_value=True)
+    mocker.patch(
+        "backend.api.features.workspace.routes.WorkspaceManager",
+        return_value=mock_manager,
+    )
+
+    response = client.delete("/files/file-aaa-bbb")
+    assert response.status_code == 200
+    assert response.json() == {"deleted": True}
+    mock_manager.delete_file.assert_called_once_with("file-aaa-bbb")
+
+
+def test_delete_file_not_found(mocker: pytest_mock.MockFixture):
+    """Deleting a non-existent file should return 404."""
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_workspace",
+        return_value=MOCK_WORKSPACE,
+    )
+    mock_manager = mocker.MagicMock()
+    mock_manager.delete_file = mocker.AsyncMock(return_value=False)
+    mocker.patch(
+        "backend.api.features.workspace.routes.WorkspaceManager",
+        return_value=mock_manager,
+    )
+
+    response = client.delete("/files/nonexistent-id")
+    assert response.status_code == 404
+    assert "File not found" in response.text
+
+
+def test_delete_file_no_workspace(mocker: pytest_mock.MockFixture):
+    """Deleting when user has no workspace should return 404."""
+    mocker.patch(
+        "backend.api.features.workspace.routes.get_workspace",
+        return_value=None,
+    )
+
+    response = client.delete("/files/file-aaa-bbb")
+    assert response.status_code == 404
+    assert "Workspace not found" in response.text
--- a/autogpt_platform/backend/backend/api/model.py
+++ b/autogpt_platform/backend/backend/api/model.py
@@ -94,3 +94,8 @@ class NotificationPayload(pydantic.BaseModel):

 class OnboardingNotificationPayload(NotificationPayload):
    step: OnboardingStep | None
+
+
+class CopilotCompletionPayload(NotificationPayload):
+    session_id: str
+    status: Literal["completed", "failed"]
--- a/autogpt_platform/backend/backend/api/rest_api.py
+++ b/autogpt_platform/backend/backend/api/rest_api.py
@@ -18,6 +18,7 @@ from prisma.errors import PrismaError

 import backend.api.features.admin.credit_admin_routes
 import backend.api.features.admin.execution_analytics_routes
+import backend.api.features.admin.rate_limit_admin_routes
 import backend.api.features.admin.store_admin_routes
 import backend.api.features.builder
 import backend.api.features.builder.routes
@@ -41,11 +42,11 @@ import backend.data.user
 import backend.integrations.webhooks.utils
 import backend.util.service
 import backend.util.settings
-from backend.blocks.llm import DEFAULT_LLM_MODEL
-from backend.copilot.completion_consumer import (
-    start_completion_consumer,
-    stop_completion_consumer,
+from backend.api.features.library.exceptions import (
+    FolderAlreadyExistsError,
+    FolderValidationError,
 )
+from backend.blocks.llm import DEFAULT_LLM_MODEL
 from backend.data.model import Credentials
 from backend.integrations.providers import ProviderName
 from backend.monitoring.instrumentation import instrument_fastapi
@@ -55,6 +56,7 @@ from backend.util.exceptions import (
    MissingConfigError,
    NotAuthorizedError,
    NotFoundError,
+    PreconditionFailed,
 )
 from backend.util.feature_flag import initialize_launchdarkly, shutdown_launchdarkly
 from backend.util.service import UnhealthyServiceError
@@ -116,6 +118,11 @@ async def lifespan_context(app: fastapi.FastAPI):

    AutoRegistry.patch_integrations()

+    # Register managed credential providers (e.g. AgentMail)
+    from backend.integrations.managed_providers import register_all
+
+    register_all()
+
    await backend.data.block.initialize_blocks()

    await backend.data.user.migrate_and_encrypt_user_integrations()
@@ -123,21 +130,9 @@ async def lifespan_context(app: fastapi.FastAPI):
    await backend.data.graph.migrate_llm_models(DEFAULT_LLM_MODEL)
    await backend.integrations.webhooks.utils.migrate_legacy_triggered_graphs()

-    # Start chat completion consumer for Redis Streams notifications
-    try:
-        await start_completion_consumer()
-    except Exception as e:
-        logger.warning(f"Could not start chat completion consumer: {e}")
-
    with launch_darkly_context():
        yield

-    # Stop chat completion consumer
-    try:
-        await stop_completion_consumer()
-    except Exception as e:
-        logger.warning(f"Error stopping chat completion consumer: {e}")
-
    try:
        await shutdown_cloud_storage_handler()
    except Exception as e:
@@ -221,13 +216,22 @@ instrument_fastapi(
 def handle_internal_http_error(status_code: int = 500, log_error: bool = True):
    def handler(request: fastapi.Request, exc: Exception):
        if log_error:
-            logger.exception(
-                "%s %s failed. Investigate and resolve the underlying issue: %s",
-                request.method,
-                request.url.path,
-                exc,
-                exc_info=exc,
-            )
+            if status_code >= 500:
+                logger.exception(
+                    "%s %s failed. Investigate and resolve the underlying issue: %s",
+                    request.method,
+                    request.url.path,
+                    exc,
+                    exc_info=exc,
+                )
+            else:
+                logger.warning(
+                    "%s %s failed with %d: %s",
+                    request.method,
+                    request.url.path,
+                    status_code,
+                    exc,
+                )

        hint = (
            "Adjust the request and retry."
@@ -277,12 +281,15 @@ async def validation_error_handler(


 app.add_exception_handler(PrismaError, handle_internal_http_error(500))
-app.add_exception_handler(NotFoundError, handle_internal_http_error(404, False))
-app.add_exception_handler(NotAuthorizedError, handle_internal_http_error(403, False))
+app.add_exception_handler(FolderAlreadyExistsError, handle_internal_http_error(409))
+app.add_exception_handler(FolderValidationError, handle_internal_http_error(400))
+app.add_exception_handler(NotFoundError, handle_internal_http_error(404))
+app.add_exception_handler(NotAuthorizedError, handle_internal_http_error(403))
 app.add_exception_handler(RequestValidationError, validation_error_handler)
 app.add_exception_handler(pydantic.ValidationError, validation_error_handler)
 app.add_exception_handler(MissingConfigError, handle_internal_http_error(503))
 app.add_exception_handler(ValueError, handle_internal_http_error(400))
+app.add_exception_handler(PreconditionFailed, handle_internal_http_error(428))
 app.add_exception_handler(Exception, handle_internal_http_error(500))

 app.include_router(backend.api.features.v1.v1_router, tags=["v1"], prefix="/api")
@@ -317,6 +324,11 @@ app.include_router(
    tags=["v2", "admin"],
    prefix="/api/executions",
 )
+app.include_router(
+    backend.api.features.admin.rate_limit_admin_routes.router,
+    tags=["v2", "admin"],
+    prefix="/api/copilot",
+)
 app.include_router(
    backend.api.features.executions.review.routes.router,
    tags=["v2", "executions", "review"],
@@ -527,8 +539,11 @@ class AgentServer(backend.util.service.AppProcess):
        user_id: str,
        provider: ProviderName,
        credentials: Credentials,
-    ) -> Credentials:
-        from .features.integrations.router import create_credentials, get_credential
+    ):
+        from backend.api.features.integrations.router import (
+            create_credentials,
+            get_credential,
+        )

        try:
            return await create_credentials(
--- a/autogpt_platform/backend/backend/app.py
+++ b/autogpt_platform/backend/backend/app.py
@@ -24,7 +24,7 @@ def run_processes(*processes: "AppProcess", **kwargs):
        # Run the last process in the foreground.
        processes[-1].start(background=False, **kwargs)
    finally:
-        for process in processes:
+        for process in reversed(processes):
            try:
                process.stop()
            except Exception as e:
--- a/autogpt_platform/backend/backend/blocks/_base.py
+++ b/autogpt_platform/backend/backend/blocks/_base.py
@@ -418,6 +418,8 @@ class BlockWebhookConfig(BlockManualWebhookConfig):


 class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
+    _optimized_description: ClassVar[str | None] = None
+
    def __init__(
        self,
        id: str = "",
@@ -470,6 +472,8 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
        self.block_type = block_type
        self.webhook_config = webhook_config
        self.is_sensitive_action = is_sensitive_action
+        # Read from ClassVar set by initialize_blocks()
+        self.optimized_description: str | None = type(self)._optimized_description
        self.execution_stats: "NodeExecutionStats" = NodeExecutionStats()

        if self.webhook_config:
@@ -620,6 +624,7 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
        graph_id: str,
        graph_version: int,
        execution_context: "ExecutionContext",
+        is_graph_execution: bool = True,
        **kwargs,
    ) -> tuple[bool, BlockInput]:
        """
@@ -648,6 +653,7 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
            graph_version=graph_version,
            block_name=self.name,
            editable=True,
+            is_graph_execution=is_graph_execution,
        )

        if decision is None:
--- a/autogpt_platform/backend/backend/blocks/agent_mail/_config.py
+++ b/autogpt_platform/backend/backend/blocks/agent_mail/_config.py
@@ -0,0 +1,33 @@
+"""
+Shared configuration for all AgentMail blocks.
+"""
+
+from agentmail import AsyncAgentMail
+
+from backend.sdk import APIKeyCredentials, ProviderBuilder, SecretStr
+
+agent_mail = (
+    ProviderBuilder("agent_mail")
+    .with_api_key("AGENTMAIL_API_KEY", "AgentMail API Key")
+    .build()
+)
+
+TEST_CREDENTIALS = APIKeyCredentials(
+    id="01234567-89ab-cdef-0123-456789abcdef",
+    provider="agent_mail",
+    title="Mock AgentMail API Key",
+    api_key=SecretStr("mock-agentmail-api-key"),
+    expires_at=None,
+)
+
+TEST_CREDENTIALS_INPUT = {
+    "id": TEST_CREDENTIALS.id,
+    "provider": TEST_CREDENTIALS.provider,
+    "type": TEST_CREDENTIALS.type,
+    "title": TEST_CREDENTIALS.title,
+}
+
+
+def _client(credentials: APIKeyCredentials) -> AsyncAgentMail:
+    """Create an AsyncAgentMail client from credentials."""
+    return AsyncAgentMail(api_key=credentials.api_key.get_secret_value())
--- a/autogpt_platform/backend/backend/blocks/agent_mail/attachments.py
+++ b/autogpt_platform/backend/backend/blocks/agent_mail/attachments.py
@@ -0,0 +1,211 @@
+"""
+AgentMail Attachment blocks — download file attachments from messages and threads.
+
+Attachments are files associated with messages (PDFs, CSVs, images, etc.).
+To send attachments, include them in the attachments parameter when using
+AgentMailSendMessageBlock or AgentMailReplyToMessageBlock.
+
+To download, first get the attachment_id from a message's attachments array,
+then use these blocks to retrieve the file content as base64.
+"""
+
+import base64
+
+from backend.sdk import (
+    APIKeyCredentials,
+    Block,
+    BlockCategory,
+    BlockOutput,
+    BlockSchemaInput,
+    BlockSchemaOutput,
+    CredentialsMetaInput,
+    SchemaField,
+)
+
+from ._config import TEST_CREDENTIALS, TEST_CREDENTIALS_INPUT, _client, agent_mail
+
+
+class AgentMailGetMessageAttachmentBlock(Block):
+    """
+    Download a file attachment from a specific email message.
+
+    Retrieves the raw file content and returns it as base64-encoded data.
+    First get the attachment_id from a message object's attachments array,
+    then use this block to download the file.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        inbox_id: str = SchemaField(
+            description="Inbox ID or email address the message belongs to"
+        )
+        message_id: str = SchemaField(
+            description="Message ID containing the attachment"
+        )
+        attachment_id: str = SchemaField(
+            description="Attachment ID to download (from the message's attachments array)"
+        )
+
+    class Output(BlockSchemaOutput):
+        content_base64: str = SchemaField(
+            description="File content encoded as a base64 string. Decode with base64.b64decode() to get raw bytes."
+        )
+        attachment_id: str = SchemaField(
+            description="The attachment ID that was downloaded"
+        )
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="a283ffc4-8087-4c3d-9135-8f26b86742ec",
+            description="Download a file attachment from an email message. Returns base64-encoded file content.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={
+                "credentials": TEST_CREDENTIALS_INPUT,
+                "inbox_id": "test-inbox",
+                "message_id": "test-msg",
+                "attachment_id": "test-attach",
+            },
+            test_output=[
+                ("content_base64", "dGVzdA=="),
+                ("attachment_id", "test-attach"),
+            ],
+            test_mock={
+                "get_attachment": lambda *a, **kw: b"test",
+            },
+        )
+
+    @staticmethod
+    async def get_attachment(
+        credentials: APIKeyCredentials,
+        inbox_id: str,
+        message_id: str,
+        attachment_id: str,
+    ):
+        client = _client(credentials)
+        return await client.inboxes.messages.get_attachment(
+            inbox_id=inbox_id,
+            message_id=message_id,
+            attachment_id=attachment_id,
+        )
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            data = await self.get_attachment(
+                credentials=credentials,
+                inbox_id=input_data.inbox_id,
+                message_id=input_data.message_id,
+                attachment_id=input_data.attachment_id,
+            )
+            if isinstance(data, bytes):
+                encoded = base64.b64encode(data).decode()
+            elif isinstance(data, str):
+                encoded = base64.b64encode(data.encode("utf-8")).decode()
+            else:
+                raise TypeError(
+                    f"Unexpected attachment data type: {type(data).__name__}"
+                )
+
+            yield "content_base64", encoded
+            yield "attachment_id", input_data.attachment_id
+        except Exception as e:
+            yield "error", str(e)
+
+
+class AgentMailGetThreadAttachmentBlock(Block):
+    """
+    Download a file attachment from a conversation thread.
+
+    Same as GetMessageAttachment but looks up by thread ID instead of
+    message ID. Useful when you know the thread but not the specific
+    message containing the attachment.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        inbox_id: str = SchemaField(
+            description="Inbox ID or email address the thread belongs to"
+        )
+        thread_id: str = SchemaField(description="Thread ID containing the attachment")
+        attachment_id: str = SchemaField(
+            description="Attachment ID to download (from a message's attachments array within the thread)"
+        )
+
+    class Output(BlockSchemaOutput):
+        content_base64: str = SchemaField(
+            description="File content encoded as a base64 string. Decode with base64.b64decode() to get raw bytes."
+        )
+        attachment_id: str = SchemaField(
+            description="The attachment ID that was downloaded"
+        )
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="06b6a4c4-9d71-4992-9e9c-cf3b352763b5",
+            description="Download a file attachment from a conversation thread. Returns base64-encoded file content.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={
+                "credentials": TEST_CREDENTIALS_INPUT,
+                "inbox_id": "test-inbox",
+                "thread_id": "test-thread",
+                "attachment_id": "test-attach",
+            },
+            test_output=[
+                ("content_base64", "dGVzdA=="),
+                ("attachment_id", "test-attach"),
+            ],
+            test_mock={
+                "get_attachment": lambda *a, **kw: b"test",
+            },
+        )
+
+    @staticmethod
+    async def get_attachment(
+        credentials: APIKeyCredentials,
+        inbox_id: str,
+        thread_id: str,
+        attachment_id: str,
+    ):
+        client = _client(credentials)
+        return await client.inboxes.threads.get_attachment(
+            inbox_id=inbox_id,
+            thread_id=thread_id,
+            attachment_id=attachment_id,
+        )
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            data = await self.get_attachment(
+                credentials=credentials,
+                inbox_id=input_data.inbox_id,
+                thread_id=input_data.thread_id,
+                attachment_id=input_data.attachment_id,
+            )
+            if isinstance(data, bytes):
+                encoded = base64.b64encode(data).decode()
+            elif isinstance(data, str):
+                encoded = base64.b64encode(data.encode("utf-8")).decode()
+            else:
+                raise TypeError(
+                    f"Unexpected attachment data type: {type(data).__name__}"
+                )
+
+            yield "content_base64", encoded
+            yield "attachment_id", input_data.attachment_id
+        except Exception as e:
+            yield "error", str(e)
--- a/autogpt_platform/backend/backend/blocks/agent_mail/drafts.py
+++ b/autogpt_platform/backend/backend/blocks/agent_mail/drafts.py
@@ -0,0 +1,678 @@
+"""
+AgentMail Draft blocks — create, get, list, update, send, and delete drafts.
+
+A Draft is an unsent message that can be reviewed, edited, and sent later.
+Drafts enable human-in-the-loop review, scheduled sending (via send_at),
+and complex multi-step email composition workflows.
+"""
+
+from typing import Optional
+
+from backend.sdk import (
+    APIKeyCredentials,
+    Block,
+    BlockCategory,
+    BlockOutput,
+    BlockSchemaInput,
+    BlockSchemaOutput,
+    CredentialsMetaInput,
+    SchemaField,
+)
+
+from ._config import TEST_CREDENTIALS, TEST_CREDENTIALS_INPUT, _client, agent_mail
+
+
+class AgentMailCreateDraftBlock(Block):
+    """
+    Create a draft email in an AgentMail inbox for review or scheduled sending.
+
+    Drafts let agents prepare emails without sending immediately. Use send_at
+    to schedule automatic sending at a future time (ISO 8601 format).
+    Scheduled drafts are auto-labeled 'scheduled' and can be cancelled by
+    deleting the draft.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        inbox_id: str = SchemaField(
+            description="Inbox ID or email address to create the draft in"
+        )
+        to: list[str] = SchemaField(
+            description="Recipient email addresses (e.g. ['user@example.com'])"
+        )
+        subject: str = SchemaField(description="Email subject line", default="")
+        text: str = SchemaField(description="Plain text body of the draft", default="")
+        html: str = SchemaField(
+            description="Rich HTML body of the draft", default="", advanced=True
+        )
+        cc: list[str] = SchemaField(
+            description="CC recipient email addresses",
+            default_factory=list,
+            advanced=True,
+        )
+        bcc: list[str] = SchemaField(
+            description="BCC recipient email addresses",
+            default_factory=list,
+            advanced=True,
+        )
+        in_reply_to: str = SchemaField(
+            description="Message ID this draft replies to, for threading follow-up drafts",
+            default="",
+            advanced=True,
+        )
+        send_at: str = SchemaField(
+            description="Schedule automatic sending at this ISO 8601 datetime (e.g. '2025-01-15T09:00:00Z'). Leave empty for manual send.",
+            default="",
+            advanced=True,
+        )
+
+    class Output(BlockSchemaOutput):
+        draft_id: str = SchemaField(
+            description="Unique identifier of the created draft"
+        )
+        send_status: str = SchemaField(
+            description="'scheduled' if send_at was set, empty otherwise. Values: scheduled, sending, failed.",
+            default="",
+        )
+        result: dict = SchemaField(
+            description="Complete draft object with all metadata"
+        )
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="25ac9086-69fd-48b8-b910-9dbe04b8f3bd",
+            description="Create a draft email for review or scheduled sending. Use send_at for automatic future delivery.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={
+                "credentials": TEST_CREDENTIALS_INPUT,
+                "inbox_id": "test-inbox",
+                "to": ["user@example.com"],
+            },
+            test_output=[
+                ("draft_id", "mock-draft-id"),
+                ("send_status", ""),
+                ("result", dict),
+            ],
+            test_mock={
+                "create_draft": lambda *a, **kw: type(
+                    "Draft",
+                    (),
+                    {
+                        "draft_id": "mock-draft-id",
+                        "send_status": "",
+                        "model_dump": lambda self: {"draft_id": "mock-draft-id"},
+                    },
+                )(),
+            },
+        )
+
+    @staticmethod
+    async def create_draft(credentials: APIKeyCredentials, inbox_id: str, **params):
+        client = _client(credentials)
+        return await client.inboxes.drafts.create(inbox_id, **params)
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            params: dict = {"to": input_data.to}
+            if input_data.subject:
+                params["subject"] = input_data.subject
+            if input_data.text:
+                params["text"] = input_data.text
+            if input_data.html:
+                params["html"] = input_data.html
+            if input_data.cc:
+                params["cc"] = input_data.cc
+            if input_data.bcc:
+                params["bcc"] = input_data.bcc
+            if input_data.in_reply_to:
+                params["in_reply_to"] = input_data.in_reply_to
+            if input_data.send_at:
+                params["send_at"] = input_data.send_at
+
+            draft = await self.create_draft(credentials, input_data.inbox_id, **params)
+            result = draft.model_dump()
+
+            yield "draft_id", draft.draft_id
+            yield "send_status", draft.send_status or ""
+            yield "result", result
+        except Exception as e:
+            yield "error", str(e)
+
+
+class AgentMailGetDraftBlock(Block):
+    """
+    Retrieve a specific draft from an AgentMail inbox.
+
+    Returns the draft contents including recipients, subject, body, and
+    scheduled send status. Use this to review a draft before approving it.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        inbox_id: str = SchemaField(
+            description="Inbox ID or email address the draft belongs to"
+        )
+        draft_id: str = SchemaField(description="Draft ID to retrieve")
+
+    class Output(BlockSchemaOutput):
+        draft_id: str = SchemaField(description="Unique identifier of the draft")
+        subject: str = SchemaField(description="Draft subject line", default="")
+        send_status: str = SchemaField(
+            description="Scheduled send status: 'scheduled', 'sending', 'failed', or empty",
+            default="",
+        )
+        send_at: str = SchemaField(
+            description="Scheduled send time (ISO 8601) if set", default=""
+        )
+        result: dict = SchemaField(description="Complete draft object with all fields")
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="8e57780d-dc25-43d4-a0f4-1f02877b09fb",
+            description="Retrieve a draft email to review its contents, recipients, and scheduled send status.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={
+                "credentials": TEST_CREDENTIALS_INPUT,
+                "inbox_id": "test-inbox",
+                "draft_id": "test-draft",
+            },
+            test_output=[
+                ("draft_id", "test-draft"),
+                ("subject", ""),
+                ("send_status", ""),
+                ("send_at", ""),
+                ("result", dict),
+            ],
+            test_mock={
+                "get_draft": lambda *a, **kw: type(
+                    "Draft",
+                    (),
+                    {
+                        "draft_id": "test-draft",
+                        "subject": "",
+                        "send_status": "",
+                        "send_at": "",
+                        "model_dump": lambda self: {"draft_id": "test-draft"},
+                    },
+                )(),
+            },
+        )
+
+    @staticmethod
+    async def get_draft(credentials: APIKeyCredentials, inbox_id: str, draft_id: str):
+        client = _client(credentials)
+        return await client.inboxes.drafts.get(inbox_id=inbox_id, draft_id=draft_id)
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            draft = await self.get_draft(
+                credentials, input_data.inbox_id, input_data.draft_id
+            )
+            result = draft.model_dump()
+
+            yield "draft_id", draft.draft_id
+            yield "subject", draft.subject or ""
+            yield "send_status", draft.send_status or ""
+            yield "send_at", draft.send_at or ""
+            yield "result", result
+        except Exception as e:
+            yield "error", str(e)
+
+
+class AgentMailListDraftsBlock(Block):
+    """
+    List all drafts in an AgentMail inbox with optional label filtering.
+
+    Use labels=['scheduled'] to find all drafts queued for future sending.
+    Useful for building approval dashboards or monitoring pending outreach.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        inbox_id: str = SchemaField(
+            description="Inbox ID or email address to list drafts from"
+        )
+        limit: int = SchemaField(
+            description="Maximum number of drafts to return per page (1-100)",
+            default=20,
+            advanced=True,
+        )
+        page_token: str = SchemaField(
+            description="Token from a previous response to fetch the next page",
+            default="",
+            advanced=True,
+        )
+        labels: list[str] = SchemaField(
+            description="Filter drafts by labels (e.g. ['scheduled'] for pending sends)",
+            default_factory=list,
+            advanced=True,
+        )
+
+    class Output(BlockSchemaOutput):
+        drafts: list[dict] = SchemaField(
+            description="List of draft objects with subject, recipients, send_status, etc."
+        )
+        count: int = SchemaField(description="Number of drafts returned")
+        next_page_token: str = SchemaField(
+            description="Token for the next page. Empty if no more results.",
+            default="",
+        )
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="e84883b7-7c39-4c5c-88e8-0a72b078ea63",
+            description="List drafts in an AgentMail inbox. Filter by labels=['scheduled'] to find pending sends.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={
+                "credentials": TEST_CREDENTIALS_INPUT,
+                "inbox_id": "test-inbox",
+            },
+            test_output=[
+                ("drafts", []),
+                ("count", 0),
+                ("next_page_token", ""),
+            ],
+            test_mock={
+                "list_drafts": lambda *a, **kw: type(
+                    "Resp",
+                    (),
+                    {
+                        "drafts": [],
+                        "count": 0,
+                        "next_page_token": "",
+                    },
+                )(),
+            },
+        )
+
+    @staticmethod
+    async def list_drafts(credentials: APIKeyCredentials, inbox_id: str, **params):
+        client = _client(credentials)
+        return await client.inboxes.drafts.list(inbox_id, **params)
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            params: dict = {"limit": input_data.limit}
+            if input_data.page_token:
+                params["page_token"] = input_data.page_token
+            if input_data.labels:
+                params["labels"] = input_data.labels
+
+            response = await self.list_drafts(
+                credentials, input_data.inbox_id, **params
+            )
+            drafts = [d.model_dump() for d in response.drafts]
+
+            yield "drafts", drafts
+            yield "count", response.count
+            yield "next_page_token", response.next_page_token or ""
+        except Exception as e:
+            yield "error", str(e)
+
+
+class AgentMailUpdateDraftBlock(Block):
+    """
+    Update an existing draft's content, recipients, or scheduled send time.
+
+    Use this to reschedule a draft (change send_at), modify recipients,
+    or edit the subject/body before sending. To cancel a scheduled send,
+    delete the draft instead.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        inbox_id: str = SchemaField(
+            description="Inbox ID or email address the draft belongs to"
+        )
+        draft_id: str = SchemaField(description="Draft ID to update")
+        to: Optional[list[str]] = SchemaField(
+            description="Updated recipient email addresses (replaces existing list). Omit to keep current value.",
+            default=None,
+        )
+        subject: Optional[str] = SchemaField(
+            description="Updated subject line. Omit to keep current value.",
+            default=None,
+        )
+        text: Optional[str] = SchemaField(
+            description="Updated plain text body. Omit to keep current value.",
+            default=None,
+        )
+        html: Optional[str] = SchemaField(
+            description="Updated HTML body. Omit to keep current value.",
+            default=None,
+            advanced=True,
+        )
+        send_at: Optional[str] = SchemaField(
+            description="Reschedule: new ISO 8601 send time (e.g. '2025-01-20T14:00:00Z'). Omit to keep current value.",
+            default=None,
+            advanced=True,
+        )
+
+    class Output(BlockSchemaOutput):
+        draft_id: str = SchemaField(description="The updated draft ID")
+        send_status: str = SchemaField(description="Updated send status", default="")
+        result: dict = SchemaField(description="Complete updated draft object")
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="351f6e51-695a-421a-9032-46a587b10336",
+            description="Update a draft's content, recipients, or scheduled send time. Use to reschedule or edit before sending.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={
+                "credentials": TEST_CREDENTIALS_INPUT,
+                "inbox_id": "test-inbox",
+                "draft_id": "test-draft",
+            },
+            test_output=[
+                ("draft_id", "test-draft"),
+                ("send_status", ""),
+                ("result", dict),
+            ],
+            test_mock={
+                "update_draft": lambda *a, **kw: type(
+                    "Draft",
+                    (),
+                    {
+                        "draft_id": "test-draft",
+                        "send_status": "",
+                        "model_dump": lambda self: {"draft_id": "test-draft"},
+                    },
+                )(),
+            },
+        )
+
+    @staticmethod
+    async def update_draft(
+        credentials: APIKeyCredentials, inbox_id: str, draft_id: str, **params
+    ):
+        client = _client(credentials)
+        return await client.inboxes.drafts.update(
+            inbox_id=inbox_id, draft_id=draft_id, **params
+        )
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            params: dict = {}
+            if input_data.to is not None:
+                params["to"] = input_data.to
+            if input_data.subject is not None:
+                params["subject"] = input_data.subject
+            if input_data.text is not None:
+                params["text"] = input_data.text
+            if input_data.html is not None:
+                params["html"] = input_data.html
+            if input_data.send_at is not None:
+                params["send_at"] = input_data.send_at
+
+            draft = await self.update_draft(
+                credentials, input_data.inbox_id, input_data.draft_id, **params
+            )
+            result = draft.model_dump()
+
+            yield "draft_id", draft.draft_id
+            yield "send_status", draft.send_status or ""
+            yield "result", result
+        except Exception as e:
+            yield "error", str(e)
+
+
+class AgentMailSendDraftBlock(Block):
+    """
+    Send a draft immediately, converting it into a delivered message.
+
+    The draft is deleted after successful sending and becomes a regular
+    message with a message_id. Use this for human-in-the-loop approval
+    workflows: agent creates draft, human reviews, then this block sends it.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        inbox_id: str = SchemaField(
+            description="Inbox ID or email address the draft belongs to"
+        )
+        draft_id: str = SchemaField(description="Draft ID to send now")
+
+    class Output(BlockSchemaOutput):
+        message_id: str = SchemaField(
+            description="Message ID of the now-sent email (draft is deleted)"
+        )
+        thread_id: str = SchemaField(
+            description="Thread ID the sent message belongs to"
+        )
+        result: dict = SchemaField(description="Complete sent message object")
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="37c39e83-475d-4b3d-843a-d923d001b85a",
+            description="Send a draft immediately, converting it into a delivered message. The draft is deleted after sending.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            is_sensitive_action=True,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={
+                "credentials": TEST_CREDENTIALS_INPUT,
+                "inbox_id": "test-inbox",
+                "draft_id": "test-draft",
+            },
+            test_output=[
+                ("message_id", "mock-msg-id"),
+                ("thread_id", "mock-thread-id"),
+                ("result", dict),
+            ],
+            test_mock={
+                "send_draft": lambda *a, **kw: type(
+                    "Msg",
+                    (),
+                    {
+                        "message_id": "mock-msg-id",
+                        "thread_id": "mock-thread-id",
+                        "model_dump": lambda self: {"message_id": "mock-msg-id"},
+                    },
+                )(),
+            },
+        )
+
+    @staticmethod
+    async def send_draft(credentials: APIKeyCredentials, inbox_id: str, draft_id: str):
+        client = _client(credentials)
+        return await client.inboxes.drafts.send(inbox_id=inbox_id, draft_id=draft_id)
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            msg = await self.send_draft(
+                credentials, input_data.inbox_id, input_data.draft_id
+            )
+            result = msg.model_dump()
+
+            yield "message_id", msg.message_id
+            yield "thread_id", msg.thread_id or ""
+            yield "result", result
+        except Exception as e:
+            yield "error", str(e)
+
+
+class AgentMailDeleteDraftBlock(Block):
+    """
+    Delete a draft from an AgentMail inbox. Also cancels any scheduled send.
+
+    If the draft was scheduled with send_at, deleting it cancels the
+    scheduled delivery. This is the way to cancel a scheduled email.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        inbox_id: str = SchemaField(
+            description="Inbox ID or email address the draft belongs to"
+        )
+        draft_id: str = SchemaField(
+            description="Draft ID to delete (also cancels scheduled sends)"
+        )
+
+    class Output(BlockSchemaOutput):
+        success: bool = SchemaField(
+            description="True if the draft was successfully deleted/cancelled"
+        )
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="9023eb99-3e2f-4def-808b-d9c584b3d9e7",
+            description="Delete a draft or cancel a scheduled email. Removes the draft permanently.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            is_sensitive_action=True,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={
+                "credentials": TEST_CREDENTIALS_INPUT,
+                "inbox_id": "test-inbox",
+                "draft_id": "test-draft",
+            },
+            test_output=[("success", True)],
+            test_mock={
+                "delete_draft": lambda *a, **kw: None,
+            },
+        )
+
+    @staticmethod
+    async def delete_draft(
+        credentials: APIKeyCredentials, inbox_id: str, draft_id: str
+    ):
+        client = _client(credentials)
+        await client.inboxes.drafts.delete(inbox_id=inbox_id, draft_id=draft_id)
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            await self.delete_draft(
+                credentials, input_data.inbox_id, input_data.draft_id
+            )
+            yield "success", True
+        except Exception as e:
+            yield "error", str(e)
+
+
+class AgentMailListOrgDraftsBlock(Block):
+    """
+    List all drafts across every inbox in your organization.
+
+    Returns drafts from all inboxes in one query. Perfect for building
+    a central approval dashboard where a human supervisor can review
+    and approve any draft created by any agent.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        limit: int = SchemaField(
+            description="Maximum number of drafts to return per page (1-100)",
+            default=20,
+            advanced=True,
+        )
+        page_token: str = SchemaField(
+            description="Token from a previous response to fetch the next page",
+            default="",
+            advanced=True,
+        )
+
+    class Output(BlockSchemaOutput):
+        drafts: list[dict] = SchemaField(
+            description="List of draft objects from all inboxes in the organization"
+        )
+        count: int = SchemaField(description="Number of drafts returned")
+        next_page_token: str = SchemaField(
+            description="Token for the next page. Empty if no more results.",
+            default="",
+        )
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="ed7558ae-3a07-45f5-af55-a25fe88c9971",
+            description="List all drafts across every inbox in your organization. Use for central approval dashboards.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={"credentials": TEST_CREDENTIALS_INPUT},
+            test_output=[
+                ("drafts", []),
+                ("count", 0),
+                ("next_page_token", ""),
+            ],
+            test_mock={
+                "list_org_drafts": lambda *a, **kw: type(
+                    "Resp",
+                    (),
+                    {
+                        "drafts": [],
+                        "count": 0,
+                        "next_page_token": "",
+                    },
+                )(),
+            },
+        )
+
+    @staticmethod
+    async def list_org_drafts(credentials: APIKeyCredentials, **params):
+        client = _client(credentials)
+        return await client.drafts.list(**params)
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            params: dict = {"limit": input_data.limit}
+            if input_data.page_token:
+                params["page_token"] = input_data.page_token
+
+            response = await self.list_org_drafts(credentials, **params)
+            drafts = [d.model_dump() for d in response.drafts]
+
+            yield "drafts", drafts
+            yield "count", response.count
+            yield "next_page_token", response.next_page_token or ""
+        except Exception as e:
+            yield "error", str(e)
--- a/autogpt_platform/backend/backend/blocks/agent_mail/inbox.py
+++ b/autogpt_platform/backend/backend/blocks/agent_mail/inbox.py
@@ -0,0 +1,414 @@
+"""
+AgentMail Inbox blocks — create, get, list, update, and delete inboxes.
+
+An Inbox is a fully programmable email account for AI agents. Each inbox gets
+a unique email address and can send, receive, and manage emails via the
+AgentMail API. You can create thousands of inboxes on demand.
+"""
+
+from agentmail.inboxes.types import CreateInboxRequest
+
+from backend.sdk import (
+    APIKeyCredentials,
+    Block,
+    BlockCategory,
+    BlockOutput,
+    BlockSchemaInput,
+    BlockSchemaOutput,
+    CredentialsMetaInput,
+    SchemaField,
+)
+
+from ._config import TEST_CREDENTIALS, TEST_CREDENTIALS_INPUT, _client, agent_mail
+
+
+class AgentMailCreateInboxBlock(Block):
+    """
+    Create a new email inbox for an AI agent via AgentMail.
+
+    Each inbox gets a unique email address (e.g. username@agentmail.to).
+    If username and domain are not provided, AgentMail auto-generates them.
+    Use custom domains by specifying the domain field.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        username: str = SchemaField(
+            description="Local part of the email address (e.g. 'support' for support@domain.com). Leave empty to auto-generate.",
+            default="",
+            advanced=False,
+        )
+        domain: str = SchemaField(
+            description="Email domain (e.g. 'mydomain.com'). Defaults to agentmail.to if empty.",
+            default="",
+            advanced=False,
+        )
+        display_name: str = SchemaField(
+            description="Friendly name shown in the 'From' field of sent emails (e.g. 'Support Agent')",
+            default="",
+            advanced=False,
+        )
+
+    class Output(BlockSchemaOutput):
+        inbox_id: str = SchemaField(
+            description="Unique identifier for the created inbox (also the email address)"
+        )
+        email_address: str = SchemaField(
+            description="Full email address of the inbox (e.g. support@agentmail.to)"
+        )
+        result: dict = SchemaField(
+            description="Complete inbox object with all metadata"
+        )
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="7a8ac219-c6ec-4eec-a828-81af283ce04c",
+            description="Create a new email inbox for an AI agent via AgentMail. Each inbox gets a unique address and can send/receive emails.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={"credentials": TEST_CREDENTIALS_INPUT},
+            test_output=[
+                ("inbox_id", "mock-inbox-id"),
+                ("email_address", "mock-inbox-id"),
+                ("result", dict),
+            ],
+            test_mock={
+                "create_inbox": lambda *a, **kw: type(
+                    "Inbox",
+                    (),
+                    {
+                        "inbox_id": "mock-inbox-id",
+                        "model_dump": lambda self: {"inbox_id": "mock-inbox-id"},
+                    },
+                )(),
+            },
+        )
+
+    @staticmethod
+    async def create_inbox(credentials: APIKeyCredentials, **params):
+        client = _client(credentials)
+        return await client.inboxes.create(request=CreateInboxRequest(**params))
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            params: dict = {}
+            if input_data.username:
+                params["username"] = input_data.username
+            if input_data.domain:
+                params["domain"] = input_data.domain
+            if input_data.display_name:
+                params["display_name"] = input_data.display_name
+
+            inbox = await self.create_inbox(credentials, **params)
+            result = inbox.model_dump()
+
+            yield "inbox_id", inbox.inbox_id
+            yield "email_address", inbox.inbox_id
+            yield "result", result
+        except Exception as e:
+            yield "error", str(e)
+
+
+class AgentMailGetInboxBlock(Block):
+    """
+    Retrieve details of an existing AgentMail inbox by its ID or email address.
+
+    Returns the inbox metadata including email address, display name, and
+    configuration. Use this to check if an inbox exists or get its properties.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        inbox_id: str = SchemaField(
+            description="Inbox ID or email address to look up (e.g. 'support@agentmail.to')"
+        )
+
+    class Output(BlockSchemaOutput):
+        inbox_id: str = SchemaField(description="Unique identifier of the inbox")
+        email_address: str = SchemaField(description="Full email address of the inbox")
+        display_name: str = SchemaField(
+            description="Friendly name shown in the 'From' field", default=""
+        )
+        result: dict = SchemaField(
+            description="Complete inbox object with all metadata"
+        )
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="b858f62b-6c12-4736-aaf2-dbc5a9281320",
+            description="Retrieve details of an existing AgentMail inbox including its email address, display name, and configuration.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={
+                "credentials": TEST_CREDENTIALS_INPUT,
+                "inbox_id": "test-inbox",
+            },
+            test_output=[
+                ("inbox_id", "test-inbox"),
+                ("email_address", "test-inbox"),
+                ("display_name", ""),
+                ("result", dict),
+            ],
+            test_mock={
+                "get_inbox": lambda *a, **kw: type(
+                    "Inbox",
+                    (),
+                    {
+                        "inbox_id": "test-inbox",
+                        "display_name": "",
+                        "model_dump": lambda self: {"inbox_id": "test-inbox"},
+                    },
+                )(),
+            },
+        )
+
+    @staticmethod
+    async def get_inbox(credentials: APIKeyCredentials, inbox_id: str):
+        client = _client(credentials)
+        return await client.inboxes.get(inbox_id=inbox_id)
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            inbox = await self.get_inbox(credentials, input_data.inbox_id)
+            result = inbox.model_dump()
+
+            yield "inbox_id", inbox.inbox_id
+            yield "email_address", inbox.inbox_id
+            yield "display_name", inbox.display_name or ""
+            yield "result", result
+        except Exception as e:
+            yield "error", str(e)
+
+
+class AgentMailListInboxesBlock(Block):
+    """
+    List all email inboxes in your AgentMail organization.
+
+    Returns a paginated list of all inboxes with their metadata.
+    Use page_token for pagination when you have many inboxes.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        limit: int = SchemaField(
+            description="Maximum number of inboxes to return per page (1-100)",
+            default=20,
+            advanced=True,
+        )
+        page_token: str = SchemaField(
+            description="Token from a previous response to fetch the next page of results",
+            default="",
+            advanced=True,
+        )
+
+    class Output(BlockSchemaOutput):
+        inboxes: list[dict] = SchemaField(
+            description="List of inbox objects, each containing inbox_id, email_address, display_name, etc."
+        )
+        count: int = SchemaField(
+            description="Total number of inboxes in your organization"
+        )
+        next_page_token: str = SchemaField(
+            description="Token to pass as page_token to get the next page. Empty if no more results.",
+            default="",
+        )
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="cfd84a06-2121-4cef-8d14-8badf52d22f0",
+            description="List all email inboxes in your AgentMail organization with pagination support.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={"credentials": TEST_CREDENTIALS_INPUT},
+            test_output=[
+                ("inboxes", []),
+                ("count", 0),
+                ("next_page_token", ""),
+            ],
+            test_mock={
+                "list_inboxes": lambda *a, **kw: type(
+                    "Resp",
+                    (),
+                    {
+                        "inboxes": [],
+                        "count": 0,
+                        "next_page_token": "",
+                    },
+                )(),
+            },
+        )
+
+    @staticmethod
+    async def list_inboxes(credentials: APIKeyCredentials, **params):
+        client = _client(credentials)
+        return await client.inboxes.list(**params)
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            params: dict = {"limit": input_data.limit}
+            if input_data.page_token:
+                params["page_token"] = input_data.page_token
+
+            response = await self.list_inboxes(credentials, **params)
+            inboxes = [i.model_dump() for i in response.inboxes]
+
+            yield "inboxes", inboxes
+            yield "count", (c if (c := response.count) is not None else len(inboxes))
+            yield "next_page_token", response.next_page_token or ""
+        except Exception as e:
+            yield "error", str(e)
+
+
+class AgentMailUpdateInboxBlock(Block):
+    """
+    Update the display name of an existing AgentMail inbox.
+
+    Changes the friendly name shown in the 'From' field when emails are sent
+    from this inbox. The email address itself cannot be changed.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        inbox_id: str = SchemaField(
+            description="Inbox ID or email address to update (e.g. 'support@agentmail.to')"
+        )
+        display_name: str = SchemaField(
+            description="New display name for the inbox (e.g. 'Customer Support Bot')"
+        )
+
+    class Output(BlockSchemaOutput):
+        inbox_id: str = SchemaField(description="The updated inbox ID")
+        result: dict = SchemaField(
+            description="Complete updated inbox object with all metadata"
+        )
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="59b49f59-a6d1-4203-94c0-3908adac50b6",
+            description="Update the display name of an AgentMail inbox. Changes the 'From' name shown when emails are sent.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={
+                "credentials": TEST_CREDENTIALS_INPUT,
+                "inbox_id": "test-inbox",
+                "display_name": "Updated",
+            },
+            test_output=[
+                ("inbox_id", "test-inbox"),
+                ("result", dict),
+            ],
+            test_mock={
+                "update_inbox": lambda *a, **kw: type(
+                    "Inbox",
+                    (),
+                    {
+                        "inbox_id": "test-inbox",
+                        "model_dump": lambda self: {"inbox_id": "test-inbox"},
+                    },
+                )(),
+            },
+        )
+
+    @staticmethod
+    async def update_inbox(credentials: APIKeyCredentials, inbox_id: str, **params):
+        client = _client(credentials)
+        return await client.inboxes.update(inbox_id=inbox_id, **params)
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            inbox = await self.update_inbox(
+                credentials,
+                input_data.inbox_id,
+                display_name=input_data.display_name,
+            )
+            result = inbox.model_dump()
+
+            yield "inbox_id", inbox.inbox_id
+            yield "result", result
+        except Exception as e:
+            yield "error", str(e)
+
+
+class AgentMailDeleteInboxBlock(Block):
+    """
+    Permanently delete an AgentMail inbox and all its data.
+
+    This removes the inbox, all its messages, threads, and drafts.
+    This action cannot be undone. The email address will no longer
+    receive or send emails.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        inbox_id: str = SchemaField(
+            description="Inbox ID or email address to permanently delete"
+        )
+
+    class Output(BlockSchemaOutput):
+        success: bool = SchemaField(
+            description="True if the inbox was successfully deleted"
+        )
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="ade970ae-8428-4a7b-9278-b52054dbf535",
+            description="Permanently delete an AgentMail inbox and all its messages, threads, and drafts. This action cannot be undone.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            is_sensitive_action=True,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={
+                "credentials": TEST_CREDENTIALS_INPUT,
+                "inbox_id": "test-inbox",
+            },
+            test_output=[("success", True)],
+            test_mock={
+                "delete_inbox": lambda *a, **kw: None,
+            },
+        )
+
+    @staticmethod
+    async def delete_inbox(credentials: APIKeyCredentials, inbox_id: str):
+        client = _client(credentials)
+        await client.inboxes.delete(inbox_id=inbox_id)
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            await self.delete_inbox(credentials, input_data.inbox_id)
+            yield "success", True
+        except Exception as e:
+            yield "error", str(e)
--- a/autogpt_platform/backend/backend/blocks/agent_mail/lists.py
+++ b/autogpt_platform/backend/backend/blocks/agent_mail/lists.py
@@ -0,0 +1,384 @@
+"""
+AgentMail List blocks — manage allow/block lists for email filtering.
+
+Lists let you control which email addresses and domains your agents can
+send to or receive from. There are four list types based on two dimensions:
+direction (send/receive) and type (allow/block).
+
+- receive + allow: Only accept emails from these addresses/domains
+- receive + block: Reject emails from these addresses/domains
+- send + allow: Only send emails to these addresses/domains
+- send + block: Prevent sending emails to these addresses/domains
+"""
+
+from enum import Enum
+
+from backend.sdk import (
+    APIKeyCredentials,
+    Block,
+    BlockCategory,
+    BlockOutput,
+    BlockSchemaInput,
+    BlockSchemaOutput,
+    CredentialsMetaInput,
+    SchemaField,
+)
+
+from ._config import TEST_CREDENTIALS, TEST_CREDENTIALS_INPUT, _client, agent_mail
+
+
+class ListDirection(str, Enum):
+    SEND = "send"
+    RECEIVE = "receive"
+
+
+class ListType(str, Enum):
+    ALLOW = "allow"
+    BLOCK = "block"
+
+
+class AgentMailListEntriesBlock(Block):
+    """
+    List all entries in an AgentMail allow/block list.
+
+    Retrieves email addresses and domains that are currently allowed
+    or blocked for sending or receiving. Use direction and list_type
+    to select which of the four lists to query.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        direction: ListDirection = SchemaField(
+            description="'send' to filter outgoing emails, 'receive' to filter incoming emails"
+        )
+        list_type: ListType = SchemaField(
+            description="'allow' for whitelist (only permit these), 'block' for blacklist (reject these)"
+        )
+        limit: int = SchemaField(
+            description="Maximum number of entries to return per page",
+            default=20,
+            advanced=True,
+        )
+        page_token: str = SchemaField(
+            description="Token from a previous response to fetch the next page",
+            default="",
+            advanced=True,
+        )
+
+    class Output(BlockSchemaOutput):
+        entries: list[dict] = SchemaField(
+            description="List of entries, each with an email address or domain"
+        )
+        count: int = SchemaField(description="Number of entries returned")
+        next_page_token: str = SchemaField(
+            description="Token for the next page. Empty if no more results.",
+            default="",
+        )
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="01489100-35da-45aa-8a01-9540ba0e9a21",
+            description="List all entries in an AgentMail allow/block list. Choose send/receive direction and allow/block type.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={
+                "credentials": TEST_CREDENTIALS_INPUT,
+                "direction": "receive",
+                "list_type": "block",
+            },
+            test_output=[
+                ("entries", []),
+                ("count", 0),
+                ("next_page_token", ""),
+            ],
+            test_mock={
+                "list_entries": lambda *a, **kw: type(
+                    "Resp",
+                    (),
+                    {
+                        "entries": [],
+                        "count": 0,
+                        "next_page_token": "",
+                    },
+                )(),
+            },
+        )
+
+    @staticmethod
+    async def list_entries(
+        credentials: APIKeyCredentials, direction: str, list_type: str, **params
+    ):
+        client = _client(credentials)
+        return await client.lists.list(direction, list_type, **params)
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            params: dict = {"limit": input_data.limit}
+            if input_data.page_token:
+                params["page_token"] = input_data.page_token
+
+            response = await self.list_entries(
+                credentials,
+                input_data.direction.value,
+                input_data.list_type.value,
+                **params,
+            )
+            entries = [e.model_dump() for e in response.entries]
+
+            yield "entries", entries
+            yield "count", (c if (c := response.count) is not None else len(entries))
+            yield "next_page_token", response.next_page_token or ""
+        except Exception as e:
+            yield "error", str(e)
+
+
+class AgentMailCreateListEntryBlock(Block):
+    """
+    Add an email address or domain to an AgentMail allow/block list.
+
+    Entries can be full email addresses (e.g. 'partner@example.com') or
+    entire domains (e.g. 'example.com'). For block lists, you can optionally
+    provide a reason (e.g. 'spam', 'competitor').
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        direction: ListDirection = SchemaField(
+            description="'send' for outgoing email rules, 'receive' for incoming email rules"
+        )
+        list_type: ListType = SchemaField(
+            description="'allow' to whitelist, 'block' to blacklist"
+        )
+        entry: str = SchemaField(
+            description="Email address (user@example.com) or domain (example.com) to add"
+        )
+        reason: str = SchemaField(
+            description="Reason for blocking (only used with block lists, e.g. 'spam', 'competitor')",
+            default="",
+            advanced=True,
+        )
+
+    class Output(BlockSchemaOutput):
+        entry: str = SchemaField(
+            description="The email address or domain that was added"
+        )
+        result: dict = SchemaField(description="Complete entry object")
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="b6650a0a-b113-40cf-8243-ff20f684f9b8",
+            description="Add an email address or domain to an allow/block list. Block spam senders or whitelist trusted domains.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            is_sensitive_action=True,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={
+                "credentials": TEST_CREDENTIALS_INPUT,
+                "direction": "receive",
+                "list_type": "block",
+                "entry": "spam@example.com",
+            },
+            test_output=[
+                ("entry", "spam@example.com"),
+                ("result", dict),
+            ],
+            test_mock={
+                "create_entry": lambda *a, **kw: type(
+                    "Entry",
+                    (),
+                    {
+                        "model_dump": lambda self: {"entry": "spam@example.com"},
+                    },
+                )(),
+            },
+        )
+
+    @staticmethod
+    async def create_entry(
+        credentials: APIKeyCredentials, direction: str, list_type: str, **params
+    ):
+        client = _client(credentials)
+        return await client.lists.create(direction, list_type, **params)
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            params: dict = {"entry": input_data.entry}
+            if input_data.reason and input_data.list_type == ListType.BLOCK:
+                params["reason"] = input_data.reason
+
+            result = await self.create_entry(
+                credentials,
+                input_data.direction.value,
+                input_data.list_type.value,
+                **params,
+            )
+            result_dict = result.model_dump()
+
+            yield "entry", input_data.entry
+            yield "result", result_dict
+        except Exception as e:
+            yield "error", str(e)
+
+
+class AgentMailGetListEntryBlock(Block):
+    """
+    Check if an email address or domain exists in an AgentMail allow/block list.
+
+    Returns the entry details if found. Use this to verify whether a specific
+    address or domain is currently allowed or blocked.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        direction: ListDirection = SchemaField(
+            description="'send' for outgoing rules, 'receive' for incoming rules"
+        )
+        list_type: ListType = SchemaField(
+            description="'allow' for whitelist, 'block' for blacklist"
+        )
+        entry: str = SchemaField(description="Email address or domain to look up")
+
+    class Output(BlockSchemaOutput):
+        entry: str = SchemaField(
+            description="The email address or domain that was found"
+        )
+        result: dict = SchemaField(description="Complete entry object with metadata")
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="fb117058-ab27-40d1-9231-eb1dd526fc7a",
+            description="Check if an email address or domain is in an allow/block list. Verify filtering rules.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={
+                "credentials": TEST_CREDENTIALS_INPUT,
+                "direction": "receive",
+                "list_type": "block",
+                "entry": "spam@example.com",
+            },
+            test_output=[
+                ("entry", "spam@example.com"),
+                ("result", dict),
+            ],
+            test_mock={
+                "get_entry": lambda *a, **kw: type(
+                    "Entry",
+                    (),
+                    {
+                        "model_dump": lambda self: {"entry": "spam@example.com"},
+                    },
+                )(),
+            },
+        )
+
+    @staticmethod
+    async def get_entry(
+        credentials: APIKeyCredentials, direction: str, list_type: str, entry: str
+    ):
+        client = _client(credentials)
+        return await client.lists.get(direction, list_type, entry=entry)
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            result = await self.get_entry(
+                credentials,
+                input_data.direction.value,
+                input_data.list_type.value,
+                input_data.entry,
+            )
+            result_dict = result.model_dump()
+
+            yield "entry", input_data.entry
+            yield "result", result_dict
+        except Exception as e:
+            yield "error", str(e)
+
+
+class AgentMailDeleteListEntryBlock(Block):
+    """
+    Remove an email address or domain from an AgentMail allow/block list.
+
+    After removal, the address/domain will no longer be filtered by this list.
+    """
+
+    class Input(BlockSchemaInput):
+        credentials: CredentialsMetaInput = agent_mail.credentials_field(
+            description="AgentMail API key from https://console.agentmail.to"
+        )
+        direction: ListDirection = SchemaField(
+            description="'send' for outgoing rules, 'receive' for incoming rules"
+        )
+        list_type: ListType = SchemaField(
+            description="'allow' for whitelist, 'block' for blacklist"
+        )
+        entry: str = SchemaField(
+            description="Email address or domain to remove from the list"
+        )
+
+    class Output(BlockSchemaOutput):
+        success: bool = SchemaField(
+            description="True if the entry was successfully removed"
+        )
+        error: str = SchemaField(description="Error message if the operation failed")
+
+    def __init__(self):
+        super().__init__(
+            id="2b8d57f1-1c9e-470f-a70b-5991c80fad5f",
+            description="Remove an email address or domain from an allow/block list to stop filtering it.",
+            categories={BlockCategory.COMMUNICATION},
+            input_schema=self.Input,
+            output_schema=self.Output,
+            is_sensitive_action=True,
+            test_credentials=TEST_CREDENTIALS,
+            test_input={
+                "credentials": TEST_CREDENTIALS_INPUT,
+                "direction": "receive",
+                "list_type": "block",
+                "entry": "spam@example.com",
+            },
+            test_output=[("success", True)],
+            test_mock={
+                "delete_entry": lambda *a, **kw: None,
+            },
+        )
+
+    @staticmethod
+    async def delete_entry(
+        credentials: APIKeyCredentials, direction: str, list_type: str, entry: str
+    ):
+        client = _client(credentials)
+        await client.lists.delete(direction, list_type, entry=entry)
+
+    async def run(
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+    ) -> BlockOutput:
+        try:
+            await self.delete_entry(
+                credentials,
+                input_data.direction.value,
+                input_data.list_type.value,
+                input_data.entry,
+            )
+            yield "success", True
+        except Exception as e:
+            yield "error", str(e)
--- a/Show More
+++ b/Show More