refactor(backend/copilot): move imports to module level

- Move KEY_WORKFLOWS and TOOL_REGISTRY imports to top of file - Better code organization following Python conventions
test(backend/copilot): add tests for auto-generated tool documentation
2026-03-17 03:00:27 -04:00 · 2026-03-06 23:15:39 +07:00 · 2026-03-06 23:15:39 +07:00 · 2026-03-06 23:10:42 +07:00 · 2026-03-06 23:10:42 +07:00 · 2026-03-06 23:10:42 +07:00
377 changed files with 9004 additions and 34708 deletions
--- a/.claude/skills/pr-address/SKILL.md
+++ b/.claude/skills/pr-address/SKILL.md
@@ -1,79 +0,0 @@
---
-name: pr-address
-description: Address PR review comments and loop until CI green and all comments resolved. TRIGGER when user asks to address comments, fix PR feedback, respond to reviewers, or babysit/monitor a PR.
-user-invocable: true
-args: "[PR number or URL] — if omitted, finds PR for current branch."
-metadata:
-  author: autogpt-team
-  version: "1.0.0"
---
-
-# PR Address
-
-## Find the PR
-
-```bash
-gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT
-gh pr view {N}
-```
-
-## Fetch comments (all sources)
-
-```bash
-gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews       # top-level reviews
-gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments      # inline review comments
-gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments     # PR conversation comments
-```
-
-**Bots to watch for:**
- `autogpt-reviewer` — posts "Blockers", "Should Fix", "Nice to Have". Address ALL of them.
- `sentry[bot]` — bug predictions. Fix real bugs, explain false positives.
- `coderabbitai[bot]` — automated review. Address actionable items.
-
-## For each unaddressed comment
-
-Address comments **one at a time**: fix → commit → push → inline reply → next.
-
-1. Read the referenced code, make the fix (or reply explaining why it's not needed)
-2. Commit and push the fix
-3. Reply **inline** (not as a new top-level comment) referencing the fixing commit — this is what resolves the conversation for bot reviewers (coderabbitai, sentry):
-
-| Comment type | How to reply |
-|---|---|
-| Inline review (`pulls/{N}/comments`) | `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID}/replies -f body="Fixed in <commit-sha>: <description>"` |
-| Conversation (`issues/{N}/comments`) | `gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments -f body="Fixed in <commit-sha>: <description>"` |
-
-## Format and commit
-
-After fixing, format the changed code:
-
- **Backend** (from `autogpt_platform/backend/`): `poetry run format`
- **Frontend** (from `autogpt_platform/frontend/`): `pnpm format && pnpm lint && pnpm types`
-
-If API routes changed, regenerate the frontend client:
-```bash
-cd autogpt_platform/backend && poetry run rest &
-REST_PID=$!
-trap "kill $REST_PID 2>/dev/null" EXIT
-WAIT=0; until curl -sf http://localhost:8006/health > /dev/null 2>&1; do sleep 1; WAIT=$((WAIT+1)); [ $WAIT -ge 60 ] && echo "Timed out" && exit 1; done
-cd ../frontend && pnpm generate:api:force
-kill $REST_PID 2>/dev/null; trap - EXIT
-```
-Never manually edit files in `src/app/api/__generated__/`.
-
-Then commit and **push immediately** — never batch commits without pushing.
-
-For backend commits in worktrees: `poetry run git commit` (pre-commit hooks).
-
-## The loop
-
-```text
-address comments → format → commit → push
-→ re-check comments → fix new ones → push
-→ wait for CI → re-check comments after CI settles
-→ repeat until: all comments addressed AND CI green AND no new comments arriving
-```
-
-While CI runs, stay productive: run local tests, address remaining comments.
-
-**The loop ends when:** CI fully green + all comments addressed + no new comments since CI settled.
--- a/.claude/skills/pr-review/SKILL.md
+++ b/.claude/skills/pr-review/SKILL.md
@@ -1,74 +0,0 @@
---
-name: pr-review
-description: Review a PR for correctness, security, code quality, and testing issues. TRIGGER when user asks to review a PR, check PR quality, or give feedback on a PR.
-user-invocable: true
-args: "[PR number or URL] — if omitted, finds PR for current branch."
-metadata:
-  author: autogpt-team
-  version: "1.0.0"
---
-
-# PR Review
-
-## Find the PR
-
-```bash
-gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT
-gh pr view {N}
-```
-
-## Read the diff
-
-```bash
-gh pr diff {N}
-```
-
-## Fetch existing review comments
-
-Before posting anything, fetch existing inline comments to avoid duplicates:
-
-```bash
-gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments
-gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews
-```
-
-## What to check
-
-**Correctness:** logic errors, off-by-one, missing edge cases, race conditions (TOCTOU in file access, credit charging), error handling gaps, async correctness (missing `await`, unclosed resources).
-
-**Security:** input validation at boundaries, no injection (command, XSS, SQL), secrets not logged, file paths sanitized (`os.path.basename()` in error messages).
-
-**Code quality:** apply rules from backend/frontend CLAUDE.md files.
-
-**Architecture:** DRY, single responsibility, modular functions. `Security()` vs `Depends()` for FastAPI auth. `data:` for SSE events, `: comment` for heartbeats. `transaction=True` for Redis pipelines.
-
-**Testing:** edge cases covered, colocated `*_test.py` (backend) / `__tests__/` (frontend), mocks target where symbol is **used** not defined, `AsyncMock` for async.
-
-## Output format
-
-Every comment **must** be prefixed with `🤖` and a criticality badge:
-
-| Tier | Badge | Meaning |
-|---|---|---|
-| Blocker | `🔴 **Blocker**` | Must fix before merge |
-| Should Fix | `🟠 **Should Fix**` | Important improvement |
-| Nice to Have | `🟡 **Nice to Have**` | Minor suggestion |
-| Nit | `🔵 **Nit**` | Style / wording |
-
-Example: `🤖 🔴 **Blocker**: Missing error handling for X — suggest wrapping in try/except.`
-
-## Post inline comments
-
-For each finding, post an inline comment on the PR (do not just write a local report):
-
-```bash
-# Get the latest commit SHA for the PR
-COMMIT_SHA=$(gh api repos/Significant-Gravitas/AutoGPT/pulls/{N} --jq '.head.sha')
-
-# Post an inline comment on a specific file/line
-gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments \
-  -f body="🤖 🔴 **Blocker**: <description>" \
-  -f commit_id="$COMMIT_SHA" \
-  -f path="<file path>" \
-  -F line=<line number>
-```
--- a/.claude/skills/worktree/SKILL.md
+++ b/.claude/skills/worktree/SKILL.md
@@ -1,85 +0,0 @@
---
-name: worktree
-description: Set up a new git worktree for parallel development. Creates the worktree, copies .env files, installs dependencies, and generates Prisma client. TRIGGER when user asks to set up a worktree, work on a branch in isolation, or needs a separate environment for a branch or PR.
-user-invocable: true
-args: "[name] — optional worktree name (e.g., 'AutoGPT7'). If omitted, uses next available AutoGPT<N>."
-metadata:
-  author: autogpt-team
-  version: "3.0.0"
---
-
-# Worktree Setup
-
-## Create the worktree
-
-Derive paths from the git toplevel. If a name is provided as argument, use it. Otherwise, check `git worktree list` and pick the next `AutoGPT<N>`.
-
-```bash
-ROOT=$(git rev-parse --show-toplevel)
-PARENT=$(dirname "$ROOT")
-
-# From an existing branch
-git worktree add "$PARENT/<NAME>" <branch-name>
-
-# From a new branch off dev
-git worktree add -b <new-branch> "$PARENT/<NAME>" dev
-```
-
-## Copy environment files
-
-Copy `.env` from the root worktree. Falls back to `.env.default` if `.env` doesn't exist.
-
-```bash
-ROOT=$(git rev-parse --show-toplevel)
-TARGET="$(dirname "$ROOT")/<NAME>"
-
-for envpath in autogpt_platform/backend autogpt_platform/frontend autogpt_platform; do
-  if [ -f "$ROOT/$envpath/.env" ]; then
-    cp "$ROOT/$envpath/.env" "$TARGET/$envpath/.env"
-  elif [ -f "$ROOT/$envpath/.env.default" ]; then
-    cp "$ROOT/$envpath/.env.default" "$TARGET/$envpath/.env"
-  fi
-done
-```
-
-## Install dependencies
-
-```bash
-TARGET="$(dirname "$(git rev-parse --show-toplevel)")/<NAME>"
-cd "$TARGET/autogpt_platform/autogpt_libs" && poetry install
-cd "$TARGET/autogpt_platform/backend" && poetry install && poetry run prisma generate
-cd "$TARGET/autogpt_platform/frontend" && pnpm install
-```
-
-Replace `<NAME>` with the actual worktree name (e.g., `AutoGPT7`).
-
-## Running the app (optional)
-
-Backend uses ports: 8001, 8002, 8003, 8005, 8006, 8007, 8008. Free them first if needed:
-
-```bash
-TARGET="$(dirname "$(git rev-parse --show-toplevel)")/<NAME>"
-for port in 8001 8002 8003 8005 8006 8007 8008; do
-  lsof -ti :$port | xargs kill -9 2>/dev/null || true
-done
-cd "$TARGET/autogpt_platform/backend" && poetry run app
-```
-
-## CoPilot testing
-
-SDK mode spawns a Claude subprocess — won't work inside Claude Code. Set `CHAT_USE_CLAUDE_AGENT_SDK=false` in `backend/.env` to use baseline mode.
-
-## Cleanup
-
-```bash
-# Replace <NAME> with the actual worktree name (e.g., AutoGPT7)
-git worktree remove "$(dirname "$(git rev-parse --show-toplevel)")/<NAME>"
-```
-
-## Alternative: Branchlet (optional)
-
-If [branchlet](https://www.npmjs.com/package/branchlet) is installed:
-
-```bash
-branchlet create -n <name> -s <source-branch> -b <new-branch>
-```
--- a/autogpt_platform/CLAUDE.md
+++ b/autogpt_platform/CLAUDE.md
@@ -60,12 +60,9 @@ AutoGPT Platform is a monorepo containing:

 ### Reviewing/Revising Pull Requests

-Use `/pr-review` to review a PR or `/pr-address` to address comments.
-
-When fetching comments manually:
- `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews` — top-level reviews
- `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments` — inline review comments
- `gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments` — PR conversation comments
+- When the user runs /pr-comments or tries to fetch them, also run gh api /repos/Significant-Gravitas/AutoGPT/pulls/[issuenum]/reviews to get the reviews
+- Use gh api /repos/Significant-Gravitas/AutoGPT/pulls/[issuenum]/reviews/[review_id]/comments to get the review contents
+- Use gh api /repos/Significant-Gravitas/AutoGPT/issues/9924/comments to get the pr specific comments

 ### Conventional Commits

--- a/autogpt_platform/analytics/queries/auth_activities.sql
+++ b/autogpt_platform/analytics/queries/auth_activities.sql
@@ -1,40 +0,0 @@
-- =============================================================
-- View: analytics.auth_activities
-- Looker source alias: ds49  |  Charts: 1
-- =============================================================
-- DESCRIPTION
--   Tracks authentication events (login, logout, SSO, password
--   reset, etc.) from Supabase's internal audit log.
--   Useful for monitoring sign-in patterns and detecting anomalies.
--
-- SOURCE TABLES
--   auth.audit_log_entries  — Supabase internal auth event log
--
-- OUTPUT COLUMNS
--   created_at      TIMESTAMPTZ  When the auth event occurred
--   actor_id        TEXT         User ID who triggered the event
--   actor_via_sso   TEXT         Whether the action was via SSO ('true'/'false')
--   action          TEXT         Event type (e.g. 'login', 'logout', 'token_refreshed')
--
-- WINDOW
--   Rolling 90 days from current date
--
-- EXAMPLE QUERIES
--   -- Daily login counts
--   SELECT DATE_TRUNC('day', created_at) AS day, COUNT(*) AS logins
--   FROM analytics.auth_activities
--   WHERE action = 'login'
--   GROUP BY 1 ORDER BY 1;
--
--   -- SSO vs password login breakdown
--   SELECT actor_via_sso, COUNT(*) FROM analytics.auth_activities
--   WHERE action = 'login' GROUP BY 1;
-- =============================================================
-
-SELECT
-    created_at,
-    payload->>'actor_id'      AS actor_id,
-    payload->>'actor_via_sso' AS actor_via_sso,
-    payload->>'action'        AS action
-FROM auth.audit_log_entries
-WHERE created_at >= NOW() - INTERVAL '90 days'
--- a/autogpt_platform/analytics/queries/graph_execution.sql
+++ b/autogpt_platform/analytics/queries/graph_execution.sql
@@ -1,105 +0,0 @@
-- =============================================================
-- View: analytics.graph_execution
-- Looker source alias: ds16  |  Charts: 21
-- =============================================================
-- DESCRIPTION
--   One row per agent graph execution (last 90 days).
--   Unpacks the JSONB stats column into individual numeric columns
--   and normalises the executionStatus — runs that failed due to
--   insufficient credits are reclassified as 'NO_CREDITS' for
--   easier filtering.  Error messages are scrubbed of IDs and URLs
--   to allow safe grouping.
--
-- SOURCE TABLES
--   platform.AgentGraphExecution  — Execution records
--   platform.AgentGraph           — Agent graph metadata (for name)
--   platform.LibraryAgent         — To flag possibly-AI (safe-mode) agents
--
-- OUTPUT COLUMNS
--   id                TEXT         Execution UUID
--   agentGraphId      TEXT         Agent graph UUID
--   agentGraphVersion INT          Graph version number
--   executionStatus   TEXT         COMPLETED | FAILED | NO_CREDITS | RUNNING | QUEUED | TERMINATED
--   createdAt         TIMESTAMPTZ  When the execution was queued
--   updatedAt         TIMESTAMPTZ  Last status update time
--   userId            TEXT         Owner user UUID
--   agentGraphName    TEXT         Human-readable agent name
--   cputime           DECIMAL      Total CPU seconds consumed
--   walltime          DECIMAL      Total wall-clock seconds
--   node_count        DECIMAL      Number of nodes in the graph
--   nodes_cputime     DECIMAL      CPU time across all nodes
--   nodes_walltime    DECIMAL      Wall time across all nodes
--   execution_cost    DECIMAL      Credit cost of this execution
--   correctness_score FLOAT        AI correctness score (if available)
--   possibly_ai       BOOLEAN      True if agent has sensitive_action_safe_mode enabled
--   groupedErrorMessage TEXT       Scrubbed error string (IDs/URLs replaced with wildcards)
--
-- WINDOW
--   Rolling 90 days (createdAt > CURRENT_DATE - 90 days)
--
-- EXAMPLE QUERIES
--   -- Daily execution counts by status
--   SELECT DATE_TRUNC('day', "createdAt") AS day, "executionStatus", COUNT(*)
--   FROM analytics.graph_execution
--   GROUP BY 1, 2 ORDER BY 1;
--
--   -- Average cost per execution by agent
--   SELECT "agentGraphName", AVG("execution_cost") AS avg_cost, COUNT(*) AS runs
--   FROM analytics.graph_execution
--   WHERE "executionStatus" = 'COMPLETED'
--   GROUP BY 1 ORDER BY avg_cost DESC;
--
--   -- Top error messages
--   SELECT "groupedErrorMessage", COUNT(*) AS occurrences
--   FROM analytics.graph_execution
--   WHERE "executionStatus" = 'FAILED'
--   GROUP BY 1 ORDER BY 2 DESC LIMIT 20;
-- =============================================================
-
-SELECT
-    ge."id"                                                        AS id,
-    ge."agentGraphId"                                              AS agentGraphId,
-    ge."agentGraphVersion"                                         AS agentGraphVersion,
-    CASE
-        WHEN jsonb_exists(ge."stats"::jsonb, 'error')
-         AND (
-               (ge."stats"::jsonb->>'error') ILIKE '%insufficient balance%'
-            OR (ge."stats"::jsonb->>'error') ILIKE '%you have no credits left%'
-             )
-        THEN 'NO_CREDITS'
-        ELSE CAST(ge."executionStatus" AS TEXT)
-    END                                                            AS executionStatus,
-    ge."createdAt"                                                 AS createdAt,
-    ge."updatedAt"                                                 AS updatedAt,
-    ge."userId"                                                    AS userId,
-    g."name"                                                       AS agentGraphName,
-    (ge."stats"::jsonb->>'cputime')::decimal                       AS cputime,
-    (ge."stats"::jsonb->>'walltime')::decimal                      AS walltime,
-    (ge."stats"::jsonb->>'node_count')::decimal                    AS node_count,
-    (ge."stats"::jsonb->>'nodes_cputime')::decimal                 AS nodes_cputime,
-    (ge."stats"::jsonb->>'nodes_walltime')::decimal                AS nodes_walltime,
-    (ge."stats"::jsonb->>'cost')::decimal                          AS execution_cost,
-    (ge."stats"::jsonb->>'correctness_score')::float               AS correctness_score,
-    COALESCE(la.possibly_ai, FALSE)                                AS possibly_ai,
-    REGEXP_REPLACE(
-        REGEXP_REPLACE(
-            TRIM(BOTH '"' FROM ge."stats"::jsonb->>'error'),
-            '(https?://)([A-Za-z0-9.-]+)(:[0-9]+)?(/[^\s]*)?',
-            '\1\2/...', 'gi'
-        ),
-        '[a-zA-Z0-9_:-]*\d[a-zA-Z0-9_:-]*', '*', 'g'
-    )                                                              AS groupedErrorMessage
-FROM platform."AgentGraphExecution" ge
-LEFT JOIN platform."AgentGraph" g
-       ON ge."agentGraphId" = g."id"
-      AND ge."agentGraphVersion" = g."version"
-LEFT JOIN (
-    SELECT DISTINCT ON ("userId", "agentGraphId")
-           "userId", "agentGraphId",
-           ("settings"::jsonb->>'sensitive_action_safe_mode')::boolean AS possibly_ai
-    FROM platform."LibraryAgent"
-    WHERE "isDeleted"  = FALSE
-      AND "isArchived" = FALSE
-    ORDER BY "userId", "agentGraphId", "agentGraphVersion" DESC
-) la ON la."userId" = ge."userId" AND la."agentGraphId" = ge."agentGraphId"
-WHERE ge."createdAt" > CURRENT_DATE - INTERVAL '90 days'
--- a/autogpt_platform/analytics/queries/node_block_execution.sql
+++ b/autogpt_platform/analytics/queries/node_block_execution.sql
@@ -1,101 +0,0 @@
-- =============================================================
-- View: analytics.node_block_execution
-- Looker source alias: ds14  |  Charts: 11
-- =============================================================
-- DESCRIPTION
--   One row per node (block) execution (last 90 days).
--   Unpacks stats JSONB and joins to identify which block type
--   was run.  For failed nodes, joins the error output and
--   scrubs it for safe grouping.
--
-- SOURCE TABLES
--   platform.AgentNodeExecution              — Node execution records
--   platform.AgentNode                       — Node → block mapping
--   platform.AgentBlock                      — Block name/ID
--   platform.AgentNodeExecutionInputOutput   — Error output values
--
-- OUTPUT COLUMNS
--   id                    TEXT         Node execution UUID
--   agentGraphExecutionId TEXT         Parent graph execution UUID
--   agentNodeId           TEXT         Node UUID within the graph
--   executionStatus       TEXT         COMPLETED | FAILED | QUEUED | RUNNING | TERMINATED
--   addedTime             TIMESTAMPTZ  When the node was queued
--   queuedTime            TIMESTAMPTZ  When it entered the queue
--   startedTime           TIMESTAMPTZ  When execution started
--   endedTime             TIMESTAMPTZ  When execution finished
--   inputSize             BIGINT       Input payload size in bytes
--   outputSize            BIGINT       Output payload size in bytes
--   walltime              NUMERIC      Wall-clock seconds for this node
--   cputime               NUMERIC      CPU seconds for this node
--   llmRetryCount         INT          Number of LLM retries
--   llmCallCount          INT          Number of LLM API calls made
--   inputTokenCount       BIGINT       LLM input tokens consumed
--   outputTokenCount      BIGINT       LLM output tokens produced
--   blockName             TEXT         Human-readable block name (e.g. 'OpenAIBlock')
--   blockId               TEXT         Block UUID
--   groupedErrorMessage   TEXT         Scrubbed error (IDs/URLs wildcarded)
--   errorMessage          TEXT         Raw error output (only set when FAILED)
--
-- WINDOW
--   Rolling 90 days (addedTime > CURRENT_DATE - 90 days)
--
-- EXAMPLE QUERIES
--   -- Most-used blocks by execution count
--   SELECT "blockName", COUNT(*) AS executions,
--          COUNT(*) FILTER (WHERE "executionStatus"='FAILED') AS failures
--   FROM analytics.node_block_execution
--   GROUP BY 1 ORDER BY executions DESC LIMIT 20;
--
--   -- Average LLM token usage per block
--   SELECT "blockName",
--          AVG("inputTokenCount") AS avg_input_tokens,
--          AVG("outputTokenCount") AS avg_output_tokens
--   FROM analytics.node_block_execution
--   WHERE "llmCallCount" > 0
--   GROUP BY 1 ORDER BY avg_input_tokens DESC;
--
--   -- Top failure reasons
--   SELECT "blockName", "groupedErrorMessage", COUNT(*) AS count
--   FROM analytics.node_block_execution
--   WHERE "executionStatus" = 'FAILED'
--   GROUP BY 1, 2 ORDER BY count DESC LIMIT 20;
-- =============================================================
-
-SELECT
-    ne."id"                                                            AS id,
-    ne."agentGraphExecutionId"                                         AS agentGraphExecutionId,
-    ne."agentNodeId"                                                   AS agentNodeId,
-    CAST(ne."executionStatus" AS TEXT)                                 AS executionStatus,
-    ne."addedTime"                                                     AS addedTime,
-    ne."queuedTime"                                                    AS queuedTime,
-    ne."startedTime"                                                   AS startedTime,
-    ne."endedTime"                                                     AS endedTime,
-    (ne."stats"::jsonb->>'input_size')::bigint                         AS inputSize,
-    (ne."stats"::jsonb->>'output_size')::bigint                        AS outputSize,
-    (ne."stats"::jsonb->>'walltime')::numeric                          AS walltime,
-    (ne."stats"::jsonb->>'cputime')::numeric                           AS cputime,
-    (ne."stats"::jsonb->>'llm_retry_count')::int                       AS llmRetryCount,
-    (ne."stats"::jsonb->>'llm_call_count')::int                        AS llmCallCount,
-    (ne."stats"::jsonb->>'input_token_count')::bigint                  AS inputTokenCount,
-    (ne."stats"::jsonb->>'output_token_count')::bigint                 AS outputTokenCount,
-    b."name"                                                           AS blockName,
-    b."id"                                                             AS blockId,
-    REGEXP_REPLACE(
-        REGEXP_REPLACE(
-            TRIM(BOTH '"' FROM eio."data"::text),
-            '(https?://)([A-Za-z0-9.-]+)(:[0-9]+)?(/[^\s]*)?',
-            '\1\2/...', 'gi'
-        ),
-        '[a-zA-Z0-9_:-]*\d[a-zA-Z0-9_:-]*', '*', 'g'
-    )                                                                  AS groupedErrorMessage,
-    eio."data"                                                         AS errorMessage
-FROM platform."AgentNodeExecution" ne
-LEFT JOIN platform."AgentNode" nd
-       ON ne."agentNodeId" = nd."id"
-LEFT JOIN platform."AgentBlock" b
-       ON nd."agentBlockId" = b."id"
-LEFT JOIN platform."AgentNodeExecutionInputOutput" eio
-       ON eio."referencedByOutputExecId" = ne."id"
-      AND eio."name" = 'error'
-      AND ne."executionStatus" = 'FAILED'
-WHERE ne."addedTime" > CURRENT_DATE - INTERVAL '90 days'
--- a/autogpt_platform/analytics/queries/retention_agent.sql
+++ b/autogpt_platform/analytics/queries/retention_agent.sql
@@ -1,97 +0,0 @@
-- =============================================================
-- View: analytics.retention_agent
-- Looker source alias: ds35  |  Charts: 2
-- =============================================================
-- DESCRIPTION
--   Weekly cohort retention broken down per individual agent.
--   Cohort = week of a user's first use of THAT specific agent.
--   Tells you which agents keep users coming back vs. one-shot
--   use. Only includes cohorts from the last 180 days.
--
-- SOURCE TABLES
--   platform.AgentGraphExecution  — Execution records (user × agent × time)
--   platform.AgentGraph           — Agent names
--
-- OUTPUT COLUMNS
--   agent_id            TEXT   Agent graph UUID
--   agent_label         TEXT   'AgentName [first8chars]'
--   agent_label_n       TEXT   'AgentName [first8chars] (n=total_users)'
--   cohort_week_start   DATE   Week users first ran this agent
--   cohort_label        TEXT   ISO week label
--   cohort_label_n      TEXT   ISO week label with cohort size
--   user_lifetime_week  INT    Weeks since first use of this agent
--   cohort_users        BIGINT Users in this cohort for this agent
--   active_users        BIGINT Users who ran the agent again in week k
--   retention_rate      FLOAT  active_users / cohort_users
--   cohort_users_w0     BIGINT cohort_users only at week 0 (safe to SUM)
--   agent_total_users   BIGINT Total users across all cohorts for this agent
--
-- EXAMPLE QUERIES
--   -- Best-retained agents at week 2
--   SELECT agent_label, AVG(retention_rate) AS w2_retention
--   FROM analytics.retention_agent
--   WHERE user_lifetime_week = 2 AND cohort_users >= 10
--   GROUP BY 1 ORDER BY w2_retention DESC LIMIT 10;
--
--   -- Agents with most unique users
--   SELECT DISTINCT agent_label, agent_total_users
--   FROM analytics.retention_agent
--   ORDER BY agent_total_users DESC LIMIT 20;
-- =============================================================
-
-WITH params AS (SELECT 12::int AS max_weeks, (CURRENT_DATE - INTERVAL '180 days') AS cohort_start),
-events AS (
-  SELECT e."userId"::text AS user_id, e."agentGraphId" AS agent_id,
-         e."createdAt"::timestamptz AS created_at,
-         DATE_TRUNC('week', e."createdAt")::date AS week_start
-  FROM platform."AgentGraphExecution" e
-),
-first_use AS (
-  SELECT user_id, agent_id, MIN(created_at) AS first_use_at,
-         DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
-  FROM events GROUP BY 1,2
-  HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
-),
-activity_weeks AS (SELECT DISTINCT user_id, agent_id, week_start FROM events),
-user_week_age AS (
-  SELECT aw.user_id, aw.agent_id, fu.cohort_week_start,
-         ((aw.week_start - DATE_TRUNC('week',fu.first_use_at)::date)/7)::int AS user_lifetime_week
-  FROM activity_weeks aw JOIN first_use fu USING (user_id, agent_id)
-  WHERE aw.week_start >= DATE_TRUNC('week',fu.first_use_at)::date
-),
-active_counts AS (
-  SELECT agent_id, cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users
-  FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2,3
-),
-cohort_sizes AS (
-  SELECT agent_id, cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_use GROUP BY 1,2
-),
-cohort_caps AS (
-  SELECT cs.agent_id, cs.cohort_week_start, cs.cohort_users,
-         LEAST((SELECT max_weeks FROM params),
-               GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date-cs.cohort_week_start)/7)::int)) AS cap_weeks
-  FROM cohort_sizes cs
-),
-grid AS (
-  SELECT cc.agent_id, cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
-  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
-),
-agent_names AS (SELECT DISTINCT ON (g."id") g."id" AS agent_id, g."name" AS agent_name FROM platform."AgentGraph" g ORDER BY g."id", g."version" DESC),
-agent_total_users AS (SELECT agent_id, SUM(cohort_users) AS agent_total_users FROM cohort_sizes GROUP BY 1)
-SELECT
-  g.agent_id,
-  COALESCE(an.agent_name,'(unnamed)')||' ['||LEFT(g.agent_id::text,8)||']'  AS agent_label,
-  COALESCE(an.agent_name,'(unnamed)')||' ['||LEFT(g.agent_id::text,8)||'] (n='||COALESCE(atu.agent_total_users,0)||')' AS agent_label_n,
-  g.cohort_week_start,
-  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')                               AS cohort_label,
-  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')'  AS cohort_label_n,
-  g.user_lifetime_week, g.cohort_users,
-  COALESCE(ac.active_users,0)                                              AS active_users,
-  COALESCE(ac.active_users,0)::float / NULLIF(g.cohort_users,0)           AS retention_rate,
-  CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END         AS cohort_users_w0,
-  COALESCE(atu.agent_total_users,0)                                        AS agent_total_users
-FROM grid g
-LEFT JOIN active_counts     ac  ON ac.agent_id=g.agent_id AND ac.cohort_week_start=g.cohort_week_start AND ac.user_lifetime_week=g.user_lifetime_week
-LEFT JOIN agent_names       an  ON an.agent_id=g.agent_id
-LEFT JOIN agent_total_users atu ON atu.agent_id=g.agent_id
-ORDER BY agent_label, g.cohort_week_start, g.user_lifetime_week;
--- a/autogpt_platform/analytics/queries/retention_execution_daily.sql
+++ b/autogpt_platform/analytics/queries/retention_execution_daily.sql
@@ -1,81 +0,0 @@
-- =============================================================
-- View: analytics.retention_execution_daily
-- Looker source alias: ds111  |  Charts: 1
-- =============================================================
-- DESCRIPTION
--   Daily cohort retention based on agent executions.
--   Cohort anchor = day of user's FIRST ever execution.
--   Only includes cohorts from the last 90 days, up to day 30.
--   Great for early engagement analysis (did users run another
--   agent the next day?).
--
-- SOURCE TABLES
--   platform.AgentGraphExecution  — Execution records
--
-- OUTPUT COLUMNS
--   Same pattern as retention_login_daily.
--   cohort_day_start = day of first execution (not first login)
--
-- EXAMPLE QUERIES
--   -- Day-3 execution retention
--   SELECT cohort_label, retention_rate_bounded AS d3_retention
--   FROM analytics.retention_execution_daily
--   WHERE user_lifetime_day = 3 ORDER BY cohort_day_start;
-- =============================================================
-
-WITH params AS (SELECT 30::int AS max_days, (CURRENT_DATE - INTERVAL '90 days') AS cohort_start),
-events AS (
-  SELECT e."userId"::text AS user_id, e."createdAt"::timestamptz AS created_at,
-         DATE_TRUNC('day', e."createdAt")::date AS day_start
-  FROM platform."AgentGraphExecution" e WHERE e."userId" IS NOT NULL
-),
-first_exec AS (
-  SELECT user_id, MIN(created_at) AS first_exec_at,
-         DATE_TRUNC('day', MIN(created_at))::date AS cohort_day_start
-  FROM events GROUP BY 1
-  HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
-),
-activity_days AS (SELECT DISTINCT user_id, day_start FROM events),
-user_day_age AS (
-  SELECT ad.user_id, fe.cohort_day_start,
-         (ad.day_start - DATE_TRUNC('day',fe.first_exec_at)::date)::int AS user_lifetime_day
-  FROM activity_days ad JOIN first_exec fe USING (user_id)
-  WHERE ad.day_start >= DATE_TRUNC('day',fe.first_exec_at)::date
-),
-bounded_counts AS (
-  SELECT cohort_day_start, user_lifetime_day, COUNT(DISTINCT user_id) AS active_users_bounded
-  FROM user_day_age WHERE user_lifetime_day >= 0 GROUP BY 1,2
-),
-last_active AS (
-  SELECT cohort_day_start, user_id, MAX(user_lifetime_day) AS last_active_day FROM user_day_age GROUP BY 1,2
-),
-unbounded_counts AS (
-  SELECT la.cohort_day_start, gs AS user_lifetime_day, COUNT(*) AS retained_users_unbounded
-  FROM last_active la
-  CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_day,(SELECT max_days FROM params))) gs
-  GROUP BY 1,2
-),
-cohort_sizes AS (SELECT cohort_day_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_exec GROUP BY 1),
-cohort_caps AS (
-  SELECT cs.cohort_day_start, cs.cohort_users,
-         LEAST((SELECT max_days FROM params), GREATEST(0,(CURRENT_DATE-cs.cohort_day_start)::int)) AS cap_days
-  FROM cohort_sizes cs
-),
-grid AS (
-  SELECT cc.cohort_day_start, gs AS user_lifetime_day, cc.cohort_users
-  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_days) gs
-)
-SELECT
-  g.cohort_day_start,
-  TO_CHAR(g.cohort_day_start,'YYYY-MM-DD')                                AS cohort_label,
-  TO_CHAR(g.cohort_day_start,'YYYY-MM-DD')||' (n='||g.cohort_users||')'   AS cohort_label_n,
-  g.user_lifetime_day, g.cohort_users,
-  COALESCE(b.active_users_bounded,0)     AS active_users_bounded,
-  COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
-  CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END    AS retention_rate_bounded,
-  CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
-  CASE WHEN g.user_lifetime_day=0 THEN g.cohort_users ELSE 0 END          AS cohort_users_d0
-FROM grid g
-LEFT JOIN bounded_counts   b ON b.cohort_day_start=g.cohort_day_start AND b.user_lifetime_day=g.user_lifetime_day
-LEFT JOIN unbounded_counts u ON u.cohort_day_start=g.cohort_day_start AND u.user_lifetime_day=g.user_lifetime_day
-ORDER BY g.cohort_day_start, g.user_lifetime_day;
--- a/autogpt_platform/analytics/queries/retention_execution_weekly.sql
+++ b/autogpt_platform/analytics/queries/retention_execution_weekly.sql
@@ -1,81 +0,0 @@
-- =============================================================
-- View: analytics.retention_execution_weekly
-- Looker source alias: ds92  |  Charts: 2
-- =============================================================
-- DESCRIPTION
--   Weekly cohort retention based on agent executions.
--   Cohort anchor = week of user's FIRST ever agent execution
--   (not first login). Only includes cohorts from the last 180 days.
--   Useful when you care about product engagement, not just visits.
--
-- SOURCE TABLES
--   platform.AgentGraphExecution  — Execution records
--
-- OUTPUT COLUMNS
--   Same pattern as retention_login_weekly.
--   cohort_week_start = week of first execution (not first login)
--
-- EXAMPLE QUERIES
--   -- Week-2 execution retention
--   SELECT cohort_label, retention_rate_bounded
--   FROM analytics.retention_execution_weekly
--   WHERE user_lifetime_week = 2 ORDER BY cohort_week_start;
-- =============================================================
-
-WITH params AS (SELECT 12::int AS max_weeks, (CURRENT_DATE - INTERVAL '180 days') AS cohort_start),
-events AS (
-  SELECT e."userId"::text AS user_id, e."createdAt"::timestamptz AS created_at,
-         DATE_TRUNC('week', e."createdAt")::date AS week_start
-  FROM platform."AgentGraphExecution" e WHERE e."userId" IS NOT NULL
-),
-first_exec AS (
-  SELECT user_id, MIN(created_at) AS first_exec_at,
-         DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
-  FROM events GROUP BY 1
-  HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
-),
-activity_weeks AS (SELECT DISTINCT user_id, week_start FROM events),
-user_week_age AS (
-  SELECT aw.user_id, fe.cohort_week_start,
-         ((aw.week_start - DATE_TRUNC('week',fe.first_exec_at)::date)/7)::int AS user_lifetime_week
-  FROM activity_weeks aw JOIN first_exec fe USING (user_id)
-  WHERE aw.week_start >= DATE_TRUNC('week',fe.first_exec_at)::date
-),
-bounded_counts AS (
-  SELECT cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users_bounded
-  FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2
-),
-last_active AS (
-  SELECT cohort_week_start, user_id, MAX(user_lifetime_week) AS last_active_week FROM user_week_age GROUP BY 1,2
-),
-unbounded_counts AS (
-  SELECT la.cohort_week_start, gs AS user_lifetime_week, COUNT(*) AS retained_users_unbounded
-  FROM last_active la
-  CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_week,(SELECT max_weeks FROM params))) gs
-  GROUP BY 1,2
-),
-cohort_sizes AS (SELECT cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_exec GROUP BY 1),
-cohort_caps AS (
-  SELECT cs.cohort_week_start, cs.cohort_users,
-         LEAST((SELECT max_weeks FROM params),
-               GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date-cs.cohort_week_start)/7)::int)) AS cap_weeks
-  FROM cohort_sizes cs
-),
-grid AS (
-  SELECT cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
-  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
-)
-SELECT
-  g.cohort_week_start,
-  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')                               AS cohort_label,
-  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')'  AS cohort_label_n,
-  g.user_lifetime_week, g.cohort_users,
-  COALESCE(b.active_users_bounded,0)     AS active_users_bounded,
-  COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
-  CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END    AS retention_rate_bounded,
-  CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
-  CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END         AS cohort_users_w0
-FROM grid g
-LEFT JOIN bounded_counts   b ON b.cohort_week_start=g.cohort_week_start AND b.user_lifetime_week=g.user_lifetime_week
-LEFT JOIN unbounded_counts u ON u.cohort_week_start=g.cohort_week_start AND u.user_lifetime_week=g.user_lifetime_week
-ORDER BY g.cohort_week_start, g.user_lifetime_week;
--- a/autogpt_platform/analytics/queries/retention_login_daily.sql
+++ b/autogpt_platform/analytics/queries/retention_login_daily.sql
@@ -1,94 +0,0 @@
-- =============================================================
-- View: analytics.retention_login_daily
-- Looker source alias: ds112  |  Charts: 1
-- =============================================================
-- DESCRIPTION
--   Daily cohort retention based on login sessions.
--   Same logic as retention_login_weekly but at day granularity,
--   showing up to day 30 for cohorts from the last 90 days.
--   Useful for analysing early activation (days 1-7) in detail.
--
-- SOURCE TABLES
--   auth.sessions  — Login session records
--
-- OUTPUT COLUMNS (same pattern as retention_login_weekly)
--   cohort_day_start          DATE     First day the cohort logged in
--   cohort_label              TEXT     Date string (e.g. '2025-03-01')
--   cohort_label_n            TEXT     Date + cohort size (e.g. '2025-03-01 (n=12)')
--   user_lifetime_day         INT      Days since first login (0 = signup day)
--   cohort_users              BIGINT   Total users in cohort
--   active_users_bounded      BIGINT   Users active on exactly day k
--   retained_users_unbounded  BIGINT   Users active any time on/after day k
--   retention_rate_bounded    FLOAT    bounded / cohort_users
--   retention_rate_unbounded  FLOAT    unbounded / cohort_users
--   cohort_users_d0           BIGINT   cohort_users only at day 0, else 0 (safe to SUM)
--
-- EXAMPLE QUERIES
--   -- Day-1 retention rate (came back next day)
--   SELECT cohort_label, retention_rate_bounded AS d1_retention
--   FROM analytics.retention_login_daily
--   WHERE user_lifetime_day = 1 ORDER BY cohort_day_start;
--
--   -- Average retention curve across all cohorts
--   SELECT user_lifetime_day,
--          SUM(active_users_bounded)::float / NULLIF(SUM(cohort_users_d0), 0) AS avg_retention
--   FROM analytics.retention_login_daily
--   GROUP BY 1 ORDER BY 1;
-- =============================================================
-
-WITH params AS (SELECT 30::int AS max_days, (CURRENT_DATE - INTERVAL '90 days')::date AS cohort_start),
-events AS (
-  SELECT s.user_id::text AS user_id, s.created_at::timestamptz AS created_at,
-         DATE_TRUNC('day', s.created_at)::date AS day_start
-  FROM auth.sessions s WHERE s.user_id IS NOT NULL
-),
-first_login AS (
-  SELECT user_id, MIN(created_at) AS first_login_time,
-         DATE_TRUNC('day', MIN(created_at))::date AS cohort_day_start
-  FROM events GROUP BY 1
-  HAVING MIN(created_at) >= (SELECT cohort_start FROM params)
-),
-activity_days AS (SELECT DISTINCT user_id, day_start FROM events),
-user_day_age AS (
-  SELECT ad.user_id, fl.cohort_day_start,
-         (ad.day_start - DATE_TRUNC('day', fl.first_login_time)::date)::int AS user_lifetime_day
-  FROM activity_days ad JOIN first_login fl USING (user_id)
-  WHERE ad.day_start >= DATE_TRUNC('day', fl.first_login_time)::date
-),
-bounded_counts AS (
-  SELECT cohort_day_start, user_lifetime_day, COUNT(DISTINCT user_id) AS active_users_bounded
-  FROM user_day_age WHERE user_lifetime_day >= 0 GROUP BY 1,2
-),
-last_active AS (
-  SELECT cohort_day_start, user_id, MAX(user_lifetime_day) AS last_active_day FROM user_day_age GROUP BY 1,2
-),
-unbounded_counts AS (
-  SELECT la.cohort_day_start, gs AS user_lifetime_day, COUNT(*) AS retained_users_unbounded
-  FROM last_active la
-  CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_day,(SELECT max_days FROM params))) gs
-  GROUP BY 1,2
-),
-cohort_sizes AS (SELECT cohort_day_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_login GROUP BY 1),
-cohort_caps AS (
-  SELECT cs.cohort_day_start, cs.cohort_users,
-         LEAST((SELECT max_days FROM params), GREATEST(0,(CURRENT_DATE-cs.cohort_day_start)::int)) AS cap_days
-  FROM cohort_sizes cs
-),
-grid AS (
-  SELECT cc.cohort_day_start, gs AS user_lifetime_day, cc.cohort_users
-  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_days) gs
-)
-SELECT
-  g.cohort_day_start,
-  TO_CHAR(g.cohort_day_start,'YYYY-MM-DD')                                  AS cohort_label,
-  TO_CHAR(g.cohort_day_start,'YYYY-MM-DD')||' (n='||g.cohort_users||')'     AS cohort_label_n,
-  g.user_lifetime_day, g.cohort_users,
-  COALESCE(b.active_users_bounded,0)     AS active_users_bounded,
-  COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
-  CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END    AS retention_rate_bounded,
-  CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
-  CASE WHEN g.user_lifetime_day=0 THEN g.cohort_users ELSE 0 END            AS cohort_users_d0
-FROM grid g
-LEFT JOIN bounded_counts   b ON b.cohort_day_start=g.cohort_day_start AND b.user_lifetime_day=g.user_lifetime_day
-LEFT JOIN unbounded_counts u ON u.cohort_day_start=g.cohort_day_start AND u.user_lifetime_day=g.user_lifetime_day
-ORDER BY g.cohort_day_start, g.user_lifetime_day;
--- a/autogpt_platform/analytics/queries/retention_login_onboarded_weekly.sql
+++ b/autogpt_platform/analytics/queries/retention_login_onboarded_weekly.sql
@@ -1,96 +0,0 @@
-- =============================================================
-- View: analytics.retention_login_onboarded_weekly
-- Looker source alias: ds101  |  Charts: 2
-- =============================================================
-- DESCRIPTION
--   Weekly cohort retention from login sessions, restricted to
--   users who "onboarded" — defined as running at least one
--   agent within 365 days of their first login.
--   Filters out users who signed up but never activated,
--   giving a cleaner view of engaged-user retention.
--
-- SOURCE TABLES
--   auth.sessions                  — Login session records
--   platform.AgentGraphExecution   — Used to identify onboarders
--
-- OUTPUT COLUMNS
--   Same as retention_login_weekly (cohort_week_start, user_lifetime_week,
--   retention_rate_bounded, retention_rate_unbounded, etc.)
--   Only difference: cohort is filtered to onboarded users only.
--
-- EXAMPLE QUERIES
--   -- Compare week-4 retention: all users vs onboarded only
--   SELECT 'all_users' AS segment, AVG(retention_rate_bounded) AS w4_retention
--   FROM analytics.retention_login_weekly WHERE user_lifetime_week = 4
--   UNION ALL
--   SELECT 'onboarded', AVG(retention_rate_bounded)
--   FROM analytics.retention_login_onboarded_weekly WHERE user_lifetime_week = 4;
-- =============================================================
-
-WITH params AS (SELECT 12::int AS max_weeks, 365::int AS onboarding_window_days),
-events AS (
-  SELECT s.user_id::text AS user_id, s.created_at::timestamptz AS created_at,
-         DATE_TRUNC('week', s.created_at)::date AS week_start
-  FROM auth.sessions s WHERE s.user_id IS NOT NULL
-),
-first_login_all AS (
-  SELECT user_id, MIN(created_at) AS first_login_time,
-         DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
-  FROM events GROUP BY 1
-),
-onboarders AS (
-  SELECT fl.user_id FROM first_login_all fl
-  WHERE EXISTS (
-    SELECT 1 FROM platform."AgentGraphExecution" e
-    WHERE e."userId"::text = fl.user_id
-      AND e."createdAt" >= fl.first_login_time
-      AND e."createdAt" < fl.first_login_time
-          + make_interval(days => (SELECT onboarding_window_days FROM params))
-  )
-),
-first_login AS (SELECT * FROM first_login_all WHERE user_id IN (SELECT user_id FROM onboarders)),
-activity_weeks AS (SELECT DISTINCT user_id, week_start FROM events),
-user_week_age AS (
-  SELECT aw.user_id, fl.cohort_week_start,
-         ((aw.week_start - DATE_TRUNC('week',fl.first_login_time)::date)/7)::int AS user_lifetime_week
-  FROM activity_weeks aw JOIN first_login fl USING (user_id)
-  WHERE aw.week_start >= DATE_TRUNC('week',fl.first_login_time)::date
-),
-bounded_counts AS (
-  SELECT cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users_bounded
-  FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2
-),
-last_active AS (
-  SELECT cohort_week_start, user_id, MAX(user_lifetime_week) AS last_active_week FROM user_week_age GROUP BY 1,2
-),
-unbounded_counts AS (
-  SELECT la.cohort_week_start, gs AS user_lifetime_week, COUNT(*) AS retained_users_unbounded
-  FROM last_active la
-  CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_week,(SELECT max_weeks FROM params))) gs
-  GROUP BY 1,2
-),
-cohort_sizes AS (SELECT cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_login GROUP BY 1),
-cohort_caps AS (
-  SELECT cs.cohort_week_start, cs.cohort_users,
-         LEAST((SELECT max_weeks FROM params),
-               GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date-cs.cohort_week_start)/7)::int)) AS cap_weeks
-  FROM cohort_sizes cs
-),
-grid AS (
-  SELECT cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
-  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
-)
-SELECT
-  g.cohort_week_start,
-  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')                               AS cohort_label,
-  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')'  AS cohort_label_n,
-  g.user_lifetime_week, g.cohort_users,
-  COALESCE(b.active_users_bounded,0)     AS active_users_bounded,
-  COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
-  CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END    AS retention_rate_bounded,
-  CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
-  CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END         AS cohort_users_w0
-FROM grid g
-LEFT JOIN bounded_counts   b ON b.cohort_week_start=g.cohort_week_start AND b.user_lifetime_week=g.user_lifetime_week
-LEFT JOIN unbounded_counts u ON u.cohort_week_start=g.cohort_week_start AND u.user_lifetime_week=g.user_lifetime_week
-ORDER BY g.cohort_week_start, g.user_lifetime_week;
--- a/autogpt_platform/analytics/queries/retention_login_weekly.sql
+++ b/autogpt_platform/analytics/queries/retention_login_weekly.sql
@@ -1,103 +0,0 @@
-- =============================================================
-- View: analytics.retention_login_weekly
-- Looker source alias: ds83  |  Charts: 2
-- =============================================================
-- DESCRIPTION
--   Weekly cohort retention based on login sessions.
--   Users are grouped by the ISO week of their first ever login.
--   For each cohort × lifetime-week combination, outputs both:
--     - bounded rate: % active in exactly that week
--     - unbounded rate: % who were ever active on or after that week
--   Weeks are capped to the cohort's actual age (no future data points).
--
-- SOURCE TABLES
--   auth.sessions  — Login session records
--
-- HOW TO READ THE OUTPUT
--   cohort_week_start   The Monday of the week users first logged in
--   user_lifetime_week  0 = signup week, 1 = one week later, etc.
--   retention_rate_bounded   = active_users_bounded / cohort_users
--   retention_rate_unbounded = retained_users_unbounded / cohort_users
--
-- OUTPUT COLUMNS
--   cohort_week_start         DATE     First day of the cohort's signup week
--   cohort_label              TEXT     ISO week label (e.g. '2025-W01')
--   cohort_label_n            TEXT     ISO week label with cohort size (e.g. '2025-W01 (n=42)')
--   user_lifetime_week        INT      Weeks since first login (0 = signup week)
--   cohort_users              BIGINT   Total users in this cohort (denominator)
--   active_users_bounded      BIGINT   Users active in exactly week k
--   retained_users_unbounded  BIGINT   Users active any time on/after week k
--   retention_rate_bounded    FLOAT    bounded active / cohort_users
--   retention_rate_unbounded  FLOAT    unbounded retained / cohort_users
--   cohort_users_w0           BIGINT   cohort_users only at week 0, else 0 (safe to SUM in pivot tables)
--
-- EXAMPLE QUERIES
--   -- Week-1 retention rate per cohort
--   SELECT cohort_label, retention_rate_bounded AS w1_retention
--   FROM analytics.retention_login_weekly
--   WHERE user_lifetime_week = 1
--   ORDER BY cohort_week_start;
--
--   -- Overall average retention curve (all cohorts combined)
--   SELECT user_lifetime_week,
--          SUM(active_users_bounded)::float / NULLIF(SUM(cohort_users_w0), 0) AS avg_retention
--   FROM analytics.retention_login_weekly
--   GROUP BY 1 ORDER BY 1;
-- =============================================================
-
-WITH params AS (SELECT 12::int AS max_weeks),
-events AS (
-  SELECT s.user_id::text AS user_id, s.created_at::timestamptz AS created_at,
-         DATE_TRUNC('week', s.created_at)::date AS week_start
-  FROM auth.sessions s WHERE s.user_id IS NOT NULL
-),
-first_login AS (
-  SELECT user_id, MIN(created_at) AS first_login_time,
-         DATE_TRUNC('week', MIN(created_at))::date AS cohort_week_start
-  FROM events GROUP BY 1
-),
-activity_weeks AS (SELECT DISTINCT user_id, week_start FROM events),
-user_week_age AS (
-  SELECT aw.user_id, fl.cohort_week_start,
-         ((aw.week_start - DATE_TRUNC('week', fl.first_login_time)::date) / 7)::int AS user_lifetime_week
-  FROM activity_weeks aw JOIN first_login fl USING (user_id)
-  WHERE aw.week_start >= DATE_TRUNC('week', fl.first_login_time)::date
-),
-bounded_counts AS (
-  SELECT cohort_week_start, user_lifetime_week, COUNT(DISTINCT user_id) AS active_users_bounded
-  FROM user_week_age WHERE user_lifetime_week >= 0 GROUP BY 1,2
-),
-last_active AS (
-  SELECT cohort_week_start, user_id, MAX(user_lifetime_week) AS last_active_week FROM user_week_age GROUP BY 1,2
-),
-unbounded_counts AS (
-  SELECT la.cohort_week_start, gs AS user_lifetime_week, COUNT(*) AS retained_users_unbounded
-  FROM last_active la
-  CROSS JOIN LATERAL generate_series(0, LEAST(la.last_active_week,(SELECT max_weeks FROM params))) gs
-  GROUP BY 1,2
-),
-cohort_sizes AS (SELECT cohort_week_start, COUNT(DISTINCT user_id) AS cohort_users FROM first_login GROUP BY 1),
-cohort_caps AS (
-  SELECT cs.cohort_week_start, cs.cohort_users,
-         LEAST((SELECT max_weeks FROM params),
-               GREATEST(0,((DATE_TRUNC('week',CURRENT_DATE)::date - cs.cohort_week_start)/7)::int)) AS cap_weeks
-  FROM cohort_sizes cs
-),
-grid AS (
-  SELECT cc.cohort_week_start, gs AS user_lifetime_week, cc.cohort_users
-  FROM cohort_caps cc CROSS JOIN LATERAL generate_series(0, cc.cap_weeks) gs
-)
-SELECT
-  g.cohort_week_start,
-  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')                                    AS cohort_label,
-  TO_CHAR(g.cohort_week_start,'IYYY-"W"IW')||' (n='||g.cohort_users||')'       AS cohort_label_n,
-  g.user_lifetime_week, g.cohort_users,
-  COALESCE(b.active_users_bounded,0)     AS active_users_bounded,
-  COALESCE(u.retained_users_unbounded,0) AS retained_users_unbounded,
-  CASE WHEN g.cohort_users>0 THEN COALESCE(b.active_users_bounded,0)::float/g.cohort_users END    AS retention_rate_bounded,
-  CASE WHEN g.cohort_users>0 THEN COALESCE(u.retained_users_unbounded,0)::float/g.cohort_users END AS retention_rate_unbounded,
-  CASE WHEN g.user_lifetime_week=0 THEN g.cohort_users ELSE 0 END               AS cohort_users_w0
-FROM grid g
-LEFT JOIN bounded_counts   b ON b.cohort_week_start=g.cohort_week_start AND b.user_lifetime_week=g.user_lifetime_week
-LEFT JOIN unbounded_counts u ON u.cohort_week_start=g.cohort_week_start AND u.user_lifetime_week=g.user_lifetime_week
-ORDER BY g.cohort_week_start, g.user_lifetime_week
--- a/autogpt_platform/analytics/queries/user_block_spending.sql
+++ b/autogpt_platform/analytics/queries/user_block_spending.sql
@@ -1,71 +0,0 @@
-- =============================================================
-- View: analytics.user_block_spending
-- Looker source alias: ds6  |  Charts: 5
-- =============================================================
-- DESCRIPTION
--   One row per credit transaction (last 90 days).
--   Shows how users spend credits broken down by block type,
--   LLM provider and model.  Joins node execution stats for
--   token-level detail.
--
-- SOURCE TABLES
--   platform.CreditTransaction   — Credit debit/credit records
--   platform.AgentNodeExecution  — Node execution stats (for token counts)
--
-- OUTPUT COLUMNS
--   transactionKey        TEXT         Unique transaction identifier
--   userId                TEXT         User who was charged
--   amount                DECIMAL      Credit amount (positive = credit, negative = debit)
--   negativeAmount        DECIMAL      amount * -1 (convenience for spend charts)
--   transactionType       TEXT         Transaction type (e.g. 'USAGE', 'REFUND', 'TOP_UP')
--   transactionTime       TIMESTAMPTZ  When the transaction was recorded
--   blockId               TEXT         Block UUID that triggered the spend
--   blockName             TEXT         Human-readable block name
--   llm_provider          TEXT         LLM provider (e.g. 'openai', 'anthropic')
--   llm_model             TEXT         Model name (e.g. 'gpt-4o', 'claude-3-5-sonnet')
--   node_exec_id          TEXT         Linked node execution UUID
--   llm_call_count        INT          LLM API calls made in that execution
--   llm_retry_count       INT          LLM retries in that execution
--   llm_input_token_count INT          Input tokens consumed
--   llm_output_token_count INT         Output tokens produced
--
-- WINDOW
--   Rolling 90 days (createdAt > CURRENT_DATE - 90 days)
--
-- EXAMPLE QUERIES
--   -- Total spend per user (last 90 days)
--   SELECT "userId", SUM("negativeAmount") AS total_spent
--   FROM analytics.user_block_spending
--   WHERE "transactionType" = 'USAGE'
--   GROUP BY 1 ORDER BY total_spent DESC;
--
--   -- Spend by LLM provider + model
--   SELECT "llm_provider", "llm_model",
--          SUM("negativeAmount") AS total_cost,
--          SUM("llm_input_token_count") AS input_tokens,
--          SUM("llm_output_token_count") AS output_tokens
--   FROM analytics.user_block_spending
--   WHERE "llm_provider" IS NOT NULL
--   GROUP BY 1, 2 ORDER BY total_cost DESC;
-- =============================================================
-
-SELECT
-    c."transactionKey"                                        AS transactionKey,
-    c."userId"                                                AS userId,
-    c."amount"                                                AS amount,
-    c."amount" * -1                                           AS negativeAmount,
-    c."type"                                                  AS transactionType,
-    c."createdAt"                                             AS transactionTime,
-    c.metadata->>'block_id'                                   AS blockId,
-    c.metadata->>'block'                                      AS blockName,
-    c.metadata->'input'->'credentials'->>'provider'           AS llm_provider,
-    c.metadata->'input'->>'model'                             AS llm_model,
-    c.metadata->>'node_exec_id'                               AS node_exec_id,
-    (ne."stats"->>'llm_call_count')::int                       AS llm_call_count,
-    (ne."stats"->>'llm_retry_count')::int                      AS llm_retry_count,
-    (ne."stats"->>'input_token_count')::int                    AS llm_input_token_count,
-    (ne."stats"->>'output_token_count')::int                   AS llm_output_token_count
-FROM platform."CreditTransaction" c
-LEFT JOIN platform."AgentNodeExecution" ne
-       ON (c.metadata->>'node_exec_id') = ne."id"::text
-WHERE c."createdAt" > CURRENT_DATE - INTERVAL '90 days'
--- a/autogpt_platform/analytics/queries/user_onboarding.sql
+++ b/autogpt_platform/analytics/queries/user_onboarding.sql
@@ -1,45 +0,0 @@
-- =============================================================
-- View: analytics.user_onboarding
-- Looker source alias: ds68  |  Charts: 3
-- =============================================================
-- DESCRIPTION
--   One row per user onboarding record.  Contains the user's
--   stated usage reason, selected integrations, completed
--   onboarding steps and optional first agent selection.
--   Full history (no date filter) since onboarding happens
--   once per user.
--
-- SOURCE TABLES
--   platform.UserOnboarding  — Onboarding state per user
--
-- OUTPUT COLUMNS
--   id                            TEXT         Onboarding record UUID
--   createdAt                     TIMESTAMPTZ  When onboarding started
--   updatedAt                     TIMESTAMPTZ  Last update to onboarding state
--   usageReason                   TEXT         Why user signed up (e.g. 'work', 'personal')
--   integrations                  TEXT[]       Array of integration names the user selected
--   userId                        TEXT         User UUID
--   completedSteps                TEXT[]       Array of onboarding step enums completed
--   selectedStoreListingVersionId TEXT         First marketplace agent the user chose (if any)
--
-- EXAMPLE QUERIES
--   -- Usage reason breakdown
--   SELECT "usageReason", COUNT(*) FROM analytics.user_onboarding GROUP BY 1;
--
--   -- Completion rate per step
--   SELECT step, COUNT(*) AS users_completed
--   FROM analytics.user_onboarding
--   CROSS JOIN LATERAL UNNEST("completedSteps") AS step
--   GROUP BY 1 ORDER BY users_completed DESC;
-- =============================================================
-
-SELECT
-    id,
-    "createdAt",
-    "updatedAt",
-    "usageReason",
-    integrations,
-    "userId",
-    "completedSteps",
-    "selectedStoreListingVersionId"
-FROM platform."UserOnboarding"
--- a/autogpt_platform/analytics/queries/user_onboarding_funnel.sql
+++ b/autogpt_platform/analytics/queries/user_onboarding_funnel.sql
@@ -1,100 +0,0 @@
-- =============================================================
-- View: analytics.user_onboarding_funnel
-- Looker source alias: ds74  |  Charts: 1
-- =============================================================
-- DESCRIPTION
--   Pre-aggregated onboarding funnel showing how many users
--   completed each step and the drop-off percentage from the
--   previous step.  One row per onboarding step (all 22 steps
--   always present, even with 0 completions — prevents sparse
--   gaps from making LAG compare the wrong predecessors).
--
-- SOURCE TABLES
--   platform.UserOnboarding  — Onboarding records with completedSteps array
--
-- OUTPUT COLUMNS
--   step             TEXT     Onboarding step enum name (e.g. 'WELCOME', 'CONGRATS')
--   step_order       INT      Numeric position in the funnel (1=first, 22=last)
--   users_completed  BIGINT   Distinct users who completed this step
--   pct_from_prev    NUMERIC  % of users from the previous step who reached this one
--
-- STEP ORDER
--   1  WELCOME               9  MARKETPLACE_VISIT     17  SCHEDULE_AGENT
--   2  USAGE_REASON         10  MARKETPLACE_ADD_AGENT  18  RUN_AGENTS
--   3  INTEGRATIONS         11  MARKETPLACE_RUN_AGENT  19  RUN_3_DAYS
--   4  AGENT_CHOICE         12  BUILDER_OPEN           20  TRIGGER_WEBHOOK
--   5  AGENT_NEW_RUN        13  BUILDER_SAVE_AGENT     21  RUN_14_DAYS
--   6  AGENT_INPUT          14  BUILDER_RUN_AGENT      22  RUN_AGENTS_100
--   7  CONGRATS             15  VISIT_COPILOT
--   8  GET_RESULTS          16  RE_RUN_AGENT
--
-- WINDOW
--   Users who started onboarding in the last 90 days
--
-- EXAMPLE QUERIES
--   -- Full funnel
--   SELECT * FROM analytics.user_onboarding_funnel ORDER BY step_order;
--
--   -- Biggest drop-off point
--   SELECT step, pct_from_prev FROM analytics.user_onboarding_funnel
--   ORDER BY pct_from_prev ASC LIMIT 3;
-- =============================================================
-
-WITH all_steps AS (
-  -- Complete ordered grid of all 22 steps so zero-completion steps
-  -- are always present, keeping LAG comparisons correct.
-  SELECT step_name, step_order
-  FROM (VALUES
-    ('WELCOME',               1),
-    ('USAGE_REASON',          2),
-    ('INTEGRATIONS',          3),
-    ('AGENT_CHOICE',          4),
-    ('AGENT_NEW_RUN',         5),
-    ('AGENT_INPUT',           6),
-    ('CONGRATS',              7),
-    ('GET_RESULTS',           8),
-    ('MARKETPLACE_VISIT',     9),
-    ('MARKETPLACE_ADD_AGENT', 10),
-    ('MARKETPLACE_RUN_AGENT', 11),
-    ('BUILDER_OPEN',          12),
-    ('BUILDER_SAVE_AGENT',    13),
-    ('BUILDER_RUN_AGENT',     14),
-    ('VISIT_COPILOT',         15),
-    ('RE_RUN_AGENT',          16),
-    ('SCHEDULE_AGENT',        17),
-    ('RUN_AGENTS',            18),
-    ('RUN_3_DAYS',            19),
-    ('TRIGGER_WEBHOOK',       20),
-    ('RUN_14_DAYS',           21),
-    ('RUN_AGENTS_100',        22)
-  ) AS t(step_name, step_order)
-),
-raw AS (
-  SELECT
-      u."userId",
-      step_txt::text AS step
-  FROM platform."UserOnboarding" u
-  CROSS JOIN LATERAL UNNEST(u."completedSteps") AS step_txt
-  WHERE u."createdAt" >= CURRENT_DATE - INTERVAL '90 days'
-),
-step_counts AS (
-  SELECT step, COUNT(DISTINCT "userId") AS users_completed
-  FROM raw GROUP BY step
-),
-funnel AS (
-  SELECT
-      a.step_name                          AS step,
-      a.step_order,
-      COALESCE(sc.users_completed, 0)      AS users_completed,
-      ROUND(
-        100.0 * COALESCE(sc.users_completed, 0)
-        / NULLIF(
-            LAG(COALESCE(sc.users_completed, 0)) OVER (ORDER BY a.step_order),
-            0
-          ),
-        2
-      )                                    AS pct_from_prev
-  FROM all_steps a
-  LEFT JOIN step_counts sc ON sc.step = a.step_name
-)
-SELECT * FROM funnel ORDER BY step_order
--- a/autogpt_platform/analytics/queries/user_onboarding_integration.sql
+++ b/autogpt_platform/analytics/queries/user_onboarding_integration.sql
@@ -1,41 +0,0 @@
-- =============================================================
-- View: analytics.user_onboarding_integration
-- Looker source alias: ds75  |  Charts: 1
-- =============================================================
-- DESCRIPTION
--   Pre-aggregated count of users who selected each integration
--   during onboarding.  One row per integration type, sorted
--   by popularity.
--
-- SOURCE TABLES
--   platform.UserOnboarding  — integrations array column
--
-- OUTPUT COLUMNS
--   integration            TEXT    Integration name (e.g. 'github', 'slack', 'notion')
--   users_with_integration BIGINT  Distinct users who selected this integration
--
-- WINDOW
--   Users who started onboarding in the last 90 days
--
-- EXAMPLE QUERIES
--   -- Full integration popularity ranking
--   SELECT * FROM analytics.user_onboarding_integration;
--
--   -- Top 5 integrations
--   SELECT * FROM analytics.user_onboarding_integration LIMIT 5;
-- =============================================================
-
-WITH exploded AS (
-  SELECT
-      u."userId" AS user_id,
-      UNNEST(u."integrations") AS integration
-  FROM platform."UserOnboarding" u
-  WHERE u."createdAt" >= CURRENT_DATE - INTERVAL '90 days'
-)
-SELECT
-    integration,
-    COUNT(DISTINCT user_id) AS users_with_integration
-FROM exploded
-WHERE integration IS NOT NULL AND integration <> ''
-GROUP BY integration
-ORDER BY users_with_integration DESC
--- a/autogpt_platform/analytics/queries/users_activities.sql
+++ b/autogpt_platform/analytics/queries/users_activities.sql
@@ -1,145 +0,0 @@
-- =============================================================
-- View: analytics.users_activities
-- Looker source alias: ds56  |  Charts: 5
-- =============================================================
-- DESCRIPTION
--   One row per user with lifetime activity summary.
--   Joins login sessions with agent graphs, executions and
--   node-level runs to give a full picture of how engaged
--   each user is.  Includes a convenience flag for 7-day
--   activation (did the user return at least 7 days after
--   their first login?).
--
-- SOURCE TABLES
--   auth.sessions                    — Login/session records
--   platform.AgentGraph              — Graphs (agents) built by the user
--   platform.AgentGraphExecution     — Agent run history
--   platform.AgentNodeExecution      — Individual block execution history
--
-- PERFORMANCE NOTE
--   Each CTE aggregates its own table independently by userId.
--   This avoids the fan-out that occurs when driving every join
--   from user_logins across the two largest tables
--   (AgentGraphExecution and AgentNodeExecution).
--
-- OUTPUT COLUMNS
--   user_id                   TEXT         Supabase user UUID
--   first_login_time          TIMESTAMPTZ  First ever session created_at
--   last_login_time           TIMESTAMPTZ  Most recent session created_at
--   last_visit_time           TIMESTAMPTZ  Max of last refresh or login
--   last_agent_save_time      TIMESTAMPTZ  Last time user saved an agent graph
--   agent_count               BIGINT       Number of distinct active graphs built (0 if none)
--   first_agent_run_time      TIMESTAMPTZ  First ever graph execution
--   last_agent_run_time       TIMESTAMPTZ  Most recent graph execution
--   unique_agent_runs         BIGINT       Distinct agent graphs ever run (0 if none)
--   agent_runs                BIGINT       Total graph execution count (0 if none)
--   node_execution_count      BIGINT       Total node executions across all runs
--   node_execution_failed     BIGINT       Node executions with FAILED status
--   node_execution_completed  BIGINT       Node executions with COMPLETED status
--   node_execution_terminated BIGINT       Node executions with TERMINATED status
--   node_execution_queued     BIGINT       Node executions with QUEUED status
--   node_execution_running    BIGINT       Node executions with RUNNING status
--   is_active_after_7d        INT          1=returned after day 7, 0=did not, NULL=too early to tell
--   node_execution_incomplete BIGINT       Node executions with INCOMPLETE status
--   node_execution_review     BIGINT       Node executions with REVIEW status
--
-- EXAMPLE QUERIES
--   -- Users who ran at least one agent and returned after 7 days
--   SELECT COUNT(*) FROM analytics.users_activities
--   WHERE agent_runs > 0 AND is_active_after_7d = 1;
--
--   -- Top 10 most active users by agent runs
--   SELECT user_id, agent_runs, node_execution_count
--   FROM analytics.users_activities
--   ORDER BY agent_runs DESC LIMIT 10;
--
--   -- 7-day activation rate
--   SELECT
--     SUM(CASE WHEN is_active_after_7d = 1 THEN 1 ELSE 0 END)::float
--     / NULLIF(COUNT(CASE WHEN is_active_after_7d IS NOT NULL THEN 1 END), 0)
--     AS activation_rate
--   FROM analytics.users_activities;
-- =============================================================
-
-WITH user_logins AS (
-  SELECT
-    user_id::text                                    AS user_id,
-    MIN(created_at)                                  AS first_login_time,
-    MAX(created_at)                                  AS last_login_time,
-    GREATEST(
-      MAX(refreshed_at)::timestamptz,
-      MAX(created_at)::timestamptz
-    )                                                AS last_visit_time
-  FROM auth.sessions
-  GROUP BY user_id
-),
-user_agents AS (
-  -- Aggregate AgentGraph directly by userId (no fan-out from user_logins)
-  SELECT
-    "userId"::text                AS user_id,
-    MAX("updatedAt")              AS last_agent_save_time,
-    COUNT(DISTINCT "id")          AS agent_count
-  FROM platform."AgentGraph"
-  WHERE "isActive"
-  GROUP BY "userId"
-),
-user_graph_runs AS (
-  -- Aggregate AgentGraphExecution directly by userId
-  SELECT
-    "userId"::text                        AS user_id,
-    MIN("createdAt")                      AS first_agent_run_time,
-    MAX("createdAt")                      AS last_agent_run_time,
-    COUNT(DISTINCT "agentGraphId")        AS unique_agent_runs,
-    COUNT("id")                           AS agent_runs
-  FROM platform."AgentGraphExecution"
-  GROUP BY "userId"
-),
-user_node_runs AS (
-  -- Aggregate AgentNodeExecution directly; resolve userId via a
-  -- single join to AgentGraphExecution instead of fanning out from
-  -- user_logins through both large tables.
-  SELECT
-    g."userId"::text                                                   AS user_id,
-    COUNT(*)                                                           AS node_execution_count,
-    COUNT(*) FILTER (WHERE n."executionStatus" = 'FAILED')             AS node_execution_failed,
-    COUNT(*) FILTER (WHERE n."executionStatus" = 'COMPLETED')          AS node_execution_completed,
-    COUNT(*) FILTER (WHERE n."executionStatus" = 'TERMINATED')         AS node_execution_terminated,
-    COUNT(*) FILTER (WHERE n."executionStatus" = 'QUEUED')             AS node_execution_queued,
-    COUNT(*) FILTER (WHERE n."executionStatus" = 'RUNNING')            AS node_execution_running,
-    COUNT(*) FILTER (WHERE n."executionStatus" = 'INCOMPLETE')         AS node_execution_incomplete,
-    COUNT(*) FILTER (WHERE n."executionStatus" = 'REVIEW')             AS node_execution_review
-  FROM platform."AgentNodeExecution" n
-  JOIN platform."AgentGraphExecution" g
-    ON g."id" = n."agentGraphExecutionId"
-  GROUP BY g."userId"
-)
-SELECT
-  ul.user_id,
-  ul.first_login_time,
-  ul.last_login_time,
-  ul.last_visit_time,
-  ua.last_agent_save_time,
-  COALESCE(ua.agent_count, 0)             AS agent_count,
-  gr.first_agent_run_time,
-  gr.last_agent_run_time,
-  COALESCE(gr.unique_agent_runs, 0)       AS unique_agent_runs,
-  COALESCE(gr.agent_runs, 0)              AS agent_runs,
-  COALESCE(nr.node_execution_count, 0)      AS node_execution_count,
-  COALESCE(nr.node_execution_failed, 0)     AS node_execution_failed,
-  COALESCE(nr.node_execution_completed, 0)  AS node_execution_completed,
-  COALESCE(nr.node_execution_terminated, 0) AS node_execution_terminated,
-  COALESCE(nr.node_execution_queued, 0)     AS node_execution_queued,
-  COALESCE(nr.node_execution_running, 0)    AS node_execution_running,
-  CASE
-    WHEN ul.first_login_time < NOW() - INTERVAL '7 days'
-     AND ul.last_visit_time  >= ul.first_login_time + INTERVAL '7 days' THEN 1
-    WHEN ul.first_login_time < NOW() - INTERVAL '7 days'
-     AND ul.last_visit_time  <  ul.first_login_time + INTERVAL '7 days' THEN 0
-    ELSE NULL
-  END AS is_active_after_7d,
-  COALESCE(nr.node_execution_incomplete, 0) AS node_execution_incomplete,
-  COALESCE(nr.node_execution_review, 0)     AS node_execution_review
-FROM user_logins ul
-LEFT JOIN user_agents     ua ON ul.user_id = ua.user_id
-LEFT JOIN user_graph_runs gr ON ul.user_id = gr.user_id
-LEFT JOIN user_node_runs  nr ON ul.user_id = nr.user_id
--- a/autogpt_platform/backend/.env.default
+++ b/autogpt_platform/backend/.env.default
@@ -37,10 +37,6 @@ JWT_VERIFY_KEY=your-super-secret-jwt-token-with-at-least-32-characters-long
 ENCRYPTION_KEY=dvziYgz0KSK8FENhju0ZYi8-fRTfAdlz6YLhdB_jhNw=
 UNSUBSCRIBE_SECRET_KEY=HlP8ivStJjmbf6NKi78m_3FnOogut0t5ckzjsIqeaio=

-## ===== SIGNUP / INVITE GATE ===== ##
-# Set to true to require an invite before users can sign up
-ENABLE_INVITE_GATE=false
-
 ## ===== IMPORTANT OPTIONAL CONFIGURATION ===== ##
 # Platform URLs (set these for webhooks and OAuth to work)
 PLATFORM_BASE_URL=http://localhost:8000
--- a/autogpt_platform/backend/CLAUDE.md
+++ b/autogpt_platform/backend/CLAUDE.md
@@ -58,31 +58,10 @@ poetry run pytest path/to/test.py --snapshot-update
 - **Authentication**: JWT-based with Supabase integration
 - **Security**: Cache protection middleware prevents sensitive data caching in browsers/proxies

-## Code Style
-
- **Top-level imports only** — no local/inner imports (lazy imports only for heavy optional deps like `openpyxl`)
- **No duck typing** — no `hasattr`/`getattr`/`isinstance` for type dispatch; use typed interfaces/unions/protocols
- **Pydantic models** over dataclass/namedtuple/dict for structured data
- **No linter suppressors** — no `# type: ignore`, `# noqa`, `# pyright: ignore`; fix the type/code
- **List comprehensions** over manual loop-and-append
- **Early return** — guard clauses first, avoid deep nesting
- **Lazy `%s` logging** — `logger.info("Processing %s items", count)` not `logger.info(f"Processing {count} items")`
- **Sanitize error paths** — `os.path.basename()` in error messages to avoid leaking directory structure
- **TOCTOU awareness** — avoid check-then-act patterns for file access and credit charging
- **`Security()` vs `Depends()`** — use `Security()` for auth deps to get proper OpenAPI security spec
- **Redis pipelines** — `transaction=True` for atomicity on multi-step operations
- **`max(0, value)` guards** — for computed values that should never be negative
- **SSE protocol** — `data:` lines for frontend-parsed events (must match Zod schema), `: comment` lines for heartbeats/status
- **File length** — keep files under ~300 lines; if a file grows beyond this, split by responsibility (e.g. extract helpers, models, or a sub-module into a new file). Never keep appending to a long file.
- **Function length** — keep functions under ~40 lines; extract named helpers when a function grows longer. Long functions are a sign of mixed concerns, not complexity.
-
 ## Testing Approach

 - Uses pytest with snapshot testing for API responses
 - Test files are colocated with source files (`*_test.py`)
- Mock at boundaries — mock where the symbol is **used**, not where it's **defined**
- After refactoring, update mock targets to match new module paths
- Use `AsyncMock` for async functions (`from unittest.mock import AsyncMock`)

 ## Database Schema

--- a/autogpt_platform/backend/backend/api/external/v1/routes.py
+++ b/autogpt_platform/backend/backend/api/external/v1/routes.py
@@ -1,7 +1,7 @@
 import logging
 import urllib.parse
 from collections import defaultdict
-from typing import Annotated, Any, Optional, Sequence
+from typing import Annotated, Any, Literal, Optional, Sequence

 from fastapi import APIRouter, Body, HTTPException, Security
 from prisma.enums import AgentExecutionStatus, APIKeyPermission
@@ -9,10 +9,9 @@ from pydantic import BaseModel, Field
 from typing_extensions import TypedDict

 import backend.api.features.store.cache as store_cache
-import backend.api.features.store.db as store_db
 import backend.api.features.store.model as store_model
 import backend.blocks
-from backend.api.external.middleware import require_auth, require_permission
+from backend.api.external.middleware import require_permission
 from backend.data import execution as execution_db
 from backend.data import graph as graph_db
 from backend.data import user as user_db
@@ -231,13 +230,13 @@ async def get_graph_execution_results(
@v1_router.get(
    path="/store/agents",
    tags=["store"],
-    dependencies=[Security(require_auth)],  # data is public; auth required as anti-DDoS
+    dependencies=[Security(require_permission(APIKeyPermission.READ_STORE))],
    response_model=store_model.StoreAgentsResponse,
 )
 async def get_store_agents(
    featured: bool = False,
    creator: str | None = None,
-    sorted_by: store_db.StoreAgentsSortOptions | None = None,
+    sorted_by: Literal["rating", "runs", "name", "updated_at"] | None = None,
    search_query: str | None = None,
    category: str | None = None,
    page: int = 1,
@@ -279,7 +278,7 @@ async def get_store_agents(
@v1_router.get(
    path="/store/agents/{username}/{agent_name}",
    tags=["store"],
-    dependencies=[Security(require_auth)],  # data is public; auth required as anti-DDoS
+    dependencies=[Security(require_permission(APIKeyPermission.READ_STORE))],
    response_model=store_model.StoreAgentDetails,
 )
 async def get_store_agent(
@@ -307,13 +306,13 @@ async def get_store_agent(
@v1_router.get(
    path="/store/creators",
    tags=["store"],
-    dependencies=[Security(require_auth)],  # data is public; auth required as anti-DDoS
+    dependencies=[Security(require_permission(APIKeyPermission.READ_STORE))],
    response_model=store_model.CreatorsResponse,
 )
 async def get_store_creators(
    featured: bool = False,
    search_query: str | None = None,
-    sorted_by: store_db.StoreCreatorsSortOptions | None = None,
+    sorted_by: Literal["agent_rating", "agent_runs", "num_agents"] | None = None,
    page: int = 1,
    page_size: int = 20,
 ) -> store_model.CreatorsResponse:
@@ -349,7 +348,7 @@ async def get_store_creators(
@v1_router.get(
    path="/store/creators/{username}",
    tags=["store"],
-    dependencies=[Security(require_auth)],  # data is public; auth required as anti-DDoS
+    dependencies=[Security(require_permission(APIKeyPermission.READ_STORE))],
    response_model=store_model.CreatorDetails,
 )
 async def get_store_creator(
--- a/autogpt_platform/backend/backend/api/features/admin/model.py
+++ b/autogpt_platform/backend/backend/api/features/admin/model.py
@@ -1,17 +1,8 @@
-from __future__ import annotations
-
-from datetime import datetime
-from typing import TYPE_CHECKING, Any, Literal, Optional
-
-import prisma.enums
-from pydantic import BaseModel, EmailStr
+from pydantic import BaseModel

 from backend.data.model import UserTransaction
 from backend.util.models import Pagination

-if TYPE_CHECKING:
-    from backend.data.invited_user import BulkInvitedUsersResult, InvitedUserRecord
-

 class UserHistoryResponse(BaseModel):
    """Response model for listings with version history"""
@@ -23,70 +14,3 @@ class UserHistoryResponse(BaseModel):
 class AddUserCreditsResponse(BaseModel):
    new_balance: int
    transaction_key: str
-
-
-class CreateInvitedUserRequest(BaseModel):
-    email: EmailStr
-    name: Optional[str] = None
-
-
-class InvitedUserResponse(BaseModel):
-    id: str
-    email: str
-    status: prisma.enums.InvitedUserStatus
-    auth_user_id: Optional[str] = None
-    name: Optional[str] = None
-    tally_understanding: Optional[dict[str, Any]] = None
-    tally_status: prisma.enums.TallyComputationStatus
-    tally_computed_at: Optional[datetime] = None
-    tally_error: Optional[str] = None
-    created_at: datetime
-    updated_at: datetime
-
-    @classmethod
-    def from_record(cls, record: InvitedUserRecord) -> InvitedUserResponse:
-        return cls.model_validate(record.model_dump())
-
-
-class InvitedUsersResponse(BaseModel):
-    invited_users: list[InvitedUserResponse]
-    pagination: Pagination
-
-
-class BulkInvitedUserRowResponse(BaseModel):
-    row_number: int
-    email: Optional[str] = None
-    name: Optional[str] = None
-    status: Literal["CREATED", "SKIPPED", "ERROR"]
-    message: str
-    invited_user: Optional[InvitedUserResponse] = None
-
-
-class BulkInvitedUsersResponse(BaseModel):
-    created_count: int
-    skipped_count: int
-    error_count: int
-    results: list[BulkInvitedUserRowResponse]
-
-    @classmethod
-    def from_result(cls, result: BulkInvitedUsersResult) -> BulkInvitedUsersResponse:
-        return cls(
-            created_count=result.created_count,
-            skipped_count=result.skipped_count,
-            error_count=result.error_count,
-            results=[
-                BulkInvitedUserRowResponse(
-                    row_number=row.row_number,
-                    email=row.email,
-                    name=row.name,
-                    status=row.status,
-                    message=row.message,
-                    invited_user=(
-                        InvitedUserResponse.from_record(row.invited_user)
-                        if row.invited_user is not None
-                        else None
-                    ),
-                )
-                for row in result.results
-            ],
-        )
--- a/autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py
+++ b/autogpt_platform/backend/backend/api/features/admin/store_admin_routes.py
@@ -24,13 +24,14 @@ router = fastapi.APIRouter(
@router.get(
    "/listings",
    summary="Get Admin Listings History",
+    response_model=store_model.StoreListingsWithVersionsResponse,
 )
 async def get_admin_listings_with_versions(
    status: typing.Optional[prisma.enums.SubmissionStatus] = None,
    search: typing.Optional[str] = None,
    page: int = 1,
    page_size: int = 20,
-) -> store_model.StoreListingsWithVersionsAdminViewResponse:
+):
    """
    Get store listings with their version history for admins.

@@ -44,26 +45,36 @@ async def get_admin_listings_with_versions(
        page_size: Number of items per page

    Returns:
-        Paginated listings with their versions
+        StoreListingsWithVersionsResponse with listings and their versions
    """
-    listings = await store_db.get_admin_listings_with_versions(
-        status=status,
-        search_query=search,
-        page=page,
-        page_size=page_size,
-    )
-    return listings
+    try:
+        listings = await store_db.get_admin_listings_with_versions(
+            status=status,
+            search_query=search,
+            page=page,
+            page_size=page_size,
+        )
+        return listings
+    except Exception as e:
+        logger.exception("Error getting admin listings with versions: %s", e)
+        return fastapi.responses.JSONResponse(
+            status_code=500,
+            content={
+                "detail": "An error occurred while retrieving listings with versions"
+            },
+        )


@router.post(
    "/submissions/{store_listing_version_id}/review",
    summary="Review Store Submission",
+    response_model=store_model.StoreSubmission,
 )
 async def review_submission(
    store_listing_version_id: str,
    request: store_model.ReviewSubmissionRequest,
    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
-) -> store_model.StoreSubmissionAdminView:
+):
    """
    Review a store listing submission.

@@ -73,24 +84,31 @@ async def review_submission(
        user_id: Authenticated admin user performing the review

    Returns:
-        StoreSubmissionAdminView with updated review information
+        StoreSubmission with updated review information
    """
-    already_approved = await store_db.check_submission_already_approved(
-        store_listing_version_id=store_listing_version_id,
-    )
-    submission = await store_db.review_store_submission(
-        store_listing_version_id=store_listing_version_id,
-        is_approved=request.is_approved,
-        external_comments=request.comments,
-        internal_comments=request.internal_comments or "",
-        reviewer_id=user_id,
-    )
+    try:
+        already_approved = await store_db.check_submission_already_approved(
+            store_listing_version_id=store_listing_version_id,
+        )
+        submission = await store_db.review_store_submission(
+            store_listing_version_id=store_listing_version_id,
+            is_approved=request.is_approved,
+            external_comments=request.comments,
+            internal_comments=request.internal_comments or "",
+            reviewer_id=user_id,
+        )

-    state_changed = already_approved != request.is_approved
-    # Clear caches whenever approval state changes, since store visibility can change
-    if state_changed:
-        store_cache.clear_all_caches()
-    return submission
+        state_changed = already_approved != request.is_approved
+        # Clear caches when the request is approved as it updates what is shown on the store
+        if state_changed:
+            store_cache.clear_all_caches()
+        return submission
+    except Exception as e:
+        logger.exception("Error reviewing submission: %s", e)
+        return fastapi.responses.JSONResponse(
+            status_code=500,
+            content={"detail": "An error occurred while reviewing the submission"},
+        )


@router.get(
--- a/autogpt_platform/backend/backend/api/features/admin/user_admin_routes.py
+++ b/autogpt_platform/backend/backend/api/features/admin/user_admin_routes.py
@@ -1,137 +0,0 @@
-import logging
-import math
-
-from autogpt_libs.auth import get_user_id, requires_admin_user
-from fastapi import APIRouter, File, Query, Security, UploadFile
-
-from backend.data.invited_user import (
-    bulk_create_invited_users_from_file,
-    create_invited_user,
-    list_invited_users,
-    retry_invited_user_tally,
-    revoke_invited_user,
-)
-from backend.data.tally import mask_email
-from backend.util.models import Pagination
-
-from .model import (
-    BulkInvitedUsersResponse,
-    CreateInvitedUserRequest,
-    InvitedUserResponse,
-    InvitedUsersResponse,
-)
-
-logger = logging.getLogger(__name__)
-
-
-router = APIRouter(
-    prefix="/admin",
-    tags=["users", "admin"],
-    dependencies=[Security(requires_admin_user)],
-)
-
-
-@router.get(
-    "/invited-users",
-    response_model=InvitedUsersResponse,
-    summary="List Invited Users",
-)
-async def get_invited_users(
-    admin_user_id: str = Security(get_user_id),
-    page: int = Query(1, ge=1),
-    page_size: int = Query(50, ge=1, le=200),
-) -> InvitedUsersResponse:
-    logger.info("Admin user %s requested invited users", admin_user_id)
-    invited_users, total = await list_invited_users(page=page, page_size=page_size)
-    return InvitedUsersResponse(
-        invited_users=[InvitedUserResponse.from_record(iu) for iu in invited_users],
-        pagination=Pagination(
-            total_items=total,
-            total_pages=max(1, math.ceil(total / page_size)),
-            current_page=page,
-            page_size=page_size,
-        ),
-    )
-
-
-@router.post(
-    "/invited-users",
-    response_model=InvitedUserResponse,
-    summary="Create Invited User",
-)
-async def create_invited_user_route(
-    request: CreateInvitedUserRequest,
-    admin_user_id: str = Security(get_user_id),
-) -> InvitedUserResponse:
-    logger.info(
-        "Admin user %s creating invited user for %s",
-        admin_user_id,
-        mask_email(request.email),
-    )
-    invited_user = await create_invited_user(request.email, request.name)
-    logger.info(
-        "Admin user %s created invited user %s",
-        admin_user_id,
-        invited_user.id,
-    )
-    return InvitedUserResponse.from_record(invited_user)
-
-
-@router.post(
-    "/invited-users/bulk",
-    response_model=BulkInvitedUsersResponse,
-    summary="Bulk Create Invited Users",
-    operation_id="postV2BulkCreateInvitedUsers",
-)
-async def bulk_create_invited_users_route(
-    file: UploadFile = File(...),
-    admin_user_id: str = Security(get_user_id),
-) -> BulkInvitedUsersResponse:
-    logger.info(
-        "Admin user %s bulk invited users from %s",
-        admin_user_id,
-        file.filename or "<unnamed>",
-    )
-    content = await file.read()
-    result = await bulk_create_invited_users_from_file(file.filename, content)
-    return BulkInvitedUsersResponse.from_result(result)
-
-
-@router.post(
-    "/invited-users/{invited_user_id}/revoke",
-    response_model=InvitedUserResponse,
-    summary="Revoke Invited User",
-)
-async def revoke_invited_user_route(
-    invited_user_id: str,
-    admin_user_id: str = Security(get_user_id),
-) -> InvitedUserResponse:
-    logger.info(
-        "Admin user %s revoking invited user %s", admin_user_id, invited_user_id
-    )
-    invited_user = await revoke_invited_user(invited_user_id)
-    logger.info("Admin user %s revoked invited user %s", admin_user_id, invited_user_id)
-    return InvitedUserResponse.from_record(invited_user)
-
-
-@router.post(
-    "/invited-users/{invited_user_id}/retry-tally",
-    response_model=InvitedUserResponse,
-    summary="Retry Invited User Tally",
-)
-async def retry_invited_user_tally_route(
-    invited_user_id: str,
-    admin_user_id: str = Security(get_user_id),
-) -> InvitedUserResponse:
-    logger.info(
-        "Admin user %s retrying Tally seed for invited user %s",
-        admin_user_id,
-        invited_user_id,
-    )
-    invited_user = await retry_invited_user_tally(invited_user_id)
-    logger.info(
-        "Admin user %s retried Tally seed for invited user %s",
-        admin_user_id,
-        invited_user_id,
-    )
-    return InvitedUserResponse.from_record(invited_user)
--- a/autogpt_platform/backend/backend/api/features/admin/user_admin_routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/admin/user_admin_routes_test.py
@@ -1,168 +0,0 @@
-from datetime import datetime, timezone
-from unittest.mock import AsyncMock
-
-import fastapi
-import fastapi.testclient
-import prisma.enums
-import pytest
-import pytest_mock
-from autogpt_libs.auth.jwt_utils import get_jwt_payload
-
-from backend.data.invited_user import (
-    BulkInvitedUserRowResult,
-    BulkInvitedUsersResult,
-    InvitedUserRecord,
-)
-
-from .user_admin_routes import router as user_admin_router
-
-app = fastapi.FastAPI()
-app.include_router(user_admin_router)
-
-client = fastapi.testclient.TestClient(app)
-
-
-@pytest.fixture(autouse=True)
-def setup_app_admin_auth(mock_jwt_admin):
-    app.dependency_overrides[get_jwt_payload] = mock_jwt_admin["get_jwt_payload"]
-    yield
-    app.dependency_overrides.clear()
-
-
-def _sample_invited_user() -> InvitedUserRecord:
-    now = datetime.now(timezone.utc)
-    return InvitedUserRecord(
-        id="invite-1",
-        email="invited@example.com",
-        status=prisma.enums.InvitedUserStatus.INVITED,
-        auth_user_id=None,
-        name="Invited User",
-        tally_understanding=None,
-        tally_status=prisma.enums.TallyComputationStatus.PENDING,
-        tally_computed_at=None,
-        tally_error=None,
-        created_at=now,
-        updated_at=now,
-    )
-
-
-def _sample_bulk_invited_users_result() -> BulkInvitedUsersResult:
-    return BulkInvitedUsersResult(
-        created_count=1,
-        skipped_count=1,
-        error_count=0,
-        results=[
-            BulkInvitedUserRowResult(
-                row_number=1,
-                email="invited@example.com",
-                name=None,
-                status="CREATED",
-                message="Invite created",
-                invited_user=_sample_invited_user(),
-            ),
-            BulkInvitedUserRowResult(
-                row_number=2,
-                email="duplicate@example.com",
-                name=None,
-                status="SKIPPED",
-                message="An invited user with this email already exists",
-                invited_user=None,
-            ),
-        ],
-    )
-
-
-def test_get_invited_users(
-    mocker: pytest_mock.MockerFixture,
-) -> None:
-    mocker.patch(
-        "backend.api.features.admin.user_admin_routes.list_invited_users",
-        AsyncMock(return_value=([_sample_invited_user()], 1)),
-    )
-
-    response = client.get("/admin/invited-users")
-
-    assert response.status_code == 200
-    data = response.json()
-    assert len(data["invited_users"]) == 1
-    assert data["invited_users"][0]["email"] == "invited@example.com"
-    assert data["invited_users"][0]["status"] == "INVITED"
-    assert data["pagination"]["total_items"] == 1
-    assert data["pagination"]["current_page"] == 1
-    assert data["pagination"]["page_size"] == 50
-
-
-def test_create_invited_user(
-    mocker: pytest_mock.MockerFixture,
-) -> None:
-    mocker.patch(
-        "backend.api.features.admin.user_admin_routes.create_invited_user",
-        AsyncMock(return_value=_sample_invited_user()),
-    )
-
-    response = client.post(
-        "/admin/invited-users",
-        json={"email": "invited@example.com", "name": "Invited User"},
-    )
-
-    assert response.status_code == 200
-    data = response.json()
-    assert data["email"] == "invited@example.com"
-    assert data["name"] == "Invited User"
-
-
-def test_bulk_create_invited_users(
-    mocker: pytest_mock.MockerFixture,
-) -> None:
-    mocker.patch(
-        "backend.api.features.admin.user_admin_routes.bulk_create_invited_users_from_file",
-        AsyncMock(return_value=_sample_bulk_invited_users_result()),
-    )
-
-    response = client.post(
-        "/admin/invited-users/bulk",
-        files={
-            "file": ("invites.txt", b"invited@example.com\nduplicate@example.com\n")
-        },
-    )
-
-    assert response.status_code == 200
-    data = response.json()
-    assert data["created_count"] == 1
-    assert data["skipped_count"] == 1
-    assert data["results"][0]["status"] == "CREATED"
-    assert data["results"][1]["status"] == "SKIPPED"
-
-
-def test_revoke_invited_user(
-    mocker: pytest_mock.MockerFixture,
-) -> None:
-    revoked = _sample_invited_user().model_copy(
-        update={"status": prisma.enums.InvitedUserStatus.REVOKED}
-    )
-    mocker.patch(
-        "backend.api.features.admin.user_admin_routes.revoke_invited_user",
-        AsyncMock(return_value=revoked),
-    )
-
-    response = client.post("/admin/invited-users/invite-1/revoke")
-
-    assert response.status_code == 200
-    assert response.json()["status"] == "REVOKED"
-
-
-def test_retry_invited_user_tally(
-    mocker: pytest_mock.MockerFixture,
-) -> None:
-    retried = _sample_invited_user().model_copy(
-        update={"tally_status": prisma.enums.TallyComputationStatus.RUNNING}
-    )
-    mocker.patch(
-        "backend.api.features.admin.user_admin_routes.retry_invited_user_tally",
-        AsyncMock(return_value=retried),
-    )
-
-    response = client.post("/admin/invited-users/invite-1/retry-tally")
-
-    assert response.status_code == 200
-    assert response.json()["tally_status"] == "RUNNING"
--- a/autogpt_platform/backend/backend/api/features/chat/routes.py
+++ b/autogpt_platform/backend/backend/api/features/chat/routes.py
@@ -11,7 +11,7 @@ from autogpt_libs import auth
 from fastapi import APIRouter, Depends, HTTPException, Query, Response, Security
 from fastapi.responses import StreamingResponse
 from prisma.models import UserWorkspaceFile
-from pydantic import BaseModel, Field, field_validator
+from pydantic import BaseModel, Field

 from backend.copilot import service as chat_service
 from backend.copilot import stream_registry
@@ -25,10 +25,8 @@ from backend.copilot.model import (
    delete_chat_session,
    get_chat_session,
    get_user_sessions,
-    update_session_title,
 )
 from backend.copilot.response_model import StreamError, StreamFinish, StreamHeartbeat
-from backend.copilot.tools.e2b_sandbox import kill_sandbox
 from backend.copilot.tools.models import (
    AgentDetailsResponse,
    AgentOutputResponse,
@@ -53,8 +51,6 @@ from backend.copilot.tools.models import (
    UnderstandingUpdatedResponse,
 )
 from backend.copilot.tracking import track_user_message
-from backend.data.redis_client import get_redis_async
-from backend.data.understanding import get_business_understanding
 from backend.data.workspace import get_or_create_workspace
 from backend.util.exceptions import NotFoundError

@@ -129,7 +125,6 @@ class SessionSummaryResponse(BaseModel):
    created_at: str
    updated_at: str
    title: str | None = None
-    is_processing: bool


 class ListSessionsResponse(BaseModel):
@@ -146,20 +141,6 @@ class CancelSessionResponse(BaseModel):
    reason: str | None = None


-class UpdateSessionTitleRequest(BaseModel):
-    """Request model for updating a session's title."""
-
-    title: str
-
-    @field_validator("title")
-    @classmethod
-    def title_must_not_be_blank(cls, v: str) -> str:
-        stripped = v.strip()
-        if not stripped:
-            raise ValueError("Title must not be blank")
-        return stripped
-
-
 # ========== Routes ==========


@@ -188,28 +169,6 @@ async def list_sessions(
    """
    sessions, total_count = await get_user_sessions(user_id, limit, offset)

-    # Batch-check Redis for active stream status on each session
-    processing_set: set[str] = set()
-    if sessions:
-        try:
-            redis = await get_redis_async()
-            pipe = redis.pipeline(transaction=False)
-            for session in sessions:
-                pipe.hget(
-                    f"{config.session_meta_prefix}{session.session_id}",
-                    "status",
-                )
-            statuses = await pipe.execute()
-            processing_set = {
-                session.session_id
-                for session, st in zip(sessions, statuses)
-                if st == "running"
-            }
-        except Exception:
-            logger.warning(
-                "Failed to fetch processing status from Redis; " "defaulting to empty"
-            )
-
    return ListSessionsResponse(
        sessions=[
            SessionSummaryResponse(
@@ -217,7 +176,6 @@ async def list_sessions(
                created_at=session.started_at.isoformat(),
                updated_at=session.updated_at.isoformat(),
                title=session.title,
-                is_processing=session.session_id in processing_set,
            )
            for session in sessions
        ],
@@ -292,12 +250,12 @@ async def delete_session(
        )

    # Best-effort cleanup of the E2B sandbox (if any).
-    # sandbox_id is in Redis; kill_sandbox() fetches it from there.
-    e2b_cfg = ChatConfig()
-    if e2b_cfg.e2b_active:
-        assert e2b_cfg.e2b_api_key  # guaranteed by e2b_active check
+    config = ChatConfig()
+    if config.use_e2b_sandbox and config.e2b_api_key:
+        from backend.copilot.tools.e2b_sandbox import kill_sandbox
+
        try:
-            await kill_sandbox(session_id, e2b_cfg.e2b_api_key)
+            await kill_sandbox(session_id, config.e2b_api_key)
        except Exception:
            logger.warning(
                "[E2B] Failed to kill sandbox for session %s", session_id[:12]
@@ -306,43 +264,6 @@ async def delete_session(
    return Response(status_code=204)


-@router.patch(
-    "/sessions/{session_id}/title",
-    summary="Update session title",
-    dependencies=[Security(auth.requires_user)],
-    status_code=200,
-    responses={404: {"description": "Session not found or access denied"}},
-)
-async def update_session_title_route(
-    session_id: str,
-    request: UpdateSessionTitleRequest,
-    user_id: Annotated[str, Security(auth.get_user_id)],
-) -> dict:
-    """
-    Update the title of a chat session.
-
-    Allows the user to rename their chat session.
-
-    Args:
-        session_id: The session ID to update.
-        request: Request body containing the new title.
-        user_id: The authenticated user's ID.
-
-    Returns:
-        dict: Status of the update.
-
-    Raises:
-        HTTPException: 404 if session not found or not owned by user.
-    """
-    success = await update_session_title(session_id, user_id, request.title)
-    if not success:
-        raise HTTPException(
-            status_code=404,
-            detail=f"Session {session_id} not found or access denied",
-        )
-    return {"status": "ok"}
-
-
@router.get(
    "/sessions/{session_id}",
 )
@@ -832,6 +753,7 @@ async def resume_session_stream(
@router.patch(
    "/sessions/{session_id}/assign-user",
    dependencies=[Security(auth.requires_user)],
+    status_code=200,
 )
 async def session_assign_user(
    session_id: str,
@@ -854,36 +776,6 @@ async def session_assign_user(
    return {"status": "ok"}


-# ========== Suggested Prompts ==========
-
-
-class SuggestedPromptsResponse(BaseModel):
-    """Response model for user-specific suggested prompts."""
-
-    prompts: list[str]
-
-
-@router.get(
-    "/suggested-prompts",
-    dependencies=[Security(auth.requires_user)],
-)
-async def get_suggested_prompts(
-    user_id: Annotated[str, Security(auth.get_user_id)],
-) -> SuggestedPromptsResponse:
-    """
-    Get LLM-generated suggested prompts for the authenticated user.
-
-    Returns personalized quick-action prompts based on the user's
-    business understanding. Returns an empty list if no custom prompts
-    are available.
-    """
-    understanding = await get_business_understanding(user_id)
-    if understanding is None:
-        return SuggestedPromptsResponse(prompts=[])
-
-    return SuggestedPromptsResponse(prompts=understanding.suggested_prompts)
-
-
 # ========== Configuration ==========


--- a/autogpt_platform/backend/backend/api/features/chat/routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/chat/routes_test.py
@@ -1,6 +1,4 @@
-"""Tests for chat API routes: session title update, file attachment validation, and suggested prompts."""
-
-from unittest.mock import AsyncMock, MagicMock
+"""Tests for chat route file_ids validation and enrichment."""

 import fastapi
 import fastapi.testclient
@@ -19,7 +17,6 @@ TEST_USER_ID = "3e53486c-cf57-477e-ba2a-cb02dc828e1a"

@pytest.fixture(autouse=True)
 def setup_app_auth(mock_jwt_user):
-    """Setup auth overrides for all tests in this module"""
    from autogpt_libs.auth.jwt_utils import get_jwt_payload

    app.dependency_overrides[get_jwt_payload] = mock_jwt_user["get_jwt_payload"]
@@ -27,95 +24,7 @@ def setup_app_auth(mock_jwt_user):
    app.dependency_overrides.clear()


-def _mock_update_session_title(
-    mocker: pytest_mock.MockerFixture, *, success: bool = True
-):
-    """Mock update_session_title."""
-    return mocker.patch(
-        "backend.api.features.chat.routes.update_session_title",
-        new_callable=AsyncMock,
-        return_value=success,
-    )
-
-
-# ─── Update title: success ─────────────────────────────────────────────
-
-
-def test_update_title_success(
-    mocker: pytest_mock.MockerFixture,
-    test_user_id: str,
-) -> None:
-    mock_update = _mock_update_session_title(mocker, success=True)
-
-    response = client.patch(
-        "/sessions/sess-1/title",
-        json={"title": "My project"},
-    )
-
-    assert response.status_code == 200
-    assert response.json() == {"status": "ok"}
-    mock_update.assert_called_once_with("sess-1", test_user_id, "My project")
-
-
-def test_update_title_trims_whitespace(
-    mocker: pytest_mock.MockerFixture,
-    test_user_id: str,
-) -> None:
-    mock_update = _mock_update_session_title(mocker, success=True)
-
-    response = client.patch(
-        "/sessions/sess-1/title",
-        json={"title": "  trimmed  "},
-    )
-
-    assert response.status_code == 200
-    mock_update.assert_called_once_with("sess-1", test_user_id, "trimmed")
-
-
-# ─── Update title: blank / whitespace-only → 422 ──────────────────────
-
-
-def test_update_title_blank_rejected(
-    test_user_id: str,
-) -> None:
-    """Whitespace-only titles must be rejected before hitting the DB."""
-    response = client.patch(
-        "/sessions/sess-1/title",
-        json={"title": "   "},
-    )
-
-    assert response.status_code == 422
-
-
-def test_update_title_empty_rejected(
-    test_user_id: str,
-) -> None:
-    response = client.patch(
-        "/sessions/sess-1/title",
-        json={"title": ""},
-    )
-
-    assert response.status_code == 422
-
-
-# ─── Update title: session not found or wrong user → 404 ──────────────
-
-
-def test_update_title_not_found(
-    mocker: pytest_mock.MockerFixture,
-    test_user_id: str,
-) -> None:
-    _mock_update_session_title(mocker, success=False)
-
-    response = client.patch(
-        "/sessions/sess-1/title",
-        json={"title": "New name"},
-    )
-
-    assert response.status_code == 404
-
-
-# ─── file_ids Pydantic validation ─────────────────────────────────────
+# ---- file_ids Pydantic validation (B1) ----


 def test_stream_chat_rejects_too_many_file_ids():
@@ -183,7 +92,7 @@ def test_stream_chat_accepts_20_file_ids(mocker: pytest_mock.MockFixture):
    assert response.status_code == 200


-# ─── UUID format filtering ─────────────────────────────────────────────
+# ---- UUID format filtering ----


 def test_file_ids_filters_invalid_uuids(mocker: pytest_mock.MockFixture):
@@ -222,7 +131,7 @@ def test_file_ids_filters_invalid_uuids(mocker: pytest_mock.MockFixture):
    assert call_kwargs["where"]["id"]["in"] == [valid_id]


-# ─── Cross-workspace file_ids ─────────────────────────────────────────
+# ---- Cross-workspace file_ids ----


 def test_file_ids_scoped_to_workspace(mocker: pytest_mock.MockFixture):
@@ -249,62 +158,3 @@ def test_file_ids_scoped_to_workspace(mocker: pytest_mock.MockFixture):
    call_kwargs = mock_prisma.find_many.call_args[1]
    assert call_kwargs["where"]["workspaceId"] == "my-workspace-id"
    assert call_kwargs["where"]["isDeleted"] is False
-
-
-# ─── Suggested prompts endpoint ──────────────────────────────────────
-
-
-def _mock_get_business_understanding(
-    mocker: pytest_mock.MockerFixture,
-    *,
-    return_value=None,
-):
-    """Mock get_business_understanding."""
-    return mocker.patch(
-        "backend.api.features.chat.routes.get_business_understanding",
-        new_callable=AsyncMock,
-        return_value=return_value,
-    )
-
-
-def test_suggested_prompts_returns_prompts(
-    mocker: pytest_mock.MockerFixture,
-    test_user_id: str,
-) -> None:
-    """User with understanding and prompts gets them back."""
-    mock_understanding = MagicMock()
-    mock_understanding.suggested_prompts = ["Do X", "Do Y", "Do Z"]
-    _mock_get_business_understanding(mocker, return_value=mock_understanding)
-
-    response = client.get("/suggested-prompts")
-
-    assert response.status_code == 200
-    assert response.json() == {"prompts": ["Do X", "Do Y", "Do Z"]}
-
-
-def test_suggested_prompts_no_understanding(
-    mocker: pytest_mock.MockerFixture,
-    test_user_id: str,
-) -> None:
-    """User with no understanding gets empty list."""
-    _mock_get_business_understanding(mocker, return_value=None)
-
-    response = client.get("/suggested-prompts")
-
-    assert response.status_code == 200
-    assert response.json() == {"prompts": []}
-
-
-def test_suggested_prompts_empty_prompts(
-    mocker: pytest_mock.MockerFixture,
-    test_user_id: str,
-) -> None:
-    """User with understanding but no prompts gets empty list."""
-    mock_understanding = MagicMock()
-    mock_understanding.suggested_prompts = []
-    _mock_get_business_understanding(mocker, return_value=mock_understanding)
-
-    response = client.get("/suggested-prompts")
-
-    assert response.status_code == 200
-    assert response.json() == {"prompts": []}
--- a/autogpt_platform/backend/backend/api/features/executions/review/review_routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/executions/review/review_routes_test.py
@@ -638,7 +638,7 @@ async def test_process_review_action_auto_approve_creates_auto_approval_records(

    # Mock get_node_executions to return node_id mapping
    mock_get_node_executions = mocker.patch(
-        "backend.api.features.executions.review.routes.get_node_executions"
+        "backend.data.execution.get_node_executions"
    )
    mock_node_exec = mocker.Mock(spec=NodeExecutionResult)
    mock_node_exec.node_exec_id = "test_node_123"
@@ -936,7 +936,7 @@ async def test_process_review_action_auto_approve_only_applies_to_approved_revie

    # Mock get_node_executions to return node_id mapping
    mock_get_node_executions = mocker.patch(
-        "backend.api.features.executions.review.routes.get_node_executions"
+        "backend.data.execution.get_node_executions"
    )
    mock_node_exec = mocker.Mock(spec=NodeExecutionResult)
    mock_node_exec.node_exec_id = "node_exec_approved"
@@ -1148,7 +1148,7 @@ async def test_process_review_action_per_review_auto_approve_granularity(

    # Mock get_node_executions to return batch node data
    mock_get_node_executions = mocker.patch(
-        "backend.api.features.executions.review.routes.get_node_executions"
+        "backend.data.execution.get_node_executions"
    )
    # Create mock node executions for each review
    mock_node_execs = []
--- a/autogpt_platform/backend/backend/api/features/executions/review/routes.py
+++ b/autogpt_platform/backend/backend/api/features/executions/review/routes.py
@@ -6,15 +6,10 @@ import autogpt_libs.auth as autogpt_auth_lib
 from fastapi import APIRouter, HTTPException, Query, Security, status
 from prisma.enums import ReviewStatus

-from backend.copilot.constants import (
-    is_copilot_synthetic_id,
-    parse_node_id_from_exec_id,
-)
 from backend.data.execution import (
    ExecutionContext,
    ExecutionStatus,
    get_graph_execution_meta,
-    get_node_executions,
 )
 from backend.data.graph import get_graph_settings
 from backend.data.human_review import (
@@ -41,38 +36,6 @@ router = APIRouter(
 )


-async def _resolve_node_ids(
-    node_exec_ids: list[str],
-    graph_exec_id: str,
-    is_copilot: bool,
-) -> dict[str, str]:
-    """Resolve node_exec_id -> node_id for auto-approval records.
-
-    CoPilot synthetic IDs encode node_id in the format "{node_id}:{random}".
-    Graph executions look up node_id from NodeExecution records.
-    """
-    if not node_exec_ids:
-        return {}
-
-    if is_copilot:
-        return {neid: parse_node_id_from_exec_id(neid) for neid in node_exec_ids}
-
-    node_execs = await get_node_executions(
-        graph_exec_id=graph_exec_id, include_exec_data=False
-    )
-    node_exec_map = {ne.node_exec_id: ne.node_id for ne in node_execs}
-
-    result = {}
-    for neid in node_exec_ids:
-        if neid in node_exec_map:
-            result[neid] = node_exec_map[neid]
-        else:
-            logger.error(
-                f"Failed to resolve node_id for {neid}: Node execution not found."
-            )
-    return result
-
-
@router.get(
    "/pending",
    summary="Get Pending Reviews",
@@ -147,16 +110,14 @@ async def list_pending_reviews_for_execution(
    """

    # Verify user owns the graph execution before returning reviews
-    # (CoPilot synthetic IDs don't have graph execution records)
-    if not is_copilot_synthetic_id(graph_exec_id):
-        graph_exec = await get_graph_execution_meta(
-            user_id=user_id, execution_id=graph_exec_id
+    graph_exec = await get_graph_execution_meta(
+        user_id=user_id, execution_id=graph_exec_id
+    )
+    if not graph_exec:
+        raise HTTPException(
+            status_code=status.HTTP_404_NOT_FOUND,
+            detail=f"Graph execution #{graph_exec_id} not found",
        )
-        if not graph_exec:
-            raise HTTPException(
-                status_code=status.HTTP_404_NOT_FOUND,
-                detail=f"Graph execution #{graph_exec_id} not found",
-            )

    return await get_pending_reviews_for_execution(graph_exec_id, user_id)

@@ -199,26 +160,30 @@ async def process_review_action(
        )

    graph_exec_id = next(iter(graph_exec_ids))
-    is_copilot = is_copilot_synthetic_id(graph_exec_id)

-    # Validate execution status for graph executions (skip for CoPilot synthetic IDs)
-    if not is_copilot:
-        graph_exec_meta = await get_graph_execution_meta(
-            user_id=user_id, execution_id=graph_exec_id
+    # Validate execution status before processing reviews
+    graph_exec_meta = await get_graph_execution_meta(
+        user_id=user_id, execution_id=graph_exec_id
+    )
+
+    if not graph_exec_meta:
+        raise HTTPException(
+            status_code=status.HTTP_404_NOT_FOUND,
+            detail=f"Graph execution #{graph_exec_id} not found",
+        )
+
+    # Only allow processing reviews if execution is paused for review
+    # or incomplete (partial execution with some reviews already processed)
+    if graph_exec_meta.status not in (
+        ExecutionStatus.REVIEW,
+        ExecutionStatus.INCOMPLETE,
+    ):
+        raise HTTPException(
+            status_code=status.HTTP_409_CONFLICT,
+            detail=f"Cannot process reviews while execution status is {graph_exec_meta.status}. "
+            f"Reviews can only be processed when execution is paused (REVIEW status). "
+            f"Current status: {graph_exec_meta.status}",
        )
-        if not graph_exec_meta:
-            raise HTTPException(
-                status_code=status.HTTP_404_NOT_FOUND,
-                detail=f"Graph execution #{graph_exec_id} not found",
-            )
-        if graph_exec_meta.status not in (
-            ExecutionStatus.REVIEW,
-            ExecutionStatus.INCOMPLETE,
-        ):
-            raise HTTPException(
-                status_code=status.HTTP_409_CONFLICT,
-                detail=f"Cannot process reviews while execution status is {graph_exec_meta.status}",
-            )

    # Build review decisions map and track which reviews requested auto-approval
    # Auto-approved reviews use original data (no modifications allowed)
@@ -271,7 +236,7 @@ async def process_review_action(
            )
            return (node_id, False)

-    # Collect node_exec_ids that need auto-approval and resolve their node_ids
+    # Collect node_exec_ids that need auto-approval
    node_exec_ids_needing_auto_approval = [
        node_exec_id
        for node_exec_id, review_result in updated_reviews.items()
@@ -279,16 +244,29 @@ async def process_review_action(
        and auto_approve_requests.get(node_exec_id, False)
    ]

-    node_id_map = await _resolve_node_ids(
-        node_exec_ids_needing_auto_approval, graph_exec_id, is_copilot
-    )
-
-    # Deduplicate by node_id — one auto-approval per node
+    # Batch-fetch node executions to get node_ids
    nodes_needing_auto_approval: dict[str, Any] = {}
-    for node_exec_id in node_exec_ids_needing_auto_approval:
-        node_id = node_id_map.get(node_exec_id)
-        if node_id and node_id not in nodes_needing_auto_approval:
-            nodes_needing_auto_approval[node_id] = updated_reviews[node_exec_id]
+    if node_exec_ids_needing_auto_approval:
+        from backend.data.execution import get_node_executions
+
+        node_execs = await get_node_executions(
+            graph_exec_id=graph_exec_id, include_exec_data=False
+        )
+        node_exec_map = {node_exec.node_exec_id: node_exec for node_exec in node_execs}
+
+        for node_exec_id in node_exec_ids_needing_auto_approval:
+            node_exec = node_exec_map.get(node_exec_id)
+            if node_exec:
+                review_result = updated_reviews[node_exec_id]
+                # Use the first approved review for this node (deduplicate by node_id)
+                if node_exec.node_id not in nodes_needing_auto_approval:
+                    nodes_needing_auto_approval[node_exec.node_id] = review_result
+            else:
+                logger.error(
+                    f"Failed to create auto-approval record for {node_exec_id}: "
+                    f"Node execution not found. This may indicate a race condition "
+                    f"or data inconsistency."
+                )

    # Execute all auto-approval creations in parallel (deduplicated by node_id)
    auto_approval_results = await asyncio.gather(
@@ -303,11 +281,13 @@ async def process_review_action(
    auto_approval_failed_count = 0
    for result in auto_approval_results:
        if isinstance(result, Exception):
+            # Unexpected exception during auto-approval creation
            auto_approval_failed_count += 1
            logger.error(
                f"Unexpected exception during auto-approval creation: {result}"
            )
        elif isinstance(result, tuple) and len(result) == 2 and not result[1]:
+            # Auto-approval creation failed (returned False)
            auto_approval_failed_count += 1

    # Count results
@@ -322,20 +302,22 @@ async def process_review_action(
        if review.status == ReviewStatus.REJECTED
    )

-    # Resume graph execution only for real graph executions (not CoPilot)
-    # CoPilot sessions are resumed by the LLM retrying run_block with review_id
-    if not is_copilot and updated_reviews:
+    # Resume execution only if ALL pending reviews for this execution have been processed
+    if updated_reviews:
        still_has_pending = await has_pending_reviews_for_graph_exec(graph_exec_id)

        if not still_has_pending:
+            # Get the graph_id from any processed review
            first_review = next(iter(updated_reviews.values()))

            try:
+                # Fetch user and settings to build complete execution context
                user = await get_user_by_id(user_id)
                settings = await get_graph_settings(
                    user_id=user_id, graph_id=first_review.graph_id
                )

+                # Preserve user's timezone preference when resuming execution
                user_timezone = (
                    user.timezone if user.timezone != USER_TIMEZONE_NOT_SET else "UTC"
                )
--- a/autogpt_platform/backend/backend/api/features/library/db.py
+++ b/autogpt_platform/backend/backend/api/features/library/db.py
@@ -8,6 +8,7 @@ import prisma.errors
 import prisma.models
 import prisma.types

+import backend.api.features.store.exceptions as store_exceptions
 import backend.api.features.store.image_gen as store_image_gen
 import backend.api.features.store.media as store_media
 import backend.data.graph as graph_db
@@ -250,7 +251,7 @@ async def get_library_agent(id: str, user_id: str) -> library_model.LibraryAgent
        The requested LibraryAgent.

    Raises:
-        NotFoundError: If the specified agent does not exist.
+        AgentNotFoundError: If the specified agent does not exist.
        DatabaseError: If there's an error during retrieval.
    """
    library_agent = await prisma.models.LibraryAgent.prisma().find_first(
@@ -397,7 +398,6 @@ async def create_library_agent(
    hitl_safe_mode: bool = True,
    sensitive_action_safe_mode: bool = False,
    create_library_agents_for_sub_graphs: bool = True,
-    folder_id: str | None = None,
 ) -> list[library_model.LibraryAgent]:
    """
    Adds an agent to the user's library (LibraryAgent table).
@@ -414,18 +414,12 @@ async def create_library_agent(
        If the graph has sub-graphs, the parent graph will always be the first entry in the list.

    Raises:
-        NotFoundError: If the specified agent does not exist.
+        AgentNotFoundError: If the specified agent does not exist.
        DatabaseError: If there's an error during creation or if image generation fails.
    """
    logger.info(
        f"Creating library agent for graph #{graph.id} v{graph.version}; user:<redacted>"
    )
-
-    # Authorization: FK only checks existence, not ownership.
-    # Verify the folder belongs to this user to prevent cross-user nesting.
-    if folder_id:
-        await get_folder(folder_id, user_id)
-
    graph_entries = (
        [graph, *graph.sub_graphs] if create_library_agents_for_sub_graphs else [graph]
    )
@@ -438,6 +432,7 @@ async def create_library_agent(
                        isCreatedByUser=(user_id == user_id),
                        useGraphIsActiveVersion=True,
                        User={"connect": {"id": user_id}},
+                        # Creator={"connect": {"id": user_id}},
                        AgentGraph={
                            "connect": {
                                "graphVersionId": {
@@ -453,11 +448,6 @@ async def create_library_agent(
                                sensitive_action_safe_mode=sensitive_action_safe_mode,
                            ).model_dump()
                        ),
-                        **(
-                            {"Folder": {"connect": {"id": folder_id}}}
-                            if folder_id and graph_entry is graph
-                            else {}
-                        ),
                    ),
                    include=library_agent_include(
                        user_id, include_nodes=False, include_executions=False
@@ -539,7 +529,6 @@ async def update_agent_version_in_library(
 async def create_graph_in_library(
    graph: graph_db.Graph,
    user_id: str,
-    folder_id: str | None = None,
 ) -> tuple[graph_db.GraphModel, library_model.LibraryAgent]:
    """Create a new graph and add it to the user's library."""
    graph.version = 1
@@ -553,7 +542,6 @@ async def create_graph_in_library(
        user_id=user_id,
        sensitive_action_safe_mode=True,
        create_library_agents_for_sub_graphs=False,
-        folder_id=folder_id,
    )

    if created_graph.is_active:
@@ -829,7 +817,7 @@ async def add_store_agent_to_library(
        The newly created LibraryAgent if successfully added, the existing corresponding one if any.

    Raises:
-        NotFoundError: If the store listing or associated agent is not found.
+        AgentNotFoundError: If the store listing or associated agent is not found.
        DatabaseError: If there's an issue creating the LibraryAgent record.
    """
    logger.debug(
@@ -844,7 +832,7 @@ async def add_store_agent_to_library(
    )
    if not store_listing_version or not store_listing_version.AgentGraph:
        logger.warning(f"Store listing version not found: {store_listing_version_id}")
-        raise NotFoundError(
+        raise store_exceptions.AgentNotFoundError(
            f"Store listing version {store_listing_version_id} not found or invalid"
        )

@@ -858,7 +846,7 @@ async def add_store_agent_to_library(
        include_subgraphs=False,
    )
    if not graph_model:
-        raise NotFoundError(
+        raise store_exceptions.AgentNotFoundError(
            f"Graph #{graph.id} v{graph.version} not found or accessible"
        )

@@ -1493,67 +1481,6 @@ async def bulk_move_agents_to_folder(
    return [library_model.LibraryAgent.from_db(agent) for agent in agents]


-def collect_tree_ids(
-    nodes: list[library_model.LibraryFolderTree],
-    visited: set[str] | None = None,
-) -> list[str]:
-    """Collect all folder IDs from a folder tree."""
-    if visited is None:
-        visited = set()
-    ids: list[str] = []
-    for n in nodes:
-        if n.id in visited:
-            continue
-        visited.add(n.id)
-        ids.append(n.id)
-        ids.extend(collect_tree_ids(n.children, visited))
-    return ids
-
-
-async def get_folder_agent_summaries(
-    user_id: str, folder_id: str
-) -> list[dict[str, str | None]]:
-    """Get a lightweight list of agents in a folder (id, name, description)."""
-    all_agents: list[library_model.LibraryAgent] = []
-    for page in itertools.count(1):
-        resp = await list_library_agents(
-            user_id=user_id, folder_id=folder_id, page=page
-        )
-        all_agents.extend(resp.agents)
-        if page >= resp.pagination.total_pages:
-            break
-    return [
-        {"id": a.id, "name": a.name, "description": a.description} for a in all_agents
-    ]
-
-
-async def get_root_agent_summaries(
-    user_id: str,
-) -> list[dict[str, str | None]]:
-    """Get a lightweight list of root-level agents (folderId IS NULL)."""
-    all_agents: list[library_model.LibraryAgent] = []
-    for page in itertools.count(1):
-        resp = await list_library_agents(
-            user_id=user_id, include_root_only=True, page=page
-        )
-        all_agents.extend(resp.agents)
-        if page >= resp.pagination.total_pages:
-            break
-    return [
-        {"id": a.id, "name": a.name, "description": a.description} for a in all_agents
-    ]
-
-
-async def get_folder_agents_map(
-    user_id: str, folder_ids: list[str]
-) -> dict[str, list[dict[str, str | None]]]:
-    """Get agent summaries for multiple folders concurrently."""
-    results = await asyncio.gather(
-        *(get_folder_agent_summaries(user_id, fid) for fid in folder_ids)
-    )
-    return dict(zip(folder_ids, results))
-
-
 ##############################################
 ########### Presets DB Functions #############
 ##############################################
--- a/autogpt_platform/backend/backend/api/features/library/db_test.py
+++ b/autogpt_platform/backend/backend/api/features/library/db_test.py
@@ -4,6 +4,7 @@ import prisma.enums
 import prisma.models
 import pytest

+import backend.api.features.store.exceptions
 from backend.data.db import connect
 from backend.data.includes import library_agent_include

@@ -217,7 +218,7 @@ async def test_add_agent_to_library_not_found(mocker):
    )

    # Call function and verify exception
-    with pytest.raises(db.NotFoundError):
+    with pytest.raises(backend.api.features.store.exceptions.AgentNotFoundError):
        await db.add_store_agent_to_library("version123", "test-user")

    # Verify mock called correctly
--- a/autogpt_platform/backend/backend/api/features/library/model.py
+++ b/autogpt_platform/backend/backend/api/features/library/model.py
@@ -165,6 +165,7 @@ class LibraryAgent(pydantic.BaseModel):
    id: str
    graph_id: str
    graph_version: int
+    owner_user_id: str

    image_url: str | None

@@ -205,9 +206,7 @@ class LibraryAgent(pydantic.BaseModel):
        default_factory=list,
        description="List of recent executions with status, score, and summary",
    )
-    can_access_graph: bool = pydantic.Field(
-        description="Indicates whether the same user owns the corresponding graph"
-    )
+    can_access_graph: bool
    is_latest_version: bool
    is_favorite: bool
    folder_id: str | None = None
@@ -325,6 +324,7 @@ class LibraryAgent(pydantic.BaseModel):
            id=agent.id,
            graph_id=agent.agentGraphId,
            graph_version=agent.agentGraphVersion,
+            owner_user_id=agent.userId,
            image_url=agent.imageUrl,
            creator_name=creator_name,
            creator_image_url=creator_image_url,
--- a/autogpt_platform/backend/backend/api/features/library/routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes_test.py
@@ -42,6 +42,7 @@ async def test_get_library_agents_success(
                id="test-agent-1",
                graph_id="test-agent-1",
                graph_version=1,
+                owner_user_id=test_user_id,
                name="Test Agent 1",
                description="Test Description 1",
                image_url=None,
@@ -66,6 +67,7 @@ async def test_get_library_agents_success(
                id="test-agent-2",
                graph_id="test-agent-2",
                graph_version=1,
+                owner_user_id=test_user_id,
                name="Test Agent 2",
                description="Test Description 2",
                image_url=None,
@@ -129,6 +131,7 @@ async def test_get_favorite_library_agents_success(
                id="test-agent-1",
                graph_id="test-agent-1",
                graph_version=1,
+                owner_user_id=test_user_id,
                name="Favorite Agent 1",
                description="Test Favorite Description 1",
                image_url=None,
@@ -181,6 +184,7 @@ def test_add_agent_to_library_success(
        id="test-library-agent-id",
        graph_id="test-agent-1",
        graph_version=1,
+        owner_user_id=test_user_id,
        name="Test Agent 1",
        description="Test Description 1",
        image_url=None,
--- a/autogpt_platform/backend/backend/api/features/mcp/routes.py
+++ b/autogpt_platform/backend/backend/api/features/mcp/routes.py
@@ -24,7 +24,7 @@ from backend.blocks.mcp.oauth import MCPOAuthHandler
 from backend.data.model import OAuth2Credentials
 from backend.integrations.creds_manager import IntegrationCredentialsManager
 from backend.integrations.providers import ProviderName
-from backend.util.request import HTTPClientError, Requests, validate_url_host
+from backend.util.request import HTTPClientError, Requests, validate_url
 from backend.util.settings import Settings

 logger = logging.getLogger(__name__)
@@ -80,7 +80,7 @@ async def discover_tools(
    """
    # Validate URL to prevent SSRF — blocks loopback and private IP ranges.
    try:
-        await validate_url_host(request.server_url)
+        await validate_url(request.server_url, trusted_origins=[])
    except ValueError as e:
        raise fastapi.HTTPException(status_code=400, detail=f"Invalid server URL: {e}")

@@ -167,7 +167,7 @@ async def mcp_oauth_login(
    """
    # Validate URL to prevent SSRF — blocks loopback and private IP ranges.
    try:
-        await validate_url_host(request.server_url)
+        await validate_url(request.server_url, trusted_origins=[])
    except ValueError as e:
        raise fastapi.HTTPException(status_code=400, detail=f"Invalid server URL: {e}")

@@ -187,7 +187,7 @@ async def mcp_oauth_login(

        # Validate the auth server URL from metadata to prevent SSRF.
        try:
-            await validate_url_host(auth_server_url)
+            await validate_url(auth_server_url, trusted_origins=[])
        except ValueError as e:
            raise fastapi.HTTPException(
                status_code=400,
@@ -234,7 +234,7 @@ async def mcp_oauth_login(
    if registration_endpoint:
        # Validate the registration endpoint to prevent SSRF via metadata.
        try:
-            await validate_url_host(registration_endpoint)
+            await validate_url(registration_endpoint, trusted_origins=[])
        except ValueError:
            pass  # Skip registration, fall back to default client_id
        else:
@@ -429,7 +429,7 @@ async def mcp_store_token(

    # Validate URL to prevent SSRF — blocks loopback and private IP ranges.
    try:
-        await validate_url_host(request.server_url)
+        await validate_url(request.server_url, trusted_origins=[])
    except ValueError as e:
        raise fastapi.HTTPException(status_code=400, detail=f"Invalid server URL: {e}")

--- a/autogpt_platform/backend/backend/api/features/mcp/test_routes.py
+++ b/autogpt_platform/backend/backend/api/features/mcp/test_routes.py
@@ -32,9 +32,9 @@ async def client():

@pytest.fixture(autouse=True)
 def _bypass_ssrf_validation():
-    """Bypass validate_url_host in all route tests (test URLs don't resolve)."""
+    """Bypass validate_url in all route tests (test URLs don't resolve)."""
    with patch(
-        "backend.api.features.mcp.routes.validate_url_host",
+        "backend.api.features.mcp.routes.validate_url",
        new_callable=AsyncMock,
    ):
        yield
@@ -521,12 +521,12 @@ class TestStoreToken:


 class TestSSRFValidation:
-    """Verify that validate_url_host is enforced on all endpoints."""
+    """Verify that validate_url is enforced on all endpoints."""

    @pytest.mark.asyncio(loop_scope="session")
    async def test_discover_tools_ssrf_blocked(self, client):
        with patch(
-            "backend.api.features.mcp.routes.validate_url_host",
+            "backend.api.features.mcp.routes.validate_url",
            new_callable=AsyncMock,
            side_effect=ValueError("blocked loopback"),
        ):
@@ -541,7 +541,7 @@ class TestSSRFValidation:
    @pytest.mark.asyncio(loop_scope="session")
    async def test_oauth_login_ssrf_blocked(self, client):
        with patch(
-            "backend.api.features.mcp.routes.validate_url_host",
+            "backend.api.features.mcp.routes.validate_url",
            new_callable=AsyncMock,
            side_effect=ValueError("blocked private IP"),
        ):
@@ -556,7 +556,7 @@ class TestSSRFValidation:
    @pytest.mark.asyncio(loop_scope="session")
    async def test_store_token_ssrf_blocked(self, client):
        with patch(
-            "backend.api.features.mcp.routes.validate_url_host",
+            "backend.api.features.mcp.routes.validate_url",
            new_callable=AsyncMock,
            side_effect=ValueError("blocked loopback"),
        ):
--- a/autogpt_platform/backend/backend/api/features/store/cache.py
+++ b/autogpt_platform/backend/backend/api/features/store/cache.py
@@ -1,3 +1,5 @@
+from typing import Literal
+
 from backend.util.cache import cached

 from . import db as store_db
@@ -21,7 +23,7 @@ def clear_all_caches():
 async def _get_cached_store_agents(
    featured: bool,
    creator: str | None,
-    sorted_by: store_db.StoreAgentsSortOptions | None,
+    sorted_by: Literal["rating", "runs", "name", "updated_at"] | None,
    search_query: str | None,
    category: str | None,
    page: int,
@@ -55,7 +57,7 @@ async def _get_cached_agent_details(
 async def _get_cached_store_creators(
    featured: bool,
    search_query: str | None,
-    sorted_by: store_db.StoreCreatorsSortOptions | None,
+    sorted_by: Literal["agent_rating", "agent_runs", "num_agents"] | None,
    page: int,
    page_size: int,
 ):
@@ -73,4 +75,4 @@ async def _get_cached_store_creators(
@cached(maxsize=100, ttl_seconds=300, shared_cache=True)
 async def _get_cached_creator_details(username: str):
    """Cached helper to get creator details."""
-    return await store_db.get_store_creator(username=username.lower())
+    return await store_db.get_store_creator_details(username=username.lower())
--- a/autogpt_platform/backend/backend/api/features/store/db.py
+++ b/autogpt_platform/backend/backend/api/features/store/db.py
--- a/autogpt_platform/backend/backend/api/features/store/db_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/db_test.py
@@ -26,7 +26,7 @@ async def test_get_store_agents(mocker):
    mock_agents = [
        prisma.models.StoreAgent(
            listing_id="test-id",
-            listing_version_id="version123",
+            storeListingVersionId="version123",
            slug="test-agent",
            agent_name="Test Agent",
            agent_video=None,
@@ -40,11 +40,11 @@ async def test_get_store_agents(mocker):
            runs=10,
            rating=4.5,
            versions=["1.0"],
-            graph_id="test-graph-id",
-            graph_versions=["1"],
+            agentGraphVersions=["1"],
+            agentGraphId="test-graph-id",
            updated_at=datetime.now(),
            is_available=False,
-            use_for_onboarding=False,
+            useForOnboarding=False,
        )
    ]

@@ -68,10 +68,10 @@ async def test_get_store_agents(mocker):

@pytest.mark.asyncio(loop_scope="session")
 async def test_get_store_agent_details(mocker):
-    # Mock data - StoreAgent view already contains the active version data
+    # Mock data
    mock_agent = prisma.models.StoreAgent(
        listing_id="test-id",
-        listing_version_id="version123",
+        storeListingVersionId="version123",
        slug="test-agent",
        agent_name="Test Agent",
        agent_video="video.mp4",
@@ -85,38 +85,102 @@ async def test_get_store_agent_details(mocker):
        runs=10,
        rating=4.5,
        versions=["1.0"],
-        graph_id="test-graph-id",
-        graph_versions=["1"],
+        agentGraphVersions=["1"],
+        agentGraphId="test-graph-id",
        updated_at=datetime.now(),
-        is_available=True,
-        use_for_onboarding=False,
+        is_available=False,
+        useForOnboarding=False,
    )

-    # Mock StoreAgent prisma call
+    # Mock active version agent (what we want to return for active version)
+    mock_active_agent = prisma.models.StoreAgent(
+        listing_id="test-id",
+        storeListingVersionId="active-version-id",
+        slug="test-agent",
+        agent_name="Test Agent Active",
+        agent_video="active_video.mp4",
+        agent_image=["active_image.jpg"],
+        featured=False,
+        creator_username="creator",
+        creator_avatar="avatar.jpg",
+        sub_heading="Test heading active",
+        description="Test description active",
+        categories=["test"],
+        runs=15,
+        rating=4.8,
+        versions=["1.0", "2.0"],
+        agentGraphVersions=["1", "2"],
+        agentGraphId="test-graph-id-active",
+        updated_at=datetime.now(),
+        is_available=True,
+        useForOnboarding=False,
+    )
+
+    # Create a mock StoreListing result
+    mock_store_listing = mocker.MagicMock()
+    mock_store_listing.activeVersionId = "active-version-id"
+    mock_store_listing.hasApprovedVersion = True
+    mock_store_listing.ActiveVersion = mocker.MagicMock()
+    mock_store_listing.ActiveVersion.recommendedScheduleCron = None
+
+    # Mock StoreAgent prisma call - need to handle multiple calls
    mock_store_agent = mocker.patch("prisma.models.StoreAgent.prisma")
-    mock_store_agent.return_value.find_first = mocker.AsyncMock(return_value=mock_agent)
+
+    # Set up side_effect to return different results for different calls
+    def mock_find_first_side_effect(*args, **kwargs):
+        where_clause = kwargs.get("where", {})
+        if "storeListingVersionId" in where_clause:
+            # Second call for active version
+            return mock_active_agent
+        else:
+            # First call for initial lookup
+            return mock_agent
+
+    mock_store_agent.return_value.find_first = mocker.AsyncMock(
+        side_effect=mock_find_first_side_effect
+    )
+
+    # Mock Profile prisma call
+    mock_profile = mocker.MagicMock()
+    mock_profile.userId = "user-id-123"
+    mock_profile_db = mocker.patch("prisma.models.Profile.prisma")
+    mock_profile_db.return_value.find_first = mocker.AsyncMock(
+        return_value=mock_profile
+    )
+
+    # Mock StoreListing prisma call
+    mock_store_listing_db = mocker.patch("prisma.models.StoreListing.prisma")
+    mock_store_listing_db.return_value.find_first = mocker.AsyncMock(
+        return_value=mock_store_listing
+    )

    # Call function
    result = await db.get_store_agent_details("creator", "test-agent")

-    # Verify results - constructed from the StoreAgent view
+    # Verify results - should use active version data
    assert result.slug == "test-agent"
-    assert result.agent_name == "Test Agent"
-    assert result.active_version_id == "version123"
+    assert result.agent_name == "Test Agent Active"  # From active version
+    assert result.active_version_id == "active-version-id"
    assert result.has_approved_version is True
-    assert result.store_listing_version_id == "version123"
-    assert result.graph_id == "test-graph-id"
-    assert result.runs == 10
-    assert result.rating == 4.5
+    assert (
+        result.store_listing_version_id == "active-version-id"
+    )  # Should be active version ID

-    # Verify single StoreAgent lookup
-    mock_store_agent.return_value.find_first.assert_called_once_with(
+    # Verify mocks called correctly - now expecting 2 calls
+    assert mock_store_agent.return_value.find_first.call_count == 2
+
+    # Check the specific calls
+    calls = mock_store_agent.return_value.find_first.call_args_list
+    assert calls[0] == mocker.call(
        where={"creator_username": "creator", "slug": "test-agent"}
    )
+    assert calls[1] == mocker.call(where={"storeListingVersionId": "active-version-id"})
+
+    mock_store_listing_db.return_value.find_first.assert_called_once()


@pytest.mark.asyncio(loop_scope="session")
-async def test_get_store_creator(mocker):
+async def test_get_store_creator_details(mocker):
    # Mock data
    mock_creator_data = prisma.models.Creator(
        name="Test Creator",
@@ -138,7 +202,7 @@ async def test_get_store_creator(mocker):
    mock_creator.return_value.find_unique.return_value = mock_creator_data

    # Call function
-    result = await db.get_store_creator("creator")
+    result = await db.get_store_creator_details("creator")

    # Verify results
    assert result.username == "creator"
@@ -154,110 +218,61 @@ async def test_get_store_creator(mocker):

@pytest.mark.asyncio(loop_scope="session")
 async def test_create_store_submission(mocker):
-    now = datetime.now()
-
-    # Mock agent graph (with no pending submissions) and user with profile
-    mock_profile = prisma.models.Profile(
-        id="profile-id",
-        userId="user-id",
-        name="Test User",
-        username="testuser",
-        description="Test",
-        isFeatured=False,
-        links=[],
-        createdAt=now,
-        updatedAt=now,
-    )
-    mock_user = prisma.models.User(
-        id="user-id",
-        email="test@example.com",
-        createdAt=now,
-        updatedAt=now,
-        Profile=[mock_profile],
-        emailVerified=True,
-        metadata="{}",  # type: ignore[reportArgumentType]
-        integrations="",
-        maxEmailsPerDay=1,
-        notifyOnAgentRun=True,
-        notifyOnZeroBalance=True,
-        notifyOnLowBalance=True,
-        notifyOnBlockExecutionFailed=True,
-        notifyOnContinuousAgentError=True,
-        notifyOnDailySummary=True,
-        notifyOnWeeklySummary=True,
-        notifyOnMonthlySummary=True,
-        notifyOnAgentApproved=True,
-        notifyOnAgentRejected=True,
-        timezone="Europe/Delft",
-    )
+    # Mock data
    mock_agent = prisma.models.AgentGraph(
        id="agent-id",
        version=1,
        userId="user-id",
-        createdAt=now,
+        createdAt=datetime.now(),
        isActive=True,
-        StoreListingVersions=[],
-        User=mock_user,
    )

-    # Mock the created StoreListingVersion (returned by create)
-    mock_store_listing_obj = prisma.models.StoreListing(
+    mock_listing = prisma.models.StoreListing(
        id="listing-id",
-        createdAt=now,
-        updatedAt=now,
+        createdAt=datetime.now(),
+        updatedAt=datetime.now(),
        isDeleted=False,
        hasApprovedVersion=False,
        slug="test-agent",
        agentGraphId="agent-id",
-        owningUserId="user-id",
-        useForOnboarding=False,
-    )
-    mock_version = prisma.models.StoreListingVersion(
-        id="version-id",
-        agentGraphId="agent-id",
        agentGraphVersion=1,
-        name="Test Agent",
-        description="Test description",
-        createdAt=now,
-        updatedAt=now,
-        subHeading="",
-        imageUrls=[],
-        categories=[],
-        isFeatured=False,
-        isDeleted=False,
-        version=1,
-        storeListingId="listing-id",
-        submissionStatus=prisma.enums.SubmissionStatus.PENDING,
-        isAvailable=True,
-        submittedAt=now,
-        StoreListing=mock_store_listing_obj,
+        owningUserId="user-id",
+        Versions=[
+            prisma.models.StoreListingVersion(
+                id="version-id",
+                agentGraphId="agent-id",
+                agentGraphVersion=1,
+                name="Test Agent",
+                description="Test description",
+                createdAt=datetime.now(),
+                updatedAt=datetime.now(),
+                subHeading="Test heading",
+                imageUrls=["image.jpg"],
+                categories=["test"],
+                isFeatured=False,
+                isDeleted=False,
+                version=1,
+                storeListingId="listing-id",
+                submissionStatus=prisma.enums.SubmissionStatus.PENDING,
+                isAvailable=True,
+            )
+        ],
+        useForOnboarding=False,
    )

    # Mock prisma calls
    mock_agent_graph = mocker.patch("prisma.models.AgentGraph.prisma")
    mock_agent_graph.return_value.find_first = mocker.AsyncMock(return_value=mock_agent)

-    # Mock transaction context manager
-    mock_tx = mocker.MagicMock()
-    mocker.patch(
-        "backend.api.features.store.db.transaction",
-        return_value=mocker.AsyncMock(
-            __aenter__=mocker.AsyncMock(return_value=mock_tx),
-            __aexit__=mocker.AsyncMock(return_value=False),
-        ),
-    )
-
-    mock_sl = mocker.patch("prisma.models.StoreListing.prisma")
-    mock_sl.return_value.find_unique = mocker.AsyncMock(return_value=None)
-
-    mock_slv = mocker.patch("prisma.models.StoreListingVersion.prisma")
-    mock_slv.return_value.create = mocker.AsyncMock(return_value=mock_version)
+    mock_store_listing = mocker.patch("prisma.models.StoreListing.prisma")
+    mock_store_listing.return_value.find_first = mocker.AsyncMock(return_value=None)
+    mock_store_listing.return_value.create = mocker.AsyncMock(return_value=mock_listing)

    # Call function
    result = await db.create_store_submission(
        user_id="user-id",
-        graph_id="agent-id",
-        graph_version=1,
+        agent_id="agent-id",
+        agent_version=1,
        slug="test-agent",
        name="Test Agent",
        description="Test description",
@@ -266,11 +281,11 @@ async def test_create_store_submission(mocker):
    # Verify results
    assert result.name == "Test Agent"
    assert result.description == "Test description"
-    assert result.listing_version_id == "version-id"
+    assert result.store_listing_version_id == "version-id"

    # Verify mocks called correctly
    mock_agent_graph.return_value.find_first.assert_called_once()
-    mock_slv.return_value.create.assert_called_once()
+    mock_store_listing.return_value.create.assert_called_once()


@pytest.mark.asyncio(loop_scope="session")
@@ -303,6 +318,7 @@ async def test_update_profile(mocker):
        description="Test description",
        links=["link1"],
        avatar_url="avatar.jpg",
+        is_featured=False,
    )

    # Call function
@@ -373,7 +389,7 @@ async def test_get_store_agents_with_search_and_filters_parameterized():
        creators=["creator1'; DROP TABLE Users; --", "creator2"],
        category="AI'; DELETE FROM StoreAgent; --",
        featured=True,
-        sorted_by=db.StoreAgentsSortOptions.RATING,
+        sorted_by="rating",
        page=1,
        page_size=20,
    )
--- a/autogpt_platform/backend/backend/api/features/store/exceptions.py
+++ b/autogpt_platform/backend/backend/api/features/store/exceptions.py
@@ -57,6 +57,12 @@ class StoreError(ValueError):
    pass


+class AgentNotFoundError(NotFoundError):
+    """Raised when an agent is not found"""
+
+    pass
+
+
 class CreatorNotFoundError(NotFoundError):
    """Raised when a creator is not found"""

--- a/autogpt_platform/backend/backend/api/features/store/hybrid_search.py
+++ b/autogpt_platform/backend/backend/api/features/store/hybrid_search.py
@@ -568,7 +568,7 @@ async def hybrid_search(
            SELECT uce."contentId" as "storeListingVersionId"
            FROM {{schema_prefix}}"UnifiedContentEmbedding" uce
            INNER JOIN {{schema_prefix}}"StoreAgent" sa
-                ON uce."contentId" = sa.listing_version_id
+                ON uce."contentId" = sa."storeListingVersionId"
            WHERE uce."contentType" = 'STORE_AGENT'::{{schema_prefix}}"ContentType"
            AND uce."userId" IS NULL
            AND uce.search @@ plainto_tsquery('english', {query_param})
@@ -582,7 +582,7 @@ async def hybrid_search(
                SELECT uce."contentId", uce.embedding
                FROM {{schema_prefix}}"UnifiedContentEmbedding" uce
                INNER JOIN {{schema_prefix}}"StoreAgent" sa
-                    ON uce."contentId" = sa.listing_version_id
+                    ON uce."contentId" = sa."storeListingVersionId"
                WHERE uce."contentType" = 'STORE_AGENT'::{{schema_prefix}}"ContentType"
                AND uce."userId" IS NULL
                AND {where_clause}
@@ -605,7 +605,7 @@ async def hybrid_search(
                sa.featured,
                sa.is_available,
                sa.updated_at,
-                sa.graph_id,
+                sa."agentGraphId",
                -- Searchable text for BM25 reranking
                COALESCE(sa.agent_name, '') || ' ' || COALESCE(sa.sub_heading, '') || ' ' || COALESCE(sa.description, '') as searchable_text,
                -- Semantic score
@@ -627,9 +627,9 @@ async def hybrid_search(
                sa.runs as popularity_raw
            FROM candidates c
            INNER JOIN {{schema_prefix}}"StoreAgent" sa
-                ON c."storeListingVersionId" = sa.listing_version_id
+                ON c."storeListingVersionId" = sa."storeListingVersionId"
            INNER JOIN {{schema_prefix}}"UnifiedContentEmbedding" uce
-                ON sa.listing_version_id = uce."contentId"
+                ON sa."storeListingVersionId" = uce."contentId"
                AND uce."contentType" = 'STORE_AGENT'::{{schema_prefix}}"ContentType"
        ),
        max_vals AS (
@@ -665,7 +665,7 @@ async def hybrid_search(
                featured,
                is_available,
                updated_at,
-                graph_id,
+                "agentGraphId",
                searchable_text,
                semantic_score,
                lexical_score,
--- a/autogpt_platform/backend/backend/api/features/store/model.py
+++ b/autogpt_platform/backend/backend/api/features/store/model.py
@@ -1,14 +1,11 @@
 import datetime
-from typing import TYPE_CHECKING, List, Self
+from typing import List

 import prisma.enums
 import pydantic

 from backend.util.models import Pagination

-if TYPE_CHECKING:
-    import prisma.models
-

 class ChangelogEntry(pydantic.BaseModel):
    version: str
@@ -16,9 +13,9 @@ class ChangelogEntry(pydantic.BaseModel):
    date: datetime.datetime


-class MyUnpublishedAgent(pydantic.BaseModel):
-    graph_id: str
-    graph_version: int
+class MyAgent(pydantic.BaseModel):
+    agent_id: str
+    agent_version: int
    agent_name: str
    agent_image: str | None = None
    description: str
@@ -26,8 +23,8 @@ class MyUnpublishedAgent(pydantic.BaseModel):
    recommended_schedule_cron: str | None = None


-class MyUnpublishedAgentsResponse(pydantic.BaseModel):
-    agents: list[MyUnpublishedAgent]
+class MyAgentsResponse(pydantic.BaseModel):
+    agents: list[MyAgent]
    pagination: Pagination


@@ -43,21 +40,6 @@ class StoreAgent(pydantic.BaseModel):
    rating: float
    agent_graph_id: str

-    @classmethod
-    def from_db(cls, agent: "prisma.models.StoreAgent") -> "StoreAgent":
-        return cls(
-            slug=agent.slug,
-            agent_name=agent.agent_name,
-            agent_image=agent.agent_image[0] if agent.agent_image else "",
-            creator=agent.creator_username or "Needs Profile",
-            creator_avatar=agent.creator_avatar or "",
-            sub_heading=agent.sub_heading,
-            description=agent.description,
-            runs=agent.runs,
-            rating=agent.rating,
-            agent_graph_id=agent.graph_id,
-        )
-

 class StoreAgentsResponse(pydantic.BaseModel):
    agents: list[StoreAgent]
@@ -80,192 +62,81 @@ class StoreAgentDetails(pydantic.BaseModel):
    runs: int
    rating: float
    versions: list[str]
-    graph_id: str
-    graph_versions: list[str]
+    agentGraphVersions: list[str]
+    agentGraphId: str
    last_updated: datetime.datetime
    recommended_schedule_cron: str | None = None

-    active_version_id: str
-    has_approved_version: bool
+    active_version_id: str | None = None
+    has_approved_version: bool = False

    # Optional changelog data when include_changelog=True
    changelog: list[ChangelogEntry] | None = None

-    @classmethod
-    def from_db(cls, agent: "prisma.models.StoreAgent") -> "StoreAgentDetails":
-        return cls(
-            store_listing_version_id=agent.listing_version_id,
-            slug=agent.slug,
-            agent_name=agent.agent_name,
-            agent_video=agent.agent_video or "",
-            agent_output_demo=agent.agent_output_demo or "",
-            agent_image=agent.agent_image,
-            creator=agent.creator_username or "",
-            creator_avatar=agent.creator_avatar or "",
-            sub_heading=agent.sub_heading,
-            description=agent.description,
-            categories=agent.categories,
-            runs=agent.runs,
-            rating=agent.rating,
-            versions=agent.versions,
-            graph_id=agent.graph_id,
-            graph_versions=agent.graph_versions,
-            last_updated=agent.updated_at,
-            recommended_schedule_cron=agent.recommended_schedule_cron,
-            active_version_id=agent.listing_version_id,
-            has_approved_version=True,  # StoreAgent view only has approved agents
-        )

-
-class Profile(pydantic.BaseModel):
-    """Marketplace user profile (only attributes that the user can update)"""
-
-    username: str
+class Creator(pydantic.BaseModel):
    name: str
+    username: str
    description: str
-    avatar_url: str | None
-    links: list[str]
-
-
-class ProfileDetails(Profile):
-    """Marketplace user profile (including read-only fields)"""
-
-    is_featured: bool
-
-    @classmethod
-    def from_db(cls, profile: "prisma.models.Profile") -> "ProfileDetails":
-        return cls(
-            name=profile.name,
-            username=profile.username,
-            avatar_url=profile.avatarUrl,
-            description=profile.description,
-            links=profile.links,
-            is_featured=profile.isFeatured,
-        )
-
-
-class CreatorDetails(ProfileDetails):
-    """Marketplace creator profile details, including aggregated stats"""
-
+    avatar_url: str
    num_agents: int
-    agent_runs: int
    agent_rating: float
-    top_categories: list[str]
-
-    @classmethod
-    def from_db(cls, creator: "prisma.models.Creator") -> "CreatorDetails":  # type: ignore[override]
-        return cls(
-            name=creator.name,
-            username=creator.username,
-            avatar_url=creator.avatar_url,
-            description=creator.description,
-            links=creator.links,
-            is_featured=creator.is_featured,
-            num_agents=creator.num_agents,
-            agent_runs=creator.agent_runs,
-            agent_rating=creator.agent_rating,
-            top_categories=creator.top_categories,
-        )
+    agent_runs: int
+    is_featured: bool


 class CreatorsResponse(pydantic.BaseModel):
-    creators: List[CreatorDetails]
+    creators: List[Creator]
    pagination: Pagination


-class StoreSubmission(pydantic.BaseModel):
-    # From StoreListing:
-    listing_id: str
-    user_id: str
-    slug: str
+class CreatorDetails(pydantic.BaseModel):
+    name: str
+    username: str
+    description: str
+    links: list[str]
+    avatar_url: str
+    agent_rating: float
+    agent_runs: int
+    top_categories: list[str]

-    # From StoreListingVersion:
-    listing_version_id: str
-    listing_version: int
-    graph_id: str
-    graph_version: int
+
+class Profile(pydantic.BaseModel):
+    name: str
+    username: str
+    description: str
+    links: list[str]
+    avatar_url: str
+    is_featured: bool = False
+
+
+class StoreSubmission(pydantic.BaseModel):
+    listing_id: str
+    agent_id: str
+    agent_version: int
    name: str
    sub_heading: str
+    slug: str
    description: str
-    instructions: str | None
-    categories: list[str]
+    instructions: str | None = None
    image_urls: list[str]
-    video_url: str | None
-    agent_output_demo_url: str | None
-
-    submitted_at: datetime.datetime | None
-    changes_summary: str | None
+    date_submitted: datetime.datetime
    status: prisma.enums.SubmissionStatus
-    reviewed_at: datetime.datetime | None = None
+    runs: int
+    rating: float
+    store_listing_version_id: str | None = None
+    version: int | None = None  # Actual version number from the database
+
    reviewer_id: str | None = None
    review_comments: str | None = None  # External comments visible to creator
+    internal_comments: str | None = None  # Private notes for admin use only
+    reviewed_at: datetime.datetime | None = None
+    changes_summary: str | None = None

-    # Aggregated from AgentGraphExecutions and StoreListingReviews:
-    run_count: int = 0
-    review_count: int = 0
-    review_avg_rating: float = 0.0
-
-    @classmethod
-    def from_db(cls, _sub: "prisma.models.StoreSubmission") -> Self:
-        """Construct from the StoreSubmission Prisma view."""
-        return cls(
-            listing_id=_sub.listing_id,
-            user_id=_sub.user_id,
-            slug=_sub.slug,
-            listing_version_id=_sub.listing_version_id,
-            listing_version=_sub.listing_version,
-            graph_id=_sub.graph_id,
-            graph_version=_sub.graph_version,
-            name=_sub.name,
-            sub_heading=_sub.sub_heading,
-            description=_sub.description,
-            instructions=_sub.instructions,
-            categories=_sub.categories,
-            image_urls=_sub.image_urls,
-            video_url=_sub.video_url,
-            agent_output_demo_url=_sub.agent_output_demo_url,
-            submitted_at=_sub.submitted_at,
-            changes_summary=_sub.changes_summary,
-            status=_sub.status,
-            reviewed_at=_sub.reviewed_at,
-            reviewer_id=_sub.reviewer_id,
-            review_comments=_sub.review_comments,
-            run_count=_sub.run_count,
-            review_count=_sub.review_count,
-            review_avg_rating=_sub.review_avg_rating,
-        )
-
-    @classmethod
-    def from_listing_version(cls, _lv: "prisma.models.StoreListingVersion") -> Self:
-        """
-        Construct from the StoreListingVersion Prisma model (with StoreListing included)
-        """
-        if not (_l := _lv.StoreListing):
-            raise ValueError("StoreListingVersion must have included StoreListing")
-
-        return cls(
-            listing_id=_l.id,
-            user_id=_l.owningUserId,
-            slug=_l.slug,
-            listing_version_id=_lv.id,
-            listing_version=_lv.version,
-            graph_id=_lv.agentGraphId,
-            graph_version=_lv.agentGraphVersion,
-            name=_lv.name,
-            sub_heading=_lv.subHeading,
-            description=_lv.description,
-            instructions=_lv.instructions,
-            categories=_lv.categories,
-            image_urls=_lv.imageUrls,
-            video_url=_lv.videoUrl,
-            agent_output_demo_url=_lv.agentOutputDemoUrl,
-            submitted_at=_lv.submittedAt,
-            changes_summary=_lv.changesSummary,
-            status=_lv.submissionStatus,
-            reviewed_at=_lv.reviewedAt,
-            reviewer_id=_lv.reviewerId,
-            review_comments=_lv.reviewComments,
-        )
+    # Additional fields for editing
+    video_url: str | None = None
+    agent_output_demo_url: str | None = None
+    categories: list[str] = []


 class StoreSubmissionsResponse(pydantic.BaseModel):
@@ -273,12 +144,33 @@ class StoreSubmissionsResponse(pydantic.BaseModel):
    pagination: Pagination


+class StoreListingWithVersions(pydantic.BaseModel):
+    """A store listing with its version history"""
+
+    listing_id: str
+    slug: str
+    agent_id: str
+    agent_version: int
+    active_version_id: str | None = None
+    has_approved_version: bool = False
+    creator_email: str | None = None
+    latest_version: StoreSubmission | None = None
+    versions: list[StoreSubmission] = []
+
+
+class StoreListingsWithVersionsResponse(pydantic.BaseModel):
+    """Response model for listings with version history"""
+
+    listings: list[StoreListingWithVersions]
+    pagination: Pagination
+
+
 class StoreSubmissionRequest(pydantic.BaseModel):
-    graph_id: str = pydantic.Field(
-        ..., min_length=1, description="Graph ID cannot be empty"
+    agent_id: str = pydantic.Field(
+        ..., min_length=1, description="Agent ID cannot be empty"
    )
-    graph_version: int = pydantic.Field(
-        ..., gt=0, description="Graph version must be greater than 0"
+    agent_version: int = pydantic.Field(
+        ..., gt=0, description="Agent version must be greater than 0"
    )
    slug: str
    name: str
@@ -306,42 +198,12 @@ class StoreSubmissionEditRequest(pydantic.BaseModel):
    recommended_schedule_cron: str | None = None


-class StoreSubmissionAdminView(StoreSubmission):
-    internal_comments: str | None  # Private admin notes
-
-    @classmethod
-    def from_db(cls, _sub: "prisma.models.StoreSubmission") -> Self:
-        return cls(
-            **StoreSubmission.from_db(_sub).model_dump(),
-            internal_comments=_sub.internal_comments,
-        )
-
-    @classmethod
-    def from_listing_version(cls, _lv: "prisma.models.StoreListingVersion") -> Self:
-        return cls(
-            **StoreSubmission.from_listing_version(_lv).model_dump(),
-            internal_comments=_lv.internalComments,
-        )
-
-
-class StoreListingWithVersionsAdminView(pydantic.BaseModel):
-    """A store listing with its version history"""
-
-    listing_id: str
-    graph_id: str
-    slug: str
-    active_listing_version_id: str | None = None
-    has_approved_version: bool = False
-    creator_email: str | None = None
-    latest_version: StoreSubmissionAdminView | None = None
-    versions: list[StoreSubmissionAdminView] = []
-
-
-class StoreListingsWithVersionsAdminViewResponse(pydantic.BaseModel):
-    """Response model for listings with version history"""
-
-    listings: list[StoreListingWithVersionsAdminView]
-    pagination: Pagination
+class ProfileDetails(pydantic.BaseModel):
+    name: str
+    username: str
+    description: str
+    links: list[str]
+    avatar_url: str | None = None


 class StoreReview(pydantic.BaseModel):
--- a/autogpt_platform/backend/backend/api/features/store/model_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/model_test.py
@@ -0,0 +1,203 @@
+import datetime
+
+import prisma.enums
+
+from . import model as store_model
+
+
+def test_pagination():
+    pagination = store_model.Pagination(
+        total_items=100, total_pages=5, current_page=2, page_size=20
+    )
+    assert pagination.total_items == 100
+    assert pagination.total_pages == 5
+    assert pagination.current_page == 2
+    assert pagination.page_size == 20
+
+
+def test_store_agent():
+    agent = store_model.StoreAgent(
+        slug="test-agent",
+        agent_name="Test Agent",
+        agent_image="test.jpg",
+        creator="creator1",
+        creator_avatar="avatar.jpg",
+        sub_heading="Test subheading",
+        description="Test description",
+        runs=50,
+        rating=4.5,
+        agent_graph_id="test-graph-id",
+    )
+    assert agent.slug == "test-agent"
+    assert agent.agent_name == "Test Agent"
+    assert agent.runs == 50
+    assert agent.rating == 4.5
+    assert agent.agent_graph_id == "test-graph-id"
+
+
+def test_store_agents_response():
+    response = store_model.StoreAgentsResponse(
+        agents=[
+            store_model.StoreAgent(
+                slug="test-agent",
+                agent_name="Test Agent",
+                agent_image="test.jpg",
+                creator="creator1",
+                creator_avatar="avatar.jpg",
+                sub_heading="Test subheading",
+                description="Test description",
+                runs=50,
+                rating=4.5,
+                agent_graph_id="test-graph-id",
+            )
+        ],
+        pagination=store_model.Pagination(
+            total_items=1, total_pages=1, current_page=1, page_size=20
+        ),
+    )
+    assert len(response.agents) == 1
+    assert response.pagination.total_items == 1
+
+
+def test_store_agent_details():
+    details = store_model.StoreAgentDetails(
+        store_listing_version_id="version123",
+        slug="test-agent",
+        agent_name="Test Agent",
+        agent_video="video.mp4",
+        agent_output_demo="demo.mp4",
+        agent_image=["image1.jpg", "image2.jpg"],
+        creator="creator1",
+        creator_avatar="avatar.jpg",
+        sub_heading="Test subheading",
+        description="Test description",
+        categories=["cat1", "cat2"],
+        runs=50,
+        rating=4.5,
+        versions=["1.0", "2.0"],
+        agentGraphVersions=["1", "2"],
+        agentGraphId="test-graph-id",
+        last_updated=datetime.datetime.now(),
+    )
+    assert details.slug == "test-agent"
+    assert len(details.agent_image) == 2
+    assert len(details.categories) == 2
+    assert len(details.versions) == 2
+
+
+def test_creator():
+    creator = store_model.Creator(
+        agent_rating=4.8,
+        agent_runs=1000,
+        name="Test Creator",
+        username="creator1",
+        description="Test description",
+        avatar_url="avatar.jpg",
+        num_agents=5,
+        is_featured=False,
+    )
+    assert creator.name == "Test Creator"
+    assert creator.num_agents == 5
+
+
+def test_creators_response():
+    response = store_model.CreatorsResponse(
+        creators=[
+            store_model.Creator(
+                agent_rating=4.8,
+                agent_runs=1000,
+                name="Test Creator",
+                username="creator1",
+                description="Test description",
+                avatar_url="avatar.jpg",
+                num_agents=5,
+                is_featured=False,
+            )
+        ],
+        pagination=store_model.Pagination(
+            total_items=1, total_pages=1, current_page=1, page_size=20
+        ),
+    )
+    assert len(response.creators) == 1
+    assert response.pagination.total_items == 1
+
+
+def test_creator_details():
+    details = store_model.CreatorDetails(
+        name="Test Creator",
+        username="creator1",
+        description="Test description",
+        links=["link1.com", "link2.com"],
+        avatar_url="avatar.jpg",
+        agent_rating=4.8,
+        agent_runs=1000,
+        top_categories=["cat1", "cat2"],
+    )
+    assert details.name == "Test Creator"
+    assert len(details.links) == 2
+    assert details.agent_rating == 4.8
+    assert len(details.top_categories) == 2
+
+
+def test_store_submission():
+    submission = store_model.StoreSubmission(
+        listing_id="listing123",
+        agent_id="agent123",
+        agent_version=1,
+        sub_heading="Test subheading",
+        name="Test Agent",
+        slug="test-agent",
+        description="Test description",
+        image_urls=["image1.jpg", "image2.jpg"],
+        date_submitted=datetime.datetime(2023, 1, 1),
+        status=prisma.enums.SubmissionStatus.PENDING,
+        runs=50,
+        rating=4.5,
+    )
+    assert submission.name == "Test Agent"
+    assert len(submission.image_urls) == 2
+    assert submission.status == prisma.enums.SubmissionStatus.PENDING
+
+
+def test_store_submissions_response():
+    response = store_model.StoreSubmissionsResponse(
+        submissions=[
+            store_model.StoreSubmission(
+                listing_id="listing123",
+                agent_id="agent123",
+                agent_version=1,
+                sub_heading="Test subheading",
+                name="Test Agent",
+                slug="test-agent",
+                description="Test description",
+                image_urls=["image1.jpg"],
+                date_submitted=datetime.datetime(2023, 1, 1),
+                status=prisma.enums.SubmissionStatus.PENDING,
+                runs=50,
+                rating=4.5,
+            )
+        ],
+        pagination=store_model.Pagination(
+            total_items=1, total_pages=1, current_page=1, page_size=20
+        ),
+    )
+    assert len(response.submissions) == 1
+    assert response.pagination.total_items == 1
+
+
+def test_store_submission_request():
+    request = store_model.StoreSubmissionRequest(
+        agent_id="agent123",
+        agent_version=1,
+        slug="test-agent",
+        name="Test Agent",
+        sub_heading="Test subheading",
+        video_url="video.mp4",
+        image_urls=["image1.jpg", "image2.jpg"],
+        description="Test description",
+        categories=["cat1", "cat2"],
+    )
+    assert request.agent_id == "agent123"
+    assert request.agent_version == 1
+    assert len(request.image_urls) == 2
+    assert len(request.categories) == 2
--- a/autogpt_platform/backend/backend/api/features/store/routes.py
+++ b/autogpt_platform/backend/backend/api/features/store/routes.py
@@ -1,17 +1,16 @@
 import logging
 import tempfile
+import typing
 import urllib.parse
+from typing import Literal

 import autogpt_libs.auth
 import fastapi
 import fastapi.responses
 import prisma.enums
-from fastapi import Query, Security
-from pydantic import BaseModel

 import backend.data.graph
 import backend.util.json
-from backend.util.exceptions import NotFoundError
 from backend.util.models import Pagination

 from . import cache as store_cache
@@ -35,15 +34,22 @@ router = fastapi.APIRouter()
    "/profile",
    summary="Get user profile",
    tags=["store", "private"],
-    dependencies=[Security(autogpt_libs.auth.requires_user)],
+    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
+    response_model=store_model.ProfileDetails,
 )
 async def get_profile(
-    user_id: str = Security(autogpt_libs.auth.get_user_id),
-) -> store_model.ProfileDetails:
-    """Get the profile details for the authenticated user."""
+    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
+):
+    """
+    Get the profile details for the authenticated user.
+    Cached for 1 hour per user.
+    """
    profile = await store_db.get_user_profile(user_id)
    if profile is None:
-        raise NotFoundError("User does not have a profile yet")
+        return fastapi.responses.JSONResponse(
+            status_code=404,
+            content={"detail": "Profile not found"},
+        )
    return profile


@@ -51,17 +57,98 @@ async def get_profile(
    "/profile",
    summary="Update user profile",
    tags=["store", "private"],
-    dependencies=[Security(autogpt_libs.auth.requires_user)],
+    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
+    response_model=store_model.CreatorDetails,
 )
 async def update_or_create_profile(
    profile: store_model.Profile,
-    user_id: str = Security(autogpt_libs.auth.get_user_id),
-) -> store_model.ProfileDetails:
-    """Update the store profile for the authenticated user."""
+    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
+):
+    """
+    Update the store profile for the authenticated user.
+
+    Args:
+        profile (Profile): The updated profile details
+        user_id (str): ID of the authenticated user
+
+    Returns:
+        CreatorDetails: The updated profile
+
+    Raises:
+        HTTPException: If there is an error updating the profile
+    """
    updated_profile = await store_db.update_profile(user_id=user_id, profile=profile)
    return updated_profile


+##############################################
+############### Agent Endpoints ##############
+##############################################
+
+
+@router.get(
+    "/agents",
+    summary="List store agents",
+    tags=["store", "public"],
+    response_model=store_model.StoreAgentsResponse,
+)
+async def get_agents(
+    featured: bool = False,
+    creator: str | None = None,
+    sorted_by: Literal["rating", "runs", "name", "updated_at"] | None = None,
+    search_query: str | None = None,
+    category: str | None = None,
+    page: int = 1,
+    page_size: int = 20,
+):
+    """
+    Get a paginated list of agents from the store with optional filtering and sorting.
+
+    Args:
+        featured (bool, optional): Filter to only show featured agents. Defaults to False.
+        creator (str | None, optional): Filter agents by creator username. Defaults to None.
+        sorted_by (str | None, optional): Sort agents by "runs" or "rating". Defaults to None.
+        search_query (str | None, optional): Search agents by name, subheading and description. Defaults to None.
+        category (str | None, optional): Filter agents by category. Defaults to None.
+        page (int, optional): Page number for pagination. Defaults to 1.
+        page_size (int, optional): Number of agents per page. Defaults to 20.
+
+    Returns:
+        StoreAgentsResponse: Paginated list of agents matching the filters
+
+    Raises:
+        HTTPException: If page or page_size are less than 1
+
+    Used for:
+    - Home Page Featured Agents
+    - Home Page Top Agents
+    - Search Results
+    - Agent Details - Other Agents By Creator
+    - Agent Details - Similar Agents
+    - Creator Details - Agents By Creator
+    """
+    if page < 1:
+        raise fastapi.HTTPException(
+            status_code=422, detail="Page must be greater than 0"
+        )
+
+    if page_size < 1:
+        raise fastapi.HTTPException(
+            status_code=422, detail="Page size must be greater than 0"
+        )
+
+    agents = await store_cache._get_cached_store_agents(
+        featured=featured,
+        creator=creator,
+        sorted_by=sorted_by,
+        search_query=search_query,
+        category=category,
+        page=page,
+        page_size=page_size,
+    )
+    return agents
+
+
 ##############################################
 ############### Search Endpoints #############
 ##############################################
@@ -71,30 +158,60 @@ async def update_or_create_profile(
    "/search",
    summary="Unified search across all content types",
    tags=["store", "public"],
+    response_model=store_model.UnifiedSearchResponse,
 )
 async def unified_search(
    query: str,
-    content_types: list[prisma.enums.ContentType] | None = Query(
+    content_types: list[str] | None = fastapi.Query(
        default=None,
-        description="Content types to search. If not specified, searches all.",
+        description="Content types to search: STORE_AGENT, BLOCK, DOCUMENTATION. If not specified, searches all.",
    ),
-    page: int = Query(ge=1, default=1),
-    page_size: int = Query(ge=1, default=20),
-    user_id: str | None = Security(
+    page: int = 1,
+    page_size: int = 20,
+    user_id: str | None = fastapi.Security(
        autogpt_libs.auth.get_optional_user_id, use_cache=False
    ),
-) -> store_model.UnifiedSearchResponse:
+):
    """
-    Search across all content types (marketplace agents, blocks, documentation)
-    using hybrid search.
+    Search across all content types (store agents, blocks, documentation) using hybrid search.

    Combines semantic (embedding-based) and lexical (text-based) search for best results.
+
+    Args:
+        query: The search query string
+        content_types: Optional list of content types to filter by (STORE_AGENT, BLOCK, DOCUMENTATION)
+        page: Page number for pagination (default 1)
+        page_size: Number of results per page (default 20)
+        user_id: Optional authenticated user ID (for user-scoped content in future)
+
+    Returns:
+        UnifiedSearchResponse: Paginated list of search results with relevance scores
    """
+    if page < 1:
+        raise fastapi.HTTPException(
+            status_code=422, detail="Page must be greater than 0"
+        )
+
+    if page_size < 1:
+        raise fastapi.HTTPException(
+            status_code=422, detail="Page size must be greater than 0"
+        )
+
+    # Convert string content types to enum
+    content_type_enums: list[prisma.enums.ContentType] | None = None
+    if content_types:
+        try:
+            content_type_enums = [prisma.enums.ContentType(ct) for ct in content_types]
+        except ValueError as e:
+            raise fastapi.HTTPException(
+                status_code=422,
+                detail=f"Invalid content type. Valid values: STORE_AGENT, BLOCK, DOCUMENTATION. Error: {e}",
+            )

    # Perform unified hybrid search
    results, total = await store_hybrid_search.unified_hybrid_search(
        query=query,
-        content_types=content_types,
+        content_types=content_type_enums,
        user_id=user_id,
        page=page,
        page_size=page_size,
@@ -128,69 +245,22 @@ async def unified_search(
    )


-##############################################
-############### Agent Endpoints ##############
-##############################################
-
-
-@router.get(
-    "/agents",
-    summary="List store agents",
-    tags=["store", "public"],
-)
-async def get_agents(
-    featured: bool = Query(
-        default=False, description="Filter to only show featured agents"
-    ),
-    creator: str | None = Query(
-        default=None, description="Filter agents by creator username"
-    ),
-    category: str | None = Query(default=None, description="Filter agents by category"),
-    search_query: str | None = Query(
-        default=None, description="Literal + semantic search on names and descriptions"
-    ),
-    sorted_by: store_db.StoreAgentsSortOptions | None = Query(
-        default=None,
-        description="Property to sort results by. Ignored if search_query is provided.",
-    ),
-    page: int = Query(ge=1, default=1),
-    page_size: int = Query(ge=1, default=20),
-) -> store_model.StoreAgentsResponse:
-    """
-    Get a paginated list of agents from the marketplace,
-    with optional filtering and sorting.
-
-    Used for:
-    - Home Page Featured Agents
-    - Home Page Top Agents
-    - Search Results
-    - Agent Details - Other Agents By Creator
-    - Agent Details - Similar Agents
-    - Creator Details - Agents By Creator
-    """
-    agents = await store_cache._get_cached_store_agents(
-        featured=featured,
-        creator=creator,
-        sorted_by=sorted_by,
-        search_query=search_query,
-        category=category,
-        page=page,
-        page_size=page_size,
-    )
-    return agents
-
-
@router.get(
    "/agents/{username}/{agent_name}",
    summary="Get specific agent",
    tags=["store", "public"],
+    response_model=store_model.StoreAgentDetails,
 )
-async def get_agent_by_name(
+async def get_agent(
    username: str,
    agent_name: str,
-    include_changelog: bool = Query(default=False),
-) -> store_model.StoreAgentDetails:
-    """Get details of a marketplace agent"""
+    include_changelog: bool = fastapi.Query(default=False),
+):
+    """
+    This is only used on the AgentDetails Page.
+
+    It returns the store listing agents details.
+    """
    username = urllib.parse.unquote(username).lower()
    # URL decode the agent name since it comes from the URL path
    agent_name = urllib.parse.unquote(agent_name).lower()
@@ -200,82 +270,76 @@ async def get_agent_by_name(
    return agent


+@router.get(
+    "/graph/{store_listing_version_id}",
+    summary="Get agent graph",
+    tags=["store"],
+    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
+)
+async def get_graph_meta_by_store_listing_version_id(
+    store_listing_version_id: str,
+) -> backend.data.graph.GraphModelWithoutNodes:
+    """
+    Get Agent Graph from Store Listing Version ID.
+    """
+    graph = await store_db.get_available_graph(store_listing_version_id)
+    return graph
+
+
+@router.get(
+    "/agents/{store_listing_version_id}",
+    summary="Get agent by version",
+    tags=["store"],
+    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
+    response_model=store_model.StoreAgentDetails,
+)
+async def get_store_agent(store_listing_version_id: str):
+    """
+    Get Store Agent Details from Store Listing Version ID.
+    """
+    agent = await store_db.get_store_agent_by_version_id(store_listing_version_id)
+
+    return agent
+
+
@router.post(
    "/agents/{username}/{agent_name}/review",
    summary="Create agent review",
    tags=["store"],
-    dependencies=[Security(autogpt_libs.auth.requires_user)],
+    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
+    response_model=store_model.StoreReview,
 )
-async def post_user_review_for_agent(
+async def create_review(
    username: str,
    agent_name: str,
    review: store_model.StoreReviewCreate,
-    user_id: str = Security(autogpt_libs.auth.get_user_id),
-) -> store_model.StoreReview:
-    """Post a user review on a marketplace agent listing"""
+    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
+):
+    """
+    Create a review for a store agent.
+
+    Args:
+        username: Creator's username
+        agent_name: Name/slug of the agent
+        review: Review details including score and optional comments
+        user_id: ID of authenticated user creating the review
+
+    Returns:
+        The created review
+    """
    username = urllib.parse.unquote(username).lower()
    agent_name = urllib.parse.unquote(agent_name).lower()
-
+    # Create the review
    created_review = await store_db.create_store_review(
        user_id=user_id,
        store_listing_version_id=review.store_listing_version_id,
        score=review.score,
        comments=review.comments,
    )
+
    return created_review


-@router.get(
-    "/listings/versions/{store_listing_version_id}",
-    summary="Get agent by version",
-    tags=["store"],
-    dependencies=[Security(autogpt_libs.auth.requires_user)],
-)
-async def get_agent_by_listing_version(
-    store_listing_version_id: str,
-) -> store_model.StoreAgentDetails:
-    agent = await store_db.get_store_agent_by_version_id(store_listing_version_id)
-    return agent
-
-
-@router.get(
-    "/listings/versions/{store_listing_version_id}/graph",
-    summary="Get agent graph",
-    tags=["store"],
-    dependencies=[Security(autogpt_libs.auth.requires_user)],
-)
-async def get_graph_meta_by_store_listing_version_id(
-    store_listing_version_id: str,
-) -> backend.data.graph.GraphModelWithoutNodes:
-    """Get outline of graph belonging to a specific marketplace listing version"""
-    graph = await store_db.get_available_graph(store_listing_version_id)
-    return graph
-
-
-@router.get(
-    "/listings/versions/{store_listing_version_id}/graph/download",
-    summary="Download agent file",
-    tags=["store", "public"],
-)
-async def download_agent_file(
-    store_listing_version_id: str,
-) -> fastapi.responses.FileResponse:
-    """Download agent graph file for a specific marketplace listing version"""
-    graph_data = await store_db.get_agent(store_listing_version_id)
-    file_name = f"agent_{graph_data.id}_v{graph_data.version or 'latest'}.json"
-
-    # Sending graph as a stream (similar to marketplace v1)
-    with tempfile.NamedTemporaryFile(
-        mode="w", suffix=".json", delete=False
-    ) as tmp_file:
-        tmp_file.write(backend.util.json.dumps(graph_data))
-        tmp_file.flush()
-
-        return fastapi.responses.FileResponse(
-            tmp_file.name, filename=file_name, media_type="application/json"
-        )
-
-
 ##############################################
 ############# Creator Endpoints #############
 ##############################################
@@ -285,19 +349,37 @@ async def download_agent_file(
    "/creators",
    summary="List store creators",
    tags=["store", "public"],
+    response_model=store_model.CreatorsResponse,
 )
 async def get_creators(
-    featured: bool = Query(
-        default=False, description="Filter to only show featured creators"
-    ),
-    search_query: str | None = Query(
-        default=None, description="Literal + semantic search on names and descriptions"
-    ),
-    sorted_by: store_db.StoreCreatorsSortOptions | None = None,
-    page: int = Query(ge=1, default=1),
-    page_size: int = Query(ge=1, default=20),
-) -> store_model.CreatorsResponse:
-    """List or search marketplace creators"""
+    featured: bool = False,
+    search_query: str | None = None,
+    sorted_by: Literal["agent_rating", "agent_runs", "num_agents"] | None = None,
+    page: int = 1,
+    page_size: int = 20,
+):
+    """
+    This is needed for:
+    - Home Page Featured Creators
+    - Search Results Page
+
+    ---
+
+    To support this functionality we need:
+    - featured: bool - to limit the list to just featured agents
+    - search_query: str - vector search based on the creators profile description.
+    - sorted_by: [agent_rating, agent_runs] -
+    """
+    if page < 1:
+        raise fastapi.HTTPException(
+            status_code=422, detail="Page must be greater than 0"
+        )
+
+    if page_size < 1:
+        raise fastapi.HTTPException(
+            status_code=422, detail="Page size must be greater than 0"
+        )
+
    creators = await store_cache._get_cached_store_creators(
        featured=featured,
        search_query=search_query,
@@ -309,12 +391,18 @@ async def get_creators(


@router.get(
-    "/creators/{username}",
+    "/creator/{username}",
    summary="Get creator details",
    tags=["store", "public"],
+    response_model=store_model.CreatorDetails,
 )
-async def get_creator(username: str) -> store_model.CreatorDetails:
-    """Get details on a marketplace creator"""
+async def get_creator(
+    username: str,
+):
+    """
+    Get the details of a creator.
+    - Creator Details Page
+    """
    username = urllib.parse.unquote(username).lower()
    creator = await store_cache._get_cached_creator_details(username=username)
    return creator
@@ -326,17 +414,20 @@ async def get_creator(username: str) -> store_model.CreatorDetails:


@router.get(
-    "/my-unpublished-agents",
+    "/myagents",
    summary="Get my agents",
    tags=["store", "private"],
-    dependencies=[Security(autogpt_libs.auth.requires_user)],
+    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
+    response_model=store_model.MyAgentsResponse,
 )
-async def get_my_unpublished_agents(
-    user_id: str = Security(autogpt_libs.auth.get_user_id),
-    page: int = Query(ge=1, default=1),
-    page_size: int = Query(ge=1, default=20),
-) -> store_model.MyUnpublishedAgentsResponse:
-    """List the authenticated user's unpublished agents"""
+async def get_my_agents(
+    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
+    page: typing.Annotated[int, fastapi.Query(ge=1)] = 1,
+    page_size: typing.Annotated[int, fastapi.Query(ge=1)] = 20,
+):
+    """
+    Get user's own agents.
+    """
    agents = await store_db.get_my_agents(user_id, page=page, page_size=page_size)
    return agents

@@ -345,17 +436,28 @@ async def get_my_unpublished_agents(
    "/submissions/{submission_id}",
    summary="Delete store submission",
    tags=["store", "private"],
-    dependencies=[Security(autogpt_libs.auth.requires_user)],
+    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
+    response_model=bool,
 )
 async def delete_submission(
    submission_id: str,
-    user_id: str = Security(autogpt_libs.auth.get_user_id),
-) -> bool:
-    """Delete a marketplace listing submission"""
+    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
+):
+    """
+    Delete a store listing submission.
+
+    Args:
+        user_id (str): ID of the authenticated user
+        submission_id (str): ID of the submission to be deleted
+
+    Returns:
+        bool: True if the submission was successfully deleted, False otherwise
+    """
    result = await store_db.delete_store_submission(
        user_id=user_id,
        submission_id=submission_id,
    )
+
    return result


@@ -363,14 +465,37 @@ async def delete_submission(
    "/submissions",
    summary="List my submissions",
    tags=["store", "private"],
-    dependencies=[Security(autogpt_libs.auth.requires_user)],
+    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
+    response_model=store_model.StoreSubmissionsResponse,
 )
 async def get_submissions(
-    user_id: str = Security(autogpt_libs.auth.get_user_id),
-    page: int = Query(ge=1, default=1),
-    page_size: int = Query(ge=1, default=20),
-) -> store_model.StoreSubmissionsResponse:
-    """List the authenticated user's marketplace listing submissions"""
+    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
+    page: int = 1,
+    page_size: int = 20,
+):
+    """
+    Get a paginated list of store submissions for the authenticated user.
+
+    Args:
+        user_id (str): ID of the authenticated user
+        page (int, optional): Page number for pagination. Defaults to 1.
+        page_size (int, optional): Number of submissions per page. Defaults to 20.
+
+    Returns:
+        StoreListingsResponse: Paginated list of store submissions
+
+    Raises:
+        HTTPException: If page or page_size are less than 1
+    """
+    if page < 1:
+        raise fastapi.HTTPException(
+            status_code=422, detail="Page must be greater than 0"
+        )
+
+    if page_size < 1:
+        raise fastapi.HTTPException(
+            status_code=422, detail="Page size must be greater than 0"
+        )
    listings = await store_db.get_store_submissions(
        user_id=user_id,
        page=page,
@@ -383,17 +508,30 @@ async def get_submissions(
    "/submissions",
    summary="Create store submission",
    tags=["store", "private"],
-    dependencies=[Security(autogpt_libs.auth.requires_user)],
+    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
+    response_model=store_model.StoreSubmission,
 )
 async def create_submission(
    submission_request: store_model.StoreSubmissionRequest,
-    user_id: str = Security(autogpt_libs.auth.get_user_id),
-) -> store_model.StoreSubmission:
-    """Submit a new marketplace listing for review"""
+    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
+):
+    """
+    Create a new store listing submission.
+
+    Args:
+        submission_request (StoreSubmissionRequest): The submission details
+        user_id (str): ID of the authenticated user submitting the listing
+
+    Returns:
+        StoreSubmission: The created store submission
+
+    Raises:
+        HTTPException: If there is an error creating the submission
+    """
    result = await store_db.create_store_submission(
        user_id=user_id,
-        graph_id=submission_request.graph_id,
-        graph_version=submission_request.graph_version,
+        agent_id=submission_request.agent_id,
+        agent_version=submission_request.agent_version,
        slug=submission_request.slug,
        name=submission_request.name,
        video_url=submission_request.video_url,
@@ -406,6 +544,7 @@ async def create_submission(
        changes_summary=submission_request.changes_summary or "Initial Submission",
        recommended_schedule_cron=submission_request.recommended_schedule_cron,
    )
+
    return result


@@ -413,14 +552,28 @@ async def create_submission(
    "/submissions/{store_listing_version_id}",
    summary="Edit store submission",
    tags=["store", "private"],
-    dependencies=[Security(autogpt_libs.auth.requires_user)],
+    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
+    response_model=store_model.StoreSubmission,
 )
 async def edit_submission(
    store_listing_version_id: str,
    submission_request: store_model.StoreSubmissionEditRequest,
-    user_id: str = Security(autogpt_libs.auth.get_user_id),
-) -> store_model.StoreSubmission:
-    """Update a pending marketplace listing submission"""
+    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
+):
+    """
+    Edit an existing store listing submission.
+
+    Args:
+        store_listing_version_id (str): ID of the store listing version to edit
+        submission_request (StoreSubmissionRequest): The updated submission details
+        user_id (str): ID of the authenticated user editing the listing
+
+    Returns:
+        StoreSubmission: The updated store submission
+
+    Raises:
+        HTTPException: If there is an error editing the submission
+    """
    result = await store_db.edit_store_submission(
        user_id=user_id,
        store_listing_version_id=store_listing_version_id,
@@ -435,6 +588,7 @@ async def edit_submission(
        changes_summary=submission_request.changes_summary,
        recommended_schedule_cron=submission_request.recommended_schedule_cron,
    )
+
    return result


@@ -442,61 +596,115 @@ async def edit_submission(
    "/submissions/media",
    summary="Upload submission media",
    tags=["store", "private"],
-    dependencies=[Security(autogpt_libs.auth.requires_user)],
+    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
 )
 async def upload_submission_media(
    file: fastapi.UploadFile,
-    user_id: str = Security(autogpt_libs.auth.get_user_id),
-) -> str:
-    """Upload media for a marketplace listing submission"""
+    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
+):
+    """
+    Upload media (images/videos) for a store listing submission.
+
+    Args:
+        file (UploadFile): The media file to upload
+        user_id (str): ID of the authenticated user uploading the media
+
+    Returns:
+        str: URL of the uploaded media file
+
+    Raises:
+        HTTPException: If there is an error uploading the media
+    """
    media_url = await store_media.upload_media(user_id=user_id, file=file)
    return media_url


-class ImageURLResponse(BaseModel):
-    image_url: str
-
-
@router.post(
    "/submissions/generate_image",
    summary="Generate submission image",
    tags=["store", "private"],
-    dependencies=[Security(autogpt_libs.auth.requires_user)],
+    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
 )
 async def generate_image(
-    graph_id: str,
-    user_id: str = Security(autogpt_libs.auth.get_user_id),
-) -> ImageURLResponse:
+    agent_id: str,
+    user_id: str = fastapi.Security(autogpt_libs.auth.get_user_id),
+) -> fastapi.responses.Response:
    """
-    Generate an image for a marketplace listing submission based on the properties
-    of a given graph.
+    Generate an image for a store listing submission.
+
+    Args:
+        agent_id (str): ID of the agent to generate an image for
+        user_id (str): ID of the authenticated user
+
+    Returns:
+        JSONResponse: JSON containing the URL of the generated image
    """
-    graph = await backend.data.graph.get_graph(
-        graph_id=graph_id, version=None, user_id=user_id
+    agent = await backend.data.graph.get_graph(
+        graph_id=agent_id, version=None, user_id=user_id
    )

-    if not graph:
-        raise NotFoundError(f"Agent graph #{graph_id} not found")
+    if not agent:
+        raise fastapi.HTTPException(
+            status_code=404, detail=f"Agent with ID {agent_id} not found"
+        )
    # Use .jpeg here since we are generating JPEG images
-    filename = f"agent_{graph_id}.jpeg"
+    filename = f"agent_{agent_id}.jpeg"

    existing_url = await store_media.check_media_exists(user_id, filename)
    if existing_url:
-        logger.info(f"Using existing image for agent graph {graph_id}")
-        return ImageURLResponse(image_url=existing_url)
+        logger.info(f"Using existing image for agent {agent_id}")
+        return fastapi.responses.JSONResponse(content={"image_url": existing_url})
    # Generate agent image as JPEG
-    image = await store_image_gen.generate_agent_image(agent=graph)
+    image = await store_image_gen.generate_agent_image(agent=agent)

    # Create UploadFile with the correct filename and content_type
    image_file = fastapi.UploadFile(
        file=image,
        filename=filename,
    )
+
    image_url = await store_media.upload_media(
        user_id=user_id, file=image_file, use_file_name=True
    )

-    return ImageURLResponse(image_url=image_url)
+    return fastapi.responses.JSONResponse(content={"image_url": image_url})
+
+
+@router.get(
+    "/download/agents/{store_listing_version_id}",
+    summary="Download agent file",
+    tags=["store", "public"],
+)
+async def download_agent_file(
+    store_listing_version_id: str = fastapi.Path(
+        ..., description="The ID of the agent to download"
+    ),
+) -> fastapi.responses.FileResponse:
+    """
+    Download the agent file by streaming its content.
+
+    Args:
+        store_listing_version_id (str): The ID of the agent to download
+
+    Returns:
+        StreamingResponse: A streaming response containing the agent's graph data.
+
+    Raises:
+        HTTPException: If the agent is not found or an unexpected error occurs.
+    """
+    graph_data = await store_db.get_agent(store_listing_version_id)
+    file_name = f"agent_{graph_data.id}_v{graph_data.version or 'latest'}.json"
+
+    # Sending graph as a stream (similar to marketplace v1)
+    with tempfile.NamedTemporaryFile(
+        mode="w", suffix=".json", delete=False
+    ) as tmp_file:
+        tmp_file.write(backend.util.json.dumps(graph_data))
+        tmp_file.flush()
+
+        return fastapi.responses.FileResponse(
+            tmp_file.name, filename=file_name, media_type="application/json"
+        )


 ##############################################
--- a/autogpt_platform/backend/backend/api/features/store/routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/routes_test.py
@@ -8,8 +8,6 @@ import pytest
 import pytest_mock
 from pytest_snapshot.plugin import Snapshot

-from backend.api.features.store.db import StoreAgentsSortOptions
-
 from . import model as store_model
 from . import routes as store_routes

@@ -198,7 +196,7 @@ def test_get_agents_sorted(
    mock_db_call.assert_called_once_with(
        featured=False,
        creators=None,
-        sorted_by=StoreAgentsSortOptions.RUNS,
+        sorted_by="runs",
        search_query=None,
        category=None,
        page=1,
@@ -382,11 +380,9 @@ def test_get_agent_details(
        runs=100,
        rating=4.5,
        versions=["1.0.0", "1.1.0"],
-        graph_versions=["1", "2"],
-        graph_id="test-graph-id",
+        agentGraphVersions=["1", "2"],
+        agentGraphId="test-graph-id",
        last_updated=FIXED_NOW,
-        active_version_id="test-version-id",
-        has_approved_version=True,
    )
    mock_db_call = mocker.patch("backend.api.features.store.db.get_store_agent_details")
    mock_db_call.return_value = mocked_value
@@ -439,17 +435,15 @@ def test_get_creators_pagination(
 ) -> None:
    mocked_value = store_model.CreatorsResponse(
        creators=[
-            store_model.CreatorDetails(
+            store_model.Creator(
                name=f"Creator {i}",
                username=f"creator{i}",
-                avatar_url=f"avatar{i}.jpg",
                description=f"Creator {i} description",
-                links=[f"user{i}.link.com"],
-                is_featured=False,
+                avatar_url=f"avatar{i}.jpg",
                num_agents=1,
-                agent_runs=100,
                agent_rating=4.5,
-                top_categories=["cat1", "cat2", "cat3"],
+                agent_runs=100,
+                is_featured=False,
            )
            for i in range(5)
        ],
@@ -502,19 +496,19 @@ def test_get_creator_details(
    mocked_value = store_model.CreatorDetails(
        name="Test User",
        username="creator1",
-        avatar_url="avatar.jpg",
        description="Test creator description",
        links=["link1.com", "link2.com"],
-        is_featured=True,
-        num_agents=5,
-        agent_runs=1000,
+        avatar_url="avatar.jpg",
        agent_rating=4.8,
+        agent_runs=1000,
        top_categories=["category1", "category2"],
    )
-    mock_db_call = mocker.patch("backend.api.features.store.db.get_store_creator")
+    mock_db_call = mocker.patch(
+        "backend.api.features.store.db.get_store_creator_details"
+    )
    mock_db_call.return_value = mocked_value

-    response = client.get("/creators/creator1")
+    response = client.get("/creator/creator1")
    assert response.status_code == 200

    data = store_model.CreatorDetails.model_validate(response.json())
@@ -534,26 +528,19 @@ def test_get_submissions_success(
        submissions=[
            store_model.StoreSubmission(
                listing_id="test-listing-id",
-                user_id="test-user-id",
-                slug="test-agent",
-                listing_version_id="test-version-id",
-                listing_version=1,
-                graph_id="test-agent-id",
-                graph_version=1,
                name="Test Agent",
-                sub_heading="Test agent subheading",
                description="Test agent description",
-                instructions="Click the button!",
-                categories=["test-category"],
                image_urls=["test.jpg"],
-                video_url="test.mp4",
-                agent_output_demo_url="demo_video.mp4",
-                submitted_at=FIXED_NOW,
-                changes_summary="Initial Submission",
+                date_submitted=FIXED_NOW,
                status=prisma.enums.SubmissionStatus.APPROVED,
-                run_count=50,
-                review_count=5,
-                review_avg_rating=4.2,
+                runs=50,
+                rating=4.2,
+                agent_id="test-agent-id",
+                agent_version=1,
+                sub_heading="Test agent subheading",
+                slug="test-agent",
+                video_url="test.mp4",
+                categories=["test-category"],
            )
        ],
        pagination=store_model.Pagination(
--- a/autogpt_platform/backend/backend/api/features/store/test_cache_delete.py
+++ b/autogpt_platform/backend/backend/api/features/store/test_cache_delete.py
@@ -11,7 +11,6 @@ import pytest
 from backend.util.models import Pagination

 from . import cache as store_cache
-from .db import StoreAgentsSortOptions
 from .model import StoreAgent, StoreAgentsResponse


@@ -216,7 +215,7 @@ class TestCacheDeletion:
            await store_cache._get_cached_store_agents(
                featured=True,
                creator="testuser",
-                sorted_by=StoreAgentsSortOptions.RATING,
+                sorted_by="rating",
                search_query="AI assistant",
                category="productivity",
                page=2,
@@ -228,7 +227,7 @@ class TestCacheDeletion:
            deleted = store_cache._get_cached_store_agents.cache_delete(
                featured=True,
                creator="testuser",
-                sorted_by=StoreAgentsSortOptions.RATING,
+                sorted_by="rating",
                search_query="AI assistant",
                category="productivity",
                page=2,
@@ -240,7 +239,7 @@ class TestCacheDeletion:
            deleted = store_cache._get_cached_store_agents.cache_delete(
                featured=True,
                creator="testuser",
-                sorted_by=StoreAgentsSortOptions.RATING,
+                sorted_by="rating",
                search_query="AI assistant",
                category="productivity",
                page=2,
--- a/autogpt_platform/backend/backend/api/features/v1.py
+++ b/autogpt_platform/backend/backend/api/features/v1.py
@@ -55,7 +55,6 @@ from backend.data.credit import (
    set_auto_top_up,
 )
 from backend.data.graph import GraphSettings
-from backend.data.invited_user import get_or_activate_user
 from backend.data.model import CredentialsMetaInput, UserOnboarding
 from backend.data.notifications import NotificationPreference, NotificationPreferenceDTO
 from backend.data.onboarding import (
@@ -71,6 +70,7 @@ from backend.data.onboarding import (
    update_user_onboarding,
 )
 from backend.data.user import (
+    get_or_create_user,
    get_user_by_id,
    get_user_notification_preference,
    update_user_email,
@@ -136,10 +136,12 @@ _tally_background_tasks: set[asyncio.Task] = set()
    dependencies=[Security(requires_user)],
 )
 async def get_or_create_user_route(user_data: dict = Security(get_jwt_payload)):
-    user = await get_or_activate_user(user_data)
+    user = await get_or_create_user(user_data)

-    # Fire-and-forget: backfill Tally understanding when invite pre-seeding did
-    # not produce a stored result before first activation.
+    # Fire-and-forget: populate business understanding from Tally form.
+    # We use created_at proximity instead of an is_new flag because
+    # get_or_create_user is cached — a separate is_new return value would be
+    # unreliable on repeated calls within the cache TTL.
    age_seconds = (datetime.now(timezone.utc) - user.created_at).total_seconds()
    if age_seconds < 30:
        try:
@@ -163,8 +165,7 @@ async def get_or_create_user_route(user_data: dict = Security(get_jwt_payload)):
    dependencies=[Security(requires_user)],
 )
 async def update_user_email_route(
-    user_id: Annotated[str, Security(get_user_id)],
-    email: str = Body(...),
+    user_id: Annotated[str, Security(get_user_id)], email: str = Body(...)
 ) -> dict[str, str]:
    await update_user_email(user_id, email)

@@ -178,16 +179,10 @@ async def update_user_email_route(
    dependencies=[Security(requires_user)],
 )
 async def get_user_timezone_route(
-    user_id: Annotated[str, Security(get_user_id)],
+    user_data: dict = Security(get_jwt_payload),
 ) -> TimezoneResponse:
    """Get user timezone setting."""
-    try:
-        user = await get_user_by_id(user_id)
-    except ValueError:
-        raise HTTPException(
-            status_code=HTTP_404_NOT_FOUND,
-            detail="User not found. Please complete activation via /auth/user first.",
-        )
+    user = await get_or_create_user(user_data)
    return TimezoneResponse(timezone=user.timezone)


@@ -198,8 +193,7 @@ async def get_user_timezone_route(
    dependencies=[Security(requires_user)],
 )
 async def update_user_timezone_route(
-    user_id: Annotated[str, Security(get_user_id)],
-    request: UpdateTimezoneRequest,
+    user_id: Annotated[str, Security(get_user_id)], request: UpdateTimezoneRequest
 ) -> TimezoneResponse:
    """Update user timezone. The timezone should be a valid IANA timezone identifier."""
    user = await update_user_timezone(user_id, str(request.timezone))
@@ -455,6 +449,7 @@ async def execute_graph_block(
 async def upload_file(
    user_id: Annotated[str, Security(get_user_id)],
    file: UploadFile = File(...),
+    provider: str = "gcs",
    expiration_hours: int = 24,
 ) -> UploadFileResponse:
    """
@@ -517,6 +512,7 @@ async def upload_file(
    storage_path = await cloud_storage.store_file(
        content=content,
        filename=file_name,
+        provider=provider,
        expiration_hours=expiration_hours,
        user_id=user_id,
    )
--- a/autogpt_platform/backend/backend/api/features/v1_test.py
+++ b/autogpt_platform/backend/backend/api/features/v1_test.py
@@ -51,7 +51,7 @@ def test_get_or_create_user_route(
    }

    mocker.patch(
-        "backend.api.features.v1.get_or_activate_user",
+        "backend.api.features.v1.get_or_create_user",
        return_value=mock_user,
    )

@@ -515,6 +515,7 @@ async def test_upload_file_success(test_user_id: str):
        result = await upload_file(
            file=upload_file_mock,
            user_id=test_user_id,
+            provider="gcs",
            expiration_hours=24,
        )

@@ -532,6 +533,7 @@ async def test_upload_file_success(test_user_id: str):
        mock_handler.store_file.assert_called_once_with(
            content=file_content,
            filename="test.txt",
+            provider="gcs",
            expiration_hours=24,
            user_id=test_user_id,
        )
--- a/autogpt_platform/backend/backend/api/model.py
+++ b/autogpt_platform/backend/backend/api/model.py
@@ -94,8 +94,3 @@ class NotificationPayload(pydantic.BaseModel):

 class OnboardingNotificationPayload(NotificationPayload):
    step: OnboardingStep | None
-
-
-class CopilotCompletionPayload(NotificationPayload):
-    session_id: str
-    status: Literal["completed", "failed"]
--- a/autogpt_platform/backend/backend/api/rest_api.py
+++ b/autogpt_platform/backend/backend/api/rest_api.py
@@ -19,7 +19,6 @@ from prisma.errors import PrismaError
 import backend.api.features.admin.credit_admin_routes
 import backend.api.features.admin.execution_analytics_routes
 import backend.api.features.admin.store_admin_routes
-import backend.api.features.admin.user_admin_routes
 import backend.api.features.builder
 import backend.api.features.builder.routes
 import backend.api.features.chat.routes as chat_routes
@@ -56,7 +55,6 @@ from backend.util.exceptions import (
    MissingConfigError,
    NotAuthorizedError,
    NotFoundError,
-    PreconditionFailed,
 )
 from backend.util.feature_flag import initialize_launchdarkly, shutdown_launchdarkly
 from backend.util.service import UnhealthyServiceError
@@ -277,7 +275,6 @@ app.add_exception_handler(RequestValidationError, validation_error_handler)
 app.add_exception_handler(pydantic.ValidationError, validation_error_handler)
 app.add_exception_handler(MissingConfigError, handle_internal_http_error(503))
 app.add_exception_handler(ValueError, handle_internal_http_error(400))
-app.add_exception_handler(PreconditionFailed, handle_internal_http_error(428))
 app.add_exception_handler(Exception, handle_internal_http_error(500))

 app.include_router(backend.api.features.v1.v1_router, tags=["v1"], prefix="/api")
@@ -312,11 +309,6 @@ app.include_router(
    tags=["v2", "admin"],
    prefix="/api/executions",
 )
-app.include_router(
-    backend.api.features.admin.user_admin_routes.router,
-    tags=["v2", "admin"],
-    prefix="/api/users",
-)
 app.include_router(
    backend.api.features.executions.review.routes.router,
    tags=["v2", "executions", "review"],
--- a/autogpt_platform/backend/backend/blocks/_base.py
+++ b/autogpt_platform/backend/backend/blocks/_base.py
@@ -418,8 +418,6 @@ class BlockWebhookConfig(BlockManualWebhookConfig):


 class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
-    _optimized_description: ClassVar[str | None] = None
-
    def __init__(
        self,
        id: str = "",
@@ -472,8 +470,6 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
        self.block_type = block_type
        self.webhook_config = webhook_config
        self.is_sensitive_action = is_sensitive_action
-        # Read from ClassVar set by initialize_blocks()
-        self.optimized_description: str | None = type(self)._optimized_description
        self.execution_stats: "NodeExecutionStats" = NodeExecutionStats()

        if self.webhook_config:
@@ -624,7 +620,6 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
        graph_id: str,
        graph_version: int,
        execution_context: "ExecutionContext",
-        is_graph_execution: bool = True,
        **kwargs,
    ) -> tuple[bool, BlockInput]:
        """
@@ -653,7 +648,6 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
            graph_version=graph_version,
            block_name=self.name,
            editable=True,
-            is_graph_execution=is_graph_execution,
        )

        if decision is None:
--- a/autogpt_platform/backend/backend/blocks/basic.py
+++ b/autogpt_platform/backend/backend/blocks/basic.py
@@ -126,7 +126,7 @@ class PrintToConsoleBlock(Block):
            output_schema=PrintToConsoleBlock.Output,
            test_input={"text": "Hello, World!"},
            is_sensitive_action=True,
-            disabled=True,
+            disabled=True,  # Disabled per Nick Tindle's request (OPEN-3000)
            test_output=[
                ("output", "Hello, World!"),
                ("status", "printed"),
--- a/autogpt_platform/backend/backend/blocks/code_executor.py
+++ b/autogpt_platform/backend/backend/blocks/code_executor.py
@@ -142,7 +142,7 @@ class BaseE2BExecutorMixin:
                start_timestamp = ts_result.stdout.strip() if ts_result.stdout else None

            # Execute the code
-            execution = await sandbox.run_code(  # type: ignore[attr-defined]
+            execution = await sandbox.run_code(
                code,
                language=language.value,
                on_error=lambda e: sandbox.kill(),  # Kill the sandbox on error
--- a/autogpt_platform/backend/backend/blocks/email_block.py
+++ b/autogpt_platform/backend/backend/blocks/email_block.py
@@ -96,7 +96,6 @@ class SendEmailBlock(Block):
            test_credentials=TEST_CREDENTIALS,
            test_output=[("status", "Email sent successfully")],
            test_mock={"send_email": lambda *args, **kwargs: "Email sent successfully"},
-            is_sensitive_action=True,
        )

    @staticmethod
--- a/autogpt_platform/backend/backend/blocks/github/_utils.py
+++ b/autogpt_platform/backend/backend/blocks/github/_utils.py
@@ -1,3 +0,0 @@
-def github_repo_path(repo_url: str) -> str:
-    """Extract 'owner/repo' from a GitHub repository URL."""
-    return repo_url.replace("https://github.com/", "")
--- a/autogpt_platform/backend/backend/blocks/github/commits.py
+++ b/autogpt_platform/backend/backend/blocks/github/commits.py
@@ -1,408 +0,0 @@
-import asyncio
-from enum import StrEnum
-from urllib.parse import quote
-
-from typing_extensions import TypedDict
-
-from backend.blocks._base import (
-    Block,
-    BlockCategory,
-    BlockOutput,
-    BlockSchemaInput,
-    BlockSchemaOutput,
-)
-from backend.data.execution import ExecutionContext
-from backend.data.model import SchemaField
-from backend.util.file import parse_data_uri, resolve_media_content
-from backend.util.type import MediaFileType
-
-from ._api import get_api
-from ._auth import (
-    TEST_CREDENTIALS,
-    TEST_CREDENTIALS_INPUT,
-    GithubCredentials,
-    GithubCredentialsField,
-    GithubCredentialsInput,
-)
-from ._utils import github_repo_path
-
-
-class GithubListCommitsBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        repo_url: str = SchemaField(
-            description="URL of the GitHub repository",
-            placeholder="https://github.com/owner/repo",
-        )
-        branch: str = SchemaField(
-            description="Branch name to list commits from",
-            default="main",
-        )
-        per_page: int = SchemaField(
-            description="Number of commits to return (max 100)",
-            default=30,
-            ge=1,
-            le=100,
-        )
-        page: int = SchemaField(
-            description="Page number for pagination",
-            default=1,
-            ge=1,
-        )
-
-    class Output(BlockSchemaOutput):
-        class CommitItem(TypedDict):
-            sha: str
-            message: str
-            author: str
-            date: str
-            url: str
-
-        commit: CommitItem = SchemaField(
-            title="Commit", description="A commit with its details"
-        )
-        commits: list[CommitItem] = SchemaField(
-            description="List of commits with their details"
-        )
-        error: str = SchemaField(description="Error message if listing commits failed")
-
-    def __init__(self):
-        super().__init__(
-            id="8b13f579-d8b6-4dc2-a140-f770428805de",
-            description="This block lists commits on a branch in a GitHub repository.",
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubListCommitsBlock.Input,
-            output_schema=GithubListCommitsBlock.Output,
-            test_input={
-                "repo_url": "https://github.com/owner/repo",
-                "branch": "main",
-                "per_page": 30,
-                "page": 1,
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[
-                (
-                    "commits",
-                    [
-                        {
-                            "sha": "abc123",
-                            "message": "Initial commit",
-                            "author": "octocat",
-                            "date": "2024-01-01T00:00:00Z",
-                            "url": "https://github.com/owner/repo/commit/abc123",
-                        }
-                    ],
-                ),
-                (
-                    "commit",
-                    {
-                        "sha": "abc123",
-                        "message": "Initial commit",
-                        "author": "octocat",
-                        "date": "2024-01-01T00:00:00Z",
-                        "url": "https://github.com/owner/repo/commit/abc123",
-                    },
-                ),
-            ],
-            test_mock={
-                "list_commits": lambda *args, **kwargs: [
-                    {
-                        "sha": "abc123",
-                        "message": "Initial commit",
-                        "author": "octocat",
-                        "date": "2024-01-01T00:00:00Z",
-                        "url": "https://github.com/owner/repo/commit/abc123",
-                    }
-                ]
-            },
-        )
-
-    @staticmethod
-    async def list_commits(
-        credentials: GithubCredentials,
-        repo_url: str,
-        branch: str,
-        per_page: int,
-        page: int,
-    ) -> list[Output.CommitItem]:
-        api = get_api(credentials)
-        commits_url = repo_url + "/commits"
-        params = {"sha": branch, "per_page": str(per_page), "page": str(page)}
-        response = await api.get(commits_url, params=params)
-        data = response.json()
-        repo_path = github_repo_path(repo_url)
-        return [
-            GithubListCommitsBlock.Output.CommitItem(
-                sha=c["sha"],
-                message=c["commit"]["message"],
-                author=(c["commit"].get("author") or {}).get("name", "Unknown"),
-                date=(c["commit"].get("author") or {}).get("date", ""),
-                url=f"https://github.com/{repo_path}/commit/{c['sha']}",
-            )
-            for c in data
-        ]
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            commits = await self.list_commits(
-                credentials,
-                input_data.repo_url,
-                input_data.branch,
-                input_data.per_page,
-                input_data.page,
-            )
-            yield "commits", commits
-            for commit in commits:
-                yield "commit", commit
-        except Exception as e:
-            yield "error", str(e)
-
-
-class FileOperation(StrEnum):
-    """File operations for GithubMultiFileCommitBlock.
-
-    UPSERT creates or overwrites a file (the Git Trees API does not distinguish
-    between creation and update — the blob is placed at the given path regardless
-    of whether a file already exists there).
-
-    DELETE removes a file from the tree.
-    """
-
-    UPSERT = "upsert"
-    DELETE = "delete"
-
-
-class FileOperationInput(TypedDict):
-    path: str
-    # MediaFileType is a str NewType — no runtime breakage for existing callers.
-    content: MediaFileType
-    operation: FileOperation
-
-
-class GithubMultiFileCommitBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        repo_url: str = SchemaField(
-            description="URL of the GitHub repository",
-            placeholder="https://github.com/owner/repo",
-        )
-        branch: str = SchemaField(
-            description="Branch to commit to",
-            placeholder="feature-branch",
-        )
-        commit_message: str = SchemaField(
-            description="Commit message",
-            placeholder="Add new feature",
-        )
-        files: list[FileOperationInput] = SchemaField(
-            description=(
-                "List of file operations. Each item has: "
-                "'path' (file path), 'content' (file content, ignored for delete), "
-                "'operation' (upsert/delete)"
-            ),
-        )
-
-    class Output(BlockSchemaOutput):
-        sha: str = SchemaField(description="SHA of the new commit")
-        url: str = SchemaField(description="URL of the new commit")
-        error: str = SchemaField(description="Error message if the commit failed")
-
-    def __init__(self):
-        super().__init__(
-            id="389eee51-a95e-4230-9bed-92167a327802",
-            description=(
-                "This block creates a single commit with multiple file "
-                "upsert/delete operations using the Git Trees API."
-            ),
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubMultiFileCommitBlock.Input,
-            output_schema=GithubMultiFileCommitBlock.Output,
-            test_input={
-                "repo_url": "https://github.com/owner/repo",
-                "branch": "feature",
-                "commit_message": "Add files",
-                "files": [
-                    {
-                        "path": "src/new.py",
-                        "content": "print('hello')",
-                        "operation": "upsert",
-                    },
-                    {
-                        "path": "src/old.py",
-                        "content": "",
-                        "operation": "delete",
-                    },
-                ],
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[
-                ("sha", "newcommitsha"),
-                ("url", "https://github.com/owner/repo/commit/newcommitsha"),
-            ],
-            test_mock={
-                "multi_file_commit": lambda *args, **kwargs: (
-                    "newcommitsha",
-                    "https://github.com/owner/repo/commit/newcommitsha",
-                )
-            },
-        )
-
-    @staticmethod
-    async def multi_file_commit(
-        credentials: GithubCredentials,
-        repo_url: str,
-        branch: str,
-        commit_message: str,
-        files: list[FileOperationInput],
-    ) -> tuple[str, str]:
-        api = get_api(credentials)
-        safe_branch = quote(branch, safe="")
-
-        # 1. Get the latest commit SHA for the branch
-        ref_url = repo_url + f"/git/refs/heads/{safe_branch}"
-        response = await api.get(ref_url)
-        ref_data = response.json()
-        latest_commit_sha = ref_data["object"]["sha"]
-
-        # 2. Get the tree SHA of the latest commit
-        commit_url = repo_url + f"/git/commits/{latest_commit_sha}"
-        response = await api.get(commit_url)
-        commit_data = response.json()
-        base_tree_sha = commit_data["tree"]["sha"]
-
-        # 3. Build tree entries for each file operation (blobs created concurrently)
-        async def _create_blob(content: str, encoding: str = "utf-8") -> str:
-            blob_url = repo_url + "/git/blobs"
-            blob_response = await api.post(
-                blob_url,
-                json={"content": content, "encoding": encoding},
-            )
-            return blob_response.json()["sha"]
-
-        tree_entries: list[dict] = []
-        upsert_files = []
-        for file_op in files:
-            path = file_op["path"]
-            operation = FileOperation(file_op.get("operation", "upsert"))
-
-            if operation == FileOperation.DELETE:
-                tree_entries.append(
-                    {
-                        "path": path,
-                        "mode": "100644",
-                        "type": "blob",
-                        "sha": None,  # null SHA = delete
-                    }
-                )
-            else:
-                upsert_files.append((path, file_op.get("content", "")))
-
-        # Create all blobs concurrently. Data URIs (from store_media_file)
-        # are sent as base64 blobs to preserve binary content.
-        if upsert_files:
-
-            async def _make_blob(content: str) -> str:
-                parsed = parse_data_uri(content)
-                if parsed is not None:
-                    _, b64_payload = parsed
-                    return await _create_blob(b64_payload, encoding="base64")
-                return await _create_blob(content)
-
-            blob_shas = await asyncio.gather(
-                *[_make_blob(content) for _, content in upsert_files]
-            )
-            for (path, _), blob_sha in zip(upsert_files, blob_shas):
-                tree_entries.append(
-                    {
-                        "path": path,
-                        "mode": "100644",
-                        "type": "blob",
-                        "sha": blob_sha,
-                    }
-                )
-
-        # 4. Create a new tree
-        tree_url = repo_url + "/git/trees"
-        tree_response = await api.post(
-            tree_url,
-            json={"base_tree": base_tree_sha, "tree": tree_entries},
-        )
-        new_tree_sha = tree_response.json()["sha"]
-
-        # 5. Create a new commit
-        new_commit_url = repo_url + "/git/commits"
-        commit_response = await api.post(
-            new_commit_url,
-            json={
-                "message": commit_message,
-                "tree": new_tree_sha,
-                "parents": [latest_commit_sha],
-            },
-        )
-        new_commit_sha = commit_response.json()["sha"]
-
-        # 6. Update the branch reference
-        try:
-            await api.patch(
-                ref_url,
-                json={"sha": new_commit_sha},
-            )
-        except Exception as e:
-            raise RuntimeError(
-                f"Commit {new_commit_sha} was created but failed to update "
-                f"ref heads/{branch}: {e}. "
-                f"You can recover by manually updating the branch to {new_commit_sha}."
-            ) from e
-
-        repo_path = github_repo_path(repo_url)
-        commit_web_url = f"https://github.com/{repo_path}/commit/{new_commit_sha}"
-        return new_commit_sha, commit_web_url
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        execution_context: ExecutionContext,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            # Resolve media references (workspace://, data:, URLs) to data
-            # URIs so _make_blob can send binary content correctly.
-            resolved_files: list[FileOperationInput] = []
-            for file_op in input_data.files:
-                content = file_op.get("content", "")
-                operation = FileOperation(file_op.get("operation", "upsert"))
-                if operation != FileOperation.DELETE:
-                    content = await resolve_media_content(
-                        MediaFileType(content),
-                        execution_context,
-                        return_format="for_external_api",
-                    )
-                resolved_files.append(
-                    FileOperationInput(
-                        path=file_op["path"],
-                        content=MediaFileType(content),
-                        operation=operation,
-                    )
-                )
-
-            sha, url = await self.multi_file_commit(
-                credentials,
-                input_data.repo_url,
-                input_data.branch,
-                input_data.commit_message,
-                resolved_files,
-            )
-            yield "sha", sha
-            yield "url", url
-        except Exception as e:
-            yield "error", str(e)
--- a/autogpt_platform/backend/backend/blocks/github/pull_requests.py
+++ b/autogpt_platform/backend/backend/blocks/github/pull_requests.py
@@ -1,5 +1,4 @@
 import re
-from typing import Literal

 from typing_extensions import TypedDict

@@ -21,8 +20,6 @@ from ._auth import (
    GithubCredentialsInput,
 )

-MergeMethod = Literal["merge", "squash", "rebase"]
-

 class GithubListPullRequestsBlock(Block):
    class Input(BlockSchemaInput):
@@ -561,109 +558,12 @@ class GithubListPRReviewersBlock(Block):
            yield "reviewer", reviewer


-class GithubMergePullRequestBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        pr_url: str = SchemaField(
-            description="URL of the GitHub pull request",
-            placeholder="https://github.com/owner/repo/pull/1",
-        )
-        merge_method: MergeMethod = SchemaField(
-            description="Merge method to use: merge, squash, or rebase",
-            default="merge",
-        )
-        commit_title: str = SchemaField(
-            description="Title for the merge commit (optional, used for merge and squash)",
-            default="",
-        )
-        commit_message: str = SchemaField(
-            description="Message for the merge commit (optional, used for merge and squash)",
-            default="",
-        )
-
-    class Output(BlockSchemaOutput):
-        sha: str = SchemaField(description="SHA of the merge commit")
-        merged: bool = SchemaField(description="Whether the PR was merged")
-        message: str = SchemaField(description="Merge status message")
-        error: str = SchemaField(description="Error message if the merge failed")
-
-    def __init__(self):
-        super().__init__(
-            id="77456c22-33d8-4fd4-9eef-50b46a35bb48",
-            description="This block merges a pull request using merge, squash, or rebase.",
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubMergePullRequestBlock.Input,
-            output_schema=GithubMergePullRequestBlock.Output,
-            test_input={
-                "pr_url": "https://github.com/owner/repo/pull/1",
-                "merge_method": "squash",
-                "commit_title": "",
-                "commit_message": "",
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[
-                ("sha", "abc123"),
-                ("merged", True),
-                ("message", "Pull Request successfully merged"),
-            ],
-            test_mock={
-                "merge_pr": lambda *args, **kwargs: (
-                    "abc123",
-                    True,
-                    "Pull Request successfully merged",
-                )
-            },
-            is_sensitive_action=True,
-        )
-
-    @staticmethod
-    async def merge_pr(
-        credentials: GithubCredentials,
-        pr_url: str,
-        merge_method: MergeMethod,
-        commit_title: str,
-        commit_message: str,
-    ) -> tuple[str, bool, str]:
-        api = get_api(credentials)
-        merge_url = prepare_pr_api_url(pr_url=pr_url, path="merge")
-        data: dict[str, str] = {"merge_method": merge_method}
-        if commit_title:
-            data["commit_title"] = commit_title
-        if commit_message:
-            data["commit_message"] = commit_message
-        response = await api.put(merge_url, json=data)
-        result = response.json()
-        return result["sha"], result["merged"], result["message"]
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            sha, merged, message = await self.merge_pr(
-                credentials,
-                input_data.pr_url,
-                input_data.merge_method,
-                input_data.commit_title,
-                input_data.commit_message,
-            )
-            yield "sha", sha
-            yield "merged", merged
-            yield "message", message
-        except Exception as e:
-            yield "error", str(e)
-
-
 def prepare_pr_api_url(pr_url: str, path: str) -> str:
    # Pattern to capture the base repository URL and the pull request number
-    pattern = r"^(?:(https?)://)?([^/]+/[^/]+/[^/]+)/pull/(\d+)"
+    pattern = r"^(?:https?://)?([^/]+/[^/]+/[^/]+)/pull/(\d+)"
    match = re.match(pattern, pr_url)
    if not match:
        return pr_url

-    scheme, base_url, pr_number = match.groups()
-    return f"{scheme or 'https'}://{base_url}/pulls/{pr_number}/{path}"
+    base_url, pr_number = match.groups()
+    return f"{base_url}/pulls/{pr_number}/{path}"
--- a/autogpt_platform/backend/backend/blocks/github/repo.py
+++ b/autogpt_platform/backend/backend/blocks/github/repo.py
@@ -1,3 +1,5 @@
+import base64
+
 from typing_extensions import TypedDict

 from backend.blocks._base import (
@@ -17,7 +19,6 @@ from ._auth import (
    GithubCredentialsField,
    GithubCredentialsInput,
 )
-from ._utils import github_repo_path


 class GithubListTagsBlock(Block):
@@ -88,7 +89,7 @@ class GithubListTagsBlock(Block):
        tags_url = repo_url + "/tags"
        response = await api.get(tags_url)
        data = response.json()
-        repo_path = github_repo_path(repo_url)
+        repo_path = repo_url.replace("https://github.com/", "")
        tags: list[GithubListTagsBlock.Output.TagItem] = [
            {
                "name": tag["name"],
@@ -114,6 +115,101 @@ class GithubListTagsBlock(Block):
            yield "tag", tag


+class GithubListBranchesBlock(Block):
+    class Input(BlockSchemaInput):
+        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
+        repo_url: str = SchemaField(
+            description="URL of the GitHub repository",
+            placeholder="https://github.com/owner/repo",
+        )
+
+    class Output(BlockSchemaOutput):
+        class BranchItem(TypedDict):
+            name: str
+            url: str
+
+        branch: BranchItem = SchemaField(
+            title="Branch",
+            description="Branches with their name and file tree browser URL",
+        )
+        branches: list[BranchItem] = SchemaField(
+            description="List of branches with their name and file tree browser URL"
+        )
+
+    def __init__(self):
+        super().__init__(
+            id="74243e49-2bec-4916-8bf4-db43d44aead5",
+            description="This block lists all branches for a specified GitHub repository.",
+            categories={BlockCategory.DEVELOPER_TOOLS},
+            input_schema=GithubListBranchesBlock.Input,
+            output_schema=GithubListBranchesBlock.Output,
+            test_input={
+                "repo_url": "https://github.com/owner/repo",
+                "credentials": TEST_CREDENTIALS_INPUT,
+            },
+            test_credentials=TEST_CREDENTIALS,
+            test_output=[
+                (
+                    "branches",
+                    [
+                        {
+                            "name": "main",
+                            "url": "https://github.com/owner/repo/tree/main",
+                        }
+                    ],
+                ),
+                (
+                    "branch",
+                    {
+                        "name": "main",
+                        "url": "https://github.com/owner/repo/tree/main",
+                    },
+                ),
+            ],
+            test_mock={
+                "list_branches": lambda *args, **kwargs: [
+                    {
+                        "name": "main",
+                        "url": "https://github.com/owner/repo/tree/main",
+                    }
+                ]
+            },
+        )
+
+    @staticmethod
+    async def list_branches(
+        credentials: GithubCredentials, repo_url: str
+    ) -> list[Output.BranchItem]:
+        api = get_api(credentials)
+        branches_url = repo_url + "/branches"
+        response = await api.get(branches_url)
+        data = response.json()
+        repo_path = repo_url.replace("https://github.com/", "")
+        branches: list[GithubListBranchesBlock.Output.BranchItem] = [
+            {
+                "name": branch["name"],
+                "url": f"https://github.com/{repo_path}/tree/{branch['name']}",
+            }
+            for branch in data
+        ]
+        return branches
+
+    async def run(
+        self,
+        input_data: Input,
+        *,
+        credentials: GithubCredentials,
+        **kwargs,
+    ) -> BlockOutput:
+        branches = await self.list_branches(
+            credentials,
+            input_data.repo_url,
+        )
+        yield "branches", branches
+        for branch in branches:
+            yield "branch", branch
+
+
 class GithubListDiscussionsBlock(Block):
    class Input(BlockSchemaInput):
        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
@@ -187,7 +283,7 @@ class GithubListDiscussionsBlock(Block):
    ) -> list[Output.DiscussionItem]:
        api = get_api(credentials)
        # GitHub GraphQL API endpoint is different; we'll use api.post with custom URL
-        repo_path = github_repo_path(repo_url)
+        repo_path = repo_url.replace("https://github.com/", "")
        owner, repo = repo_path.split("/")
        query = """
        query($owner: String!, $repo: String!, $num: Int!) {
@@ -320,6 +416,564 @@ class GithubListReleasesBlock(Block):
            yield "release", release


+class GithubReadFileBlock(Block):
+    class Input(BlockSchemaInput):
+        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
+        repo_url: str = SchemaField(
+            description="URL of the GitHub repository",
+            placeholder="https://github.com/owner/repo",
+        )
+        file_path: str = SchemaField(
+            description="Path to the file in the repository",
+            placeholder="path/to/file",
+        )
+        branch: str = SchemaField(
+            description="Branch to read from",
+            placeholder="branch_name",
+            default="master",
+        )
+
+    class Output(BlockSchemaOutput):
+        text_content: str = SchemaField(
+            description="Content of the file (decoded as UTF-8 text)"
+        )
+        raw_content: str = SchemaField(
+            description="Raw base64-encoded content of the file"
+        )
+        size: int = SchemaField(description="The size of the file (in bytes)")
+
+    def __init__(self):
+        super().__init__(
+            id="87ce6c27-5752-4bbc-8e26-6da40a3dcfd3",
+            description="This block reads the content of a specified file from a GitHub repository.",
+            categories={BlockCategory.DEVELOPER_TOOLS},
+            input_schema=GithubReadFileBlock.Input,
+            output_schema=GithubReadFileBlock.Output,
+            test_input={
+                "repo_url": "https://github.com/owner/repo",
+                "file_path": "path/to/file",
+                "branch": "master",
+                "credentials": TEST_CREDENTIALS_INPUT,
+            },
+            test_credentials=TEST_CREDENTIALS,
+            test_output=[
+                ("raw_content", "RmlsZSBjb250ZW50"),
+                ("text_content", "File content"),
+                ("size", 13),
+            ],
+            test_mock={"read_file": lambda *args, **kwargs: ("RmlsZSBjb250ZW50", 13)},
+        )
+
+    @staticmethod
+    async def read_file(
+        credentials: GithubCredentials, repo_url: str, file_path: str, branch: str
+    ) -> tuple[str, int]:
+        api = get_api(credentials)
+        content_url = repo_url + f"/contents/{file_path}?ref={branch}"
+        response = await api.get(content_url)
+        data = response.json()
+
+        if isinstance(data, list):
+            # Multiple entries of different types exist at this path
+            if not (file := next((f for f in data if f["type"] == "file"), None)):
+                raise TypeError("Not a file")
+            data = file
+
+        if data["type"] != "file":
+            raise TypeError("Not a file")
+
+        return data["content"], data["size"]
+
+    async def run(
+        self,
+        input_data: Input,
+        *,
+        credentials: GithubCredentials,
+        **kwargs,
+    ) -> BlockOutput:
+        content, size = await self.read_file(
+            credentials,
+            input_data.repo_url,
+            input_data.file_path,
+            input_data.branch,
+        )
+        yield "raw_content", content
+        yield "text_content", base64.b64decode(content).decode("utf-8")
+        yield "size", size
+
+
+class GithubReadFolderBlock(Block):
+    class Input(BlockSchemaInput):
+        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
+        repo_url: str = SchemaField(
+            description="URL of the GitHub repository",
+            placeholder="https://github.com/owner/repo",
+        )
+        folder_path: str = SchemaField(
+            description="Path to the folder in the repository",
+            placeholder="path/to/folder",
+        )
+        branch: str = SchemaField(
+            description="Branch name to read from (defaults to master)",
+            placeholder="branch_name",
+            default="master",
+        )
+
+    class Output(BlockSchemaOutput):
+        class DirEntry(TypedDict):
+            name: str
+            path: str
+
+        class FileEntry(TypedDict):
+            name: str
+            path: str
+            size: int
+
+        file: FileEntry = SchemaField(description="Files in the folder")
+        dir: DirEntry = SchemaField(description="Directories in the folder")
+        error: str = SchemaField(
+            description="Error message if reading the folder failed"
+        )
+
+    def __init__(self):
+        super().__init__(
+            id="1355f863-2db3-4d75-9fba-f91e8a8ca400",
+            description="This block reads the content of a specified folder from a GitHub repository.",
+            categories={BlockCategory.DEVELOPER_TOOLS},
+            input_schema=GithubReadFolderBlock.Input,
+            output_schema=GithubReadFolderBlock.Output,
+            test_input={
+                "repo_url": "https://github.com/owner/repo",
+                "folder_path": "path/to/folder",
+                "branch": "master",
+                "credentials": TEST_CREDENTIALS_INPUT,
+            },
+            test_credentials=TEST_CREDENTIALS,
+            test_output=[
+                (
+                    "file",
+                    {
+                        "name": "file1.txt",
+                        "path": "path/to/folder/file1.txt",
+                        "size": 1337,
+                    },
+                ),
+                ("dir", {"name": "dir2", "path": "path/to/folder/dir2"}),
+            ],
+            test_mock={
+                "read_folder": lambda *args, **kwargs: (
+                    [
+                        {
+                            "name": "file1.txt",
+                            "path": "path/to/folder/file1.txt",
+                            "size": 1337,
+                        }
+                    ],
+                    [{"name": "dir2", "path": "path/to/folder/dir2"}],
+                )
+            },
+        )
+
+    @staticmethod
+    async def read_folder(
+        credentials: GithubCredentials, repo_url: str, folder_path: str, branch: str
+    ) -> tuple[list[Output.FileEntry], list[Output.DirEntry]]:
+        api = get_api(credentials)
+        contents_url = repo_url + f"/contents/{folder_path}?ref={branch}"
+        response = await api.get(contents_url)
+        data = response.json()
+
+        if not isinstance(data, list):
+            raise TypeError("Not a folder")
+
+        files: list[GithubReadFolderBlock.Output.FileEntry] = [
+            GithubReadFolderBlock.Output.FileEntry(
+                name=entry["name"],
+                path=entry["path"],
+                size=entry["size"],
+            )
+            for entry in data
+            if entry["type"] == "file"
+        ]
+
+        dirs: list[GithubReadFolderBlock.Output.DirEntry] = [
+            GithubReadFolderBlock.Output.DirEntry(
+                name=entry["name"],
+                path=entry["path"],
+            )
+            for entry in data
+            if entry["type"] == "dir"
+        ]
+
+        return files, dirs
+
+    async def run(
+        self,
+        input_data: Input,
+        *,
+        credentials: GithubCredentials,
+        **kwargs,
+    ) -> BlockOutput:
+        files, dirs = await self.read_folder(
+            credentials,
+            input_data.repo_url,
+            input_data.folder_path.lstrip("/"),
+            input_data.branch,
+        )
+        for file in files:
+            yield "file", file
+        for dir in dirs:
+            yield "dir", dir
+
+
+class GithubMakeBranchBlock(Block):
+    class Input(BlockSchemaInput):
+        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
+        repo_url: str = SchemaField(
+            description="URL of the GitHub repository",
+            placeholder="https://github.com/owner/repo",
+        )
+        new_branch: str = SchemaField(
+            description="Name of the new branch",
+            placeholder="new_branch_name",
+        )
+        source_branch: str = SchemaField(
+            description="Name of the source branch",
+            placeholder="source_branch_name",
+        )
+
+    class Output(BlockSchemaOutput):
+        status: str = SchemaField(description="Status of the branch creation operation")
+        error: str = SchemaField(
+            description="Error message if the branch creation failed"
+        )
+
+    def __init__(self):
+        super().__init__(
+            id="944cc076-95e7-4d1b-b6b6-b15d8ee5448d",
+            description="This block creates a new branch from a specified source branch.",
+            categories={BlockCategory.DEVELOPER_TOOLS},
+            input_schema=GithubMakeBranchBlock.Input,
+            output_schema=GithubMakeBranchBlock.Output,
+            test_input={
+                "repo_url": "https://github.com/owner/repo",
+                "new_branch": "new_branch_name",
+                "source_branch": "source_branch_name",
+                "credentials": TEST_CREDENTIALS_INPUT,
+            },
+            test_credentials=TEST_CREDENTIALS,
+            test_output=[("status", "Branch created successfully")],
+            test_mock={
+                "create_branch": lambda *args, **kwargs: "Branch created successfully"
+            },
+        )
+
+    @staticmethod
+    async def create_branch(
+        credentials: GithubCredentials,
+        repo_url: str,
+        new_branch: str,
+        source_branch: str,
+    ) -> str:
+        api = get_api(credentials)
+        ref_url = repo_url + f"/git/refs/heads/{source_branch}"
+        response = await api.get(ref_url)
+        data = response.json()
+        sha = data["object"]["sha"]
+
+        # Create the new branch
+        new_ref_url = repo_url + "/git/refs"
+        data = {
+            "ref": f"refs/heads/{new_branch}",
+            "sha": sha,
+        }
+        response = await api.post(new_ref_url, json=data)
+        return "Branch created successfully"
+
+    async def run(
+        self,
+        input_data: Input,
+        *,
+        credentials: GithubCredentials,
+        **kwargs,
+    ) -> BlockOutput:
+        status = await self.create_branch(
+            credentials,
+            input_data.repo_url,
+            input_data.new_branch,
+            input_data.source_branch,
+        )
+        yield "status", status
+
+
+class GithubDeleteBranchBlock(Block):
+    class Input(BlockSchemaInput):
+        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
+        repo_url: str = SchemaField(
+            description="URL of the GitHub repository",
+            placeholder="https://github.com/owner/repo",
+        )
+        branch: str = SchemaField(
+            description="Name of the branch to delete",
+            placeholder="branch_name",
+        )
+
+    class Output(BlockSchemaOutput):
+        status: str = SchemaField(description="Status of the branch deletion operation")
+        error: str = SchemaField(
+            description="Error message if the branch deletion failed"
+        )
+
+    def __init__(self):
+        super().__init__(
+            id="0d4130f7-e0ab-4d55-adc3-0a40225e80f4",
+            description="This block deletes a specified branch.",
+            categories={BlockCategory.DEVELOPER_TOOLS},
+            input_schema=GithubDeleteBranchBlock.Input,
+            output_schema=GithubDeleteBranchBlock.Output,
+            test_input={
+                "repo_url": "https://github.com/owner/repo",
+                "branch": "branch_name",
+                "credentials": TEST_CREDENTIALS_INPUT,
+            },
+            test_credentials=TEST_CREDENTIALS,
+            test_output=[("status", "Branch deleted successfully")],
+            test_mock={
+                "delete_branch": lambda *args, **kwargs: "Branch deleted successfully"
+            },
+        )
+
+    @staticmethod
+    async def delete_branch(
+        credentials: GithubCredentials, repo_url: str, branch: str
+    ) -> str:
+        api = get_api(credentials)
+        ref_url = repo_url + f"/git/refs/heads/{branch}"
+        await api.delete(ref_url)
+        return "Branch deleted successfully"
+
+    async def run(
+        self,
+        input_data: Input,
+        *,
+        credentials: GithubCredentials,
+        **kwargs,
+    ) -> BlockOutput:
+        status = await self.delete_branch(
+            credentials,
+            input_data.repo_url,
+            input_data.branch,
+        )
+        yield "status", status
+
+
+class GithubCreateFileBlock(Block):
+    class Input(BlockSchemaInput):
+        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
+        repo_url: str = SchemaField(
+            description="URL of the GitHub repository",
+            placeholder="https://github.com/owner/repo",
+        )
+        file_path: str = SchemaField(
+            description="Path where the file should be created",
+            placeholder="path/to/file.txt",
+        )
+        content: str = SchemaField(
+            description="Content to write to the file",
+            placeholder="File content here",
+        )
+        branch: str = SchemaField(
+            description="Branch where the file should be created",
+            default="main",
+        )
+        commit_message: str = SchemaField(
+            description="Message for the commit",
+            default="Create new file",
+        )
+
+    class Output(BlockSchemaOutput):
+        url: str = SchemaField(description="URL of the created file")
+        sha: str = SchemaField(description="SHA of the commit")
+        error: str = SchemaField(
+            description="Error message if the file creation failed"
+        )
+
+    def __init__(self):
+        super().__init__(
+            id="8fd132ac-b917-428a-8159-d62893e8a3fe",
+            description="This block creates a new file in a GitHub repository.",
+            categories={BlockCategory.DEVELOPER_TOOLS},
+            input_schema=GithubCreateFileBlock.Input,
+            output_schema=GithubCreateFileBlock.Output,
+            test_input={
+                "repo_url": "https://github.com/owner/repo",
+                "file_path": "test/file.txt",
+                "content": "Test content",
+                "branch": "main",
+                "commit_message": "Create test file",
+                "credentials": TEST_CREDENTIALS_INPUT,
+            },
+            test_credentials=TEST_CREDENTIALS,
+            test_output=[
+                ("url", "https://github.com/owner/repo/blob/main/test/file.txt"),
+                ("sha", "abc123"),
+            ],
+            test_mock={
+                "create_file": lambda *args, **kwargs: (
+                    "https://github.com/owner/repo/blob/main/test/file.txt",
+                    "abc123",
+                )
+            },
+        )
+
+    @staticmethod
+    async def create_file(
+        credentials: GithubCredentials,
+        repo_url: str,
+        file_path: str,
+        content: str,
+        branch: str,
+        commit_message: str,
+    ) -> tuple[str, str]:
+        api = get_api(credentials)
+        contents_url = repo_url + f"/contents/{file_path}"
+        content_base64 = base64.b64encode(content.encode()).decode()
+        data = {
+            "message": commit_message,
+            "content": content_base64,
+            "branch": branch,
+        }
+        response = await api.put(contents_url, json=data)
+        data = response.json()
+        return data["content"]["html_url"], data["commit"]["sha"]
+
+    async def run(
+        self,
+        input_data: Input,
+        *,
+        credentials: GithubCredentials,
+        **kwargs,
+    ) -> BlockOutput:
+        try:
+            url, sha = await self.create_file(
+                credentials,
+                input_data.repo_url,
+                input_data.file_path,
+                input_data.content,
+                input_data.branch,
+                input_data.commit_message,
+            )
+            yield "url", url
+            yield "sha", sha
+        except Exception as e:
+            yield "error", str(e)
+
+
+class GithubUpdateFileBlock(Block):
+    class Input(BlockSchemaInput):
+        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
+        repo_url: str = SchemaField(
+            description="URL of the GitHub repository",
+            placeholder="https://github.com/owner/repo",
+        )
+        file_path: str = SchemaField(
+            description="Path to the file to update",
+            placeholder="path/to/file.txt",
+        )
+        content: str = SchemaField(
+            description="New content for the file",
+            placeholder="Updated content here",
+        )
+        branch: str = SchemaField(
+            description="Branch containing the file",
+            default="main",
+        )
+        commit_message: str = SchemaField(
+            description="Message for the commit",
+            default="Update file",
+        )
+
+    class Output(BlockSchemaOutput):
+        url: str = SchemaField(description="URL of the updated file")
+        sha: str = SchemaField(description="SHA of the commit")
+
+    def __init__(self):
+        super().__init__(
+            id="30be12a4-57cb-4aa4-baf5-fcc68d136076",
+            description="This block updates an existing file in a GitHub repository.",
+            categories={BlockCategory.DEVELOPER_TOOLS},
+            input_schema=GithubUpdateFileBlock.Input,
+            output_schema=GithubUpdateFileBlock.Output,
+            test_input={
+                "repo_url": "https://github.com/owner/repo",
+                "file_path": "test/file.txt",
+                "content": "Updated content",
+                "branch": "main",
+                "commit_message": "Update test file",
+                "credentials": TEST_CREDENTIALS_INPUT,
+            },
+            test_credentials=TEST_CREDENTIALS,
+            test_output=[
+                ("url", "https://github.com/owner/repo/blob/main/test/file.txt"),
+                ("sha", "def456"),
+            ],
+            test_mock={
+                "update_file": lambda *args, **kwargs: (
+                    "https://github.com/owner/repo/blob/main/test/file.txt",
+                    "def456",
+                )
+            },
+        )
+
+    @staticmethod
+    async def update_file(
+        credentials: GithubCredentials,
+        repo_url: str,
+        file_path: str,
+        content: str,
+        branch: str,
+        commit_message: str,
+    ) -> tuple[str, str]:
+        api = get_api(credentials)
+        contents_url = repo_url + f"/contents/{file_path}"
+        params = {"ref": branch}
+        response = await api.get(contents_url, params=params)
+        data = response.json()
+
+        # Convert new content to base64
+        content_base64 = base64.b64encode(content.encode()).decode()
+        data = {
+            "message": commit_message,
+            "content": content_base64,
+            "sha": data["sha"],
+            "branch": branch,
+        }
+        response = await api.put(contents_url, json=data)
+        data = response.json()
+        return data["content"]["html_url"], data["commit"]["sha"]
+
+    async def run(
+        self,
+        input_data: Input,
+        *,
+        credentials: GithubCredentials,
+        **kwargs,
+    ) -> BlockOutput:
+        try:
+            url, sha = await self.update_file(
+                credentials,
+                input_data.repo_url,
+                input_data.file_path,
+                input_data.content,
+                input_data.branch,
+                input_data.commit_message,
+            )
+            yield "url", url
+            yield "sha", sha
+        except Exception as e:
+            yield "error", str(e)
+
+
 class GithubCreateRepositoryBlock(Block):
    class Input(BlockSchemaInput):
        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
@@ -449,7 +1103,7 @@ class GithubListStargazersBlock(Block):

    def __init__(self):
        super().__init__(
-            id="e96d01ec-b55e-4a99-8ce8-c8776dce850b",  # Generated unique UUID
+            id="a4b9c2d1-e5f6-4g7h-8i9j-0k1l2m3n4o5p",  # Generated unique UUID
            description="This block lists all users who have starred a specified GitHub repository.",
            categories={BlockCategory.DEVELOPER_TOOLS},
            input_schema=GithubListStargazersBlock.Input,
@@ -518,230 +1172,3 @@ class GithubListStargazersBlock(Block):
        yield "stargazers", stargazers
        for stargazer in stargazers:
            yield "stargazer", stargazer
-
-
-class GithubGetRepositoryInfoBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        repo_url: str = SchemaField(
-            description="URL of the GitHub repository",
-            placeholder="https://github.com/owner/repo",
-        )
-
-    class Output(BlockSchemaOutput):
-        name: str = SchemaField(description="Repository name")
-        full_name: str = SchemaField(description="Full repository name (owner/repo)")
-        description: str = SchemaField(description="Repository description")
-        default_branch: str = SchemaField(description="Default branch name (e.g. main)")
-        private: bool = SchemaField(description="Whether the repository is private")
-        html_url: str = SchemaField(description="Web URL of the repository")
-        clone_url: str = SchemaField(description="Git clone URL")
-        stars: int = SchemaField(description="Number of stars")
-        forks: int = SchemaField(description="Number of forks")
-        open_issues: int = SchemaField(description="Number of open issues")
-        error: str = SchemaField(
-            description="Error message if fetching repo info failed"
-        )
-
-    def __init__(self):
-        super().__init__(
-            id="59d4f241-968a-4040-95da-348ac5c5ce27",
-            description="This block retrieves metadata about a GitHub repository.",
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubGetRepositoryInfoBlock.Input,
-            output_schema=GithubGetRepositoryInfoBlock.Output,
-            test_input={
-                "repo_url": "https://github.com/owner/repo",
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[
-                ("name", "repo"),
-                ("full_name", "owner/repo"),
-                ("description", "A test repo"),
-                ("default_branch", "main"),
-                ("private", False),
-                ("html_url", "https://github.com/owner/repo"),
-                ("clone_url", "https://github.com/owner/repo.git"),
-                ("stars", 42),
-                ("forks", 5),
-                ("open_issues", 3),
-            ],
-            test_mock={
-                "get_repo_info": lambda *args, **kwargs: {
-                    "name": "repo",
-                    "full_name": "owner/repo",
-                    "description": "A test repo",
-                    "default_branch": "main",
-                    "private": False,
-                    "html_url": "https://github.com/owner/repo",
-                    "clone_url": "https://github.com/owner/repo.git",
-                    "stargazers_count": 42,
-                    "forks_count": 5,
-                    "open_issues_count": 3,
-                }
-            },
-        )
-
-    @staticmethod
-    async def get_repo_info(credentials: GithubCredentials, repo_url: str) -> dict:
-        api = get_api(credentials)
-        response = await api.get(repo_url)
-        return response.json()
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            data = await self.get_repo_info(credentials, input_data.repo_url)
-            yield "name", data["name"]
-            yield "full_name", data["full_name"]
-            yield "description", data.get("description", "") or ""
-            yield "default_branch", data["default_branch"]
-            yield "private", data["private"]
-            yield "html_url", data["html_url"]
-            yield "clone_url", data["clone_url"]
-            yield "stars", data["stargazers_count"]
-            yield "forks", data["forks_count"]
-            yield "open_issues", data["open_issues_count"]
-        except Exception as e:
-            yield "error", str(e)
-
-
-class GithubForkRepositoryBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        repo_url: str = SchemaField(
-            description="URL of the GitHub repository to fork",
-            placeholder="https://github.com/owner/repo",
-        )
-        organization: str = SchemaField(
-            description="Organization to fork into (leave empty to fork to your account)",
-            default="",
-        )
-
-    class Output(BlockSchemaOutput):
-        url: str = SchemaField(description="URL of the forked repository")
-        clone_url: str = SchemaField(description="Git clone URL of the fork")
-        full_name: str = SchemaField(description="Full name of the fork (owner/repo)")
-        error: str = SchemaField(description="Error message if the fork failed")
-
-    def __init__(self):
-        super().__init__(
-            id="a439f2f4-835f-4dae-ba7b-0205ffa70be6",
-            description="This block forks a GitHub repository to your account or an organization.",
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubForkRepositoryBlock.Input,
-            output_schema=GithubForkRepositoryBlock.Output,
-            test_input={
-                "repo_url": "https://github.com/owner/repo",
-                "organization": "",
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[
-                ("url", "https://github.com/myuser/repo"),
-                ("clone_url", "https://github.com/myuser/repo.git"),
-                ("full_name", "myuser/repo"),
-            ],
-            test_mock={
-                "fork_repo": lambda *args, **kwargs: (
-                    "https://github.com/myuser/repo",
-                    "https://github.com/myuser/repo.git",
-                    "myuser/repo",
-                )
-            },
-        )
-
-    @staticmethod
-    async def fork_repo(
-        credentials: GithubCredentials,
-        repo_url: str,
-        organization: str,
-    ) -> tuple[str, str, str]:
-        api = get_api(credentials)
-        forks_url = repo_url + "/forks"
-        data: dict[str, str] = {}
-        if organization:
-            data["organization"] = organization
-        response = await api.post(forks_url, json=data)
-        result = response.json()
-        return result["html_url"], result["clone_url"], result["full_name"]
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            url, clone_url, full_name = await self.fork_repo(
-                credentials,
-                input_data.repo_url,
-                input_data.organization,
-            )
-            yield "url", url
-            yield "clone_url", clone_url
-            yield "full_name", full_name
-        except Exception as e:
-            yield "error", str(e)
-
-
-class GithubStarRepositoryBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        repo_url: str = SchemaField(
-            description="URL of the GitHub repository to star",
-            placeholder="https://github.com/owner/repo",
-        )
-
-    class Output(BlockSchemaOutput):
-        status: str = SchemaField(description="Status of the star operation")
-        error: str = SchemaField(description="Error message if starring failed")
-
-    def __init__(self):
-        super().__init__(
-            id="bd700764-53e3-44dd-a969-d1854088458f",
-            description="This block stars a GitHub repository.",
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubStarRepositoryBlock.Input,
-            output_schema=GithubStarRepositoryBlock.Output,
-            test_input={
-                "repo_url": "https://github.com/owner/repo",
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[("status", "Repository starred successfully")],
-            test_mock={
-                "star_repo": lambda *args, **kwargs: "Repository starred successfully"
-            },
-        )
-
-    @staticmethod
-    async def star_repo(credentials: GithubCredentials, repo_url: str) -> str:
-        api = get_api(credentials, convert_urls=False)
-        repo_path = github_repo_path(repo_url)
-        owner, repo = repo_path.split("/")
-        await api.put(
-            f"https://api.github.com/user/starred/{owner}/{repo}",
-            headers={"Content-Length": "0"},
-        )
-        return "Repository starred successfully"
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            status = await self.star_repo(credentials, input_data.repo_url)
-            yield "status", status
-        except Exception as e:
-            yield "error", str(e)
--- a/autogpt_platform/backend/backend/blocks/github/repo_branches.py
+++ b/autogpt_platform/backend/backend/blocks/github/repo_branches.py
@@ -1,452 +0,0 @@
-from urllib.parse import quote
-
-from typing_extensions import TypedDict
-
-from backend.blocks._base import (
-    Block,
-    BlockCategory,
-    BlockOutput,
-    BlockSchemaInput,
-    BlockSchemaOutput,
-)
-from backend.data.model import SchemaField
-
-from ._api import get_api
-from ._auth import (
-    TEST_CREDENTIALS,
-    TEST_CREDENTIALS_INPUT,
-    GithubCredentials,
-    GithubCredentialsField,
-    GithubCredentialsInput,
-)
-from ._utils import github_repo_path
-
-
-class GithubListBranchesBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        repo_url: str = SchemaField(
-            description="URL of the GitHub repository",
-            placeholder="https://github.com/owner/repo",
-        )
-        per_page: int = SchemaField(
-            description="Number of branches to return per page (max 100)",
-            default=30,
-            ge=1,
-            le=100,
-        )
-        page: int = SchemaField(
-            description="Page number for pagination",
-            default=1,
-            ge=1,
-        )
-
-    class Output(BlockSchemaOutput):
-        class BranchItem(TypedDict):
-            name: str
-            url: str
-
-        branch: BranchItem = SchemaField(
-            title="Branch",
-            description="Branches with their name and file tree browser URL",
-        )
-        branches: list[BranchItem] = SchemaField(
-            description="List of branches with their name and file tree browser URL"
-        )
-        error: str = SchemaField(description="Error message if listing branches failed")
-
-    def __init__(self):
-        super().__init__(
-            id="74243e49-2bec-4916-8bf4-db43d44aead5",
-            description="This block lists all branches for a specified GitHub repository.",
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubListBranchesBlock.Input,
-            output_schema=GithubListBranchesBlock.Output,
-            test_input={
-                "repo_url": "https://github.com/owner/repo",
-                "per_page": 30,
-                "page": 1,
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[
-                (
-                    "branches",
-                    [
-                        {
-                            "name": "main",
-                            "url": "https://github.com/owner/repo/tree/main",
-                        }
-                    ],
-                ),
-                (
-                    "branch",
-                    {
-                        "name": "main",
-                        "url": "https://github.com/owner/repo/tree/main",
-                    },
-                ),
-            ],
-            test_mock={
-                "list_branches": lambda *args, **kwargs: [
-                    {
-                        "name": "main",
-                        "url": "https://github.com/owner/repo/tree/main",
-                    }
-                ]
-            },
-        )
-
-    @staticmethod
-    async def list_branches(
-        credentials: GithubCredentials, repo_url: str, per_page: int, page: int
-    ) -> list[Output.BranchItem]:
-        api = get_api(credentials)
-        branches_url = repo_url + "/branches"
-        response = await api.get(
-            branches_url, params={"per_page": str(per_page), "page": str(page)}
-        )
-        data = response.json()
-        repo_path = github_repo_path(repo_url)
-        branches: list[GithubListBranchesBlock.Output.BranchItem] = [
-            {
-                "name": branch["name"],
-                "url": f"https://github.com/{repo_path}/tree/{branch['name']}",
-            }
-            for branch in data
-        ]
-        return branches
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            branches = await self.list_branches(
-                credentials,
-                input_data.repo_url,
-                input_data.per_page,
-                input_data.page,
-            )
-            yield "branches", branches
-            for branch in branches:
-                yield "branch", branch
-        except Exception as e:
-            yield "error", str(e)
-
-
-class GithubMakeBranchBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        repo_url: str = SchemaField(
-            description="URL of the GitHub repository",
-            placeholder="https://github.com/owner/repo",
-        )
-        new_branch: str = SchemaField(
-            description="Name of the new branch",
-            placeholder="new_branch_name",
-        )
-        source_branch: str = SchemaField(
-            description="Name of the source branch",
-            placeholder="source_branch_name",
-        )
-
-    class Output(BlockSchemaOutput):
-        status: str = SchemaField(description="Status of the branch creation operation")
-        error: str = SchemaField(
-            description="Error message if the branch creation failed"
-        )
-
-    def __init__(self):
-        super().__init__(
-            id="944cc076-95e7-4d1b-b6b6-b15d8ee5448d",
-            description="This block creates a new branch from a specified source branch.",
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubMakeBranchBlock.Input,
-            output_schema=GithubMakeBranchBlock.Output,
-            test_input={
-                "repo_url": "https://github.com/owner/repo",
-                "new_branch": "new_branch_name",
-                "source_branch": "source_branch_name",
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[("status", "Branch created successfully")],
-            test_mock={
-                "create_branch": lambda *args, **kwargs: "Branch created successfully"
-            },
-        )
-
-    @staticmethod
-    async def create_branch(
-        credentials: GithubCredentials,
-        repo_url: str,
-        new_branch: str,
-        source_branch: str,
-    ) -> str:
-        api = get_api(credentials)
-        ref_url = repo_url + f"/git/refs/heads/{quote(source_branch, safe='')}"
-        response = await api.get(ref_url)
-        data = response.json()
-        sha = data["object"]["sha"]
-
-        # Create the new branch
-        new_ref_url = repo_url + "/git/refs"
-        data = {
-            "ref": f"refs/heads/{new_branch}",
-            "sha": sha,
-        }
-        response = await api.post(new_ref_url, json=data)
-        return "Branch created successfully"
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            status = await self.create_branch(
-                credentials,
-                input_data.repo_url,
-                input_data.new_branch,
-                input_data.source_branch,
-            )
-            yield "status", status
-        except Exception as e:
-            yield "error", str(e)
-
-
-class GithubDeleteBranchBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        repo_url: str = SchemaField(
-            description="URL of the GitHub repository",
-            placeholder="https://github.com/owner/repo",
-        )
-        branch: str = SchemaField(
-            description="Name of the branch to delete",
-            placeholder="branch_name",
-        )
-
-    class Output(BlockSchemaOutput):
-        status: str = SchemaField(description="Status of the branch deletion operation")
-        error: str = SchemaField(
-            description="Error message if the branch deletion failed"
-        )
-
-    def __init__(self):
-        super().__init__(
-            id="0d4130f7-e0ab-4d55-adc3-0a40225e80f4",
-            description="This block deletes a specified branch.",
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubDeleteBranchBlock.Input,
-            output_schema=GithubDeleteBranchBlock.Output,
-            test_input={
-                "repo_url": "https://github.com/owner/repo",
-                "branch": "branch_name",
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[("status", "Branch deleted successfully")],
-            test_mock={
-                "delete_branch": lambda *args, **kwargs: "Branch deleted successfully"
-            },
-            is_sensitive_action=True,
-        )
-
-    @staticmethod
-    async def delete_branch(
-        credentials: GithubCredentials, repo_url: str, branch: str
-    ) -> str:
-        api = get_api(credentials)
-        ref_url = repo_url + f"/git/refs/heads/{quote(branch, safe='')}"
-        await api.delete(ref_url)
-        return "Branch deleted successfully"
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            status = await self.delete_branch(
-                credentials,
-                input_data.repo_url,
-                input_data.branch,
-            )
-            yield "status", status
-        except Exception as e:
-            yield "error", str(e)
-
-
-class GithubCompareBranchesBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        repo_url: str = SchemaField(
-            description="URL of the GitHub repository",
-            placeholder="https://github.com/owner/repo",
-        )
-        base: str = SchemaField(
-            description="Base branch or commit SHA",
-            placeholder="main",
-        )
-        head: str = SchemaField(
-            description="Head branch or commit SHA to compare against base",
-            placeholder="feature-branch",
-        )
-
-    class Output(BlockSchemaOutput):
-        class FileChange(TypedDict):
-            filename: str
-            status: str
-            additions: int
-            deletions: int
-            patch: str
-
-        status: str = SchemaField(
-            description="Comparison status: ahead, behind, diverged, or identical"
-        )
-        ahead_by: int = SchemaField(
-            description="Number of commits head is ahead of base"
-        )
-        behind_by: int = SchemaField(
-            description="Number of commits head is behind base"
-        )
-        total_commits: int = SchemaField(
-            description="Total number of commits in the comparison"
-        )
-        diff: str = SchemaField(description="Unified diff of all file changes")
-        file: FileChange = SchemaField(
-            title="Changed File", description="A changed file with its diff"
-        )
-        files: list[FileChange] = SchemaField(
-            description="List of changed files with their diffs"
-        )
-        error: str = SchemaField(description="Error message if comparison failed")
-
-    def __init__(self):
-        super().__init__(
-            id="2e4faa8c-6086-4546-ba77-172d1d560186",
-            description="This block compares two branches or commits in a GitHub repository.",
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubCompareBranchesBlock.Input,
-            output_schema=GithubCompareBranchesBlock.Output,
-            test_input={
-                "repo_url": "https://github.com/owner/repo",
-                "base": "main",
-                "head": "feature",
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[
-                ("status", "ahead"),
-                ("ahead_by", 2),
-                ("behind_by", 0),
-                ("total_commits", 2),
-                ("diff", "+++ b/file.py\n+new line"),
-                (
-                    "files",
-                    [
-                        {
-                            "filename": "file.py",
-                            "status": "modified",
-                            "additions": 1,
-                            "deletions": 0,
-                            "patch": "+new line",
-                        }
-                    ],
-                ),
-                (
-                    "file",
-                    {
-                        "filename": "file.py",
-                        "status": "modified",
-                        "additions": 1,
-                        "deletions": 0,
-                        "patch": "+new line",
-                    },
-                ),
-            ],
-            test_mock={
-                "compare_branches": lambda *args, **kwargs: {
-                    "status": "ahead",
-                    "ahead_by": 2,
-                    "behind_by": 0,
-                    "total_commits": 2,
-                    "files": [
-                        {
-                            "filename": "file.py",
-                            "status": "modified",
-                            "additions": 1,
-                            "deletions": 0,
-                            "patch": "+new line",
-                        }
-                    ],
-                }
-            },
-        )
-
-    @staticmethod
-    async def compare_branches(
-        credentials: GithubCredentials,
-        repo_url: str,
-        base: str,
-        head: str,
-    ) -> dict:
-        api = get_api(credentials)
-        safe_base = quote(base, safe="")
-        safe_head = quote(head, safe="")
-        compare_url = repo_url + f"/compare/{safe_base}...{safe_head}"
-        response = await api.get(compare_url)
-        return response.json()
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            data = await self.compare_branches(
-                credentials,
-                input_data.repo_url,
-                input_data.base,
-                input_data.head,
-            )
-            yield "status", data["status"]
-            yield "ahead_by", data["ahead_by"]
-            yield "behind_by", data["behind_by"]
-            yield "total_commits", data["total_commits"]
-
-            files: list[GithubCompareBranchesBlock.Output.FileChange] = [
-                GithubCompareBranchesBlock.Output.FileChange(
-                    filename=f["filename"],
-                    status=f["status"],
-                    additions=f["additions"],
-                    deletions=f["deletions"],
-                    patch=f.get("patch", ""),
-                )
-                for f in data.get("files", [])
-            ]
-
-            # Build unified diff
-            diff_parts = []
-            for f in data.get("files", []):
-                patch = f.get("patch", "")
-                if patch:
-                    diff_parts.append(f"+++ b/{f['filename']}\n{patch}")
-            yield "diff", "\n".join(diff_parts)
-
-            yield "files", files
-            for file in files:
-                yield "file", file
-        except Exception as e:
-            yield "error", str(e)
--- a/autogpt_platform/backend/backend/blocks/github/repo_files.py
+++ b/autogpt_platform/backend/backend/blocks/github/repo_files.py
@@ -1,720 +0,0 @@
-import base64
-from urllib.parse import quote
-
-from typing_extensions import TypedDict
-
-from backend.blocks._base import (
-    Block,
-    BlockCategory,
-    BlockOutput,
-    BlockSchemaInput,
-    BlockSchemaOutput,
-)
-from backend.data.model import SchemaField
-
-from ._api import get_api
-from ._auth import (
-    TEST_CREDENTIALS,
-    TEST_CREDENTIALS_INPUT,
-    GithubCredentials,
-    GithubCredentialsField,
-    GithubCredentialsInput,
-)
-
-
-class GithubReadFileBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        repo_url: str = SchemaField(
-            description="URL of the GitHub repository",
-            placeholder="https://github.com/owner/repo",
-        )
-        file_path: str = SchemaField(
-            description="Path to the file in the repository",
-            placeholder="path/to/file",
-        )
-        branch: str = SchemaField(
-            description="Branch to read from",
-            placeholder="branch_name",
-            default="main",
-        )
-
-    class Output(BlockSchemaOutput):
-        text_content: str = SchemaField(
-            description="Content of the file (decoded as UTF-8 text)"
-        )
-        raw_content: str = SchemaField(
-            description="Raw base64-encoded content of the file"
-        )
-        size: int = SchemaField(description="The size of the file (in bytes)")
-        error: str = SchemaField(description="Error message if reading the file failed")
-
-    def __init__(self):
-        super().__init__(
-            id="87ce6c27-5752-4bbc-8e26-6da40a3dcfd3",
-            description="This block reads the content of a specified file from a GitHub repository.",
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubReadFileBlock.Input,
-            output_schema=GithubReadFileBlock.Output,
-            test_input={
-                "repo_url": "https://github.com/owner/repo",
-                "file_path": "path/to/file",
-                "branch": "main",
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[
-                ("raw_content", "RmlsZSBjb250ZW50"),
-                ("text_content", "File content"),
-                ("size", 13),
-            ],
-            test_mock={"read_file": lambda *args, **kwargs: ("RmlsZSBjb250ZW50", 13)},
-        )
-
-    @staticmethod
-    async def read_file(
-        credentials: GithubCredentials, repo_url: str, file_path: str, branch: str
-    ) -> tuple[str, int]:
-        api = get_api(credentials)
-        content_url = (
-            repo_url
-            + f"/contents/{quote(file_path, safe='')}?ref={quote(branch, safe='')}"
-        )
-        response = await api.get(content_url)
-        data = response.json()
-
-        if isinstance(data, list):
-            # Multiple entries of different types exist at this path
-            if not (file := next((f for f in data if f["type"] == "file"), None)):
-                raise TypeError("Not a file")
-            data = file
-
-        if data["type"] != "file":
-            raise TypeError("Not a file")
-
-        return data["content"], data["size"]
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            content, size = await self.read_file(
-                credentials,
-                input_data.repo_url,
-                input_data.file_path,
-                input_data.branch,
-            )
-            yield "raw_content", content
-            yield "text_content", base64.b64decode(content).decode("utf-8")
-            yield "size", size
-        except Exception as e:
-            yield "error", str(e)
-
-
-class GithubReadFolderBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        repo_url: str = SchemaField(
-            description="URL of the GitHub repository",
-            placeholder="https://github.com/owner/repo",
-        )
-        folder_path: str = SchemaField(
-            description="Path to the folder in the repository",
-            placeholder="path/to/folder",
-        )
-        branch: str = SchemaField(
-            description="Branch name to read from (defaults to main)",
-            placeholder="branch_name",
-            default="main",
-        )
-
-    class Output(BlockSchemaOutput):
-        class DirEntry(TypedDict):
-            name: str
-            path: str
-
-        class FileEntry(TypedDict):
-            name: str
-            path: str
-            size: int
-
-        file: FileEntry = SchemaField(description="Files in the folder")
-        dir: DirEntry = SchemaField(description="Directories in the folder")
-        error: str = SchemaField(
-            description="Error message if reading the folder failed"
-        )
-
-    def __init__(self):
-        super().__init__(
-            id="1355f863-2db3-4d75-9fba-f91e8a8ca400",
-            description="This block reads the content of a specified folder from a GitHub repository.",
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubReadFolderBlock.Input,
-            output_schema=GithubReadFolderBlock.Output,
-            test_input={
-                "repo_url": "https://github.com/owner/repo",
-                "folder_path": "path/to/folder",
-                "branch": "main",
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[
-                (
-                    "file",
-                    {
-                        "name": "file1.txt",
-                        "path": "path/to/folder/file1.txt",
-                        "size": 1337,
-                    },
-                ),
-                ("dir", {"name": "dir2", "path": "path/to/folder/dir2"}),
-            ],
-            test_mock={
-                "read_folder": lambda *args, **kwargs: (
-                    [
-                        {
-                            "name": "file1.txt",
-                            "path": "path/to/folder/file1.txt",
-                            "size": 1337,
-                        }
-                    ],
-                    [{"name": "dir2", "path": "path/to/folder/dir2"}],
-                )
-            },
-        )
-
-    @staticmethod
-    async def read_folder(
-        credentials: GithubCredentials, repo_url: str, folder_path: str, branch: str
-    ) -> tuple[list[Output.FileEntry], list[Output.DirEntry]]:
-        api = get_api(credentials)
-        contents_url = (
-            repo_url
-            + f"/contents/{quote(folder_path, safe='/')}?ref={quote(branch, safe='')}"
-        )
-        response = await api.get(contents_url)
-        data = response.json()
-
-        if not isinstance(data, list):
-            raise TypeError("Not a folder")
-
-        files: list[GithubReadFolderBlock.Output.FileEntry] = [
-            GithubReadFolderBlock.Output.FileEntry(
-                name=entry["name"],
-                path=entry["path"],
-                size=entry["size"],
-            )
-            for entry in data
-            if entry["type"] == "file"
-        ]
-
-        dirs: list[GithubReadFolderBlock.Output.DirEntry] = [
-            GithubReadFolderBlock.Output.DirEntry(
-                name=entry["name"],
-                path=entry["path"],
-            )
-            for entry in data
-            if entry["type"] == "dir"
-        ]
-
-        return files, dirs
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            files, dirs = await self.read_folder(
-                credentials,
-                input_data.repo_url,
-                input_data.folder_path.lstrip("/"),
-                input_data.branch,
-            )
-            for file in files:
-                yield "file", file
-            for dir in dirs:
-                yield "dir", dir
-        except Exception as e:
-            yield "error", str(e)
-
-
-class GithubCreateFileBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        repo_url: str = SchemaField(
-            description="URL of the GitHub repository",
-            placeholder="https://github.com/owner/repo",
-        )
-        file_path: str = SchemaField(
-            description="Path where the file should be created",
-            placeholder="path/to/file.txt",
-        )
-        content: str = SchemaField(
-            description="Content to write to the file",
-            placeholder="File content here",
-        )
-        branch: str = SchemaField(
-            description="Branch where the file should be created",
-            default="main",
-        )
-        commit_message: str = SchemaField(
-            description="Message for the commit",
-            default="Create new file",
-        )
-
-    class Output(BlockSchemaOutput):
-        url: str = SchemaField(description="URL of the created file")
-        sha: str = SchemaField(description="SHA of the commit")
-        error: str = SchemaField(
-            description="Error message if the file creation failed"
-        )
-
-    def __init__(self):
-        super().__init__(
-            id="8fd132ac-b917-428a-8159-d62893e8a3fe",
-            description="This block creates a new file in a GitHub repository.",
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubCreateFileBlock.Input,
-            output_schema=GithubCreateFileBlock.Output,
-            test_input={
-                "repo_url": "https://github.com/owner/repo",
-                "file_path": "test/file.txt",
-                "content": "Test content",
-                "branch": "main",
-                "commit_message": "Create test file",
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[
-                ("url", "https://github.com/owner/repo/blob/main/test/file.txt"),
-                ("sha", "abc123"),
-            ],
-            test_mock={
-                "create_file": lambda *args, **kwargs: (
-                    "https://github.com/owner/repo/blob/main/test/file.txt",
-                    "abc123",
-                )
-            },
-        )
-
-    @staticmethod
-    async def create_file(
-        credentials: GithubCredentials,
-        repo_url: str,
-        file_path: str,
-        content: str,
-        branch: str,
-        commit_message: str,
-    ) -> tuple[str, str]:
-        api = get_api(credentials)
-        contents_url = repo_url + f"/contents/{quote(file_path, safe='/')}"
-        content_base64 = base64.b64encode(content.encode()).decode()
-        data = {
-            "message": commit_message,
-            "content": content_base64,
-            "branch": branch,
-        }
-        response = await api.put(contents_url, json=data)
-        data = response.json()
-        return data["content"]["html_url"], data["commit"]["sha"]
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            url, sha = await self.create_file(
-                credentials,
-                input_data.repo_url,
-                input_data.file_path,
-                input_data.content,
-                input_data.branch,
-                input_data.commit_message,
-            )
-            yield "url", url
-            yield "sha", sha
-        except Exception as e:
-            yield "error", str(e)
-
-
-class GithubUpdateFileBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        repo_url: str = SchemaField(
-            description="URL of the GitHub repository",
-            placeholder="https://github.com/owner/repo",
-        )
-        file_path: str = SchemaField(
-            description="Path to the file to update",
-            placeholder="path/to/file.txt",
-        )
-        content: str = SchemaField(
-            description="New content for the file",
-            placeholder="Updated content here",
-        )
-        branch: str = SchemaField(
-            description="Branch containing the file",
-            default="main",
-        )
-        commit_message: str = SchemaField(
-            description="Message for the commit",
-            default="Update file",
-        )
-
-    class Output(BlockSchemaOutput):
-        url: str = SchemaField(description="URL of the updated file")
-        sha: str = SchemaField(description="SHA of the commit")
-
-    def __init__(self):
-        super().__init__(
-            id="30be12a4-57cb-4aa4-baf5-fcc68d136076",
-            description="This block updates an existing file in a GitHub repository.",
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubUpdateFileBlock.Input,
-            output_schema=GithubUpdateFileBlock.Output,
-            test_input={
-                "repo_url": "https://github.com/owner/repo",
-                "file_path": "test/file.txt",
-                "content": "Updated content",
-                "branch": "main",
-                "commit_message": "Update test file",
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[
-                ("url", "https://github.com/owner/repo/blob/main/test/file.txt"),
-                ("sha", "def456"),
-            ],
-            test_mock={
-                "update_file": lambda *args, **kwargs: (
-                    "https://github.com/owner/repo/blob/main/test/file.txt",
-                    "def456",
-                )
-            },
-        )
-
-    @staticmethod
-    async def update_file(
-        credentials: GithubCredentials,
-        repo_url: str,
-        file_path: str,
-        content: str,
-        branch: str,
-        commit_message: str,
-    ) -> tuple[str, str]:
-        api = get_api(credentials)
-        contents_url = repo_url + f"/contents/{quote(file_path, safe='/')}"
-        params = {"ref": branch}
-        response = await api.get(contents_url, params=params)
-        data = response.json()
-
-        # Convert new content to base64
-        content_base64 = base64.b64encode(content.encode()).decode()
-        data = {
-            "message": commit_message,
-            "content": content_base64,
-            "sha": data["sha"],
-            "branch": branch,
-        }
-        response = await api.put(contents_url, json=data)
-        data = response.json()
-        return data["content"]["html_url"], data["commit"]["sha"]
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            url, sha = await self.update_file(
-                credentials,
-                input_data.repo_url,
-                input_data.file_path,
-                input_data.content,
-                input_data.branch,
-                input_data.commit_message,
-            )
-            yield "url", url
-            yield "sha", sha
-        except Exception as e:
-            yield "error", str(e)
-
-
-class GithubSearchCodeBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        query: str = SchemaField(
-            description="Search query (GitHub code search syntax)",
-            placeholder="className language:python",
-        )
-        repo: str = SchemaField(
-            description="Restrict search to a repository (owner/repo format, optional)",
-            default="",
-            placeholder="owner/repo",
-        )
-        per_page: int = SchemaField(
-            description="Number of results to return (max 100)",
-            default=30,
-            ge=1,
-            le=100,
-        )
-
-    class Output(BlockSchemaOutput):
-        class SearchResult(TypedDict):
-            name: str
-            path: str
-            repository: str
-            url: str
-            score: float
-
-        result: SearchResult = SchemaField(
-            title="Result", description="A code search result"
-        )
-        results: list[SearchResult] = SchemaField(
-            description="List of code search results"
-        )
-        total_count: int = SchemaField(description="Total number of matching results")
-        error: str = SchemaField(description="Error message if search failed")
-
-    def __init__(self):
-        super().__init__(
-            id="47f94891-a2b1-4f1c-b5f2-573c043f721e",
-            description="This block searches for code in GitHub repositories.",
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubSearchCodeBlock.Input,
-            output_schema=GithubSearchCodeBlock.Output,
-            test_input={
-                "query": "addClass",
-                "repo": "owner/repo",
-                "per_page": 30,
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[
-                ("total_count", 1),
-                (
-                    "results",
-                    [
-                        {
-                            "name": "file.py",
-                            "path": "src/file.py",
-                            "repository": "owner/repo",
-                            "url": "https://github.com/owner/repo/blob/main/src/file.py",
-                            "score": 1.0,
-                        }
-                    ],
-                ),
-                (
-                    "result",
-                    {
-                        "name": "file.py",
-                        "path": "src/file.py",
-                        "repository": "owner/repo",
-                        "url": "https://github.com/owner/repo/blob/main/src/file.py",
-                        "score": 1.0,
-                    },
-                ),
-            ],
-            test_mock={
-                "search_code": lambda *args, **kwargs: (
-                    1,
-                    [
-                        {
-                            "name": "file.py",
-                            "path": "src/file.py",
-                            "repository": "owner/repo",
-                            "url": "https://github.com/owner/repo/blob/main/src/file.py",
-                            "score": 1.0,
-                        }
-                    ],
-                )
-            },
-        )
-
-    @staticmethod
-    async def search_code(
-        credentials: GithubCredentials,
-        query: str,
-        repo: str,
-        per_page: int,
-    ) -> tuple[int, list[Output.SearchResult]]:
-        api = get_api(credentials, convert_urls=False)
-        full_query = f"{query} repo:{repo}" if repo else query
-        params = {"q": full_query, "per_page": str(per_page)}
-        response = await api.get("https://api.github.com/search/code", params=params)
-        data = response.json()
-        results: list[GithubSearchCodeBlock.Output.SearchResult] = [
-            GithubSearchCodeBlock.Output.SearchResult(
-                name=item["name"],
-                path=item["path"],
-                repository=item["repository"]["full_name"],
-                url=item["html_url"],
-                score=item["score"],
-            )
-            for item in data["items"]
-        ]
-        return data["total_count"], results
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            total_count, results = await self.search_code(
-                credentials,
-                input_data.query,
-                input_data.repo,
-                input_data.per_page,
-            )
-            yield "total_count", total_count
-            yield "results", results
-            for result in results:
-                yield "result", result
-        except Exception as e:
-            yield "error", str(e)
-
-
-class GithubGetRepositoryTreeBlock(Block):
-    class Input(BlockSchemaInput):
-        credentials: GithubCredentialsInput = GithubCredentialsField("repo")
-        repo_url: str = SchemaField(
-            description="URL of the GitHub repository",
-            placeholder="https://github.com/owner/repo",
-        )
-        branch: str = SchemaField(
-            description="Branch name to get the tree from",
-            default="main",
-        )
-        recursive: bool = SchemaField(
-            description="Whether to recursively list the entire tree",
-            default=True,
-        )
-
-    class Output(BlockSchemaOutput):
-        class TreeEntry(TypedDict):
-            path: str
-            type: str
-            size: int
-            sha: str
-
-        entry: TreeEntry = SchemaField(
-            title="Tree Entry", description="A file or directory in the tree"
-        )
-        entries: list[TreeEntry] = SchemaField(
-            description="List of all files and directories in the tree"
-        )
-        truncated: bool = SchemaField(
-            description="Whether the tree was truncated due to size"
-        )
-        error: str = SchemaField(description="Error message if getting tree failed")
-
-    def __init__(self):
-        super().__init__(
-            id="89c5c0ec-172e-4001-a32c-bdfe4d0c9e81",
-            description="This block lists the entire file tree of a GitHub repository recursively.",
-            categories={BlockCategory.DEVELOPER_TOOLS},
-            input_schema=GithubGetRepositoryTreeBlock.Input,
-            output_schema=GithubGetRepositoryTreeBlock.Output,
-            test_input={
-                "repo_url": "https://github.com/owner/repo",
-                "branch": "main",
-                "recursive": True,
-                "credentials": TEST_CREDENTIALS_INPUT,
-            },
-            test_credentials=TEST_CREDENTIALS,
-            test_output=[
-                ("truncated", False),
-                (
-                    "entries",
-                    [
-                        {
-                            "path": "src/main.py",
-                            "type": "blob",
-                            "size": 1234,
-                            "sha": "abc123",
-                        }
-                    ],
-                ),
-                (
-                    "entry",
-                    {
-                        "path": "src/main.py",
-                        "type": "blob",
-                        "size": 1234,
-                        "sha": "abc123",
-                    },
-                ),
-            ],
-            test_mock={
-                "get_tree": lambda *args, **kwargs: (
-                    False,
-                    [
-                        {
-                            "path": "src/main.py",
-                            "type": "blob",
-                            "size": 1234,
-                            "sha": "abc123",
-                        }
-                    ],
-                )
-            },
-        )
-
-    @staticmethod
-    async def get_tree(
-        credentials: GithubCredentials,
-        repo_url: str,
-        branch: str,
-        recursive: bool,
-    ) -> tuple[bool, list[Output.TreeEntry]]:
-        api = get_api(credentials)
-        tree_url = repo_url + f"/git/trees/{quote(branch, safe='')}"
-        params = {"recursive": "1"} if recursive else {}
-        response = await api.get(tree_url, params=params)
-        data = response.json()
-        entries: list[GithubGetRepositoryTreeBlock.Output.TreeEntry] = [
-            GithubGetRepositoryTreeBlock.Output.TreeEntry(
-                path=item["path"],
-                type=item["type"],
-                size=item.get("size", 0),
-                sha=item["sha"],
-            )
-            for item in data["tree"]
-        ]
-        return data.get("truncated", False), entries
-
-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: GithubCredentials,
-        **kwargs,
-    ) -> BlockOutput:
-        try:
-            truncated, entries = await self.get_tree(
-                credentials,
-                input_data.repo_url,
-                input_data.branch,
-                input_data.recursive,
-            )
-            yield "truncated", truncated
-            yield "entries", entries
-            for entry in entries:
-                yield "entry", entry
-        except Exception as e:
-            yield "error", str(e)
--- a/autogpt_platform/backend/backend/blocks/github/test_github_blocks.py
+++ b/autogpt_platform/backend/backend/blocks/github/test_github_blocks.py
@@ -1,125 +0,0 @@
-import inspect
-
-import pytest
-
-from backend.blocks.github._auth import TEST_CREDENTIALS, TEST_CREDENTIALS_INPUT
-from backend.blocks.github.commits import FileOperation, GithubMultiFileCommitBlock
-from backend.blocks.github.pull_requests import (
-    GithubMergePullRequestBlock,
-    prepare_pr_api_url,
-)
-from backend.data.execution import ExecutionContext
-from backend.util.exceptions import BlockExecutionError
-
-# ── prepare_pr_api_url tests ──
-
-
-class TestPreparePrApiUrl:
-    def test_https_scheme_preserved(self):
-        result = prepare_pr_api_url("https://github.com/owner/repo/pull/42", "merge")
-        assert result == "https://github.com/owner/repo/pulls/42/merge"
-
-    def test_http_scheme_preserved(self):
-        result = prepare_pr_api_url("http://github.com/owner/repo/pull/1", "files")
-        assert result == "http://github.com/owner/repo/pulls/1/files"
-
-    def test_no_scheme_defaults_to_https(self):
-        result = prepare_pr_api_url("github.com/owner/repo/pull/5", "merge")
-        assert result == "https://github.com/owner/repo/pulls/5/merge"
-
-    def test_reviewers_path(self):
-        result = prepare_pr_api_url(
-            "https://github.com/owner/repo/pull/99", "requested_reviewers"
-        )
-        assert result == "https://github.com/owner/repo/pulls/99/requested_reviewers"
-
-    def test_invalid_url_returned_as_is(self):
-        url = "https://example.com/not-a-pr"
-        assert prepare_pr_api_url(url, "merge") == url
-
-    def test_empty_string(self):
-        assert prepare_pr_api_url("", "merge") == ""
-
-
-# ── Error-path block tests ──
-# When a block's run() yields ("error", msg), _execute() converts it to a
-# BlockExecutionError. We call block.execute() directly (not execute_block_test,
-# which returns early on empty test_output).
-
-
-def _mock_block(block, mocks: dict):
-    """Apply mocks to a block's static methods, wrapping sync mocks as async."""
-    for name, mock_fn in mocks.items():
-        original = getattr(block, name)
-        if inspect.iscoroutinefunction(original):
-
-            async def async_mock(*args, _fn=mock_fn, **kwargs):
-                return _fn(*args, **kwargs)
-
-            setattr(block, name, async_mock)
-        else:
-            setattr(block, name, mock_fn)
-
-
-def _raise(exc: Exception):
-    """Helper that returns a callable which raises the given exception."""
-
-    def _raiser(*args, **kwargs):
-        raise exc
-
-    return _raiser
-
-
-@pytest.mark.asyncio
-async def test_merge_pr_error_path():
-    block = GithubMergePullRequestBlock()
-    _mock_block(block, {"merge_pr": _raise(RuntimeError("PR not mergeable"))})
-    input_data = {
-        "pr_url": "https://github.com/owner/repo/pull/1",
-        "merge_method": "squash",
-        "commit_title": "",
-        "commit_message": "",
-        "credentials": TEST_CREDENTIALS_INPUT,
-    }
-    with pytest.raises(BlockExecutionError, match="PR not mergeable"):
-        async for _ in block.execute(input_data, credentials=TEST_CREDENTIALS):
-            pass
-
-
-@pytest.mark.asyncio
-async def test_multi_file_commit_error_path():
-    block = GithubMultiFileCommitBlock()
-    _mock_block(block, {"multi_file_commit": _raise(RuntimeError("ref update failed"))})
-    input_data = {
-        "repo_url": "https://github.com/owner/repo",
-        "branch": "feature",
-        "commit_message": "test",
-        "files": [{"path": "a.py", "content": "x", "operation": "upsert"}],
-        "credentials": TEST_CREDENTIALS_INPUT,
-    }
-    with pytest.raises(BlockExecutionError, match="ref update failed"):
-        async for _ in block.execute(
-            input_data,
-            credentials=TEST_CREDENTIALS,
-            execution_context=ExecutionContext(),
-        ):
-            pass
-
-
-# ── FileOperation enum tests ──
-
-
-class TestFileOperation:
-    def test_upsert_value(self):
-        assert FileOperation.UPSERT == "upsert"
-
-    def test_delete_value(self):
-        assert FileOperation.DELETE == "delete"
-
-    def test_invalid_value_raises(self):
-        with pytest.raises(ValueError):
-            FileOperation("create")
-
-    def test_invalid_value_raises_typo(self):
-        with pytest.raises(ValueError):
-            FileOperation("upser")
--- a/autogpt_platform/backend/backend/blocks/google/gmail.py
+++ b/autogpt_platform/backend/backend/blocks/google/gmail.py
@@ -241,8 +241,8 @@ class GmailBase(Block, ABC):
                    h.ignore_links = False
                    h.ignore_images = True
                    return h.handle(html_content)
-                except Exception:
-                    # Keep extraction resilient if html2text is unavailable or fails.
+                except ImportError:
+                    # Fallback: return raw HTML if html2text is not available
                    return html_content

        # Handle content stored as attachment
--- a/autogpt_platform/backend/backend/blocks/helpers/review.py
+++ b/autogpt_platform/backend/backend/blocks/helpers/review.py
@@ -67,7 +67,6 @@ class HITLReviewHelper:
        graph_version: int,
        block_name: str = "Block",
        editable: bool = False,
-        is_graph_execution: bool = True,
    ) -> Optional[ReviewResult]:
        """
        Handle a review request for a block that requires human review.
@@ -144,11 +143,10 @@ class HITLReviewHelper:
            logger.info(
                f"Block {block_name} pausing execution for node {node_exec_id} - awaiting human review"
            )
-            if is_graph_execution:
-                await HITLReviewHelper.update_node_execution_status(
-                    exec_id=node_exec_id,
-                    status=ExecutionStatus.REVIEW,
-                )
+            await HITLReviewHelper.update_node_execution_status(
+                exec_id=node_exec_id,
+                status=ExecutionStatus.REVIEW,
+            )
            return None  # Signal that execution should pause

        # Mark review as processed if not already done
@@ -170,7 +168,6 @@ class HITLReviewHelper:
        graph_version: int,
        block_name: str = "Block",
        editable: bool = False,
-        is_graph_execution: bool = True,
    ) -> Optional[ReviewDecision]:
        """
        Handle a review request and return the decision in a single call.
@@ -200,7 +197,6 @@ class HITLReviewHelper:
            graph_version=graph_version,
            block_name=block_name,
            editable=editable,
-            is_graph_execution=is_graph_execution,
        )

        if review_result is None:
--- a/autogpt_platform/backend/backend/blocks/jina/search.py
+++ b/autogpt_platform/backend/backend/blocks/jina/search.py
@@ -17,7 +17,7 @@ from backend.blocks.jina._auth import (
 from backend.blocks.search import GetRequest
 from backend.data.model import SchemaField
 from backend.util.exceptions import BlockExecutionError
-from backend.util.request import HTTPClientError, HTTPServerError, validate_url_host
+from backend.util.request import HTTPClientError, HTTPServerError, validate_url


 class SearchTheWebBlock(Block, GetRequest):
@@ -112,7 +112,7 @@ class ExtractWebsiteContentBlock(Block, GetRequest):
    ) -> BlockOutput:
        if input_data.raw_content:
            try:
-                parsed_url, _, _ = await validate_url_host(input_data.url)
+                parsed_url, _, _ = await validate_url(input_data.url, [])
                url = parsed_url.geturl()
            except ValueError as e:
                yield "error", f"Invalid URL: {e}"
--- a/autogpt_platform/backend/backend/blocks/llm.py
+++ b/autogpt_platform/backend/backend/blocks/llm.py
@@ -31,14 +31,10 @@ from backend.data.model import (
 )
 from backend.integrations.providers import ProviderName
 from backend.util import json
-from backend.util.clients import OPENROUTER_BASE_URL
 from backend.util.logging import TruncatedLogger
 from backend.util.prompt import compress_context, estimate_token_count
-from backend.util.request import validate_url_host
-from backend.util.settings import Settings
 from backend.util.text import TextFormatter

-settings = Settings()
 logger = TruncatedLogger(logging.getLogger(__name__), "[LLM-Block]")
 fmt = TextFormatter(autoescape=False)

@@ -120,7 +116,6 @@ class LlmModel(str, Enum, metaclass=LlmModelMeta):
    CLAUDE_4_5_SONNET = "claude-sonnet-4-5-20250929"
    CLAUDE_4_5_HAIKU = "claude-haiku-4-5-20251001"
    CLAUDE_4_6_OPUS = "claude-opus-4-6"
-    CLAUDE_4_6_SONNET = "claude-sonnet-4-6"
    CLAUDE_3_HAIKU = "claude-3-haiku-20240307"
    # AI/ML API models
    AIML_API_QWEN2_5_72B = "Qwen/Qwen2.5-72B-Instruct-Turbo"
@@ -140,31 +135,19 @@ class LlmModel(str, Enum, metaclass=LlmModelMeta):
    # OpenRouter models
    OPENAI_GPT_OSS_120B = "openai/gpt-oss-120b"
    OPENAI_GPT_OSS_20B = "openai/gpt-oss-20b"
-    GEMINI_2_5_PRO_PREVIEW = "google/gemini-2.5-pro-preview-03-25"
-    GEMINI_2_5_PRO = "google/gemini-2.5-pro"
-    GEMINI_3_1_PRO_PREVIEW = "google/gemini-3.1-pro-preview"
-    GEMINI_3_FLASH_PREVIEW = "google/gemini-3-flash-preview"
+    GEMINI_2_5_PRO = "google/gemini-2.5-pro-preview-03-25"
+    GEMINI_3_PRO_PREVIEW = "google/gemini-3-pro-preview"
    GEMINI_2_5_FLASH = "google/gemini-2.5-flash"
    GEMINI_2_0_FLASH = "google/gemini-2.0-flash-001"
-    GEMINI_3_1_FLASH_LITE_PREVIEW = "google/gemini-3.1-flash-lite-preview"
    GEMINI_2_5_FLASH_LITE_PREVIEW = "google/gemini-2.5-flash-lite-preview-06-17"
    GEMINI_2_0_FLASH_LITE = "google/gemini-2.0-flash-lite-001"
    MISTRAL_NEMO = "mistralai/mistral-nemo"
-    MISTRAL_LARGE_3 = "mistralai/mistral-large-2512"
-    MISTRAL_MEDIUM_3_1 = "mistralai/mistral-medium-3.1"
-    MISTRAL_SMALL_3_2 = "mistralai/mistral-small-3.2-24b-instruct"
-    CODESTRAL = "mistralai/codestral-2508"
    COHERE_COMMAND_R_08_2024 = "cohere/command-r-08-2024"
    COHERE_COMMAND_R_PLUS_08_2024 = "cohere/command-r-plus-08-2024"
-    COHERE_COMMAND_A_03_2025 = "cohere/command-a-03-2025"
-    COHERE_COMMAND_A_TRANSLATE_08_2025 = "cohere/command-a-translate-08-2025"
-    COHERE_COMMAND_A_REASONING_08_2025 = "cohere/command-a-reasoning-08-2025"
-    COHERE_COMMAND_A_VISION_07_2025 = "cohere/command-a-vision-07-2025"
    DEEPSEEK_CHAT = "deepseek/deepseek-chat"  # Actually: DeepSeek V3
    DEEPSEEK_R1_0528 = "deepseek/deepseek-r1-0528"
    PERPLEXITY_SONAR = "perplexity/sonar"
    PERPLEXITY_SONAR_PRO = "perplexity/sonar-pro"
-    PERPLEXITY_SONAR_REASONING_PRO = "perplexity/sonar-reasoning-pro"
    PERPLEXITY_SONAR_DEEP_RESEARCH = "perplexity/sonar-deep-research"
    NOUSRESEARCH_HERMES_3_LLAMA_3_1_405B = "nousresearch/hermes-3-llama-3.1-405b"
    NOUSRESEARCH_HERMES_3_LLAMA_3_1_70B = "nousresearch/hermes-3-llama-3.1-70b"
@@ -172,11 +155,9 @@ class LlmModel(str, Enum, metaclass=LlmModelMeta):
    AMAZON_NOVA_MICRO_V1 = "amazon/nova-micro-v1"
    AMAZON_NOVA_PRO_V1 = "amazon/nova-pro-v1"
    MICROSOFT_WIZARDLM_2_8X22B = "microsoft/wizardlm-2-8x22b"
-    MICROSOFT_PHI_4 = "microsoft/phi-4"
    GRYPHE_MYTHOMAX_L2_13B = "gryphe/mythomax-l2-13b"
    META_LLAMA_4_SCOUT = "meta-llama/llama-4-scout"
    META_LLAMA_4_MAVERICK = "meta-llama/llama-4-maverick"
-    GROK_3 = "x-ai/grok-3"
    GROK_4 = "x-ai/grok-4"
    GROK_4_FAST = "x-ai/grok-4-fast"
    GROK_4_1_FAST = "x-ai/grok-4.1-fast"
@@ -293,9 +274,6 @@ MODEL_METADATA = {
    LlmModel.CLAUDE_4_6_OPUS: ModelMetadata(
        "anthropic", 200000, 128000, "Claude Opus 4.6", "Anthropic", "Anthropic", 3
    ),  # claude-opus-4-6
-    LlmModel.CLAUDE_4_6_SONNET: ModelMetadata(
-        "anthropic", 200000, 64000, "Claude Sonnet 4.6", "Anthropic", "Anthropic", 3
-    ),  # claude-sonnet-4-6
    LlmModel.CLAUDE_4_5_OPUS: ModelMetadata(
        "anthropic", 200000, 64000, "Claude Opus 4.5", "Anthropic", "Anthropic", 3
    ),  # claude-opus-4-5-20251101
@@ -354,41 +332,17 @@ MODEL_METADATA = {
        "ollama", 32768, None, "Dolphin Mistral Latest", "Ollama", "Mistral AI", 1
    ),
    # https://openrouter.ai/models
-    LlmModel.GEMINI_2_5_PRO_PREVIEW: ModelMetadata(
+    LlmModel.GEMINI_2_5_PRO: ModelMetadata(
        "open_router",
-        1048576,
-        65536,
+        1050000,
+        8192,
        "Gemini 2.5 Pro Preview 03.25",
        "OpenRouter",
        "Google",
        2,
    ),
-    LlmModel.GEMINI_2_5_PRO: ModelMetadata(
-        "open_router",
-        1048576,
-        65536,
-        "Gemini 2.5 Pro",
-        "OpenRouter",
-        "Google",
-        2,
-    ),
-    LlmModel.GEMINI_3_1_PRO_PREVIEW: ModelMetadata(
-        "open_router",
-        1048576,
-        65536,
-        "Gemini 3.1 Pro Preview",
-        "OpenRouter",
-        "Google",
-        2,
-    ),
-    LlmModel.GEMINI_3_FLASH_PREVIEW: ModelMetadata(
-        "open_router",
-        1048576,
-        65536,
-        "Gemini 3 Flash Preview",
-        "OpenRouter",
-        "Google",
-        1,
+    LlmModel.GEMINI_3_PRO_PREVIEW: ModelMetadata(
+        "open_router", 1048576, 65535, "Gemini 3 Pro Preview", "OpenRouter", "Google", 2
    ),
    LlmModel.GEMINI_2_5_FLASH: ModelMetadata(
        "open_router", 1048576, 65535, "Gemini 2.5 Flash", "OpenRouter", "Google", 1
@@ -396,15 +350,6 @@ MODEL_METADATA = {
    LlmModel.GEMINI_2_0_FLASH: ModelMetadata(
        "open_router", 1048576, 8192, "Gemini 2.0 Flash 001", "OpenRouter", "Google", 1
    ),
-    LlmModel.GEMINI_3_1_FLASH_LITE_PREVIEW: ModelMetadata(
-        "open_router",
-        1048576,
-        65536,
-        "Gemini 3.1 Flash Lite Preview",
-        "OpenRouter",
-        "Google",
-        1,
-    ),
    LlmModel.GEMINI_2_5_FLASH_LITE_PREVIEW: ModelMetadata(
        "open_router",
        1048576,
@@ -426,78 +371,12 @@ MODEL_METADATA = {
    LlmModel.MISTRAL_NEMO: ModelMetadata(
        "open_router", 128000, 4096, "Mistral Nemo", "OpenRouter", "Mistral AI", 1
    ),
-    LlmModel.MISTRAL_LARGE_3: ModelMetadata(
-        "open_router",
-        262144,
-        None,
-        "Mistral Large 3 2512",
-        "OpenRouter",
-        "Mistral AI",
-        2,
-    ),
-    LlmModel.MISTRAL_MEDIUM_3_1: ModelMetadata(
-        "open_router",
-        131072,
-        None,
-        "Mistral Medium 3.1",
-        "OpenRouter",
-        "Mistral AI",
-        2,
-    ),
-    LlmModel.MISTRAL_SMALL_3_2: ModelMetadata(
-        "open_router",
-        131072,
-        131072,
-        "Mistral Small 3.2 24B",
-        "OpenRouter",
-        "Mistral AI",
-        1,
-    ),
-    LlmModel.CODESTRAL: ModelMetadata(
-        "open_router",
-        256000,
-        None,
-        "Codestral 2508",
-        "OpenRouter",
-        "Mistral AI",
-        1,
-    ),
    LlmModel.COHERE_COMMAND_R_08_2024: ModelMetadata(
        "open_router", 128000, 4096, "Command R 08.2024", "OpenRouter", "Cohere", 1
    ),
    LlmModel.COHERE_COMMAND_R_PLUS_08_2024: ModelMetadata(
        "open_router", 128000, 4096, "Command R Plus 08.2024", "OpenRouter", "Cohere", 2
    ),
-    LlmModel.COHERE_COMMAND_A_03_2025: ModelMetadata(
-        "open_router", 256000, 8192, "Command A 03.2025", "OpenRouter", "Cohere", 2
-    ),
-    LlmModel.COHERE_COMMAND_A_TRANSLATE_08_2025: ModelMetadata(
-        "open_router",
-        128000,
-        8192,
-        "Command A Translate 08.2025",
-        "OpenRouter",
-        "Cohere",
-        2,
-    ),
-    LlmModel.COHERE_COMMAND_A_REASONING_08_2025: ModelMetadata(
-        "open_router",
-        256000,
-        32768,
-        "Command A Reasoning 08.2025",
-        "OpenRouter",
-        "Cohere",
-        3,
-    ),
-    LlmModel.COHERE_COMMAND_A_VISION_07_2025: ModelMetadata(
-        "open_router",
-        128000,
-        8192,
-        "Command A Vision 07.2025",
-        "OpenRouter",
-        "Cohere",
-        2,
-    ),
    LlmModel.DEEPSEEK_CHAT: ModelMetadata(
        "open_router", 64000, 2048, "DeepSeek Chat", "OpenRouter", "DeepSeek", 1
    ),
@@ -510,15 +389,6 @@ MODEL_METADATA = {
    LlmModel.PERPLEXITY_SONAR_PRO: ModelMetadata(
        "open_router", 200000, 8000, "Sonar Pro", "OpenRouter", "Perplexity", 2
    ),
-    LlmModel.PERPLEXITY_SONAR_REASONING_PRO: ModelMetadata(
-        "open_router",
-        128000,
-        8000,
-        "Sonar Reasoning Pro",
-        "OpenRouter",
-        "Perplexity",
-        2,
-    ),
    LlmModel.PERPLEXITY_SONAR_DEEP_RESEARCH: ModelMetadata(
        "open_router",
        128000,
@@ -564,9 +434,6 @@ MODEL_METADATA = {
    LlmModel.MICROSOFT_WIZARDLM_2_8X22B: ModelMetadata(
        "open_router", 65536, 4096, "WizardLM 2 8x22B", "OpenRouter", "Microsoft", 1
    ),
-    LlmModel.MICROSOFT_PHI_4: ModelMetadata(
-        "open_router", 16384, 16384, "Phi-4", "OpenRouter", "Microsoft", 1
-    ),
    LlmModel.GRYPHE_MYTHOMAX_L2_13B: ModelMetadata(
        "open_router", 4096, 4096, "MythoMax L2 13B", "OpenRouter", "Gryphe", 1
    ),
@@ -576,15 +443,6 @@ MODEL_METADATA = {
    LlmModel.META_LLAMA_4_MAVERICK: ModelMetadata(
        "open_router", 1048576, 1000000, "Llama 4 Maverick", "OpenRouter", "Meta", 1
    ),
-    LlmModel.GROK_3: ModelMetadata(
-        "open_router",
-        131072,
-        131072,
-        "Grok 3",
-        "OpenRouter",
-        "xAI",
-        2,
-    ),
    LlmModel.GROK_4: ModelMetadata(
        "open_router", 256000, 256000, "Grok 4", "OpenRouter", "xAI", 3
    ),
@@ -942,11 +800,6 @@ async def llm_call(
        if tools:
            raise ValueError("Ollama does not support tools.")

-        # Validate user-provided Ollama host to prevent SSRF etc.
-        await validate_url_host(
-            ollama_host, trusted_hostnames=[settings.config.ollama_host]
-        )
-
        client = ollama.AsyncClient(host=ollama_host)
        sys_messages = [p["content"] for p in prompt if p["role"] == "system"]
        usr_messages = [p["content"] for p in prompt if p["role"] != "system"]
@@ -968,7 +821,7 @@ async def llm_call(
    elif provider == "open_router":
        tools_param = tools if tools else openai.NOT_GIVEN
        client = openai.AsyncOpenAI(
-            base_url=OPENROUTER_BASE_URL,
+            base_url="https://openrouter.ai/api/v1",
            api_key=credentials.api_key.get_secret_value(),
        )

--- a/autogpt_platform/backend/backend/blocks/perplexity.py
+++ b/autogpt_platform/backend/backend/blocks/perplexity.py
@@ -4,7 +4,7 @@ from enum import Enum
 from typing import Any, Literal

 import openai
-from pydantic import SecretStr, field_validator
+from pydantic import SecretStr

 from backend.blocks._base import (
    Block,
@@ -13,7 +13,6 @@ from backend.blocks._base import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
-from backend.data.block import BlockInput
 from backend.data.model import (
    APIKeyCredentials,
    CredentialsField,
@@ -22,7 +21,6 @@ from backend.data.model import (
    SchemaField,
 )
 from backend.integrations.providers import ProviderName
-from backend.util.clients import OPENROUTER_BASE_URL
 from backend.util.logging import TruncatedLogger

 logger = TruncatedLogger(logging.getLogger(__name__), "[Perplexity-Block]")
@@ -36,20 +34,6 @@ class PerplexityModel(str, Enum):
    SONAR_DEEP_RESEARCH = "perplexity/sonar-deep-research"


-def _sanitize_perplexity_model(value: Any) -> PerplexityModel:
-    """Return a valid PerplexityModel, falling back to SONAR for invalid values."""
-    if isinstance(value, PerplexityModel):
-        return value
-    try:
-        return PerplexityModel(value)
-    except ValueError:
-        logger.warning(
-            f"Invalid PerplexityModel '{value}', "
-            f"falling back to {PerplexityModel.SONAR.value}"
-        )
-        return PerplexityModel.SONAR
-
-
 PerplexityCredentials = CredentialsMetaInput[
    Literal[ProviderName.OPEN_ROUTER], Literal["api_key"]
 ]
@@ -88,25 +72,6 @@ class PerplexityBlock(Block):
            advanced=False,
        )
        credentials: PerplexityCredentials = PerplexityCredentialsField()
-
-        @field_validator("model", mode="before")
-        @classmethod
-        def fallback_invalid_model(cls, v: Any) -> PerplexityModel:
-            """Fall back to SONAR if the model value is not a valid
-            PerplexityModel (e.g. an OpenAI model ID set by the agent
-            generator)."""
-            return _sanitize_perplexity_model(v)
-
-        @classmethod
-        def validate_data(cls, data: BlockInput) -> str | None:
-            """Sanitize the model field before JSON schema validation so that
-            invalid values are replaced with the default instead of raising a
-            BlockInputError."""
-            model_value = data.get("model")
-            if model_value is not None:
-                data["model"] = _sanitize_perplexity_model(model_value).value
-            return super().validate_data(data)
-
        system_prompt: str = SchemaField(
            title="System Prompt",
            default="",
@@ -171,7 +136,7 @@ class PerplexityBlock(Block):
    ) -> dict[str, Any]:
        """Call Perplexity via OpenRouter and extract annotations."""
        client = openai.AsyncOpenAI(
-            base_url=OPENROUTER_BASE_URL,
+            base_url="https://openrouter.ai/api/v1",
            api_key=credentials.api_key.get_secret_value(),
        )

--- a/autogpt_platform/backend/backend/blocks/reddit.py
+++ b/autogpt_platform/backend/backend/blocks/reddit.py
@@ -2232,7 +2232,6 @@ class DeleteRedditPostBlock(Block):
                ("post_id", "abc123"),
            ],
            test_mock={"delete_post": lambda creds, post_id: True},
-            is_sensitive_action=True,
        )

    @staticmethod
@@ -2291,7 +2290,6 @@ class DeleteRedditCommentBlock(Block):
                ("comment_id", "xyz789"),
            ],
            test_mock={"delete_comment": lambda creds, comment_id: True},
-            is_sensitive_action=True,
        )

    @staticmethod
--- a/autogpt_platform/backend/backend/blocks/slant3d/order.py
+++ b/autogpt_platform/backend/backend/blocks/slant3d/order.py
@@ -72,7 +72,6 @@ class Slant3DCreateOrderBlock(Slant3DBlockBase):
                "_make_request": lambda *args, **kwargs: {"orderId": "314144241"},
                "_convert_to_color": lambda *args, **kwargs: "black",
            },
-            is_sensitive_action=True,
        )

    async def run(
--- a/autogpt_platform/backend/backend/blocks/stagehand/blocks.py
+++ b/autogpt_platform/backend/backend/blocks/stagehand/blocks.py
@@ -83,8 +83,7 @@ class StagehandRecommendedLlmModel(str, Enum):
    GPT41_MINI = "gpt-4.1-mini-2025-04-14"

    # Anthropic
-    CLAUDE_4_5_SONNET = "claude-sonnet-4-5-20250929"  # Keep for backwards compat
-    CLAUDE_4_6_SONNET = "claude-sonnet-4-6"
+    CLAUDE_4_5_SONNET = "claude-sonnet-4-5-20250929"

    @property
    def provider_name(self) -> str:
@@ -138,7 +137,7 @@ class StagehandObserveBlock(Block):
        model: StagehandRecommendedLlmModel = SchemaField(
            title="LLM Model",
            description="LLM to use for Stagehand (provider is inferred)",
-            default=StagehandRecommendedLlmModel.CLAUDE_4_6_SONNET,
+            default=StagehandRecommendedLlmModel.CLAUDE_4_5_SONNET,
            advanced=False,
        )
        model_credentials: AICredentials = AICredentialsField()
@@ -228,7 +227,7 @@ class StagehandActBlock(Block):
        model: StagehandRecommendedLlmModel = SchemaField(
            title="LLM Model",
            description="LLM to use for Stagehand (provider is inferred)",
-            default=StagehandRecommendedLlmModel.CLAUDE_4_6_SONNET,
+            default=StagehandRecommendedLlmModel.CLAUDE_4_5_SONNET,
            advanced=False,
        )
        model_credentials: AICredentials = AICredentialsField()
@@ -325,7 +324,7 @@ class StagehandExtractBlock(Block):
        model: StagehandRecommendedLlmModel = SchemaField(
            title="LLM Model",
            description="LLM to use for Stagehand (provider is inferred)",
-            default=StagehandRecommendedLlmModel.CLAUDE_4_6_SONNET,
+            default=StagehandRecommendedLlmModel.CLAUDE_4_5_SONNET,
            advanced=False,
        )
        model_credentials: AICredentials = AICredentialsField()
--- a/autogpt_platform/backend/backend/blocks/system/store_operations.py
+++ b/autogpt_platform/backend/backend/blocks/system/store_operations.py
@@ -1,8 +1,8 @@
 import logging
+from typing import Literal

 from pydantic import BaseModel

-from backend.api.features.store.db import StoreAgentsSortOptions
 from backend.blocks._base import (
    Block,
    BlockCategory,
@@ -176,8 +176,8 @@ class SearchStoreAgentsBlock(Block):
        category: str | None = SchemaField(
            description="Filter by category", default=None
        )
-        sort_by: StoreAgentsSortOptions = SchemaField(
-            description="How to sort the results", default=StoreAgentsSortOptions.RATING
+        sort_by: Literal["rating", "runs", "name", "updated_at"] = SchemaField(
+            description="How to sort the results", default="rating"
        )
        limit: int = SchemaField(
            description="Maximum number of results to return", default=10, ge=1, le=100
@@ -278,7 +278,7 @@ class SearchStoreAgentsBlock(Block):
        self,
        query: str | None = None,
        category: str | None = None,
-        sort_by: StoreAgentsSortOptions = StoreAgentsSortOptions.RATING,
+        sort_by: Literal["rating", "runs", "name", "updated_at"] = "rating",
        limit: int = 10,
    ) -> SearchAgentsResponse:
        """
--- a/autogpt_platform/backend/backend/blocks/test/test_perplexity.py
+++ b/autogpt_platform/backend/backend/blocks/test/test_perplexity.py
@@ -1,81 +0,0 @@
-"""Unit tests for PerplexityBlock model fallback behavior."""
-
-import pytest
-
-from backend.blocks.perplexity import (
-    TEST_CREDENTIALS_INPUT,
-    PerplexityBlock,
-    PerplexityModel,
-)
-
-
-def _make_input(**overrides) -> dict:
-    defaults = {
-        "prompt": "test query",
-        "credentials": TEST_CREDENTIALS_INPUT,
-    }
-    defaults.update(overrides)
-    return defaults
-
-
-class TestPerplexityModelFallback:
-    """Tests for fallback_invalid_model field_validator."""
-
-    def test_invalid_model_falls_back_to_sonar(self):
-        inp = PerplexityBlock.Input(**_make_input(model="gpt-5.2-2025-12-11"))
-        assert inp.model == PerplexityModel.SONAR
-
-    def test_another_invalid_model_falls_back_to_sonar(self):
-        inp = PerplexityBlock.Input(**_make_input(model="gpt-4o"))
-        assert inp.model == PerplexityModel.SONAR
-
-    def test_valid_model_string_is_kept(self):
-        inp = PerplexityBlock.Input(**_make_input(model="perplexity/sonar-pro"))
-        assert inp.model == PerplexityModel.SONAR_PRO
-
-    def test_valid_enum_value_is_kept(self):
-        inp = PerplexityBlock.Input(
-            **_make_input(model=PerplexityModel.SONAR_DEEP_RESEARCH)
-        )
-        assert inp.model == PerplexityModel.SONAR_DEEP_RESEARCH
-
-    def test_default_model_when_omitted(self):
-        inp = PerplexityBlock.Input(**_make_input())
-        assert inp.model == PerplexityModel.SONAR
-
-    @pytest.mark.parametrize(
-        "model_value",
-        [
-            "perplexity/sonar",
-            "perplexity/sonar-pro",
-            "perplexity/sonar-deep-research",
-        ],
-    )
-    def test_all_valid_models_accepted(self, model_value: str):
-        inp = PerplexityBlock.Input(**_make_input(model=model_value))
-        assert inp.model.value == model_value
-
-
-class TestPerplexityValidateData:
-    """Tests for validate_data which runs during block execution (before
-    Pydantic instantiation). Invalid models must be sanitized here so
-    JSON schema validation does not reject them."""
-
-    def test_invalid_model_sanitized_before_schema_validation(self):
-        data = _make_input(model="gpt-5.2-2025-12-11")
-        error = PerplexityBlock.Input.validate_data(data)
-        assert error is None
-        assert data["model"] == PerplexityModel.SONAR.value
-
-    def test_valid_model_unchanged_by_validate_data(self):
-        data = _make_input(model="perplexity/sonar-pro")
-        error = PerplexityBlock.Input.validate_data(data)
-        assert error is None
-        assert data["model"] == "perplexity/sonar-pro"
-
-    def test_missing_model_uses_default(self):
-        data = _make_input()  # no model key
-        error = PerplexityBlock.Input.validate_data(data)
-        assert error is None
-        inp = PerplexityBlock.Input(**data)
-        assert inp.model == PerplexityModel.SONAR
--- a/autogpt_platform/backend/backend/blocks/test/test_store_operations.py
+++ b/autogpt_platform/backend/backend/blocks/test/test_store_operations.py
@@ -2,7 +2,6 @@ from unittest.mock import MagicMock

 import pytest

-from backend.api.features.store.db import StoreAgentsSortOptions
 from backend.blocks.system.library_operations import (
    AddToLibraryFromStoreBlock,
    LibraryAgent,
@@ -122,10 +121,7 @@ async def test_search_store_agents_block(mocker):
    )

    input_data = block.Input(
-        query="test",
-        category="productivity",
-        sort_by=StoreAgentsSortOptions.RATING,  # type: ignore[reportArgumentType]
-        limit=10,
+        query="test", category="productivity", sort_by="rating", limit=10
    )

    outputs = {}
--- a/autogpt_platform/backend/backend/copilot/baseline/service.py
+++ b/autogpt_platform/backend/backend/copilot/baseline/service.py
@@ -22,7 +22,6 @@ from backend.copilot.model import (
    update_session_title,
    upsert_chat_session,
 )
-from backend.copilot.prompting import get_baseline_supplement
 from backend.copilot.response_model import (
    StreamBaseResponse,
    StreamError,
@@ -63,8 +62,8 @@ async def _update_title_async(
    """Generate and persist a session title in the background."""
    try:
        title = await _generate_session_title(message, user_id, session_id)
-        if title and user_id:
-            await update_session_title(session_id, user_id, title, only_if_empty=True)
+        if title:
+            await update_session_title(session_id, title)
    except Exception as e:
        logger.warning("[Baseline] Failed to update session title: %s", e)

@@ -177,17 +176,14 @@ async def stream_chat_completion_baseline(
    # changes from concurrent chats updating business understanding.
    is_first_turn = len(session.messages) <= 1
    if is_first_turn:
-        base_system_prompt, _ = await _build_system_prompt(
+        system_prompt, _ = await _build_system_prompt(
            user_id, has_conversation_history=False
        )
    else:
-        base_system_prompt, _ = await _build_system_prompt(
+        system_prompt, _ = await _build_system_prompt(
            user_id=None, has_conversation_history=True
        )

-    # Append tool documentation and technical notes
-    system_prompt = base_system_prompt + get_baseline_supplement()
-
    # Compress context if approaching the model's token limit
    messages_for_context = await _compress_session_messages(session.messages)

--- a/autogpt_platform/backend/backend/copilot/config.py
+++ b/autogpt_platform/backend/backend/copilot/config.py
@@ -1,13 +1,10 @@
 """Configuration management for chat system."""

 import os
-from typing import Literal

 from pydantic import Field, field_validator
 from pydantic_settings import BaseSettings

-from backend.util.clients import OPENROUTER_BASE_URL
-

 class ChatConfig(BaseSettings):
    """Configuration for the chat system."""
@@ -22,7 +19,7 @@ class ChatConfig(BaseSettings):
    )
    api_key: str | None = Field(default=None, description="OpenAI API key")
    base_url: str | None = Field(
-        default=OPENROUTER_BASE_URL,
+        default="https://openrouter.ai/api/v1",
        description="Base URL for API (e.g., for OpenRouter)",
    )

@@ -115,37 +112,9 @@ class ChatConfig(BaseSettings):
        description="E2B sandbox template to use for copilot sessions.",
    )
    e2b_sandbox_timeout: int = Field(
-        default=300,  # 5 min safety net — explicit per-turn pause is the primary mechanism
-        description="E2B sandbox running-time timeout (seconds). "
-        "E2B timeout is wall-clock (not idle). Explicit per-turn pause is the primary "
-        "mechanism; this is the safety net.",
+        default=43200,  # 12 hours — same as session_ttl
+        description="E2B sandbox keepalive timeout in seconds.",
    )
-    e2b_sandbox_on_timeout: Literal["kill", "pause"] = Field(
-        default="pause",
-        description="E2B lifecycle action on timeout: 'pause' (default, free) or 'kill'.",
-    )
-
-    @property
-    def e2b_active(self) -> bool:
-        """True when E2B is enabled and the API key is present.
-
-        Single source of truth for "should we use E2B right now?".
-        Prefer this over combining ``use_e2b_sandbox`` and ``e2b_api_key``
-        separately at call sites.
-        """
-        return self.use_e2b_sandbox and bool(self.e2b_api_key)
-
-    @property
-    def active_e2b_api_key(self) -> str | None:
-        """Return the E2B API key when E2B is enabled and configured, else None.
-
-        Combines the ``use_e2b_sandbox`` flag check and key presence into one.
-        Use in callers::
-
-            if api_key := config.active_e2b_api_key:
-                # E2B is active; api_key is narrowed to str
-        """
-        return self.e2b_api_key if self.e2b_active else None

    @field_validator("use_e2b_sandbox", mode="before")
    @classmethod
@@ -195,7 +164,7 @@ class ChatConfig(BaseSettings):
            if not v:
                v = os.getenv("OPENAI_BASE_URL")
            if not v:
-                v = OPENROUTER_BASE_URL
+                v = "https://openrouter.ai/api/v1"
        return v

    @field_validator("use_claude_agent_sdk", mode="before")
--- a/autogpt_platform/backend/backend/copilot/config_test.py
+++ b/autogpt_platform/backend/backend/copilot/config_test.py
@@ -1,38 +0,0 @@
-"""Unit tests for ChatConfig."""
-
-import pytest
-
-from .config import ChatConfig
-
-# Env vars that the ChatConfig validators read — must be cleared so they don't
-# override the explicit constructor values we pass in each test.
-_E2B_ENV_VARS = (
-    "CHAT_USE_E2B_SANDBOX",
-    "CHAT_E2B_API_KEY",
-    "E2B_API_KEY",
-)
-
-
-@pytest.fixture(autouse=True)
-def _clean_e2b_env(monkeypatch: pytest.MonkeyPatch) -> None:
-    for var in _E2B_ENV_VARS:
-        monkeypatch.delenv(var, raising=False)
-
-
-class TestE2BActive:
-    """Tests for the e2b_active property — single source of truth for E2B usage."""
-
-    def test_both_enabled_and_key_present_returns_true(self):
-        """e2b_active is True when use_e2b_sandbox=True and e2b_api_key is set."""
-        cfg = ChatConfig(use_e2b_sandbox=True, e2b_api_key="test-key")
-        assert cfg.e2b_active is True
-
-    def test_enabled_but_missing_key_returns_false(self):
-        """e2b_active is False when use_e2b_sandbox=True but e2b_api_key is absent."""
-        cfg = ChatConfig(use_e2b_sandbox=True, e2b_api_key=None)
-        assert cfg.e2b_active is False
-
-    def test_disabled_returns_false(self):
-        """e2b_active is False when use_e2b_sandbox=False regardless of key."""
-        cfg = ChatConfig(use_e2b_sandbox=False, e2b_api_key="test-key")
-        assert cfg.e2b_active is False
--- a/autogpt_platform/backend/backend/copilot/constants.py
+++ b/autogpt_platform/backend/backend/copilot/constants.py
@@ -6,32 +6,6 @@
 COPILOT_ERROR_PREFIX = "[__COPILOT_ERROR_f7a1__]"  # Renders as ErrorCard
 COPILOT_SYSTEM_PREFIX = "[__COPILOT_SYSTEM_e3b0__]"  # Renders as system info message

-# Prefix for all synthetic IDs generated by CoPilot block execution.
-# Used to distinguish CoPilot-generated records from real graph execution records
-# in PendingHumanReview and other tables.
-COPILOT_SYNTHETIC_ID_PREFIX = "copilot-"
-
-# Sub-prefixes for session-scoped and node-scoped synthetic IDs.
-COPILOT_SESSION_PREFIX = f"{COPILOT_SYNTHETIC_ID_PREFIX}session-"
-COPILOT_NODE_PREFIX = f"{COPILOT_SYNTHETIC_ID_PREFIX}node-"
-
-# Separator used in synthetic node_exec_id to encode node_id.
-# Format: "{node_id}:{random_hex}" — extract node_id via rsplit(":", 1)[0]
-COPILOT_NODE_EXEC_ID_SEPARATOR = ":"
-
 # Compaction notice messages shown to users.
 COMPACTION_DONE_MSG = "Earlier messages were summarized to fit within context limits."
 COMPACTION_TOOL_NAME = "context_compaction"
-
-
-def is_copilot_synthetic_id(id_value: str) -> bool:
-    """Check if an ID is a CoPilot synthetic ID (not from a real graph execution)."""
-    return id_value.startswith(COPILOT_SYNTHETIC_ID_PREFIX)
-
-
-def parse_node_id_from_exec_id(node_exec_id: str) -> str:
-    """Extract node_id from a synthetic node_exec_id.
-
-    Format: "{node_id}:{random_hex}" → returns "{node_id}".
-    """
-    return node_exec_id.rsplit(COPILOT_NODE_EXEC_ID_SEPARATOR, 1)[0]
--- a/autogpt_platform/backend/backend/copilot/context.py
+++ b/autogpt_platform/backend/backend/copilot/context.py
@@ -1,128 +0,0 @@
-"""Shared execution context for copilot SDK tool handlers.
-
-All context variables and their accessors live here so that
-``tool_adapter``, ``file_ref``, and ``e2b_file_tools`` can import them
-without creating circular dependencies.
-"""
-
-import os
-import re
-from contextvars import ContextVar
-from typing import TYPE_CHECKING
-
-from backend.copilot.model import ChatSession
-from backend.data.db_accessors import workspace_db
-from backend.util.workspace import WorkspaceManager
-
-if TYPE_CHECKING:
-    from e2b import AsyncSandbox
-
-# Allowed base directory for the Read tool.
-_SDK_PROJECTS_DIR = os.path.realpath(os.path.expanduser("~/.claude/projects"))
-
-# Encoded project-directory name for the current session (e.g.
-# "-private-tmp-copilot-<uuid>").  Set by set_execution_context() so path
-# validation can scope tool-results reads to the current session.
-_current_project_dir: ContextVar[str] = ContextVar("_current_project_dir", default="")
-
-_current_user_id: ContextVar[str | None] = ContextVar("current_user_id", default=None)
-_current_session: ContextVar[ChatSession | None] = ContextVar(
-    "current_session", default=None
-)
-_current_sandbox: ContextVar["AsyncSandbox | None"] = ContextVar(
-    "_current_sandbox", default=None
-)
-_current_sdk_cwd: ContextVar[str] = ContextVar("_current_sdk_cwd", default="")
-
-
-def _encode_cwd_for_cli(cwd: str) -> str:
-    """Encode a working directory path the same way the Claude CLI does."""
-    return re.sub(r"[^a-zA-Z0-9]", "-", os.path.realpath(cwd))
-
-
-def set_execution_context(
-    user_id: str | None,
-    session: ChatSession,
-    sandbox: "AsyncSandbox | None" = None,
-    sdk_cwd: str | None = None,
-) -> None:
-    """Set per-turn context variables used by file-resolution tool handlers."""
-    _current_user_id.set(user_id)
-    _current_session.set(session)
-    _current_sandbox.set(sandbox)
-    _current_sdk_cwd.set(sdk_cwd or "")
-    _current_project_dir.set(_encode_cwd_for_cli(sdk_cwd) if sdk_cwd else "")
-
-
-def get_execution_context() -> tuple[str | None, ChatSession | None]:
-    """Return the current (user_id, session) pair for the active request."""
-    return _current_user_id.get(), _current_session.get()
-
-
-def get_current_sandbox() -> "AsyncSandbox | None":
-    """Return the E2B sandbox for the current session, or None if not active."""
-    return _current_sandbox.get()
-
-
-def get_sdk_cwd() -> str:
-    """Return the SDK working directory for the current session (empty string if unset)."""
-    return _current_sdk_cwd.get()
-
-
-E2B_WORKDIR = "/home/user"
-
-
-def resolve_sandbox_path(path: str) -> str:
-    """Normalise *path* to an absolute sandbox path under ``/home/user``.
-
-    Raises :class:`ValueError` if the resolved path escapes the sandbox.
-    """
-    candidate = path if os.path.isabs(path) else os.path.join(E2B_WORKDIR, path)
-    normalized = os.path.normpath(candidate)
-    if normalized != E2B_WORKDIR and not normalized.startswith(E2B_WORKDIR + "/"):
-        raise ValueError(f"Path must be within {E2B_WORKDIR}: {path}")
-    return normalized
-
-
-async def get_workspace_manager(user_id: str, session_id: str) -> WorkspaceManager:
-    """Create a session-scoped :class:`WorkspaceManager`.
-
-    Placed here (rather than in ``tools/workspace_files``) so that modules
-    like ``sdk/file_ref`` can import it without triggering the heavy
-    ``tools/__init__`` import chain.
-    """
-    workspace = await workspace_db().get_or_create_workspace(user_id)
-    return WorkspaceManager(user_id, workspace.id, session_id)
-
-
-def is_allowed_local_path(path: str, sdk_cwd: str | None = None) -> bool:
-    """Return True if *path* is within an allowed host-filesystem location.
-
-    Allowed:
-    - Files under *sdk_cwd* (``/tmp/copilot-<session>/``)
-    - Files under ``~/.claude/projects/<encoded-cwd>/tool-results/`` (SDK tool-results)
-    """
-    if not path:
-        return False
-
-    if path.startswith("~"):
-        resolved = os.path.realpath(os.path.expanduser(path))
-    elif not os.path.isabs(path) and sdk_cwd:
-        resolved = os.path.realpath(os.path.join(sdk_cwd, path))
-    else:
-        resolved = os.path.realpath(path)
-
-    if sdk_cwd:
-        norm_cwd = os.path.realpath(sdk_cwd)
-        if resolved == norm_cwd or resolved.startswith(norm_cwd + os.sep):
-            return True
-
-    encoded = _current_project_dir.get("")
-    if encoded:
-        tool_results_dir = os.path.join(_SDK_PROJECTS_DIR, encoded, "tool-results")
-        if resolved == tool_results_dir or resolved.startswith(
-            tool_results_dir + os.sep
-        ):
-            return True
-
-    return False
--- a/autogpt_platform/backend/backend/copilot/context_test.py
+++ b/autogpt_platform/backend/backend/copilot/context_test.py
@@ -1,163 +0,0 @@
-"""Tests for context.py — execution context variables and path helpers."""
-
-from __future__ import annotations
-
-import os
-import tempfile
-from unittest.mock import MagicMock
-
-import pytest
-
-from backend.copilot.context import (
-    _SDK_PROJECTS_DIR,
-    _current_project_dir,
-    get_current_sandbox,
-    get_execution_context,
-    get_sdk_cwd,
-    is_allowed_local_path,
-    resolve_sandbox_path,
-    set_execution_context,
-)
-
-
-def _make_session() -> MagicMock:
-    s = MagicMock()
-    s.session_id = "test-session"
-    return s
-
-
-# ---------------------------------------------------------------------------
-# Context variable getters
-# ---------------------------------------------------------------------------
-
-
-def test_get_execution_context_defaults():
-    """get_execution_context returns (None, session) when user_id is not set."""
-    set_execution_context(None, _make_session())
-    user_id, session = get_execution_context()
-    assert user_id is None
-    assert session is not None
-
-
-def test_set_and_get_execution_context():
-    """set_execution_context stores user_id and session."""
-    mock_session = _make_session()
-    set_execution_context("user-abc", mock_session)
-    user_id, session = get_execution_context()
-    assert user_id == "user-abc"
-    assert session is mock_session
-
-
-def test_get_current_sandbox_none_by_default():
-    """get_current_sandbox returns None when no sandbox is set."""
-    set_execution_context("u1", _make_session(), sandbox=None)
-    assert get_current_sandbox() is None
-
-
-def test_get_current_sandbox_returns_set_value():
-    """get_current_sandbox returns the sandbox set via set_execution_context."""
-    mock_sandbox = MagicMock()
-    set_execution_context("u1", _make_session(), sandbox=mock_sandbox)
-    assert get_current_sandbox() is mock_sandbox
-
-
-def test_get_sdk_cwd_empty_when_not_set():
-    """get_sdk_cwd returns empty string when sdk_cwd is not set."""
-    set_execution_context("u1", _make_session(), sdk_cwd=None)
-    assert get_sdk_cwd() == ""
-
-
-def test_get_sdk_cwd_returns_set_value():
-    """get_sdk_cwd returns the value set via set_execution_context."""
-    set_execution_context("u1", _make_session(), sdk_cwd="/tmp/copilot-test")
-    assert get_sdk_cwd() == "/tmp/copilot-test"
-
-
-# ---------------------------------------------------------------------------
-# is_allowed_local_path
-# ---------------------------------------------------------------------------
-
-
-def test_is_allowed_local_path_empty():
-    assert not is_allowed_local_path("")
-
-
-def test_is_allowed_local_path_inside_sdk_cwd():
-    with tempfile.TemporaryDirectory() as cwd:
-        path = os.path.join(cwd, "file.txt")
-        assert is_allowed_local_path(path, cwd)
-
-
-def test_is_allowed_local_path_sdk_cwd_itself():
-    with tempfile.TemporaryDirectory() as cwd:
-        assert is_allowed_local_path(cwd, cwd)
-
-
-def test_is_allowed_local_path_outside_sdk_cwd():
-    with tempfile.TemporaryDirectory() as cwd:
-        assert not is_allowed_local_path("/etc/passwd", cwd)
-
-
-def test_is_allowed_local_path_no_sdk_cwd_no_project_dir():
-    """Without sdk_cwd or project_dir, all paths are rejected."""
-    _current_project_dir.set("")
-    assert not is_allowed_local_path("/tmp/some-file.txt", sdk_cwd=None)
-
-
-def test_is_allowed_local_path_tool_results_dir():
-    """Files under the tool-results directory for the current project are allowed."""
-    encoded = "test-encoded-dir"
-    tool_results_dir = os.path.join(_SDK_PROJECTS_DIR, encoded, "tool-results")
-    path = os.path.join(tool_results_dir, "output.txt")
-
-    _current_project_dir.set(encoded)
-    try:
-        assert is_allowed_local_path(path, sdk_cwd=None)
-    finally:
-        _current_project_dir.set("")
-
-
-def test_is_allowed_local_path_sibling_of_tool_results_is_rejected():
-    """A path adjacent to tool-results/ but not inside it is rejected."""
-    encoded = "test-encoded-dir"
-    sibling_path = os.path.join(_SDK_PROJECTS_DIR, encoded, "other-dir", "file.txt")
-
-    _current_project_dir.set(encoded)
-    try:
-        assert not is_allowed_local_path(sibling_path, sdk_cwd=None)
-    finally:
-        _current_project_dir.set("")
-
-
-# ---------------------------------------------------------------------------
-# resolve_sandbox_path
-# ---------------------------------------------------------------------------
-
-
-def test_resolve_sandbox_path_absolute_valid():
-    assert (
-        resolve_sandbox_path("/home/user/project/main.py")
-        == "/home/user/project/main.py"
-    )
-
-
-def test_resolve_sandbox_path_relative():
-    assert resolve_sandbox_path("project/main.py") == "/home/user/project/main.py"
-
-
-def test_resolve_sandbox_path_workdir_itself():
-    assert resolve_sandbox_path("/home/user") == "/home/user"
-
-
-def test_resolve_sandbox_path_normalizes_dots():
-    assert resolve_sandbox_path("/home/user/a/../b") == "/home/user/b"
-
-
-def test_resolve_sandbox_path_escape_raises():
-    with pytest.raises(ValueError, match="/home/user"):
-        resolve_sandbox_path("/home/user/../../etc/passwd")
-
-
-def test_resolve_sandbox_path_absolute_outside_raises():
-    with pytest.raises(ValueError, match="/home/user"):
-        resolve_sandbox_path("/etc/passwd")
--- a/autogpt_platform/backend/backend/copilot/db.py
+++ b/autogpt_platform/backend/backend/copilot/db.py
@@ -81,35 +81,6 @@ async def update_chat_session(
    return ChatSession.from_db(session) if session else None


-async def update_chat_session_title(
-    session_id: str,
-    user_id: str,
-    title: str,
-    *,
-    only_if_empty: bool = False,
-) -> bool:
-    """Update the title of a chat session, scoped to the owning user.
-
-    Always filters by (session_id, user_id) so callers cannot mutate another
-    user's session even when they know the session_id.
-
-    Args:
-        only_if_empty: When True, uses an atomic ``UPDATE WHERE title IS NULL``
-            guard so auto-generated titles never overwrite a user-set title.
-
-    Returns True if a row was updated, False otherwise (session not found,
-    wrong user, or — when only_if_empty — title was already set).
-    """
-    where: ChatSessionWhereInput = {"id": session_id, "userId": user_id}
-    if only_if_empty:
-        where["title"] = None
-    result = await PrismaChatSession.prisma().update_many(
-        where=where,
-        data={"title": title, "updatedAt": datetime.now(UTC)},
-    )
-    return result > 0
-
-
 async def add_chat_message(
    session_id: str,
    role: str,
--- a/autogpt_platform/backend/backend/copilot/integration_creds.py
+++ b/autogpt_platform/backend/backend/copilot/integration_creds.py
@@ -1,162 +0,0 @@
-"""Integration credential lookup with per-process TTL cache.
-
-Provides token retrieval for connected integrations so that copilot tools
-(e.g. bash_exec) can inject auth tokens into the execution environment without
-hitting the database on every command.
-
-Cache semantics (handled automatically by TTLCache):
- Token found → cached for _TOKEN_CACHE_TTL (5 min).  Avoids repeated DB hits
-  for users who have credentials and are running many bash commands.
- No credentials found → cached for _NULL_CACHE_TTL (60 s).  Avoids a DB hit
-  on every E2B command for users who haven't connected an account yet, while
-  still picking up a newly-connected account within one minute.
-
-Both caches are bounded to _CACHE_MAX_SIZE entries; cachetools evicts the
-least-recently-used entry when the limit is reached.
-
-Multi-worker note: both caches are in-process only.  Each worker/replica
-maintains its own independent cache, so a credential fetch may be duplicated
-across processes.  This is acceptable for the current goal (reduce DB hits per
-session per-process), but if cache efficiency across replicas becomes important
-a shared cache (e.g. Redis) should be used instead.
-"""
-
-import logging
-from typing import cast
-
-from cachetools import TTLCache
-
-from backend.data.model import APIKeyCredentials, OAuth2Credentials
-from backend.integrations.creds_manager import (
-    IntegrationCredentialsManager,
-    register_creds_changed_hook,
-)
-
-logger = logging.getLogger(__name__)
-
-# Maps provider slug → env var names to inject when the provider is connected.
-# Add new providers here when adding integration support.
-# NOTE: keep in sync with connect_integration._PROVIDER_INFO — both registries
-# must be updated when adding a new provider.
-PROVIDER_ENV_VARS: dict[str, list[str]] = {
-    "github": ["GH_TOKEN", "GITHUB_TOKEN"],
-}
-
-_TOKEN_CACHE_TTL = 300.0  # seconds — for found tokens
-_NULL_CACHE_TTL = 60.0  # seconds — for "not connected" results
-_CACHE_MAX_SIZE = 10_000
-
-# (user_id, provider) → token string.  TTLCache handles expiry + eviction.
-# Thread-safety note: TTLCache is NOT thread-safe, but that is acceptable here
-# because all callers (get_provider_token, invalidate_user_provider_cache) run
-# exclusively on the asyncio event loop.  There are no await points between a
-# cache read and its corresponding write within any function, so no concurrent
-# coroutine can interleave.  If ThreadPoolExecutor workers are ever added to
-# this path, a threading.RLock should be wrapped around these caches.
-_token_cache: TTLCache[tuple[str, str], str] = TTLCache(
-    maxsize=_CACHE_MAX_SIZE, ttl=_TOKEN_CACHE_TTL
-)
-# Separate cache for "no credentials" results with a shorter TTL.
-_null_cache: TTLCache[tuple[str, str], bool] = TTLCache(
-    maxsize=_CACHE_MAX_SIZE, ttl=_NULL_CACHE_TTL
-)
-
-
-def invalidate_user_provider_cache(user_id: str, provider: str) -> None:
-    """Remove the cached entry for *user_id*/*provider* from both caches.
-
-    Call this after storing new credentials so that the next
-    ``get_provider_token()`` call performs a fresh DB lookup instead of
-    serving a stale TTL-cached result.
-    """
-    key = (user_id, provider)
-    _token_cache.pop(key, None)
-    _null_cache.pop(key, None)
-
-
-# Register this module's cache-bust function with the credentials manager so
-# that any create/update/delete operation immediately evicts stale cache
-# entries.  This avoids a lazy import inside creds_manager and eliminates the
-# circular-import risk.
-register_creds_changed_hook(invalidate_user_provider_cache)
-
-# Module-level singleton to avoid re-instantiating IntegrationCredentialsManager
-# on every cache-miss call to get_provider_token().
-_manager = IntegrationCredentialsManager()
-
-
-async def get_provider_token(user_id: str, provider: str) -> str | None:
-    """Return the user's access token for *provider*, or ``None`` if not connected.
-
-    OAuth2 tokens are preferred (refreshed if needed); API keys are the fallback.
-    Found tokens are cached for _TOKEN_CACHE_TTL (5 min).  "Not connected" results
-    are cached for _NULL_CACHE_TTL (60 s) to avoid a DB hit on every bash_exec
-    command for users who haven't connected yet, while still picking up a
-    newly-connected account within one minute.
-    """
-    cache_key = (user_id, provider)
-
-    if cache_key in _null_cache:
-        return None
-    if cached := _token_cache.get(cache_key):
-        return cached
-
-    manager = _manager
-    try:
-        creds_list = await manager.store.get_creds_by_provider(user_id, provider)
-    except Exception:
-        logger.debug("Failed to fetch %s credentials for user %s", provider, user_id)
-        return None
-
-    # Pass 1: prefer OAuth2 (carry scope info, refreshable via token endpoint).
-    # Sort so broader-scoped tokens come first: a token with "repo" scope covers
-    # full git access, while a public-data-only token lacks push/pull permission.
-    # lock=False — background injection; not worth a distributed lock acquisition.
-    oauth2_creds = sorted(
-        [c for c in creds_list if c.type == "oauth2"],
-        key=lambda c: 0 if "repo" in (cast(OAuth2Credentials, c).scopes or []) else 1,
-    )
-    for creds in oauth2_creds:
-        if creds.type == "oauth2":
-            try:
-                fresh = await manager.refresh_if_needed(
-                    user_id, cast(OAuth2Credentials, creds), lock=False
-                )
-                token = fresh.access_token.get_secret_value()
-            except Exception:
-                logger.warning(
-                    "Failed to refresh %s OAuth token for user %s; "
-                    "falling back to potentially stale token",
-                    provider,
-                    user_id,
-                )
-                token = cast(OAuth2Credentials, creds).access_token.get_secret_value()
-            _token_cache[cache_key] = token
-            return token
-
-    # Pass 2: fall back to API key (no expiry, no refresh needed).
-    for creds in creds_list:
-        if creds.type == "api_key":
-            token = cast(APIKeyCredentials, creds).api_key.get_secret_value()
-            _token_cache[cache_key] = token
-            return token
-
-    # No credentials found — cache to avoid repeated DB hits.
-    _null_cache[cache_key] = True
-    return None
-
-
-async def get_integration_env_vars(user_id: str) -> dict[str, str]:
-    """Return env vars for all providers the user has connected.
-
-    Iterates :data:`PROVIDER_ENV_VARS`, fetches each token, and builds a flat
-    ``{env_var: token}`` dict ready to pass to a subprocess or E2B sandbox.
-    Only providers with a stored credential contribute entries.
-    """
-    env: dict[str, str] = {}
-    for provider, var_names in PROVIDER_ENV_VARS.items():
-        token = await get_provider_token(user_id, provider)
-        if token:
-            for var in var_names:
-                env[var] = token
-    return env
--- a/autogpt_platform/backend/backend/copilot/integration_creds_test.py
+++ b/autogpt_platform/backend/backend/copilot/integration_creds_test.py
@@ -1,193 +0,0 @@
-"""Tests for integration_creds — TTL cache and token lookup paths."""
-
-from unittest.mock import AsyncMock, MagicMock, patch
-
-import pytest
-from pydantic import SecretStr
-
-from backend.copilot.integration_creds import (
-    _NULL_CACHE_TTL,
-    _TOKEN_CACHE_TTL,
-    PROVIDER_ENV_VARS,
-    _null_cache,
-    _token_cache,
-    get_integration_env_vars,
-    get_provider_token,
-    invalidate_user_provider_cache,
-)
-from backend.data.model import APIKeyCredentials, OAuth2Credentials
-
-_USER = "user-integration-creds-test"
-_PROVIDER = "github"
-
-
-def _make_api_key_creds(key: str = "test-api-key") -> APIKeyCredentials:
-    return APIKeyCredentials(
-        id="creds-api-key",
-        provider=_PROVIDER,
-        api_key=SecretStr(key),
-        title="Test API Key",
-        expires_at=None,
-    )
-
-
-def _make_oauth2_creds(token: str = "test-oauth-token") -> OAuth2Credentials:
-    return OAuth2Credentials(
-        id="creds-oauth2",
-        provider=_PROVIDER,
-        title="Test OAuth",
-        access_token=SecretStr(token),
-        refresh_token=SecretStr("test-refresh"),
-        access_token_expires_at=None,
-        refresh_token_expires_at=None,
-        scopes=[],
-    )
-
-
-@pytest.fixture(autouse=True)
-def clear_caches():
-    """Ensure clean caches before and after every test."""
-    _token_cache.clear()
-    _null_cache.clear()
-    yield
-    _token_cache.clear()
-    _null_cache.clear()
-
-
-class TestInvalidateUserProviderCache:
-    def test_removes_token_entry(self):
-        key = (_USER, _PROVIDER)
-        _token_cache[key] = "tok"
-        invalidate_user_provider_cache(_USER, _PROVIDER)
-        assert key not in _token_cache
-
-    def test_removes_null_entry(self):
-        key = (_USER, _PROVIDER)
-        _null_cache[key] = True
-        invalidate_user_provider_cache(_USER, _PROVIDER)
-        assert key not in _null_cache
-
-    def test_noop_when_key_not_cached(self):
-        # Should not raise even when there is no cache entry.
-        invalidate_user_provider_cache("no-such-user", _PROVIDER)
-
-    def test_only_removes_targeted_key(self):
-        other_key = ("other-user", _PROVIDER)
-        _token_cache[other_key] = "other-tok"
-        invalidate_user_provider_cache(_USER, _PROVIDER)
-        assert other_key in _token_cache
-
-
-class TestGetProviderToken:
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_returns_cached_token_without_db_hit(self):
-        _token_cache[(_USER, _PROVIDER)] = "cached-tok"
-
-        mock_manager = MagicMock()
-        with patch("backend.copilot.integration_creds._manager", mock_manager):
-            result = await get_provider_token(_USER, _PROVIDER)
-
-        assert result == "cached-tok"
-        mock_manager.store.get_creds_by_provider.assert_not_called()
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_returns_none_for_null_cached_provider(self):
-        _null_cache[(_USER, _PROVIDER)] = True
-
-        mock_manager = MagicMock()
-        with patch("backend.copilot.integration_creds._manager", mock_manager):
-            result = await get_provider_token(_USER, _PROVIDER)
-
-        assert result is None
-        mock_manager.store.get_creds_by_provider.assert_not_called()
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_api_key_creds_returned_and_cached(self):
-        api_creds = _make_api_key_creds("my-api-key")
-        mock_manager = MagicMock()
-        mock_manager.store.get_creds_by_provider = AsyncMock(return_value=[api_creds])
-
-        with patch("backend.copilot.integration_creds._manager", mock_manager):
-            result = await get_provider_token(_USER, _PROVIDER)
-
-        assert result == "my-api-key"
-        assert _token_cache.get((_USER, _PROVIDER)) == "my-api-key"
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_oauth2_preferred_over_api_key(self):
-        oauth_creds = _make_oauth2_creds("oauth-tok")
-        api_creds = _make_api_key_creds("api-tok")
-        mock_manager = MagicMock()
-        mock_manager.store.get_creds_by_provider = AsyncMock(
-            return_value=[api_creds, oauth_creds]
-        )
-        mock_manager.refresh_if_needed = AsyncMock(return_value=oauth_creds)
-
-        with patch("backend.copilot.integration_creds._manager", mock_manager):
-            result = await get_provider_token(_USER, _PROVIDER)
-
-        assert result == "oauth-tok"
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_oauth2_refresh_failure_falls_back_to_stale_token(self):
-        oauth_creds = _make_oauth2_creds("stale-oauth-tok")
-        mock_manager = MagicMock()
-        mock_manager.store.get_creds_by_provider = AsyncMock(return_value=[oauth_creds])
-        mock_manager.refresh_if_needed = AsyncMock(side_effect=RuntimeError("network"))
-
-        with patch("backend.copilot.integration_creds._manager", mock_manager):
-            result = await get_provider_token(_USER, _PROVIDER)
-
-        assert result == "stale-oauth-tok"
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_no_credentials_caches_null_entry(self):
-        mock_manager = MagicMock()
-        mock_manager.store.get_creds_by_provider = AsyncMock(return_value=[])
-
-        with patch("backend.copilot.integration_creds._manager", mock_manager):
-            result = await get_provider_token(_USER, _PROVIDER)
-
-        assert result is None
-        assert _null_cache.get((_USER, _PROVIDER)) is True
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_db_exception_returns_none_without_caching(self):
-        mock_manager = MagicMock()
-        mock_manager.store.get_creds_by_provider = AsyncMock(
-            side_effect=RuntimeError("db down")
-        )
-
-        with patch("backend.copilot.integration_creds._manager", mock_manager):
-            result = await get_provider_token(_USER, _PROVIDER)
-
-        assert result is None
-        # DB errors are not cached — next call will retry
-        assert (_USER, _PROVIDER) not in _token_cache
-        assert (_USER, _PROVIDER) not in _null_cache
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_null_cache_has_shorter_ttl_than_token_cache(self):
-        """Verify the TTL constants are set correctly for each cache."""
-        assert _null_cache.ttl == _NULL_CACHE_TTL
-        assert _token_cache.ttl == _TOKEN_CACHE_TTL
-        assert _NULL_CACHE_TTL < _TOKEN_CACHE_TTL
-
-
-class TestGetIntegrationEnvVars:
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_injects_all_env_vars_for_provider(self):
-        _token_cache[(_USER, "github")] = "gh-tok"
-
-        result = await get_integration_env_vars(_USER)
-
-        for var in PROVIDER_ENV_VARS["github"]:
-            assert result[var] == "gh-tok"
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_empty_dict_when_no_credentials(self):
-        _null_cache[(_USER, "github")] = True
-
-        result = await get_integration_env_vars(_USER)
-
-        assert result == {}
--- a/autogpt_platform/backend/backend/copilot/model.py
+++ b/autogpt_platform/backend/backend/copilot/model.py
@@ -469,16 +469,8 @@ async def upsert_chat_session(
            )
            db_error = e

-        # Save to cache (best-effort, even if DB failed).
-        # Title updates (update_session_title) run *outside* this lock because
-        # they only touch the title field, not messages.  So a concurrent rename
-        # or auto-title may have written a newer title to Redis while this
-        # upsert was in progress.  Always prefer the cached title to avoid
-        # overwriting it with the stale in-memory copy.
+        # Save to cache (best-effort, even if DB failed)
        try:
-            existing_cached = await _get_session_from_cache(session.session_id)
-            if existing_cached and existing_cached.title:
-                session = session.model_copy(update={"title": existing_cached.title})
            await cache_chat_session(session)
        except Exception as e:
            # If DB succeeded but cache failed, raise cache error
@@ -693,48 +685,30 @@ async def delete_chat_session(session_id: str, user_id: str | None = None) -> bo
    return True


-async def update_session_title(
-    session_id: str,
-    user_id: str,
-    title: str,
-    *,
-    only_if_empty: bool = False,
-) -> bool:
-    """Update the title of a chat session, scoped to the owning user.
+async def update_session_title(session_id: str, title: str) -> bool:
+    """Update only the title of a chat session.

-    Lightweight operation that doesn't touch messages, avoiding race conditions
-    with concurrent message updates.
+    This is a lightweight operation that doesn't touch messages, avoiding
+    race conditions with concurrent message updates. Use this for background
+    title generation instead of upsert_chat_session.

    Args:
        session_id: The session ID to update.
-        user_id: Owning user — the DB query filters on this.
        title: The new title to set.
-        only_if_empty: When True, uses an atomic ``UPDATE WHERE title IS NULL``
-            so auto-generated titles never overwrite a user-set title.

    Returns:
-        True if updated successfully, False otherwise (not found, wrong user,
-        or — when only_if_empty — title was already set).
+        True if updated successfully, False otherwise.
    """
    try:
-        updated = await chat_db().update_chat_session_title(
-            session_id, user_id, title, only_if_empty=only_if_empty
-        )
-        if not updated:
+        result = await chat_db().update_chat_session(session_id=session_id, title=title)
+        if result is None:
+            logger.warning(f"Session {session_id} not found for title update")
            return False

-        # Update title in cache if it exists (instead of invalidating).
-        # This prevents race conditions where cache invalidation causes
-        # the frontend to see stale DB data while streaming is still in progress.
-        try:
-            cached = await _get_session_from_cache(session_id)
-            if cached:
-                cached.title = title
-                await cache_chat_session(cached)
-        except Exception as e:
-            logger.warning(
-                f"Cache title update failed for session {session_id} (non-critical): {e}"
-            )
+        # Invalidate the cache so the next access reloads from DB with the
+        # updated title. This avoids a read-modify-write on the full session
+        # blob, which could overwrite concurrent message updates.
+        await invalidate_session_cache(session_id)

        return True
    except Exception as e:
--- a/autogpt_platform/backend/backend/copilot/optimize_blocks.py
+++ b/autogpt_platform/backend/backend/copilot/optimize_blocks.py
@@ -1,138 +0,0 @@
-"""Scheduler job to generate LLM-optimized block descriptions.
-
-Runs periodically to rewrite block descriptions into concise, actionable
-summaries that help the copilot LLM pick the right blocks during agent
-generation.
-"""
-
-import asyncio
-import logging
-
-from backend.blocks import get_blocks
-from backend.util.clients import get_database_manager_client, get_openai_client
-
-logger = logging.getLogger(__name__)
-
-SYSTEM_PROMPT = (
-    "You are a technical writer for an automation platform. "
-    "Rewrite the following block description to be concise (under 50 words), "
-    "informative, and actionable. Focus on what the block does and when to "
-    "use it. Output ONLY the rewritten description, nothing else. "
-    "Do not use markdown formatting."
-)
-
-# Rate-limit delay between sequential LLM calls (seconds)
-_RATE_LIMIT_DELAY = 0.5
-# Maximum tokens for optimized description generation
-_MAX_DESCRIPTION_TOKENS = 150
-# Model for generating optimized descriptions (fast, cheap)
-_MODEL = "gpt-4o-mini"
-
-
-async def _optimize_descriptions(blocks: list[dict[str, str]]) -> dict[str, str]:
-    """Call the shared OpenAI client to rewrite each block description."""
-    client = get_openai_client()
-    if client is None:
-        logger.error(
-            "No OpenAI client configured, skipping block description optimization"
-        )
-        return {}
-
-    results: dict[str, str] = {}
-    for block in blocks:
-        block_id = block["id"]
-        block_name = block["name"]
-        description = block["description"]
-
-        try:
-            response = await client.chat.completions.create(
-                model=_MODEL,
-                messages=[
-                    {"role": "system", "content": SYSTEM_PROMPT},
-                    {
-                        "role": "user",
-                        "content": f"Block name: {block_name}\nDescription: {description}",
-                    },
-                ],
-                max_tokens=_MAX_DESCRIPTION_TOKENS,
-            )
-            optimized = (response.choices[0].message.content or "").strip()
-            if optimized:
-                results[block_id] = optimized
-                logger.debug("Optimized description for %s", block_name)
-            else:
-                logger.warning("Empty response for block %s", block_name)
-        except Exception:
-            logger.warning(
-                "Failed to optimize description for %s", block_name, exc_info=True
-            )
-
-        await asyncio.sleep(_RATE_LIMIT_DELAY)
-
-    return results
-
-
-def optimize_block_descriptions() -> dict[str, int]:
-    """Generate optimized descriptions for blocks that don't have one yet.
-
-    Uses the shared OpenAI client to rewrite block descriptions into concise
-    summaries suitable for agent generation prompts.
-
-    Returns:
-        Dict with counts: processed, success, failed, skipped.
-    """
-    db_client = get_database_manager_client()
-
-    blocks = db_client.get_blocks_needing_optimization()
-    if not blocks:
-        logger.info("All blocks already have optimized descriptions")
-        return {"processed": 0, "success": 0, "failed": 0, "skipped": 0}
-
-    logger.info("Found %d blocks needing optimized descriptions", len(blocks))
-
-    non_empty = [b for b in blocks if b.get("description", "").strip()]
-    skipped = len(blocks) - len(non_empty)
-
-    new_descriptions = asyncio.run(_optimize_descriptions(non_empty))
-
-    stats = {
-        "processed": len(non_empty),
-        "success": len(new_descriptions),
-        "failed": len(non_empty) - len(new_descriptions),
-        "skipped": skipped,
-    }
-
-    logger.info(
-        "Block description optimization complete: "
-        "%d/%d succeeded, %d failed, %d skipped",
-        stats["success"],
-        stats["processed"],
-        stats["failed"],
-        stats["skipped"],
-    )
-
-    if new_descriptions:
-        for block_id, optimized in new_descriptions.items():
-            db_client.update_block_optimized_description(block_id, optimized)
-
-        # Update in-memory descriptions first so the cache rebuilds with fresh data.
-        try:
-            block_classes = get_blocks()
-            for block_id, optimized in new_descriptions.items():
-                if block_id in block_classes:
-                    block_classes[block_id]._optimized_description = optimized
-            logger.info(
-                "Updated %d in-memory block descriptions", len(new_descriptions)
-            )
-        except Exception:
-            logger.warning(
-                "Could not update in-memory block descriptions", exc_info=True
-            )
-
-        from backend.copilot.tools.agent_generator.blocks import (
-            reset_block_caches,  # local to avoid circular import
-        )
-
-        reset_block_caches()
-
-    return stats
--- a/autogpt_platform/backend/backend/copilot/optimize_blocks_test.py
+++ b/autogpt_platform/backend/backend/copilot/optimize_blocks_test.py
@@ -1,91 +0,0 @@
-"""Unit tests for optimize_blocks._optimize_descriptions."""
-
-import asyncio
-from unittest.mock import AsyncMock, MagicMock, patch
-
-from backend.copilot.optimize_blocks import _RATE_LIMIT_DELAY, _optimize_descriptions
-
-
-def _make_client_response(text: str) -> MagicMock:
-    """Build a minimal mock that looks like an OpenAI ChatCompletion response."""
-    choice = MagicMock()
-    choice.message.content = text
-    response = MagicMock()
-    response.choices = [choice]
-    return response
-
-
-def _run(coro):
-    return asyncio.get_event_loop().run_until_complete(coro)
-
-
-class TestOptimizeDescriptions:
-    """Tests for _optimize_descriptions async function."""
-
-    def test_returns_empty_when_no_client(self):
-        with patch(
-            "backend.copilot.optimize_blocks.get_openai_client", return_value=None
-        ):
-            result = _run(
-                _optimize_descriptions([{"id": "b1", "name": "B", "description": "d"}])
-            )
-        assert result == {}
-
-    def test_success_single_block(self):
-        client = MagicMock()
-        client.chat.completions.create = AsyncMock(
-            return_value=_make_client_response("Short desc.")
-        )
-        blocks = [{"id": "b1", "name": "MyBlock", "description": "A block."}]
-
-        with (
-            patch(
-                "backend.copilot.optimize_blocks.get_openai_client", return_value=client
-            ),
-            patch(
-                "backend.copilot.optimize_blocks.asyncio.sleep", new_callable=AsyncMock
-            ),
-        ):
-            result = _run(_optimize_descriptions(blocks))
-
-        assert result == {"b1": "Short desc."}
-        client.chat.completions.create.assert_called_once()
-
-    def test_skips_block_on_exception(self):
-        client = MagicMock()
-        client.chat.completions.create = AsyncMock(side_effect=Exception("API error"))
-        blocks = [{"id": "b1", "name": "MyBlock", "description": "A block."}]
-
-        with (
-            patch(
-                "backend.copilot.optimize_blocks.get_openai_client", return_value=client
-            ),
-            patch(
-                "backend.copilot.optimize_blocks.asyncio.sleep", new_callable=AsyncMock
-            ),
-        ):
-            result = _run(_optimize_descriptions(blocks))
-
-        assert result == {}
-
-    def test_sleeps_between_blocks(self):
-        client = MagicMock()
-        client.chat.completions.create = AsyncMock(
-            return_value=_make_client_response("desc")
-        )
-        blocks = [
-            {"id": "b1", "name": "B1", "description": "d1"},
-            {"id": "b2", "name": "B2", "description": "d2"},
-        ]
-        sleep_mock = AsyncMock()
-
-        with (
-            patch(
-                "backend.copilot.optimize_blocks.get_openai_client", return_value=client
-            ),
-            patch("backend.copilot.optimize_blocks.asyncio.sleep", sleep_mock),
-        ):
-            _run(_optimize_descriptions(blocks))
-
-        assert sleep_mock.call_count == 2
-        sleep_mock.assert_called_with(_RATE_LIMIT_DELAY)
--- a/autogpt_platform/backend/backend/copilot/prompt_constants.py
+++ b/autogpt_platform/backend/backend/copilot/prompt_constants.py
@@ -0,0 +1,29 @@
+"""Prompt constants for CoPilot - workflow guidance and supplementary documentation.
+
+This module contains workflow patterns and guidance that supplement the main system prompt.
+These are appended dynamically to the prompt along with auto-generated tool documentation.
+"""
+
+# Workflow guidance for key tool patterns
+# This is appended after the auto-generated tool list to provide usage patterns
+KEY_WORKFLOWS = """
+
+## KEY WORKFLOWS
+
+### MCP Integration Workflow
+When using `run_mcp_tool`:
+1. **Known servers** (use directly): Notion (https://mcp.notion.com/mcp), Linear (https://mcp.linear.app/mcp), Stripe (https://mcp.stripe.com), Intercom (https://mcp.intercom.com/mcp), Cloudflare (https://mcp.cloudflare.com/mcp), Atlassian (https://mcp.atlassian.com/mcp)
+2. **Unknown servers**: Use `web_search("{{service}} MCP server URL")` to find the endpoint
+3. **Discovery**: Call `run_mcp_tool(server_url)` to see available tools
+4. **Execution**: Call `run_mcp_tool(server_url, tool_name, tool_arguments)`
+5. **Authentication**: If credentials needed, user will be prompted. When they confirm, retry immediately with same arguments.
+
+### Agent Creation Workflow
+When using `create_agent`:
+1. Always check `find_library_agent` first for existing solutions
+2. Call `create_agent` with description
+3. **If `suggested_goal` returned**: Present to user, ask for confirmation, call again with suggested goal if accepted
+4. **If `clarifying_questions` returned**: After user answers, call again with original description AND answers in `context` parameter
+
+### Folder Management
+Use folder tools (`create_folder`, `list_folders`, `move_agents_to_folder`) to organize agents in the user's library for better discoverability."""
--- a/autogpt_platform/backend/backend/copilot/prompting.py
+++ b/autogpt_platform/backend/backend/copilot/prompting.py
@@ -1,285 +0,0 @@
-"""Centralized prompt building logic for CoPilot.
-
-This module contains all prompt construction functions and constants,
-handling the distinction between:
- SDK mode vs Baseline mode (tool documentation needs)
- Local mode vs E2B mode (storage/filesystem differences)
-"""
-
-from backend.copilot.tools import TOOL_REGISTRY
-
-# Shared technical notes that apply to both SDK and baseline modes
-_SHARED_TOOL_NOTES = """\
-
-### Sharing files with the user
-After saving a file to the persistent workspace with `write_workspace_file`,
-share it with the user by embedding the `download_url` from the response in
-your message as a Markdown link or image:
-
- **Any file** — shows as a clickable download link:
-  `[report.csv](workspace://file_id#text/csv)`
- **Image** — renders inline in chat:
-  `![chart](workspace://file_id#image/png)`
- **Video** — renders inline in chat with player controls:
-  `![recording](workspace://file_id#video/mp4)`
-
-The `download_url` field in the `write_workspace_file` response is already
-in the correct format — paste it directly after the `(` in the Markdown.
-
-### Passing file content to tools — @@agptfile: references
-Instead of copying large file contents into a tool argument, pass a file
-reference and the platform will load the content for you.
-
-Syntax: `@@agptfile:<uri>[<start>-<end>]`
-
- `<uri>` **must** start with `workspace://` or `/` (absolute path):
-  - `workspace://<file_id>` — workspace file by ID
-  - `workspace:///<path>` — workspace file by virtual path
-  - `/absolute/local/path` — ephemeral or sdk_cwd file
-  - E2B sandbox absolute path (e.g. `/home/user/script.py`)
- `[<start>-<end>]` is an optional 1-indexed inclusive line range.
- URIs that do not start with `workspace://` or `/` are **not** expanded.
-
-Examples:
-```
-@@agptfile:workspace://abc123
-@@agptfile:workspace://abc123[10-50]
-@@agptfile:workspace:///reports/q1.md
-@@agptfile:/tmp/copilot-<session>/output.py[1-80]
-@@agptfile:/home/user/script.py
-```
-
-You can embed a reference inside any string argument, or use it as the entire
-value.  Multiple references in one argument are all expanded.
-
-**Structured data**: When the **entire** argument value is a single file
-reference (no surrounding text), the platform automatically parses the file
-content based on its extension or MIME type.  Supported formats: JSON, JSONL,
-CSV, TSV, YAML, TOML, Parquet, and Excel (.xlsx — first sheet only).
-For example, pass `@@agptfile:workspace://<id>` where the file is a `.csv` and
-the rows will be parsed into `list[list[str]]` automatically.  If the format is
-unrecognised or parsing fails, the content is returned as a plain string.
-Legacy `.xls` files are **not** supported — only the modern `.xlsx` format.
-
-**Type coercion**: The platform also coerces expanded values to match the
-block's expected input types.  For example, if a block expects `list[list[str]]`
-and the expanded value is a JSON string, it will be parsed into the correct type.
-
-### Media file inputs (format: "file")
-Some block inputs accept media files — their schema shows `"format": "file"`.
-These fields accept:
- **`workspace://<file_id>`** or **`workspace://<file_id>#<mime>`** — preferred
-  for large files (images, videos, PDFs). The platform passes the reference
-  directly to the block without reading the content into memory.
- **`data:<mime>;base64,<payload>`** — inline base64 data URI, suitable for
-  small files only.
-
-When a block input has `format: "file"`, **pass the `workspace://` URI
-directly as the value** (do NOT wrap it in `@@agptfile:`). This avoids large
-payloads in tool arguments and preserves binary content (images, videos)
-that would be corrupted by text encoding.
-
-Example — committing an image file to GitHub:
-```json
-{
-  "files": [{
-    "path": "docs/hero.png",
-    "content": "workspace://abc123#image/png",
-    "operation": "upsert"
-  }]
-}
-```
-
-### Sub-agent tasks
- When using the Task tool, NEVER set `run_in_background` to true.
-  All tasks must run in the foreground.
-"""
-
-# E2B-only notes — E2B has full internet access so gh CLI works there.
-# Not shown in local (bubblewrap) mode: --unshare-net blocks all network.
-_E2B_TOOL_NOTES = """
-### GitHub CLI (`gh`) and git
- If the user has connected their GitHub account, both `gh` and `git` are
-  pre-authenticated — use them directly without any manual login step.
-  `git` HTTPS operations (clone, push, pull) work automatically.
- If the token changes mid-session (e.g. user reconnects with a new token),
-  run `gh auth setup-git` to re-register the credential helper.
- If `gh` or `git` fails with an authentication error (e.g. "authentication
-  required", "could not read Username", or exit code 128), call
-  `connect_integration(provider="github")` to surface the GitHub credentials
-  setup card so the user can connect their account. Once connected, retry
-  the operation.
- For operations that need broader access (e.g. private org repos, GitHub
-  Actions), pass the required scopes: e.g.
-  `connect_integration(provider="github", scopes=["repo", "read:org"])`.
-"""
-
-
-# Environment-specific supplement templates
-def _build_storage_supplement(
-    working_dir: str,
-    sandbox_type: str,
-    storage_system_1_name: str,
-    storage_system_1_characteristics: list[str],
-    storage_system_1_persistence: list[str],
-    file_move_name_1_to_2: str,
-    file_move_name_2_to_1: str,
-    extra_notes: str = "",
-) -> str:
-    """Build storage/filesystem supplement for a specific environment.
-
-    Template function handles all formatting (bullets, indentation, markdown).
-    Callers provide clean data as lists of strings.
-
-    Args:
-        working_dir: Working directory path
-        sandbox_type: Description of bash_exec sandbox
-        storage_system_1_name: Name of primary storage (ephemeral or cloud)
-        storage_system_1_characteristics: List of characteristic descriptions
-        storage_system_1_persistence: List of persistence behavior descriptions
-        file_move_name_1_to_2: Direction label for primary→persistent
-        file_move_name_2_to_1: Direction label for persistent→primary
-        extra_notes: Environment-specific notes appended after shared notes
-    """
-    # Format lists as bullet points with proper indentation
-    characteristics = "\n".join(f"   - {c}" for c in storage_system_1_characteristics)
-    persistence = "\n".join(f"   - {p}" for p in storage_system_1_persistence)
-
-    return f"""
-
-## Tool notes
-
-### Shell commands
- The SDK built-in Bash tool is NOT available.  Use the `bash_exec` MCP tool
-  for shell commands — it runs {sandbox_type}.
-
-### Working directory
- Your working directory is: `{working_dir}`
- All SDK file tools AND `bash_exec` operate on the same filesystem
- Use relative paths or absolute paths under `{working_dir}` for all file operations
-
-### Two storage systems — CRITICAL to understand
-
-1. **{storage_system_1_name}** (`{working_dir}`):
-{characteristics}
-{persistence}
-
-2. **Persistent workspace** (cloud storage):
-   - Files here **survive across sessions indefinitely**
-
-### Moving files between storages
- **{file_move_name_1_to_2}**: Copy to persistent workspace
- **{file_move_name_2_to_1}**: Download for processing
-
-### File persistence
-Important files (code, configs, outputs) should be saved to workspace to ensure they persist.
-{_SHARED_TOOL_NOTES}{extra_notes}"""
-
-
-# Pre-built supplements for common environments
-def _get_local_storage_supplement(cwd: str) -> str:
-    """Local ephemeral storage (files lost between turns).
-
-    Network is isolated (bubblewrap --unshare-net), so internet-dependent CLIs
-    like gh will not work — no integration env-var notes are included.
-    """
-    return _build_storage_supplement(
-        working_dir=cwd,
-        sandbox_type="in a network-isolated sandbox",
-        storage_system_1_name="Ephemeral working directory",
-        storage_system_1_characteristics=[
-            "Shared by SDK Read/Write/Edit/Glob/Grep tools AND `bash_exec`",
-        ],
-        storage_system_1_persistence=[
-            "Files here are **lost between turns** — do NOT rely on them persisting",
-            "Use for temporary work: running scripts, processing data, etc.",
-        ],
-        file_move_name_1_to_2="Ephemeral → Persistent",
-        file_move_name_2_to_1="Persistent → Ephemeral",
-    )
-
-
-def _get_cloud_sandbox_supplement() -> str:
-    """Cloud persistent sandbox (files survive across turns in session).
-
-    E2B has full internet access, so integration tokens (GH_TOKEN etc.) are
-    injected per command in bash_exec — include the CLI guidance notes.
-    """
-    return _build_storage_supplement(
-        working_dir="/home/user",
-        sandbox_type="in a cloud sandbox with full internet access",
-        storage_system_1_name="Cloud sandbox",
-        storage_system_1_characteristics=[
-            "Shared by all file tools AND `bash_exec` — same filesystem",
-            "Full Linux environment with internet access",
-        ],
-        storage_system_1_persistence=[
-            "Files **persist across turns** within the current session",
-            "Lost when the session expires (12 h inactivity)",
-        ],
-        file_move_name_1_to_2="Sandbox → Persistent",
-        file_move_name_2_to_1="Persistent → Sandbox",
-        extra_notes=_E2B_TOOL_NOTES,
-    )
-
-
-def _generate_tool_documentation() -> str:
-    """Auto-generate tool documentation from TOOL_REGISTRY.
-
-    NOTE: This is ONLY used in baseline mode (direct OpenAI API).
-    SDK mode doesn't need it since Claude gets tool schemas automatically.
-
-    This generates a complete list of available tools with their descriptions,
-    ensuring the documentation stays in sync with the actual tool implementations.
-    All workflow guidance is now embedded in individual tool descriptions.
-
-    Only documents tools that are available in the current environment
-    (checked via tool.is_available property).
-    """
-    docs = "\n## AVAILABLE TOOLS\n\n"
-
-    # Sort tools alphabetically for consistent output
-    # Filter by is_available to match get_available_tools() behavior
-    for name in sorted(TOOL_REGISTRY.keys()):
-        tool = TOOL_REGISTRY[name]
-        if not tool.is_available:
-            continue
-        schema = tool.as_openai_tool()
-        desc = schema["function"].get("description", "No description available")
-        # Format as bullet list with tool name in code style
-        docs += f"- **`{name}`**: {desc}\n"
-
-    return docs
-
-
-def get_sdk_supplement(use_e2b: bool, cwd: str = "") -> str:
-    """Get the supplement for SDK mode (Claude Agent SDK).
-
-    SDK mode does NOT include tool documentation because Claude automatically
-    receives tool schemas from the SDK. Only includes technical notes about
-    storage systems and execution environment.
-
-    Args:
-        use_e2b: Whether E2B cloud sandbox is being used
-        cwd: Current working directory (only used in local_storage mode)
-
-    Returns:
-        The supplement string to append to the system prompt
-    """
-    if use_e2b:
-        return _get_cloud_sandbox_supplement()
-    return _get_local_storage_supplement(cwd)
-
-
-def get_baseline_supplement() -> str:
-    """Get the supplement for baseline mode (direct OpenAI API).
-
-    Baseline mode INCLUDES auto-generated tool documentation because the
-    direct API doesn't automatically provide tool schemas to Claude.
-    Also includes shared technical notes (but NOT SDK-specific environment details).
-
-    Returns:
-        The supplement string to append to the system prompt
-    """
-    tool_docs = _generate_tool_documentation()
-    return tool_docs + _SHARED_TOOL_NOTES
--- a/autogpt_platform/backend/backend/copilot/sdk/init.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/init.py
@@ -3,45 +3,12 @@
 This module provides the integration layer between the Claude Agent SDK
 and the existing CoPilot tool system, enabling drop-in replacement of
 the current LLM orchestration with the battle-tested Claude Agent SDK.
-
-Submodule imports are deferred via PEP 562 ``__getattr__`` to break a
-circular import cycle::
-
-    sdk/__init__ → tool_adapter → copilot.tools (TOOL_REGISTRY)
-    copilot.tools → run_block → sdk.file_ref  (no cycle here, but…)
-    sdk/__init__ → service → copilot.prompting → copilot.tools  (cycle!)
-
-``tool_adapter`` uses ``TOOL_REGISTRY`` at **module level** to build the
-static ``COPILOT_TOOL_NAMES`` list, so the import cannot be deferred to
-function scope without a larger refactor (moving tool-name registration
-to a separate lightweight module).  The lazy-import pattern here is the
-least invasive way to break the cycle while keeping module-level constants
-intact.
 """

-from typing import Any
+from .service import stream_chat_completion_sdk
+from .tool_adapter import create_copilot_mcp_server

 __all__ = [
    "stream_chat_completion_sdk",
    "create_copilot_mcp_server",
 ]
-
-# Dispatch table for PEP 562 lazy imports.  Each entry is a (module, attr)
-# pair so new exports can be added without touching __getattr__ itself.
-_LAZY_IMPORTS: dict[str, tuple[str, str]] = {
-    "stream_chat_completion_sdk": (".service", "stream_chat_completion_sdk"),
-    "create_copilot_mcp_server": (".tool_adapter", "create_copilot_mcp_server"),
-}
-
-
-def __getattr__(name: str) -> Any:
-    entry = _LAZY_IMPORTS.get(name)
-    if entry is not None:
-        module_path, attr = entry
-        import importlib
-
-        module = importlib.import_module(module_path, package=__name__)
-        value = getattr(module, attr)
-        globals()[name] = value
-        return value
-    raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
--- a/autogpt_platform/backend/backend/copilot/sdk/agent_generation_guide.md
+++ b/autogpt_platform/backend/backend/copilot/sdk/agent_generation_guide.md
@@ -1,155 +0,0 @@
-## Agent Generation Guide
-
-You can create, edit, and customize agents directly. You ARE the brain —
-generate the agent JSON yourself using block schemas, then validate and save.
-
-### Workflow for Creating/Editing Agents
-
-1. **Discover blocks**: Call `find_block(query, include_schemas=true)` to
-   search for relevant blocks. This returns block IDs, names, descriptions,
-   and full input/output schemas.
-2. **Find library agents**: Call `find_library_agent` to discover reusable
-   agents that can be composed as sub-agents via `AgentExecutorBlock`.
-3. **Generate JSON**: Build the agent JSON using block schemas:
-   - Use block IDs from step 1 as `block_id` in nodes
-   - Wire outputs to inputs using links
-   - Set design-time config in `input_default`
-   - Use `AgentInputBlock` for values the user provides at runtime
-4. **Write to workspace**: Save the JSON to a workspace file so the user
-   can review it: `write_workspace_file(filename="agent.json", content=...)`
-5. **Validate**: Call `validate_agent_graph` with the agent JSON to check
-   for errors
-6. **Fix if needed**: Call `fix_agent_graph` to auto-fix common issues,
-   or fix manually based on the error descriptions. Iterate until valid.
-7. **Save**: Call `create_agent` (new) or `edit_agent` (existing) with
-   the final `agent_json`
-
-### Agent JSON Structure
-
-```json
-{
-  "id": "<UUID v4>",        // auto-generated if omitted
-  "version": 1,
-  "is_active": true,
-  "name": "Agent Name",
-  "description": "What the agent does",
-  "nodes": [
-    {
-      "id": "<UUID v4>",
-      "block_id": "<block UUID from find_block>",
-      "input_default": {
-        "field_name": "design-time value"
-      },
-      "metadata": {
-        "position": {"x": 0, "y": 0},
-        "customized_name": "Optional display name"
-      }
-    }
-  ],
-  "links": [
-    {
-      "id": "<UUID v4>",
-      "source_id": "<source node UUID>",
-      "source_name": "output_field_name",
-      "sink_id": "<sink node UUID>",
-      "sink_name": "input_field_name",
-      "is_static": false
-    }
-  ]
-}
-```
-
-### REQUIRED: AgentInputBlock and AgentOutputBlock
-
-Every agent MUST include at least one AgentInputBlock and one AgentOutputBlock.
-These define the agent's interface — what it accepts and what it produces.
-
-**AgentInputBlock** (ID: `c0a8e994-ebf1-4a9c-a4d8-89d09c86741b`):
- Defines a user-facing input field on the agent
- Required `input_default` fields: `name` (str), `value` (default: null)
- Optional: `title`, `description`, `placeholder_values` (for dropdowns)
- Output: `result` — the user-provided value at runtime
- Create one AgentInputBlock per distinct input the agent needs
-
-**AgentOutputBlock** (ID: `363ae599-353e-4804-937e-b2ee3cef3da4`):
- Defines a user-facing output displayed after the agent runs
- Required `input_default` fields: `name` (str)
- The `value` input should be linked from another block's output
- Optional: `title`, `description`, `format` (Jinja2 template)
- Create one AgentOutputBlock per distinct result to show the user
-
-Without these blocks, the agent has no interface and the user cannot provide
-inputs or see outputs. NEVER skip them.
-
-### Key Rules
-
- **Name & description**: Include `name` and `description` in the agent JSON
-  when creating a new agent, or when editing and the agent's purpose changed.
-  Without these the agent gets a generic default name.
- **Design-time vs runtime**: `input_default` = values known at build time.
-  For user-provided values, create an `AgentInputBlock` node and link its
-  output to the consuming block's input.
- **Credentials**: Do NOT require credentials upfront. Users configure
-  credentials later in the platform UI after the agent is saved.
- **Node spacing**: Position nodes with at least 800 X-units between them.
- **Nested properties**: Use `parentField_#_childField` notation in link
-  sink_name/source_name to access nested object fields.
- **is_static links**: Set `is_static: true` when the link carries a
-  design-time constant (matches a field in inputSchema with a default).
- **ConditionBlock**: Needs a `StoreValueBlock` wired to its `value2` input.
- **Prompt templates**: Use `{{variable}}` (double curly braces) for
-  literal braces in prompt strings — single `{` and `}` are for
-  template variables.
- **AgentExecutorBlock**: When composing sub-agents, set `graph_id` and
-  `graph_version` in input_default, and wire inputs/outputs to match
-  the sub-agent's schema.
-
-### Using Sub-Agents (AgentExecutorBlock)
-
-To compose agents using other agents as sub-agents:
-1. Call `find_library_agent` to find the sub-agent — the response includes
-   `graph_id`, `graph_version`, `input_schema`, and `output_schema`
-2. Create an `AgentExecutorBlock` node (ID: `e189baac-8c20-45a1-94a7-55177ea42565`)
-3. Set `input_default`:
-   - `graph_id`: from the library agent's `graph_id`
-   - `graph_version`: from the library agent's `graph_version`
-   - `input_schema`: from the library agent's `input_schema` (JSON Schema)
-   - `output_schema`: from the library agent's `output_schema` (JSON Schema)
-   - `user_id`: leave as `""` (filled at runtime)
-   - `inputs`: `{}` (populated by links at runtime)
-4. Wire inputs: link to sink names matching the sub-agent's `input_schema`
-   property names (e.g., if input_schema has a `"url"` property, use
-   `"url"` as the sink_name)
-5. Wire outputs: link from source names matching the sub-agent's
-   `output_schema` property names
-6. Pass `library_agent_ids` to `create_agent`/`customize_agent` with
-   the library agent IDs used, so the fixer can validate schemas
-
-### Using MCP Tools (MCPToolBlock)
-
-To use an MCP (Model Context Protocol) tool as a node in the agent:
-1. The user must specify which MCP server URL and tool name they want
-2. Create an `MCPToolBlock` node (ID: `a0a4b1c2-d3e4-4f56-a7b8-c9d0e1f2a3b4`)
-3. Set `input_default`:
-   - `server_url`: the MCP server URL (e.g. `"https://mcp.example.com/sse"`)
-   - `selected_tool`: the tool name on that server
-   - `tool_input_schema`: JSON Schema for the tool's inputs
-   - `tool_arguments`: `{}` (populated by links or hardcoded values)
-4. The block requires MCP credentials — the user configures these in the
-   platform UI after the agent is saved
-5. Wire inputs using the tool argument field name directly as the sink_name
-   (e.g., `query`, NOT `tool_arguments_#_query`). The execution engine
-   automatically collects top-level fields matching tool_input_schema into
-   tool_arguments.
-6. Output: `result` (the tool's return value) and `error` (error message)
-
-### Example: Simple AI Text Processor
-
-A minimal agent with input, processing, and output:
- Node 1: `AgentInputBlock` (ID: `c0a8e994-ebf1-4a9c-a4d8-89d09c86741b`,
-  input_default: {"name": "user_text", "title": "Text to process"},
-  output: "result")
- Node 2: `AITextGeneratorBlock` (input: "prompt" linked from Node 1's "result")
- Node 3: `AgentOutputBlock` (ID: `363ae599-353e-4804-937e-b2ee3cef3da4`,
-  input_default: {"name": "summary", "title": "Summary"},
-  input: "value" linked from Node 2's output)
--- a/autogpt_platform/backend/backend/copilot/sdk/compaction.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/compaction.py
@@ -11,7 +11,7 @@ persistence, and the ``CompactionTracker`` state machine.
 import asyncio
 import logging
 import uuid
-from dataclasses import dataclass, field
+from collections.abc import Callable

 from ..constants import COMPACTION_DONE_MSG, COMPACTION_TOOL_NAME
 from ..model import ChatMessage, ChatSession
@@ -27,19 +27,6 @@ from ..response_model import (
 logger = logging.getLogger(__name__)


-@dataclass
-class CompactionResult:
-    """Result of emit_end_if_ready — bundles events with compaction metadata.
-
-    Eliminates the need for separate ``compaction_just_ended`` checks,
-    preventing TOCTOU races between the emit call and the flag read.
-    """
-
-    events: list[StreamBaseResponse] = field(default_factory=list)
-    just_ended: bool = False
-    transcript_path: str = ""
-
-
 # ---------------------------------------------------------------------------
 # Event builders (private — use CompactionTracker or compaction_events)
 # ---------------------------------------------------------------------------
@@ -190,22 +177,11 @@ class CompactionTracker:
        self._start_emitted = False
        self._done = False
        self._tool_call_id = ""
-        self._transcript_path: str = ""

-    def on_compact(self, transcript_path: str = "") -> None:
-        """Callback for the PreCompact hook. Stores transcript_path."""
-        if (
-            self._transcript_path
-            and transcript_path
-            and self._transcript_path != transcript_path
-        ):
-            logger.warning(
-                "[Compaction] Overwriting transcript_path %s -> %s",
-                self._transcript_path,
-                transcript_path,
-            )
-        self._transcript_path = transcript_path
-        self._compact_start.set()
+    @property
+    def on_compact(self) -> Callable[[], None]:
+        """Callback for the PreCompact hook."""
+        return self._compact_start.set

    # ------------------------------------------------------------------
    # Pre-query compaction
@@ -225,7 +201,6 @@ class CompactionTracker:
        self._done = False
        self._start_emitted = False
        self._tool_call_id = ""
-        self._transcript_path = ""

    def emit_start_if_ready(self) -> list[StreamBaseResponse]:
        """If the PreCompact hook fired, emit start events (spinning tool)."""
@@ -236,20 +211,15 @@ class CompactionTracker:
            return _start_events(self._tool_call_id)
        return []

-    async def emit_end_if_ready(self, session: ChatSession) -> CompactionResult:
-        """If compaction is in progress, emit end events and persist.
-
-        Returns a ``CompactionResult`` with ``just_ended=True`` and the
-        captured ``transcript_path`` when a compaction cycle completes.
-        This avoids a separate flag check (TOCTOU-safe).
-        """
+    async def emit_end_if_ready(self, session: ChatSession) -> list[StreamBaseResponse]:
+        """If compaction is in progress, emit end events and persist."""
        # Yield so pending hook tasks can set compact_start
        await asyncio.sleep(0)

        if self._done:
-            return CompactionResult()
+            return []
        if not self._start_emitted and not self._compact_start.is_set():
-            return CompactionResult()
+            return []

        if self._start_emitted:
            # Close the open spinner
@@ -262,12 +232,8 @@ class CompactionTracker:
                COMPACTION_DONE_MSG, tool_call_id=persist_id
            )

-        transcript_path = self._transcript_path
        self._compact_start.clear()
        self._start_emitted = False
        self._done = True
-        self._transcript_path = ""
        _persist(session, persist_id, COMPACTION_DONE_MSG)
-        return CompactionResult(
-            events=done_events, just_ended=True, transcript_path=transcript_path
-        )
+        return done_events
--- a/autogpt_platform/backend/backend/copilot/sdk/compaction_test.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/compaction_test.py
@@ -195,11 +195,10 @@ class TestCompactionTracker:
        session = _make_session()
        tracker.on_compact()
        tracker.emit_start_if_ready()
-        result = await tracker.emit_end_if_ready(session)
-        assert result.just_ended is True
-        assert len(result.events) == 2
-        assert isinstance(result.events[0], StreamToolOutputAvailable)
-        assert isinstance(result.events[1], StreamFinishStep)
+        evts = await tracker.emit_end_if_ready(session)
+        assert len(evts) == 2
+        assert isinstance(evts[0], StreamToolOutputAvailable)
+        assert isinstance(evts[1], StreamFinishStep)
        # Should persist
        assert len(session.messages) == 2

@@ -211,32 +210,28 @@ class TestCompactionTracker:
        session = _make_session()
        tracker.on_compact()
        # Don't call emit_start_if_ready
-        result = await tracker.emit_end_if_ready(session)
-        assert result.just_ended is True
-        assert len(result.events) == 5  # Full self-contained event
-        assert isinstance(result.events[0], StreamStartStep)
+        evts = await tracker.emit_end_if_ready(session)
+        assert len(evts) == 5  # Full self-contained event
+        assert isinstance(evts[0], StreamStartStep)
        assert len(session.messages) == 2

    @pytest.mark.asyncio
-    async def test_emit_end_no_op_when_no_new_compaction(self):
+    async def test_emit_end_no_op_when_done(self):
        tracker = CompactionTracker()
        session = _make_session()
        tracker.on_compact()
        tracker.emit_start_if_ready()
-        result1 = await tracker.emit_end_if_ready(session)
-        assert result1.just_ended is True
-        # Second call should be no-op (no new on_compact)
-        result2 = await tracker.emit_end_if_ready(session)
-        assert result2.just_ended is False
-        assert result2.events == []
+        await tracker.emit_end_if_ready(session)
+        # Second call should be no-op
+        evts = await tracker.emit_end_if_ready(session)
+        assert evts == []

    @pytest.mark.asyncio
    async def test_emit_end_no_op_when_nothing_happened(self):
        tracker = CompactionTracker()
        session = _make_session()
-        result = await tracker.emit_end_if_ready(session)
-        assert result.just_ended is False
-        assert result.events == []
+        evts = await tracker.emit_end_if_ready(session)
+        assert evts == []

    def test_emit_pre_query(self):
        tracker = CompactionTracker()
@@ -251,29 +246,20 @@ class TestCompactionTracker:
        tracker._done = True
        tracker._start_emitted = True
        tracker._tool_call_id = "old"
-        tracker._transcript_path = "/some/path"
        tracker.reset_for_query()
        assert tracker._done is False
        assert tracker._start_emitted is False
        assert tracker._tool_call_id == ""
-        assert tracker._transcript_path == ""

    @pytest.mark.asyncio
-    async def test_pre_query_blocks_sdk_compaction_until_reset(self):
-        """After pre-query compaction, SDK compaction is blocked until
-        reset_for_query is called."""
+    async def test_pre_query_blocks_sdk_compaction(self):
+        """After pre-query compaction, SDK compaction events are suppressed."""
        tracker = CompactionTracker()
        session = _make_session()
        tracker.emit_pre_query(session)
        tracker.on_compact()
-        # _done is True so emit_start_if_ready is blocked
        evts = tracker.emit_start_if_ready()
-        assert evts == []
-        # Reset clears _done, allowing subsequent compaction
-        tracker.reset_for_query()
-        tracker.on_compact()
-        evts = tracker.emit_start_if_ready()
-        assert len(evts) == 3
+        assert evts == []  # _done blocks it

    @pytest.mark.asyncio
    async def test_reset_allows_new_compaction(self):
@@ -293,9 +279,9 @@ class TestCompactionTracker:
        session = _make_session()
        tracker.on_compact()
        start_evts = tracker.emit_start_if_ready()
-        result = await tracker.emit_end_if_ready(session)
+        end_evts = await tracker.emit_end_if_ready(session)
        start_evt = start_evts[1]
-        end_evt = result.events[0]
+        end_evt = end_evts[0]
        assert isinstance(start_evt, StreamToolInputStart)
        assert isinstance(end_evt, StreamToolOutputAvailable)
        assert start_evt.toolCallId == end_evt.toolCallId
@@ -303,105 +289,3 @@ class TestCompactionTracker:
        tool_calls = session.messages[0].tool_calls
        assert tool_calls is not None
        assert tool_calls[0]["id"] == start_evt.toolCallId
-
-    @pytest.mark.asyncio
-    async def test_multiple_compactions_within_query(self):
-        """Two mid-stream compactions within a single query both trigger."""
-        tracker = CompactionTracker()
-        session = _make_session()
-
-        # First compaction cycle
-        tracker.on_compact("/path/1")
-        tracker.emit_start_if_ready()
-        result1 = await tracker.emit_end_if_ready(session)
-        assert result1.just_ended is True
-        assert len(result1.events) == 2
-        assert result1.transcript_path == "/path/1"
-
-        # Second compaction cycle (should NOT be blocked — _done resets
-        # because emit_end_if_ready sets it True, but the next on_compact
-        # + emit_start_if_ready checks !_done which IS True now.
-        # So we need reset_for_query between queries, but within a single
-        # query multiple compactions work because _done blocks emit_start
-        # until the next message arrives, at which point emit_end detects it)
-        #
-        # Actually: _done=True blocks emit_start_if_ready, so we need
-        # the stream loop to reset. In practice service.py doesn't call
-        # reset between compactions within the same query — let's verify
-        # the actual behavior.
-        tracker.on_compact("/path/2")
-        # _done is True from first compaction, so start is blocked
-        start_evts = tracker.emit_start_if_ready()
-        assert start_evts == []
-        # But emit_end returns no-op because _done is True
-        result2 = await tracker.emit_end_if_ready(session)
-        assert result2.just_ended is False
-
-    @pytest.mark.asyncio
-    async def test_multiple_compactions_with_intervening_message(self):
-        """Multiple compactions work when the stream loop processes messages between them.
-
-        In the real service.py flow:
-        1. PreCompact fires → on_compact()
-        2. emit_start shows spinner
-        3. Next message arrives → emit_end completes compaction (_done=True)
-        4. Stream continues processing messages...
-        5. If a second PreCompact fires, _done=True blocks emit_start
-        6. But the next message triggers emit_end, which sees _done=True → no-op
-        7. The stream loop needs to detect this and handle accordingly
-
-        The actual flow for multiple compactions within a query requires
-        _done to be cleared between them. The service.py code uses
-        CompactionResult.just_ended to trigger replace_entries, and _done
-        stays True until reset_for_query.
-        """
-        tracker = CompactionTracker()
-        session = _make_session()
-
-        # First compaction
-        tracker.on_compact("/path/1")
-        tracker.emit_start_if_ready()
-        result1 = await tracker.emit_end_if_ready(session)
-        assert result1.just_ended is True
-        assert result1.transcript_path == "/path/1"
-
-        # Simulate reset between queries
-        tracker.reset_for_query()
-
-        # Second compaction in new query
-        tracker.on_compact("/path/2")
-        start_evts = tracker.emit_start_if_ready()
-        assert len(start_evts) == 3
-        result2 = await tracker.emit_end_if_ready(session)
-        assert result2.just_ended is True
-        assert result2.transcript_path == "/path/2"
-
-    def test_on_compact_stores_transcript_path(self):
-        tracker = CompactionTracker()
-        tracker.on_compact("/some/path.jsonl")
-        assert tracker._transcript_path == "/some/path.jsonl"
-
-    @pytest.mark.asyncio
-    async def test_emit_end_returns_transcript_path(self):
-        """CompactionResult includes the transcript_path from on_compact."""
-        tracker = CompactionTracker()
-        session = _make_session()
-        tracker.on_compact("/my/session.jsonl")
-        tracker.emit_start_if_ready()
-        result = await tracker.emit_end_if_ready(session)
-        assert result.just_ended is True
-        assert result.transcript_path == "/my/session.jsonl"
-        # transcript_path is cleared after emit_end
-        assert tracker._transcript_path == ""
-
-    @pytest.mark.asyncio
-    async def test_emit_end_clears_transcript_path(self):
-        """After emit_end, _transcript_path is reset so it doesn't leak to
-        subsequent non-compaction emit_end calls."""
-        tracker = CompactionTracker()
-        session = _make_session()
-        tracker.on_compact("/first/path.jsonl")
-        tracker.emit_start_if_ready()
-        await tracker.emit_end_if_ready(session)
-        # After compaction, _transcript_path is cleared
-        assert tracker._transcript_path == ""
--- a/autogpt_platform/backend/backend/copilot/sdk/e2b_file_tools.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/e2b_file_tools.py
@@ -8,6 +8,8 @@ SDK-internal paths (``~/.claude/projects/…/tool-results/``) are handled
 by the separate ``Read`` MCP tool registered in ``tool_adapter.py``.
 """

+from __future__ import annotations
+
 import itertools
 import json
 import logging
@@ -15,23 +17,36 @@ import os
 import shlex
 from typing import Any, Callable

-from backend.copilot.context import (
-    E2B_WORKDIR,
-    get_current_sandbox,
-    get_sdk_cwd,
-    is_allowed_local_path,
-    resolve_sandbox_path,
-)
+from backend.copilot.tools.e2b_sandbox import E2B_WORKDIR

 logger = logging.getLogger(__name__)


-def _get_sandbox():
+# Lazy imports to break circular dependency with tool_adapter.
+
+
+def _get_sandbox():  # type: ignore[return]
+    from .tool_adapter import get_current_sandbox  # noqa: E402
+
    return get_current_sandbox()


 def _is_allowed_local(path: str) -> bool:
-    return is_allowed_local_path(path, get_sdk_cwd())
+    from .tool_adapter import is_allowed_local_path  # noqa: E402
+
+    return is_allowed_local_path(path)
+
+
+def _resolve_remote(path: str) -> str:
+    """Normalise *path* to an absolute sandbox path under ``/home/user``.
+
+    Raises :class:`ValueError` if the resolved path escapes the sandbox.
+    """
+    candidate = path if os.path.isabs(path) else os.path.join(E2B_WORKDIR, path)
+    normalized = os.path.normpath(candidate)
+    if normalized != E2B_WORKDIR and not normalized.startswith(E2B_WORKDIR + "/"):
+        raise ValueError(f"Path must be within {E2B_WORKDIR}: {path}")
+    return normalized


 def _mcp(text: str, *, error: bool = False) -> dict[str, Any]:
@@ -48,7 +63,7 @@ def _get_sandbox_and_path(
    if sandbox is None:
        return _mcp("No E2B sandbox available", error=True)
    try:
-        remote = resolve_sandbox_path(file_path)
+        remote = _resolve_remote(file_path)
    except ValueError as exc:
        return _mcp(str(exc), error=True)
    return sandbox, remote
@@ -58,7 +73,6 @@ def _get_sandbox_and_path(


 async def _handle_read_file(args: dict[str, Any]) -> dict[str, Any]:
-    """Read lines from a sandbox file, falling back to the local host for SDK-internal paths."""
    file_path: str = args.get("file_path", "")
    offset: int = max(0, int(args.get("offset", 0)))
    limit: int = max(1, int(args.get("limit", 2000)))
@@ -90,7 +104,6 @@ async def _handle_read_file(args: dict[str, Any]) -> dict[str, Any]:


 async def _handle_write_file(args: dict[str, Any]) -> dict[str, Any]:
-    """Write content to a sandbox file, creating parent directories as needed."""
    file_path: str = args.get("file_path", "")
    content: str = args.get("content", "")

@@ -114,7 +127,6 @@ async def _handle_write_file(args: dict[str, Any]) -> dict[str, Any]:


 async def _handle_edit_file(args: dict[str, Any]) -> dict[str, Any]:
-    """Replace a substring in a sandbox file, with optional replace-all support."""
    file_path: str = args.get("file_path", "")
    old_string: str = args.get("old_string", "")
    new_string: str = args.get("new_string", "")
@@ -160,7 +172,6 @@ async def _handle_edit_file(args: dict[str, Any]) -> dict[str, Any]:


 async def _handle_glob(args: dict[str, Any]) -> dict[str, Any]:
-    """Find files matching a name pattern inside the sandbox using ``find``."""
    pattern: str = args.get("pattern", "")
    path: str = args.get("path", "")

@@ -172,7 +183,7 @@ async def _handle_glob(args: dict[str, Any]) -> dict[str, Any]:
        return _mcp("No E2B sandbox available", error=True)

    try:
-        search_dir = resolve_sandbox_path(path) if path else E2B_WORKDIR
+        search_dir = _resolve_remote(path) if path else E2B_WORKDIR
    except ValueError as exc:
        return _mcp(str(exc), error=True)

@@ -187,7 +198,6 @@ async def _handle_glob(args: dict[str, Any]) -> dict[str, Any]:


 async def _handle_grep(args: dict[str, Any]) -> dict[str, Any]:
-    """Search file contents by regex inside the sandbox using ``grep -rn``."""
    pattern: str = args.get("pattern", "")
    path: str = args.get("path", "")
    include: str = args.get("include", "")
@@ -200,7 +210,7 @@ async def _handle_grep(args: dict[str, Any]) -> dict[str, Any]:
        return _mcp("No E2B sandbox available", error=True)

    try:
-        search_dir = resolve_sandbox_path(path) if path else E2B_WORKDIR
+        search_dir = _resolve_remote(path) if path else E2B_WORKDIR
    except ValueError as exc:
        return _mcp(str(exc), error=True)

@@ -228,7 +238,7 @@ def _read_local(file_path: str, offset: int, limit: int) -> dict[str, Any]:
        return _mcp(f"Path not allowed: {file_path}", error=True)
    expanded = os.path.realpath(os.path.expanduser(file_path))
    try:
-        with open(expanded, encoding="utf-8", errors="replace") as fh:
+        with open(expanded) as fh:
            selected = list(itertools.islice(fh, offset, offset + limit))
        numbered = "".join(
            f"{i + offset + 1:>6}\t{line}" for i, line in enumerate(selected)
--- a/autogpt_platform/backend/backend/copilot/sdk/e2b_file_tools_test.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/e2b_file_tools_test.py
@@ -7,60 +7,59 @@ import os

 import pytest

-from backend.copilot.context import _current_project_dir
-
-from .e2b_file_tools import _read_local, resolve_sandbox_path
+from .e2b_file_tools import _read_local, _resolve_remote
+from .tool_adapter import _current_project_dir

 _SDK_PROJECTS_DIR = os.path.realpath(os.path.expanduser("~/.claude/projects"))


 # ---------------------------------------------------------------------------
-# resolve_sandbox_path — sandbox path normalisation & boundary enforcement
+# _resolve_remote — sandbox path normalisation & boundary enforcement
 # ---------------------------------------------------------------------------


-class TestResolveSandboxPath:
+class TestResolveRemote:
    def test_relative_path_resolved(self):
-        assert resolve_sandbox_path("src/main.py") == "/home/user/src/main.py"
+        assert _resolve_remote("src/main.py") == "/home/user/src/main.py"

    def test_absolute_within_sandbox(self):
-        assert resolve_sandbox_path("/home/user/file.txt") == "/home/user/file.txt"
+        assert _resolve_remote("/home/user/file.txt") == "/home/user/file.txt"

    def test_workdir_itself(self):
-        assert resolve_sandbox_path("/home/user") == "/home/user"
+        assert _resolve_remote("/home/user") == "/home/user"

    def test_relative_dotslash(self):
-        assert resolve_sandbox_path("./README.md") == "/home/user/README.md"
+        assert _resolve_remote("./README.md") == "/home/user/README.md"

    def test_traversal_blocked(self):
        with pytest.raises(ValueError, match="must be within /home/user"):
-            resolve_sandbox_path("../../etc/passwd")
+            _resolve_remote("../../etc/passwd")

    def test_absolute_traversal_blocked(self):
        with pytest.raises(ValueError, match="must be within /home/user"):
-            resolve_sandbox_path("/home/user/../../etc/passwd")
+            _resolve_remote("/home/user/../../etc/passwd")

    def test_absolute_outside_sandbox_blocked(self):
        with pytest.raises(ValueError, match="must be within /home/user"):
-            resolve_sandbox_path("/etc/passwd")
+            _resolve_remote("/etc/passwd")

    def test_root_blocked(self):
        with pytest.raises(ValueError, match="must be within /home/user"):
-            resolve_sandbox_path("/")
+            _resolve_remote("/")

    def test_home_other_user_blocked(self):
        with pytest.raises(ValueError, match="must be within /home/user"):
-            resolve_sandbox_path("/home/other/file.txt")
+            _resolve_remote("/home/other/file.txt")

    def test_deep_nested_allowed(self):
-        assert resolve_sandbox_path("a/b/c/d/e.txt") == "/home/user/a/b/c/d/e.txt"
+        assert _resolve_remote("a/b/c/d/e.txt") == "/home/user/a/b/c/d/e.txt"

    def test_trailing_slash_normalised(self):
-        assert resolve_sandbox_path("src/") == "/home/user/src"
+        assert _resolve_remote("src/") == "/home/user/src"

    def test_double_dots_within_sandbox_ok(self):
        """Path that resolves back within /home/user is allowed."""
-        assert resolve_sandbox_path("a/b/../c.txt") == "/home/user/a/c.txt"
+        assert _resolve_remote("a/b/../c.txt") == "/home/user/a/c.txt"


 # ---------------------------------------------------------------------------
--- a/autogpt_platform/backend/backend/copilot/sdk/e2e_compaction_test.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/e2e_compaction_test.py
@@ -1,531 +0,0 @@
-"""End-to-end compaction flow test.
-
-Simulates the full service.py compaction lifecycle using real-format
-JSONL session files — no SDK subprocess needed. Exercises:
-
-  1. TranscriptBuilder loads a "downloaded" transcript
-  2. User query appended, assistant response streamed
-  3. PreCompact hook fires → CompactionTracker.on_compact()
-  4. Next message → emit_start_if_ready() yields spinner events
-  5. Message after that → emit_end_if_ready() returns CompactionResult
-  6. read_compacted_entries() reads the CLI session file
-  7. TranscriptBuilder.replace_entries() syncs state
-  8. More messages appended post-compaction
-  9. to_jsonl() exports full state for upload
-  10. Fresh builder loads the export — roundtrip verified
-"""
-
-import asyncio
-
-from backend.copilot.model import ChatSession
-from backend.copilot.response_model import (
-    StreamFinishStep,
-    StreamStartStep,
-    StreamToolInputAvailable,
-    StreamToolInputStart,
-    StreamToolOutputAvailable,
-)
-from backend.copilot.sdk.compaction import CompactionTracker
-from backend.copilot.sdk.transcript import (
-    read_compacted_entries,
-    strip_progress_entries,
-)
-from backend.copilot.sdk.transcript_builder import TranscriptBuilder
-from backend.util import json
-
-
-def _make_jsonl(*entries: dict) -> str:
-    return "\n".join(json.dumps(e) for e in entries) + "\n"
-
-
-def _run(coro):
-    """Run an async coroutine synchronously."""
-    return asyncio.run(coro)
-
-
-# ---------------------------------------------------------------------------
-# Fixtures: realistic CLI session file content
-# ---------------------------------------------------------------------------
-
-# Pre-compaction conversation
-USER_1 = {
-    "type": "user",
-    "uuid": "u1",
-    "message": {"role": "user", "content": "What files are in this project?"},
-}
-ASST_1_THINKING = {
-    "type": "assistant",
-    "uuid": "a1-think",
-    "parentUuid": "u1",
-    "message": {
-        "role": "assistant",
-        "id": "msg_sdk_aaa",
-        "type": "message",
-        "content": [{"type": "thinking", "thinking": "Let me look at the files..."}],
-        "stop_reason": None,
-        "stop_sequence": None,
-    },
-}
-ASST_1_TOOL = {
-    "type": "assistant",
-    "uuid": "a1-tool",
-    "parentUuid": "u1",
-    "message": {
-        "role": "assistant",
-        "id": "msg_sdk_aaa",
-        "type": "message",
-        "content": [
-            {
-                "type": "tool_use",
-                "id": "tu1",
-                "name": "Bash",
-                "input": {"command": "ls"},
-            }
-        ],
-        "stop_reason": "tool_use",
-        "stop_sequence": None,
-    },
-}
-TOOL_RESULT_1 = {
-    "type": "user",
-    "uuid": "tr1",
-    "parentUuid": "a1-tool",
-    "message": {
-        "role": "user",
-        "content": [
-            {
-                "type": "tool_result",
-                "tool_use_id": "tu1",
-                "content": "file1.py\nfile2.py",
-            }
-        ],
-    },
-}
-ASST_1_TEXT = {
-    "type": "assistant",
-    "uuid": "a1-text",
-    "parentUuid": "tr1",
-    "message": {
-        "role": "assistant",
-        "id": "msg_sdk_bbb",
-        "type": "message",
-        "content": [{"type": "text", "text": "I found file1.py and file2.py."}],
-        "stop_reason": "end_turn",
-        "stop_sequence": None,
-    },
-}
-# Progress entries (should be stripped during upload)
-PROGRESS_1 = {
-    "type": "progress",
-    "uuid": "prog1",
-    "parentUuid": "a1-tool",
-    "data": {"type": "bash_progress", "stdout": "running ls..."},
-}
-# Second user message
-USER_2 = {
-    "type": "user",
-    "uuid": "u2",
-    "parentUuid": "a1-text",
-    "message": {"role": "user", "content": "Show me file1.py"},
-}
-ASST_2 = {
-    "type": "assistant",
-    "uuid": "a2",
-    "parentUuid": "u2",
-    "message": {
-        "role": "assistant",
-        "id": "msg_sdk_ccc",
-        "type": "message",
-        "content": [{"type": "text", "text": "Here is file1.py content..."}],
-        "stop_reason": "end_turn",
-        "stop_sequence": None,
-    },
-}
-
-# --- Compaction summary (written by CLI after context compaction) ---
-COMPACT_SUMMARY = {
-    "type": "summary",
-    "uuid": "cs1",
-    "isCompactSummary": True,
-    "message": {
-        "role": "user",
-        "content": (
-            "Summary: User asked about project files. Found file1.py and file2.py. "
-            "User then asked to see file1.py."
-        ),
-    },
-}
-
-# Post-compaction assistant response
-POST_COMPACT_ASST = {
-    "type": "assistant",
-    "uuid": "a3",
-    "parentUuid": "cs1",
-    "message": {
-        "role": "assistant",
-        "id": "msg_sdk_ddd",
-        "type": "message",
-        "content": [{"type": "text", "text": "Here is the content of file1.py..."}],
-        "stop_reason": "end_turn",
-        "stop_sequence": None,
-    },
-}
-
-# Post-compaction user follow-up
-USER_3 = {
-    "type": "user",
-    "uuid": "u3",
-    "parentUuid": "a3",
-    "message": {"role": "user", "content": "Now show file2.py"},
-}
-ASST_3 = {
-    "type": "assistant",
-    "uuid": "a4",
-    "parentUuid": "u3",
-    "message": {
-        "role": "assistant",
-        "id": "msg_sdk_eee",
-        "type": "message",
-        "content": [{"type": "text", "text": "Here is file2.py..."}],
-        "stop_reason": "end_turn",
-        "stop_sequence": None,
-    },
-}
-
-
-# ---------------------------------------------------------------------------
-# E2E test
-# ---------------------------------------------------------------------------
-
-
-class TestCompactionE2E:
-    def _write_session_file(self, session_dir, entries):
-        """Write a CLI session JSONL file."""
-        path = session_dir / "session.jsonl"
-        path.write_text(_make_jsonl(*entries))
-        return path
-
-    def test_full_compaction_lifecycle(self, tmp_path, monkeypatch):
-        """Simulate the complete service.py compaction flow.
-
-        Timeline:
-        1. Previous turn uploaded transcript with [USER_1, ASST_1, USER_2, ASST_2]
-        2. Current turn: download → load_previous
-        3. User sends "Now show file2.py" → append_user
-        4. SDK starts streaming response
-        5. Mid-stream: PreCompact hook fires (context too large)
-        6. CLI writes compaction summary to session file
-        7. Next SDK message → emit_start (spinner)
-        8. Following message → emit_end (CompactionResult)
-        9. read_compacted_entries reads the session file
-        10. replace_entries syncs TranscriptBuilder
-        11. More assistant messages appended
-        12. Export → upload → next turn downloads it
-        """
-        # --- Setup CLI projects directory ---
-        config_dir = tmp_path / "config"
-        projects_dir = config_dir / "projects"
-        session_dir = projects_dir / "proj"
-        session_dir.mkdir(parents=True)
-        monkeypatch.setenv("CLAUDE_CONFIG_DIR", str(config_dir))
-
-        # --- Step 1-2: Load "downloaded" transcript from previous turn ---
-        previous_transcript = _make_jsonl(
-            USER_1,
-            ASST_1_THINKING,
-            ASST_1_TOOL,
-            TOOL_RESULT_1,
-            ASST_1_TEXT,
-            USER_2,
-            ASST_2,
-        )
-        builder = TranscriptBuilder()
-        builder.load_previous(previous_transcript)
-        assert builder.entry_count == 7
-
-        # --- Step 3: User sends new query ---
-        builder.append_user("Now show file2.py")
-        assert builder.entry_count == 8
-
-        # --- Step 4: SDK starts streaming ---
-        builder.append_assistant(
-            [{"type": "thinking", "thinking": "Let me read file2.py..."}],
-            model="claude-sonnet-4-20250514",
-        )
-        assert builder.entry_count == 9
-
-        # --- Step 5-6: PreCompact fires, CLI writes session file ---
-        session_file = self._write_session_file(
-            session_dir,
-            [
-                USER_1,
-                ASST_1_THINKING,
-                ASST_1_TOOL,
-                PROGRESS_1,
-                TOOL_RESULT_1,
-                ASST_1_TEXT,
-                USER_2,
-                ASST_2,
-                COMPACT_SUMMARY,
-                POST_COMPACT_ASST,
-                USER_3,
-                ASST_3,
-            ],
-        )
-
-        # --- Step 7: CompactionTracker receives PreCompact hook ---
-        tracker = CompactionTracker()
-        session = ChatSession.new(user_id="test-user")
-        tracker.on_compact(str(session_file))
-
-        # --- Step 8: Next SDK message arrives → emit_start ---
-        start_events = tracker.emit_start_if_ready()
-        assert len(start_events) == 3
-        assert isinstance(start_events[0], StreamStartStep)
-        assert isinstance(start_events[1], StreamToolInputStart)
-        assert isinstance(start_events[2], StreamToolInputAvailable)
-
-        # Verify tool_call_id is set
-        tool_call_id = start_events[1].toolCallId
-        assert tool_call_id.startswith("compaction-")
-
-        # --- Step 9: Following message → emit_end ---
-        result = _run(tracker.emit_end_if_ready(session))
-        assert result.just_ended is True
-        assert result.transcript_path == str(session_file)
-        assert len(result.events) == 2
-        assert isinstance(result.events[0], StreamToolOutputAvailable)
-        assert isinstance(result.events[1], StreamFinishStep)
-        # Verify same tool_call_id
-        assert result.events[0].toolCallId == tool_call_id
-
-        # Session should have compaction messages persisted
-        assert len(session.messages) == 2
-        assert session.messages[0].role == "assistant"
-        assert session.messages[1].role == "tool"
-
-        # --- Step 10: read_compacted_entries + replace_entries ---
-        compacted = read_compacted_entries(str(session_file))
-        assert compacted is not None
-        # Should have: COMPACT_SUMMARY + POST_COMPACT_ASST + USER_3 + ASST_3
-        assert len(compacted) == 4
-        assert compacted[0]["uuid"] == "cs1"
-        assert compacted[0]["isCompactSummary"] is True
-
-        # Replace builder state with compacted entries
-        old_count = builder.entry_count
-        builder.replace_entries(compacted)
-        assert builder.entry_count == 4  # Only compacted entries
-        assert builder.entry_count < old_count  # Compaction reduced entries
-
-        # --- Step 11: More assistant messages after compaction ---
-        builder.append_assistant(
-            [{"type": "text", "text": "Here is file2.py:\n\ndef hello():\n    pass"}],
-            model="claude-sonnet-4-20250514",
-            stop_reason="end_turn",
-        )
-        assert builder.entry_count == 5
-
-        # --- Step 12: Export for upload ---
-        output = builder.to_jsonl()
-        assert output  # Not empty
-        output_entries = [json.loads(line) for line in output.strip().split("\n")]
-        assert len(output_entries) == 5
-
-        # Verify structure:
-        # [COMPACT_SUMMARY, POST_COMPACT_ASST, USER_3, ASST_3, new_assistant]
-        assert output_entries[0]["type"] == "summary"
-        assert output_entries[0].get("isCompactSummary") is True
-        assert output_entries[0]["uuid"] == "cs1"
-        assert output_entries[1]["uuid"] == "a3"
-        assert output_entries[2]["uuid"] == "u3"
-        assert output_entries[3]["uuid"] == "a4"
-        assert output_entries[4]["type"] == "assistant"
-
-        # Verify parent chain is intact
-        assert output_entries[1]["parentUuid"] == "cs1"  # a3 → cs1
-        assert output_entries[2]["parentUuid"] == "a3"  # u3 → a3
-        assert output_entries[3]["parentUuid"] == "u3"  # a4 → u3
-        assert output_entries[4]["parentUuid"] == "a4"  # new → a4
-
-        # --- Step 13: Roundtrip — next turn loads this export ---
-        builder2 = TranscriptBuilder()
-        builder2.load_previous(output)
-        assert builder2.entry_count == 5
-
-        # isCompactSummary survives roundtrip
-        output2 = builder2.to_jsonl()
-        first_entry = json.loads(output2.strip().split("\n")[0])
-        assert first_entry.get("isCompactSummary") is True
-
-        # Can append more messages
-        builder2.append_user("What about file3.py?")
-        assert builder2.entry_count == 6
-        final_output = builder2.to_jsonl()
-        last_entry = json.loads(final_output.strip().split("\n")[-1])
-        assert last_entry["type"] == "user"
-        # Parented to the last entry from previous turn
-        assert last_entry["parentUuid"] == output_entries[-1]["uuid"]
-
-    def test_double_compaction_within_session(self, tmp_path, monkeypatch):
-        """Two compactions in the same session (across reset_for_query)."""
-        config_dir = tmp_path / "config"
-        projects_dir = config_dir / "projects"
-        session_dir = projects_dir / "proj"
-        session_dir.mkdir(parents=True)
-        monkeypatch.setenv("CLAUDE_CONFIG_DIR", str(config_dir))
-
-        tracker = CompactionTracker()
-        session = ChatSession.new(user_id="test")
-        builder = TranscriptBuilder()
-
-        # --- First query with compaction ---
-        builder.append_user("first question")
-        builder.append_assistant([{"type": "text", "text": "first answer"}])
-
-        # Write session file for first compaction
-        first_summary = {
-            "type": "summary",
-            "uuid": "cs-first",
-            "isCompactSummary": True,
-            "message": {"role": "user", "content": "First compaction summary"},
-        }
-        first_post = {
-            "type": "assistant",
-            "uuid": "a-first",
-            "parentUuid": "cs-first",
-            "message": {"role": "assistant", "content": "first post-compact"},
-        }
-        file1 = session_dir / "session1.jsonl"
-        file1.write_text(_make_jsonl(first_summary, first_post))
-
-        tracker.on_compact(str(file1))
-        tracker.emit_start_if_ready()
-        result1 = _run(tracker.emit_end_if_ready(session))
-        assert result1.just_ended is True
-
-        compacted1 = read_compacted_entries(str(file1))
-        assert compacted1 is not None
-        builder.replace_entries(compacted1)
-        assert builder.entry_count == 2
-
-        # --- Reset for second query ---
-        tracker.reset_for_query()
-
-        # --- Second query with compaction ---
-        builder.append_user("second question")
-        builder.append_assistant([{"type": "text", "text": "second answer"}])
-
-        second_summary = {
-            "type": "summary",
-            "uuid": "cs-second",
-            "isCompactSummary": True,
-            "message": {"role": "user", "content": "Second compaction summary"},
-        }
-        second_post = {
-            "type": "assistant",
-            "uuid": "a-second",
-            "parentUuid": "cs-second",
-            "message": {"role": "assistant", "content": "second post-compact"},
-        }
-        file2 = session_dir / "session2.jsonl"
-        file2.write_text(_make_jsonl(second_summary, second_post))
-
-        tracker.on_compact(str(file2))
-        tracker.emit_start_if_ready()
-        result2 = _run(tracker.emit_end_if_ready(session))
-        assert result2.just_ended is True
-
-        compacted2 = read_compacted_entries(str(file2))
-        assert compacted2 is not None
-        builder.replace_entries(compacted2)
-        assert builder.entry_count == 2  # Only second compaction entries
-
-        # Export and verify
-        output = builder.to_jsonl()
-        entries = [json.loads(line) for line in output.strip().split("\n")]
-        assert entries[0]["uuid"] == "cs-second"
-        assert entries[0].get("isCompactSummary") is True
-
-    def test_strip_progress_then_load_then_compact_roundtrip(
-        self, tmp_path, monkeypatch
-    ):
-        """Full pipeline: strip → load → compact → replace → export → reload.
-
-        This tests the exact sequence that happens across two turns:
-        Turn 1: SDK produces transcript with progress entries
-        Upload: strip_progress_entries removes progress, upload to cloud
-        Turn 2: Download → load_previous → compaction fires → replace → export
-        Turn 3: Download the Turn 2 export → load_previous (roundtrip)
-        """
-        config_dir = tmp_path / "config"
-        projects_dir = config_dir / "projects"
-        session_dir = projects_dir / "proj"
-        session_dir.mkdir(parents=True)
-        monkeypatch.setenv("CLAUDE_CONFIG_DIR", str(config_dir))
-
-        # --- Turn 1: SDK produces raw transcript ---
-        raw_content = _make_jsonl(
-            USER_1,
-            ASST_1_THINKING,
-            ASST_1_TOOL,
-            PROGRESS_1,
-            TOOL_RESULT_1,
-            ASST_1_TEXT,
-            USER_2,
-            ASST_2,
-        )
-
-        # Strip progress for upload
-        stripped = strip_progress_entries(raw_content)
-        stripped_entries = [
-            json.loads(line) for line in stripped.strip().split("\n") if line.strip()
-        ]
-        # Progress should be gone
-        assert not any(e.get("type") == "progress" for e in stripped_entries)
-        assert len(stripped_entries) == 7  # 8 - 1 progress
-
-        # --- Turn 2: Download stripped, load, compaction happens ---
-        builder = TranscriptBuilder()
-        builder.load_previous(stripped)
-        assert builder.entry_count == 7
-
-        builder.append_user("Now show file2.py")
-        builder.append_assistant(
-            [{"type": "text", "text": "Reading file2.py..."}],
-            model="claude-sonnet-4-20250514",
-        )
-
-        # CLI writes session file with compaction
-        session_file = self._write_session_file(
-            session_dir,
-            [
-                USER_1,
-                ASST_1_TOOL,
-                TOOL_RESULT_1,
-                ASST_1_TEXT,
-                USER_2,
-                ASST_2,
-                COMPACT_SUMMARY,
-                POST_COMPACT_ASST,
-            ],
-        )
-
-        compacted = read_compacted_entries(str(session_file))
-        assert compacted is not None
-        builder.replace_entries(compacted)
-
-        # Append post-compaction message
-        builder.append_user("Thanks!")
-        output = builder.to_jsonl()
-
-        # --- Turn 3: Fresh load of Turn 2 export ---
-        builder3 = TranscriptBuilder()
-        builder3.load_previous(output)
-        # Should have: compact_summary + post_compact_asst + "Thanks!"
-        assert builder3.entry_count == 3
-
-        # Compact summary survived the full pipeline
-        first = json.loads(builder3.to_jsonl().strip().split("\n")[0])
-        assert first.get("isCompactSummary") is True
-        assert first["type"] == "summary"
--- a/autogpt_platform/backend/backend/copilot/sdk/file_ref.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/file_ref.py
@@ -1,715 +0,0 @@
-"""File reference protocol for tool call inputs.
-
-Allows the LLM to pass a file reference instead of embedding large content
-inline.  The processor expands ``@@agptfile:<uri>[<start>-<end>]`` tokens in tool
-arguments before the tool is executed.
-
-Protocol
--------
-
-    @@agptfile:<uri>[<start>-<end>]
-
-``<uri>`` (required)
-    - ``workspace://<file_id>`` — workspace file by ID
-    - ``workspace://<file_id>#<mime>`` — same, MIME hint is ignored for reads
-    - ``workspace:///<path>`` — workspace file by virtual path
-    - ``/absolute/local/path`` — ephemeral or sdk_cwd file (validated by
-      :func:`~backend.copilot.sdk.tool_adapter.is_allowed_local_path`)
-    - Any absolute path that resolves inside the E2B sandbox
-      (``/home/user/...``) when a sandbox is active
-
-``[<start>-<end>]`` (optional)
-    Line range, 1-indexed inclusive.  Examples: ``[1-100]``, ``[50-200]``.
-    Omit to read the entire file.
-
-Examples
--------
-    @@agptfile:workspace://abc123
-    @@agptfile:workspace://abc123[10-50]
-    @@agptfile:workspace:///reports/q1.md
-    @@agptfile:/tmp/copilot-<session>/output.py[1-80]
-    @@agptfile:/home/user/script.sh
-"""
-
-import itertools
-import logging
-import os
-import re
-from dataclasses import dataclass
-from typing import Any
-
-from backend.copilot.context import (
-    get_current_sandbox,
-    get_sdk_cwd,
-    get_workspace_manager,
-    is_allowed_local_path,
-    resolve_sandbox_path,
-)
-from backend.copilot.model import ChatSession
-from backend.util.file import parse_workspace_uri
-from backend.util.file_content_parser import (
-    BINARY_FORMATS,
-    MIME_TO_FORMAT,
-    PARSE_EXCEPTIONS,
-    infer_format_from_uri,
-    parse_file_content,
-)
-from backend.util.type import MediaFileType
-
-
-class FileRefExpansionError(Exception):
-    """Raised when a ``@@agptfile:`` reference in tool call args fails to resolve.
-
-    Separating this from inline substitution lets callers (e.g. the MCP tool
-    wrapper) block tool execution and surface a helpful error to the model
-    rather than passing an ``[file-ref error: …]`` string as actual input.
-    """
-
-
-logger = logging.getLogger(__name__)
-
-FILE_REF_PREFIX = "@@agptfile:"
-
-# Matches:  @@agptfile:<uri>[start-end]?
-#   Group 1 – URI; must start with '/' (absolute path) or 'workspace://'
-#   Group 2 – start line (optional)
-#   Group 3 – end line (optional)
-_FILE_REF_RE = re.compile(
-    re.escape(FILE_REF_PREFIX) + r"((?:workspace://|/)[^\[\s]*)(?:\[(\d+)-(\d+)\])?"
-)
-
-# Maximum characters returned for a single file reference expansion.
-_MAX_EXPAND_CHARS = 200_000
-# Maximum total characters across all @@agptfile: expansions in one string.
-_MAX_TOTAL_EXPAND_CHARS = 1_000_000
-# Maximum raw byte size for bare ref structured parsing (10 MB).
-_MAX_BARE_REF_BYTES = 10_000_000
-
-
-@dataclass
-class FileRef:
-    uri: str
-    start_line: int | None  # 1-indexed, inclusive
-    end_line: int | None  # 1-indexed, inclusive
-
-
-# ---------------------------------------------------------------------------
-# Public API  (top-down: main functions first, helpers below)
-# ---------------------------------------------------------------------------
-
-
-def parse_file_ref(text: str) -> FileRef | None:
-    """Return a :class:`FileRef` if *text* is a bare file reference token.
-
-    A "bare token" means the entire string matches the ``@@agptfile:...`` pattern
-    (after stripping whitespace).  Use :func:`expand_file_refs_in_string` to
-    expand references embedded in larger strings.
-    """
-    m = _FILE_REF_RE.fullmatch(text.strip())
-    if not m:
-        return None
-    start = int(m.group(2)) if m.group(2) else None
-    end = int(m.group(3)) if m.group(3) else None
-    if start is not None and start < 1:
-        return None
-    if end is not None and end < 1:
-        return None
-    if start is not None and end is not None and end < start:
-        return None
-    return FileRef(uri=m.group(1), start_line=start, end_line=end)
-
-
-async def read_file_bytes(
-    uri: str,
-    user_id: str | None,
-    session: ChatSession,
-) -> bytes:
-    """Resolve *uri* to raw bytes using workspace, local, or E2B path logic.
-
-    Raises :class:`ValueError` if the URI cannot be resolved.
-    """
-    # Strip MIME fragment (e.g. workspace://id#mime) before dispatching.
-    plain = uri.split("#")[0] if uri.startswith("workspace://") else uri
-
-    if plain.startswith("workspace://"):
-        if not user_id:
-            raise ValueError("workspace:// file references require authentication")
-        manager = await get_workspace_manager(user_id, session.session_id)
-        ws = parse_workspace_uri(plain)
-        try:
-            data = await (
-                manager.read_file(ws.file_ref)
-                if ws.is_path
-                else manager.read_file_by_id(ws.file_ref)
-            )
-        except FileNotFoundError:
-            raise ValueError(f"File not found: {plain}")
-        except (PermissionError, OSError) as exc:
-            raise ValueError(f"Failed to read {plain}: {exc}") from exc
-        except (AttributeError, TypeError, RuntimeError) as exc:
-            # AttributeError/TypeError: workspace manager returned an
-            # unexpected type or interface; RuntimeError: async runtime issues.
-            logger.warning("Unexpected error reading %s: %s", plain, exc)
-            raise ValueError(f"Failed to read {plain}: {exc}") from exc
-        # NOTE: Workspace API does not support pre-read size checks;
-        # the full file is loaded before the size guard below.
-        if len(data) > _MAX_BARE_REF_BYTES:
-            raise ValueError(
-                f"File too large ({len(data)} bytes, limit {_MAX_BARE_REF_BYTES})"
-            )
-        return data
-
-    if is_allowed_local_path(plain, get_sdk_cwd()):
-        resolved = os.path.realpath(os.path.expanduser(plain))
-        try:
-            # Read with a one-byte overshoot to detect files that exceed the limit
-            # without a separate os.path.getsize call (avoids TOCTOU race).
-            with open(resolved, "rb") as fh:
-                data = fh.read(_MAX_BARE_REF_BYTES + 1)
-            if len(data) > _MAX_BARE_REF_BYTES:
-                raise ValueError(
-                    f"File too large (>{_MAX_BARE_REF_BYTES} bytes, "
-                    f"limit {_MAX_BARE_REF_BYTES})"
-                )
-            return data
-        except FileNotFoundError:
-            raise ValueError(f"File not found: {plain}")
-        except OSError as exc:
-            raise ValueError(f"Failed to read {plain}: {exc}") from exc
-
-    sandbox = get_current_sandbox()
-    if sandbox is not None:
-        try:
-            remote = resolve_sandbox_path(plain)
-        except ValueError as exc:
-            raise ValueError(
-                f"Path is not allowed (not in workspace, sdk_cwd, or sandbox): {plain}"
-            ) from exc
-        try:
-            data = bytes(await sandbox.files.read(remote, format="bytes"))
-        except (FileNotFoundError, OSError, UnicodeDecodeError) as exc:
-            raise ValueError(f"Failed to read from sandbox: {plain}: {exc}") from exc
-        except Exception as exc:
-            # E2B SDK raises SandboxException subclasses (NotFoundException,
-            # TimeoutException, NotEnoughSpaceException, etc.) which don't
-            # inherit from standard exceptions.  Import lazily to avoid a
-            # hard dependency on e2b at module level.
-            try:
-                from e2b.exceptions import SandboxException  # noqa: PLC0415
-
-                if isinstance(exc, SandboxException):
-                    raise ValueError(
-                        f"Failed to read from sandbox: {plain}: {exc}"
-                    ) from exc
-            except ImportError:
-                pass
-            # Re-raise unexpected exceptions (TypeError, AttributeError, etc.)
-            # so they surface as real bugs rather than being silently masked.
-            raise
-        # NOTE: E2B sandbox API does not support pre-read size checks;
-        # the full file is loaded before the size guard below.
-        if len(data) > _MAX_BARE_REF_BYTES:
-            raise ValueError(
-                f"File too large ({len(data)} bytes, limit {_MAX_BARE_REF_BYTES})"
-            )
-        return data
-
-    raise ValueError(
-        f"Path is not allowed (not in workspace, sdk_cwd, or sandbox): {plain}"
-    )
-
-
-async def resolve_file_ref(
-    ref: FileRef,
-    user_id: str | None,
-    session: ChatSession,
-) -> str:
-    """Resolve a :class:`FileRef` to its text content."""
-    raw = await read_file_bytes(ref.uri, user_id, session)
-    return _apply_line_range(_to_str(raw), ref.start_line, ref.end_line)
-
-
-async def expand_file_refs_in_string(
-    text: str,
-    user_id: str | None,
-    session: ChatSession,
-    *,
-    raise_on_error: bool = False,
-) -> str:
-    """Expand all ``@@agptfile:...`` tokens in *text*, returning the substituted string.
-
-    Non-reference text is passed through unchanged.
-
-    If *raise_on_error* is ``False`` (default), expansion errors are surfaced
-    inline as ``[file-ref error: <message>]`` — useful for display/log contexts
-    where partial expansion is acceptable.
-
-    If *raise_on_error* is ``True``, any resolution failure raises
-    :class:`FileRefExpansionError` immediately so the caller can block the
-    operation and surface a clean error to the model.
-    """
-    if FILE_REF_PREFIX not in text:
-        return text
-
-    result: list[str] = []
-    last_end = 0
-    total_chars = 0
-    for m in _FILE_REF_RE.finditer(text):
-        result.append(text[last_end : m.start()])
-        start = int(m.group(2)) if m.group(2) else None
-        end = int(m.group(3)) if m.group(3) else None
-        if (start is not None and start < 1) or (end is not None and end < 1):
-            msg = f"line numbers must be >= 1: {m.group(0)}"
-            if raise_on_error:
-                raise FileRefExpansionError(msg)
-            result.append(f"[file-ref error: {msg}]")
-            last_end = m.end()
-            continue
-        if start is not None and end is not None and end < start:
-            msg = f"end line must be >= start line: {m.group(0)}"
-            if raise_on_error:
-                raise FileRefExpansionError(msg)
-            result.append(f"[file-ref error: {msg}]")
-            last_end = m.end()
-            continue
-        ref = FileRef(uri=m.group(1), start_line=start, end_line=end)
-        try:
-            content = await resolve_file_ref(ref, user_id, session)
-            if len(content) > _MAX_EXPAND_CHARS:
-                content = content[:_MAX_EXPAND_CHARS] + "\n... [truncated]"
-            remaining = _MAX_TOTAL_EXPAND_CHARS - total_chars
-            # remaining == 0 means the budget was exactly exhausted by the
-            # previous ref.  The elif below (len > remaining) won't catch
-            # this since 0 > 0 is false, so we need the <= 0 check.
-            if remaining <= 0:
-                content = "[file-ref budget exhausted: total expansion limit reached]"
-            elif len(content) > remaining:
-                content = content[:remaining] + "\n... [total budget exhausted]"
-            total_chars += len(content)
-            result.append(content)
-        except ValueError as exc:
-            logger.warning("file-ref expansion failed for %r: %s", m.group(0), exc)
-            if raise_on_error:
-                raise FileRefExpansionError(str(exc)) from exc
-            result.append(f"[file-ref error: {exc}]")
-        last_end = m.end()
-
-    result.append(text[last_end:])
-    return "".join(result)
-
-
-async def expand_file_refs_in_args(
-    args: dict[str, Any],
-    user_id: str | None,
-    session: ChatSession,
-    *,
-    input_schema: dict[str, Any] | None = None,
-) -> dict[str, Any]:
-    """Recursively expand ``@@agptfile:...`` references in tool call arguments.
-
-    String values are expanded in-place.  Nested dicts and lists are
-    traversed.  Non-string scalars are returned unchanged.
-
-    **Bare references** (the entire argument value is a single
-    ``@@agptfile:...`` token with no surrounding text) are resolved and then
-    parsed according to the file's extension or MIME type.  See
-    :mod:`backend.util.file_content_parser` for the full list of supported
-    formats (JSON, JSONL, CSV, TSV, YAML, TOML, Parquet, Excel).
-
-    When *input_schema* is provided and the target property has
-    ``"type": "string"``, structured parsing is skipped — the raw file content
-    is returned as a plain string so blocks receive the original text.
-
-    If the format is unrecognised or parsing fails, the content is returned as
-    a plain string (the fallback).
-
-    **Embedded references** (``@@agptfile:`` mixed with other text) always
-    produce a plain string — structured parsing only applies to bare refs.
-
-    Raises :class:`FileRefExpansionError` if any reference fails to resolve,
-    so the tool is *not* executed with an error string as its input.  The
-    caller (the MCP tool wrapper) should convert this into an MCP error
-    response that lets the model correct the reference before retrying.
-    """
-    if not args:
-        return args
-
-    properties = (input_schema or {}).get("properties", {})
-
-    async def _expand(
-        value: Any,
-        *,
-        prop_schema: dict[str, Any] | None = None,
-    ) -> Any:
-        """Recursively expand a single argument value.
-
-        Strings are checked for ``@@agptfile:`` references and expanded
-        (bare refs get structured parsing; embedded refs get inline
-        substitution).  Dicts and lists are traversed recursively,
-        threading the corresponding sub-schema from *prop_schema* so
-        that nested fields also receive correct type-aware expansion.
-        Non-string scalars pass through unchanged.
-        """
-        if isinstance(value, str):
-            ref = parse_file_ref(value)
-            if ref is not None:
-                # MediaFileType fields: return the raw URI immediately —
-                # no file reading, no format inference, no content parsing.
-                if _is_media_file_field(prop_schema):
-                    return ref.uri
-
-                fmt = infer_format_from_uri(ref.uri)
-                # Workspace URIs by ID (workspace://abc123) have no extension.
-                # When the MIME fragment is also missing, fall back to the
-                # workspace file manager's metadata for format detection.
-                if fmt is None and ref.uri.startswith("workspace://"):
-                    fmt = await _infer_format_from_workspace(ref.uri, user_id, session)
-                return await _expand_bare_ref(ref, fmt, user_id, session, prop_schema)
-
-            # Not a bare ref — do normal inline expansion.
-            return await expand_file_refs_in_string(
-                value, user_id, session, raise_on_error=True
-            )
-        if isinstance(value, dict):
-            # When the schema says this is an object but doesn't define
-            # inner properties, skip expansion — the caller (e.g.
-            # RunBlockTool) will expand with the actual nested schema.
-            if (
-                prop_schema is not None
-                and prop_schema.get("type") == "object"
-                and "properties" not in prop_schema
-            ):
-                return value
-            nested_props = (prop_schema or {}).get("properties", {})
-            return {
-                k: await _expand(v, prop_schema=nested_props.get(k))
-                for k, v in value.items()
-            }
-        if isinstance(value, list):
-            items_schema = (prop_schema or {}).get("items")
-            return [await _expand(item, prop_schema=items_schema) for item in value]
-        return value
-
-    return {k: await _expand(v, prop_schema=properties.get(k)) for k, v in args.items()}
-
-
-# ---------------------------------------------------------------------------
-# Private helpers  (used by the public functions above)
-# ---------------------------------------------------------------------------
-
-
-def _apply_line_range(text: str, start: int | None, end: int | None) -> str:
-    """Slice *text* to the requested 1-indexed line range (inclusive).
-
-    When the requested range extends beyond the file, a note is appended
-    so the LLM knows it received the entire remaining content.
-    """
-    if start is None and end is None:
-        return text
-    lines = text.splitlines(keepends=True)
-    total = len(lines)
-    s = (start - 1) if start is not None else 0
-    e = end if end is not None else total
-    selected = list(itertools.islice(lines, s, e))
-    result = "".join(selected)
-    if end is not None and end > total:
-        result += f"\n[Note: file has only {total} lines]\n"
-    return result
-
-
-def _to_str(content: str | bytes) -> str:
-    """Decode *content* to a string if it is bytes, otherwise return as-is."""
-    if isinstance(content, str):
-        return content
-    return content.decode("utf-8", errors="replace")
-
-
-def _check_content_size(content: str | bytes) -> None:
-    """Raise :class:`ValueError` if *content* exceeds the byte limit.
-
-    Raises ``ValueError`` (not ``FileRefExpansionError``) so that the caller
-    (``_expand_bare_ref``) can unify all resolution errors into a single
-    ``except ValueError`` → ``FileRefExpansionError`` handler, keeping the
-    error-flow consistent with ``read_file_bytes`` and ``resolve_file_ref``.
-
-    For ``bytes``, the length is the byte count directly.  For ``str``,
-    we encode to UTF-8 first because multi-byte characters (e.g. emoji)
-    mean the byte size can be up to 4x the character count.
-    """
-    if isinstance(content, bytes):
-        size = len(content)
-    else:
-        char_len = len(content)
-        # Fast lower bound: UTF-8 byte count >= char count.
-        # If char count already exceeds the limit, reject immediately
-        # without allocating an encoded copy.
-        if char_len > _MAX_BARE_REF_BYTES:
-            size = char_len  # real byte size is even larger
-        # Fast upper bound: each char is at most 4 UTF-8 bytes.
-        # If worst-case is still under the limit, skip encoding entirely.
-        elif char_len * 4 <= _MAX_BARE_REF_BYTES:
-            return
-        else:
-            # Edge case: char count is under limit but multibyte chars
-            # might push byte count over. Encode to get exact size.
-            size = len(content.encode("utf-8"))
-    if size > _MAX_BARE_REF_BYTES:
-        raise ValueError(
-            f"File too large for structured parsing "
-            f"({size} bytes, limit {_MAX_BARE_REF_BYTES})"
-        )
-
-
-async def _infer_format_from_workspace(
-    uri: str,
-    user_id: str | None,
-    session: ChatSession,
-) -> str | None:
-    """Look up workspace file metadata to infer the format.
-
-    Workspace URIs by ID (``workspace://abc123``) have no file extension.
-    When the MIME fragment is also absent, we query the workspace file
-    manager for the file's stored MIME type and original filename.
-    """
-    if not user_id:
-        return None
-    try:
-        ws = parse_workspace_uri(uri)
-        manager = await get_workspace_manager(user_id, session.session_id)
-        info = await (
-            manager.get_file_info(ws.file_ref)
-            if not ws.is_path
-            else manager.get_file_info_by_path(ws.file_ref)
-        )
-        if info is None:
-            return None
-        # Try MIME type first, then filename extension.
-        mime = (info.mime_type or "").split(";", 1)[0].strip().lower()
-        return MIME_TO_FORMAT.get(mime) or infer_format_from_uri(info.name)
-    except (
-        ValueError,
-        FileNotFoundError,
-        OSError,
-        PermissionError,
-        AttributeError,
-        TypeError,
-    ):
-        # Expected failures: bad URI, missing file, permission denied, or
-        # workspace manager returning unexpected types.  Propagate anything
-        # else (e.g. programming errors) so they don't get silently swallowed.
-        logger.debug("workspace metadata lookup failed for %s", uri, exc_info=True)
-        return None
-
-
-def _is_media_file_field(prop_schema: dict[str, Any] | None) -> bool:
-    """Return True if *prop_schema* describes a MediaFileType field (format: file)."""
-    if prop_schema is None:
-        return False
-    return (
-        prop_schema.get("type") == "string"
-        and prop_schema.get("format") == MediaFileType.string_format
-    )
-
-
-async def _expand_bare_ref(
-    ref: FileRef,
-    fmt: str | None,
-    user_id: str | None,
-    session: ChatSession,
-    prop_schema: dict[str, Any] | None,
-) -> Any:
-    """Resolve and parse a bare ``@@agptfile:`` reference.
-
-    This is the structured-parsing path: the file is read, optionally parsed
-    according to *fmt*, and adapted to the target *prop_schema*.
-
-    Raises :class:`FileRefExpansionError` on resolution or parsing failure.
-
-    Note: MediaFileType fields (format: "file") are handled earlier in
-    ``_expand`` to avoid unnecessary format inference and file I/O.
-    """
-    try:
-        if fmt is not None and fmt in BINARY_FORMATS:
-            # Binary formats need raw bytes, not UTF-8 text.
-            # Line ranges are meaningless for binary formats (parquet/xlsx)
-            # — ignore them and parse full bytes.  Warn so the caller/model
-            # knows the range was silently dropped.
-            if ref.start_line is not None or ref.end_line is not None:
-                logger.warning(
-                    "Line range [%s-%s] ignored for binary format %s (%s); "
-                    "binary formats are always parsed in full.",
-                    ref.start_line,
-                    ref.end_line,
-                    fmt,
-                    ref.uri,
-                )
-            content: str | bytes = await read_file_bytes(ref.uri, user_id, session)
-        else:
-            content = await resolve_file_ref(ref, user_id, session)
-    except ValueError as exc:
-        raise FileRefExpansionError(str(exc)) from exc
-
-    # For known formats this rejects files >10 MB before parsing.
-    # For unknown formats _MAX_EXPAND_CHARS (200K chars) below is stricter,
-    # but this check still guards the parsing path which has no char limit.
-    # _check_content_size raises ValueError, which we unify here just like
-    # resolution errors above.
-    try:
-        _check_content_size(content)
-    except ValueError as exc:
-        raise FileRefExpansionError(str(exc)) from exc
-
-    # When the schema declares this parameter as "string",
-    # return raw file content — don't parse into a structured
-    # type that would need json.dumps() serialisation.
-    expect_string = (prop_schema or {}).get("type") == "string"
-    if expect_string:
-        if isinstance(content, bytes):
-            raise FileRefExpansionError(
-                f"Cannot use {fmt} file as text input: "
-                f"binary formats (parquet, xlsx) must be passed "
-                f"to a block that accepts structured data (list/object), "
-                f"not a string-typed parameter."
-            )
-        return content
-
-    if fmt is not None:
-        # Use strict mode for binary formats so we surface the
-        # actual error (e.g. missing pyarrow/openpyxl, corrupt
-        # file) instead of silently returning garbled bytes.
-        strict = fmt in BINARY_FORMATS
-        try:
-            parsed = parse_file_content(content, fmt, strict=strict)
-        except PARSE_EXCEPTIONS as exc:
-            raise FileRefExpansionError(f"Failed to parse {fmt} file: {exc}") from exc
-        # Normalize bytes fallback to str so tools never
-        # receive raw bytes when parsing fails.
-        if isinstance(parsed, bytes):
-            parsed = _to_str(parsed)
-        return _adapt_to_schema(parsed, prop_schema)
-
-    # Unknown format — return as plain string, but apply
-    # the same per-ref character limit used by inline refs
-    # to prevent injecting unexpectedly large content.
-    text = _to_str(content)
-    if len(text) > _MAX_EXPAND_CHARS:
-        text = text[:_MAX_EXPAND_CHARS] + "\n... [truncated]"
-    return text
-
-
-def _adapt_to_schema(parsed: Any, prop_schema: dict[str, Any] | None) -> Any:
-    """Adapt a parsed file value to better fit the target schema type.
-
-    When the parser returns a natural type (e.g. dict from YAML, list from CSV)
-    that doesn't match the block's expected type, this function converts it to
-    a more useful representation instead of relying on pydantic's generic
-    coercion (which can produce awkward results like flattened dicts → lists).
-
-    Returns *parsed* unchanged when no adaptation is needed.
-    """
-    if prop_schema is None:
-        return parsed
-
-    target_type = prop_schema.get("type")
-
-    # Dict → array: delegate to helper.
-    if isinstance(parsed, dict) and target_type == "array":
-        return _adapt_dict_to_array(parsed, prop_schema)
-
-    # List → object: delegate to helper (raises for non-tabular lists).
-    if isinstance(parsed, list) and target_type == "object":
-        return _adapt_list_to_object(parsed)
-
-    # Tabular list → Any (no type): convert to list of dicts.
-    # Blocks like FindInDictionaryBlock have `input: Any` which produces
-    # a schema with no "type" key.  Tabular [[header],[rows]] is unusable
-    # for key lookup, but [{col: val}, ...] works with FindInDict's
-    # list-of-dicts branch (line 195-199 in data_manipulation.py).
-    if isinstance(parsed, list) and target_type is None and _is_tabular(parsed):
-        return _tabular_to_list_of_dicts(parsed)
-
-    return parsed
-
-
-def _adapt_dict_to_array(parsed: dict, prop_schema: dict[str, Any]) -> Any:
-    """Adapt a parsed dict to an array-typed field.
-
-    Extracts list-valued entries when the target item type is ``array``,
-    passes through unchanged when item type is ``string`` (lets pydantic error),
-    or wraps in ``[parsed]`` as a fallback.
-    """
-    items_type = (prop_schema.get("items") or {}).get("type")
-    if items_type == "array":
-        # Target is List[List[Any]] — extract list-typed values from the
-        # dict as inner lists.  E.g. YAML {"fruits": [{...},...]}} with
-        # ConcatenateLists (List[List[Any]]) → [[{...},...]].
-        list_values = [v for v in parsed.values() if isinstance(v, list)]
-        if list_values:
-            return list_values
-    if items_type == "string":
-        # Target is List[str] — wrapping a dict would give [dict]
-        # which can't coerce to strings.  Return unchanged and let
-        # pydantic surface a clear validation error.
-        return parsed
-    # Fallback: wrap in a single-element list so the block gets [dict]
-    # instead of pydantic flattening keys/values into a flat list.
-    return [parsed]
-
-
-def _adapt_list_to_object(parsed: list) -> Any:
-    """Adapt a parsed list to an object-typed field.
-
-    Converts tabular lists to column-dicts; raises for non-tabular lists.
-    """
-    if _is_tabular(parsed):
-        return _tabular_to_column_dict(parsed)
-    # Non-tabular list (e.g. a plain Python list from a YAML file) cannot
-    # be meaningfully coerced to an object.  Raise explicitly so callers
-    # get a clear error rather than pydantic silently wrapping the list.
-    raise FileRefExpansionError(
-        "Cannot adapt a non-tabular list to an object-typed field. "
-        "Expected a tabular structure ([[header], [row1], ...]) or a dict."
-    )
-
-
-def _is_tabular(parsed: Any) -> bool:
-    """Check if parsed data is in tabular format: [[header], [row1], ...].
-
-    Uses isinstance checks because this is a structural type guard on
-    opaque parser output (Any), not duck typing.  A Protocol wouldn't
-    help here — we need to verify exact list-of-lists shape.
-    """
-    if not isinstance(parsed, list) or len(parsed) < 2:
-        return False
-    header = parsed[0]
-    if not isinstance(header, list) or not header:
-        return False
-    if not all(isinstance(h, str) for h in header):
-        return False
-    return all(isinstance(row, list) for row in parsed[1:])
-
-
-def _tabular_to_list_of_dicts(parsed: list) -> list[dict[str, Any]]:
-    """Convert [[header], [row1], ...] → [{header[0]: row[0], ...}, ...].
-
-    Ragged rows (fewer columns than the header) get None for missing values.
-    Extra values beyond the header length are silently dropped.
-    """
-    header = parsed[0]
-    return [
-        dict(itertools.zip_longest(header, row[: len(header)], fillvalue=None))
-        for row in parsed[1:]
-    ]
-
-
-def _tabular_to_column_dict(parsed: list) -> dict[str, list]:
-    """Convert [[header], [row1], ...] → {"col1": [val1, ...], ...}.
-
-    Ragged rows (fewer columns than the header) get None for missing values,
-    ensuring all columns have equal length.
-    """
-    header = parsed[0]
-    return {
-        col: [row[i] if i < len(row) else None for row in parsed[1:]]
-        for i, col in enumerate(header)
-    }
--- a/autogpt_platform/backend/backend/copilot/sdk/file_ref_integration_test.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/file_ref_integration_test.py
@@ -1,521 +0,0 @@
-"""Integration tests for @@agptfile: reference expansion in tool calls.
-
-These tests verify the end-to-end behaviour of the file reference protocol:
- Parsing @@agptfile: tokens from tool arguments
- Resolving local-filesystem paths (sdk_cwd / ephemeral)
- Expanding references inside the tool-call pipeline (_execute_tool_sync)
- The extended Read tool handler (workspace:// pass-through via session context)
-
-No real LLM or database is required; workspace reads are stubbed where needed.
-"""
-
-from __future__ import annotations
-
-import os
-import tempfile
-from unittest.mock import AsyncMock, MagicMock, patch
-
-import pytest
-
-from backend.copilot.sdk.file_ref import (
-    FileRef,
-    expand_file_refs_in_args,
-    expand_file_refs_in_string,
-    read_file_bytes,
-    resolve_file_ref,
-)
-from backend.copilot.sdk.tool_adapter import _read_file_handler
-
-# ---------------------------------------------------------------------------
-# Helpers
-# ---------------------------------------------------------------------------
-
-
-def _make_session(session_id: str = "integ-sess") -> MagicMock:
-    s = MagicMock()
-    s.session_id = session_id
-    return s
-
-
-# ---------------------------------------------------------------------------
-# Local-file resolution (sdk_cwd)
-# ---------------------------------------------------------------------------
-
-
-@pytest.mark.asyncio
-async def test_resolve_file_ref_local_path():
-    """resolve_file_ref reads a real local file when it's within sdk_cwd."""
-    with tempfile.TemporaryDirectory() as sdk_cwd:
-        # Write a test file inside sdk_cwd
-        test_file = os.path.join(sdk_cwd, "hello.txt")
-        with open(test_file, "w") as f:
-            f.write("line1\nline2\nline3\n")
-
-        session = _make_session()
-        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
-            mock_cwd_var.get.return_value = sdk_cwd
-
-            ref = FileRef(uri=test_file, start_line=None, end_line=None)
-            content = await resolve_file_ref(ref, user_id="u1", session=session)
-
-        assert content == "line1\nline2\nline3\n"
-
-
-@pytest.mark.asyncio
-async def test_resolve_file_ref_local_path_with_line_range():
-    """resolve_file_ref respects line ranges for local files."""
-    with tempfile.TemporaryDirectory() as sdk_cwd:
-        test_file = os.path.join(sdk_cwd, "multi.txt")
-        lines = [f"line{i}\n" for i in range(1, 11)]  # line1 … line10
-        with open(test_file, "w") as f:
-            f.writelines(lines)
-
-        session = _make_session()
-        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
-            mock_cwd_var.get.return_value = sdk_cwd
-
-            ref = FileRef(uri=test_file, start_line=3, end_line=5)
-            content = await resolve_file_ref(ref, user_id="u1", session=session)
-
-        assert content == "line3\nline4\nline5\n"
-
-
-@pytest.mark.asyncio
-async def test_resolve_file_ref_rejects_path_outside_sdk_cwd():
-    """resolve_file_ref raises ValueError for paths outside sdk_cwd."""
-    with tempfile.TemporaryDirectory() as sdk_cwd:
-        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var, patch(
-            "backend.copilot.context._current_sandbox"
-        ) as mock_sandbox_var:
-            mock_cwd_var.get.return_value = sdk_cwd
-            mock_sandbox_var.get.return_value = None
-
-            ref = FileRef(uri="/etc/passwd", start_line=None, end_line=None)
-            with pytest.raises(ValueError, match="not allowed"):
-                await resolve_file_ref(ref, user_id="u1", session=_make_session())
-
-
-# ---------------------------------------------------------------------------
-# expand_file_refs_in_string — integration with real files
-# ---------------------------------------------------------------------------
-
-
-@pytest.mark.asyncio
-async def test_expand_string_with_real_file():
-    """expand_file_refs_in_string replaces @@agptfile: token with actual content."""
-    with tempfile.TemporaryDirectory() as sdk_cwd:
-        test_file = os.path.join(sdk_cwd, "data.txt")
-        with open(test_file, "w") as f:
-            f.write("hello world\n")
-
-        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
-            mock_cwd_var.get.return_value = sdk_cwd
-
-            result = await expand_file_refs_in_string(
-                f"Content: @@agptfile:{test_file}",
-                user_id="u1",
-                session=_make_session(),
-            )
-
-        assert result == "Content: hello world\n"
-
-
-@pytest.mark.asyncio
-async def test_expand_string_missing_file_is_surfaced_inline():
-    """Missing file ref yields [file-ref error: …] inline rather than raising."""
-    with tempfile.TemporaryDirectory() as sdk_cwd:
-        missing = os.path.join(sdk_cwd, "does_not_exist.txt")
-
-        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
-            mock_cwd_var.get.return_value = sdk_cwd
-
-            result = await expand_file_refs_in_string(
-                f"@@agptfile:{missing}",
-                user_id="u1",
-                session=_make_session(),
-            )
-
-        assert "[file-ref error:" in result
-        assert "not found" in result.lower() or "not allowed" in result.lower()
-
-
-# ---------------------------------------------------------------------------
-# expand_file_refs_in_args — dict traversal with real files
-# ---------------------------------------------------------------------------
-
-
-@pytest.mark.asyncio
-async def test_expand_args_replaces_file_ref_in_nested_dict():
-    """Nested @@agptfile: references in args are fully expanded."""
-    with tempfile.TemporaryDirectory() as sdk_cwd:
-        file_a = os.path.join(sdk_cwd, "a.txt")
-        file_b = os.path.join(sdk_cwd, "b.txt")
-        with open(file_a, "w") as f:
-            f.write("AAA")
-        with open(file_b, "w") as f:
-            f.write("BBB")
-
-        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
-            mock_cwd_var.get.return_value = sdk_cwd
-
-            result = await expand_file_refs_in_args(
-                {
-                    "outer": {
-                        "content_a": f"@@agptfile:{file_a}",
-                        "content_b": f"start @@agptfile:{file_b} end",
-                    },
-                    "count": 42,
-                },
-                user_id="u1",
-                session=_make_session(),
-            )
-
-        assert result["outer"]["content_a"] == "AAA"
-        assert result["outer"]["content_b"] == "start BBB end"
-        assert result["count"] == 42
-
-
-# ---------------------------------------------------------------------------
-# expand_file_refs_in_args — bare ref structured parsing
-# ---------------------------------------------------------------------------
-
-
-@pytest.mark.asyncio
-async def test_bare_ref_json_returns_parsed_dict():
-    """Bare ref to a .json file returns parsed dict, not raw string."""
-    with tempfile.TemporaryDirectory() as sdk_cwd:
-        json_file = os.path.join(sdk_cwd, "data.json")
-        with open(json_file, "w") as f:
-            f.write('{"key": "value", "count": 42}')
-
-        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
-            mock_cwd_var.get.return_value = sdk_cwd
-
-            result = await expand_file_refs_in_args(
-                {"data": f"@@agptfile:{json_file}"},
-                user_id="u1",
-                session=_make_session(),
-            )
-
-        assert result["data"] == {"key": "value", "count": 42}
-
-
-@pytest.mark.asyncio
-async def test_bare_ref_csv_returns_parsed_table():
-    """Bare ref to a .csv file returns list[list[str]] table."""
-    with tempfile.TemporaryDirectory() as sdk_cwd:
-        csv_file = os.path.join(sdk_cwd, "data.csv")
-        with open(csv_file, "w") as f:
-            f.write("Name,Score\nAlice,90\nBob,85")
-
-        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
-            mock_cwd_var.get.return_value = sdk_cwd
-
-            result = await expand_file_refs_in_args(
-                {"input": f"@@agptfile:{csv_file}"},
-                user_id="u1",
-                session=_make_session(),
-            )
-
-        assert result["input"] == [
-            ["Name", "Score"],
-            ["Alice", "90"],
-            ["Bob", "85"],
-        ]
-
-
-@pytest.mark.asyncio
-async def test_bare_ref_unknown_extension_returns_string():
-    """Bare ref to a file with unknown extension returns plain string."""
-    with tempfile.TemporaryDirectory() as sdk_cwd:
-        txt_file = os.path.join(sdk_cwd, "readme.txt")
-        with open(txt_file, "w") as f:
-            f.write("plain text content")
-
-        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
-            mock_cwd_var.get.return_value = sdk_cwd
-
-            result = await expand_file_refs_in_args(
-                {"data": f"@@agptfile:{txt_file}"},
-                user_id="u1",
-                session=_make_session(),
-            )
-
-        assert result["data"] == "plain text content"
-        assert isinstance(result["data"], str)
-
-
-@pytest.mark.asyncio
-async def test_bare_ref_invalid_json_falls_back_to_string():
-    """Bare ref to a .json file with invalid JSON falls back to string."""
-    with tempfile.TemporaryDirectory() as sdk_cwd:
-        json_file = os.path.join(sdk_cwd, "bad.json")
-        with open(json_file, "w") as f:
-            f.write("not valid json {{{")
-
-        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
-            mock_cwd_var.get.return_value = sdk_cwd
-
-            result = await expand_file_refs_in_args(
-                {"data": f"@@agptfile:{json_file}"},
-                user_id="u1",
-                session=_make_session(),
-            )
-
-        assert result["data"] == "not valid json {{{"
-        assert isinstance(result["data"], str)
-
-
-@pytest.mark.asyncio
-async def test_embedded_ref_always_returns_string_even_for_json():
-    """Embedded ref (text around it) returns plain string, not parsed JSON."""
-    with tempfile.TemporaryDirectory() as sdk_cwd:
-        json_file = os.path.join(sdk_cwd, "data.json")
-        with open(json_file, "w") as f:
-            f.write('{"key": "value"}')
-
-        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
-            mock_cwd_var.get.return_value = sdk_cwd
-
-            result = await expand_file_refs_in_args(
-                {"data": f"prefix @@agptfile:{json_file} suffix"},
-                user_id="u1",
-                session=_make_session(),
-            )
-
-        assert isinstance(result["data"], str)
-        assert result["data"].startswith("prefix ")
-        assert result["data"].endswith(" suffix")
-
-
-@pytest.mark.asyncio
-async def test_bare_ref_yaml_returns_parsed_dict():
-    """Bare ref to a .yaml file returns parsed dict."""
-    with tempfile.TemporaryDirectory() as sdk_cwd:
-        yaml_file = os.path.join(sdk_cwd, "config.yaml")
-        with open(yaml_file, "w") as f:
-            f.write("name: test\ncount: 42\n")
-
-        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
-            mock_cwd_var.get.return_value = sdk_cwd
-
-            result = await expand_file_refs_in_args(
-                {"config": f"@@agptfile:{yaml_file}"},
-                user_id="u1",
-                session=_make_session(),
-            )
-
-        assert result["config"] == {"name": "test", "count": 42}
-
-
-@pytest.mark.asyncio
-async def test_bare_ref_binary_with_line_range_ignores_range():
-    """Bare ref to a binary file (.parquet) with line range parses the full file.
-
-    Binary formats (parquet, xlsx) ignore line ranges — the full content is
-    parsed and the range is silently dropped with a log warning.
-    """
-    try:
-        import pandas as pd
-    except ImportError:
-        pytest.skip("pandas not installed")
-    try:
-        import pyarrow  # noqa: F401  # pyright: ignore[reportMissingImports]
-    except ImportError:
-        pytest.skip("pyarrow not installed")
-
-    with tempfile.TemporaryDirectory() as sdk_cwd:
-        parquet_file = os.path.join(sdk_cwd, "data.parquet")
-        import io as _io
-
-        df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
-        buf = _io.BytesIO()
-        df.to_parquet(buf, index=False)
-        with open(parquet_file, "wb") as f:
-            f.write(buf.getvalue())
-
-        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
-            mock_cwd_var.get.return_value = sdk_cwd
-
-            # Line range [1-2] should be silently ignored for binary formats.
-            result = await expand_file_refs_in_args(
-                {"data": f"@@agptfile:{parquet_file}[1-2]"},
-                user_id="u1",
-                session=_make_session(),
-            )
-
-        # Full file is returned despite the line range.
-        assert result["data"] == [["A", "B"], [1, 4], [2, 5], [3, 6]]
-
-
-@pytest.mark.asyncio
-async def test_bare_ref_toml_returns_parsed_dict():
-    """Bare ref to a .toml file returns parsed dict."""
-    with tempfile.TemporaryDirectory() as sdk_cwd:
-        toml_file = os.path.join(sdk_cwd, "config.toml")
-        with open(toml_file, "w") as f:
-            f.write('name = "test"\ncount = 42\n')
-
-        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
-            mock_cwd_var.get.return_value = sdk_cwd
-
-            result = await expand_file_refs_in_args(
-                {"config": f"@@agptfile:{toml_file}"},
-                user_id="u1",
-                session=_make_session(),
-            )
-
-        assert result["config"] == {"name": "test", "count": 42}
-
-
-# ---------------------------------------------------------------------------
-# _read_file_handler — extended to accept workspace:// and local paths
-# ---------------------------------------------------------------------------
-
-
-@pytest.mark.asyncio
-async def test_read_file_handler_local_file():
-    """_read_file_handler reads a local file when it's within sdk_cwd."""
-    with tempfile.TemporaryDirectory() as sdk_cwd:
-        test_file = os.path.join(sdk_cwd, "read_test.txt")
-        lines = [f"L{i}\n" for i in range(1, 6)]
-        with open(test_file, "w") as f:
-            f.writelines(lines)
-
-        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var, patch(
-            "backend.copilot.context._current_project_dir"
-        ) as mock_proj_var, patch(
-            "backend.copilot.sdk.tool_adapter.get_execution_context",
-            return_value=("user-1", _make_session()),
-        ):
-            mock_cwd_var.get.return_value = sdk_cwd
-            mock_proj_var.get.return_value = ""
-
-            result = await _read_file_handler(
-                {"file_path": test_file, "offset": 0, "limit": 5}
-            )
-
-        assert not result["isError"]
-        text = result["content"][0]["text"]
-        assert "L1" in text
-        assert "L5" in text
-
-
-@pytest.mark.asyncio
-async def test_read_file_handler_workspace_uri():
-    """_read_file_handler handles workspace:// URIs via the workspace manager."""
-    mock_session = _make_session()
-    mock_manager = AsyncMock()
-    mock_manager.read_file_by_id.return_value = b"workspace file content\nline two\n"
-
-    with patch(
-        "backend.copilot.sdk.tool_adapter.get_execution_context",
-        return_value=("user-1", mock_session),
-    ), patch(
-        "backend.copilot.sdk.file_ref.get_workspace_manager",
-        new=AsyncMock(return_value=mock_manager),
-    ):
-        result = await _read_file_handler(
-            {"file_path": "workspace://file-id-abc", "offset": 0, "limit": 10}
-        )
-
-    assert not result["isError"], result["content"][0]["text"]
-    text = result["content"][0]["text"]
-    assert "workspace file content" in text
-    assert "line two" in text
-
-
-@pytest.mark.asyncio
-async def test_read_file_handler_workspace_uri_no_session():
-    """_read_file_handler returns error when workspace:// is used without session."""
-    with patch(
-        "backend.copilot.sdk.tool_adapter.get_execution_context",
-        return_value=(None, None),
-    ):
-        result = await _read_file_handler({"file_path": "workspace://some-id"})
-
-    assert result["isError"]
-    assert "session" in result["content"][0]["text"].lower()
-
-
-@pytest.mark.asyncio
-async def test_read_file_handler_access_denied():
-    """_read_file_handler rejects paths outside allowed locations."""
-    with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd, patch(
-        "backend.copilot.context._current_sandbox"
-    ) as mock_sandbox, patch(
-        "backend.copilot.sdk.tool_adapter.get_execution_context",
-        return_value=("user-1", _make_session()),
-    ):
-        mock_cwd.get.return_value = "/tmp/safe-dir"
-        mock_sandbox.get.return_value = None
-
-        result = await _read_file_handler({"file_path": "/etc/passwd"})
-
-    assert result["isError"]
-    assert "not allowed" in result["content"][0]["text"].lower()
-
-
-# ---------------------------------------------------------------------------
-# read_file_bytes — workspace:///path (virtual path) and E2B sandbox branch
-# ---------------------------------------------------------------------------
-
-
-@pytest.mark.asyncio
-async def test_read_file_bytes_workspace_virtual_path():
-    """workspace:///path resolves via manager.read_file (is_path=True path)."""
-    session = _make_session()
-    mock_manager = AsyncMock()
-    mock_manager.read_file.return_value = b"virtual path content"
-
-    with patch(
-        "backend.copilot.sdk.file_ref.get_workspace_manager",
-        new=AsyncMock(return_value=mock_manager),
-    ):
-        result = await read_file_bytes("workspace:///reports/q1.md", "user-1", session)
-
-    assert result == b"virtual path content"
-    mock_manager.read_file.assert_awaited_once_with("/reports/q1.md")
-
-
-@pytest.mark.asyncio
-async def test_read_file_bytes_e2b_sandbox_branch():
-    """read_file_bytes reads from the E2B sandbox when a sandbox is active."""
-    session = _make_session()
-    mock_sandbox = AsyncMock()
-    mock_sandbox.files.read.return_value = bytearray(b"sandbox content")
-
-    with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd, patch(
-        "backend.copilot.context._current_sandbox"
-    ) as mock_sandbox_var, patch(
-        "backend.copilot.context._current_project_dir"
-    ) as mock_proj:
-        mock_cwd.get.return_value = ""
-        mock_sandbox_var.get.return_value = mock_sandbox
-        mock_proj.get.return_value = ""
-
-        result = await read_file_bytes("/home/user/script.sh", None, session)
-
-    assert result == b"sandbox content"
-    mock_sandbox.files.read.assert_awaited_once_with(
-        "/home/user/script.sh", format="bytes"
-    )
-
-
-@pytest.mark.asyncio
-async def test_read_file_bytes_e2b_path_escapes_sandbox_raises():
-    """read_file_bytes raises ValueError for paths that escape the sandbox root."""
-    session = _make_session()
-    mock_sandbox = AsyncMock()
-
-    with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd, patch(
-        "backend.copilot.context._current_sandbox"
-    ) as mock_sandbox_var, patch(
-        "backend.copilot.context._current_project_dir"
-    ) as mock_proj:
-        mock_cwd.get.return_value = ""
-        mock_sandbox_var.get.return_value = mock_sandbox
-        mock_proj.get.return_value = ""
-
-        with pytest.raises(ValueError, match="not allowed"):
-            await read_file_bytes("/etc/passwd", None, session)
--- a/autogpt_platform/backend/backend/copilot/sdk/file_ref_test.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/file_ref_test.py
--- a/autogpt_platform/backend/backend/copilot/sdk/mcp_tool_guide.md
+++ b/autogpt_platform/backend/backend/copilot/sdk/mcp_tool_guide.md
@@ -1,59 +0,0 @@
-## MCP Tool Guide
-
-### Workflow
-
-`run_mcp_tool` follows a two-step pattern:
-
-1. **Discover** — call with only `server_url` to list available tools on the server.
-2. **Execute** — call again with `server_url`, `tool_name`, and `tool_arguments` to run a tool.
-
-### Known hosted MCP servers
-
-Use these URLs directly without asking the user:
-
-| Service | URL |
-|---|---|
-| Notion | `https://mcp.notion.com/mcp` |
-| Linear | `https://mcp.linear.app/mcp` |
-| Stripe | `https://mcp.stripe.com` |
-| Intercom | `https://mcp.intercom.com/mcp` |
-| Cloudflare | `https://mcp.cloudflare.com/mcp` |
-| Atlassian / Jira | `https://mcp.atlassian.com/mcp` |
-
-For other services, search the MCP registry API:
-```http
-GET https://registry.modelcontextprotocol.io/v0/servers?q=<search_term>
-```
-Each result includes a `remotes` array with the exact server URL to use.
-
-### Important: Check blocks first
-
-Before using `run_mcp_tool`, always check if the platform already has blocks for the service
-using `find_block`. The platform has hundreds of built-in blocks (Google Sheets, Google Docs,
-Google Calendar, Gmail, etc.) that work without MCP setup.
-
-Only use `run_mcp_tool` when:
- The service is in the known hosted MCP servers list above, OR
- You searched `find_block` first and found no matching blocks
-
-**Never guess or construct MCP server URLs.** Only use URLs from the known servers list above
-or from the `remotes[].url` field in MCP registry search results.
-
-### Authentication
-
-If the server requires credentials, a `SetupRequirementsResponse` is returned with an OAuth
-login prompt. Once the user completes the flow and confirms, retry the same call immediately.
-
-### Communication style
-
-Avoid technical jargon like "MCP server", "OAuth", or "credentials" when talking to the user.
-Use plain, friendly language instead:
-
-| Instead of… | Say… |
-|---|---|
-| "Let me connect to Sentry's MCP server and discover what tools are available." | "I can connect to Sentry and help identify important issues." |
-| "Let me connect to Sentry's MCP server now." | "Next, I'll connect to Sentry." |
-| "The MCP server at mcp.sentry.dev requires authentication. Please connect your credentials to continue." | "To continue, sign in to Sentry and approve access." |
-| "Sentry's MCP server needs OAuth authentication. You should see a prompt to connect your Sentry account…" | "You should see a prompt to sign in to Sentry. Once connected, I can help surface critical issues right away." |
-
-Use **"connect to [Service]"** or **"sign in to [Service]"** — never "MCP server", "OAuth", or "credentials".
--- a/autogpt_platform/backend/backend/copilot/sdk/response_adapter_test.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/response_adapter_test.py
@@ -536,12 +536,10 @@ async def test_wait_for_stash_signaled():
    result = await wait_for_stash(timeout=1.0)

    assert result is True
-    pto = _pto.get()
-    assert pto is not None
-    assert pto.get("WebSearch") == ["result data"]
+    assert _pto.get({}).get("WebSearch") == ["result data"]

    # Cleanup
-    _pto.set({})
+    _pto.set({})  # type: ignore[arg-type]
    _stash_event.set(None)


@@ -556,7 +554,7 @@ async def test_wait_for_stash_timeout():
    assert result is False

    # Cleanup
-    _pto.set({})
+    _pto.set({})  # type: ignore[arg-type]
    _stash_event.set(None)


@@ -575,12 +573,10 @@ async def test_wait_for_stash_already_stashed():
    assert result is True

    # But the stash itself is populated
-    pto = _pto.get()
-    assert pto is not None
-    assert pto.get("Read") == ["file contents"]
+    assert _pto.get({}).get("Read") == ["file contents"]

    # Cleanup
-    _pto.set({})
+    _pto.set({})  # type: ignore[arg-type]
    _stash_event.set(None)


--- a/autogpt_platform/backend/backend/copilot/sdk/security_hooks.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/security_hooks.py
@@ -10,13 +10,12 @@ import re
 from collections.abc import Callable
 from typing import Any, cast

-from backend.copilot.context import is_allowed_local_path
-
 from .tool_adapter import (
    BLOCKED_TOOLS,
    DANGEROUS_PATTERNS,
    MCP_TOOL_PREFIX,
    WORKSPACE_SCOPED_TOOLS,
+    is_allowed_local_path,
    stash_pending_tool_output,
 )

@@ -127,7 +126,7 @@ def create_security_hooks(
    user_id: str | None,
    sdk_cwd: str | None = None,
    max_subtasks: int = 3,
-    on_compact: Callable[[str], None] | None = None,
+    on_compact: Callable[[], None] | None = None,
 ) -> dict[str, Any]:
    """Create the security hooks configuration for Claude Agent SDK.

@@ -142,7 +141,6 @@ def create_security_hooks(
        sdk_cwd: SDK working directory for workspace-scoped tool validation
        max_subtasks: Maximum concurrent Task (sub-agent) spawns allowed per session
        on_compact: Callback invoked when SDK starts compacting context.
-            Receives the transcript_path from the hook input.

    Returns:
        Hooks configuration dict for ClaudeAgentOptions
@@ -302,21 +300,11 @@ def create_security_hooks(
            """
            _ = context, tool_use_id
            trigger = input_data.get("trigger", "auto")
-            # Sanitize untrusted input before logging to prevent log injection
-            transcript_path = (
-                str(input_data.get("transcript_path", ""))
-                .replace("\n", "")
-                .replace("\r", "")
-            )
            logger.info(
-                "[SDK] Context compaction triggered: %s, user=%s, "
-                "transcript_path=%s",
-                trigger,
-                user_id,
-                transcript_path,
+                f"[SDK] Context compaction triggered: {trigger}, user={user_id}"
            )
            if on_compact is not None:
-                on_compact(transcript_path)
+                on_compact()
            return cast(SyncHookJSONOutput, {})

        hooks: dict[str, Any] = {
--- a/autogpt_platform/backend/backend/copilot/sdk/security_hooks_test.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/security_hooks_test.py
@@ -9,9 +9,8 @@ import os

 import pytest

-from backend.copilot.context import _current_project_dir
-
 from .security_hooks import _validate_tool_access, _validate_user_isolation
+from .service import _is_tool_error_or_denial

 SDK_CWD = "/tmp/copilot-abc123"

@@ -121,6 +120,8 @@ def test_read_no_cwd_denies_absolute():


 def test_read_tool_results_allowed():
+    from .tool_adapter import _current_project_dir
+
    home = os.path.expanduser("~")
    path = f"{home}/.claude/projects/-tmp-copilot-abc123/tool-results/12345.txt"
    # is_allowed_local_path requires the session's encoded cwd to be set
@@ -132,14 +133,16 @@ def test_read_tool_results_allowed():
        _current_project_dir.reset(token)


-def test_read_claude_projects_settings_json_denied():
-    """SDK-internal artifacts like settings.json are NOT accessible — only tool-results/ is."""
+def test_read_claude_projects_session_dir_allowed():
+    """Files within the current session's project dir are allowed."""
+    from .tool_adapter import _current_project_dir
+
    home = os.path.expanduser("~")
    path = f"{home}/.claude/projects/-tmp-copilot-abc123/settings.json"
    token = _current_project_dir.set("-tmp-copilot-abc123")
    try:
        result = _validate_tool_access("Read", {"file_path": path}, sdk_cwd=SDK_CWD)
-        assert _is_denied(result)
+        assert not _is_denied(result)
    finally:
        _current_project_dir.reset(token)

@@ -354,3 +357,76 @@ async def test_task_slot_released_on_failure(_hooks):
        context={},
    )
    assert not _is_denied(result)
+
+
+# -- _is_tool_error_or_denial ------------------------------------------------
+
+
+class TestIsToolErrorOrDenial:
+    def test_none_content(self):
+        assert _is_tool_error_or_denial(None) is False
+
+    def test_empty_content(self):
+        assert _is_tool_error_or_denial("") is False
+
+    def test_benign_output(self):
+        assert _is_tool_error_or_denial("All good, no issues.") is False
+
+    def test_security_marker(self):
+        assert _is_tool_error_or_denial("[SECURITY] Tool access blocked") is True
+
+    def test_cannot_be_bypassed(self):
+        assert _is_tool_error_or_denial("This restriction cannot be bypassed.") is True
+
+    def test_not_allowed(self):
+        assert _is_tool_error_or_denial("Operation not allowed in sandbox") is True
+
+    def test_background_task_denial(self):
+        assert (
+            _is_tool_error_or_denial(
+                "Background task execution is not supported. "
+                "Run tasks in the foreground instead."
+            )
+            is True
+        )
+
+    def test_subtask_limit_denial(self):
+        assert (
+            _is_tool_error_or_denial(
+                "Maximum 2 concurrent sub-tasks. "
+                "Wait for running sub-tasks to finish, "
+                "or continue in the main conversation."
+            )
+            is True
+        )
+
+    def test_denied_marker(self):
+        assert (
+            _is_tool_error_or_denial("Access denied: insufficient privileges") is True
+        )
+
+    def test_blocked_marker(self):
+        assert _is_tool_error_or_denial("Request blocked by security policy") is True
+
+    def test_failed_marker(self):
+        assert _is_tool_error_or_denial("Failed to execute tool: timeout") is True
+
+    def test_mcp_iserror(self):
+        assert _is_tool_error_or_denial('{"isError": true, "content": []}') is True
+
+    def test_benign_error_in_value(self):
+        """Content like '0 errors found' should not trigger — 'error' was removed."""
+        assert _is_tool_error_or_denial("0 errors found") is False
+
+    def test_benign_permission_field(self):
+        """Schema descriptions mentioning 'permission' should not trigger."""
+        assert (
+            _is_tool_error_or_denial(
+                '{"fields": [{"name": "permission_level", "type": "int"}]}'
+            )
+            is False
+        )
+
+    def test_benign_not_found_in_listing(self):
+        """File listing containing 'not found' in filenames should not trigger."""
+        assert _is_tool_error_or_denial("readme.md\nfile-not-found-handler.py") is False
--- a/autogpt_platform/backend/backend/copilot/sdk/service.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/service.py
@@ -29,7 +29,6 @@ from langfuse import propagate_attributes
 from langsmith.integrations.claude_agent_sdk import configure_claude_agent_sdk
 from pydantic import BaseModel

-from backend.copilot.context import get_workspace_manager
 from backend.data.redis_client import get_redis_async
 from backend.executor.cluster_lock import AsyncClusterLock
 from backend.util.exceptions import NotFoundError
@@ -45,7 +44,7 @@ from ..model import (
    update_session_title,
    upsert_chat_session,
 )
-from ..prompting import get_sdk_supplement
+from ..prompt_constants import KEY_WORKFLOWS
 from ..response_model import (
    StreamBaseResponse,
    StreamError,
@@ -61,8 +60,10 @@ from ..service import (
    _generate_session_title,
    _is_langfuse_configured,
 )
-from ..tools.e2b_sandbox import get_or_create_sandbox, pause_sandbox_direct
+from ..tools import TOOL_REGISTRY
+from ..tools.e2b_sandbox import get_or_create_sandbox
 from ..tools.sandbox import WORKSPACE_PREFIX, make_session_path
+from ..tools.workspace_files import get_manager
 from ..tracking import track_user_message
 from .compaction import CompactionTracker, filter_compaction_messages
 from .response_adapter import SDKResponseAdapter
@@ -77,7 +78,6 @@ from .tool_adapter import (
 from .transcript import (
    cleanup_cli_project_dir,
    download_transcript,
-    read_compacted_entries,
    upload_transcript,
    validate_transcript,
    write_transcript_to_tempfile,
@@ -148,6 +148,169 @@ _SDK_CWD_PREFIX = WORKSPACE_PREFIX
 _HEARTBEAT_INTERVAL = 10.0  # seconds


+# Appended to the system prompt to inform the agent about available tools.
+# The SDK built-in Bash is NOT available — use mcp__copilot__bash_exec instead,
+# which has kernel-level network isolation (unshare --net).
+def _generate_tool_documentation() -> str:
+    """Auto-generate tool documentation from TOOL_REGISTRY.
+
+    This generates a complete list of available tools with their descriptions,
+    ensuring the documentation stays in sync with the actual tool implementations.
+    """
+    docs = "\n## AVAILABLE TOOLS\n\n"
+
+    # Sort tools alphabetically for consistent output
+    for name in sorted(TOOL_REGISTRY.keys()):
+        tool = TOOL_REGISTRY[name]
+        schema = tool.as_openai_tool()
+        desc = schema["function"].get("description", "No description available")
+        # Format as bullet list with tool name in code style
+        docs += f"- **`{name}`**: {desc}\n"
+
+    # Add workflow guidance for key tools
+    docs += KEY_WORKFLOWS
+
+    return docs
+
+
+_SHARED_TOOL_NOTES = """\
+
+### Web search and research
+- **`web_search(query)`** — Search the web for current information (uses Claude's
+  native web search). Use this when you need up-to-date information, facts,
+  statistics, or current events that are beyond your knowledge cutoff.
+- **`web_fetch(url)`** — Retrieve and analyze content from a specific URL.
+  Use this when you have a specific URL to read (documentation, articles, etc.).
+
+### Sharing files with the user
+After saving a file to the persistent workspace with `write_workspace_file`,
+share it with the user by embedding the `download_url` from the response in
+your message as a Markdown link or image:
+
+- **Any file** — shows as a clickable download link:
+  `[report.csv](workspace://file_id#text/csv)`
+- **Image** — renders inline in chat:
+  `![chart](workspace://file_id#image/png)`
+- **Video** — renders inline in chat with player controls:
+  `![recording](workspace://file_id#video/mp4)`
+
+The `download_url` field in the `write_workspace_file` response is already
+in the correct format — paste it directly after the `(` in the Markdown.
+
+### Long-running tools
+Long-running tools (create_agent, edit_agent, etc.) are handled
+asynchronously.  You will receive an immediate response; the actual result
+is delivered to the user via a background stream.
+
+### Large tool outputs
+When a tool output exceeds the display limit, it is automatically saved to
+the persistent workspace.  The truncated output includes a
+`<tool-output-truncated>` tag with the workspace path.  Use
+`read_workspace_file(path="...", offset=N, length=50000)` to retrieve
+additional sections.
+
+### Sub-agent tasks
+- When using the Task tool, NEVER set `run_in_background` to true.
+  All tasks must run in the foreground.
+"""
+
+
+_LOCAL_TOOL_SUPPLEMENT = (
+    """
+
+## Tool notes
+
+### Shell commands
+- The SDK built-in Bash tool is NOT available.  Use the `bash_exec` MCP tool
+  for shell commands — it runs in a network-isolated sandbox.
+
+### Working directory
+- Your working directory is: `{cwd}`
+- All SDK Read/Write/Edit/Glob/Grep tools AND `bash_exec` operate inside this
+  directory.  This is the ONLY writable path — do not attempt to read or write
+  anywhere else on the filesystem.
+- Use relative paths or absolute paths under `{cwd}` for all file operations.
+
+### Two storage systems — CRITICAL to understand
+
+1. **Ephemeral working directory** (`{cwd}`):
+   - Shared by SDK Read/Write/Edit/Glob/Grep tools AND `bash_exec`
+   - Files here are **lost between turns** — do NOT rely on them persisting
+   - Use for temporary work: running scripts, processing data, etc.
+
+2. **Persistent workspace** (cloud storage):
+   - Files here **survive across turns and sessions**
+   - Use `write_workspace_file` to save important files (code, outputs, configs)
+   - Use `read_workspace_file` to retrieve previously saved files
+   - Use `list_workspace_files` to see what files you've saved before
+   - Call `list_workspace_files(include_all_sessions=True)` to see files from
+     all sessions
+
+### Moving files between ephemeral and persistent storage
+- **Ephemeral → Persistent**: Use `write_workspace_file` with either:
+  - `content` param (plain text) — for text files
+  - `source_path` param — to copy any file directly from the ephemeral dir
+- **Persistent → Ephemeral**: Use `read_workspace_file` with `save_to_path`
+  param to download a workspace file to the ephemeral dir for processing
+
+### File persistence workflow
+When you create or modify important files (code, configs, outputs), you MUST:
+1. Save them using `write_workspace_file` so they persist
+2. At the start of a new turn, call `list_workspace_files` to see what files
+   are available from previous turns
+"""
+    + _SHARED_TOOL_NOTES
+)
+
+
+_E2B_TOOL_SUPPLEMENT = (
+    """
+
+## Tool notes
+
+### Shell commands
+- The SDK built-in Bash tool is NOT available.  Use the `bash_exec` MCP tool
+  for shell commands — it runs in a cloud sandbox with full internet access.
+
+### Working directory
+- Your working directory is: `/home/user` (cloud sandbox)
+- All file tools (`read_file`, `write_file`, `edit_file`, `glob`, `grep`)
+  AND `bash_exec` operate on the **same cloud sandbox filesystem**.
+- Files created by `bash_exec` are immediately visible to `read_file` and
+  vice-versa — they share one filesystem.
+- Use relative paths (resolved from `/home/user`) or absolute paths.
+
+### Two storage systems — CRITICAL to understand
+
+1. **Cloud sandbox** (`/home/user`):
+   - Shared by all file tools AND `bash_exec` — same filesystem
+   - Files **persist across turns** within the current session
+   - Full Linux environment with internet access
+   - Lost when the session expires (12 h inactivity)
+
+2. **Persistent workspace** (cloud storage):
+   - Files here **survive across sessions indefinitely**
+   - Use `write_workspace_file` to save important files permanently
+   - Use `read_workspace_file` to retrieve previously saved files
+   - Use `list_workspace_files` to see what files you've saved before
+   - Call `list_workspace_files(include_all_sessions=True)` to see files from
+     all sessions
+
+### Moving files between sandbox and persistent storage
+- **Sandbox → Persistent**: Use `write_workspace_file` with `source_path`
+  to copy from the sandbox to permanent storage
+- **Persistent → Sandbox**: Use `read_workspace_file` with `save_to_path`
+  to download into the sandbox for processing
+
+### File persistence workflow
+Important files that must survive beyond this session should be saved with
+`write_workspace_file`.  Sandbox files persist across turns but are lost
+when the session expires.
+"""
+    + _SHARED_TOOL_NOTES
+)
+
+
 STREAM_LOCK_PREFIX = "copilot:stream:lock:"


@@ -328,14 +491,13 @@ def _format_sdk_content_blocks(blocks: list) -> list[dict[str, Any]]:
                }
            )
        elif isinstance(block, ToolResultBlock):
-            tool_result_entry: dict[str, Any] = {
-                "type": "tool_result",
-                "tool_use_id": block.tool_use_id,
-                "content": block.content,
-            }
-            if block.is_error:
-                tool_result_entry["is_error"] = True
-            result.append(tool_result_entry)
+            result.append(
+                {
+                    "type": "tool_result",
+                    "tool_use_id": block.tool_use_id,
+                    "content": block.content,
+                }
+            )
        elif isinstance(block, ThinkingBlock):
            result.append(
                {
@@ -457,6 +619,31 @@ def _format_conversation_context(messages: list[ChatMessage]) -> str | None:
    return "<conversation_history>\n" + "\n".join(lines) + "\n</conversation_history>"


+def _is_tool_error_or_denial(content: str | None) -> bool:
+    """Check if a tool message content indicates an error or denial.
+
+    Currently unused — ``_format_conversation_context`` includes all tool
+    results.  Kept as a utility for future selective filtering.
+    """
+    if not content:
+        return False
+    lower = content.lower()
+    return any(
+        marker in lower
+        for marker in (
+            "[security]",
+            "cannot be bypassed",
+            "not allowed",
+            "not supported",  # background-task denial
+            "maximum",  # subtask-limit denial
+            "denied",
+            "blocked",
+            "failed to",  # internal tool execution failures
+            '"iserror": true',  # MCP protocol error flag
+        )
+    )
+
+
 async def _build_query_message(
    current_message: str,
    session: ChatSession,
@@ -565,7 +752,7 @@ async def _prepare_file_attachments(
        return empty

    try:
-        manager = await get_workspace_manager(user_id, session_id)
+        manager = await get_manager(user_id, session_id)
    except Exception:
        logger.warning(
            "Failed to create workspace manager for file attachments",
@@ -760,32 +947,29 @@ async def stream_chat_completion_sdk(

        async def _setup_e2b():
            """Set up E2B sandbox if configured, return sandbox or None."""
-            if not (e2b_api_key := config.active_e2b_api_key):
-                if config.use_e2b_sandbox:
-                    logger.warning(
-                        "[E2B] [%s] E2B sandbox enabled but no API key configured "
-                        "(CHAT_E2B_API_KEY / E2B_API_KEY) — falling back to bubblewrap",
-                        session_id[:12],
-                    )
-                return None
-            try:
-                sandbox = await get_or_create_sandbox(
-                    session_id,
-                    api_key=e2b_api_key,
-                    template=config.e2b_sandbox_template,
-                    timeout=config.e2b_sandbox_timeout,
-                    on_timeout=config.e2b_sandbox_on_timeout,
-                )
-            except Exception as e2b_err:
-                logger.error(
-                    "[E2B] [%s] Setup failed: %s",
+            if config.use_e2b_sandbox and not config.e2b_api_key:
+                logger.warning(
+                    "[E2B] [%s] E2B sandbox enabled but no API key configured "
+                    "(CHAT_E2B_API_KEY / E2B_API_KEY) — falling back to bubblewrap",
                    session_id[:12],
-                    e2b_err,
-                    exc_info=True,
                )
                return None
-
-            return sandbox
+            if config.use_e2b_sandbox and config.e2b_api_key:
+                try:
+                    return await get_or_create_sandbox(
+                        session_id,
+                        api_key=config.e2b_api_key,
+                        template=config.e2b_sandbox_template,
+                        timeout=config.e2b_sandbox_timeout,
+                    )
+                except Exception as e2b_err:
+                    logger.error(
+                        "[E2B] [%s] Setup failed: %s",
+                        session_id[:12],
+                        e2b_err,
+                        exc_info=True,
+                    )
+            return None

        async def _fetch_transcript():
            """Download transcript for --resume if applicable."""
@@ -812,10 +996,18 @@ async def stream_chat_completion_sdk(
        )

        use_e2b = e2b_sandbox is not None
-        # Append appropriate supplement (Claude gets tool schemas automatically)
-        system_prompt = base_system_prompt + get_sdk_supplement(
-            use_e2b=use_e2b, cwd=sdk_cwd
+        # Generate tool documentation and append appropriate supplement
+        tool_docs = _generate_tool_documentation()
+        system_prompt = (
+            base_system_prompt
+            + tool_docs
+            + (
+                _E2B_TOOL_SUPPLEMENT
+                if use_e2b
+                else _LOCAL_TOOL_SUPPLEMENT.format(cwd=sdk_cwd)
+            )
        )
+
        # Process transcript download result
        transcript_msg_count = 0
        if dl:
@@ -880,11 +1072,6 @@ async def stream_chat_completion_sdk(

        allowed = get_copilot_tool_names(use_e2b=use_e2b)
        disallowed = get_sdk_disallowed_tools(use_e2b=use_e2b)
-
-        def _on_stderr(line: str) -> None:
-            sid = session_id[:12] if session_id else "?"
-            logger.info("[SDK] [%s] CLI stderr: %s", sid, line.rstrip())
-
        sdk_options_kwargs: dict[str, Any] = {
            "system_prompt": system_prompt,
            "mcp_servers": {"copilot": mcp_server},
@@ -893,7 +1080,6 @@ async def stream_chat_completion_sdk(
            "hooks": security_hooks,
            "cwd": sdk_cwd,
            "max_buffer_size": config.claude_agent_max_buffer_size,
-            "stderr": _on_stderr,
        }
        if sdk_model:
            sdk_options_kwargs["model"] = sdk_model
@@ -984,18 +1170,19 @@ async def stream_chat_completion_sdk(
                    json.dumps(user_msg) + "\n"
                )
                # Capture user message in transcript (multimodal)
-                transcript_builder.append_user(content=content_blocks)
+                transcript_builder.add_user_message(content=content_blocks)
            else:
                await client.query(query_message, session_id=session_id)
                # Capture actual user message in transcript (not the engineered query)
                # query_message may include context wrappers, but transcript needs raw input
-                transcript_builder.append_user(content=current_message)
+                transcript_builder.add_user_message(content=current_message)

            assistant_response = ChatMessage(role="assistant", content="")
            accumulated_tool_calls: list[dict[str, Any]] = []
            has_appended_assistant = False
            has_tool_results = False
            ended_with_stream_error = False
+
            # Use an explicit async iterator with non-cancelling heartbeats.
            # CRITICAL: we must NOT cancel __anext__() mid-flight — doing so
            # (via asyncio.timeout or wait_for) corrupts the SDK's internal
@@ -1048,7 +1235,6 @@ async def stream_chat_completion_sdk(
                            exc_info=True,
                        )
                        ended_with_stream_error = True
-
                        yield StreamError(
                            errorText=f"SDK stream error: {stream_err}",
                            code="sdk_stream_error",
@@ -1067,17 +1253,13 @@ async def stream_chat_completion_sdk(
                        len(adapter.resolved_tool_calls),
                    )

-                    # Log AssistantMessage API errors (e.g. invalid_request)
-                    # so we can debug Anthropic API 400s surfaced by the CLI.
-                    sdk_error = getattr(sdk_msg, "error", None)
-                    if isinstance(sdk_msg, AssistantMessage) and sdk_error:
-                        logger.error(
-                            "[SDK] [%s] AssistantMessage has error=%s, "
-                            "content_blocks=%d, content_preview=%s",
-                            session_id[:12],
-                            sdk_error,
-                            len(sdk_msg.content),
-                            str(sdk_msg.content)[:500],
+                    # Capture SDK messages in transcript
+                    if isinstance(sdk_msg, AssistantMessage):
+                        content_blocks = _format_sdk_content_blocks(sdk_msg.content)
+                        model_name = getattr(sdk_msg, "model", "")
+                        transcript_builder.add_assistant_message(
+                            content_blocks=content_blocks,
+                            model=model_name,
                        )

                    # Race-condition fix: SDK hooks (PostToolUse) are
@@ -1133,26 +1315,9 @@ async def stream_chat_completion_sdk(
                                sdk_msg.result or "(no error message provided)",
                            )

-                    # Emit compaction end if SDK finished compacting.
-                    # When compaction ends, sync TranscriptBuilder with the
-                    # CLI's active context so they stay identical.
-                    compact_result = await compaction.emit_end_if_ready(session)
-                    for ev in compact_result.events:
+                    # Emit compaction end if SDK finished compacting
+                    for ev in await compaction.emit_end_if_ready(session):
                        yield ev
-                    # After replace_entries, skip append_assistant for this
-                    # sdk_msg — the CLI session file already contains it,
-                    # so appending again would create a duplicate.
-                    entries_replaced = False
-                    if compact_result.just_ended:
-                        compacted = await asyncio.to_thread(
-                            read_compacted_entries,
-                            compact_result.transcript_path,
-                        )
-                        if compacted is not None:
-                            transcript_builder.replace_entries(
-                                compacted, log_prefix=log_prefix
-                            )
-                            entries_replaced = True

                    for response in adapter.convert_message(sdk_msg):
                        if isinstance(response, StreamStart):
@@ -1227,40 +1392,33 @@ async def stream_chat_completion_sdk(
                                has_appended_assistant = True

                        elif isinstance(response, StreamToolOutputAvailable):
-                            content = (
+                            tool_result_content = (
                                response.output
                                if isinstance(response.output, str)
-                                else json.dumps(response.output, ensure_ascii=False)
+                                else str(response.output)
                            )
                            session.messages.append(
                                ChatMessage(
                                    role="tool",
-                                    content=content,
+                                    content=tool_result_content,
                                    tool_call_id=response.toolCallId,
                                )
                            )
-                            if not entries_replaced:
-                                transcript_builder.append_tool_result(
-                                    tool_use_id=response.toolCallId,
-                                    content=content,
-                                )
+                            # Capture tool result in transcript as user message with tool_result content
+                            transcript_builder.add_user_message(
+                                content=[
+                                    {
+                                        "type": "tool_result",
+                                        "tool_use_id": response.toolCallId,
+                                        "content": tool_result_content,
+                                    }
+                                ]
+                            )
                            has_tool_results = True

                        elif isinstance(response, StreamFinish):
                            stream_completed = True

-                    # Append assistant entry AFTER convert_message so that
-                    # any stashed tool results from the previous turn are
-                    # recorded first, preserving the required API order:
-                    # assistant(tool_use) → tool_result → assistant(text).
-                    # Skip if replace_entries just ran — the CLI session
-                    # file already contains this message.
-                    if isinstance(sdk_msg, AssistantMessage) and not entries_replaced:
-                        transcript_builder.append_assistant(
-                            content_blocks=_format_sdk_content_blocks(sdk_msg.content),
-                            model=sdk_msg.model,
-                        )
-
            except asyncio.CancelledError:
                # Task/generator was cancelled (e.g. client disconnect,
                # server shutdown).  Log and let the safety-net / finally
@@ -1309,15 +1467,6 @@ async def stream_chat_completion_sdk(
                            type(response).__name__,
                            getattr(response, "toolName", "N/A"),
                        )
-                    if isinstance(response, StreamToolOutputAvailable):
-                        transcript_builder.append_tool_result(
-                            tool_use_id=response.toolCallId,
-                            content=(
-                                response.output
-                                if isinstance(response.output, str)
-                                else json.dumps(response.output, ensure_ascii=False)
-                            ),
-                        )
                    yield response

            # If the stream ended without a ResultMessage, the SDK
@@ -1434,25 +1583,14 @@ async def stream_chat_completion_sdk(
                    exc_info=True,
                )

-        # --- Pause E2B sandbox to stop billing between turns ---
-        # Fire-and-forget: pausing is best-effort and must not block the
-        # response or the transcript upload.  The task is anchored to
-        # _background_tasks to prevent garbage collection.
-        # Use pause_sandbox_direct to skip the Redis lookup and reconnect
-        # round-trip — e2b_sandbox is the live object from this turn.
-        if e2b_sandbox is not None:
-            task = asyncio.create_task(pause_sandbox_direct(e2b_sandbox, session_id))
-            _background_tasks.add(task)
-            task.add_done_callback(_background_tasks.discard)
-
        # --- Upload transcript for next-turn --resume ---
-        # TranscriptBuilder is the single source of truth.  It mirrors the
-        # CLI's active context: on compaction, replace_entries() syncs it
-        # with the compacted session file.  No CLI file read needed here.
+        # This MUST run in finally so the transcript is uploaded even when
+        # the streaming loop raises an exception.
+        # The transcript represents the COMPLETE active context (atomic).
        if config.claude_agent_use_resume and user_id and session is not None:
            try:
+                # Build complete transcript from captured SDK messages
                transcript_content = transcript_builder.to_jsonl()
-                entry_count = transcript_builder.entry_count

                if not transcript_content:
                    logger.warning(
@@ -1462,15 +1600,18 @@ async def stream_chat_completion_sdk(
                    logger.warning(
                        "%s Transcript invalid, skipping upload (entries=%d)",
                        log_prefix,
-                        entry_count,
+                        transcript_builder.entry_count,
                    )
                else:
                    logger.info(
-                        "%s Uploading transcript (entries=%d, bytes=%d)",
+                        "%s Uploading complete transcript (entries=%d, bytes=%d)",
                        log_prefix,
-                        entry_count,
+                        transcript_builder.entry_count,
                        len(transcript_content),
                    )
+                    # Shield upload from cancellation - let it complete even if
+                    # the finally block is interrupted. No timeout to avoid race
+                    # conditions where backgrounded uploads overwrite newer transcripts.
                    await asyncio.shield(
                        upload_transcript(
                            user_id=user_id,
@@ -1504,7 +1645,7 @@ async def _update_title_async(
            message, user_id=user_id, session_id=session_id
        )
        if title and user_id:
-            await update_session_title(session_id, user_id, title, only_if_empty=True)
+            await update_session_title(session_id, title)
            logger.debug(f"[SDK] Generated title for {session_id}: {title}")
    except Exception as e:
        logger.warning(f"[SDK] Failed to update session title: {e}")
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Zamil Majdy	3d880cd591	refactor(backend/copilot): move imports to module level - Move KEY_WORKFLOWS and TOOL_REGISTRY imports to top of file - Better code organization following Python conventions	2026-03-06 23:15:39 +07:00
Zamil Majdy	73f5ff9983	test(backend/copilot): add tests for auto-generated tool documentation - Test tool documentation structure (sections, format) - Test that all TOOL_REGISTRY tools are included - Test workflow sections are present - Test no duplicate tools - Verify markdown formatting compliance - All 6 tests passing	2026-03-06 23:15:39 +07:00
Zamil Majdy	6d9faf5f91	refactor(backend/copilot): auto-generate tool docs in supplement, simplify default prompt - Add _generate_tool_documentation() to auto-generate tool list from TOOL_REGISTRY - Extract KEY_WORKFLOWS constant to prompt_constants.py for maintainability - Append auto-generated tool docs + workflow guidance to system prompt supplement - Simplify DEFAULT_SYSTEM_PROMPT to minimal tone/style baseline (Langfuse handles details) - Add KEY WORKFLOWS section covering MCP integration, agent creation, folder management - Ensures tool documentation stays in sync with actual implementations - Fix Pyright error by safely accessing description field with .get()	2026-03-06 23:10:42 +07:00
Zamil Majdy	7774717104	docs(backend/copilot): document web_search and web_fetch in tool supplement Add clear documentation for web_search and web_fetch to the shared tool notes that get appended to all system prompts (Langfuse or default). This ensures the copilot knows to use web_search for general web queries instead of incorrectly using find_block to search for web search blocks. - web_search: For current information beyond knowledge cutoff - web_fetch: For retrieving content from specific URLs	2026-03-06 23:10:42 +07:00
Zamil Majdy	89ed628609	fix(backend/copilot): capture tool results in transcript Tool results (StreamToolOutputAvailable) were being added to session.messages but NOT to transcript_builder, causing the transcript to miss tool executions. This made the copilot claim '(no tool used)' when tools were actually called. Now tool results are captured as user messages with tool_result content blocks, matching the Claude API transcript format and ensuring --resume has complete conversation history including all tool interactions.	2026-03-06 23:10:42 +07:00