docs: Add changelog section to platform docs (#12077 )

## Summary - Adds a new **Changelog** section to the platform docs sidebar navigation - Creates the first bi-weekly changelog entry covering January 29 – February 11, 2026 - Establishes the directory structure (`docs/platform/changelog/`) for future entries ### What's included in the changelog entry The entry covers voice input, persistent file workspace, Claude Opus 4.6 upgrade, marketplace agent customization, rebuilt chat streaming (Vercel AI SDK migration), reconnection during long tasks, agent creation UX improvements, homepage redesign, and agent library reuse — plus collapsible sections for smaller improvements and fixes. ### How to add future entries 1. Create a new markdown file in `docs/platform/changelog/` (e.g., `february-12-february-25-2026.md`) 2. Add a link to `docs/platform/SUMMARY.md` under the Changelog section ## Test plan - [ ] Verify the Changelog section appears in the GitBook sidebar after merge - [ ] Verify the changelog entry renders correctly (headings, table, collapsible sections) - [ ] Verify navigation between the changelog index and the entry works 🤖 Generated with [Claude Code](https://claude.com/claude-code)  <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Adds a new **Changelog** section to the platform documentation with the first bi-weekly entry (January 29 – February 11, 2026). The changelog documents major platform updates including voice input, persistent file workspace, Claude Opus 4.6 upgrade, marketplace agent customization, rebuilt chat streaming, reconnection during long tasks, agent creation UX improvements, homepage redesign, and agent library reuse. **Key changes:** - Created `docs/platform/changelog/` directory with a landing page (`README.md`) and the first entry - Added "Changelog" section to `SUMMARY.md` navigation with proper GitBook structure - Used collapsible `<details>` sections for minor improvements and fixes to keep the changelog scannable - Follows existing documentation patterns with clear headings, GitBook hint blocks, and markdown tables The structure allows for easy addition of future entries by creating new markdown files and linking them in `SUMMARY.md`. </details> <details><summary><h3>Confidence Score: 5/5</h3></summary> - This PR is safe to merge with minimal risk - it only adds documentation files. - The PR introduces only new documentation files with no code changes. The changelog structure follows GitBook conventions correctly, uses proper markdown formatting, and integrates cleanly into the existing documentation navigation. All files are well-formatted with clear headings, proper use of GitBook hint blocks, and collapsible sections for organization. - No files require special attention </details>   --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
docs(blocks): complete block documentation migration cleanup
2026-02-18 10:41:49 -05:00 · 2026-02-11 18:26:57 +00:00 · 2026-01-22 14:18:10 -06:00 · 2026-01-22 14:17:18 -06:00 · 2026-01-22 18:59:20 +00:00 · 2026-01-15 12:59:51 -07:00
740 changed files with 22058 additions and 64281 deletions
--- a/.branchlet.json
+++ b/.branchlet.json
@@ -29,7 +29,8 @@
  "postCreateCmd": [
    "cd autogpt_platform/autogpt_libs && poetry install",
    "cd autogpt_platform/backend && poetry install && poetry run prisma generate",
-    "cd autogpt_platform/frontend && pnpm install"
+    "cd autogpt_platform/frontend && pnpm install",
+    "cd docs && pip install -r requirements.txt"
  ],
  "terminalCommand": "code .",
  "deleteBranchWithWorktree": false
--- a/.dockerignore
+++ b/.dockerignore
@@ -5,13 +5,42 @@
 !docs/

 # Platform - Libs
-!autogpt_platform/autogpt_libs/
+!autogpt_platform/autogpt_libs/autogpt_libs/
+!autogpt_platform/autogpt_libs/pyproject.toml
+!autogpt_platform/autogpt_libs/poetry.lock
+!autogpt_platform/autogpt_libs/README.md

 # Platform - Backend
-!autogpt_platform/backend/
+!autogpt_platform/backend/backend/
+!autogpt_platform/backend/test/e2e_test_data.py
+!autogpt_platform/backend/migrations/
+!autogpt_platform/backend/schema.prisma
+!autogpt_platform/backend/pyproject.toml
+!autogpt_platform/backend/poetry.lock
+!autogpt_platform/backend/README.md
+!autogpt_platform/backend/.env
+!autogpt_platform/backend/gen_prisma_types_stub.py
+
+# Platform - Market
+!autogpt_platform/market/market/
+!autogpt_platform/market/scripts.py
+!autogpt_platform/market/schema.prisma
+!autogpt_platform/market/pyproject.toml
+!autogpt_platform/market/poetry.lock
+!autogpt_platform/market/README.md

 # Platform - Frontend
-!autogpt_platform/frontend/
+!autogpt_platform/frontend/src/
+!autogpt_platform/frontend/public/
+!autogpt_platform/frontend/scripts/
+!autogpt_platform/frontend/package.json
+!autogpt_platform/frontend/pnpm-lock.yaml
+!autogpt_platform/frontend/tsconfig.json
+!autogpt_platform/frontend/README.md
+## config
+!autogpt_platform/frontend/*.config.*
+!autogpt_platform/frontend/.env.*
+!autogpt_platform/frontend/.env

 # Classic - AutoGPT
 !classic/original_autogpt/autogpt/
@@ -35,38 +64,6 @@
 # Classic - Frontend
 !classic/frontend/build/web/

-# Explicitly re-ignore unwanted files from whitelisted directories
-# Note: These patterns MUST come after the whitelist rules to take effect
-
-# Hidden files and directories (but keep frontend .env files needed for build)
-**/.*
-!autogpt_platform/frontend/.env
-!autogpt_platform/frontend/.env.default
-!autogpt_platform/frontend/.env.production
-
-# Python artifacts
-**/__pycache__/
-**/*.pyc
-**/*.pyo
-**/.venv/
-**/.ruff_cache/
-**/.pytest_cache/
-**/.coverage
-**/htmlcov/
-
-# Node artifacts
-**/node_modules/
-**/.next/
-**/storybook-static/
-**/playwright-report/
-**/test-results/
-
-# Build artifacts
-**/dist/
-**/build/
-!autogpt_platform/frontend/src/**/build/
-**/target/
-
-# Logs and temp files
-**/*.log
-**/*.tmp
+# Explicitly re-ignore some folders
+.*
+**/__pycache__
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -160,7 +160,7 @@ pnpm storybook                      # Start component development server

 **Backend Entry Points:**

- `backend/backend/api/rest_api.py` - FastAPI application setup
+- `backend/backend/server/server.py` - FastAPI application setup
 - `backend/backend/data/` - Database models and user management
 - `backend/blocks/` - Agent execution blocks and logic

@@ -219,7 +219,7 @@ Agents are built using a visual block-based system where each block performs a s

 ### API Development

-1. Update routes in `/backend/backend/api/features/`
+1. Update routes in `/backend/backend/server/routers/`
 2. Add/update Pydantic models in same directory
 3. Write tests alongside route files
 4. For `data/*.py` changes, validate user ID checks
@@ -285,7 +285,7 @@ Agents are built using a visual block-based system where each block performs a s

 ### Security Guidelines

-**Cache Protection Middleware** (`/backend/backend/api/middleware/security.py`):
+**Cache Protection Middleware** (`/backend/backend/server/middleware/security.py`):

 - Default: Disables caching for ALL endpoints with `Cache-Control: no-store, no-cache, must-revalidate, private`
 - Uses allow list approach for cacheable paths (static assets, health checks, public pages)
--- a/.github/scripts/detect_overlaps.py
+++ b/.github/scripts/detect_overlaps.py
--- a/.github/workflows/classic-frontend-ci.yml
+++ b/.github/workflows/classic-frontend-ci.yml
@@ -49,7 +49,7 @@ jobs:

      - name: Create PR ${{ env.BUILD_BRANCH }} -> ${{ github.ref_name }}
        if: github.event_name == 'push'
-        uses: peter-evans/create-pull-request@v8
+        uses: peter-evans/create-pull-request@v7
        with:
          add-paths: classic/frontend/build/web
          base: ${{ github.ref_name }}
--- a/.github/workflows/claude-ci-failure-auto-fix.yml
+++ b/.github/workflows/claude-ci-failure-auto-fix.yml
@@ -22,7 +22,7 @@ jobs:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          ref: ${{ github.event.workflow_run.head_branch }}
          fetch-depth: 0
@@ -40,51 +40,9 @@ jobs:
          git checkout -b "$BRANCH_NAME"
          echo "branch_name=$BRANCH_NAME" >> $GITHUB_OUTPUT

-      # Backend Python/Poetry setup (so Claude can run linting/tests)
-      - name: Set up Python
-        uses: actions/setup-python@v5
-        with:
-          python-version: "3.11"
-
-      - name: Set up Python dependency cache
-        uses: actions/cache@v5
-        with:
-          path: ~/.cache/pypoetry
-          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
-
-      - name: Install Poetry
-        run: |
-          cd autogpt_platform/backend
-          HEAD_POETRY_VERSION=$(python3 ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
-          curl -sSL https://install.python-poetry.org | POETRY_VERSION=$HEAD_POETRY_VERSION python3 -
-          echo "$HOME/.local/bin" >> $GITHUB_PATH
-
-      - name: Install Python dependencies
-        working-directory: autogpt_platform/backend
-        run: poetry install
-
-      - name: Generate Prisma Client
-        working-directory: autogpt_platform/backend
-        run: poetry run prisma generate && poetry run gen-prisma-stub
-
-      # Frontend Node.js/pnpm setup (so Claude can run linting/tests)
-      - name: Enable corepack
-        run: corepack enable
-
-      - name: Set up Node.js
-        uses: actions/setup-node@v6
-        with:
-          node-version: "22"
-          cache: "pnpm"
-          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
-
-      - name: Install JavaScript dependencies
-        working-directory: autogpt_platform/frontend
-        run: pnpm install --frozen-lockfile
-
      - name: Get CI failure details
        id: failure_details
-        uses: actions/github-script@v8
+        uses: actions/github-script@v7
        with:
          script: |
            const run = await github.rest.actions.getWorkflowRun({
--- a/.github/workflows/claude-dependabot.yml
+++ b/.github/workflows/claude-dependabot.yml
@@ -30,7 +30,7 @@ jobs:
      actions: read # Required for CI access
    steps:
      - name: Checkout code
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          fetch-depth: 1

@@ -41,7 +41,7 @@ jobs:
          python-version: "3.11"  # Use standard version matching CI

      - name: Set up Python dependency cache
-        uses: actions/cache@v5
+        uses: actions/cache@v4
        with:
          path: ~/.cache/pypoetry
          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
@@ -77,15 +77,27 @@ jobs:
        run: poetry run prisma generate && poetry run gen-prisma-stub

      # Frontend Node.js/pnpm setup (mirrors platform-frontend-ci.yml)
+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: "22"
+
      - name: Enable corepack
        run: corepack enable

-      - name: Set up Node.js
-        uses: actions/setup-node@v6
+      - name: Set pnpm store directory
+        run: |
+          pnpm config set store-dir ~/.pnpm-store
+          echo "PNPM_HOME=$HOME/.pnpm-store" >> $GITHUB_ENV
+
+      - name: Cache frontend dependencies
+        uses: actions/cache@v4
        with:
-          node-version: "22"
-          cache: "pnpm"
-          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
+          path: ~/.pnpm-store
+          key: ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
+            ${{ runner.os }}-pnpm-

      - name: Install JavaScript dependencies
        working-directory: autogpt_platform/frontend
@@ -112,7 +124,7 @@ jobs:
      # Phase 1: Cache and load Docker images for faster setup
      - name: Set up Docker image cache
        id: docker-cache
-        uses: actions/cache@v5
+        uses: actions/cache@v4
        with:
          path: ~/docker-cache
          # Use a versioned key for cache invalidation when image list changes
@@ -297,7 +309,6 @@ jobs:
        uses: anthropics/claude-code-action@v1
        with:
          claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
-          allowed_bots: "dependabot[bot]"
          claude_args: |
            --allowedTools "Bash(npm:*),Bash(pnpm:*),Bash(poetry:*),Bash(git:*),Edit,Replace,NotebookEditCell,mcp__github_inline_comment__create_inline_comment,Bash(gh pr comment:*), Bash(gh pr diff:*), Bash(gh pr view:*)"
          prompt: |
--- a/.github/workflows/claude.yml
+++ b/.github/workflows/claude.yml
@@ -40,7 +40,7 @@ jobs:
      actions: read # Required for CI access
    steps:
      - name: Checkout code
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          fetch-depth: 1

@@ -57,7 +57,7 @@ jobs:
          python-version: "3.11"  # Use standard version matching CI

      - name: Set up Python dependency cache
-        uses: actions/cache@v5
+        uses: actions/cache@v4
        with:
          path: ~/.cache/pypoetry
          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
@@ -93,15 +93,27 @@ jobs:
        run: poetry run prisma generate && poetry run gen-prisma-stub

      # Frontend Node.js/pnpm setup (mirrors platform-frontend-ci.yml)
+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: "22"
+
      - name: Enable corepack
        run: corepack enable

-      - name: Set up Node.js
-        uses: actions/setup-node@v6
+      - name: Set pnpm store directory
+        run: |
+          pnpm config set store-dir ~/.pnpm-store
+          echo "PNPM_HOME=$HOME/.pnpm-store" >> $GITHUB_ENV
+
+      - name: Cache frontend dependencies
+        uses: actions/cache@v4
        with:
-          node-version: "22"
-          cache: "pnpm"
-          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
+          path: ~/.pnpm-store
+          key: ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
+            ${{ runner.os }}-pnpm-

      - name: Install JavaScript dependencies
        working-directory: autogpt_platform/frontend
@@ -128,7 +140,7 @@ jobs:
      # Phase 1: Cache and load Docker images for faster setup
      - name: Set up Docker image cache
        id: docker-cache
-        uses: actions/cache@v5
+        uses: actions/cache@v4
        with:
          path: ~/docker-cache
          # Use a versioned key for cache invalidation when image list changes
--- a/.github/workflows/codeql.yml
+++ b/.github/workflows/codeql.yml
@@ -58,11 +58,11 @@ jobs:
        # your codebase is analyzed, see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages
    steps:
    - name: Checkout repository
-      uses: actions/checkout@v6
+      uses: actions/checkout@v4

    # Initializes the CodeQL tools for scanning.
    - name: Initialize CodeQL
-      uses: github/codeql-action/init@v4
+      uses: github/codeql-action/init@v3
      with:
        languages: ${{ matrix.language }}
        build-mode: ${{ matrix.build-mode }}
@@ -93,6 +93,6 @@ jobs:
        exit 1

    - name: Perform CodeQL Analysis
-      uses: github/codeql-action/analyze@v4
+      uses: github/codeql-action/analyze@v3
      with:
        category: "/language:${{matrix.language}}"
--- a/.github/workflows/copilot-setup-steps.yml
+++ b/.github/workflows/copilot-setup-steps.yml
@@ -27,7 +27,7 @@ jobs:
    # If you do not check out your code, Copilot will do this for you.
    steps:
      - name: Checkout code
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          fetch-depth: 0
          submodules: true
@@ -39,7 +39,7 @@ jobs:
          python-version: "3.11"  # Use standard version matching CI

      - name: Set up Python dependency cache
-        uses: actions/cache@v5
+        uses: actions/cache@v4
        with:
          path: ~/.cache/pypoetry
          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
@@ -76,7 +76,7 @@ jobs:

      # Frontend Node.js/pnpm setup (mirrors platform-frontend-ci.yml)
      - name: Set up Node.js
-        uses: actions/setup-node@v6
+        uses: actions/setup-node@v4
        with:
          node-version: "22"

@@ -89,7 +89,7 @@ jobs:
          echo "PNPM_HOME=$HOME/.pnpm-store" >> $GITHUB_ENV

      - name: Cache frontend dependencies
-        uses: actions/cache@v5
+        uses: actions/cache@v4
        with:
          path: ~/.pnpm-store
          key: ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}
@@ -132,7 +132,7 @@ jobs:
      # Phase 1: Cache and load Docker images for faster setup
      - name: Set up Docker image cache
        id: docker-cache
-        uses: actions/cache@v5
+        uses: actions/cache@v4
        with:
          path: ~/docker-cache
          # Use a versioned key for cache invalidation when image list changes
--- a/.github/workflows/docs-block-sync.yml
+++ b/.github/workflows/docs-block-sync.yml
@@ -23,7 +23,7 @@ jobs:

    steps:
      - name: Checkout code
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          fetch-depth: 1

@@ -33,7 +33,7 @@ jobs:
          python-version: "3.11"

      - name: Set up Python dependency cache
-        uses: actions/cache@v5
+        uses: actions/cache@v4
        with:
          path: ~/.cache/pypoetry
          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
--- a/.github/workflows/docs-claude-review.yml
+++ b/.github/workflows/docs-claude-review.yml
@@ -7,10 +7,6 @@ on:
      - "docs/integrations/**"
      - "autogpt_platform/backend/backend/blocks/**"

-concurrency:
-  group: claude-docs-review-${{ github.event.pull_request.number }}
-  cancel-in-progress: true
-
 jobs:
  claude-review:
    # Only run for PRs from members/collaborators
@@ -27,7 +23,7 @@ jobs:

    steps:
      - name: Checkout code
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          fetch-depth: 0

@@ -37,7 +33,7 @@ jobs:
          python-version: "3.11"

      - name: Set up Python dependency cache
-        uses: actions/cache@v5
+        uses: actions/cache@v4
        with:
          path: ~/.cache/pypoetry
          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
@@ -95,35 +91,5 @@ jobs:
            3. Read corresponding documentation files to verify accuracy
            4. Provide your feedback as a PR comment

-            ## IMPORTANT: Comment Marker
-            Start your PR comment with exactly this HTML comment marker on its own line:
-            <!-- CLAUDE_DOCS_REVIEW -->
-
-            This marker is used to identify and replace your comment on subsequent runs.
-
            Be constructive and specific. If everything looks good, say so!
            If there are issues, explain what's wrong and suggest how to fix it.
-
-      - name: Delete old Claude review comments
-        env:
-          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
-        run: |
-          # Get all comment IDs with our marker, sorted by creation date (oldest first)
-          COMMENT_IDS=$(gh api \
-            repos/${{ github.repository }}/issues/${{ github.event.pull_request.number }}/comments \
-            --jq '[.[] | select(.body | contains("<!-- CLAUDE_DOCS_REVIEW -->"))] | sort_by(.created_at) | .[].id')
-
-          # Count comments
-          COMMENT_COUNT=$(echo "$COMMENT_IDS" | grep -c . || true)
-
-          if [ "$COMMENT_COUNT" -gt 1 ]; then
-            # Delete all but the last (newest) comment
-            echo "$COMMENT_IDS" | head -n -1 | while read -r COMMENT_ID; do
-              if [ -n "$COMMENT_ID" ]; then
-                echo "Deleting old review comment: $COMMENT_ID"
-                gh api -X DELETE repos/${{ github.repository }}/issues/comments/$COMMENT_ID
-              fi
-            done
-          else
-            echo "No old review comments to clean up"
-          fi
--- a/.github/workflows/docs-enhance.yml
+++ b/.github/workflows/docs-enhance.yml
@@ -28,7 +28,7 @@ jobs:

    steps:
      - name: Checkout code
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          fetch-depth: 1

@@ -38,7 +38,7 @@ jobs:
          python-version: "3.11"

      - name: Set up Python dependency cache
-        uses: actions/cache@v5
+        uses: actions/cache@v4
        with:
          path: ~/.cache/pypoetry
          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
--- a/.github/workflows/platform-autogpt-deploy-dev.yaml
+++ b/.github/workflows/platform-autogpt-deploy-dev.yaml
@@ -25,7 +25,7 @@ jobs:

    steps:
      - name: Checkout code
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          ref: ${{ github.event.inputs.git_ref || github.ref_name }}

@@ -52,7 +52,7 @@ jobs:
    runs-on: ubuntu-latest
    steps:
      - name: Trigger deploy workflow
-        uses: peter-evans/repository-dispatch@v4
+        uses: peter-evans/repository-dispatch@v3
        with:
          token: ${{ secrets.DEPLOY_TOKEN }}
          repository: Significant-Gravitas/AutoGPT_cloud_infrastructure
--- a/.github/workflows/platform-autogpt-deploy-prod.yml
+++ b/.github/workflows/platform-autogpt-deploy-prod.yml
@@ -17,7 +17,7 @@ jobs:

    steps:
      - name: Checkout code
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          ref: ${{ github.ref_name || 'master' }}

@@ -45,7 +45,7 @@ jobs:
    runs-on: ubuntu-latest
    steps:
      - name: Trigger deploy workflow
-        uses: peter-evans/repository-dispatch@v4
+        uses: peter-evans/repository-dispatch@v3
        with:
          token: ${{ secrets.DEPLOY_TOKEN }}
          repository: Significant-Gravitas/AutoGPT_cloud_infrastructure
--- a/.github/workflows/platform-backend-ci.yml
+++ b/.github/workflows/platform-backend-ci.yml
@@ -41,18 +41,13 @@ jobs:
        ports:
          - 6379:6379
      rabbitmq:
-        image: rabbitmq:4.1.4
+        image: rabbitmq:3.12-management
        ports:
          - 5672:5672
+          - 15672:15672
        env:
          RABBITMQ_DEFAULT_USER: ${{ env.RABBITMQ_DEFAULT_USER }}
          RABBITMQ_DEFAULT_PASS: ${{ env.RABBITMQ_DEFAULT_PASS }}
-        options: >-
-          --health-cmd "rabbitmq-diagnostics -q ping"
-          --health-interval 30s
-          --health-timeout 10s
-          --health-retries 5
-          --health-start-period 10s
      clamav:
        image: clamav/clamav-debian:latest
        ports:
@@ -73,7 +68,7 @@ jobs:

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          fetch-depth: 0
          submodules: true
@@ -93,7 +88,7 @@ jobs:
        run: echo "date=$(date +'%Y-%m-%d')" >> $GITHUB_OUTPUT

      - name: Set up Python dependency cache
-        uses: actions/cache@v5
+        uses: actions/cache@v4
        with:
          path: ~/.cache/pypoetry
          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
--- a/.github/workflows/platform-dev-deploy-event-dispatcher.yml
+++ b/.github/workflows/platform-dev-deploy-event-dispatcher.yml
@@ -17,7 +17,7 @@ jobs:
      - name: Check comment permissions and deployment status
        id: check_status
        if: github.event_name == 'issue_comment' && github.event.issue.pull_request
-        uses: actions/github-script@v8
+        uses: actions/github-script@v7
        with:
          script: |
            const commentBody = context.payload.comment.body.trim();
@@ -55,7 +55,7 @@ jobs:

      - name: Post permission denied comment
        if: steps.check_status.outputs.permission_denied == 'true'
-        uses: actions/github-script@v8
+        uses: actions/github-script@v7
        with:
          script: |
            await github.rest.issues.createComment({
@@ -68,7 +68,7 @@ jobs:
      - name: Get PR details for deployment
        id: pr_details
        if: steps.check_status.outputs.should_deploy == 'true' || steps.check_status.outputs.should_undeploy == 'true'
-        uses: actions/github-script@v8
+        uses: actions/github-script@v7
        with:
          script: |
            const pr = await github.rest.pulls.get({
@@ -82,7 +82,7 @@ jobs:
          
      - name: Dispatch Deploy Event
        if: steps.check_status.outputs.should_deploy == 'true'
-        uses: peter-evans/repository-dispatch@v4
+        uses: peter-evans/repository-dispatch@v3
        with:
          token: ${{ secrets.DISPATCH_TOKEN }}
          repository: Significant-Gravitas/AutoGPT_cloud_infrastructure
@@ -98,7 +98,7 @@ jobs:

      - name: Post deploy success comment
        if: steps.check_status.outputs.should_deploy == 'true'
-        uses: actions/github-script@v8
+        uses: actions/github-script@v7
        with:
          script: |
            await github.rest.issues.createComment({
@@ -110,7 +110,7 @@ jobs:

      - name: Dispatch Undeploy Event (from comment)
        if: steps.check_status.outputs.should_undeploy == 'true'
-        uses: peter-evans/repository-dispatch@v4
+        uses: peter-evans/repository-dispatch@v3
        with:
          token: ${{ secrets.DISPATCH_TOKEN }}
          repository: Significant-Gravitas/AutoGPT_cloud_infrastructure
@@ -126,7 +126,7 @@ jobs:

      - name: Post undeploy success comment
        if: steps.check_status.outputs.should_undeploy == 'true'
-        uses: actions/github-script@v8
+        uses: actions/github-script@v7
        with:
          script: |
            await github.rest.issues.createComment({
@@ -139,7 +139,7 @@ jobs:
      - name: Check deployment status on PR close
        id: check_pr_close
        if: github.event_name == 'pull_request' && github.event.action == 'closed'
-        uses: actions/github-script@v8
+        uses: actions/github-script@v7
        with:
          script: |
            const comments = await github.rest.issues.listComments({
@@ -168,7 +168,7 @@ jobs:
          github.event_name == 'pull_request' &&
          github.event.action == 'closed' &&
          steps.check_pr_close.outputs.should_undeploy == 'true'
-        uses: peter-evans/repository-dispatch@v4
+        uses: peter-evans/repository-dispatch@v3
        with:
          token: ${{ secrets.DISPATCH_TOKEN }}
          repository: Significant-Gravitas/AutoGPT_cloud_infrastructure
@@ -187,7 +187,7 @@ jobs:
          github.event_name == 'pull_request' &&
          github.event.action == 'closed' &&
          steps.check_pr_close.outputs.should_undeploy == 'true'
-        uses: actions/github-script@v8
+        uses: actions/github-script@v7
        with:
          script: |
            await github.rest.issues.createComment({
--- a/.github/workflows/platform-frontend-ci.yml
+++ b/.github/workflows/platform-frontend-ci.yml
@@ -6,16 +6,10 @@ on:
    paths:
      - ".github/workflows/platform-frontend-ci.yml"
      - "autogpt_platform/frontend/**"
-      - "autogpt_platform/backend/Dockerfile"
-      - "autogpt_platform/docker-compose.yml"
-      - "autogpt_platform/docker-compose.platform.yml"
  pull_request:
    paths:
      - ".github/workflows/platform-frontend-ci.yml"
      - "autogpt_platform/frontend/**"
-      - "autogpt_platform/backend/Dockerfile"
-      - "autogpt_platform/docker-compose.yml"
-      - "autogpt_platform/docker-compose.platform.yml"
  merge_group:
  workflow_dispatch:

@@ -32,31 +26,34 @@ jobs:
  setup:
    runs-on: ubuntu-latest
    outputs:
-      components-changed: ${{ steps.filter.outputs.components }}
+      cache-key: ${{ steps.cache-key.outputs.key }}

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4

-      - name: Check for component changes
-        uses: dorny/paths-filter@v3
-        id: filter
+      - name: Set up Node.js
+        uses: actions/setup-node@v4
        with:
-          filters: |
-            components:
-              - 'autogpt_platform/frontend/src/components/**'
+          node-version: "22.18.0"

      - name: Enable corepack
        run: corepack enable

-      - name: Set up Node
-        uses: actions/setup-node@v6
-        with:
-          node-version: "22.18.0"
-          cache: "pnpm"
-          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
+      - name: Generate cache key
+        id: cache-key
+        run: echo "key=${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}" >> $GITHUB_OUTPUT

-      - name: Install dependencies to populate cache
+      - name: Cache dependencies
+        uses: actions/cache@v4
+        with:
+          path: ~/.pnpm-store
+          key: ${{ steps.cache-key.outputs.key }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
+            ${{ runner.os }}-pnpm-
+
+      - name: Install dependencies
        run: pnpm install --frozen-lockfile

  lint:
@@ -65,17 +62,24 @@ jobs:

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
+
+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: "22.18.0"

      - name: Enable corepack
        run: corepack enable

-      - name: Set up Node
-        uses: actions/setup-node@v6
+      - name: Restore dependencies cache
+        uses: actions/cache@v4
        with:
-          node-version: "22.18.0"
-          cache: "pnpm"
-          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
+          path: ~/.pnpm-store
+          key: ${{ needs.setup.outputs.cache-key }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
+            ${{ runner.os }}-pnpm-

      - name: Install dependencies
        run: pnpm install --frozen-lockfile
@@ -86,27 +90,31 @@ jobs:
  chromatic:
    runs-on: ubuntu-latest
    needs: setup
-    # Disabled: to re-enable, remove 'false &&' from the condition below
-    if: >-
-      false
-      && (github.ref == 'refs/heads/dev' || github.base_ref == 'dev')
-      && needs.setup.outputs.components-changed == 'true'
+    # Only run on dev branch pushes or PRs targeting dev
+    if: github.ref == 'refs/heads/dev' || github.base_ref == 'dev'

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          fetch-depth: 0

+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: "22.18.0"
+
      - name: Enable corepack
        run: corepack enable

-      - name: Set up Node
-        uses: actions/setup-node@v6
+      - name: Restore dependencies cache
+        uses: actions/cache@v4
        with:
-          node-version: "22.18.0"
-          cache: "pnpm"
-          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
+          path: ~/.pnpm-store
+          key: ${{ needs.setup.outputs.cache-key }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
+            ${{ runner.os }}-pnpm-

      - name: Install dependencies
        run: pnpm install --frozen-lockfile
@@ -121,20 +129,30 @@ jobs:
          exitOnceUploaded: true

  e2e_test:
-    name: end-to-end tests
    runs-on: big-boi
+    needs: setup
+    strategy:
+      fail-fast: false

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          submodules: recursive

-      - name: Set up Platform - Copy default supabase .env
+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: "22.18.0"
+
+      - name: Enable corepack
+        run: corepack enable
+
+      - name: Copy default supabase .env
        run: |
          cp ../.env.default ../.env

-      - name: Set up Platform - Copy backend .env and set OpenAI API key
+      - name: Copy backend .env and set OpenAI API key
        run: |
          cp ../backend/.env.default ../backend/.env
          echo "OPENAI_INTERNAL_API_KEY=${{ secrets.OPENAI_API_KEY }}" >> ../backend/.env
@@ -142,125 +160,77 @@ jobs:
          # Used by E2E test data script to generate embeddings for approved store agents
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

-      - name: Set up Platform - Set up Docker Buildx
+      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
+
+      - name: Cache Docker layers
+        uses: actions/cache@v4
        with:
-          driver: docker-container
-          driver-opts: network=host
+          path: /tmp/.buildx-cache
+          key: ${{ runner.os }}-buildx-frontend-test-${{ hashFiles('autogpt_platform/docker-compose.yml', 'autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/pyproject.toml', 'autogpt_platform/backend/poetry.lock') }}
+          restore-keys: |
+            ${{ runner.os }}-buildx-frontend-test-

-      - name: Set up Platform - Expose GHA cache to docker buildx CLI
-        uses: crazy-max/ghaction-github-runtime@v3
-
-      - name: Set up Platform - Build Docker images (with cache)
-        working-directory: autogpt_platform
+      - name: Run docker compose
        run: |
-          pip install pyyaml
-
-          # Resolve extends and generate a flat compose file that bake can understand
-          docker compose -f docker-compose.yml config > docker-compose.resolved.yml
-
-          # Add cache configuration to the resolved compose file
-          python ../.github/workflows/scripts/docker-ci-fix-compose-build-cache.py \
-            --source docker-compose.resolved.yml \
-            --cache-from "type=gha" \
-            --cache-to "type=gha,mode=max" \
-            --backend-hash "${{ hashFiles('autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/poetry.lock', 'autogpt_platform/backend/backend') }}" \
-            --frontend-hash "${{ hashFiles('autogpt_platform/frontend/Dockerfile', 'autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/src') }}" \
-            --git-ref "${{ github.ref }}"
-
-          # Build with bake using the resolved compose file (now includes cache config)
-          docker buildx bake --allow=fs.read=.. -f docker-compose.resolved.yml --load
+          NEXT_PUBLIC_PW_TEST=true docker compose -f ../docker-compose.yml up -d
        env:
-          NEXT_PUBLIC_PW_TEST: true
+          DOCKER_BUILDKIT: 1
+          BUILDX_CACHE_FROM: type=local,src=/tmp/.buildx-cache
+          BUILDX_CACHE_TO: type=local,dest=/tmp/.buildx-cache-new,mode=max

-      - name: Set up tests - Cache E2E test data
-        id: e2e-data-cache
-        uses: actions/cache@v5
-        with:
-          path: /tmp/e2e_test_data.sql
-          key: e2e-test-data-${{ hashFiles('autogpt_platform/backend/test/e2e_test_data.py', 'autogpt_platform/backend/migrations/**', '.github/workflows/platform-frontend-ci.yml') }}
-
-      - name: Set up Platform - Start Supabase DB + Auth
+      - name: Move cache
        run: |
-          docker compose -f ../docker-compose.resolved.yml up -d db auth --no-build
-          echo "Waiting for database to be ready..."
-          timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done'
-          echo "Waiting for auth service to be ready..."
-          timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -c "SELECT 1 FROM auth.users LIMIT 1" 2>/dev/null; do sleep 2; done' || echo "Auth schema check timeout, continuing..."
+          rm -rf /tmp/.buildx-cache
+          if [ -d "/tmp/.buildx-cache-new" ]; then
+            mv /tmp/.buildx-cache-new /tmp/.buildx-cache
+          fi

-      - name: Set up Platform - Run migrations
+      - name: Wait for services to be ready
        run: |
-          echo "Running migrations..."
-          docker compose -f ../docker-compose.resolved.yml run --rm migrate
-          echo "✅ Migrations completed"
-        env:
-          NEXT_PUBLIC_PW_TEST: true
-
-      - name: Set up tests - Load cached E2E test data
-        if: steps.e2e-data-cache.outputs.cache-hit == 'true'
-        run: |
-          echo "✅ Found cached E2E test data, restoring..."
-          {
-            echo "SET session_replication_role = 'replica';"
-            cat /tmp/e2e_test_data.sql
-            echo "SET session_replication_role = 'origin';"
-          } | docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -b
-          # Refresh materialized views after restore
-          docker compose -f ../docker-compose.resolved.yml exec -T db \
-            psql -U postgres -d postgres -b -c "SET search_path TO platform; SELECT refresh_store_materialized_views();" || true
-
-          echo "✅ E2E test data restored from cache"
-
-      - name: Set up Platform - Start (all other services)
-        run: |
-          docker compose -f ../docker-compose.resolved.yml up -d --no-build
          echo "Waiting for rest_server to be ready..."
          timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
-        env:
-          NEXT_PUBLIC_PW_TEST: true
+          echo "Waiting for database to be ready..."
+          timeout 60 sh -c 'until docker compose -f ../docker-compose.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done' || echo "Database ready check timeout, continuing..."

-      - name: Set up tests - Create E2E test data
-        if: steps.e2e-data-cache.outputs.cache-hit != 'true'
+      - name: Create E2E test data
        run: |
          echo "Creating E2E test data..."
-          docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.resolved.yml ps -q rest_server):/tmp/e2e_test_data.py
-          docker compose -f ../docker-compose.resolved.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
-            echo "❌ E2E test data creation failed!"
-            docker compose -f ../docker-compose.resolved.yml logs --tail=50 rest_server
-            exit 1
-          }
+          # First try to run the script from inside the container
+          if docker compose -f ../docker-compose.yml exec -T rest_server test -f /app/autogpt_platform/backend/test/e2e_test_data.py; then
+            echo "✅ Found e2e_test_data.py in container, running it..."
+            docker compose -f ../docker-compose.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python backend/test/e2e_test_data.py" || {
+              echo "❌ E2E test data creation failed!"
+              docker compose -f ../docker-compose.yml logs --tail=50 rest_server
+              exit 1
+            }
+          else
+            echo "⚠️ e2e_test_data.py not found in container, copying and running..."
+            # Copy the script into the container and run it
+            docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.yml ps -q rest_server):/tmp/e2e_test_data.py || {
+              echo "❌ Failed to copy script to container"
+              exit 1
+            }
+            docker compose -f ../docker-compose.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
+              echo "❌ E2E test data creation failed!"
+              docker compose -f ../docker-compose.yml logs --tail=50 rest_server
+              exit 1
+            }
+          fi

-          # Dump auth.users + platform schema for cache (two separate dumps)
-          echo "Dumping database for cache..."
-          {
-            docker compose -f ../docker-compose.resolved.yml exec -T db \
-              pg_dump -U postgres --data-only --column-inserts \
-              --table='auth.users' postgres
-            docker compose -f ../docker-compose.resolved.yml exec -T db \
-              pg_dump -U postgres --data-only --column-inserts \
-              --schema=platform \
-              --exclude-table='platform._prisma_migrations' \
-              --exclude-table='platform.apscheduler_jobs' \
-              --exclude-table='platform.apscheduler_jobs_batched_notifications' \
-              postgres
-          } > /tmp/e2e_test_data.sql
-
-          echo "✅ Database dump created for caching ($(wc -l < /tmp/e2e_test_data.sql) lines)"
-
-      - name: Set up tests - Enable corepack
-        run: corepack enable
-
-      - name: Set up tests - Set up Node
-        uses: actions/setup-node@v6
+      - name: Restore dependencies cache
+        uses: actions/cache@v4
        with:
-          node-version: "22.18.0"
-          cache: "pnpm"
-          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
+          path: ~/.pnpm-store
+          key: ${{ needs.setup.outputs.cache-key }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
+            ${{ runner.os }}-pnpm-

-      - name: Set up tests - Install dependencies
+      - name: Install dependencies
        run: pnpm install --frozen-lockfile

-      - name: Set up tests - Install browser 'chromium'
+      - name: Install Browser 'chromium'
        run: pnpm playwright install --with-deps chromium

      - name: Run Playwright tests
@@ -287,7 +257,7 @@ jobs:

      - name: Print Final Docker Compose logs
        if: always()
-        run: docker compose -f ../docker-compose.resolved.yml logs
+        run: docker compose -f ../docker-compose.yml logs

  integration_test:
    runs-on: ubuntu-latest
@@ -295,19 +265,26 @@ jobs:

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          submodules: recursive

+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: "22.18.0"
+
      - name: Enable corepack
        run: corepack enable

-      - name: Set up Node
-        uses: actions/setup-node@v6
+      - name: Restore dependencies cache
+        uses: actions/cache@v4
        with:
-          node-version: "22.18.0"
-          cache: "pnpm"
-          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
+          path: ~/.pnpm-store
+          key: ${{ needs.setup.outputs.cache-key }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
+            ${{ runner.os }}-pnpm-

      - name: Install dependencies
        run: pnpm install --frozen-lockfile
--- a/.github/workflows/platform-fullstack-ci.yml
+++ b/.github/workflows/platform-fullstack-ci.yml
@@ -29,10 +29,10 @@ jobs:

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4

      - name: Set up Node.js
-        uses: actions/setup-node@v6
+        uses: actions/setup-node@v4
        with:
          node-version: "22.18.0"

@@ -44,7 +44,7 @@ jobs:
        run: echo "key=${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}" >> $GITHUB_OUTPUT

      - name: Cache dependencies
-        uses: actions/cache@v5
+        uses: actions/cache@v4
        with:
          path: ~/.pnpm-store
          key: ${{ steps.cache-key.outputs.key }}
@@ -56,19 +56,19 @@ jobs:
        run: pnpm install --frozen-lockfile

  types:
-    runs-on: big-boi
+    runs-on: ubuntu-latest
    needs: setup
    strategy:
      fail-fast: false

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v6
+        uses: actions/checkout@v4
        with:
          submodules: recursive

      - name: Set up Node.js
-        uses: actions/setup-node@v6
+        uses: actions/setup-node@v4
        with:
          node-version: "22.18.0"

@@ -85,10 +85,10 @@ jobs:

      - name: Run docker compose
        run: |
-          docker compose -f ../docker-compose.yml --profile local up -d deps_backend
+          docker compose -f ../docker-compose.yml --profile local --profile deps_backend up -d

      - name: Restore dependencies cache
-        uses: actions/cache@v5
+        uses: actions/cache@v4
        with:
          path: ~/.pnpm-store
          key: ${{ needs.setup.outputs.cache-key }}
--- a/.github/workflows/pr-overlap-check.yml
+++ b/.github/workflows/pr-overlap-check.yml
@@ -1,39 +0,0 @@
-name: PR Overlap Detection
-
-on:
-  pull_request:
-    types: [opened, synchronize, reopened]
-    branches:
-      - dev
-      - master
-
-permissions:
-  contents: read
-  pull-requests: write
-
-jobs:
-  check-overlaps:
-    runs-on: ubuntu-latest
-    steps:
-      - name: Checkout repository
-        uses: actions/checkout@v4
-        with:
-          fetch-depth: 0  # Need full history for merge testing
-
-      - name: Set up Python
-        uses: actions/setup-python@v5
-        with:
-          python-version: '3.11'
-
-      - name: Configure git
-        run: |
-          git config user.email "github-actions[bot]@users.noreply.github.com"
-          git config user.name "github-actions[bot]"
-
-      - name: Run overlap detection
-        env:
-          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
-        # Always succeed - this check informs contributors, it shouldn't block merging
-        continue-on-error: true
-        run: |
-          python .github/scripts/detect_overlaps.py ${{ github.event.pull_request.number }}
--- a/.github/workflows/repo-workflow-checker.yml
+++ b/.github/workflows/repo-workflow-checker.yml
@@ -11,7 +11,7 @@ jobs:
    steps:
      # - name: Wait some time for all actions to start
      #   run: sleep 30
-      - uses: actions/checkout@v6
+      - uses: actions/checkout@v4
        # with:
          # fetch-depth: 0
      - name: Set up Python
--- a/.github/workflows/scripts/docker-ci-fix-compose-build-cache.py
+++ b/.github/workflows/scripts/docker-ci-fix-compose-build-cache.py
@@ -1,195 +0,0 @@
-#!/usr/bin/env python3
-"""
-Add cache configuration to a resolved docker-compose file for all services
-that have a build key, and ensure image names match what docker compose expects.
-"""
-
-import argparse
-
-import yaml
-
-
-DEFAULT_BRANCH = "dev"
-CACHE_BUILDS_FOR_COMPONENTS = ["backend", "frontend"]
-
-
-def main():
-    parser = argparse.ArgumentParser(
-        description="Add cache config to a resolved compose file"
-    )
-    parser.add_argument(
-        "--source",
-        required=True,
-        help="Source compose file to read (should be output of `docker compose config`)",
-    )
-    parser.add_argument(
-        "--cache-from",
-        default="type=gha",
-        help="Cache source configuration",
-    )
-    parser.add_argument(
-        "--cache-to",
-        default="type=gha,mode=max",
-        help="Cache destination configuration",
-    )
-    for component in CACHE_BUILDS_FOR_COMPONENTS:
-        parser.add_argument(
-            f"--{component}-hash",
-            default="",
-            help=f"Hash for {component} cache scope (e.g., from hashFiles())",
-        )
-    parser.add_argument(
-        "--git-ref",
-        default="",
-        help="Git ref for branch-based cache scope (e.g., refs/heads/master)",
-    )
-    args = parser.parse_args()
-
-    # Normalize git ref to a safe scope name (e.g., refs/heads/master -> master)
-    git_ref_scope = ""
-    if args.git_ref:
-        git_ref_scope = args.git_ref.replace("refs/heads/", "").replace("/", "-")
-
-    with open(args.source, "r") as f:
-        compose = yaml.safe_load(f)
-
-    # Get project name from compose file or default
-    project_name = compose.get("name", "autogpt_platform")
-
-    def get_image_name(dockerfile: str, target: str) -> str:
-        """Generate image name based on Dockerfile folder and build target."""
-        dockerfile_parts = dockerfile.replace("\\", "/").split("/")
-        if len(dockerfile_parts) >= 2:
-            folder_name = dockerfile_parts[-2]  # e.g., "backend" or "frontend"
-        else:
-            folder_name = "app"
-        return f"{project_name}-{folder_name}:{target}"
-
-    def get_build_key(dockerfile: str, target: str) -> str:
-        """Generate a unique key for a Dockerfile+target combination."""
-        return f"{dockerfile}:{target}"
-
-    def get_component(dockerfile: str) -> str | None:
-        """Get component name (frontend/backend) from dockerfile path."""
-        for component in CACHE_BUILDS_FOR_COMPONENTS:
-            if component in dockerfile:
-                return component
-        return None
-
-    # First pass: collect all services with build configs and identify duplicates
-    # Track which (dockerfile, target) combinations we've seen
-    build_key_to_first_service: dict[str, str] = {}
-    services_to_build: list[str] = []
-    services_to_dedupe: list[str] = []
-
-    for service_name, service_config in compose.get("services", {}).items():
-        if "build" not in service_config:
-            continue
-
-        build_config = service_config["build"]
-        dockerfile = build_config.get("dockerfile", "Dockerfile")
-        target = build_config.get("target", "default")
-        build_key = get_build_key(dockerfile, target)
-
-        if build_key not in build_key_to_first_service:
-            # First service with this build config - it will do the actual build
-            build_key_to_first_service[build_key] = service_name
-            services_to_build.append(service_name)
-        else:
-            # Duplicate - will just use the image from the first service
-            services_to_dedupe.append(service_name)
-
-    # Second pass: configure builds and deduplicate
-    modified_services = []
-    for service_name, service_config in compose.get("services", {}).items():
-        if "build" not in service_config:
-            continue
-
-        build_config = service_config["build"]
-        dockerfile = build_config.get("dockerfile", "Dockerfile")
-        target = build_config.get("target", "latest")
-        image_name = get_image_name(dockerfile, target)
-
-        # Set image name for all services (needed for both builders and deduped)
-        service_config["image"] = image_name
-
-        if service_name in services_to_dedupe:
-            # Remove build config - this service will use the pre-built image
-            del service_config["build"]
-            continue
-
-        # This service will do the actual build - add cache config
-        cache_from_list = []
-        cache_to_list = []
-
-        component = get_component(dockerfile)
-        if not component:
-            # Skip services that don't clearly match frontend/backend
-            continue
-
-        # Get the hash for this component
-        component_hash = getattr(args, f"{component}_hash")
-
-        # Scope format: platform-{component}-{target}-{hash|ref}
-        # Example: platform-backend-server-abc123
-
-        if "type=gha" in args.cache_from:
-            # 1. Primary: exact hash match (most specific)
-            if component_hash:
-                hash_scope = f"platform-{component}-{target}-{component_hash}"
-                cache_from_list.append(f"{args.cache_from},scope={hash_scope}")
-
-            # 2. Fallback: branch-based cache
-            if git_ref_scope:
-                ref_scope = f"platform-{component}-{target}-{git_ref_scope}"
-                cache_from_list.append(f"{args.cache_from},scope={ref_scope}")
-
-            # 3. Fallback: dev branch cache (for PRs/feature branches)
-            if git_ref_scope and git_ref_scope != DEFAULT_BRANCH:
-                master_scope = f"platform-{component}-{target}-{DEFAULT_BRANCH}"
-                cache_from_list.append(f"{args.cache_from},scope={master_scope}")
-
-        if "type=gha" in args.cache_to:
-            # Write to both hash-based and branch-based scopes
-            if component_hash:
-                hash_scope = f"platform-{component}-{target}-{component_hash}"
-                cache_to_list.append(f"{args.cache_to},scope={hash_scope}")
-
-            if git_ref_scope:
-                ref_scope = f"platform-{component}-{target}-{git_ref_scope}"
-                cache_to_list.append(f"{args.cache_to},scope={ref_scope}")
-
-        # Ensure we have at least one cache source/target
-        if not cache_from_list:
-            cache_from_list.append(args.cache_from)
-        if not cache_to_list:
-            cache_to_list.append(args.cache_to)
-
-        build_config["cache_from"] = cache_from_list
-        build_config["cache_to"] = cache_to_list
-        modified_services.append(service_name)
-
-    # Write back to the same file
-    with open(args.source, "w") as f:
-        yaml.dump(compose, f, default_flow_style=False, sort_keys=False)
-
-    print(f"Added cache config to {len(modified_services)} services in {args.source}:")
-    for svc in modified_services:
-        svc_config = compose["services"][svc]
-        build_cfg = svc_config.get("build", {})
-        cache_from_list = build_cfg.get("cache_from", ["none"])
-        cache_to_list = build_cfg.get("cache_to", ["none"])
-        print(f"  - {svc}")
-        print(f"      image: {svc_config.get('image', 'N/A')}")
-        print(f"      cache_from: {cache_from_list}")
-        print(f"      cache_to: {cache_to_list}")
-    if services_to_dedupe:
-        print(
-            f"Deduplicated {len(services_to_dedupe)} services (will use pre-built images):"
-        )
-        for svc in services_to_dedupe:
-            print(f"  - {svc} -> {compose['services'][svc].get('image', 'N/A')}")
-
-
-if __name__ == "__main__":
-    main()
--- a/.gitignore
+++ b/.gitignore
@@ -178,6 +178,4 @@ autogpt_platform/backend/settings.py
 *.ign.*
 .test-contents
 .claude/settings.local.json
-CLAUDE.local.md
 /autogpt_platform/backend/logs
-.next
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -16,6 +16,7 @@ See `docs/content/platform/getting-started.md` for setup instructions.
 - Format Python code with `poetry run format`.
 - Format frontend code using `pnpm format`.

+
 ## Frontend guidelines:

 See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:
@@ -32,17 +33,14 @@ See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:
 4. **Styling**: Tailwind CSS only, use design tokens, Phosphor Icons only
 5. **Testing**: Add Storybook stories for new components, Playwright for E2E
 6. **Code conventions**: Function declarations (not arrow functions) for components/handlers
-
 - Component props should be `interface Props { ... }` (not exported) unless the interface needs to be used outside the component
 - Separate render logic from business logic (component.tsx + useComponent.ts + helpers.ts)
 - Colocate state when possible and avoid creating large components, use sub-components ( local `/components` folder next to the parent component ) when sensible
 - Avoid large hooks, abstract logic into `helpers.ts` files when sensible
 - Use function declarations for components, arrow functions only for callbacks
 - No barrel files or `index.ts` re-exports
+- Do not use `useCallback` or `useMemo` unless strictly needed
 - Avoid comments at all times unless the code is very complex
- Do not use `useCallback` or `useMemo` unless asked to optimise a given function
- Do not type hook returns, let Typescript infer as much as possible
- Never type with `any`, if not types available use `unknown`

 ## Testing

@@ -51,8 +49,22 @@ See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:

 Always run the relevant linters and tests before committing.
 Use conventional commit messages for all commits (e.g. `feat(backend): add API`).
-Types: - feat - fix - refactor - ci - dx (developer experience)
-Scopes: - platform - platform/library - platform/marketplace - backend - backend/executor - frontend - frontend/library - frontend/marketplace - blocks
+  Types:
+    - feat
+    - fix
+    - refactor
+    - ci
+    - dx (developer experience)
+  Scopes:
+    - platform
+      - platform/library
+      - platform/marketplace
+      - backend
+        - backend/executor
+      - frontend
+        - frontend/library
+        - frontend/marketplace
+      - blocks

 ## Pull requests

--- a/README.md
+++ b/README.md
@@ -54,7 +54,7 @@ Before proceeding with the installation, ensure your system meets the following
 ### Updated Setup Instructions:
 We've moved to a fully maintained and regularly updated documentation site.

-👉 [Follow the official self-hosting guide here](https://agpt.co/docs/platform/getting-started/getting-started)
+👉 [Follow the official self-hosting guide here](https://docs.agpt.co/platform/getting-started/)


 This tutorial assumes you have Docker, VSCode, git and npm installed.
--- a/autogpt_platform/CLAUDE.md
+++ b/autogpt_platform/CLAUDE.md
@@ -6,30 +6,152 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

 AutoGPT Platform is a monorepo containing:

- **Backend** (`backend`): Python FastAPI server with async support
- **Frontend** (`frontend`): Next.js React application
- **Shared Libraries** (`autogpt_libs`): Common Python utilities
+- **Backend** (`/backend`): Python FastAPI server with async support
+- **Frontend** (`/frontend`): Next.js React application
+- **Shared Libraries** (`/autogpt_libs`): Common Python utilities

-## Component Documentation
+## Essential Commands

- **Backend**: See @backend/CLAUDE.md for backend-specific commands, architecture, and development tasks
- **Frontend**: See @frontend/CLAUDE.md for frontend-specific commands, architecture, and development patterns
+### Backend Development

-## Key Concepts
+```bash
+# Install dependencies
+cd backend && poetry install
+
+# Run database migrations
+poetry run prisma migrate dev
+
+# Start all services (database, redis, rabbitmq, clamav)
+docker compose up -d
+
+# Run the backend server
+poetry run serve
+
+# Run tests
+poetry run test
+
+# Run specific test
+poetry run pytest path/to/test_file.py::test_function_name
+
+# Run block tests (tests that validate all blocks work correctly)
+poetry run pytest backend/blocks/test/test_block.py -xvs
+
+# Run tests for a specific block (e.g., GetCurrentTimeBlock)
+poetry run pytest 'backend/blocks/test/test_block.py::test_available_blocks[GetCurrentTimeBlock]' -xvs
+
+# Lint and format
+# prefer format if you want to just "fix" it and only get the errors that can't be autofixed
+poetry run format  # Black + isort
+poetry run lint    # ruff
+```
+
+More details can be found in TESTING.md
+
+#### Creating/Updating Snapshots
+
+When you first write a test or when the expected output changes:
+
+```bash
+poetry run pytest path/to/test.py --snapshot-update
+```
+
+⚠️ **Important**: Always review snapshot changes before committing! Use `git diff` to verify the changes are expected.
+
+### Frontend Development
+
+```bash
+# Install dependencies
+cd frontend && pnpm i
+
+# Generate API client from OpenAPI spec
+pnpm generate:api
+
+# Start development server
+pnpm dev
+
+# Run E2E tests
+pnpm test
+
+# Run Storybook for component development
+pnpm storybook
+
+# Build production
+pnpm build
+
+# Format and lint
+pnpm format
+
+# Type checking
+pnpm types
+```
+
+**📖 Complete Guide**: See `/frontend/CONTRIBUTING.md` and `/frontend/.cursorrules` for comprehensive frontend patterns.
+
+**Key Frontend Conventions:**
+
+- Separate render logic from data/behavior in components
+- Use generated API hooks from `@/app/api/__generated__/endpoints/`
+- Use function declarations (not arrow functions) for components/handlers
+- Use design system components from `src/components/` (atoms, molecules, organisms)
+- Only use Phosphor Icons
+- Never use `src/components/__legacy__/*` or deprecated `BackendAPI`
+
+## Architecture Overview
+
+### Backend Architecture
+
+- **API Layer**: FastAPI with REST and WebSocket endpoints
+- **Database**: PostgreSQL with Prisma ORM, includes pgvector for embeddings
+- **Queue System**: RabbitMQ for async task processing
+- **Execution Engine**: Separate executor service processes agent workflows
+- **Authentication**: JWT-based with Supabase integration
+- **Security**: Cache protection middleware prevents sensitive data caching in browsers/proxies
+
+### Frontend Architecture
+
+- **Framework**: Next.js 15 App Router (client-first approach)
+- **Data Fetching**: Type-safe generated API hooks via Orval + React Query
+- **State Management**: React Query for server state, co-located UI state in components/hooks
+- **Component Structure**: Separate render logic (`.tsx`) from business logic (`use*.ts` hooks)
+- **Workflow Builder**: Visual graph editor using @xyflow/react
+- **UI Components**: shadcn/ui (Radix UI primitives) with Tailwind CSS styling
+- **Icons**: Phosphor Icons only
+- **Feature Flags**: LaunchDarkly integration
+- **Error Handling**: ErrorCard for render errors, toast for mutations, Sentry for exceptions
+- **Testing**: Playwright for E2E, Storybook for component development
+
+### Key Concepts

 1. **Agent Graphs**: Workflow definitions stored as JSON, executed by the backend
-2. **Blocks**: Reusable components in `backend/backend/blocks/` that perform specific tasks
+2. **Blocks**: Reusable components in `/backend/blocks/` that perform specific tasks
 3. **Integrations**: OAuth and API connections stored per user
 4. **Store**: Marketplace for sharing agent templates
 5. **Virus Scanning**: ClamAV integration for file upload security

+### Testing Approach
+
+- Backend uses pytest with snapshot testing for API responses
+- Test files are colocated with source files (`*_test.py`)
+- Frontend uses Playwright for E2E tests
+- Component testing via Storybook
+
+### Database Schema
+
+Key models (defined in `/backend/schema.prisma`):
+
+- `User`: Authentication and profile data
+- `AgentGraph`: Workflow definitions with version control
+- `AgentGraphExecution`: Execution history and results
+- `AgentNode`: Individual nodes in a workflow
+- `StoreListing`: Marketplace listings for sharing agents
+
 ### Environment Configuration

 #### Configuration Files

- **Backend**: `backend/.env.default` (defaults) → `backend/.env` (user overrides)
- **Frontend**: `frontend/.env.default` (defaults) → `frontend/.env` (user overrides)
- **Platform**: `.env.default` (Supabase/shared defaults) → `.env` (user overrides)
+- **Backend**: `/backend/.env.default` (defaults) → `/backend/.env` (user overrides)
+- **Frontend**: `/frontend/.env.default` (defaults) → `/frontend/.env` (user overrides)
+- **Platform**: `/.env.default` (Supabase/shared defaults) → `/.env` (user overrides)

 #### Docker Environment Loading Order

@@ -45,17 +167,83 @@ AutoGPT Platform is a monorepo containing:
 - Backend/Frontend services use YAML anchors for consistent configuration
 - Supabase services (`db/docker/docker-compose.yml`) follow the same pattern

-### Branching Strategy
+### Common Development Tasks

- **`dev`** is the main development branch. All PRs should target `dev`.
- **`master`** is the production branch. Only used for production releases.
+**Adding a new block:**
+
+Follow the comprehensive [Block SDK Guide](../../../docs/content/platform/block-sdk-guide.md) which covers:
+
+- Provider configuration with `ProviderBuilder`
+- Block schema definition
+- Authentication (API keys, OAuth, webhooks)
+- Testing and validation
+- File organization
+
+Quick steps:
+
+1. Create new file in `/backend/backend/blocks/`
+2. Configure provider using `ProviderBuilder` in `_config.py`
+3. Inherit from `Block` base class
+4. Define input/output schemas using `BlockSchema`
+5. Implement async `run` method
+6. Generate unique block ID using `uuid.uuid4()`
+7. Test with `poetry run pytest backend/blocks/test/test_block.py`
+
+Note: when making many new blocks analyze the interfaces for each of these blocks and picture if they would go well together in a graph based editor or would they struggle to connect productively?
+ex: do the inputs and outputs tie well together?
+
+If you get any pushback or hit complex block conditions check the new_blocks guide in the docs.
+
+**Modifying the API:**
+
+1. Update route in `/backend/backend/server/routers/`
+2. Add/update Pydantic models in same directory
+3. Write tests alongside the route file
+4. Run `poetry run test` to verify
+
+### Frontend guidelines:
+
+See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:
+
+1. **Pages**: Create in `src/app/(platform)/feature-name/page.tsx`
+   - Add `usePageName.ts` hook for logic
+   - Put sub-components in local `components/` folder
+2. **Components**: Structure as `ComponentName/ComponentName.tsx` + `useComponentName.ts` + `helpers.ts`
+   - Use design system components from `src/components/` (atoms, molecules, organisms)
+   - Never use `src/components/__legacy__/*`
+3. **Data fetching**: Use generated API hooks from `@/app/api/__generated__/endpoints/`
+   - Regenerate with `pnpm generate:api`
+   - Pattern: `use{Method}{Version}{OperationName}`
+4. **Styling**: Tailwind CSS only, use design tokens, Phosphor Icons only
+5. **Testing**: Add Storybook stories for new components, Playwright for E2E
+6. **Code conventions**: Function declarations (not arrow functions) for components/handlers
+- Component props should be `interface Props { ... }` (not exported) unless the interface needs to be used outside the component
+- Separate render logic from business logic (component.tsx + useComponent.ts + helpers.ts)
+- Colocate state when possible and avoid creating large components, use sub-components ( local `/components` folder next to the parent component ) when sensible
+- Avoid large hooks, abstract logic into `helpers.ts` files when sensible
+- Use function declarations for components, arrow functions only for callbacks
+- No barrel files or `index.ts` re-exports
+- Do not use `useCallback` or `useMemo` unless strictly needed
+- Avoid comments at all times unless the code is very complex
+
+### Security Implementation
+
+**Cache Protection Middleware:**
+
+- Located in `/backend/backend/server/middleware/security.py`
+- Default behavior: Disables caching for ALL endpoints with `Cache-Control: no-store, no-cache, must-revalidate, private`
+- Uses an allow list approach - only explicitly permitted paths can be cached
+- Cacheable paths include: static assets (`/static/*`, `/_next/static/*`), health checks, public store pages, documentation
+- Prevents sensitive data (auth tokens, API keys, user data) from being cached by browsers/proxies
+- To allow caching for a new endpoint, add it to `CACHEABLE_PATHS` in the middleware
+- Applied to both main API server and external API applications

 ### Creating Pull Requests

- Create the PR against the `dev` branch of the repository.
- Ensure the branch name is descriptive (e.g., `feature/add-new-block`)
- Use conventional commit messages (see below)
- Fill out the .github/PULL_REQUEST_TEMPLATE.md template as the PR description
+- Create the PR aginst the `dev` branch of the repository.
+- Ensure the branch name is descriptive (e.g., `feature/add-new-block`)/
+- Use conventional commit messages (see below)/
+- Fill out the .github/PULL_REQUEST_TEMPLATE.md template as the PR description/
 - Run the github pre-commit hooks to ensure code quality.

 ### Reviewing/Revising Pull Requests
--- a/autogpt_platform/autogpt_libs/poetry.lock
+++ b/autogpt_platform/autogpt_libs/poetry.lock
--- a/autogpt_platform/autogpt_libs/pyproject.toml
+++ b/autogpt_platform/autogpt_libs/pyproject.toml
@@ -9,25 +9,25 @@ packages = [{ include = "autogpt_libs" }]
 [tool.poetry.dependencies]
 python = ">=3.10,<4.0"
 colorama = "^0.4.6"
-cryptography = "^46.0"
+cryptography = "^45.0"
 expiringdict = "^1.2.2"
-fastapi = "^0.128.7"
-google-cloud-logging = "^3.13.0"
-launchdarkly-server-sdk = "^9.15.0"
-pydantic = "^2.12.5"
-pydantic-settings = "^2.12.0"
-pyjwt = { version = "^2.11.0", extras = ["crypto"] }
+fastapi = "^0.116.1"
+google-cloud-logging = "^3.12.1"
+launchdarkly-server-sdk = "^9.12.0"
+pydantic = "^2.11.7"
+pydantic-settings = "^2.10.1"
+pyjwt = { version = "^2.10.1", extras = ["crypto"] }
 redis = "^6.2.0"
-supabase = "^2.28.0"
-uvicorn = "^0.40.0"
+supabase = "^2.16.0"
+uvicorn = "^0.35.0"

 [tool.poetry.group.dev.dependencies]
-pyright = "^1.1.408"
+pyright = "^1.1.404"
 pytest = "^8.4.1"
-pytest-asyncio = "^1.3.0"
-pytest-mock = "^3.15.1"
-pytest-cov = "^7.0.0"
-ruff = "^0.15.0"
+pytest-asyncio = "^1.1.0"
+pytest-mock = "^3.14.1"
+pytest-cov = "^6.2.1"
+ruff = "^0.12.11"

 [build-system]
 requires = ["poetry-core"]
--- a/autogpt_platform/backend/.env.default
+++ b/autogpt_platform/backend/.env.default
@@ -104,12 +104,6 @@ TWITTER_CLIENT_SECRET=
 # Make a new workspace for your OAuth APP -- trust me
 # https://linear.app/settings/api/applications/new
 # Callback URL: http://localhost:3000/auth/integrations/oauth_callback
-LINEAR_API_KEY=
-# Linear project and team IDs for the feature request tracker.
-# Find these in your Linear workspace URL: linear.app/<workspace>/project/<project-id>
-# and in team settings. Used by the chat copilot to file and search feature requests.
-LINEAR_FEATURE_REQUEST_PROJECT_ID=
-LINEAR_FEATURE_REQUEST_TEAM_ID=
 LINEAR_CLIENT_ID=
 LINEAR_CLIENT_SECRET=

@@ -158,7 +152,6 @@ REPLICATE_API_KEY=
 REVID_API_KEY=
 SCREENSHOTONE_API_KEY=
 UNREAL_SPEECH_API_KEY=
-ELEVENLABS_API_KEY=

 # Data & Search Services
 E2B_API_KEY=
@@ -185,10 +178,5 @@ AYRSHARE_JWT_KEY=
 SMARTLEAD_API_KEY=
 ZEROBOUNCE_API_KEY=

-# PostHog Analytics
-# Get API key from https://posthog.com - Project Settings > Project API Key
-POSTHOG_API_KEY=
-POSTHOG_HOST=https://eu.i.posthog.com
-
 # Other Services
 AUTOMOD_API_KEY=
--- a/autogpt_platform/backend/.gitignore
+++ b/autogpt_platform/backend/.gitignore
@@ -19,6 +19,3 @@ load-tests/*.json
 load-tests/*.log
 load-tests/node_modules/*
 migrations/*/rollback*.sql
-
-# Workspace files
-workspaces/
--- a/autogpt_platform/backend/CLAUDE.md
+++ b/autogpt_platform/backend/CLAUDE.md
@@ -1,170 +0,0 @@
-# CLAUDE.md - Backend
-
-This file provides guidance to Claude Code when working with the backend.
-
-## Essential Commands
-
-To run something with Python package dependencies you MUST use `poetry run ...`.
-
-```bash
-# Install dependencies
-poetry install
-
-# Run database migrations
-poetry run prisma migrate dev
-
-# Start all services (database, redis, rabbitmq, clamav)
-docker compose up -d
-
-# Run the backend as a whole
-poetry run app
-
-# Run tests
-poetry run test
-
-# Run specific test
-poetry run pytest path/to/test_file.py::test_function_name
-
-# Run block tests (tests that validate all blocks work correctly)
-poetry run pytest backend/blocks/test/test_block.py -xvs
-
-# Run tests for a specific block (e.g., GetCurrentTimeBlock)
-poetry run pytest 'backend/blocks/test/test_block.py::test_available_blocks[GetCurrentTimeBlock]' -xvs
-
-# Lint and format
-# prefer format if you want to just "fix" it and only get the errors that can't be autofixed
-poetry run format  # Black + isort
-poetry run lint    # ruff
-```
-
-More details can be found in @TESTING.md
-
-### Creating/Updating Snapshots
-
-When you first write a test or when the expected output changes:
-
-```bash
-poetry run pytest path/to/test.py --snapshot-update
-```
-
-⚠️ **Important**: Always review snapshot changes before committing! Use `git diff` to verify the changes are expected.
-
-## Architecture
-
- **API Layer**: FastAPI with REST and WebSocket endpoints
- **Database**: PostgreSQL with Prisma ORM, includes pgvector for embeddings
- **Queue System**: RabbitMQ for async task processing
- **Execution Engine**: Separate executor service processes agent workflows
- **Authentication**: JWT-based with Supabase integration
- **Security**: Cache protection middleware prevents sensitive data caching in browsers/proxies
-
-## Testing Approach
-
- Uses pytest with snapshot testing for API responses
- Test files are colocated with source files (`*_test.py`)
-
-## Database Schema
-
-Key models (defined in `schema.prisma`):
-
- `User`: Authentication and profile data
- `AgentGraph`: Workflow definitions with version control
- `AgentGraphExecution`: Execution history and results
- `AgentNode`: Individual nodes in a workflow
- `StoreListing`: Marketplace listings for sharing agents
-
-## Environment Configuration
-
- **Backend**: `.env.default` (defaults) → `.env` (user overrides)
-
-## Common Development Tasks
-
-### Adding a new block
-
-Follow the comprehensive [Block SDK Guide](@../../docs/content/platform/block-sdk-guide.md) which covers:
-
- Provider configuration with `ProviderBuilder`
- Block schema definition
- Authentication (API keys, OAuth, webhooks)
- Testing and validation
- File organization
-
-Quick steps:
-
-1. Create new file in `backend/blocks/`
-2. Configure provider using `ProviderBuilder` in `_config.py`
-3. Inherit from `Block` base class
-4. Define input/output schemas using `BlockSchema`
-5. Implement async `run` method
-6. Generate unique block ID using `uuid.uuid4()`
-7. Test with `poetry run pytest backend/blocks/test/test_block.py`
-
-Note: when making many new blocks analyze the interfaces for each of these blocks and picture if they would go well together in a graph-based editor or would they struggle to connect productively?
-ex: do the inputs and outputs tie well together?
-
-If you get any pushback or hit complex block conditions check the new_blocks guide in the docs.
-
-#### Handling files in blocks with `store_media_file()`
-
-When blocks need to work with files (images, videos, documents), use `store_media_file()` from `backend.util.file`. The `return_format` parameter determines what you get back:
-
-| Format | Use When | Returns |
-|--------|----------|---------|
-| `"for_local_processing"` | Processing with local tools (ffmpeg, MoviePy, PIL) | Local file path (e.g., `"image.png"`) |
-| `"for_external_api"` | Sending content to external APIs (Replicate, OpenAI) | Data URI (e.g., `"data:image/png;base64,..."`) |
-| `"for_block_output"` | Returning output from your block | Smart: `workspace://` in CoPilot, data URI in graphs |
-
-**Examples:**
-
-```python
-# INPUT: Need to process file locally with ffmpeg
-local_path = await store_media_file(
-    file=input_data.video,
-    execution_context=execution_context,
-    return_format="for_local_processing",
-)
-# local_path = "video.mp4" - use with Path/ffmpeg/etc
-
-# INPUT: Need to send to external API like Replicate
-image_b64 = await store_media_file(
-    file=input_data.image,
-    execution_context=execution_context,
-    return_format="for_external_api",
-)
-# image_b64 = "data:image/png;base64,iVBORw0..." - send to API
-
-# OUTPUT: Returning result from block
-result_url = await store_media_file(
-    file=generated_image_url,
-    execution_context=execution_context,
-    return_format="for_block_output",
-)
-yield "image_url", result_url
-# In CoPilot: result_url = "workspace://abc123"
-# In graphs:  result_url = "data:image/png;base64,..."
-```
-
-**Key points:**
-
- `for_block_output` is the ONLY format that auto-adapts to execution context
- Always use `for_block_output` for block outputs unless you have a specific reason not to
- Never hardcode workspace checks - let `for_block_output` handle it
-
-### Modifying the API
-
-1. Update route in `backend/api/features/`
-2. Add/update Pydantic models in same directory
-3. Write tests alongside the route file
-4. Run `poetry run test` to verify
-
-## Security Implementation
-
-### Cache Protection Middleware
-
- Located in `backend/api/middleware/security.py`
- Default behavior: Disables caching for ALL endpoints with `Cache-Control: no-store, no-cache, must-revalidate, private`
- Uses an allow list approach - only explicitly permitted paths can be cached
- Cacheable paths include: static assets (`static/*`, `_next/static/*`), health checks, public store pages, documentation
- Prevents sensitive data (auth tokens, API keys, user data) from being cached by browsers/proxies
- To allow caching for a new endpoint, add it to `CACHEABLE_PATHS` in the middleware
- Applied to both main API server and external API applications
--- a/autogpt_platform/backend/Dockerfile
+++ b/autogpt_platform/backend/Dockerfile
@@ -1,5 +1,3 @@
-# ============================ DEPENDENCY BUILDER ============================ #
-
 FROM debian:13-slim AS builder

 # Set environment variables
@@ -53,62 +51,25 @@ COPY autogpt_platform/backend/backend/data/partial_types.py ./backend/data/parti
 COPY autogpt_platform/backend/gen_prisma_types_stub.py ./
 RUN poetry run prisma generate && poetry run gen-prisma-stub

-# =============================== DB MIGRATOR =============================== #
-
-# Lightweight migrate stage - only needs Prisma CLI, not full Python environment
-FROM debian:13-slim AS migrate
-
-WORKDIR /app/autogpt_platform/backend
-
-ENV DEBIAN_FRONTEND=noninteractive
-
-# Install only what's needed for prisma migrate: Node.js and minimal Python for prisma-python
-RUN apt-get update && apt-get install -y --no-install-recommends \
-    python3.13 \
-    python3-pip \
-    ca-certificates \
-    && rm -rf /var/lib/apt/lists/*
-
-# Copy Node.js from builder (needed for Prisma CLI)
-COPY --from=builder /usr/bin/node /usr/bin/node
-COPY --from=builder /usr/lib/node_modules /usr/lib/node_modules
-COPY --from=builder /usr/bin/npm /usr/bin/npm
-
-# Copy Prisma binaries
-COPY --from=builder /root/.cache/prisma-python/binaries /root/.cache/prisma-python/binaries
-
-# Install prisma-client-py directly (much smaller than copying full venv)
-RUN pip3 install prisma>=0.15.0 --break-system-packages
-
-COPY autogpt_platform/backend/schema.prisma ./
-COPY autogpt_platform/backend/backend/data/partial_types.py ./backend/data/partial_types.py
-COPY autogpt_platform/backend/gen_prisma_types_stub.py ./
-COPY autogpt_platform/backend/migrations ./migrations
-
-# ============================== BACKEND SERVER ============================== #
-
-FROM debian:13-slim AS server
+FROM debian:13-slim AS server_dependencies

 WORKDIR /app

-ENV DEBIAN_FRONTEND=noninteractive
+ENV POETRY_HOME=/opt/poetry \
+    POETRY_NO_INTERACTION=1 \
+    POETRY_VIRTUALENVS_CREATE=true \
+    POETRY_VIRTUALENVS_IN_PROJECT=true \
+    DEBIAN_FRONTEND=noninteractive
+ENV PATH=/opt/poetry/bin:$PATH

-# Install Python, FFmpeg, ImageMagick, and CLI tools for agent use.
-# bubblewrap provides OS-level sandbox (whitelist-only FS + no network)
-# for the bash_exec MCP tool.
-# Using --no-install-recommends saves ~650MB by skipping unnecessary deps like llvm, mesa, etc.
-RUN apt-get update && apt-get install -y --no-install-recommends \
+# Install Python without upgrading system-managed packages
+RUN apt-get update && apt-get install -y \
    python3.13 \
    python3-pip \
-    ffmpeg \
-    imagemagick \
-    jq \
-    ripgrep \
-    tree \
-    bubblewrap \
    && rm -rf /var/lib/apt/lists/*

-# Copy poetry (build-time only, for `poetry install --only-root` to create entry points)
+# Copy only necessary files from builder
+COPY --from=builder /app /app
 COPY --from=builder /usr/local/lib/python3* /usr/local/lib/python3*
 COPY --from=builder /usr/local/bin/poetry /usr/local/bin/poetry
 # Copy Node.js installation for Prisma
@@ -118,25 +79,30 @@ COPY --from=builder /usr/bin/npm /usr/bin/npm
 COPY --from=builder /usr/bin/npx /usr/bin/npx
 COPY --from=builder /root/.cache/prisma-python/binaries /root/.cache/prisma-python/binaries

-WORKDIR /app/autogpt_platform/backend
-
-# Copy only the .venv from builder (not the entire /app directory)
-# The .venv includes the generated Prisma client
-COPY --from=builder /app/autogpt_platform/backend/.venv ./.venv
 ENV PATH="/app/autogpt_platform/backend/.venv/bin:$PATH"

-# Copy dependency files + autogpt_libs (path dependency)
-COPY autogpt_platform/autogpt_libs /app/autogpt_platform/autogpt_libs
-COPY autogpt_platform/backend/poetry.lock autogpt_platform/backend/pyproject.toml ./
+RUN mkdir -p /app/autogpt_platform/autogpt_libs
+RUN mkdir -p /app/autogpt_platform/backend

-# Copy backend code + docs (for Copilot docs search)
-COPY autogpt_platform/backend ./
+COPY autogpt_platform/autogpt_libs /app/autogpt_platform/autogpt_libs
+
+COPY autogpt_platform/backend/poetry.lock autogpt_platform/backend/pyproject.toml /app/autogpt_platform/backend/
+
+WORKDIR /app/autogpt_platform/backend
+
+FROM server_dependencies AS migrate
+
+# Migration stage only needs schema and migrations - much lighter than full backend
+COPY autogpt_platform/backend/schema.prisma /app/autogpt_platform/backend/
+COPY autogpt_platform/backend/backend/data/partial_types.py /app/autogpt_platform/backend/backend/data/partial_types.py
+COPY autogpt_platform/backend/migrations /app/autogpt_platform/backend/migrations
+
+FROM server_dependencies AS server
+
+COPY autogpt_platform/backend /app/autogpt_platform/backend
 COPY docs /app/docs
-# Install the project package to create entry point scripts in .venv/bin/
-# (e.g., rest, executor, ws, db, scheduler, notification - see [tool.poetry.scripts])
-RUN POETRY_VIRTUALENVS_CREATE=true POETRY_VIRTUALENVS_IN_PROJECT=true \
-    poetry install --no-ansi --only-root
+RUN poetry install --no-ansi --only-root

 ENV PORT=8000

-CMD ["rest"]
+CMD ["poetry", "run", "rest"]
--- a/autogpt_platform/backend/TESTING.md
+++ b/autogpt_platform/backend/TESTING.md
@@ -138,7 +138,7 @@ If the test doesn't need the `user_id` specifically, mocking is not necessary as

 #### Using Global Auth Fixtures

-Two global auth fixtures are provided by `backend/api/conftest.py`:
+Two global auth fixtures are provided by `backend/server/conftest.py`:

 - `mock_jwt_user` - Regular user with `test_user_id` ("test-user-id")
 - `mock_jwt_admin` - Admin user with `admin_user_id` ("admin-user-id")
--- a/autogpt_platform/backend/backend/api/conftest.py
+++ b/autogpt_platform/backend/backend/api/conftest.py
@@ -1,9 +1,4 @@
-"""Common test fixtures for server tests.
-
-Note: Common fixtures like test_user_id, admin_user_id, target_user_id,
-setup_test_user, and setup_admin_user are defined in the parent conftest.py
-(backend/conftest.py) and are available here automatically.
-"""
+"""Common test fixtures for server tests."""

 import pytest
 from pytest_snapshot.plugin import Snapshot
@@ -16,6 +11,54 @@ def configured_snapshot(snapshot: Snapshot) -> Snapshot:
    return snapshot


+@pytest.fixture
+def test_user_id() -> str:
+    """Test user ID fixture."""
+    return "3e53486c-cf57-477e-ba2a-cb02dc828e1a"
+
+
+@pytest.fixture
+def admin_user_id() -> str:
+    """Admin user ID fixture."""
+    return "4e53486c-cf57-477e-ba2a-cb02dc828e1b"
+
+
+@pytest.fixture
+def target_user_id() -> str:
+    """Target user ID fixture."""
+    return "5e53486c-cf57-477e-ba2a-cb02dc828e1c"
+
+
+@pytest.fixture
+async def setup_test_user(test_user_id):
+    """Create test user in database before tests."""
+    from backend.data.user import get_or_create_user
+
+    # Create the test user in the database using JWT token format
+    user_data = {
+        "sub": test_user_id,
+        "email": "test@example.com",
+        "user_metadata": {"name": "Test User"},
+    }
+    await get_or_create_user(user_data)
+    return test_user_id
+
+
+@pytest.fixture
+async def setup_admin_user(admin_user_id):
+    """Create admin user in database before tests."""
+    from backend.data.user import get_or_create_user
+
+    # Create the admin user in the database using JWT token format
+    user_data = {
+        "sub": admin_user_id,
+        "email": "test-admin@example.com",
+        "user_metadata": {"name": "Test Admin"},
+    }
+    await get_or_create_user(user_data)
+    return admin_user_id
+
+
@pytest.fixture
 def mock_jwt_user(test_user_id):
    """Provide mock JWT payload for regular user testing."""
--- a/autogpt_platform/backend/backend/api/external/v1/routes.py
+++ b/autogpt_platform/backend/backend/api/external/v1/routes.py
@@ -10,7 +10,7 @@ from typing_extensions import TypedDict

 import backend.api.features.store.cache as store_cache
 import backend.api.features.store.model as store_model
-import backend.blocks
+import backend.data.block
 from backend.api.external.middleware import require_permission
 from backend.data import execution as execution_db
 from backend.data import graph as graph_db
@@ -67,7 +67,7 @@ async def get_user_info(
    dependencies=[Security(require_permission(APIKeyPermission.READ_BLOCK))],
 )
 async def get_graph_blocks() -> Sequence[dict[Any, Any]]:
-    blocks = [block() for block in backend.blocks.get_blocks().values()]
+    blocks = [block() for block in backend.data.block.get_blocks().values()]
    return [b.to_dict() for b in blocks if not b.disabled]


@@ -83,11 +83,9 @@ async def execute_graph_block(
        require_permission(APIKeyPermission.EXECUTE_BLOCK)
    ),
 ) -> CompletedBlockOutput:
-    obj = backend.blocks.get_block(block_id)
+    obj = backend.data.block.get_block(block_id)
    if not obj:
        raise HTTPException(status_code=404, detail=f"Block #{block_id} not found.")
-    if obj.disabled:
-        raise HTTPException(status_code=403, detail=f"Block #{block_id} is disabled.")

    output = defaultdict(list)
    async for name, data in obj.execute(data):
--- a/autogpt_platform/backend/backend/api/external/v1/tools.py
+++ b/autogpt_platform/backend/backend/api/external/v1/tools.py
@@ -15,9 +15,9 @@ from prisma.enums import APIKeyPermission
 from pydantic import BaseModel, Field

 from backend.api.external.middleware import require_permission
-from backend.copilot.model import ChatSession
-from backend.copilot.tools import find_agent_tool, run_agent_tool
-from backend.copilot.tools.models import ToolResponseBase
+from backend.api.features.chat.model import ChatSession
+from backend.api.features.chat.tools import find_agent_tool, run_agent_tool
+from backend.api.features.chat.tools.models import ToolResponseBase
 from backend.data.auth.base import APIAuthorizationInfo

 logger = logging.getLogger(__name__)
--- a/autogpt_platform/backend/backend/api/features/builder/db.py
+++ b/autogpt_platform/backend/backend/api/features/builder/db.py
@@ -10,15 +10,10 @@ import backend.api.features.library.db as library_db
 import backend.api.features.library.model as library_model
 import backend.api.features.store.db as store_db
 import backend.api.features.store.model as store_model
+import backend.data.block
 from backend.blocks import load_all_blocks
-from backend.blocks._base import (
-    AnyBlockSchema,
-    BlockCategory,
-    BlockInfo,
-    BlockSchema,
-    BlockType,
-)
 from backend.blocks.llm import LlmModel
+from backend.data.block import AnyBlockSchema, BlockCategory, BlockInfo, BlockSchema
 from backend.data.db import query_raw_with_schema
 from backend.integrations.providers import ProviderName
 from backend.util.cache import cached
@@ -27,7 +22,7 @@ from backend.util.models import Pagination
 from .model import (
    BlockCategoryResponse,
    BlockResponse,
-    BlockTypeFilter,
+    BlockType,
    CountResponse,
    FilterType,
    Provider,
@@ -93,7 +88,7 @@ def get_block_categories(category_blocks: int = 3) -> list[BlockCategoryResponse
 def get_blocks(
    *,
    category: str | None = None,
-    type: BlockTypeFilter | None = None,
+    type: BlockType | None = None,
    provider: ProviderName | None = None,
    page: int = 1,
    page_size: int = 50,
@@ -674,9 +669,9 @@ async def get_suggested_blocks(count: int = 5) -> list[BlockInfo]:
    for block_type in load_all_blocks().values():
        block: AnyBlockSchema = block_type()
        if block.disabled or block.block_type in (
-            BlockType.INPUT,
-            BlockType.OUTPUT,
-            BlockType.AGENT,
+            backend.data.block.BlockType.INPUT,
+            backend.data.block.BlockType.OUTPUT,
+            backend.data.block.BlockType.AGENT,
        ):
            continue
        # Find the execution count for this block
--- a/autogpt_platform/backend/backend/api/features/builder/model.py
+++ b/autogpt_platform/backend/backend/api/features/builder/model.py
@@ -4,7 +4,7 @@ from pydantic import BaseModel

 import backend.api.features.library.model as library_model
 import backend.api.features.store.model as store_model
-from backend.blocks._base import BlockInfo
+from backend.data.block import BlockInfo
 from backend.integrations.providers import ProviderName
 from backend.util.models import Pagination

@@ -15,7 +15,7 @@ FilterType = Literal[
    "my_agents",
 ]

-BlockTypeFilter = Literal["all", "input", "action", "output"]
+BlockType = Literal["all", "input", "action", "output"]


 class SearchEntry(BaseModel):
--- a/autogpt_platform/backend/backend/api/features/builder/routes.py
+++ b/autogpt_platform/backend/backend/api/features/builder/routes.py
@@ -17,7 +17,7 @@ router = fastapi.APIRouter(
 )


-# Taken from backend/api/features/store/db.py
+# Taken from backend/server/v2/store/db.py
 def sanitize_query(query: str | None) -> str | None:
    if query is None:
        return query
@@ -88,7 +88,7 @@ async def get_block_categories(
 )
 async def get_blocks(
    category: Annotated[str | None, fastapi.Query()] = None,
-    type: Annotated[builder_model.BlockTypeFilter | None, fastapi.Query()] = None,
+    type: Annotated[builder_model.BlockType | None, fastapi.Query()] = None,
    provider: Annotated[ProviderName | None, fastapi.Query()] = None,
    page: Annotated[int, fastapi.Query()] = 1,
    page_size: Annotated[int, fastapi.Query()] = 50,
--- a/autogpt_platform/backend/backend/api/features/chat/config.py
+++ b/autogpt_platform/backend/backend/api/features/chat/config.py
@@ -0,0 +1,90 @@
+"""Configuration management for chat system."""
+
+import os
+
+from pydantic import Field, field_validator
+from pydantic_settings import BaseSettings
+
+
+class ChatConfig(BaseSettings):
+    """Configuration for the chat system."""
+
+    # OpenAI API Configuration
+    model: str = Field(
+        default="anthropic/claude-opus-4.5", description="Default model to use"
+    )
+    title_model: str = Field(
+        default="openai/gpt-4o-mini",
+        description="Model to use for generating session titles (should be fast/cheap)",
+    )
+    api_key: str | None = Field(default=None, description="OpenAI API key")
+    base_url: str | None = Field(
+        default="https://openrouter.ai/api/v1",
+        description="Base URL for API (e.g., for OpenRouter)",
+    )
+
+    # Session TTL Configuration - 12 hours
+    session_ttl: int = Field(default=43200, description="Session TTL in seconds")
+
+    # Streaming Configuration
+    max_context_messages: int = Field(
+        default=50, ge=1, le=200, description="Maximum context messages"
+    )
+
+    stream_timeout: int = Field(default=300, description="Stream timeout in seconds")
+    max_retries: int = Field(default=3, description="Maximum number of retries")
+    max_agent_runs: int = Field(default=3, description="Maximum number of agent runs")
+    max_agent_schedules: int = Field(
+        default=3, description="Maximum number of agent schedules"
+    )
+
+    # Langfuse Prompt Management Configuration
+    # Note: Langfuse credentials are in Settings().secrets (settings.py)
+    langfuse_prompt_name: str = Field(
+        default="CoPilot Prompt",
+        description="Name of the prompt in Langfuse to fetch",
+    )
+
+    @field_validator("api_key", mode="before")
+    @classmethod
+    def get_api_key(cls, v):
+        """Get API key from environment if not provided."""
+        if v is None:
+            # Try to get from environment variables
+            # First check for CHAT_API_KEY (Pydantic prefix)
+            v = os.getenv("CHAT_API_KEY")
+            if not v:
+                # Fall back to OPEN_ROUTER_API_KEY
+                v = os.getenv("OPEN_ROUTER_API_KEY")
+            if not v:
+                # Fall back to OPENAI_API_KEY
+                v = os.getenv("OPENAI_API_KEY")
+        return v
+
+    @field_validator("base_url", mode="before")
+    @classmethod
+    def get_base_url(cls, v):
+        """Get base URL from environment if not provided."""
+        if v is None:
+            # Check for OpenRouter or custom base URL
+            v = os.getenv("CHAT_BASE_URL")
+            if not v:
+                v = os.getenv("OPENROUTER_BASE_URL")
+            if not v:
+                v = os.getenv("OPENAI_BASE_URL")
+            if not v:
+                v = "https://openrouter.ai/api/v1"
+        return v
+
+    # Prompt paths for different contexts
+    PROMPT_PATHS: dict[str, str] = {
+        "default": "prompts/chat_system.md",
+        "onboarding": "prompts/onboarding_system.md",
+    }
+
+    class Config:
+        """Pydantic config."""
+
+        env_file = ".env"
+        env_file_encoding = "utf-8"
+        extra = "ignore"  # Ignore extra environment variables
--- a/autogpt_platform/backend/backend/api/features/chat/db.py
+++ b/autogpt_platform/backend/backend/api/features/chat/db.py
@@ -14,27 +14,29 @@ from prisma.types import (
    ChatSessionWhereInput,
 )

-from backend.data import db
+from backend.data.db import transaction
 from backend.util.json import SafeJson

-from .model import ChatMessage, ChatSession, ChatSessionInfo
-
 logger = logging.getLogger(__name__)


-async def get_chat_session(session_id: str) -> ChatSession | None:
+async def get_chat_session(session_id: str) -> PrismaChatSession | None:
    """Get a chat session by ID from the database."""
    session = await PrismaChatSession.prisma().find_unique(
        where={"id": session_id},
-        include={"Messages": {"order_by": {"sequence": "asc"}}},
+        include={"Messages": True},
    )
-    return ChatSession.from_db(session) if session else None
+    if session and session.Messages:
+        # Sort messages by sequence in Python - Prisma Python client doesn't support
+        # order_by in include clauses (unlike Prisma JS), so we sort after fetching
+        session.Messages.sort(key=lambda m: m.sequence)
+    return session


 async def create_chat_session(
    session_id: str,
    user_id: str,
-) -> ChatSessionInfo:
+) -> PrismaChatSession:
    """Create a new chat session in the database."""
    data = ChatSessionCreateInput(
        id=session_id,
@@ -43,8 +45,10 @@ async def create_chat_session(
        successfulAgentRuns=SafeJson({}),
        successfulAgentSchedules=SafeJson({}),
    )
-    prisma_session = await PrismaChatSession.prisma().create(data=data)
-    return ChatSessionInfo.from_db(prisma_session)
+    return await PrismaChatSession.prisma().create(
+        data=data,
+        include={"Messages": True},
+    )


 async def update_chat_session(
@@ -55,7 +59,7 @@ async def update_chat_session(
    total_prompt_tokens: int | None = None,
    total_completion_tokens: int | None = None,
    title: str | None = None,
-) -> ChatSession | None:
+) -> PrismaChatSession | None:
    """Update a chat session's metadata."""
    data: ChatSessionUpdateInput = {"updatedAt": datetime.now(UTC)}

@@ -75,9 +79,12 @@ async def update_chat_session(
    session = await PrismaChatSession.prisma().update(
        where={"id": session_id},
        data=data,
-        include={"Messages": {"order_by": {"sequence": "asc"}}},
+        include={"Messages": True},
    )
-    return ChatSession.from_db(session) if session else None
+    if session and session.Messages:
+        # Sort in Python - Prisma Python doesn't support order_by in include clauses
+        session.Messages.sort(key=lambda m: m.sequence)
+    return session


 async def add_chat_message(
@@ -90,7 +97,7 @@ async def add_chat_message(
    refusal: str | None = None,
    tool_calls: list[dict[str, Any]] | None = None,
    function_call: dict[str, Any] | None = None,
-) -> ChatMessage:
+) -> PrismaChatMessage:
    """Add a message to a chat session."""
    # Build input dict dynamically rather than using ChatMessageCreateInput directly
    # because Prisma's TypedDict validation rejects optional fields set to None.
@@ -125,14 +132,14 @@ async def add_chat_message(
        ),
        PrismaChatMessage.prisma().create(data=cast(ChatMessageCreateInput, data)),
    )
-    return ChatMessage.from_db(message)
+    return message


 async def add_chat_messages_batch(
    session_id: str,
    messages: list[dict[str, Any]],
    start_sequence: int,
-) -> list[ChatMessage]:
+) -> list[PrismaChatMessage]:
    """Add multiple messages to a chat session in a batch.

    Uses a transaction for atomicity - if any message creation fails,
@@ -143,7 +150,7 @@ async def add_chat_messages_batch(

    created_messages = []

-    async with db.transaction() as tx:
+    async with transaction() as tx:
        for i, msg in enumerate(messages):
            # Build input dict dynamically rather than using ChatMessageCreateInput
            # directly because Prisma's TypedDict validation rejects optional fields
@@ -183,22 +190,21 @@ async def add_chat_messages_batch(
            data={"updatedAt": datetime.now(UTC)},
        )

-    return [ChatMessage.from_db(m) for m in created_messages]
+    return created_messages


 async def get_user_chat_sessions(
    user_id: str,
    limit: int = 50,
    offset: int = 0,
-) -> list[ChatSessionInfo]:
+) -> list[PrismaChatSession]:
    """Get chat sessions for a user, ordered by most recent."""
-    prisma_sessions = await PrismaChatSession.prisma().find_many(
+    return await PrismaChatSession.prisma().find_many(
        where={"userId": user_id},
        order={"updatedAt": "desc"},
        take=limit,
        skip=offset,
    )
-    return [ChatSessionInfo.from_db(s) for s in prisma_sessions]


 async def get_user_session_count(user_id: str) -> int:
@@ -241,45 +247,3 @@ async def get_chat_session_message_count(session_id: str) -> int:
    """Get the number of messages in a chat session."""
    count = await PrismaChatMessage.prisma().count(where={"sessionId": session_id})
    return count
-
-
-async def update_tool_message_content(
-    session_id: str,
-    tool_call_id: str,
-    new_content: str,
-) -> bool:
-    """Update the content of a tool message in chat history.
-
-    Used by background tasks to update pending operation messages with final results.
-
-    Args:
-        session_id: The chat session ID.
-        tool_call_id: The tool call ID to find the message.
-        new_content: The new content to set.
-
-    Returns:
-        True if a message was updated, False otherwise.
-    """
-    try:
-        result = await PrismaChatMessage.prisma().update_many(
-            where={
-                "sessionId": session_id,
-                "toolCallId": tool_call_id,
-            },
-            data={
-                "content": new_content,
-            },
-        )
-        if result == 0:
-            logger.warning(
-                f"No message found to update for session {session_id}, "
-                f"tool_call_id {tool_call_id}"
-            )
-            return False
-        return True
-    except Exception as e:
-        logger.error(
-            f"Failed to update tool message for session {session_id}, "
-            f"tool_call_id {tool_call_id}: {e}"
-        )
-        return False
--- a/autogpt_platform/backend/backend/api/features/chat/model.py
+++ b/autogpt_platform/backend/backend/api/features/chat/model.py
@@ -2,7 +2,7 @@ import asyncio
 import logging
 import uuid
 from datetime import UTC, datetime
-from typing import Any, Self, cast
+from typing import Any
 from weakref import WeakValueDictionary

 from openai.types.chat import (
@@ -23,17 +23,26 @@ from prisma.models import ChatMessage as PrismaChatMessage
 from prisma.models import ChatSession as PrismaChatSession
 from pydantic import BaseModel

-from backend.data.db_accessors import chat_db
 from backend.data.redis_client import get_redis_async
 from backend.util import json
 from backend.util.exceptions import DatabaseError, RedisError

+from . import db as chat_db
 from .config import ChatConfig

 logger = logging.getLogger(__name__)
 config = ChatConfig()


+def _parse_json_field(value: str | dict | list | None, default: Any = None) -> Any:
+    """Parse a JSON field that may be stored as string or already parsed."""
+    if value is None:
+        return default
+    if isinstance(value, str):
+        return json.loads(value)
+    return value
+
+
 # Redis cache key prefix for chat sessions
 CHAT_SESSION_CACHE_PREFIX = "chat:session:"

@@ -43,7 +52,28 @@ def _get_session_cache_key(session_id: str) -> str:
    return f"{CHAT_SESSION_CACHE_PREFIX}{session_id}"


-# ===================== Chat data models ===================== #
+# Session-level locks to prevent race conditions during concurrent upserts.
+# Uses WeakValueDictionary to automatically garbage collect locks when no longer referenced,
+# preventing unbounded memory growth while maintaining lock semantics for active sessions.
+# Invalidation: Locks are auto-removed by GC when no coroutine holds a reference (after
+# async with lock: completes). Explicit cleanup also occurs in delete_chat_session().
+_session_locks: WeakValueDictionary[str, asyncio.Lock] = WeakValueDictionary()
+_session_locks_mutex = asyncio.Lock()
+
+
+async def _get_session_lock(session_id: str) -> asyncio.Lock:
+    """Get or create a lock for a specific session to prevent concurrent upserts.
+
+    Uses WeakValueDictionary for automatic cleanup: locks are garbage collected
+    when no coroutine holds a reference to them, preventing memory leaks from
+    unbounded growth of session locks.
+    """
+    async with _session_locks_mutex:
+        lock = _session_locks.get(session_id)
+        if lock is None:
+            lock = asyncio.Lock()
+            _session_locks[session_id] = lock
+        return lock


 class ChatMessage(BaseModel):
@@ -55,19 +85,6 @@ class ChatMessage(BaseModel):
    tool_calls: list[dict] | None = None
    function_call: dict | None = None

-    @staticmethod
-    def from_db(prisma_message: PrismaChatMessage) -> "ChatMessage":
-        """Convert a Prisma ChatMessage to a Pydantic ChatMessage."""
-        return ChatMessage(
-            role=prisma_message.role,
-            content=prisma_message.content,
-            name=prisma_message.name,
-            tool_call_id=prisma_message.toolCallId,
-            refusal=prisma_message.refusal,
-            tool_calls=_parse_json_field(prisma_message.toolCalls),
-            function_call=_parse_json_field(prisma_message.functionCall),
-        )
-

 class Usage(BaseModel):
    prompt_tokens: int
@@ -75,10 +92,11 @@ class Usage(BaseModel):
    total_tokens: int


-class ChatSessionInfo(BaseModel):
+class ChatSession(BaseModel):
    session_id: str
    user_id: str
    title: str | None = None
+    messages: list[ChatMessage]
    usage: list[Usage]
    credentials: dict[str, dict] = {}  # Map of provider -> credential metadata
    started_at: datetime
@@ -86,9 +104,40 @@ class ChatSessionInfo(BaseModel):
    successful_agent_runs: dict[str, int] = {}
    successful_agent_schedules: dict[str, int] = {}

-    @classmethod
-    def from_db(cls, prisma_session: PrismaChatSession) -> Self:
-        """Convert Prisma ChatSession to Pydantic ChatSession."""
+    @staticmethod
+    def new(user_id: str) -> "ChatSession":
+        return ChatSession(
+            session_id=str(uuid.uuid4()),
+            user_id=user_id,
+            title=None,
+            messages=[],
+            usage=[],
+            credentials={},
+            started_at=datetime.now(UTC),
+            updated_at=datetime.now(UTC),
+        )
+
+    @staticmethod
+    def from_db(
+        prisma_session: PrismaChatSession,
+        prisma_messages: list[PrismaChatMessage] | None = None,
+    ) -> "ChatSession":
+        """Convert Prisma models to Pydantic ChatSession."""
+        messages = []
+        if prisma_messages:
+            for msg in prisma_messages:
+                messages.append(
+                    ChatMessage(
+                        role=msg.role,
+                        content=msg.content,
+                        name=msg.name,
+                        tool_call_id=msg.toolCallId,
+                        refusal=msg.refusal,
+                        tool_calls=_parse_json_field(msg.toolCalls),
+                        function_call=_parse_json_field(msg.functionCall),
+                    )
+                )
+
        # Parse JSON fields from Prisma
        credentials = _parse_json_field(prisma_session.credentials, default={})
        successful_agent_runs = _parse_json_field(
@@ -110,10 +159,11 @@ class ChatSessionInfo(BaseModel):
                )
            )

-        return cls(
+        return ChatSession(
            session_id=prisma_session.id,
            user_id=prisma_session.userId,
            title=prisma_session.title,
+            messages=messages,
            usage=usage,
            credentials=credentials,
            started_at=prisma_session.createdAt,
@@ -122,56 +172,6 @@ class ChatSessionInfo(BaseModel):
            successful_agent_schedules=successful_agent_schedules,
        )

-
-class ChatSession(ChatSessionInfo):
-    messages: list[ChatMessage]
-
-    @classmethod
-    def new(cls, user_id: str) -> Self:
-        return cls(
-            session_id=str(uuid.uuid4()),
-            user_id=user_id,
-            title=None,
-            messages=[],
-            usage=[],
-            credentials={},
-            started_at=datetime.now(UTC),
-            updated_at=datetime.now(UTC),
-        )
-
-    @classmethod
-    def from_db(cls, prisma_session: PrismaChatSession) -> Self:
-        """Convert Prisma ChatSession to Pydantic ChatSession."""
-        if prisma_session.Messages is None:
-            raise ValueError(
-                f"Prisma session {prisma_session.id} is missing Messages relation"
-            )
-
-        return cls(
-            **ChatSessionInfo.from_db(prisma_session).model_dump(),
-            messages=[ChatMessage.from_db(m) for m in prisma_session.Messages],
-        )
-
-    def add_tool_call_to_current_turn(self, tool_call: dict) -> None:
-        """Attach a tool_call to the current turn's assistant message.
-
-        Searches backwards for the most recent assistant message (stopping at
-        any user message boundary). If found, appends the tool_call to it.
-        Otherwise creates a new assistant message with the tool_call.
-        """
-        for msg in reversed(self.messages):
-            if msg.role == "user":
-                break
-            if msg.role == "assistant":
-                if not msg.tool_calls:
-                    msg.tool_calls = []
-                msg.tool_calls.append(tool_call)
-                return
-
-        self.messages.append(
-            ChatMessage(role="assistant", content="", tool_calls=[tool_call])
-        )
-
    def to_openai_messages(self) -> list[ChatCompletionMessageParam]:
        messages = []
        for message in self.messages:
@@ -258,85 +258,115 @@ class ChatSession(ChatSessionInfo):
                        name=message.name or "",
                    )
                )
-        return self._merge_consecutive_assistant_messages(messages)
-
-    @staticmethod
-    def _merge_consecutive_assistant_messages(
-        messages: list[ChatCompletionMessageParam],
-    ) -> list[ChatCompletionMessageParam]:
-        """Merge consecutive assistant messages into single messages.
-
-        Long-running tool flows can create split assistant messages: one with
-        text content and another with tool_calls. Anthropic's API requires
-        tool_result blocks to reference a tool_use in the immediately preceding
-        assistant message, so these splits cause 400 errors via OpenRouter.
-        """
-        if len(messages) < 2:
-            return messages
-
-        result: list[ChatCompletionMessageParam] = [messages[0]]
-        for msg in messages[1:]:
-            prev = result[-1]
-            if prev.get("role") != "assistant" or msg.get("role") != "assistant":
-                result.append(msg)
-                continue
-
-            prev = cast(ChatCompletionAssistantMessageParam, prev)
-            curr = cast(ChatCompletionAssistantMessageParam, msg)
-
-            curr_content = curr.get("content") or ""
-            if curr_content:
-                prev_content = prev.get("content") or ""
-                prev["content"] = (
-                    f"{prev_content}\n{curr_content}" if prev_content else curr_content
-                )
-
-            curr_tool_calls = curr.get("tool_calls")
-            if curr_tool_calls:
-                prev_tool_calls = prev.get("tool_calls")
-                prev["tool_calls"] = (
-                    list(prev_tool_calls) + list(curr_tool_calls)
-                    if prev_tool_calls
-                    else list(curr_tool_calls)
-                )
-        return result
+        return messages


-def _parse_json_field(value: str | dict | list | None, default: Any = None) -> Any:
-    """Parse a JSON field that may be stored as string or already parsed."""
-    if value is None:
-        return default
-    if isinstance(value, str):
-        return json.loads(value)
-    return value
+async def _get_session_from_cache(session_id: str) -> ChatSession | None:
+    """Get a chat session from Redis cache."""
+    redis_key = _get_session_cache_key(session_id)
+    async_redis = await get_redis_async()
+    raw_session: bytes | None = await async_redis.get(redis_key)
+
+    if raw_session is None:
+        return None
+
+    try:
+        session = ChatSession.model_validate_json(raw_session)
+        logger.info(
+            f"Loading session {session_id} from cache: "
+            f"message_count={len(session.messages)}, "
+            f"roles={[m.role for m in session.messages]}"
+        )
+        return session
+    except Exception as e:
+        logger.error(f"Failed to deserialize session {session_id}: {e}", exc_info=True)
+        raise RedisError(f"Corrupted session data for {session_id}") from e


-# ================ Chat cache + DB operations ================ #
-
-# NOTE: Database calls are automatically routed through DatabaseManager if Prisma is not
-#       connected directly.
-
-
-async def cache_chat_session(session: ChatSession) -> None:
-    """Cache a chat session in Redis (without persisting to the database)."""
+async def _cache_session(session: ChatSession) -> None:
+    """Cache a chat session in Redis."""
    redis_key = _get_session_cache_key(session.session_id)
    async_redis = await get_redis_async()
    await async_redis.setex(redis_key, config.session_ttl, session.model_dump_json())


-async def invalidate_session_cache(session_id: str) -> None:
-    """Invalidate a chat session from Redis cache.
+async def cache_chat_session(session: ChatSession) -> None:
+    """Cache a chat session without persisting to the database."""
+    await _cache_session(session)

-    Used by background tasks to ensure fresh data is loaded on next access.
-    This is best-effort - Redis failures are logged but don't fail the operation.
-    """
-    try:
-        redis_key = _get_session_cache_key(session_id)
-        async_redis = await get_redis_async()
-        await async_redis.delete(redis_key)
-    except Exception as e:
-        # Best-effort: log but don't fail - cache will expire naturally
-        logger.warning(f"Failed to invalidate session cache for {session_id}: {e}")
+
+async def _get_session_from_db(session_id: str) -> ChatSession | None:
+    """Get a chat session from the database."""
+    prisma_session = await chat_db.get_chat_session(session_id)
+    if not prisma_session:
+        return None
+
+    messages = prisma_session.Messages
+    logger.info(
+        f"Loading session {session_id} from DB: "
+        f"has_messages={messages is not None}, "
+        f"message_count={len(messages) if messages else 0}, "
+        f"roles={[m.role for m in messages] if messages else []}"
+    )
+
+    return ChatSession.from_db(prisma_session, messages)
+
+
+async def _save_session_to_db(
+    session: ChatSession, existing_message_count: int
+) -> None:
+    """Save or update a chat session in the database."""
+    # Check if session exists in DB
+    existing = await chat_db.get_chat_session(session.session_id)
+
+    if not existing:
+        # Create new session
+        await chat_db.create_chat_session(
+            session_id=session.session_id,
+            user_id=session.user_id,
+        )
+        existing_message_count = 0
+
+    # Calculate total tokens from usage
+    total_prompt = sum(u.prompt_tokens for u in session.usage)
+    total_completion = sum(u.completion_tokens for u in session.usage)
+
+    # Update session metadata
+    await chat_db.update_chat_session(
+        session_id=session.session_id,
+        credentials=session.credentials,
+        successful_agent_runs=session.successful_agent_runs,
+        successful_agent_schedules=session.successful_agent_schedules,
+        total_prompt_tokens=total_prompt,
+        total_completion_tokens=total_completion,
+    )
+
+    # Add new messages (only those after existing count)
+    new_messages = session.messages[existing_message_count:]
+    if new_messages:
+        messages_data = []
+        for msg in new_messages:
+            messages_data.append(
+                {
+                    "role": msg.role,
+                    "content": msg.content,
+                    "name": msg.name,
+                    "tool_call_id": msg.tool_call_id,
+                    "refusal": msg.refusal,
+                    "tool_calls": msg.tool_calls,
+                    "function_call": msg.function_call,
+                }
+            )
+        logger.info(
+            f"Saving {len(new_messages)} new messages to DB for session {session.session_id}: "
+            f"roles={[m['role'] for m in messages_data]}, "
+            f"start_sequence={existing_message_count}"
+        )
+        await chat_db.add_chat_messages_batch(
+            session_id=session.session_id,
+            messages=messages_data,
+            start_sequence=existing_message_count,
+        )


 async def get_chat_session(
@@ -370,7 +400,7 @@ async def get_chat_session(
        logger.warning(f"Unexpected cache error for session {session_id}: {e}")

    # Fall back to database
-    logger.debug(f"Session {session_id} not in cache, checking database")
+    logger.info(f"Session {session_id} not in cache, checking database")
    session = await _get_session_from_db(session_id)

    if session is None:
@@ -386,7 +416,7 @@ async def get_chat_session(

    # Cache the session from DB
    try:
-        await cache_chat_session(session)
+        await _cache_session(session)
        logger.info(f"Cached session {session_id} from database")
    except Exception as e:
        logger.warning(f"Failed to cache session {session_id}: {e}")
@@ -394,45 +424,9 @@ async def get_chat_session(
    return session


-async def _get_session_from_cache(session_id: str) -> ChatSession | None:
-    """Get a chat session from Redis cache."""
-    redis_key = _get_session_cache_key(session_id)
-    async_redis = await get_redis_async()
-    raw_session: bytes | None = await async_redis.get(redis_key)
-
-    if raw_session is None:
-        return None
-
-    try:
-        session = ChatSession.model_validate_json(raw_session)
-        logger.info(
-            f"Loading session {session_id} from cache: "
-            f"message_count={len(session.messages)}, "
-            f"roles={[m.role for m in session.messages]}"
-        )
-        return session
-    except Exception as e:
-        logger.error(f"Failed to deserialize session {session_id}: {e}", exc_info=True)
-        raise RedisError(f"Corrupted session data for {session_id}") from e
-
-
-async def _get_session_from_db(session_id: str) -> ChatSession | None:
-    """Get a chat session from the database."""
-    session = await chat_db().get_chat_session(session_id)
-    if not session:
-        return None
-
-    logger.info(
-        f"Loaded session {session_id} from DB: "
-        f"has_messages={bool(session.messages)}, "
-        f"message_count={len(session.messages)}, "
-        f"roles={[m.role for m in session.messages]}"
-    )
-
-    return session
-
-
-async def upsert_chat_session(session: ChatSession) -> ChatSession:
+async def upsert_chat_session(
+    session: ChatSession,
+) -> ChatSession:
    """Update a chat session in both cache and database.

    Uses session-level locking to prevent race conditions when concurrent
@@ -450,7 +444,7 @@ async def upsert_chat_session(session: ChatSession) -> ChatSession:

    async with lock:
        # Get existing message count from DB for incremental saves
-        existing_message_count = await chat_db().get_chat_session_message_count(
+        existing_message_count = await chat_db.get_chat_session_message_count(
            session.session_id
        )

@@ -467,7 +461,7 @@ async def upsert_chat_session(session: ChatSession) -> ChatSession:

        # Save to cache (best-effort, even if DB failed)
        try:
-            await cache_chat_session(session)
+            await _cache_session(session)
        except Exception as e:
            # If DB succeeded but cache failed, raise cache error
            if db_error is None:
@@ -488,99 +482,6 @@ async def upsert_chat_session(session: ChatSession) -> ChatSession:
        return session


-async def _save_session_to_db(
-    session: ChatSession, existing_message_count: int
-) -> None:
-    """Save or update a chat session in the database."""
-    db = chat_db()
-
-    # Check if session exists in DB
-    existing = await db.get_chat_session(session.session_id)
-
-    if not existing:
-        # Create new session
-        await db.create_chat_session(
-            session_id=session.session_id,
-            user_id=session.user_id,
-        )
-        existing_message_count = 0
-
-    # Calculate total tokens from usage
-    total_prompt = sum(u.prompt_tokens for u in session.usage)
-    total_completion = sum(u.completion_tokens for u in session.usage)
-
-    # Update session metadata
-    await db.update_chat_session(
-        session_id=session.session_id,
-        credentials=session.credentials,
-        successful_agent_runs=session.successful_agent_runs,
-        successful_agent_schedules=session.successful_agent_schedules,
-        total_prompt_tokens=total_prompt,
-        total_completion_tokens=total_completion,
-    )
-
-    # Add new messages (only those after existing count)
-    new_messages = session.messages[existing_message_count:]
-    if new_messages:
-        messages_data = []
-        for msg in new_messages:
-            messages_data.append(
-                {
-                    "role": msg.role,
-                    "content": msg.content,
-                    "name": msg.name,
-                    "tool_call_id": msg.tool_call_id,
-                    "refusal": msg.refusal,
-                    "tool_calls": msg.tool_calls,
-                    "function_call": msg.function_call,
-                }
-            )
-        logger.info(
-            f"Saving {len(new_messages)} new messages to DB for session {session.session_id}: "
-            f"roles={[m['role'] for m in messages_data]}, "
-            f"start_sequence={existing_message_count}"
-        )
-        await db.add_chat_messages_batch(
-            session_id=session.session_id,
-            messages=messages_data,
-            start_sequence=existing_message_count,
-        )
-
-
-async def append_and_save_message(session_id: str, message: ChatMessage) -> ChatSession:
-    """Atomically append a message to a session and persist it.
-
-    Acquires the session lock, re-fetches the latest session state,
-    appends the message, and saves — preventing message loss when
-    concurrent requests modify the same session.
-    """
-    lock = await _get_session_lock(session_id)
-
-    async with lock:
-        session = await get_chat_session(session_id)
-        if session is None:
-            raise ValueError(f"Session {session_id} not found")
-
-        session.messages.append(message)
-        existing_message_count = await chat_db().get_chat_session_message_count(
-            session_id
-        )
-
-        try:
-            await _save_session_to_db(session, existing_message_count)
-        except Exception as e:
-            raise DatabaseError(
-                f"Failed to persist message to session {session_id}"
-            ) from e
-
-        try:
-            await cache_chat_session(session)
-        except Exception as e:
-            logger.warning(f"Cache write failed for session {session_id}: {e}")
-
-        return session
-
-
 async def create_chat_session(user_id: str) -> ChatSession:
    """Create a new chat session and persist it.

@@ -593,7 +494,7 @@ async def create_chat_session(user_id: str) -> ChatSession:

    # Create in database first - fail fast if this fails
    try:
-        await chat_db().create_chat_session(
+        await chat_db.create_chat_session(
            session_id=session.session_id,
            user_id=user_id,
        )
@@ -605,7 +506,7 @@ async def create_chat_session(user_id: str) -> ChatSession:

    # Cache the session (best-effort optimization, DB is source of truth)
    try:
-        await cache_chat_session(session)
+        await _cache_session(session)
    except Exception as e:
        logger.warning(f"Failed to cache new session {session.session_id}: {e}")

@@ -616,16 +517,20 @@ async def get_user_sessions(
    user_id: str,
    limit: int = 50,
    offset: int = 0,
-) -> tuple[list[ChatSessionInfo], int]:
+) -> tuple[list[ChatSession], int]:
    """Get chat sessions for a user from the database with total count.

    Returns:
        A tuple of (sessions, total_count) where total_count is the overall
        number of sessions for the user (not just the current page).
    """
-    db = chat_db()
-    sessions = await db.get_user_chat_sessions(user_id, limit, offset)
-    total_count = await db.get_user_session_count(user_id)
+    prisma_sessions = await chat_db.get_user_chat_sessions(user_id, limit, offset)
+    total_count = await chat_db.get_user_session_count(user_id)
+
+    sessions = []
+    for prisma_session in prisma_sessions:
+        # Convert without messages for listing (lighter weight)
+        sessions.append(ChatSession.from_db(prisma_session, None))

    return sessions, total_count

@@ -643,7 +548,7 @@ async def delete_chat_session(session_id: str, user_id: str | None = None) -> bo
    """
    # Delete from database first (with optional user_id validation)
    # This confirms ownership before invalidating cache
-    deleted = await chat_db().delete_chat_session(session_id, user_id)
+    deleted = await chat_db.delete_chat_session(session_id, user_id)

    if not deleted:
        return False
@@ -678,52 +583,20 @@ async def update_session_title(session_id: str, title: str) -> bool:
        True if updated successfully, False otherwise.
    """
    try:
-        result = await chat_db().update_chat_session(session_id=session_id, title=title)
+        result = await chat_db.update_chat_session(session_id=session_id, title=title)
        if result is None:
            logger.warning(f"Session {session_id} not found for title update")
            return False

-        # Update title in cache if it exists (instead of invalidating).
-        # This prevents race conditions where cache invalidation causes
-        # the frontend to see stale DB data while streaming is still in progress.
+        # Invalidate cache so next fetch gets updated title
        try:
-            cached = await _get_session_from_cache(session_id)
-            if cached:
-                cached.title = title
-                await cache_chat_session(cached)
+            redis_key = _get_session_cache_key(session_id)
+            async_redis = await get_redis_async()
+            await async_redis.delete(redis_key)
        except Exception as e:
-            # Not critical - title will be correct on next full cache refresh
-            logger.warning(
-                f"Failed to update title in cache for session {session_id}: {e}"
-            )
+            logger.warning(f"Failed to invalidate cache for session {session_id}: {e}")

        return True
    except Exception as e:
        logger.error(f"Failed to update title for session {session_id}: {e}")
        return False
-
-
-# ==================== Chat session locks ==================== #
-
-_session_locks: WeakValueDictionary[str, asyncio.Lock] = WeakValueDictionary()
-_session_locks_mutex = asyncio.Lock()
-
-
-async def _get_session_lock(session_id: str) -> asyncio.Lock:
-    """Get or create a lock for a specific session to prevent concurrent upserts.
-
-    This was originally added to solve the specific problem of race conditions between
-    the session title thread and the conversation thread, which always occurs on the
-    same instance as we prevent rapid request sends on the frontend.
-
-    Uses WeakValueDictionary for automatic cleanup: locks are garbage collected
-    when no coroutine holds a reference to them, preventing memory leaks from
-    unbounded growth of session locks. Explicit cleanup also occurs
-    in `delete_chat_session()`.
-    """
-    async with _session_locks_mutex:
-        lock = _session_locks.get(session_id)
-        if lock is None:
-            lock = asyncio.Lock()
-            _session_locks[session_id] = lock
-        return lock
--- a/autogpt_platform/backend/backend/api/features/chat/model_test.py
+++ b/autogpt_platform/backend/backend/api/features/chat/model_test.py
@@ -0,0 +1,119 @@
+import pytest
+
+from .model import (
+    ChatMessage,
+    ChatSession,
+    Usage,
+    get_chat_session,
+    upsert_chat_session,
+)
+
+messages = [
+    ChatMessage(content="Hello, how are you?", role="user"),
+    ChatMessage(
+        content="I'm fine, thank you!",
+        role="assistant",
+        tool_calls=[
+            {
+                "id": "t123",
+                "type": "function",
+                "function": {
+                    "name": "get_weather",
+                    "arguments": '{"city": "New York"}',
+                },
+            }
+        ],
+    ),
+    ChatMessage(
+        content="I'm using the tool to get the weather",
+        role="tool",
+        tool_call_id="t123",
+    ),
+]
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_chatsession_serialization_deserialization():
+    s = ChatSession.new(user_id="abc123")
+    s.messages = messages
+    s.usage = [Usage(prompt_tokens=100, completion_tokens=200, total_tokens=300)]
+    serialized = s.model_dump_json()
+    s2 = ChatSession.model_validate_json(serialized)
+    assert s2.model_dump() == s.model_dump()
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_chatsession_redis_storage(setup_test_user, test_user_id):
+
+    s = ChatSession.new(user_id=test_user_id)
+    s.messages = messages
+
+    s = await upsert_chat_session(s)
+
+    s2 = await get_chat_session(
+        session_id=s.session_id,
+        user_id=s.user_id,
+    )
+
+    assert s2 == s
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_chatsession_redis_storage_user_id_mismatch(
+    setup_test_user, test_user_id
+):
+
+    s = ChatSession.new(user_id=test_user_id)
+    s.messages = messages
+    s = await upsert_chat_session(s)
+
+    s2 = await get_chat_session(s.session_id, "different_user_id")
+
+    assert s2 is None
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_chatsession_db_storage(setup_test_user, test_user_id):
+    """Test that messages are correctly saved to and loaded from DB (not cache)."""
+    from backend.data.redis_client import get_redis_async
+
+    # Create session with messages including assistant message
+    s = ChatSession.new(user_id=test_user_id)
+    s.messages = messages  # Contains user, assistant, and tool messages
+    assert s.session_id is not None, "Session id is not set"
+    # Upsert to save to both cache and DB
+    s = await upsert_chat_session(s)
+
+    # Clear the Redis cache to force DB load
+    redis_key = f"chat:session:{s.session_id}"
+    async_redis = await get_redis_async()
+    await async_redis.delete(redis_key)
+
+    # Load from DB (cache was cleared)
+    s2 = await get_chat_session(
+        session_id=s.session_id,
+        user_id=s.user_id,
+    )
+
+    assert s2 is not None, "Session not found after loading from DB"
+    assert len(s2.messages) == len(
+        s.messages
+    ), f"Message count mismatch: expected {len(s.messages)}, got {len(s2.messages)}"
+
+    # Verify all roles are present
+    roles = [m.role for m in s2.messages]
+    assert "user" in roles, f"User message missing. Roles found: {roles}"
+    assert "assistant" in roles, f"Assistant message missing. Roles found: {roles}"
+    assert "tool" in roles, f"Tool message missing. Roles found: {roles}"
+
+    # Verify message content
+    for orig, loaded in zip(s.messages, s2.messages):
+        assert orig.role == loaded.role, f"Role mismatch: {orig.role} != {loaded.role}"
+        assert (
+            orig.content == loaded.content
+        ), f"Content mismatch for {orig.role}: {orig.content} != {loaded.content}"
+        if orig.tool_calls:
+            assert (
+                loaded.tool_calls is not None
+            ), f"Tool calls missing for {orig.role} message"
+            assert len(orig.tool_calls) == len(loaded.tool_calls)
--- a/autogpt_platform/backend/backend/api/features/chat/response_model.py
+++ b/autogpt_platform/backend/backend/api/features/chat/response_model.py
@@ -10,8 +10,6 @@ from typing import Any

 from pydantic import BaseModel, Field

-from backend.util.json import dumps as json_dumps
-

 class ResponseType(str, Enum):
    """Types of streaming responses following AI SDK protocol."""
@@ -20,10 +18,6 @@ class ResponseType(str, Enum):
    START = "start"
    FINISH = "finish"

-    # Step lifecycle (one LLM API call within a message)
-    START_STEP = "start-step"
-    FINISH_STEP = "finish-step"
-
    # Text streaming
    TEXT_START = "text-start"
    TEXT_DELTA = "text-delta"
@@ -37,7 +31,6 @@ class ResponseType(str, Enum):
    # Other
    ERROR = "error"
    USAGE = "usage"
-    HEARTBEAT = "heartbeat"


 class StreamBaseResponse(BaseModel):
@@ -58,20 +51,6 @@ class StreamStart(StreamBaseResponse):

    type: ResponseType = ResponseType.START
    messageId: str = Field(..., description="Unique message ID")
-    taskId: str | None = Field(
-        default=None,
-        description="Task ID for SSE reconnection. Clients can reconnect using GET /tasks/{taskId}/stream",
-    )
-
-    def to_sse(self) -> str:
-        """Convert to SSE format, excluding non-protocol fields like taskId."""
-        import json
-
-        data: dict[str, Any] = {
-            "type": self.type.value,
-            "messageId": self.messageId,
-        }
-        return f"data: {json.dumps(data)}\n\n"


 class StreamFinish(StreamBaseResponse):
@@ -80,26 +59,6 @@ class StreamFinish(StreamBaseResponse):
    type: ResponseType = ResponseType.FINISH


-class StreamStartStep(StreamBaseResponse):
-    """Start of a step (one LLM API call within a message).
-
-    The AI SDK uses this to add a step-start boundary to message.parts,
-    enabling visual separation between multiple LLM calls in a single message.
-    """
-
-    type: ResponseType = ResponseType.START_STEP
-
-
-class StreamFinishStep(StreamBaseResponse):
-    """End of a step (one LLM API call within a message).
-
-    The AI SDK uses this to reset activeTextParts and activeReasoningParts,
-    so the next LLM call in a tool-call continuation starts with clean state.
-    """
-
-    type: ResponseType = ResponseType.FINISH_STEP
-
-
 # ========== Text Streaming ==========


@@ -153,7 +112,7 @@ class StreamToolOutputAvailable(StreamBaseResponse):
    type: ResponseType = ResponseType.TOOL_OUTPUT_AVAILABLE
    toolCallId: str = Field(..., description="Tool call ID this responds to")
    output: str | dict[str, Any] = Field(..., description="Tool execution output")
-    # Keep these for internal backend use
+    # Additional fields for internal use (not part of AI SDK spec but useful)
    toolName: str | None = Field(
        default=None, description="Name of the tool that was executed"
    )
@@ -161,17 +120,6 @@ class StreamToolOutputAvailable(StreamBaseResponse):
        default=True, description="Whether the tool execution succeeded"
    )

-    def to_sse(self) -> str:
-        """Convert to SSE format, excluding non-spec fields."""
-        import json
-
-        data = {
-            "type": self.type.value,
-            "toolCallId": self.toolCallId,
-            "output": self.output,
-        }
-        return f"data: {json.dumps(data)}\n\n"
-

 # ========== Other ==========

@@ -194,32 +142,3 @@ class StreamError(StreamBaseResponse):
    details: dict[str, Any] | None = Field(
        default=None, description="Additional error details"
    )
-
-    def to_sse(self) -> str:
-        """Convert to SSE format, only emitting fields required by AI SDK protocol.
-
-        The AI SDK uses z.strictObject({type, errorText}) which rejects
-        any extra fields like `code` or `details`.
-        """
-        data = {
-            "type": self.type.value,
-            "errorText": self.errorText,
-        }
-        return f"data: {json_dumps(data)}\n\n"
-
-
-class StreamHeartbeat(StreamBaseResponse):
-    """Heartbeat to keep SSE connection alive during long-running operations.
-
-    Uses SSE comment format (: comment) which is ignored by clients but keeps
-    the connection alive through proxies and load balancers.
-    """
-
-    type: ResponseType = ResponseType.HEARTBEAT
-    toolCallId: str | None = Field(
-        default=None, description="Tool call ID if heartbeat is for a specific tool"
-    )
-
-    def to_sse(self) -> str:
-        """Convert to SSE comment format to keep connection alive."""
-        return ": heartbeat\n\n"
--- a/autogpt_platform/backend/backend/api/features/chat/routes.py
+++ b/autogpt_platform/backend/backend/api/features/chat/routes.py
@@ -1,60 +1,20 @@
 """Chat API routes for chat session management and streaming via SSE."""

-import asyncio
 import logging
-import uuid as uuid_module
 from collections.abc import AsyncGenerator
 from typing import Annotated

 from autogpt_libs import auth
-from fastapi import APIRouter, Depends, Header, HTTPException, Query, Response, Security
+from fastapi import APIRouter, Depends, Query, Security
 from fastapi.responses import StreamingResponse
 from pydantic import BaseModel

-from backend.copilot import service as chat_service
-from backend.copilot import stream_registry
-from backend.copilot.completion_handler import (
-    process_operation_failure,
-    process_operation_success,
-)
-from backend.copilot.config import ChatConfig
-from backend.copilot.executor.utils import enqueue_copilot_task
-from backend.copilot.model import (
-    ChatMessage,
-    ChatSession,
-    append_and_save_message,
-    create_chat_session,
-    delete_chat_session,
-    get_chat_session,
-    get_user_sessions,
-)
-from backend.copilot.response_model import StreamError, StreamFinish, StreamHeartbeat
-from backend.copilot.tools.models import (
-    AgentDetailsResponse,
-    AgentOutputResponse,
-    AgentPreviewResponse,
-    AgentSavedResponse,
-    AgentsFoundResponse,
-    BlockDetailsResponse,
-    BlockListResponse,
-    BlockOutputResponse,
-    ClarificationNeededResponse,
-    DocPageResponse,
-    DocSearchResultsResponse,
-    ErrorResponse,
-    ExecutionStartedResponse,
-    InputValidationErrorResponse,
-    NeedLoginResponse,
-    NoResultsResponse,
-    OperationInProgressResponse,
-    OperationPendingResponse,
-    OperationStartedResponse,
-    SetupRequirementsResponse,
-    UnderstandingUpdatedResponse,
-)
-from backend.copilot.tracking import track_user_message
 from backend.util.exceptions import NotFoundError

+from . import service as chat_service
+from .config import ChatConfig
+from .model import ChatSession, create_chat_session, get_chat_session, get_user_sessions
+
 config = ChatConfig()


@@ -95,15 +55,6 @@ class CreateSessionResponse(BaseModel):
    user_id: str | None


-class ActiveStreamInfo(BaseModel):
-    """Information about an active stream for reconnection."""
-
-    task_id: str
-    last_message_id: str  # Redis Stream message ID for resumption
-    operation_id: str  # Operation ID for completion tracking
-    tool_name: str  # Name of the tool being executed
-
-
 class SessionDetailResponse(BaseModel):
    """Response model providing complete details for a chat session, including messages."""

@@ -112,7 +63,6 @@ class SessionDetailResponse(BaseModel):
    updated_at: str
    user_id: str | None
    messages: list[dict]
-    active_stream: ActiveStreamInfo | None = None  # Present if stream is still active


 class SessionSummaryResponse(BaseModel):
@@ -131,14 +81,6 @@ class ListSessionsResponse(BaseModel):
    total: int


-class OperationCompleteRequest(BaseModel):
-    """Request model for external completion webhook."""
-
-    success: bool
-    result: dict | str | None = None
-    error: str | None = None
-
-
 # ========== Routes ==========


@@ -213,43 +155,6 @@ async def create_session(
    )


-@router.delete(
-    "/sessions/{session_id}",
-    dependencies=[Security(auth.requires_user)],
-    status_code=204,
-    responses={404: {"description": "Session not found or access denied"}},
-)
-async def delete_session(
-    session_id: str,
-    user_id: Annotated[str, Security(auth.get_user_id)],
-) -> Response:
-    """
-    Delete a chat session.
-
-    Permanently removes a chat session and all its messages.
-    Only the owner can delete their sessions.
-
-    Args:
-        session_id: The session ID to delete.
-        user_id: The authenticated user's ID.
-
-    Returns:
-        204 No Content on success.
-
-    Raises:
-        HTTPException: 404 if session not found or not owned by user.
-    """
-    deleted = await delete_chat_session(session_id, user_id)
-
-    if not deleted:
-        raise HTTPException(
-            status_code=404,
-            detail=f"Session {session_id} not found or access denied",
-        )
-
-    return Response(status_code=204)
-
-
@router.get(
    "/sessions/{session_id}",
 )
@@ -261,14 +166,13 @@ async def get_session(
    Retrieve the details of a specific chat session.

    Looks up a chat session by ID for the given user (if authenticated) and returns all session data including messages.
-    If there's an active stream for this session, returns the task_id for reconnection.

    Args:
        session_id: The unique identifier for the desired chat session.
        user_id: The optional authenticated user ID, or None for anonymous access.

    Returns:
-        SessionDetailResponse: Details for the requested session, including active_stream info if applicable.
+        SessionDetailResponse: Details for the requested session, or None if not found.

    """
    session = await get_chat_session(session_id, user_id)
@@ -276,32 +180,11 @@ async def get_session(
        raise NotFoundError(f"Session {session_id} not found.")

    messages = [message.model_dump() for message in session.messages]
-
-    # Check if there's an active stream for this session
-    active_stream_info = None
-    active_task, last_message_id = await stream_registry.get_active_task_for_session(
-        session_id, user_id
-    )
    logger.info(
-        f"[GET_SESSION] session={session_id}, active_task={active_task is not None}, "
-        f"msg_count={len(messages)}, last_role={messages[-1].get('role') if messages else 'none'}"
+        f"Returning session {session_id}: "
+        f"message_count={len(messages)}, "
+        f"roles={[m.get('role') for m in messages]}"
    )
-    if active_task:
-        # Filter out the in-progress assistant message from the session response.
-        # The client will receive the complete assistant response through the SSE
-        # stream replay instead, preventing duplicate content.
-        if messages and messages[-1].get("role") == "assistant":
-            messages = messages[:-1]
-
-        # Use "0-0" as last_message_id to replay the stream from the beginning.
-        # Since we filtered out the cached assistant message, the client needs
-        # the full stream to reconstruct the response.
-        active_stream_info = ActiveStreamInfo(
-            task_id=active_task.task_id,
-            last_message_id="0-0",
-            operation_id=active_task.operation_id,
-            tool_name=active_task.tool_name,
-        )

    return SessionDetailResponse(
        id=session.session_id,
@@ -309,7 +192,6 @@ async def get_session(
        updated_at=session.updated_at.isoformat(),
        user_id=session.user_id or None,
        messages=messages,
-        active_stream=active_stream_info,
    )


@@ -329,225 +211,49 @@ async def stream_chat_post(
      - Tool call UI elements (if invoked)
      - Tool execution results

-    The AI generation runs in a background task that continues even if the client disconnects.
-    All chunks are written to Redis for reconnection support. If the client disconnects,
-    they can reconnect using GET /tasks/{task_id}/stream to resume from where they left off.
-
    Args:
        session_id: The chat session identifier to associate with the streamed messages.
        request: Request body containing message, is_user_message, and optional context.
        user_id: Optional authenticated user ID.
    Returns:
-        StreamingResponse: SSE-formatted response chunks. First chunk is a "start" event
-        containing the task_id for reconnection.
+        StreamingResponse: SSE-formatted response chunks.

    """
-    import asyncio
-    import time
+    session = await _validate_and_get_session(session_id, user_id)

-    stream_start_time = time.perf_counter()
-    log_meta = {"component": "ChatStream", "session_id": session_id}
-    if user_id:
-        log_meta["user_id"] = user_id
-
-    logger.info(
-        f"[TIMING] stream_chat_post STARTED, session={session_id}, "
-        f"user={user_id}, message_len={len(request.message)}",
-        extra={"json_fields": log_meta},
-    )
-    await _validate_and_get_session(session_id, user_id)
-    logger.info(
-        f"[TIMING] session validated in {(time.perf_counter() - stream_start_time) * 1000:.1f}ms",
-        extra={
-            "json_fields": {
-                **log_meta,
-                "duration_ms": (time.perf_counter() - stream_start_time) * 1000,
-            }
-        },
-    )
-
-    # Atomically append user message to session BEFORE creating task to avoid
-    # race condition where GET_SESSION sees task as "running" but message isn't
-    # saved yet.  append_and_save_message re-fetches inside a lock to prevent
-    # message loss from concurrent requests.
-    if request.message:
-        message = ChatMessage(
-            role="user" if request.is_user_message else "assistant",
-            content=request.message,
-        )
-        if request.is_user_message:
-            track_user_message(
-                user_id=user_id,
-                session_id=session_id,
-                message_length=len(request.message),
-            )
-        logger.info(f"[STREAM] Saving user message to session {session_id}")
-        await append_and_save_message(session_id, message)
-        logger.info(f"[STREAM] User message saved for session {session_id}")
-
-    # Create a task in the stream registry for reconnection support
-    task_id = str(uuid_module.uuid4())
-    operation_id = str(uuid_module.uuid4())
-    log_meta["task_id"] = task_id
-
-    task_create_start = time.perf_counter()
-    await stream_registry.create_task(
-        task_id=task_id,
-        session_id=session_id,
-        user_id=user_id,
-        tool_call_id="chat_stream",  # Not a tool call, but needed for the model
-        tool_name="chat",
-        operation_id=operation_id,
-    )
-    logger.info(
-        f"[TIMING] create_task completed in {(time.perf_counter() - task_create_start) * 1000:.1f}ms",
-        extra={
-            "json_fields": {
-                **log_meta,
-                "duration_ms": (time.perf_counter() - task_create_start) * 1000,
-            }
-        },
-    )
-
-    await enqueue_copilot_task(
-        task_id=task_id,
-        session_id=session_id,
-        user_id=user_id,
-        operation_id=operation_id,
-        message=request.message,
-        is_user_message=request.is_user_message,
-        context=request.context,
-    )
-
-    setup_time = (time.perf_counter() - stream_start_time) * 1000
-    logger.info(
-        f"[TIMING] Task enqueued to RabbitMQ, setup={setup_time:.1f}ms",
-        extra={"json_fields": {**log_meta, "setup_time_ms": setup_time}},
-    )
-
-    # SSE endpoint that subscribes to the task's stream
    async def event_generator() -> AsyncGenerator[str, None]:
-        import time as time_module
-
-        event_gen_start = time_module.perf_counter()
+        chunk_count = 0
+        first_chunk_type: str | None = None
+        async for chunk in chat_service.stream_chat_completion(
+            session_id,
+            request.message,
+            is_user_message=request.is_user_message,
+            user_id=user_id,
+            session=session,  # Pass pre-fetched session to avoid double-fetch
+            context=request.context,
+        ):
+            if chunk_count < 3:
+                logger.info(
+                    "Chat stream chunk",
+                    extra={
+                        "session_id": session_id,
+                        "chunk_type": str(chunk.type),
+                    },
+                )
+            if not first_chunk_type:
+                first_chunk_type = str(chunk.type)
+            chunk_count += 1
+            yield chunk.to_sse()
        logger.info(
-            f"[TIMING] event_generator STARTED, task={task_id}, session={session_id}, "
-            f"user={user_id}",
-            extra={"json_fields": log_meta},
+            "Chat stream completed",
+            extra={
+                "session_id": session_id,
+                "chunk_count": chunk_count,
+                "first_chunk_type": first_chunk_type,
+            },
        )
-        subscriber_queue = None
-        first_chunk_yielded = False
-        chunks_yielded = 0
-        try:
-            # Subscribe to the task stream (this replays existing messages + live updates)
-            subscriber_queue = await stream_registry.subscribe_to_task(
-                task_id=task_id,
-                user_id=user_id,
-                last_message_id="0-0",  # Get all messages from the beginning
-            )
-
-            if subscriber_queue is None:
-                yield StreamFinish().to_sse()
-                yield "data: [DONE]\n\n"
-                return
-
-            # Read from the subscriber queue and yield to SSE
-            logger.info(
-                "[TIMING] Starting to read from subscriber_queue",
-                extra={"json_fields": log_meta},
-            )
-            while True:
-                try:
-                    chunk = await asyncio.wait_for(subscriber_queue.get(), timeout=30.0)
-                    chunks_yielded += 1
-
-                    if not first_chunk_yielded:
-                        first_chunk_yielded = True
-                        elapsed = time_module.perf_counter() - event_gen_start
-                        logger.info(
-                            f"[TIMING] FIRST CHUNK from queue at {elapsed:.2f}s, "
-                            f"type={type(chunk).__name__}",
-                            extra={
-                                "json_fields": {
-                                    **log_meta,
-                                    "chunk_type": type(chunk).__name__,
-                                    "elapsed_ms": elapsed * 1000,
-                                }
-                            },
-                        )
-
-                    yield chunk.to_sse()
-
-                    # Check for finish signal
-                    if isinstance(chunk, StreamFinish):
-                        total_time = time_module.perf_counter() - event_gen_start
-                        logger.info(
-                            f"[TIMING] StreamFinish received in {total_time:.2f}s; "
-                            f"n_chunks={chunks_yielded}",
-                            extra={
-                                "json_fields": {
-                                    **log_meta,
-                                    "chunks_yielded": chunks_yielded,
-                                    "total_time_ms": total_time * 1000,
-                                }
-                            },
-                        )
-                        break
-                except asyncio.TimeoutError:
-                    yield StreamHeartbeat().to_sse()
-
-        except GeneratorExit:
-            logger.info(
-                f"[TIMING] GeneratorExit (client disconnected), chunks={chunks_yielded}",
-                extra={
-                    "json_fields": {
-                        **log_meta,
-                        "chunks_yielded": chunks_yielded,
-                        "reason": "client_disconnect",
-                    }
-                },
-            )
-            pass  # Client disconnected - background task continues
-        except Exception as e:
-            elapsed = (time_module.perf_counter() - event_gen_start) * 1000
-            logger.error(
-                f"[TIMING] event_generator ERROR after {elapsed:.1f}ms: {e}",
-                extra={
-                    "json_fields": {**log_meta, "elapsed_ms": elapsed, "error": str(e)}
-                },
-            )
-            # Surface error to frontend so it doesn't appear stuck
-            yield StreamError(
-                errorText="An error occurred. Please try again.",
-                code="stream_error",
-            ).to_sse()
-            yield StreamFinish().to_sse()
-        finally:
-            # Unsubscribe when client disconnects or stream ends
-            if subscriber_queue is not None:
-                try:
-                    await stream_registry.unsubscribe_from_task(
-                        task_id, subscriber_queue
-                    )
-                except Exception as unsub_err:
-                    logger.error(
-                        f"Error unsubscribing from task {task_id}: {unsub_err}",
-                        exc_info=True,
-                    )
-            # AI SDK protocol termination - always yield even if unsubscribe fails
-            total_time = time_module.perf_counter() - event_gen_start
-            logger.info(
-                f"[TIMING] event_generator FINISHED in {total_time:.2f}s; "
-                f"task={task_id}, session={session_id}, n_chunks={chunks_yielded}",
-                extra={
-                    "json_fields": {
-                        **log_meta,
-                        "total_time_ms": total_time * 1000,
-                        "chunks_yielded": chunks_yielded,
-                    }
-                },
-            )
-            yield "data: [DONE]\n\n"
+        # AI SDK protocol termination
+        yield "data: [DONE]\n\n"

    return StreamingResponse(
        event_generator(),
@@ -564,90 +270,63 @@ async def stream_chat_post(
@router.get(
    "/sessions/{session_id}/stream",
 )
-async def resume_session_stream(
+async def stream_chat_get(
    session_id: str,
+    message: Annotated[str, Query(min_length=1, max_length=10000)],
    user_id: str | None = Depends(auth.get_user_id),
+    is_user_message: bool = Query(default=True),
 ):
    """
-    Resume an active stream for a session.
+    Stream chat responses for a session (GET - legacy endpoint).

-    Called by the AI SDK's ``useChat(resume: true)`` on page load.
-    Checks for an active (in-progress) task on the session and either replays
-    the full SSE stream or returns 204 No Content if nothing is running.
+    Streams the AI/completion responses in real time over Server-Sent Events (SSE), including:
+      - Text fragments as they are generated
+      - Tool call UI elements (if invoked)
+      - Tool execution results

    Args:
-        session_id: The chat session identifier.
+        session_id: The chat session identifier to associate with the streamed messages.
+        message: The user's new message to process.
        user_id: Optional authenticated user ID.
-
+        is_user_message: Whether the message is a user message.
    Returns:
-        StreamingResponse (SSE) when an active stream exists,
-        or 204 No Content when there is nothing to resume.
+        StreamingResponse: SSE-formatted response chunks.
+
    """
-    import asyncio
-
-    active_task, _last_id = await stream_registry.get_active_task_for_session(
-        session_id, user_id
-    )
-
-    if not active_task:
-        return Response(status_code=204)
-
-    subscriber_queue = await stream_registry.subscribe_to_task(
-        task_id=active_task.task_id,
-        user_id=user_id,
-        last_message_id="0-0",  # Full replay so useChat rebuilds the message
-    )
-
-    if subscriber_queue is None:
-        return Response(status_code=204)
+    session = await _validate_and_get_session(session_id, user_id)

    async def event_generator() -> AsyncGenerator[str, None]:
        chunk_count = 0
        first_chunk_type: str | None = None
-        try:
-            while True:
-                try:
-                    chunk = await asyncio.wait_for(subscriber_queue.get(), timeout=30.0)
-                    if chunk_count < 3:
-                        logger.info(
-                            "Resume stream chunk",
-                            extra={
-                                "session_id": session_id,
-                                "chunk_type": str(chunk.type),
-                            },
-                        )
-                    if not first_chunk_type:
-                        first_chunk_type = str(chunk.type)
-                    chunk_count += 1
-                    yield chunk.to_sse()
-
-                    if isinstance(chunk, StreamFinish):
-                        break
-                except asyncio.TimeoutError:
-                    yield StreamHeartbeat().to_sse()
-        except GeneratorExit:
-            pass
-        except Exception as e:
-            logger.error(f"Error in resume stream for session {session_id}: {e}")
-        finally:
-            try:
-                await stream_registry.unsubscribe_from_task(
-                    active_task.task_id, subscriber_queue
+        async for chunk in chat_service.stream_chat_completion(
+            session_id,
+            message,
+            is_user_message=is_user_message,
+            user_id=user_id,
+            session=session,  # Pass pre-fetched session to avoid double-fetch
+        ):
+            if chunk_count < 3:
+                logger.info(
+                    "Chat stream chunk",
+                    extra={
+                        "session_id": session_id,
+                        "chunk_type": str(chunk.type),
+                    },
                )
-            except Exception as unsub_err:
-                logger.error(
-                    f"Error unsubscribing from task {active_task.task_id}: {unsub_err}",
-                    exc_info=True,
-                )
-            logger.info(
-                "Resume stream completed",
-                extra={
-                    "session_id": session_id,
-                    "n_chunks": chunk_count,
-                    "first_chunk_type": first_chunk_type,
-                },
-            )
-            yield "data: [DONE]\n\n"
+            if not first_chunk_type:
+                first_chunk_type = str(chunk.type)
+            chunk_count += 1
+            yield chunk.to_sse()
+        logger.info(
+            "Chat stream completed",
+            extra={
+                "session_id": session_id,
+                "chunk_count": chunk_count,
+                "first_chunk_type": first_chunk_type,
+            },
+        )
+        # AI SDK protocol termination
+        yield "data: [DONE]\n\n"

    return StreamingResponse(
        event_generator(),
@@ -655,8 +334,8 @@ async def resume_session_stream(
        headers={
            "Cache-Control": "no-cache",
            "Connection": "keep-alive",
-            "X-Accel-Buffering": "no",
-            "x-vercel-ai-ui-message-stream": "v1",
+            "X-Accel-Buffering": "no",  # Disable nginx buffering
+            "x-vercel-ai-ui-message-stream": "v1",  # AI SDK protocol header
        },
    )

@@ -687,249 +366,6 @@ async def session_assign_user(
    return {"status": "ok"}


-# ========== Task Streaming (SSE Reconnection) ==========
-
-
-@router.get(
-    "/tasks/{task_id}/stream",
-)
-async def stream_task(
-    task_id: str,
-    user_id: str | None = Depends(auth.get_user_id),
-    last_message_id: str = Query(
-        default="0-0",
-        description="Last Redis Stream message ID received (e.g., '1706540123456-0'). Use '0-0' for full replay.",
-    ),
-):
-    """
-    Reconnect to a long-running task's SSE stream.
-
-    When a long-running operation (like agent generation) starts, the client
-    receives a task_id. If the connection drops, the client can reconnect
-    using this endpoint to resume receiving updates.
-
-    Args:
-        task_id: The task ID from the operation_started response.
-        user_id: Authenticated user ID for ownership validation.
-        last_message_id: Last Redis Stream message ID received ("0-0" for full replay).
-
-    Returns:
-        StreamingResponse: SSE-formatted response chunks starting after last_message_id.
-
-    Raises:
-        HTTPException: 404 if task not found, 410 if task expired, 403 if access denied.
-    """
-    # Check task existence and expiry before subscribing
-    task, error_code = await stream_registry.get_task_with_expiry_info(task_id)
-
-    if error_code == "TASK_EXPIRED":
-        raise HTTPException(
-            status_code=410,
-            detail={
-                "code": "TASK_EXPIRED",
-                "message": "This operation has expired. Please try again.",
-            },
-        )
-
-    if error_code == "TASK_NOT_FOUND":
-        raise HTTPException(
-            status_code=404,
-            detail={
-                "code": "TASK_NOT_FOUND",
-                "message": f"Task {task_id} not found.",
-            },
-        )
-
-    # Validate ownership if task has an owner
-    if task and task.user_id and user_id != task.user_id:
-        raise HTTPException(
-            status_code=403,
-            detail={
-                "code": "ACCESS_DENIED",
-                "message": "You do not have access to this task.",
-            },
-        )
-
-    # Get subscriber queue from stream registry
-    subscriber_queue = await stream_registry.subscribe_to_task(
-        task_id=task_id,
-        user_id=user_id,
-        last_message_id=last_message_id,
-    )
-
-    if subscriber_queue is None:
-        raise HTTPException(
-            status_code=404,
-            detail={
-                "code": "TASK_NOT_FOUND",
-                "message": f"Task {task_id} not found or access denied.",
-            },
-        )
-
-    async def event_generator() -> AsyncGenerator[str, None]:
-        heartbeat_interval = 15.0  # Send heartbeat every 15 seconds
-        try:
-            while True:
-                try:
-                    # Wait for next chunk with timeout for heartbeats
-                    chunk = await asyncio.wait_for(
-                        subscriber_queue.get(), timeout=heartbeat_interval
-                    )
-                    yield chunk.to_sse()
-
-                    # Check for finish signal
-                    if isinstance(chunk, StreamFinish):
-                        break
-                except asyncio.TimeoutError:
-                    # Send heartbeat to keep connection alive
-                    yield StreamHeartbeat().to_sse()
-        except Exception as e:
-            logger.error(f"Error in task stream {task_id}: {e}", exc_info=True)
-        finally:
-            # Unsubscribe when client disconnects or stream ends
-            try:
-                await stream_registry.unsubscribe_from_task(task_id, subscriber_queue)
-            except Exception as unsub_err:
-                logger.error(
-                    f"Error unsubscribing from task {task_id}: {unsub_err}",
-                    exc_info=True,
-                )
-            # AI SDK protocol termination - always yield even if unsubscribe fails
-            yield "data: [DONE]\n\n"
-
-    return StreamingResponse(
-        event_generator(),
-        media_type="text/event-stream",
-        headers={
-            "Cache-Control": "no-cache",
-            "Connection": "keep-alive",
-            "X-Accel-Buffering": "no",
-            "x-vercel-ai-ui-message-stream": "v1",
-        },
-    )
-
-
-@router.get(
-    "/tasks/{task_id}",
-)
-async def get_task_status(
-    task_id: str,
-    user_id: str | None = Depends(auth.get_user_id),
-) -> dict:
-    """
-    Get the status of a long-running task.
-
-    Args:
-        task_id: The task ID to check.
-        user_id: Authenticated user ID for ownership validation.
-
-    Returns:
-        dict: Task status including task_id, status, tool_name, and operation_id.
-
-    Raises:
-        NotFoundError: If task_id is not found or user doesn't have access.
-    """
-    task = await stream_registry.get_task(task_id)
-
-    if task is None:
-        raise NotFoundError(f"Task {task_id} not found.")
-
-    # Validate ownership - if task has an owner, requester must match
-    if task.user_id and user_id != task.user_id:
-        raise NotFoundError(f"Task {task_id} not found.")
-
-    return {
-        "task_id": task.task_id,
-        "session_id": task.session_id,
-        "status": task.status,
-        "tool_name": task.tool_name,
-        "operation_id": task.operation_id,
-        "created_at": task.created_at.isoformat(),
-    }
-
-
-# ========== External Completion Webhook ==========
-
-
-@router.post(
-    "/operations/{operation_id}/complete",
-    status_code=200,
-)
-async def complete_operation(
-    operation_id: str,
-    request: OperationCompleteRequest,
-    x_api_key: str | None = Header(default=None),
-) -> dict:
-    """
-    External completion webhook for long-running operations.
-
-    Called by Agent Generator (or other services) when an operation completes.
-    This triggers the stream registry to publish completion and continue LLM generation.
-
-    Args:
-        operation_id: The operation ID to complete.
-        request: Completion payload with success status and result/error.
-        x_api_key: Internal API key for authentication.
-
-    Returns:
-        dict: Status of the completion.
-
-    Raises:
-        HTTPException: If API key is invalid or operation not found.
-    """
-    # Validate internal API key - reject if not configured or invalid
-    if not config.internal_api_key:
-        logger.error(
-            "Operation complete webhook rejected: CHAT_INTERNAL_API_KEY not configured"
-        )
-        raise HTTPException(
-            status_code=503,
-            detail="Webhook not available: internal API key not configured",
-        )
-    if x_api_key != config.internal_api_key:
-        raise HTTPException(status_code=401, detail="Invalid API key")
-
-    # Find task by operation_id
-    task = await stream_registry.find_task_by_operation_id(operation_id)
-    if task is None:
-        raise HTTPException(
-            status_code=404,
-            detail=f"Operation {operation_id} not found",
-        )
-
-    logger.info(
-        f"Received completion webhook for operation {operation_id} "
-        f"(task_id={task.task_id}, success={request.success})"
-    )
-
-    if request.success:
-        await process_operation_success(task, request.result)
-    else:
-        await process_operation_failure(task, request.error)
-
-    return {"status": "ok", "task_id": task.task_id}
-
-
-# ========== Configuration ==========
-
-
-@router.get("/config/ttl", status_code=200)
-async def get_ttl_config() -> dict:
-    """
-    Get the stream TTL configuration.
-
-    Returns the Time-To-Live settings for chat streams, which determines
-    how long clients can reconnect to an active stream.
-
-    Returns:
-        dict: TTL configuration with seconds and milliseconds values.
-    """
-    return {
-        "stream_ttl_seconds": config.stream_ttl,
-        "stream_ttl_ms": config.stream_ttl * 1000,
-    }
-
-
 # ========== Health Check ==========


@@ -966,43 +402,3 @@ async def health_check() -> dict:
        "service": "chat",
        "version": "0.1.0",
    }
-
-
-# ========== Schema Export (for OpenAPI / Orval codegen) ==========
-
-ToolResponseUnion = (
-    AgentsFoundResponse
-    | NoResultsResponse
-    | AgentDetailsResponse
-    | SetupRequirementsResponse
-    | ExecutionStartedResponse
-    | NeedLoginResponse
-    | ErrorResponse
-    | InputValidationErrorResponse
-    | AgentOutputResponse
-    | UnderstandingUpdatedResponse
-    | AgentPreviewResponse
-    | AgentSavedResponse
-    | ClarificationNeededResponse
-    | BlockListResponse
-    | BlockDetailsResponse
-    | BlockOutputResponse
-    | DocSearchResultsResponse
-    | DocPageResponse
-    | OperationStartedResponse
-    | OperationPendingResponse
-    | OperationInProgressResponse
-)
-
-
-@router.get(
-    "/schema/tool-responses",
-    response_model=ToolResponseUnion,
-    include_in_schema=True,
-    summary="[Dummy] Tool response type export for codegen",
-    description="This endpoint is not meant to be called. It exists solely to "
-    "expose tool response models in the OpenAPI schema for frontend codegen.",
-)
-async def _tool_response_schema() -> ToolResponseUnion:  # type: ignore[return]
-    """Never called at runtime. Exists only so Orval generates TS types."""
-    raise HTTPException(status_code=501, detail="Schema-only endpoint")
--- a/autogpt_platform/backend/backend/api/features/chat/service.py
+++ b/autogpt_platform/backend/backend/api/features/chat/service.py
@@ -0,0 +1,910 @@
+import asyncio
+import logging
+import time
+from asyncio import CancelledError
+from collections.abc import AsyncGenerator
+from typing import Any
+
+import orjson
+from langfuse import get_client, propagate_attributes
+from langfuse.openai import openai  # type: ignore
+from openai import (
+    APIConnectionError,
+    APIError,
+    APIStatusError,
+    PermissionDeniedError,
+    RateLimitError,
+)
+from openai.types.chat import ChatCompletionChunk, ChatCompletionToolParam
+
+from backend.data.understanding import (
+    format_understanding_for_prompt,
+    get_business_understanding,
+)
+from backend.util.exceptions import NotFoundError
+from backend.util.settings import Settings
+
+from .config import ChatConfig
+from .model import (
+    ChatMessage,
+    ChatSession,
+    Usage,
+    cache_chat_session,
+    get_chat_session,
+    update_session_title,
+    upsert_chat_session,
+)
+from .response_model import (
+    StreamBaseResponse,
+    StreamError,
+    StreamFinish,
+    StreamStart,
+    StreamTextDelta,
+    StreamTextEnd,
+    StreamTextStart,
+    StreamToolInputAvailable,
+    StreamToolInputStart,
+    StreamToolOutputAvailable,
+    StreamUsage,
+)
+from .tools import execute_tool, tools
+
+logger = logging.getLogger(__name__)
+
+config = ChatConfig()
+settings = Settings()
+client = openai.AsyncOpenAI(api_key=config.api_key, base_url=config.base_url)
+
+
+langfuse = get_client()
+
+
+class LangfuseNotConfiguredError(Exception):
+    """Raised when Langfuse is required but not configured."""
+
+    pass
+
+
+def _is_langfuse_configured() -> bool:
+    """Check if Langfuse credentials are configured."""
+    return bool(
+        settings.secrets.langfuse_public_key and settings.secrets.langfuse_secret_key
+    )
+
+
+async def _build_system_prompt(user_id: str | None) -> tuple[str, Any]:
+    """Build the full system prompt including business understanding if available.
+
+    Args:
+        user_id: The user ID for fetching business understanding
+                     If "default" and this is the user's first session, will use "onboarding" instead.
+
+    Returns:
+        Tuple of (compiled prompt string, Langfuse prompt object for tracing)
+    """
+
+    # cache_ttl_seconds=0 disables SDK caching to always get the latest prompt
+    prompt = langfuse.get_prompt(config.langfuse_prompt_name, cache_ttl_seconds=0)
+
+    # If user is authenticated, try to fetch their business understanding
+    understanding = None
+    if user_id:
+        try:
+            understanding = await get_business_understanding(user_id)
+        except Exception as e:
+            logger.warning(f"Failed to fetch business understanding: {e}")
+            understanding = None
+    if understanding:
+        context = format_understanding_for_prompt(understanding)
+    else:
+        context = "This is the first time you are meeting the user. Greet them and introduce them to the platform"
+
+    compiled = prompt.compile(users_information=context)
+    return compiled, understanding
+
+
+async def _generate_session_title(message: str) -> str | None:
+    """Generate a concise title for a chat session based on the first message.
+
+    Args:
+        message: The first user message in the session
+
+    Returns:
+        A short title (3-6 words) or None if generation fails
+    """
+    try:
+        response = await client.chat.completions.create(
+            model=config.title_model,
+            messages=[
+                {
+                    "role": "system",
+                    "content": (
+                        "Generate a very short title (3-6 words) for a chat conversation "
+                        "based on the user's first message. The title should capture the "
+                        "main topic or intent. Return ONLY the title, no quotes or punctuation."
+                    ),
+                },
+                {"role": "user", "content": message[:500]},  # Limit input length
+            ],
+            max_tokens=20,
+        )
+        title = response.choices[0].message.content
+        if title:
+            # Clean up the title
+            title = title.strip().strip("\"'")
+            # Limit length
+            if len(title) > 50:
+                title = title[:47] + "..."
+            return title
+        return None
+    except Exception as e:
+        logger.warning(f"Failed to generate session title: {e}")
+        return None
+
+
+async def assign_user_to_session(
+    session_id: str,
+    user_id: str,
+) -> ChatSession:
+    """
+    Assign a user to a chat session.
+    """
+    session = await get_chat_session(session_id, None)
+    if not session:
+        raise NotFoundError(f"Session {session_id} not found")
+    session.user_id = user_id
+    return await upsert_chat_session(session)
+
+
+async def stream_chat_completion(
+    session_id: str,
+    message: str | None = None,
+    tool_call_response: str | None = None,
+    is_user_message: bool = True,
+    user_id: str | None = None,
+    retry_count: int = 0,
+    session: ChatSession | None = None,
+    context: dict[str, str] | None = None,  # {url: str, content: str}
+) -> AsyncGenerator[StreamBaseResponse, None]:
+    """Main entry point for streaming chat completions with database handling.
+
+    This function handles all database operations and delegates streaming
+    to the internal _stream_chat_chunks function.
+
+    Args:
+        session_id: Chat session ID
+        user_message: User's input message
+        user_id: User ID for authentication (None for anonymous)
+        session: Optional pre-loaded session object (for recursive calls to avoid Redis refetch)
+
+    Yields:
+        StreamBaseResponse objects formatted as SSE
+
+    Raises:
+        NotFoundError: If session_id is invalid
+        ValueError: If max_context_messages is exceeded
+
+    """
+    logger.info(
+        f"Streaming chat completion for session {session_id} for message {message} and user id {user_id}. Message is user message: {is_user_message}"
+    )
+
+    # Check if Langfuse is configured - required for chat functionality
+    if not _is_langfuse_configured():
+        logger.error("Chat request failed: Langfuse is not configured")
+        yield StreamError(
+            errorText="Chat service is not available. Langfuse must be configured "
+            "with LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY environment variables."
+        )
+        yield StreamFinish()
+        return
+
+    # Only fetch from Redis if session not provided (initial call)
+    if session is None:
+        session = await get_chat_session(session_id, user_id)
+        logger.info(
+            f"Fetched session from Redis: {session.session_id if session else 'None'}, "
+            f"message_count={len(session.messages) if session else 0}"
+        )
+    else:
+        logger.info(
+            f"Using provided session object: {session.session_id}, "
+            f"message_count={len(session.messages)}"
+        )
+
+    if not session:
+        raise NotFoundError(
+            f"Session {session_id} not found. Please create a new session first."
+        )
+
+    if message:
+        # Build message content with context if provided
+        message_content = message
+        if context and context.get("url") and context.get("content"):
+            context_text = f"Page URL: {context['url']}\n\nPage Content:\n{context['content']}\n\n---\n\nUser Message: {message}"
+            message_content = context_text
+            logger.info(
+                f"Including page context: URL={context['url']}, content_length={len(context['content'])}"
+            )
+
+        session.messages.append(
+            ChatMessage(
+                role="user" if is_user_message else "assistant", content=message_content
+            )
+        )
+        logger.info(
+            f"Appended message (role={'user' if is_user_message else 'assistant'}), "
+            f"new message_count={len(session.messages)}"
+        )
+
+    logger.info(
+        f"Upserting session: {session.session_id} with user id {session.user_id}, "
+        f"message_count={len(session.messages)}"
+    )
+    session = await upsert_chat_session(session)
+    assert session, "Session not found"
+
+    # Generate title for new sessions on first user message (non-blocking)
+    # Check: is_user_message, no title yet, and this is the first user message
+    if is_user_message and message and not session.title:
+        user_messages = [m for m in session.messages if m.role == "user"]
+        if len(user_messages) == 1:
+            # First user message - generate title in background
+            import asyncio
+
+            # Capture only the values we need (not the session object) to avoid
+            # stale data issues when the main flow modifies the session
+            captured_session_id = session_id
+            captured_message = message
+
+            async def _update_title():
+                try:
+                    title = await _generate_session_title(captured_message)
+                    if title:
+                        # Use dedicated title update function that doesn't
+                        # touch messages, avoiding race conditions
+                        await update_session_title(captured_session_id, title)
+                        logger.info(
+                            f"Generated title for session {captured_session_id}: {title}"
+                        )
+                except Exception as e:
+                    logger.warning(f"Failed to update session title: {e}")
+
+            # Fire and forget - don't block the chat response
+            asyncio.create_task(_update_title())
+
+    # Build system prompt with business understanding
+    system_prompt, understanding = await _build_system_prompt(user_id)
+
+    # Create Langfuse trace for this LLM call (each call gets its own trace, grouped by session_id)
+    # Using v3 SDK: start_observation creates a root span, update_trace sets trace-level attributes
+    input = message
+    if not message and tool_call_response:
+        input = tool_call_response
+
+    langfuse = get_client()
+    with langfuse.start_as_current_observation(
+        as_type="span",
+        name="user-copilot-request",
+        input=input,
+    ) as span:
+        with propagate_attributes(
+            session_id=session_id,
+            user_id=user_id,
+            tags=["copilot"],
+            metadata={
+                "users_information": format_understanding_for_prompt(understanding)[
+                    :200
+                ]  # langfuse only accepts upto to 200 chars
+            },
+        ):
+
+            # Initialize variables that will be used in finally block (must be defined before try)
+            assistant_response = ChatMessage(
+                role="assistant",
+                content="",
+            )
+            accumulated_tool_calls: list[dict[str, Any]] = []
+            has_saved_assistant_message = False
+            has_appended_streaming_message = False
+            last_cache_time = 0.0
+            last_cache_content_len = 0
+
+            # Wrap main logic in try/finally to ensure Langfuse observations are always ended
+            has_yielded_end = False
+            has_yielded_error = False
+            has_done_tool_call = False
+            has_received_text = False
+            text_streaming_ended = False
+            tool_response_messages: list[ChatMessage] = []
+            should_retry = False
+
+            # Generate unique IDs for AI SDK protocol
+            import uuid as uuid_module
+
+            message_id = str(uuid_module.uuid4())
+            text_block_id = str(uuid_module.uuid4())
+
+            # Yield message start
+            yield StreamStart(messageId=message_id)
+
+            try:
+                async for chunk in _stream_chat_chunks(
+                    session=session,
+                    tools=tools,
+                    system_prompt=system_prompt,
+                    text_block_id=text_block_id,
+                ):
+
+                    if isinstance(chunk, StreamTextStart):
+                        # Emit text-start before first text delta
+                        if not has_received_text:
+                            yield chunk
+                    elif isinstance(chunk, StreamTextDelta):
+                        delta = chunk.delta or ""
+                        assert assistant_response.content is not None
+                        assistant_response.content += delta
+                        has_received_text = True
+                        if not has_appended_streaming_message:
+                            session.messages.append(assistant_response)
+                            has_appended_streaming_message = True
+                        current_time = time.monotonic()
+                        content_len = len(assistant_response.content)
+                        if (
+                            current_time - last_cache_time >= 1.0
+                            and content_len > last_cache_content_len
+                        ):
+                            try:
+                                await cache_chat_session(session)
+                            except Exception as e:
+                                logger.warning(
+                                    f"Failed to cache partial session {session.session_id}: {e}"
+                                )
+                            last_cache_time = current_time
+                            last_cache_content_len = content_len
+                        yield chunk
+                    elif isinstance(chunk, StreamTextEnd):
+                        # Emit text-end after text completes
+                        if has_received_text and not text_streaming_ended:
+                            text_streaming_ended = True
+                            if assistant_response.content:
+                                logger.warn(
+                                    f"StreamTextEnd: Attempting to set output {assistant_response.content}"
+                                )
+                                span.update_trace(output=assistant_response.content)
+                                span.update(output=assistant_response.content)
+                            yield chunk
+                    elif isinstance(chunk, StreamToolInputStart):
+                        # Emit text-end before first tool call, but only if we've received text
+                        if has_received_text and not text_streaming_ended:
+                            yield StreamTextEnd(id=text_block_id)
+                            text_streaming_ended = True
+                        yield chunk
+                    elif isinstance(chunk, StreamToolInputAvailable):
+                        # Accumulate tool calls in OpenAI format
+                        accumulated_tool_calls.append(
+                            {
+                                "id": chunk.toolCallId,
+                                "type": "function",
+                                "function": {
+                                    "name": chunk.toolName,
+                                    "arguments": orjson.dumps(chunk.input).decode(
+                                        "utf-8"
+                                    ),
+                                },
+                            }
+                        )
+                    elif isinstance(chunk, StreamToolOutputAvailable):
+                        result_content = (
+                            chunk.output
+                            if isinstance(chunk.output, str)
+                            else orjson.dumps(chunk.output).decode("utf-8")
+                        )
+                        tool_response_messages.append(
+                            ChatMessage(
+                                role="tool",
+                                content=result_content,
+                                tool_call_id=chunk.toolCallId,
+                            )
+                        )
+                        has_done_tool_call = True
+                        # Track if any tool execution failed
+                        if not chunk.success:
+                            logger.warning(
+                                f"Tool {chunk.toolName} (ID: {chunk.toolCallId}) execution failed"
+                            )
+                        yield chunk
+                    elif isinstance(chunk, StreamFinish):
+                        if not has_done_tool_call:
+                            # Emit text-end before finish if we received text but haven't closed it
+                            if has_received_text and not text_streaming_ended:
+                                yield StreamTextEnd(id=text_block_id)
+                                text_streaming_ended = True
+
+                            # Save assistant message before yielding finish to ensure it's persisted
+                            # even if client disconnects immediately after receiving StreamFinish
+                            if not has_saved_assistant_message:
+                                messages_to_save_early: list[ChatMessage] = []
+                                if accumulated_tool_calls:
+                                    assistant_response.tool_calls = (
+                                        accumulated_tool_calls
+                                    )
+                                if not has_appended_streaming_message and (
+                                    assistant_response.content
+                                    or assistant_response.tool_calls
+                                ):
+                                    messages_to_save_early.append(assistant_response)
+                                messages_to_save_early.extend(tool_response_messages)
+
+                                if messages_to_save_early:
+                                    session.messages.extend(messages_to_save_early)
+                                    logger.info(
+                                        f"Saving assistant message before StreamFinish: "
+                                        f"content_len={len(assistant_response.content or '')}, "
+                                        f"tool_calls={len(assistant_response.tool_calls or [])}, "
+                                        f"tool_responses={len(tool_response_messages)}"
+                                    )
+                                if (
+                                    messages_to_save_early
+                                    or has_appended_streaming_message
+                                ):
+                                    await upsert_chat_session(session)
+                                    has_saved_assistant_message = True
+
+                            has_yielded_end = True
+                            yield chunk
+                    elif isinstance(chunk, StreamError):
+                        has_yielded_error = True
+                        yield chunk
+                    elif isinstance(chunk, StreamUsage):
+                        session.usage.append(
+                            Usage(
+                                prompt_tokens=chunk.promptTokens,
+                                completion_tokens=chunk.completionTokens,
+                                total_tokens=chunk.totalTokens,
+                            )
+                        )
+                    else:
+                        logger.error(
+                            f"Unknown chunk type: {type(chunk)}", exc_info=True
+                        )
+                if assistant_response.content:
+                    langfuse.update_current_trace(output=assistant_response.content)
+                    langfuse.update_current_span(output=assistant_response.content)
+                elif tool_response_messages:
+                    langfuse.update_current_trace(output=str(tool_response_messages))
+                    langfuse.update_current_span(output=str(tool_response_messages))
+
+            except CancelledError:
+                if not has_saved_assistant_message:
+                    if accumulated_tool_calls:
+                        assistant_response.tool_calls = accumulated_tool_calls
+                    if assistant_response.content:
+                        assistant_response.content = (
+                            f"{assistant_response.content}\n\n[interrupted]"
+                        )
+                    else:
+                        assistant_response.content = "[interrupted]"
+                    if not has_appended_streaming_message:
+                        session.messages.append(assistant_response)
+                    if tool_response_messages:
+                        session.messages.extend(tool_response_messages)
+                    try:
+                        await upsert_chat_session(session)
+                    except Exception as e:
+                        logger.warning(
+                            f"Failed to save interrupted session {session.session_id}: {e}"
+                        )
+                raise
+            except Exception as e:
+                logger.error(f"Error during stream: {e!s}", exc_info=True)
+
+                # Check if this is a retryable error (JSON parsing, incomplete tool calls, etc.)
+                is_retryable = isinstance(
+                    e, (orjson.JSONDecodeError, KeyError, TypeError)
+                )
+
+                if is_retryable and retry_count < config.max_retries:
+                    logger.info(
+                        f"Retryable error encountered. Attempt {retry_count + 1}/{config.max_retries}"
+                    )
+                    should_retry = True
+                else:
+                    # Non-retryable error or max retries exceeded
+                    # Save any partial progress before reporting error
+                    messages_to_save: list[ChatMessage] = []
+
+                    # Add assistant message if it has content or tool calls
+                    if accumulated_tool_calls:
+                        assistant_response.tool_calls = accumulated_tool_calls
+                    if not has_appended_streaming_message and (
+                        assistant_response.content or assistant_response.tool_calls
+                    ):
+                        messages_to_save.append(assistant_response)
+
+                    # Add tool response messages after assistant message
+                    messages_to_save.extend(tool_response_messages)
+
+                    if not has_saved_assistant_message:
+                        if messages_to_save:
+                            session.messages.extend(messages_to_save)
+                        if messages_to_save or has_appended_streaming_message:
+                            await upsert_chat_session(session)
+
+                    if not has_yielded_error:
+                        error_message = str(e)
+                        if not is_retryable:
+                            error_message = f"Non-retryable error: {error_message}"
+                        elif retry_count >= config.max_retries:
+                            error_message = f"Max retries ({config.max_retries}) exceeded: {error_message}"
+
+                        error_response = StreamError(errorText=error_message)
+                        yield error_response
+                    if not has_yielded_end:
+                        yield StreamFinish()
+                    return
+
+            # Handle retry outside of exception handler to avoid nesting
+            if should_retry and retry_count < config.max_retries:
+                logger.info(
+                    f"Retrying stream_chat_completion for session {session_id}, attempt {retry_count + 1}"
+                )
+                async for chunk in stream_chat_completion(
+                    session_id=session.session_id,
+                    user_id=user_id,
+                    retry_count=retry_count + 1,
+                    session=session,
+                    context=context,
+                ):
+                    yield chunk
+                return  # Exit after retry to avoid double-saving in finally block
+
+            # Normal completion path - save session and handle tool call continuation
+            # Only save if we haven't already saved when StreamFinish was received
+            if not has_saved_assistant_message:
+                logger.info(
+                    f"Normal completion path: session={session.session_id}, "
+                    f"current message_count={len(session.messages)}"
+                )
+
+                # Build the messages list in the correct order
+                messages_to_save: list[ChatMessage] = []
+
+                # Add assistant message with tool_calls if any
+                if accumulated_tool_calls:
+                    assistant_response.tool_calls = accumulated_tool_calls
+                    logger.info(
+                        f"Added {len(accumulated_tool_calls)} tool calls to assistant message"
+                    )
+                if not has_appended_streaming_message and (
+                    assistant_response.content or assistant_response.tool_calls
+                ):
+                    messages_to_save.append(assistant_response)
+                    logger.info(
+                        f"Saving assistant message with content_len={len(assistant_response.content or '')}, tool_calls={len(assistant_response.tool_calls or [])}"
+                    )
+
+                # Add tool response messages after assistant message
+                messages_to_save.extend(tool_response_messages)
+                logger.info(
+                    f"Saving {len(tool_response_messages)} tool response messages, "
+                    f"total_to_save={len(messages_to_save)}"
+                )
+
+                if messages_to_save:
+                    session.messages.extend(messages_to_save)
+                    logger.info(
+                        f"Extended session messages, new message_count={len(session.messages)}"
+                    )
+                if messages_to_save or has_appended_streaming_message:
+                    await upsert_chat_session(session)
+            else:
+                logger.info(
+                    "Assistant message already saved when StreamFinish was received, "
+                    "skipping duplicate save"
+                )
+
+            # If we did a tool call, stream the chat completion again to get the next response
+            if has_done_tool_call:
+                logger.info(
+                    "Tool call executed, streaming chat completion again to get assistant response"
+                )
+                async for chunk in stream_chat_completion(
+                    session_id=session.session_id,
+                    user_id=user_id,
+                    session=session,  # Pass session object to avoid Redis refetch
+                    context=context,
+                    tool_call_response=str(tool_response_messages),
+                ):
+                    yield chunk
+
+
+# Retry configuration for OpenAI API calls
+MAX_RETRIES = 3
+BASE_DELAY_SECONDS = 1.0
+MAX_DELAY_SECONDS = 30.0
+
+
+def _is_retryable_error(error: Exception) -> bool:
+    """Determine if an error is retryable."""
+    if isinstance(error, RateLimitError):
+        return True
+    if isinstance(error, APIConnectionError):
+        return True
+    if isinstance(error, APIStatusError):
+        # APIStatusError has a response with status_code
+        # Retry on 5xx status codes (server errors)
+        if error.response.status_code >= 500:
+            return True
+    if isinstance(error, APIError):
+        # Retry on overloaded errors or 500 errors (may not have status code)
+        error_message = str(error).lower()
+        if "overloaded" in error_message or "internal server error" in error_message:
+            return True
+    return False
+
+
+def _is_region_blocked_error(error: Exception) -> bool:
+    if isinstance(error, PermissionDeniedError):
+        return "not available in your region" in str(error).lower()
+    return "not available in your region" in str(error).lower()
+
+
+async def _stream_chat_chunks(
+    session: ChatSession,
+    tools: list[ChatCompletionToolParam],
+    system_prompt: str | None = None,
+    text_block_id: str | None = None,
+) -> AsyncGenerator[StreamBaseResponse, None]:
+    """
+    Pure streaming function for OpenAI chat completions with tool calling.
+
+    This function is database-agnostic and focuses only on streaming logic.
+    Implements exponential backoff retry for transient API errors.
+
+    Args:
+        session: Chat session with conversation history
+        tools: Available tools for the model
+        system_prompt: System prompt to prepend to messages
+
+    Yields:
+        SSE formatted JSON response objects
+
+    """
+    model = config.model
+
+    logger.info("Starting pure chat stream")
+
+    # Build messages with system prompt prepended
+    messages = session.to_openai_messages()
+    if system_prompt:
+        from openai.types.chat import ChatCompletionSystemMessageParam
+
+        system_message = ChatCompletionSystemMessageParam(
+            role="system",
+            content=system_prompt,
+        )
+        messages = [system_message] + messages
+
+    # Loop to handle tool calls and continue conversation
+    while True:
+        retry_count = 0
+        last_error: Exception | None = None
+
+        while retry_count <= MAX_RETRIES:
+            try:
+                logger.info(
+                    f"Creating OpenAI chat completion stream..."
+                    f"{f' (retry {retry_count}/{MAX_RETRIES})' if retry_count > 0 else ''}"
+                )
+
+                # Create the stream with proper types
+                stream = await client.chat.completions.create(
+                    model=model,
+                    messages=messages,
+                    tools=tools,
+                    tool_choice="auto",
+                    stream=True,
+                    stream_options={"include_usage": True},
+                )
+
+                # Variables to accumulate tool calls
+                tool_calls: list[dict[str, Any]] = []
+                active_tool_call_idx: int | None = None
+                finish_reason: str | None = None
+                # Track which tool call indices have had their start event emitted
+                emitted_start_for_idx: set[int] = set()
+
+                # Track if we've started the text block
+                text_started = False
+
+                # Process the stream
+                chunk: ChatCompletionChunk
+                async for chunk in stream:
+                    if chunk.usage:
+                        yield StreamUsage(
+                            promptTokens=chunk.usage.prompt_tokens,
+                            completionTokens=chunk.usage.completion_tokens,
+                            totalTokens=chunk.usage.total_tokens,
+                        )
+
+                    if chunk.choices:
+                        choice = chunk.choices[0]
+                        delta = choice.delta
+
+                        # Capture finish reason
+                        if choice.finish_reason:
+                            finish_reason = choice.finish_reason
+                            logger.info(f"Finish reason: {finish_reason}")
+
+                        # Handle content streaming
+                        if delta.content:
+                            # Emit text-start on first text content
+                            if not text_started and text_block_id:
+                                yield StreamTextStart(id=text_block_id)
+                                text_started = True
+                            # Stream the text delta
+                            text_response = StreamTextDelta(
+                                id=text_block_id or "",
+                                delta=delta.content,
+                            )
+                            yield text_response
+
+                        # Handle tool calls
+                        if delta.tool_calls:
+                            for tc_chunk in delta.tool_calls:
+                                idx = tc_chunk.index
+
+                                # Update active tool call index if needed
+                                if (
+                                    active_tool_call_idx is None
+                                    or active_tool_call_idx != idx
+                                ):
+                                    active_tool_call_idx = idx
+
+                                # Ensure we have a tool call object at this index
+                                while len(tool_calls) <= idx:
+                                    tool_calls.append(
+                                        {
+                                            "id": "",
+                                            "type": "function",
+                                            "function": {
+                                                "name": "",
+                                                "arguments": "",
+                                            },
+                                        },
+                                    )
+
+                                # Accumulate the tool call data
+                                if tc_chunk.id:
+                                    tool_calls[idx]["id"] = tc_chunk.id
+                                if tc_chunk.function:
+                                    if tc_chunk.function.name:
+                                        tool_calls[idx]["function"][
+                                            "name"
+                                        ] = tc_chunk.function.name
+                                    if tc_chunk.function.arguments:
+                                        tool_calls[idx]["function"][
+                                            "arguments"
+                                        ] += tc_chunk.function.arguments
+
+                                # Emit StreamToolInputStart only after we have the tool call ID
+                                if (
+                                    idx not in emitted_start_for_idx
+                                    and tool_calls[idx]["id"]
+                                    and tool_calls[idx]["function"]["name"]
+                                ):
+                                    yield StreamToolInputStart(
+                                        toolCallId=tool_calls[idx]["id"],
+                                        toolName=tool_calls[idx]["function"]["name"],
+                                    )
+                                    emitted_start_for_idx.add(idx)
+                logger.info(f"Stream complete. Finish reason: {finish_reason}")
+
+                # Yield all accumulated tool calls after the stream is complete
+                # This ensures all tool call arguments have been fully received
+                for idx, tool_call in enumerate(tool_calls):
+                    try:
+                        async for tc in _yield_tool_call(tool_calls, idx, session):
+                            yield tc
+                    except (orjson.JSONDecodeError, KeyError, TypeError) as e:
+                        logger.error(
+                            f"Failed to parse tool call {idx}: {e}",
+                            exc_info=True,
+                            extra={"tool_call": tool_call},
+                        )
+                        yield StreamError(
+                            errorText=f"Invalid tool call arguments for tool {tool_call.get('function', {}).get('name', 'unknown')}: {e}",
+                        )
+                        # Re-raise to trigger retry logic in the parent function
+                        raise
+
+                yield StreamFinish()
+                return
+            except Exception as e:
+                last_error = e
+                if _is_retryable_error(e) and retry_count < MAX_RETRIES:
+                    retry_count += 1
+                    # Calculate delay with exponential backoff
+                    delay = min(
+                        BASE_DELAY_SECONDS * (2 ** (retry_count - 1)),
+                        MAX_DELAY_SECONDS,
+                    )
+                    logger.warning(
+                        f"Retryable error in stream: {e!s}. "
+                        f"Retrying in {delay:.1f}s (attempt {retry_count}/{MAX_RETRIES})"
+                    )
+                    await asyncio.sleep(delay)
+                    continue  # Retry the stream
+                else:
+                    # Non-retryable error or max retries exceeded
+                    logger.error(
+                        f"Error in stream (not retrying): {e!s}",
+                        exc_info=True,
+                    )
+                    error_code = None
+                    error_text = str(e)
+                    if _is_region_blocked_error(e):
+                        error_code = "MODEL_NOT_AVAILABLE_REGION"
+                        error_text = (
+                            "This model is not available in your region. "
+                            "Please connect via VPN and try again."
+                        )
+                    error_response = StreamError(
+                        errorText=error_text,
+                        code=error_code,
+                    )
+                    yield error_response
+                    yield StreamFinish()
+                    return
+
+        # If we exit the retry loop without returning, it means we exhausted retries
+        if last_error:
+            logger.error(
+                f"Max retries ({MAX_RETRIES}) exceeded. Last error: {last_error!s}",
+                exc_info=True,
+            )
+            yield StreamError(errorText=f"Max retries exceeded: {last_error!s}")
+            yield StreamFinish()
+            return
+
+
+async def _yield_tool_call(
+    tool_calls: list[dict[str, Any]],
+    yield_idx: int,
+    session: ChatSession,
+) -> AsyncGenerator[StreamBaseResponse, None]:
+    """
+    Yield a tool call and its execution result.
+
+    Raises:
+        orjson.JSONDecodeError: If tool call arguments cannot be parsed as JSON
+        KeyError: If expected tool call fields are missing
+        TypeError: If tool call structure is invalid
+    """
+    tool_name = tool_calls[yield_idx]["function"]["name"]
+    tool_call_id = tool_calls[yield_idx]["id"]
+    logger.info(f"Yielding tool call: {tool_calls[yield_idx]}")
+
+    # Parse tool call arguments - handle empty arguments gracefully
+    raw_arguments = tool_calls[yield_idx]["function"]["arguments"]
+    if raw_arguments:
+        arguments = orjson.loads(raw_arguments)
+    else:
+        arguments = {}
+
+    yield StreamToolInputAvailable(
+        toolCallId=tool_call_id,
+        toolName=tool_name,
+        input=arguments,
+    )
+
+    tool_execution_response: StreamToolOutputAvailable = await execute_tool(
+        tool_name=tool_name,
+        parameters=arguments,
+        tool_call_id=tool_call_id,
+        user_id=session.user_id,
+        session=session,
+    )
+
+    yield tool_execution_response
--- a/autogpt_platform/backend/backend/api/features/chat/service_test.py
+++ b/autogpt_platform/backend/backend/api/features/chat/service_test.py
@@ -0,0 +1,82 @@
+import logging
+from os import getenv
+
+import pytest
+
+from . import service as chat_service
+from .model import create_chat_session, get_chat_session, upsert_chat_session
+from .response_model import (
+    StreamError,
+    StreamFinish,
+    StreamTextDelta,
+    StreamToolOutputAvailable,
+)
+
+logger = logging.getLogger(__name__)
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_stream_chat_completion(setup_test_user, test_user_id):
+    """
+    Test the stream_chat_completion function.
+    """
+    api_key: str | None = getenv("OPEN_ROUTER_API_KEY")
+    if not api_key:
+        return pytest.skip("OPEN_ROUTER_API_KEY is not set, skipping test")
+
+    session = await create_chat_session(test_user_id)
+
+    has_errors = False
+    has_ended = False
+    assistant_message = ""
+    async for chunk in chat_service.stream_chat_completion(
+        session.session_id, "Hello, how are you?", user_id=session.user_id
+    ):
+        logger.info(chunk)
+        if isinstance(chunk, StreamError):
+            has_errors = True
+        if isinstance(chunk, StreamTextDelta):
+            assistant_message += chunk.delta
+        if isinstance(chunk, StreamFinish):
+            has_ended = True
+
+    assert has_ended, "Chat completion did not end"
+    assert not has_errors, "Error occurred while streaming chat completion"
+    assert assistant_message, "Assistant message is empty"
+
+
+@pytest.mark.asyncio(loop_scope="session")
+async def test_stream_chat_completion_with_tool_calls(setup_test_user, test_user_id):
+    """
+    Test the stream_chat_completion function.
+    """
+    api_key: str | None = getenv("OPEN_ROUTER_API_KEY")
+    if not api_key:
+        return pytest.skip("OPEN_ROUTER_API_KEY is not set, skipping test")
+
+    session = await create_chat_session(test_user_id)
+    session = await upsert_chat_session(session)
+
+    has_errors = False
+    has_ended = False
+    had_tool_calls = False
+    async for chunk in chat_service.stream_chat_completion(
+        session.session_id,
+        "Please find me an agent that can help me with my business. Use the query 'moneny printing agent'",
+        user_id=session.user_id,
+    ):
+        logger.info(chunk)
+        if isinstance(chunk, StreamError):
+            has_errors = True
+
+        if isinstance(chunk, StreamFinish):
+            has_ended = True
+        if isinstance(chunk, StreamToolOutputAvailable):
+            had_tool_calls = True
+
+    assert has_ended, "Chat completion did not end"
+    assert not has_errors, "Error occurred while streaming chat completion"
+    assert had_tool_calls, "Tool calls did not occur"
+    session = await get_chat_session(session.session_id)
+    assert session, "Session not found"
+    assert session.usage, "Usage is empty"
--- a/autogpt_platform/backend/backend/api/features/chat/tools/init.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/init.py
@@ -0,0 +1,59 @@
+from typing import TYPE_CHECKING, Any
+
+from openai.types.chat import ChatCompletionToolParam
+
+from backend.api.features.chat.model import ChatSession
+
+from .add_understanding import AddUnderstandingTool
+from .agent_output import AgentOutputTool
+from .base import BaseTool
+from .create_agent import CreateAgentTool
+from .edit_agent import EditAgentTool
+from .find_agent import FindAgentTool
+from .find_block import FindBlockTool
+from .find_library_agent import FindLibraryAgentTool
+from .get_doc_page import GetDocPageTool
+from .run_agent import RunAgentTool
+from .run_block import RunBlockTool
+from .search_docs import SearchDocsTool
+
+if TYPE_CHECKING:
+    from backend.api.features.chat.response_model import StreamToolOutputAvailable
+
+# Single source of truth for all tools
+TOOL_REGISTRY: dict[str, BaseTool] = {
+    "add_understanding": AddUnderstandingTool(),
+    "create_agent": CreateAgentTool(),
+    "edit_agent": EditAgentTool(),
+    "find_agent": FindAgentTool(),
+    "find_block": FindBlockTool(),
+    "find_library_agent": FindLibraryAgentTool(),
+    "run_agent": RunAgentTool(),
+    "run_block": RunBlockTool(),
+    "view_agent_output": AgentOutputTool(),
+    "search_docs": SearchDocsTool(),
+    "get_doc_page": GetDocPageTool(),
+}
+
+# Export individual tool instances for backwards compatibility
+find_agent_tool = TOOL_REGISTRY["find_agent"]
+run_agent_tool = TOOL_REGISTRY["run_agent"]
+
+# Generated from registry for OpenAI API
+tools: list[ChatCompletionToolParam] = [
+    tool.as_openai_tool() for tool in TOOL_REGISTRY.values()
+]
+
+
+async def execute_tool(
+    tool_name: str,
+    parameters: dict[str, Any],
+    user_id: str | None,
+    session: ChatSession,
+    tool_call_id: str,
+) -> "StreamToolOutputAvailable":
+    """Execute a tool by name."""
+    tool = TOOL_REGISTRY.get(tool_name)
+    if not tool:
+        raise ValueError(f"Tool {tool_name} not found")
+    return await tool.execute(user_id, session, tool_call_id, **parameters)
--- a/autogpt_platform/backend/backend/api/features/chat/tools/_test_data.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/_test_data.py
@@ -6,11 +6,11 @@ import pytest
 from prisma.types import ProfileCreateInput
 from pydantic import SecretStr

+from backend.api.features.chat.model import ChatSession
 from backend.api.features.store import db as store_db
 from backend.blocks.firecrawl.scrape import FirecrawlScrapeBlock
 from backend.blocks.io import AgentInputBlock, AgentOutputBlock
 from backend.blocks.llm import AITextGeneratorBlock
-from backend.copilot.model import ChatSession
 from backend.data.db import prisma
 from backend.data.graph import Graph, Link, Node, create_graph
 from backend.data.model import APIKeyCredentials
--- a/autogpt_platform/backend/backend/api/features/chat/tools/add_understanding.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/add_understanding.py
@@ -3,9 +3,13 @@
 import logging
 from typing import Any

-from backend.copilot.model import ChatSession
-from backend.data.db_accessors import understanding_db
-from backend.data.understanding import BusinessUnderstandingInput
+from langfuse import observe
+
+from backend.api.features.chat.model import ChatSession
+from backend.data.understanding import (
+    BusinessUnderstandingInput,
+    upsert_business_understanding,
+)

 from .base import BaseTool
 from .models import ErrorResponse, ToolResponseBase, UnderstandingUpdatedResponse
@@ -57,6 +61,7 @@ and automations for the user's specific needs."""
        """Requires authentication to store user-specific data."""
        return True

+    @observe(as_type="tool", name="add_understanding")
    async def _execute(
        self,
        user_id: str | None,
@@ -97,9 +102,7 @@ and automations for the user's specific needs."""
        ]

        # Upsert with merge
-        understanding = await understanding_db().upsert_business_understanding(
-            user_id, input_data
-        )
+        understanding = await upsert_business_understanding(user_id, input_data)

        # Build current understanding summary (filter out empty values)
        current_understanding = {
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/init.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/init.py
@@ -0,0 +1,29 @@
+"""Agent generator package - Creates agents from natural language."""
+
+from .core import (
+    apply_agent_patch,
+    decompose_goal,
+    generate_agent,
+    generate_agent_patch,
+    get_agent_as_json,
+    save_agent_to_library,
+)
+from .fixer import apply_all_fixes
+from .utils import get_blocks_info
+from .validator import validate_agent
+
+__all__ = [
+    # Core functions
+    "decompose_goal",
+    "generate_agent",
+    "generate_agent_patch",
+    "apply_agent_patch",
+    "save_agent_to_library",
+    "get_agent_as_json",
+    # Fixer
+    "apply_all_fixes",
+    # Validator
+    "validate_agent",
+    # Utils
+    "get_blocks_info",
+]
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/client.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/client.py
@@ -0,0 +1,25 @@
+"""OpenRouter client configuration for agent generation."""
+
+import os
+
+from openai import AsyncOpenAI
+
+# Configuration - use OPEN_ROUTER_API_KEY for consistency with chat/config.py
+OPENROUTER_API_KEY = os.getenv("OPEN_ROUTER_API_KEY")
+AGENT_GENERATOR_MODEL = os.getenv("AGENT_GENERATOR_MODEL", "anthropic/claude-opus-4.5")
+
+# OpenRouter client (OpenAI-compatible API)
+_client: AsyncOpenAI | None = None
+
+
+def get_client() -> AsyncOpenAI:
+    """Get or create the OpenRouter client."""
+    global _client
+    if _client is None:
+        if not OPENROUTER_API_KEY:
+            raise ValueError("OPENROUTER_API_KEY environment variable is required")
+        _client = AsyncOpenAI(
+            base_url="https://openrouter.ai/api/v1",
+            api_key=OPENROUTER_API_KEY,
+        )
+    return _client
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/core.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/core.py
@@ -0,0 +1,391 @@
+"""Core agent generation functions."""
+
+import copy
+import json
+import logging
+import uuid
+from typing import Any
+
+from backend.api.features.library import db as library_db
+from backend.data.graph import Graph, Link, Node, create_graph
+
+from .client import AGENT_GENERATOR_MODEL, get_client
+from .prompts import DECOMPOSITION_PROMPT, GENERATION_PROMPT, PATCH_PROMPT
+from .utils import get_block_summaries, parse_json_from_llm
+
+logger = logging.getLogger(__name__)
+
+
+async def decompose_goal(description: str, context: str = "") -> dict[str, Any] | None:
+    """Break down a goal into steps or return clarifying questions.
+
+    Args:
+        description: Natural language goal description
+        context: Additional context (e.g., answers to previous questions)
+
+    Returns:
+        Dict with either:
+        - {"type": "clarifying_questions", "questions": [...]}
+        - {"type": "instructions", "steps": [...]}
+        Or None on error
+    """
+    client = get_client()
+    prompt = DECOMPOSITION_PROMPT.format(block_summaries=get_block_summaries())
+
+    full_description = description
+    if context:
+        full_description = f"{description}\n\nAdditional context:\n{context}"
+
+    try:
+        response = await client.chat.completions.create(
+            model=AGENT_GENERATOR_MODEL,
+            messages=[
+                {"role": "system", "content": prompt},
+                {"role": "user", "content": full_description},
+            ],
+            temperature=0,
+        )
+
+        content = response.choices[0].message.content
+        if content is None:
+            logger.error("LLM returned empty content for decomposition")
+            return None
+
+        result = parse_json_from_llm(content)
+
+        if result is None:
+            logger.error(f"Failed to parse decomposition response: {content[:200]}")
+            return None
+
+        return result
+
+    except Exception as e:
+        logger.error(f"Error decomposing goal: {e}")
+        return None
+
+
+async def generate_agent(instructions: dict[str, Any]) -> dict[str, Any] | None:
+    """Generate agent JSON from instructions.
+
+    Args:
+        instructions: Structured instructions from decompose_goal
+
+    Returns:
+        Agent JSON dict or None on error
+    """
+    client = get_client()
+    prompt = GENERATION_PROMPT.format(block_summaries=get_block_summaries())
+
+    try:
+        response = await client.chat.completions.create(
+            model=AGENT_GENERATOR_MODEL,
+            messages=[
+                {"role": "system", "content": prompt},
+                {"role": "user", "content": json.dumps(instructions, indent=2)},
+            ],
+            temperature=0,
+        )
+
+        content = response.choices[0].message.content
+        if content is None:
+            logger.error("LLM returned empty content for agent generation")
+            return None
+
+        result = parse_json_from_llm(content)
+
+        if result is None:
+            logger.error(f"Failed to parse agent JSON: {content[:200]}")
+            return None
+
+        # Ensure required fields
+        if "id" not in result:
+            result["id"] = str(uuid.uuid4())
+        if "version" not in result:
+            result["version"] = 1
+        if "is_active" not in result:
+            result["is_active"] = True
+
+        return result
+
+    except Exception as e:
+        logger.error(f"Error generating agent: {e}")
+        return None
+
+
+def json_to_graph(agent_json: dict[str, Any]) -> Graph:
+    """Convert agent JSON dict to Graph model.
+
+    Args:
+        agent_json: Agent JSON with nodes and links
+
+    Returns:
+        Graph ready for saving
+    """
+    nodes = []
+    for n in agent_json.get("nodes", []):
+        node = Node(
+            id=n.get("id", str(uuid.uuid4())),
+            block_id=n["block_id"],
+            input_default=n.get("input_default", {}),
+            metadata=n.get("metadata", {}),
+        )
+        nodes.append(node)
+
+    links = []
+    for link_data in agent_json.get("links", []):
+        link = Link(
+            id=link_data.get("id", str(uuid.uuid4())),
+            source_id=link_data["source_id"],
+            sink_id=link_data["sink_id"],
+            source_name=link_data["source_name"],
+            sink_name=link_data["sink_name"],
+            is_static=link_data.get("is_static", False),
+        )
+        links.append(link)
+
+    return Graph(
+        id=agent_json.get("id", str(uuid.uuid4())),
+        version=agent_json.get("version", 1),
+        is_active=agent_json.get("is_active", True),
+        name=agent_json.get("name", "Generated Agent"),
+        description=agent_json.get("description", ""),
+        nodes=nodes,
+        links=links,
+    )
+
+
+def _reassign_node_ids(graph: Graph) -> None:
+    """Reassign all node and link IDs to new UUIDs.
+
+    This is needed when creating a new version to avoid unique constraint violations.
+    """
+    # Create mapping from old node IDs to new UUIDs
+    id_map = {node.id: str(uuid.uuid4()) for node in graph.nodes}
+
+    # Reassign node IDs
+    for node in graph.nodes:
+        node.id = id_map[node.id]
+
+    # Update link references to use new node IDs
+    for link in graph.links:
+        link.id = str(uuid.uuid4())  # Also give links new IDs
+        if link.source_id in id_map:
+            link.source_id = id_map[link.source_id]
+        if link.sink_id in id_map:
+            link.sink_id = id_map[link.sink_id]
+
+
+async def save_agent_to_library(
+    agent_json: dict[str, Any], user_id: str, is_update: bool = False
+) -> tuple[Graph, Any]:
+    """Save agent to database and user's library.
+
+    Args:
+        agent_json: Agent JSON dict
+        user_id: User ID
+        is_update: Whether this is an update to an existing agent
+
+    Returns:
+        Tuple of (created Graph, LibraryAgent)
+    """
+    from backend.data.graph import get_graph_all_versions
+
+    graph = json_to_graph(agent_json)
+
+    if is_update:
+        # For updates, keep the same graph ID but increment version
+        # and reassign node/link IDs to avoid conflicts
+        if graph.id:
+            existing_versions = await get_graph_all_versions(graph.id, user_id)
+            if existing_versions:
+                latest_version = max(v.version for v in existing_versions)
+                graph.version = latest_version + 1
+                # Reassign node IDs (but keep graph ID the same)
+                _reassign_node_ids(graph)
+                logger.info(f"Updating agent {graph.id} to version {graph.version}")
+    else:
+        # For new agents, always generate a fresh UUID to avoid collisions
+        graph.id = str(uuid.uuid4())
+        graph.version = 1
+        # Reassign all node IDs as well
+        _reassign_node_ids(graph)
+        logger.info(f"Creating new agent with ID {graph.id}")
+
+    # Save to database
+    created_graph = await create_graph(graph, user_id)
+
+    # Add to user's library (or update existing library agent)
+    library_agents = await library_db.create_library_agent(
+        graph=created_graph,
+        user_id=user_id,
+        sensitive_action_safe_mode=True,
+        create_library_agents_for_sub_graphs=False,
+    )
+
+    return created_graph, library_agents[0]
+
+
+async def get_agent_as_json(
+    graph_id: str, user_id: str | None
+) -> dict[str, Any] | None:
+    """Fetch an agent and convert to JSON format for editing.
+
+    Args:
+        graph_id: Graph ID or library agent ID
+        user_id: User ID
+
+    Returns:
+        Agent as JSON dict or None if not found
+    """
+    from backend.data.graph import get_graph
+
+    # Try to get the graph (version=None gets the active version)
+    graph = await get_graph(graph_id, version=None, user_id=user_id)
+    if not graph:
+        return None
+
+    # Convert to JSON format
+    nodes = []
+    for node in graph.nodes:
+        nodes.append(
+            {
+                "id": node.id,
+                "block_id": node.block_id,
+                "input_default": node.input_default,
+                "metadata": node.metadata,
+            }
+        )
+
+    links = []
+    for node in graph.nodes:
+        for link in node.output_links:
+            links.append(
+                {
+                    "id": link.id,
+                    "source_id": link.source_id,
+                    "sink_id": link.sink_id,
+                    "source_name": link.source_name,
+                    "sink_name": link.sink_name,
+                    "is_static": link.is_static,
+                }
+            )
+
+    return {
+        "id": graph.id,
+        "name": graph.name,
+        "description": graph.description,
+        "version": graph.version,
+        "is_active": graph.is_active,
+        "nodes": nodes,
+        "links": links,
+    }
+
+
+async def generate_agent_patch(
+    update_request: str, current_agent: dict[str, Any]
+) -> dict[str, Any] | None:
+    """Generate a patch to update an existing agent.
+
+    Args:
+        update_request: Natural language description of changes
+        current_agent: Current agent JSON
+
+    Returns:
+        Patch dict or clarifying questions, or None on error
+    """
+    client = get_client()
+    prompt = PATCH_PROMPT.format(
+        current_agent=json.dumps(current_agent, indent=2),
+        block_summaries=get_block_summaries(),
+    )
+
+    try:
+        response = await client.chat.completions.create(
+            model=AGENT_GENERATOR_MODEL,
+            messages=[
+                {"role": "system", "content": prompt},
+                {"role": "user", "content": update_request},
+            ],
+            temperature=0,
+        )
+
+        content = response.choices[0].message.content
+        if content is None:
+            logger.error("LLM returned empty content for patch generation")
+            return None
+
+        return parse_json_from_llm(content)
+
+    except Exception as e:
+        logger.error(f"Error generating patch: {e}")
+        return None
+
+
+def apply_agent_patch(
+    current_agent: dict[str, Any], patch: dict[str, Any]
+) -> dict[str, Any]:
+    """Apply a patch to an existing agent.
+
+    Args:
+        current_agent: Current agent JSON
+        patch: Patch dict with operations
+
+    Returns:
+        Updated agent JSON
+    """
+    agent = copy.deepcopy(current_agent)
+    patches = patch.get("patches", [])
+
+    for p in patches:
+        patch_type = p.get("type")
+
+        if patch_type == "modify":
+            node_id = p.get("node_id")
+            changes = p.get("changes", {})
+
+            for node in agent.get("nodes", []):
+                if node["id"] == node_id:
+                    _deep_update(node, changes)
+                    logger.debug(f"Modified node {node_id}")
+                    break
+
+        elif patch_type == "add":
+            new_nodes = p.get("new_nodes", [])
+            new_links = p.get("new_links", [])
+
+            agent["nodes"] = agent.get("nodes", []) + new_nodes
+            agent["links"] = agent.get("links", []) + new_links
+            logger.debug(f"Added {len(new_nodes)} nodes, {len(new_links)} links")
+
+        elif patch_type == "remove":
+            node_ids_to_remove = set(p.get("node_ids", []))
+            link_ids_to_remove = set(p.get("link_ids", []))
+
+            # Remove nodes
+            agent["nodes"] = [
+                n for n in agent.get("nodes", []) if n["id"] not in node_ids_to_remove
+            ]
+
+            # Remove links (both explicit and those referencing removed nodes)
+            agent["links"] = [
+                link
+                for link in agent.get("links", [])
+                if link["id"] not in link_ids_to_remove
+                and link["source_id"] not in node_ids_to_remove
+                and link["sink_id"] not in node_ids_to_remove
+            ]
+
+            logger.debug(
+                f"Removed {len(node_ids_to_remove)} nodes, {len(link_ids_to_remove)} links"
+            )
+
+    return agent
+
+
+def _deep_update(target: dict, source: dict) -> None:
+    """Recursively update a dict with another dict."""
+    for key, value in source.items():
+        if key in target and isinstance(target[key], dict) and isinstance(value, dict):
+            _deep_update(target[key], value)
+        else:
+            target[key] = value
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/fixer.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/fixer.py
@@ -0,0 +1,606 @@
+"""Agent fixer - Fixes common LLM generation errors."""
+
+import logging
+import re
+import uuid
+from typing import Any
+
+from .utils import (
+    ADDTODICTIONARY_BLOCK_ID,
+    ADDTOLIST_BLOCK_ID,
+    CODE_EXECUTION_BLOCK_ID,
+    CONDITION_BLOCK_ID,
+    CREATEDICT_BLOCK_ID,
+    CREATELIST_BLOCK_ID,
+    DATA_SAMPLING_BLOCK_ID,
+    DOUBLE_CURLY_BRACES_BLOCK_IDS,
+    GET_CURRENT_DATE_BLOCK_ID,
+    STORE_VALUE_BLOCK_ID,
+    UNIVERSAL_TYPE_CONVERTER_BLOCK_ID,
+    get_blocks_info,
+    is_valid_uuid,
+)
+
+logger = logging.getLogger(__name__)
+
+
+def fix_agent_ids(agent: dict[str, Any]) -> dict[str, Any]:
+    """Fix invalid UUIDs in agent and link IDs."""
+    # Fix agent ID
+    if not is_valid_uuid(agent.get("id", "")):
+        agent["id"] = str(uuid.uuid4())
+        logger.debug(f"Fixed agent ID: {agent['id']}")
+
+    # Fix node IDs
+    id_mapping = {}  # Old ID -> New ID
+    for node in agent.get("nodes", []):
+        if not is_valid_uuid(node.get("id", "")):
+            old_id = node.get("id", "")
+            new_id = str(uuid.uuid4())
+            id_mapping[old_id] = new_id
+            node["id"] = new_id
+            logger.debug(f"Fixed node ID: {old_id} -> {new_id}")
+
+    # Fix link IDs and update references
+    for link in agent.get("links", []):
+        if not is_valid_uuid(link.get("id", "")):
+            link["id"] = str(uuid.uuid4())
+            logger.debug(f"Fixed link ID: {link['id']}")
+
+        # Update source/sink IDs if they were remapped
+        if link.get("source_id") in id_mapping:
+            link["source_id"] = id_mapping[link["source_id"]]
+        if link.get("sink_id") in id_mapping:
+            link["sink_id"] = id_mapping[link["sink_id"]]
+
+    return agent
+
+
+def fix_double_curly_braces(agent: dict[str, Any]) -> dict[str, Any]:
+    """Fix single curly braces to double in template blocks."""
+    for node in agent.get("nodes", []):
+        if node.get("block_id") not in DOUBLE_CURLY_BRACES_BLOCK_IDS:
+            continue
+
+        input_data = node.get("input_default", {})
+        for key in ("prompt", "format"):
+            if key in input_data and isinstance(input_data[key], str):
+                original = input_data[key]
+                # Fix simple variable references: {var} -> {{var}}
+                fixed = re.sub(
+                    r"(?<!\{)\{([a-zA-Z_][a-zA-Z0-9_]*)\}(?!\})",
+                    r"{{\1}}",
+                    original,
+                )
+                if fixed != original:
+                    input_data[key] = fixed
+                    logger.debug(f"Fixed curly braces in {key}")
+
+    return agent
+
+
+def fix_storevalue_before_condition(agent: dict[str, Any]) -> dict[str, Any]:
+    """Add StoreValueBlock before ConditionBlock if needed for value2."""
+    nodes = agent.get("nodes", [])
+    links = agent.get("links", [])
+
+    # Find all ConditionBlock nodes
+    condition_node_ids = {
+        node["id"] for node in nodes if node.get("block_id") == CONDITION_BLOCK_ID
+    }
+
+    if not condition_node_ids:
+        return agent
+
+    new_nodes = []
+    new_links = []
+    processed_conditions = set()
+
+    for link in links:
+        sink_id = link.get("sink_id")
+        sink_name = link.get("sink_name")
+
+        # Check if this link goes to a ConditionBlock's value2
+        if sink_id in condition_node_ids and sink_name == "value2":
+            source_node = next(
+                (n for n in nodes if n["id"] == link.get("source_id")), None
+            )
+
+            # Skip if source is already a StoreValueBlock
+            if source_node and source_node.get("block_id") == STORE_VALUE_BLOCK_ID:
+                continue
+
+            # Skip if we already processed this condition
+            if sink_id in processed_conditions:
+                continue
+
+            processed_conditions.add(sink_id)
+
+            # Create StoreValueBlock
+            store_node_id = str(uuid.uuid4())
+            store_node = {
+                "id": store_node_id,
+                "block_id": STORE_VALUE_BLOCK_ID,
+                "input_default": {"data": None},
+                "metadata": {"position": {"x": 0, "y": -100}},
+            }
+            new_nodes.append(store_node)
+
+            # Create link: original source -> StoreValueBlock
+            new_links.append(
+                {
+                    "id": str(uuid.uuid4()),
+                    "source_id": link["source_id"],
+                    "source_name": link["source_name"],
+                    "sink_id": store_node_id,
+                    "sink_name": "input",
+                    "is_static": False,
+                }
+            )
+
+            # Update original link: StoreValueBlock -> ConditionBlock
+            link["source_id"] = store_node_id
+            link["source_name"] = "output"
+
+            logger.debug(f"Added StoreValueBlock before ConditionBlock {sink_id}")
+
+    if new_nodes:
+        agent["nodes"] = nodes + new_nodes
+
+    return agent
+
+
+def fix_addtolist_blocks(agent: dict[str, Any]) -> dict[str, Any]:
+    """Fix AddToList blocks by adding prerequisite empty AddToList block.
+
+    When an AddToList block is found:
+    1. Checks if there's a CreateListBlock before it
+    2. Removes CreateListBlock if linked directly to AddToList
+    3. Adds an empty AddToList block before the original
+    4. Ensures the original has a self-referencing link
+    """
+    nodes = agent.get("nodes", [])
+    links = agent.get("links", [])
+    new_nodes = []
+    original_addtolist_ids = set()
+    nodes_to_remove = set()
+    links_to_remove = []
+
+    # First pass: identify CreateListBlock nodes to remove
+    for link in links:
+        source_node = next(
+            (n for n in nodes if n.get("id") == link.get("source_id")), None
+        )
+        sink_node = next((n for n in nodes if n.get("id") == link.get("sink_id")), None)
+
+        if (
+            source_node
+            and sink_node
+            and source_node.get("block_id") == CREATELIST_BLOCK_ID
+            and sink_node.get("block_id") == ADDTOLIST_BLOCK_ID
+        ):
+            nodes_to_remove.add(source_node.get("id"))
+            links_to_remove.append(link)
+            logger.debug(f"Removing CreateListBlock {source_node.get('id')}")
+
+    # Second pass: process AddToList blocks
+    filtered_nodes = []
+    for node in nodes:
+        if node.get("id") in nodes_to_remove:
+            continue
+
+        if node.get("block_id") == ADDTOLIST_BLOCK_ID:
+            original_addtolist_ids.add(node.get("id"))
+            node_id = node.get("id")
+            pos = node.get("metadata", {}).get("position", {"x": 0, "y": 0})
+
+            # Check if already has prerequisite
+            has_prereq = any(
+                link.get("sink_id") == node_id
+                and link.get("sink_name") == "list"
+                and link.get("source_name") == "updated_list"
+                for link in links
+            )
+
+            if not has_prereq:
+                # Remove links to "list" input (except self-reference)
+                for link in links:
+                    if (
+                        link.get("sink_id") == node_id
+                        and link.get("sink_name") == "list"
+                        and link.get("source_id") != node_id
+                        and link not in links_to_remove
+                    ):
+                        links_to_remove.append(link)
+
+                # Create prerequisite AddToList block
+                prereq_id = str(uuid.uuid4())
+                prereq_node = {
+                    "id": prereq_id,
+                    "block_id": ADDTOLIST_BLOCK_ID,
+                    "input_default": {"list": [], "entry": None, "entries": []},
+                    "metadata": {
+                        "position": {"x": pos.get("x", 0) - 800, "y": pos.get("y", 0)}
+                    },
+                }
+                new_nodes.append(prereq_node)
+
+                # Link prerequisite to original
+                links.append(
+                    {
+                        "id": str(uuid.uuid4()),
+                        "source_id": prereq_id,
+                        "source_name": "updated_list",
+                        "sink_id": node_id,
+                        "sink_name": "list",
+                        "is_static": False,
+                    }
+                )
+                logger.debug(f"Added prerequisite AddToList block for {node_id}")
+
+        filtered_nodes.append(node)
+
+    # Remove marked links
+    filtered_links = [link for link in links if link not in links_to_remove]
+
+    # Add self-referencing links for original AddToList blocks
+    for node in filtered_nodes + new_nodes:
+        if (
+            node.get("block_id") == ADDTOLIST_BLOCK_ID
+            and node.get("id") in original_addtolist_ids
+        ):
+            node_id = node.get("id")
+            has_self_ref = any(
+                link["source_id"] == node_id
+                and link["sink_id"] == node_id
+                and link["source_name"] == "updated_list"
+                and link["sink_name"] == "list"
+                for link in filtered_links
+            )
+            if not has_self_ref:
+                filtered_links.append(
+                    {
+                        "id": str(uuid.uuid4()),
+                        "source_id": node_id,
+                        "source_name": "updated_list",
+                        "sink_id": node_id,
+                        "sink_name": "list",
+                        "is_static": False,
+                    }
+                )
+                logger.debug(f"Added self-reference for AddToList {node_id}")
+
+    agent["nodes"] = filtered_nodes + new_nodes
+    agent["links"] = filtered_links
+    return agent
+
+
+def fix_addtodictionary_blocks(agent: dict[str, Any]) -> dict[str, Any]:
+    """Fix AddToDictionary blocks by removing empty CreateDictionary nodes."""
+    nodes = agent.get("nodes", [])
+    links = agent.get("links", [])
+    nodes_to_remove = set()
+    links_to_remove = []
+
+    for link in links:
+        source_node = next(
+            (n for n in nodes if n.get("id") == link.get("source_id")), None
+        )
+        sink_node = next((n for n in nodes if n.get("id") == link.get("sink_id")), None)
+
+        if (
+            source_node
+            and sink_node
+            and source_node.get("block_id") == CREATEDICT_BLOCK_ID
+            and sink_node.get("block_id") == ADDTODICTIONARY_BLOCK_ID
+        ):
+            nodes_to_remove.add(source_node.get("id"))
+            links_to_remove.append(link)
+            logger.debug(f"Removing CreateDictionary {source_node.get('id')}")
+
+    agent["nodes"] = [n for n in nodes if n.get("id") not in nodes_to_remove]
+    agent["links"] = [link for link in links if link not in links_to_remove]
+    return agent
+
+
+def fix_code_execution_output(agent: dict[str, Any]) -> dict[str, Any]:
+    """Fix CodeExecutionBlock output: change 'response' to 'stdout_logs'."""
+    nodes = agent.get("nodes", [])
+    links = agent.get("links", [])
+
+    for link in links:
+        source_node = next(
+            (n for n in nodes if n.get("id") == link.get("source_id")), None
+        )
+        if (
+            source_node
+            and source_node.get("block_id") == CODE_EXECUTION_BLOCK_ID
+            and link.get("source_name") == "response"
+        ):
+            link["source_name"] = "stdout_logs"
+            logger.debug("Fixed CodeExecutionBlock output: response -> stdout_logs")
+
+    return agent
+
+
+def fix_data_sampling_sample_size(agent: dict[str, Any]) -> dict[str, Any]:
+    """Fix DataSamplingBlock by setting sample_size to 1 as default."""
+    nodes = agent.get("nodes", [])
+    links = agent.get("links", [])
+    links_to_remove = []
+
+    for node in nodes:
+        if node.get("block_id") == DATA_SAMPLING_BLOCK_ID:
+            node_id = node.get("id")
+            input_default = node.get("input_default", {})
+
+            # Remove links to sample_size
+            for link in links:
+                if (
+                    link.get("sink_id") == node_id
+                    and link.get("sink_name") == "sample_size"
+                ):
+                    links_to_remove.append(link)
+
+            # Set default
+            input_default["sample_size"] = 1
+            node["input_default"] = input_default
+            logger.debug(f"Fixed DataSamplingBlock {node_id} sample_size to 1")
+
+    if links_to_remove:
+        agent["links"] = [link for link in links if link not in links_to_remove]
+
+    return agent
+
+
+def fix_node_x_coordinates(agent: dict[str, Any]) -> dict[str, Any]:
+    """Fix node x-coordinates to ensure 800+ unit spacing between linked nodes."""
+    nodes = agent.get("nodes", [])
+    links = agent.get("links", [])
+    node_lookup = {n.get("id"): n for n in nodes}
+
+    for link in links:
+        source_id = link.get("source_id")
+        sink_id = link.get("sink_id")
+
+        source_node = node_lookup.get(source_id)
+        sink_node = node_lookup.get(sink_id)
+
+        if not source_node or not sink_node:
+            continue
+
+        source_pos = source_node.get("metadata", {}).get("position", {})
+        sink_pos = sink_node.get("metadata", {}).get("position", {})
+
+        source_x = source_pos.get("x", 0)
+        sink_x = sink_pos.get("x", 0)
+
+        if abs(sink_x - source_x) < 800:
+            new_x = source_x + 800
+            if "metadata" not in sink_node:
+                sink_node["metadata"] = {}
+            if "position" not in sink_node["metadata"]:
+                sink_node["metadata"]["position"] = {}
+            sink_node["metadata"]["position"]["x"] = new_x
+            logger.debug(f"Fixed node {sink_id} x: {sink_x} -> {new_x}")
+
+    return agent
+
+
+def fix_getcurrentdate_offset(agent: dict[str, Any]) -> dict[str, Any]:
+    """Fix GetCurrentDateBlock offset to ensure it's positive."""
+    for node in agent.get("nodes", []):
+        if node.get("block_id") == GET_CURRENT_DATE_BLOCK_ID:
+            input_default = node.get("input_default", {})
+            if "offset" in input_default:
+                offset = input_default["offset"]
+                if isinstance(offset, (int, float)) and offset < 0:
+                    input_default["offset"] = abs(offset)
+                    logger.debug(f"Fixed offset: {offset} -> {abs(offset)}")
+
+    return agent
+
+
+def fix_ai_model_parameter(
+    agent: dict[str, Any],
+    blocks_info: list[dict[str, Any]],
+    default_model: str = "gpt-4o",
+) -> dict[str, Any]:
+    """Add default model parameter to AI blocks if missing."""
+    block_map = {b.get("id"): b for b in blocks_info}
+
+    for node in agent.get("nodes", []):
+        block_id = node.get("block_id")
+        block = block_map.get(block_id)
+
+        if not block:
+            continue
+
+        # Check if block has AI category
+        categories = block.get("categories", [])
+        is_ai_block = any(
+            cat.get("category") == "AI" for cat in categories if isinstance(cat, dict)
+        )
+
+        if is_ai_block:
+            input_default = node.get("input_default", {})
+            if "model" not in input_default:
+                input_default["model"] = default_model
+                node["input_default"] = input_default
+                logger.debug(
+                    f"Added model '{default_model}' to AI block {node.get('id')}"
+                )
+
+    return agent
+
+
+def fix_link_static_properties(
+    agent: dict[str, Any], blocks_info: list[dict[str, Any]]
+) -> dict[str, Any]:
+    """Fix is_static property based on source block's staticOutput."""
+    block_map = {b.get("id"): b for b in blocks_info}
+    node_lookup = {n.get("id"): n for n in agent.get("nodes", [])}
+
+    for link in agent.get("links", []):
+        source_node = node_lookup.get(link.get("source_id"))
+        if not source_node:
+            continue
+
+        source_block = block_map.get(source_node.get("block_id"))
+        if not source_block:
+            continue
+
+        static_output = source_block.get("staticOutput", False)
+        if link.get("is_static") != static_output:
+            link["is_static"] = static_output
+            logger.debug(f"Fixed link {link.get('id')} is_static to {static_output}")
+
+    return agent
+
+
+def fix_data_type_mismatch(
+    agent: dict[str, Any], blocks_info: list[dict[str, Any]]
+) -> dict[str, Any]:
+    """Fix data type mismatches by inserting UniversalTypeConverterBlock."""
+    nodes = agent.get("nodes", [])
+    links = agent.get("links", [])
+    block_map = {b.get("id"): b for b in blocks_info}
+    node_lookup = {n.get("id"): n for n in nodes}
+
+    def get_property_type(schema: dict, name: str) -> str | None:
+        if "_#_" in name:
+            parent, child = name.split("_#_", 1)
+            parent_schema = schema.get(parent, {})
+            if "properties" in parent_schema:
+                return parent_schema["properties"].get(child, {}).get("type")
+            return None
+        return schema.get(name, {}).get("type")
+
+    def are_types_compatible(src: str, sink: str) -> bool:
+        if {src, sink} <= {"integer", "number"}:
+            return True
+        return src == sink
+
+    type_mapping = {
+        "string": "string",
+        "text": "string",
+        "integer": "number",
+        "number": "number",
+        "float": "number",
+        "boolean": "boolean",
+        "bool": "boolean",
+        "array": "list",
+        "list": "list",
+        "object": "dictionary",
+        "dict": "dictionary",
+        "dictionary": "dictionary",
+    }
+
+    new_links = []
+    nodes_to_add = []
+
+    for link in links:
+        source_node = node_lookup.get(link.get("source_id"))
+        sink_node = node_lookup.get(link.get("sink_id"))
+
+        if not source_node or not sink_node:
+            new_links.append(link)
+            continue
+
+        source_block = block_map.get(source_node.get("block_id"))
+        sink_block = block_map.get(sink_node.get("block_id"))
+
+        if not source_block or not sink_block:
+            new_links.append(link)
+            continue
+
+        source_outputs = source_block.get("outputSchema", {}).get("properties", {})
+        sink_inputs = sink_block.get("inputSchema", {}).get("properties", {})
+
+        source_type = get_property_type(source_outputs, link.get("source_name", ""))
+        sink_type = get_property_type(sink_inputs, link.get("sink_name", ""))
+
+        if (
+            source_type
+            and sink_type
+            and not are_types_compatible(source_type, sink_type)
+        ):
+            # Insert type converter
+            converter_id = str(uuid.uuid4())
+            target_type = type_mapping.get(sink_type, sink_type)
+
+            converter_node = {
+                "id": converter_id,
+                "block_id": UNIVERSAL_TYPE_CONVERTER_BLOCK_ID,
+                "input_default": {"type": target_type},
+                "metadata": {"position": {"x": 0, "y": 100}},
+            }
+            nodes_to_add.append(converter_node)
+
+            # source -> converter
+            new_links.append(
+                {
+                    "id": str(uuid.uuid4()),
+                    "source_id": link["source_id"],
+                    "source_name": link["source_name"],
+                    "sink_id": converter_id,
+                    "sink_name": "value",
+                    "is_static": False,
+                }
+            )
+
+            # converter -> sink
+            new_links.append(
+                {
+                    "id": str(uuid.uuid4()),
+                    "source_id": converter_id,
+                    "source_name": "value",
+                    "sink_id": link["sink_id"],
+                    "sink_name": link["sink_name"],
+                    "is_static": False,
+                }
+            )
+
+            logger.debug(f"Inserted type converter: {source_type} -> {target_type}")
+        else:
+            new_links.append(link)
+
+    if nodes_to_add:
+        agent["nodes"] = nodes + nodes_to_add
+        agent["links"] = new_links
+
+    return agent
+
+
+def apply_all_fixes(
+    agent: dict[str, Any], blocks_info: list[dict[str, Any]] | None = None
+) -> dict[str, Any]:
+    """Apply all fixes to an agent JSON.
+
+    Args:
+        agent: Agent JSON dict
+        blocks_info: Optional list of block info dicts for advanced fixes
+
+    Returns:
+        Fixed agent JSON
+    """
+    # Basic fixes (no block info needed)
+    agent = fix_agent_ids(agent)
+    agent = fix_double_curly_braces(agent)
+    agent = fix_storevalue_before_condition(agent)
+    agent = fix_addtolist_blocks(agent)
+    agent = fix_addtodictionary_blocks(agent)
+    agent = fix_code_execution_output(agent)
+    agent = fix_data_sampling_sample_size(agent)
+    agent = fix_node_x_coordinates(agent)
+    agent = fix_getcurrentdate_offset(agent)
+
+    # Advanced fixes (require block info)
+    if blocks_info is None:
+        blocks_info = get_blocks_info()
+
+    agent = fix_ai_model_parameter(agent, blocks_info)
+    agent = fix_link_static_properties(agent, blocks_info)
+    agent = fix_data_type_mismatch(agent, blocks_info)
+
+    return agent
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/prompts.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/prompts.py
@@ -0,0 +1,225 @@
+"""Prompt templates for agent generation."""
+
+DECOMPOSITION_PROMPT = """
+You are an expert AutoGPT Workflow Decomposer. Your task is to analyze a user's high-level goal and break it down into a clear, step-by-step plan using the available blocks.
+
+Each step should represent a distinct, automatable action suitable for execution by an AI automation system.
+
+---
+
+FIRST: Analyze the user's goal and determine:
+1) Design-time configuration (fixed settings that won't change per run)
+2) Runtime inputs (values the agent's end-user will provide each time it runs)
+
+For anything that can vary per run (email addresses, names, dates, search terms, etc.):
+- DO NOT ask for the actual value
+- Instead, define it as an Agent Input with a clear name, type, and description
+
+Only ask clarifying questions about design-time config that affects how you build the workflow:
+- Which external service to use (e.g., "Gmail vs Outlook", "Notion vs Google Docs")
+- Required formats or structures (e.g., "CSV, JSON, or PDF output?")
+- Business rules that must be hard-coded
+
+IMPORTANT CLARIFICATIONS POLICY:
+- Ask no more than five essential questions
+- Do not ask for concrete values that can be provided at runtime as Agent Inputs
+- Do not ask for API keys or credentials; the platform handles those directly
+- If there is enough information to infer reasonable defaults, prefer to propose defaults
+
+---
+
+GUIDELINES:
+1. List each step as a numbered item
+2. Describe the action clearly and specify inputs/outputs
+3. Ensure steps are in logical, sequential order
+4. Mention block names naturally (e.g., "Use GetWeatherByLocationBlock to...")
+5. Help the user reach their goal efficiently
+
+---
+
+RULES:
+1. OUTPUT FORMAT: Only output either clarifying questions OR step-by-step instructions, not both
+2. USE ONLY THE BLOCKS PROVIDED
+3. ALL required_input fields must be provided
+4. Data types of linked properties must match
+5. Write expert-level prompts for AI-related blocks
+
+---
+
+CRITICAL BLOCK RESTRICTIONS:
+1. AddToListBlock: Outputs updated list EVERY addition, not after all additions
+2. SendEmailBlock: Draft the email for user review; set SMTP config based on email type
+3. ConditionBlock: value2 is reference, value1 is contrast
+4. CodeExecutionBlock: DO NOT USE - use AI blocks instead
+5. ReadCsvBlock: Only use the 'rows' output, not 'row'
+
+---
+
+OUTPUT FORMAT:
+
+If more information is needed:
+```json
+{{
+  "type": "clarifying_questions",
+  "questions": [
+    {{
+      "question": "Which email provider should be used? (Gmail, Outlook, custom SMTP)",
+      "keyword": "email_provider",
+      "example": "Gmail"
+    }}
+  ]
+}}
+```
+
+If ready to proceed:
+```json
+{{
+  "type": "instructions",
+  "steps": [
+    {{
+      "step_number": 1,
+      "block_name": "AgentShortTextInputBlock",
+      "description": "Get the URL of the content to analyze.",
+      "inputs": [{{"name": "name", "value": "URL"}}],
+      "outputs": [{{"name": "result", "description": "The URL entered by user"}}]
+    }}
+  ]
+}}
+```
+
+---
+
+AVAILABLE BLOCKS:
+{block_summaries}
+"""
+
+GENERATION_PROMPT = """
+You are an expert AI workflow builder. Generate a valid agent JSON from the given instructions.
+
+---
+
+NODES:
+Each node must include:
+- `id`: Unique UUID v4 (e.g. `a8f5b1e2-c3d4-4e5f-8a9b-0c1d2e3f4a5b`)
+- `block_id`: The block identifier (must match an Allowed Block)
+- `input_default`: Dict of inputs (can be empty if no static inputs needed)
+- `metadata`: Must contain:
+  - `position`: {{"x": number, "y": number}} - adjacent nodes should differ by 800+ in X
+  - `customized_name`: Clear name describing this block's purpose in the workflow
+
+---
+
+LINKS:
+Each link connects a source node's output to a sink node's input:
+- `id`: MUST be UUID v4 (NOT "link-1", "link-2", etc.)
+- `source_id`: ID of the source node
+- `source_name`: Output field name from the source block
+- `sink_id`: ID of the sink node
+- `sink_name`: Input field name on the sink block
+- `is_static`: true only if source block has static_output: true
+
+CRITICAL: All IDs must be valid UUID v4 format!
+
+---
+
+AGENT (GRAPH):
+Wrap nodes and links in:
+- `id`: UUID of the agent
+- `name`: Short, generic name (avoid specific company names, URLs)
+- `description`: Short, generic description
+- `nodes`: List of all nodes
+- `links`: List of all links
+- `version`: 1
+- `is_active`: true
+
+---
+
+TIPS:
+- All required_input fields must be provided via input_default or a valid link
+- Ensure consistent source_id and sink_id references
+- Avoid dangling links
+- Input/output pins must match block schemas
+- Do not invent unknown block_ids
+
+---
+
+ALLOWED BLOCKS:
+{block_summaries}
+
+---
+
+Generate the complete agent JSON. Output ONLY valid JSON, no explanation.
+"""
+
+PATCH_PROMPT = """
+You are an expert at modifying AutoGPT agent workflows. Given the current agent and a modification request, generate a JSON patch to update the agent.
+
+CURRENT AGENT:
+{current_agent}
+
+AVAILABLE BLOCKS:
+{block_summaries}
+
+---
+
+PATCH FORMAT:
+Return a JSON object with the following structure:
+
+```json
+{{
+  "type": "patch",
+  "intent": "Brief description of what the patch does",
+  "patches": [
+    {{
+      "type": "modify",
+      "node_id": "uuid-of-node-to-modify",
+      "changes": {{
+        "input_default": {{"field": "new_value"}},
+        "metadata": {{"customized_name": "New Name"}}
+      }}
+    }},
+    {{
+      "type": "add",
+      "new_nodes": [
+        {{
+          "id": "new-uuid",
+          "block_id": "block-uuid",
+          "input_default": {{}},
+          "metadata": {{"position": {{"x": 0, "y": 0}}, "customized_name": "Name"}}
+        }}
+      ],
+      "new_links": [
+        {{
+          "id": "link-uuid",
+          "source_id": "source-node-id",
+          "source_name": "output_field",
+          "sink_id": "sink-node-id",
+          "sink_name": "input_field"
+        }}
+      ]
+    }},
+    {{
+      "type": "remove",
+      "node_ids": ["uuid-of-node-to-remove"],
+      "link_ids": ["uuid-of-link-to-remove"]
+    }}
+  ]
+}}
+```
+
+If you need more information, return:
+```json
+{{
+  "type": "clarifying_questions",
+  "questions": [
+    {{
+      "question": "What specific change do you want?",
+      "keyword": "change_type",
+      "example": "Add error handling"
+    }}
+  ]
+}}
+```
+
+Generate the minimal patch needed. Output ONLY valid JSON.
+"""
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/utils.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/utils.py
@@ -0,0 +1,213 @@
+"""Utilities for agent generation."""
+
+import json
+import re
+from typing import Any
+
+from backend.data.block import get_blocks
+
+# UUID validation regex
+UUID_REGEX = re.compile(
+    r"^[a-f0-9]{8}-[a-f0-9]{4}-4[a-f0-9]{3}-[89ab][a-f0-9]{3}-[a-f0-9]{12}$"
+)
+
+# Block IDs for various fixes
+STORE_VALUE_BLOCK_ID = "1ff065e9-88e8-4358-9d82-8dc91f622ba9"
+CONDITION_BLOCK_ID = "715696a0-e1da-45c8-b209-c2fa9c3b0be6"
+ADDTOLIST_BLOCK_ID = "aeb08fc1-2fc1-4141-bc8e-f758f183a822"
+ADDTODICTIONARY_BLOCK_ID = "31d1064e-7446-4693-a7d4-65e5ca1180d1"
+CREATELIST_BLOCK_ID = "a912d5c7-6e00-4542-b2a9-8034136930e4"
+CREATEDICT_BLOCK_ID = "b924ddf4-de4f-4b56-9a85-358930dcbc91"
+CODE_EXECUTION_BLOCK_ID = "0b02b072-abe7-11ef-8372-fb5d162dd712"
+DATA_SAMPLING_BLOCK_ID = "4a448883-71fa-49cf-91cf-70d793bd7d87"
+UNIVERSAL_TYPE_CONVERTER_BLOCK_ID = "95d1b990-ce13-4d88-9737-ba5c2070c97b"
+GET_CURRENT_DATE_BLOCK_ID = "b29c1b50-5d0e-4d9f-8f9d-1b0e6fcbf0b1"
+
+DOUBLE_CURLY_BRACES_BLOCK_IDS = [
+    "44f6c8ad-d75c-4ae1-8209-aad1c0326928",  # FillTextTemplateBlock
+    "6ab085e2-20b3-4055-bc3e-08036e01eca6",
+    "90f8c45e-e983-4644-aa0b-b4ebe2f531bc",
+    "363ae599-353e-4804-937e-b2ee3cef3da4",  # AgentOutputBlock
+    "3b191d9f-356f-482d-8238-ba04b6d18381",
+    "db7d8f02-2f44-4c55-ab7a-eae0941f0c30",
+    "3a7c4b8d-6e2f-4a5d-b9c1-f8d23c5a9b0e",
+    "ed1ae7a0-b770-4089-b520-1f0005fad19a",
+    "a892b8d9-3e4e-4e9c-9c1e-75f8efcf1bfa",
+    "b29c1b50-5d0e-4d9f-8f9d-1b0e6fcbf0b1",
+    "716a67b3-6760-42e7-86dc-18645c6e00fc",
+    "530cf046-2ce0-4854-ae2c-659db17c7a46",
+    "ed55ac19-356e-4243-a6cb-bc599e9b716f",
+    "1f292d4a-41a4-4977-9684-7c8d560b9f91",  # LLM blocks
+    "32a87eab-381e-4dd4-bdb8-4c47151be35a",
+]
+
+
+def is_valid_uuid(value: str) -> bool:
+    """Check if a string is a valid UUID v4."""
+    return isinstance(value, str) and UUID_REGEX.match(value) is not None
+
+
+def _compact_schema(schema: dict) -> dict[str, str]:
+    """Extract compact type info from a JSON schema properties dict.
+
+    Returns a dict of {field_name: type_string} for essential info only.
+    """
+    props = schema.get("properties", {})
+    result = {}
+
+    for name, prop in props.items():
+        # Skip internal/complex fields
+        if name.startswith("_"):
+            continue
+
+        # Get type string
+        type_str = prop.get("type", "any")
+
+        # Handle anyOf/oneOf (optional types)
+        if "anyOf" in prop:
+            types = [t.get("type", "?") for t in prop["anyOf"] if t.get("type")]
+            type_str = "|".join(types) if types else "any"
+        elif "allOf" in prop:
+            type_str = "object"
+
+        # Add array item type if present
+        if type_str == "array" and "items" in prop:
+            items = prop["items"]
+            if isinstance(items, dict):
+                item_type = items.get("type", "any")
+                type_str = f"array[{item_type}]"
+
+        result[name] = type_str
+
+    return result
+
+
+def get_block_summaries(include_schemas: bool = True) -> str:
+    """Generate compact block summaries for prompts.
+
+    Args:
+        include_schemas: Whether to include input/output type info
+
+    Returns:
+        Formatted string of block summaries (compact format)
+    """
+    blocks = get_blocks()
+    summaries = []
+
+    for block_id, block_cls in blocks.items():
+        block = block_cls()
+        name = block.name
+        desc = getattr(block, "description", "") or ""
+
+        # Truncate description
+        if len(desc) > 150:
+            desc = desc[:147] + "..."
+
+        if not include_schemas:
+            summaries.append(f"- {name} (id: {block_id}): {desc}")
+        else:
+            # Compact format with type info only
+            inputs = {}
+            outputs = {}
+            required = []
+
+            if hasattr(block, "input_schema"):
+                try:
+                    schema = block.input_schema.jsonschema()
+                    inputs = _compact_schema(schema)
+                    required = schema.get("required", [])
+                except Exception:
+                    pass
+
+            if hasattr(block, "output_schema"):
+                try:
+                    schema = block.output_schema.jsonschema()
+                    outputs = _compact_schema(schema)
+                except Exception:
+                    pass
+
+            # Build compact line format
+            # Format: NAME (id): desc | in: {field:type, ...} [required] | out: {field:type}
+            in_str = ", ".join(f"{k}:{v}" for k, v in inputs.items())
+            out_str = ", ".join(f"{k}:{v}" for k, v in outputs.items())
+            req_str = f" req=[{','.join(required)}]" if required else ""
+
+            static = " [static]" if getattr(block, "static_output", False) else ""
+
+            line = f"- {name} (id: {block_id}): {desc}"
+            if in_str:
+                line += f"\n  in: {{{in_str}}}{req_str}"
+            if out_str:
+                line += f"\n  out: {{{out_str}}}{static}"
+
+            summaries.append(line)
+
+    return "\n".join(summaries)
+
+
+def get_blocks_info() -> list[dict[str, Any]]:
+    """Get block information with schemas for validation and fixing."""
+    blocks = get_blocks()
+    blocks_info = []
+    for block_id, block_cls in blocks.items():
+        block = block_cls()
+        blocks_info.append(
+            {
+                "id": block_id,
+                "name": block.name,
+                "description": getattr(block, "description", ""),
+                "categories": getattr(block, "categories", []),
+                "staticOutput": getattr(block, "static_output", False),
+                "inputSchema": (
+                    block.input_schema.jsonschema()
+                    if hasattr(block, "input_schema")
+                    else {}
+                ),
+                "outputSchema": (
+                    block.output_schema.jsonschema()
+                    if hasattr(block, "output_schema")
+                    else {}
+                ),
+            }
+        )
+    return blocks_info
+
+
+def parse_json_from_llm(text: str) -> dict[str, Any] | None:
+    """Extract JSON from LLM response (handles markdown code blocks)."""
+    if not text:
+        return None
+
+    # Try fenced code block
+    match = re.search(r"```(?:json)?\s*([\s\S]*?)```", text, re.IGNORECASE)
+    if match:
+        try:
+            return json.loads(match.group(1).strip())
+        except json.JSONDecodeError:
+            pass
+
+    # Try raw text
+    try:
+        return json.loads(text.strip())
+    except json.JSONDecodeError:
+        pass
+
+    # Try finding {...} span
+    start = text.find("{")
+    end = text.rfind("}")
+    if start != -1 and end > start:
+        try:
+            return json.loads(text[start : end + 1])
+        except json.JSONDecodeError:
+            pass
+
+    # Try finding [...] span
+    start = text.find("[")
+    end = text.rfind("]")
+    if start != -1 and end > start:
+        try:
+            return json.loads(text[start : end + 1])
+        except json.JSONDecodeError:
+            pass
+
+    return None
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/validator.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/validator.py
@@ -0,0 +1,279 @@
+"""Agent validator - Validates agent structure and connections."""
+
+import logging
+import re
+from typing import Any
+
+from .utils import get_blocks_info
+
+logger = logging.getLogger(__name__)
+
+
+class AgentValidator:
+    """Validator for AutoGPT agents with detailed error reporting."""
+
+    def __init__(self):
+        self.errors: list[str] = []
+
+    def add_error(self, error: str) -> None:
+        """Add an error message."""
+        self.errors.append(error)
+
+    def validate_block_existence(
+        self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
+    ) -> bool:
+        """Validate all block IDs exist in the blocks library."""
+        valid = True
+        valid_block_ids = {b.get("id") for b in blocks_info if b.get("id")}
+
+        for node in agent.get("nodes", []):
+            block_id = node.get("block_id")
+            node_id = node.get("id")
+
+            if not block_id:
+                self.add_error(f"Node '{node_id}' is missing 'block_id' field.")
+                valid = False
+                continue
+
+            if block_id not in valid_block_ids:
+                self.add_error(
+                    f"Node '{node_id}' references block_id '{block_id}' which does not exist."
+                )
+                valid = False
+
+        return valid
+
+    def validate_link_node_references(self, agent: dict[str, Any]) -> bool:
+        """Validate all node IDs referenced in links exist."""
+        valid = True
+        valid_node_ids = {n.get("id") for n in agent.get("nodes", []) if n.get("id")}
+
+        for link in agent.get("links", []):
+            link_id = link.get("id", "Unknown")
+            source_id = link.get("source_id")
+            sink_id = link.get("sink_id")
+
+            if not source_id:
+                self.add_error(f"Link '{link_id}' is missing 'source_id'.")
+                valid = False
+            elif source_id not in valid_node_ids:
+                self.add_error(
+                    f"Link '{link_id}' references non-existent source_id '{source_id}'."
+                )
+                valid = False
+
+            if not sink_id:
+                self.add_error(f"Link '{link_id}' is missing 'sink_id'.")
+                valid = False
+            elif sink_id not in valid_node_ids:
+                self.add_error(
+                    f"Link '{link_id}' references non-existent sink_id '{sink_id}'."
+                )
+                valid = False
+
+        return valid
+
+    def validate_required_inputs(
+        self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
+    ) -> bool:
+        """Validate required inputs are provided."""
+        valid = True
+        block_map = {b.get("id"): b for b in blocks_info}
+
+        for node in agent.get("nodes", []):
+            block_id = node.get("block_id")
+            block = block_map.get(block_id)
+
+            if not block:
+                continue
+
+            required_inputs = block.get("inputSchema", {}).get("required", [])
+            input_defaults = node.get("input_default", {})
+            node_id = node.get("id")
+
+            # Get linked inputs
+            linked_inputs = {
+                link["sink_name"]
+                for link in agent.get("links", [])
+                if link.get("sink_id") == node_id
+            }
+
+            for req_input in required_inputs:
+                if (
+                    req_input not in input_defaults
+                    and req_input not in linked_inputs
+                    and req_input != "credentials"
+                ):
+                    block_name = block.get("name", "Unknown Block")
+                    self.add_error(
+                        f"Node '{node_id}' ({block_name}) is missing required input '{req_input}'."
+                    )
+                    valid = False
+
+        return valid
+
+    def validate_data_type_compatibility(
+        self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
+    ) -> bool:
+        """Validate linked data types are compatible."""
+        valid = True
+        block_map = {b.get("id"): b for b in blocks_info}
+        node_lookup = {n.get("id"): n for n in agent.get("nodes", [])}
+
+        def get_type(schema: dict, name: str) -> str | None:
+            if "_#_" in name:
+                parent, child = name.split("_#_", 1)
+                parent_schema = schema.get(parent, {})
+                if "properties" in parent_schema:
+                    return parent_schema["properties"].get(child, {}).get("type")
+                return None
+            return schema.get(name, {}).get("type")
+
+        def are_compatible(src: str, sink: str) -> bool:
+            if {src, sink} <= {"integer", "number"}:
+                return True
+            return src == sink
+
+        for link in agent.get("links", []):
+            source_node = node_lookup.get(link.get("source_id"))
+            sink_node = node_lookup.get(link.get("sink_id"))
+
+            if not source_node or not sink_node:
+                continue
+
+            source_block = block_map.get(source_node.get("block_id"))
+            sink_block = block_map.get(sink_node.get("block_id"))
+
+            if not source_block or not sink_block:
+                continue
+
+            source_outputs = source_block.get("outputSchema", {}).get("properties", {})
+            sink_inputs = sink_block.get("inputSchema", {}).get("properties", {})
+
+            source_type = get_type(source_outputs, link.get("source_name", ""))
+            sink_type = get_type(sink_inputs, link.get("sink_name", ""))
+
+            if source_type and sink_type and not are_compatible(source_type, sink_type):
+                self.add_error(
+                    f"Type mismatch: {source_block.get('name')} output '{link['source_name']}' "
+                    f"({source_type}) -> {sink_block.get('name')} input '{link['sink_name']}' ({sink_type})."
+                )
+                valid = False
+
+        return valid
+
+    def validate_nested_sink_links(
+        self, agent: dict[str, Any], blocks_info: list[dict[str, Any]]
+    ) -> bool:
+        """Validate nested sink links (with _#_ notation)."""
+        valid = True
+        block_map = {b.get("id"): b for b in blocks_info}
+        node_lookup = {n.get("id"): n for n in agent.get("nodes", [])}
+
+        for link in agent.get("links", []):
+            sink_name = link.get("sink_name", "")
+
+            if "_#_" in sink_name:
+                parent, child = sink_name.split("_#_", 1)
+
+                sink_node = node_lookup.get(link.get("sink_id"))
+                if not sink_node:
+                    continue
+
+                block = block_map.get(sink_node.get("block_id"))
+                if not block:
+                    continue
+
+                input_props = block.get("inputSchema", {}).get("properties", {})
+                parent_schema = input_props.get(parent)
+
+                if not parent_schema:
+                    self.add_error(
+                        f"Invalid nested link '{sink_name}': parent '{parent}' not found."
+                    )
+                    valid = False
+                    continue
+
+                if not parent_schema.get("additionalProperties"):
+                    if not (
+                        isinstance(parent_schema, dict)
+                        and "properties" in parent_schema
+                        and child in parent_schema.get("properties", {})
+                    ):
+                        self.add_error(
+                            f"Invalid nested link '{sink_name}': child '{child}' not found in '{parent}'."
+                        )
+                        valid = False
+
+        return valid
+
+    def validate_prompt_spaces(self, agent: dict[str, Any]) -> bool:
+        """Validate prompts don't have spaces in template variables."""
+        valid = True
+
+        for node in agent.get("nodes", []):
+            input_default = node.get("input_default", {})
+            prompt = input_default.get("prompt", "")
+
+            if not isinstance(prompt, str):
+                continue
+
+            # Find {{...}} with spaces
+            matches = re.finditer(r"\{\{([^}]+)\}\}", prompt)
+            for match in matches:
+                content = match.group(1)
+                if " " in content:
+                    self.add_error(
+                        f"Node '{node.get('id')}' has spaces in template variable: "
+                        f"'{{{{{content}}}}}' should be '{{{{{content.replace(' ', '_')}}}}}'."
+                    )
+                    valid = False
+
+        return valid
+
+    def validate(
+        self, agent: dict[str, Any], blocks_info: list[dict[str, Any]] | None = None
+    ) -> tuple[bool, str | None]:
+        """Run all validations.
+
+        Returns:
+            Tuple of (is_valid, error_message)
+        """
+        self.errors = []
+
+        if blocks_info is None:
+            blocks_info = get_blocks_info()
+
+        checks = [
+            self.validate_block_existence(agent, blocks_info),
+            self.validate_link_node_references(agent),
+            self.validate_required_inputs(agent, blocks_info),
+            self.validate_data_type_compatibility(agent, blocks_info),
+            self.validate_nested_sink_links(agent, blocks_info),
+            self.validate_prompt_spaces(agent),
+        ]
+
+        all_passed = all(checks)
+
+        if all_passed:
+            logger.info("Agent validation successful")
+            return True, None
+
+        error_message = "Agent validation failed:\n"
+        for i, error in enumerate(self.errors, 1):
+            error_message += f"{i}. {error}\n"
+
+        logger.warning(f"Agent validation failed with {len(self.errors)} errors")
+        return False, error_message
+
+
+def validate_agent(
+    agent: dict[str, Any], blocks_info: list[dict[str, Any]] | None = None
+) -> tuple[bool, str | None]:
+    """Convenience function to validate an agent.
+
+    Returns:
+        Tuple of (is_valid, error_message)
+    """
+    validator = AgentValidator()
+    return validator.validate(agent, blocks_info)
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_output.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_output.py
@@ -5,11 +5,13 @@ import re
 from datetime import datetime, timedelta, timezone
 from typing import Any

+from langfuse import observe
 from pydantic import BaseModel, field_validator

+from backend.api.features.chat.model import ChatSession
+from backend.api.features.library import db as library_db
 from backend.api.features.library.model import LibraryAgent
-from backend.copilot.model import ChatSession
-from backend.data.db_accessors import execution_db, library_db
+from backend.data import execution as execution_db
 from backend.data.execution import ExecutionStatus, GraphExecution, GraphExecutionMeta

 from .base import BaseTool
@@ -164,12 +166,10 @@ class AgentOutputTool(BaseTool):
        Resolve agent from provided identifiers.
        Returns (library_agent, error_message).
        """
-        lib_db = library_db()
-
        # Priority 1: Exact library agent ID
        if library_agent_id:
            try:
-                agent = await lib_db.get_library_agent(library_agent_id, user_id)
+                agent = await library_db.get_library_agent(library_agent_id, user_id)
                return agent, None
            except Exception as e:
                logger.warning(f"Failed to get library agent by ID: {e}")
@@ -183,7 +183,7 @@ class AgentOutputTool(BaseTool):
                return None, f"Agent '{store_slug}' not found in marketplace"

            # Find in user's library by graph_id
-            agent = await lib_db.get_library_agent_by_graph_id(user_id, graph.id)
+            agent = await library_db.get_library_agent_by_graph_id(user_id, graph.id)
            if not agent:
                return (
                    None,
@@ -195,7 +195,7 @@ class AgentOutputTool(BaseTool):
        # Priority 3: Fuzzy name search in library
        if agent_name:
            try:
-                response = await lib_db.list_library_agents(
+                response = await library_db.list_library_agents(
                    user_id=user_id,
                    search_term=agent_name,
                    page_size=5,
@@ -229,11 +229,9 @@ class AgentOutputTool(BaseTool):
        Fetch execution(s) based on filters.
        Returns (single_execution, available_executions_meta, error_message).
        """
-        exec_db = execution_db()
-
        # If specific execution_id provided, fetch it directly
        if execution_id:
-            execution = await exec_db.get_graph_execution(
+            execution = await execution_db.get_graph_execution(
                user_id=user_id,
                execution_id=execution_id,
                include_node_executions=False,
@@ -243,7 +241,7 @@ class AgentOutputTool(BaseTool):
            return execution, [], None

        # Get completed executions with time filters
-        executions = await exec_db.get_graph_executions(
+        executions = await execution_db.get_graph_executions(
            graph_id=graph_id,
            user_id=user_id,
            statuses=[ExecutionStatus.COMPLETED],
@@ -257,7 +255,7 @@ class AgentOutputTool(BaseTool):

        # If only one execution, fetch full details
        if len(executions) == 1:
-            full_execution = await exec_db.get_graph_execution(
+            full_execution = await execution_db.get_graph_execution(
                user_id=user_id,
                execution_id=executions[0].id,
                include_node_executions=False,
@@ -265,7 +263,7 @@ class AgentOutputTool(BaseTool):
            return full_execution, [], None

        # Multiple executions - return latest with full details, plus list of available
-        full_execution = await exec_db.get_graph_execution(
+        full_execution = await execution_db.get_graph_execution(
            user_id=user_id,
            execution_id=executions[0].id,
            include_node_executions=False,
@@ -331,6 +329,7 @@ class AgentOutputTool(BaseTool):
            total_executions=len(available_executions) if available_executions else 1,
        )

+    @observe(as_type="tool", name="view_agent_output")
    async def _execute(
        self,
        user_id: str | None,
@@ -383,7 +382,7 @@ class AgentOutputTool(BaseTool):
            and not input_data.store_slug
        ):
            # Fetch execution directly to get graph_id
-            execution = await execution_db().get_graph_execution(
+            execution = await execution_db.get_graph_execution(
                user_id=user_id,
                execution_id=input_data.execution_id,
                include_node_executions=False,
@@ -395,7 +394,7 @@ class AgentOutputTool(BaseTool):
                )

            # Find library agent by graph_id
-            agent = await library_db().get_library_agent_by_graph_id(
+            agent = await library_db.get_library_agent_by_graph_id(
                user_id, execution.graph_id
            )
            if not agent:
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_search.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_search.py
@@ -0,0 +1,151 @@
+"""Shared agent search functionality for find_agent and find_library_agent tools."""
+
+import logging
+from typing import Literal
+
+from backend.api.features.library import db as library_db
+from backend.api.features.store import db as store_db
+from backend.util.exceptions import DatabaseError, NotFoundError
+
+from .models import (
+    AgentInfo,
+    AgentsFoundResponse,
+    ErrorResponse,
+    NoResultsResponse,
+    ToolResponseBase,
+)
+
+logger = logging.getLogger(__name__)
+
+SearchSource = Literal["marketplace", "library"]
+
+
+async def search_agents(
+    query: str,
+    source: SearchSource,
+    session_id: str | None,
+    user_id: str | None = None,
+) -> ToolResponseBase:
+    """
+    Search for agents in marketplace or user library.
+
+    Args:
+        query: Search query string
+        source: "marketplace" or "library"
+        session_id: Chat session ID
+        user_id: User ID (required for library search)
+
+    Returns:
+        AgentsFoundResponse, NoResultsResponse, or ErrorResponse
+    """
+    if not query:
+        return ErrorResponse(
+            message="Please provide a search query", session_id=session_id
+        )
+
+    if source == "library" and not user_id:
+        return ErrorResponse(
+            message="User authentication required to search library",
+            session_id=session_id,
+        )
+
+    agents: list[AgentInfo] = []
+    try:
+        if source == "marketplace":
+            logger.info(f"Searching marketplace for: {query}")
+            results = await store_db.get_store_agents(search_query=query, page_size=5)
+            for agent in results.agents:
+                agents.append(
+                    AgentInfo(
+                        id=f"{agent.creator}/{agent.slug}",
+                        name=agent.agent_name,
+                        description=agent.description or "",
+                        source="marketplace",
+                        in_library=False,
+                        creator=agent.creator,
+                        category="general",
+                        rating=agent.rating,
+                        runs=agent.runs,
+                        is_featured=False,
+                    )
+                )
+        else:  # library
+            logger.info(f"Searching user library for: {query}")
+            results = await library_db.list_library_agents(
+                user_id=user_id,  # type: ignore[arg-type]
+                search_term=query,
+                page_size=10,
+            )
+            for agent in results.agents:
+                agents.append(
+                    AgentInfo(
+                        id=agent.id,
+                        name=agent.name,
+                        description=agent.description or "",
+                        source="library",
+                        in_library=True,
+                        creator=agent.creator_name,
+                        status=agent.status.value,
+                        can_access_graph=agent.can_access_graph,
+                        has_external_trigger=agent.has_external_trigger,
+                        new_output=agent.new_output,
+                        graph_id=agent.graph_id,
+                    )
+                )
+        logger.info(f"Found {len(agents)} agents in {source}")
+    except NotFoundError:
+        pass
+    except DatabaseError as e:
+        logger.error(f"Error searching {source}: {e}", exc_info=True)
+        return ErrorResponse(
+            message=f"Failed to search {source}. Please try again.",
+            error=str(e),
+            session_id=session_id,
+        )
+
+    if not agents:
+        suggestions = (
+            [
+                "Try more general terms",
+                "Browse categories in the marketplace",
+                "Check spelling",
+            ]
+            if source == "marketplace"
+            else [
+                "Try different keywords",
+                "Use find_agent to search the marketplace",
+                "Check your library at /library",
+            ]
+        )
+        no_results_msg = (
+            f"No agents found matching '{query}'. Try different keywords or browse the marketplace."
+            if source == "marketplace"
+            else f"No agents matching '{query}' found in your library."
+        )
+        return NoResultsResponse(
+            message=no_results_msg, session_id=session_id, suggestions=suggestions
+        )
+
+    title = f"Found {len(agents)} agent{'s' if len(agents) != 1 else ''} "
+    title += (
+        f"for '{query}'"
+        if source == "marketplace"
+        else f"in your library for '{query}'"
+    )
+
+    message = (
+        "Now you have found some options for the user to choose from. "
+        "You can add a link to a recommended agent at: /marketplace/agent/agent_id "
+        "Please ask the user if they would like to use any of these agents."
+        if source == "marketplace"
+        else "Found agents in the user's library. You can provide a link to view an agent at: "
+        "/library/agents/{agent_id}. Use agent_output to get execution results, or run_agent to execute."
+    )
+
+    return AgentsFoundResponse(
+        message=message,
+        title=title,
+        agents=agents,
+        count=len(agents),
+        session_id=session_id,
+    )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/base.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/base.py
@@ -5,8 +5,8 @@ from typing import Any

 from openai.types.chat import ChatCompletionToolParam

-from backend.copilot.model import ChatSession
-from backend.copilot.response_model import StreamToolOutputAvailable
+from backend.api.features.chat.model import ChatSession
+from backend.api.features.chat.response_model import StreamToolOutputAvailable

 from .models import ErrorResponse, NeedLoginResponse, ToolResponseBase

@@ -36,16 +36,6 @@ class BaseTool:
        """Whether this tool requires authentication."""
        return False

-    @property
-    def is_long_running(self) -> bool:
-        """Whether this tool is long-running and should execute in background.
-
-        Long-running tools (like agent generation) are executed via background
-        tasks to survive SSE disconnections. The result is persisted to chat
-        history and visible when the user refreshes.
-        """
-        return False
-
    def as_openai_tool(self) -> ChatCompletionToolParam:
        """Convert to OpenAI tool format."""
        return ChatCompletionToolParam(
--- a/autogpt_platform/backend/backend/api/features/chat/tools/create_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/create_agent.py
@@ -3,22 +3,22 @@
 import logging
 from typing import Any

-from backend.copilot.model import ChatSession
+from langfuse import observe
+
+from backend.api.features.chat.model import ChatSession

 from .agent_generator import (
-    AgentGeneratorNotConfiguredError,
+    apply_all_fixes,
    decompose_goal,
-    enrich_library_agents_from_steps,
    generate_agent,
-    get_all_relevant_agents_for_generation,
-    get_user_message_for_error,
+    get_blocks_info,
    save_agent_to_library,
+    validate_agent,
 )
 from .base import BaseTool
 from .models import (
    AgentPreviewResponse,
    AgentSavedResponse,
-    AsyncProcessingResponse,
    ClarificationNeededResponse,
    ClarifyingQuestion,
    ErrorResponse,
@@ -27,6 +27,9 @@ from .models import (

 logger = logging.getLogger(__name__)

+# Maximum retries for agent generation with validation feedback
+MAX_GENERATION_RETRIES = 2
+

 class CreateAgentTool(BaseTool):
    """Tool for creating agents from natural language descriptions."""
@@ -46,10 +49,6 @@ class CreateAgentTool(BaseTool):
    def requires_auth(self) -> bool:
        return True

-    @property
-    def is_long_running(self) -> bool:
-        return True
-
    @property
    def parameters(self) -> dict[str, Any]:
        return {
@@ -81,6 +80,7 @@ class CreateAgentTool(BaseTool):
            "required": ["description"],
        }

+    @observe(as_type="tool", name="create_agent")
    async def _execute(
        self,
        user_id: str | None,
@@ -91,18 +91,15 @@ class CreateAgentTool(BaseTool):

        Flow:
        1. Decompose the description into steps (may return clarifying questions)
-        2. Generate agent JSON (external service handles fixing and validation)
-        3. Preview or save based on the save parameter
+        2. Generate agent JSON from the steps
+        3. Apply fixes to correct common LLM errors
+        4. Preview or save based on the save parameter
        """
        description = kwargs.get("description", "").strip()
        context = kwargs.get("context", "")
        save = kwargs.get("save", True)
        session_id = session.session_id if session else None

-        # Extract async processing params (passed by long-running tool handler)
-        operation_id = kwargs.get("_operation_id")
-        task_id = kwargs.get("_task_id")
-
        if not description:
            return ErrorResponse(
                message="Please provide a description of what the agent should do.",
@@ -110,61 +107,25 @@ class CreateAgentTool(BaseTool):
                session_id=session_id,
            )

-        library_agents = None
-        if user_id:
-            try:
-                library_agents = await get_all_relevant_agents_for_generation(
-                    user_id=user_id,
-                    search_query=description,
-                    include_marketplace=True,
-                )
-                logger.debug(
-                    f"Found {len(library_agents)} relevant agents for sub-agent composition"
-                )
-            except Exception as e:
-                logger.warning(f"Failed to fetch library agents: {e}")
-
+        # Step 1: Decompose goal into steps
        try:
-            decomposition_result = await decompose_goal(
-                description, context, library_agents
-            )
-        except AgentGeneratorNotConfiguredError:
+            decomposition_result = await decompose_goal(description, context)
+        except ValueError as e:
+            # Handle missing API key or configuration errors
            return ErrorResponse(
-                message=(
-                    "Agent generation is not available. "
-                    "The Agent Generator service is not configured."
-                ),
-                error="service_not_configured",
+                message=f"Agent generation is not configured: {str(e)}",
+                error="configuration_error",
                session_id=session_id,
            )

        if decomposition_result is None:
            return ErrorResponse(
-                message="Failed to analyze the goal. The agent generation service may be unavailable. Please try again.",
-                error="decomposition_failed",
-                details={"description": description[:100]},
-                session_id=session_id,
-            )
-
-        if decomposition_result.get("type") == "error":
-            error_msg = decomposition_result.get("error", "Unknown error")
-            error_type = decomposition_result.get("error_type", "unknown")
-            user_message = get_user_message_for_error(
-                error_type,
-                operation="analyze the goal",
-                llm_parse_message="The AI had trouble understanding this request. Please try rephrasing your goal.",
-            )
-            return ErrorResponse(
-                message=user_message,
-                error=f"decomposition_failed:{error_type}",
-                details={
-                    "description": description[:100],
-                    "service_error": error_msg,
-                    "error_type": error_type,
-                },
+                message="Failed to analyze the goal. Please try rephrasing.",
+                error="Decomposition failed",
                session_id=session_id,
            )

+        # Check if LLM returned clarifying questions
        if decomposition_result.get("type") == "clarifying_questions":
            questions = decomposition_result.get("questions", [])
            return ClarificationNeededResponse(
@@ -183,6 +144,7 @@ class CreateAgentTool(BaseTool):
                session_id=session_id,
            )

+        # Check for unachievable/vague goals
        if decomposition_result.get("type") == "unachievable_goal":
            suggested = decomposition_result.get("suggested_goal", "")
            reason = decomposition_result.get("reason", "")
@@ -209,88 +171,72 @@ class CreateAgentTool(BaseTool):
                session_id=session_id,
            )

-        if user_id and library_agents is not None:
-            try:
-                library_agents = await enrich_library_agents_from_steps(
-                    user_id=user_id,
-                    decomposition_result=decomposition_result,
-                    existing_agents=library_agents,
-                    include_marketplace=True,
+        # Step 2: Generate agent JSON with retry on validation failure
+        blocks_info = get_blocks_info()
+        agent_json = None
+        validation_errors = None
+
+        for attempt in range(MAX_GENERATION_RETRIES + 1):
+            # Generate agent (include validation errors from previous attempt)
+            if attempt == 0:
+                agent_json = await generate_agent(decomposition_result)
+            else:
+                # Retry with validation error feedback
+                logger.info(
+                    f"Retry {attempt}/{MAX_GENERATION_RETRIES} with validation feedback"
                )
-                logger.debug(
-                    f"After enrichment: {len(library_agents)} total agents for sub-agent composition"
+                retry_instructions = {
+                    **decomposition_result,
+                    "previous_errors": validation_errors,
+                    "retry_instructions": (
+                        "The previous generation had validation errors. "
+                        "Please fix these issues in the new generation:\n"
+                        f"{validation_errors}"
+                    ),
+                }
+                agent_json = await generate_agent(retry_instructions)
+
+            if agent_json is None:
+                if attempt == MAX_GENERATION_RETRIES:
+                    return ErrorResponse(
+                        message="Failed to generate the agent. Please try again.",
+                        error="Generation failed",
+                        session_id=session_id,
+                    )
+                continue
+
+            # Step 3: Apply fixes to correct common errors
+            agent_json = apply_all_fixes(agent_json, blocks_info)
+
+            # Step 4: Validate the agent
+            is_valid, validation_errors = validate_agent(agent_json, blocks_info)
+
+            if is_valid:
+                logger.info(f"Agent generated successfully on attempt {attempt + 1}")
+                break
+
+            logger.warning(
+                f"Validation failed on attempt {attempt + 1}: {validation_errors}"
+            )
+
+            if attempt == MAX_GENERATION_RETRIES:
+                # Return error with validation details
+                return ErrorResponse(
+                    message=(
+                        f"Generated agent has validation errors after {MAX_GENERATION_RETRIES + 1} attempts. "
+                        f"Please try rephrasing your request or simplify the workflow."
+                    ),
+                    error="validation_failed",
+                    details={"validation_errors": validation_errors},
+                    session_id=session_id,
                )
-            except Exception as e:
-                logger.warning(f"Failed to enrich library agents from steps: {e}")
-
-        try:
-            agent_json = await generate_agent(
-                decomposition_result,
-                library_agents,
-                operation_id=operation_id,
-                task_id=task_id,
-            )
-        except AgentGeneratorNotConfiguredError:
-            return ErrorResponse(
-                message=(
-                    "Agent generation is not available. "
-                    "The Agent Generator service is not configured."
-                ),
-                error="service_not_configured",
-                session_id=session_id,
-            )
-
-        if agent_json is None:
-            return ErrorResponse(
-                message="Failed to generate the agent. The agent generation service may be unavailable. Please try again.",
-                error="generation_failed",
-                details={"description": description[:100]},
-                session_id=session_id,
-            )
-
-        if isinstance(agent_json, dict) and agent_json.get("type") == "error":
-            error_msg = agent_json.get("error", "Unknown error")
-            error_type = agent_json.get("error_type", "unknown")
-            user_message = get_user_message_for_error(
-                error_type,
-                operation="generate the agent",
-                llm_parse_message="The AI had trouble generating the agent. Please try again or simplify your goal.",
-                validation_message=(
-                    "I wasn't able to create a valid agent for this request. "
-                    "The generated workflow had some structural issues. "
-                    "Please try simplifying your goal or breaking it into smaller steps."
-                ),
-                error_details=error_msg,
-            )
-            return ErrorResponse(
-                message=user_message,
-                error=f"generation_failed:{error_type}",
-                details={
-                    "description": description[:100],
-                    "service_error": error_msg,
-                    "error_type": error_type,
-                },
-                session_id=session_id,
-            )
-
-        # Check if Agent Generator accepted for async processing
-        if agent_json.get("status") == "accepted":
-            logger.info(
-                f"Agent generation delegated to async processing "
-                f"(operation_id={operation_id}, task_id={task_id})"
-            )
-            return AsyncProcessingResponse(
-                message="Agent generation started. You'll be notified when it's complete.",
-                operation_id=operation_id,
-                task_id=task_id,
-                session_id=session_id,
-            )

        agent_name = agent_json.get("name", "Generated Agent")
        agent_description = agent_json.get("description", "")
        node_count = len(agent_json.get("nodes", []))
        link_count = len(agent_json.get("links", []))

+        # Step 4: Preview or save
        if not save:
            return AgentPreviewResponse(
                message=(
@@ -305,6 +251,7 @@ class CreateAgentTool(BaseTool):
                session_id=session_id,
            )

+        # Save to library
        if not user_id:
            return ErrorResponse(
                message="You must be logged in to save agents.",
@@ -322,7 +269,7 @@ class CreateAgentTool(BaseTool):
                agent_id=created_graph.id,
                agent_name=created_graph.name,
                library_agent_id=library_agent.id,
-                library_agent_link=f"/library/agents/{library_agent.id}",
+                library_agent_link=f"/library/{library_agent.id}",
                agent_page_link=f"/build?flowID={created_graph.id}",
                session_id=session_id,
            )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/edit_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/edit_agent.py
@@ -3,21 +3,23 @@
 import logging
 from typing import Any

-from backend.copilot.model import ChatSession
+from langfuse import observe
+
+from backend.api.features.chat.model import ChatSession

 from .agent_generator import (
-    AgentGeneratorNotConfiguredError,
+    apply_agent_patch,
+    apply_all_fixes,
    generate_agent_patch,
    get_agent_as_json,
-    get_all_relevant_agents_for_generation,
-    get_user_message_for_error,
+    get_blocks_info,
    save_agent_to_library,
+    validate_agent,
 )
 from .base import BaseTool
 from .models import (
    AgentPreviewResponse,
    AgentSavedResponse,
-    AsyncProcessingResponse,
    ClarificationNeededResponse,
    ClarifyingQuestion,
    ErrorResponse,
@@ -26,6 +28,9 @@ from .models import (

 logger = logging.getLogger(__name__)

+# Maximum retries for patch generation with validation feedback
+MAX_GENERATION_RETRIES = 2
+

 class EditAgentTool(BaseTool):
    """Tool for editing existing agents using natural language."""
@@ -38,17 +43,13 @@ class EditAgentTool(BaseTool):
    def description(self) -> str:
        return (
            "Edit an existing agent from the user's library using natural language. "
-            "Generates updates to the agent while preserving unchanged parts."
+            "Generates a patch to update the agent while preserving unchanged parts."
        )

    @property
    def requires_auth(self) -> bool:
        return True

-    @property
-    def is_long_running(self) -> bool:
-        return True
-
    @property
    def parameters(self) -> dict[str, Any]:
        return {
@@ -86,6 +87,7 @@ class EditAgentTool(BaseTool):
            "required": ["agent_id", "changes"],
        }

+    @observe(as_type="tool", name="edit_agent")
    async def _execute(
        self,
        user_id: str | None,
@@ -96,8 +98,9 @@ class EditAgentTool(BaseTool):

        Flow:
        1. Fetch the current agent
-        2. Generate updated agent (external service handles fixing and validation)
-        3. Preview or save based on the save parameter
+        2. Generate a patch based on the requested changes
+        3. Apply the patch to create an updated agent
+        4. Preview or save based on the save parameter
        """
        agent_id = kwargs.get("agent_id", "").strip()
        changes = kwargs.get("changes", "").strip()
@@ -105,10 +108,6 @@ class EditAgentTool(BaseTool):
        save = kwargs.get("save", True)
        session_id = session.session_id if session else None

-        # Extract async processing params (passed by long-running tool handler)
-        operation_id = kwargs.get("_operation_id")
-        task_id = kwargs.get("_task_id")
-
        if not agent_id:
            return ErrorResponse(
                message="Please provide the agent ID to edit.",
@@ -123,6 +122,7 @@ class EditAgentTool(BaseTool):
                session_id=session_id,
            )

+        # Step 1: Fetch current agent
        current_agent = await get_agent_as_json(agent_id, user_id)

        if current_agent is None:
@@ -132,117 +132,126 @@ class EditAgentTool(BaseTool):
                session_id=session_id,
            )

-        library_agents = None
-        if user_id:
-            try:
-                graph_id = current_agent.get("id")
-                library_agents = await get_all_relevant_agents_for_generation(
-                    user_id=user_id,
-                    search_query=changes,
-                    exclude_graph_id=graph_id,
-                    include_marketplace=True,
-                )
-                logger.debug(
-                    f"Found {len(library_agents)} relevant agents for sub-agent composition"
-                )
-            except Exception as e:
-                logger.warning(f"Failed to fetch library agents: {e}")
-
+        # Build the update request with context
        update_request = changes
        if context:
            update_request = f"{changes}\n\nAdditional context:\n{context}"

-        try:
-            result = await generate_agent_patch(
-                update_request,
-                current_agent,
-                library_agents,
-                operation_id=operation_id,
-                task_id=task_id,
-            )
-        except AgentGeneratorNotConfiguredError:
-            return ErrorResponse(
-                message=(
-                    "Agent editing is not available. "
-                    "The Agent Generator service is not configured."
-                ),
-                error="service_not_configured",
-                session_id=session_id,
-            )
+        # Step 2: Generate patch with retry on validation failure
+        blocks_info = get_blocks_info()
+        updated_agent = None
+        validation_errors = None
+        intent = "Applied requested changes"

-        if result is None:
-            return ErrorResponse(
-                message="Failed to generate changes. The agent generation service may be unavailable or timed out. Please try again.",
-                error="update_generation_failed",
-                details={"agent_id": agent_id, "changes": changes[:100]},
-                session_id=session_id,
-            )
-
-        # Check if Agent Generator accepted for async processing
-        if result.get("status") == "accepted":
-            logger.info(
-                f"Agent edit delegated to async processing "
-                f"(operation_id={operation_id}, task_id={task_id})"
-            )
-            return AsyncProcessingResponse(
-                message="Agent edit started. You'll be notified when it's complete.",
-                operation_id=operation_id,
-                task_id=task_id,
-                session_id=session_id,
-            )
-
-        # Check if the result is an error from the external service
-        if isinstance(result, dict) and result.get("type") == "error":
-            error_msg = result.get("error", "Unknown error")
-            error_type = result.get("error_type", "unknown")
-            user_message = get_user_message_for_error(
-                error_type,
-                operation="generate the changes",
-                llm_parse_message="The AI had trouble generating the changes. Please try again or simplify your request.",
-                validation_message="The generated changes failed validation. Please try rephrasing your request.",
-                error_details=error_msg,
-            )
-            return ErrorResponse(
-                message=user_message,
-                error=f"update_generation_failed:{error_type}",
-                details={
-                    "agent_id": agent_id,
-                    "changes": changes[:100],
-                    "service_error": error_msg,
-                    "error_type": error_type,
-                },
-                session_id=session_id,
-            )
-
-        if result.get("type") == "clarifying_questions":
-            questions = result.get("questions", [])
-            return ClarificationNeededResponse(
-                message=(
-                    "I need some more information about the changes. "
-                    "Please answer the following questions:"
-                ),
-                questions=[
-                    ClarifyingQuestion(
-                        question=q.get("question", ""),
-                        keyword=q.get("keyword", ""),
-                        example=q.get("example"),
+        for attempt in range(MAX_GENERATION_RETRIES + 1):
+            # Generate patch (include validation errors from previous attempt)
+            try:
+                if attempt == 0:
+                    patch_result = await generate_agent_patch(
+                        update_request, current_agent
                    )
-                    for q in questions
-                ],
-                session_id=session_id,
+                else:
+                    # Retry with validation error feedback
+                    logger.info(
+                        f"Retry {attempt}/{MAX_GENERATION_RETRIES} with validation feedback"
+                    )
+                    retry_request = (
+                        f"{update_request}\n\n"
+                        f"IMPORTANT: The previous edit had validation errors. "
+                        f"Please fix these issues:\n{validation_errors}"
+                    )
+                    patch_result = await generate_agent_patch(
+                        retry_request, current_agent
+                    )
+            except ValueError as e:
+                # Handle missing API key or configuration errors
+                return ErrorResponse(
+                    message=f"Agent generation is not configured: {str(e)}",
+                    error="configuration_error",
+                    session_id=session_id,
+                )
+
+            if patch_result is None:
+                if attempt == MAX_GENERATION_RETRIES:
+                    return ErrorResponse(
+                        message="Failed to generate changes. Please try rephrasing.",
+                        error="Patch generation failed",
+                        session_id=session_id,
+                    )
+                continue
+
+            # Check if LLM returned clarifying questions
+            if patch_result.get("type") == "clarifying_questions":
+                questions = patch_result.get("questions", [])
+                return ClarificationNeededResponse(
+                    message=(
+                        "I need some more information about the changes. "
+                        "Please answer the following questions:"
+                    ),
+                    questions=[
+                        ClarifyingQuestion(
+                            question=q.get("question", ""),
+                            keyword=q.get("keyword", ""),
+                            example=q.get("example"),
+                        )
+                        for q in questions
+                    ],
+                    session_id=session_id,
+                )
+
+            # Step 3: Apply patch and fixes
+            try:
+                updated_agent = apply_agent_patch(current_agent, patch_result)
+                updated_agent = apply_all_fixes(updated_agent, blocks_info)
+            except Exception as e:
+                if attempt == MAX_GENERATION_RETRIES:
+                    return ErrorResponse(
+                        message=f"Failed to apply changes: {str(e)}",
+                        error="patch_apply_failed",
+                        details={"exception": str(e)},
+                        session_id=session_id,
+                    )
+                validation_errors = str(e)
+                continue
+
+            # Step 4: Validate the updated agent
+            is_valid, validation_errors = validate_agent(updated_agent, blocks_info)
+
+            if is_valid:
+                logger.info(f"Agent edited successfully on attempt {attempt + 1}")
+                intent = patch_result.get("intent", "Applied requested changes")
+                break
+
+            logger.warning(
+                f"Validation failed on attempt {attempt + 1}: {validation_errors}"
            )

-        updated_agent = result
+            if attempt == MAX_GENERATION_RETRIES:
+                # Return error with validation details
+                return ErrorResponse(
+                    message=(
+                        f"Updated agent has validation errors after "
+                        f"{MAX_GENERATION_RETRIES + 1} attempts. "
+                        f"Please try rephrasing your request or simplify the changes."
+                    ),
+                    error="validation_failed",
+                    details={"validation_errors": validation_errors},
+                    session_id=session_id,
+                )
+
+        # At this point, updated_agent is guaranteed to be set (we return on all failure paths)
+        assert updated_agent is not None

        agent_name = updated_agent.get("name", "Updated Agent")
        agent_description = updated_agent.get("description", "")
        node_count = len(updated_agent.get("nodes", []))
        link_count = len(updated_agent.get("links", []))

+        # Step 5: Preview or save
        if not save:
            return AgentPreviewResponse(
                message=(
-                    f"I've updated the agent. "
+                    f"I've updated the agent. Changes: {intent}. "
                    f"The agent now has {node_count} blocks. "
                    f"Review it and call edit_agent with save=true to save the changes."
                ),
@@ -254,6 +263,7 @@ class EditAgentTool(BaseTool):
                session_id=session_id,
            )

+        # Save to library (creates a new version)
        if not user_id:
            return ErrorResponse(
                message="You must be logged in to save agents.",
@@ -267,11 +277,14 @@ class EditAgentTool(BaseTool):
            )

            return AgentSavedResponse(
-                message=f"Updated agent '{created_graph.name}' has been saved to your library!",
+                message=(
+                    f"Updated agent '{created_graph.name}' has been saved to your library! "
+                    f"Changes: {intent}"
+                ),
                agent_id=created_graph.id,
                agent_name=created_graph.name,
                library_agent_id=library_agent.id,
-                library_agent_link=f"/library/agents/{library_agent.id}",
+                library_agent_link=f"/library/{library_agent.id}",
                agent_page_link=f"/build?flowID={created_graph.id}",
                session_id=session_id,
            )
--- a/autogpt_platform/backend/backend/api/features/chat/tools/find_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/find_agent.py
@@ -2,7 +2,9 @@

 from typing import Any

-from backend.copilot.model import ChatSession
+from langfuse import observe
+
+from backend.api.features.chat.model import ChatSession

 from .agent_search import search_agents
 from .base import BaseTool
@@ -35,6 +37,7 @@ class FindAgentTool(BaseTool):
            "required": ["query"],
        }

+    @observe(as_type="tool", name="find_agent")
    async def _execute(
        self, user_id: str | None, session: ChatSession, **kwargs
    ) -> ToolResponseBase:
--- a/autogpt_platform/backend/backend/api/features/chat/tools/find_block.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/find_block.py
@@ -1,45 +1,23 @@
 import logging
 from typing import Any

+from langfuse import observe
 from prisma.enums import ContentType

-from backend.blocks import get_block
-from backend.blocks._base import BlockType
-from backend.copilot.model import ChatSession
-from backend.data.db_accessors import search
-
-from .base import BaseTool, ToolResponseBase
-from .models import (
+from backend.api.features.chat.model import ChatSession
+from backend.api.features.chat.tools.base import BaseTool, ToolResponseBase
+from backend.api.features.chat.tools.models import (
    BlockInfoSummary,
+    BlockInputFieldInfo,
    BlockListResponse,
    ErrorResponse,
    NoResultsResponse,
 )
+from backend.api.features.store.hybrid_search import unified_hybrid_search
+from backend.data.block import get_block

 logger = logging.getLogger(__name__)

-_TARGET_RESULTS = 10
-# Over-fetch to compensate for post-hoc filtering of graph-only blocks.
-# 40 is 2x current removed; speed of query 10 vs 40 is minimial
-_OVERFETCH_PAGE_SIZE = 40
-
-# Block types that only work within graphs and cannot run standalone in CoPilot.
-COPILOT_EXCLUDED_BLOCK_TYPES = {
-    BlockType.INPUT,  # Graph interface definition - data enters via chat, not graph inputs
-    BlockType.OUTPUT,  # Graph interface definition - data exits via chat, not graph outputs
-    BlockType.WEBHOOK,  # Wait for external events - would hang forever in CoPilot
-    BlockType.WEBHOOK_MANUAL,  # Same as WEBHOOK
-    BlockType.NOTE,  # Visual annotation only - no runtime behavior
-    BlockType.HUMAN_IN_THE_LOOP,  # Pauses for human approval - CoPilot IS human-in-the-loop
-    BlockType.AGENT,  # AgentExecutorBlock requires execution_context - use run_agent tool
-}
-
-# Specific block IDs excluded from CoPilot (STANDARD type but still require graph context)
-COPILOT_EXCLUDED_BLOCK_IDS = {
-    # SmartDecisionMakerBlock - dynamically discovers downstream blocks via graph topology
-    "3b191d9f-356f-482d-8238-ba04b6d18381",
-}
-

 class FindBlockTool(BaseTool):
    """Tool for searching available blocks."""
@@ -55,8 +33,7 @@ class FindBlockTool(BaseTool):
            "Blocks are reusable components that perform specific tasks like "
            "sending emails, making API calls, processing text, etc. "
            "IMPORTANT: Use this tool FIRST to get the block's 'id' before calling run_block. "
-            "The response includes each block's id, name, and description. "
-            "Call run_block with the block's id **with no inputs** to see detailed inputs/outputs and execute it."
+            "The response includes each block's id, required_inputs, and input_schema."
        )

    @property
@@ -79,6 +56,7 @@ class FindBlockTool(BaseTool):
    def requires_auth(self) -> bool:
        return True

+    @observe(as_type="tool", name="find_block")
    async def _execute(
        self,
        user_id: str | None,
@@ -108,11 +86,11 @@ class FindBlockTool(BaseTool):

        try:
            # Search for blocks using hybrid search
-            results, total = await search().unified_hybrid_search(
+            results, total = await unified_hybrid_search(
                query=query,
                content_types=[ContentType.BLOCK],
                page=1,
-                page_size=_OVERFETCH_PAGE_SIZE,
+                page_size=10,
            )

            if not results:
@@ -125,44 +103,66 @@ class FindBlockTool(BaseTool):
                    session_id=session_id,
                )

-            # Enrich results with block information
+            # Enrich results with full block information
            blocks: list[BlockInfoSummary] = []
            for result in results:
                block_id = result["content_id"]
                block = get_block(block_id)

-                # Skip disabled blocks
-                if not block or block.disabled:
-                    continue
+                if block:
+                    # Get input/output schemas
+                    input_schema = {}
+                    output_schema = {}
+                    try:
+                        input_schema = block.input_schema.jsonschema()
+                    except Exception:
+                        pass
+                    try:
+                        output_schema = block.output_schema.jsonschema()
+                    except Exception:
+                        pass

-                # Skip blocks excluded from CoPilot (graph-only blocks)
-                if (
-                    block.block_type in COPILOT_EXCLUDED_BLOCK_TYPES
-                    or block.id in COPILOT_EXCLUDED_BLOCK_IDS
-                ):
-                    continue
+                    # Get categories from block instance
+                    categories = []
+                    if hasattr(block, "categories") and block.categories:
+                        categories = [cat.value for cat in block.categories]

-                blocks.append(
-                    BlockInfoSummary(
-                        id=block_id,
-                        name=block.name,
-                        description=block.description or "",
-                        categories=[c.value for c in block.categories],
+                    # Extract required inputs for easier use
+                    required_inputs: list[BlockInputFieldInfo] = []
+                    if input_schema:
+                        properties = input_schema.get("properties", {})
+                        required_fields = set(input_schema.get("required", []))
+                        # Get credential field names to exclude from required inputs
+                        credentials_fields = set(
+                            block.input_schema.get_credentials_fields().keys()
+                        )
+
+                        for field_name, field_schema in properties.items():
+                            # Skip credential fields - they're handled separately
+                            if field_name in credentials_fields:
+                                continue
+
+                            required_inputs.append(
+                                BlockInputFieldInfo(
+                                    name=field_name,
+                                    type=field_schema.get("type", "string"),
+                                    description=field_schema.get("description", ""),
+                                    required=field_name in required_fields,
+                                    default=field_schema.get("default"),
+                                )
+                            )
+
+                    blocks.append(
+                        BlockInfoSummary(
+                            id=block_id,
+                            name=block.name,
+                            description=block.description or "",
+                            categories=categories,
+                            input_schema=input_schema,
+                            output_schema=output_schema,
+                            required_inputs=required_inputs,
+                        )
                    )
-                )
-
-                if len(blocks) >= _TARGET_RESULTS:
-                    break
-
-            if blocks and len(blocks) < _TARGET_RESULTS:
-                logger.debug(
-                    "find_block returned %d/%d results for query '%s' "
-                    "(filtered %d excluded/disabled blocks)",
-                    len(blocks),
-                    _TARGET_RESULTS,
-                    query,
-                    len(results) - len(blocks),
-                )

            if not blocks:
                return NoResultsResponse(
@@ -176,7 +176,8 @@ class FindBlockTool(BaseTool):
            return BlockListResponse(
                message=(
                    f"Found {len(blocks)} block(s) matching '{query}'. "
-                    "To see a block's inputs/outputs and execute it, use run_block with the block's 'id' - providing no inputs."
+                    "To execute a block, use run_block with the block's 'id' field "
+                    "and provide 'input_data' matching the block's input_schema."
                ),
                blocks=blocks,
                count=len(blocks),
--- a/autogpt_platform/backend/backend/api/features/chat/tools/find_library_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/find_library_agent.py
@@ -2,7 +2,9 @@

 from typing import Any

-from backend.copilot.model import ChatSession
+from langfuse import observe
+
+from backend.api.features.chat.model import ChatSession

 from .agent_search import search_agents
 from .base import BaseTool
@@ -41,6 +43,7 @@ class FindLibraryAgentTool(BaseTool):
    def requires_auth(self) -> bool:
        return True

+    @observe(as_type="tool", name="find_library_agent")
    async def _execute(
        self, user_id: str | None, session: ChatSession, **kwargs
    ) -> ToolResponseBase:
--- a/autogpt_platform/backend/backend/api/features/chat/tools/get_doc_page.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/get_doc_page.py
@@ -4,10 +4,15 @@ import logging
 from pathlib import Path
 from typing import Any

-from backend.copilot.model import ChatSession
+from langfuse import observe

-from .base import BaseTool
-from .models import DocPageResponse, ErrorResponse, ToolResponseBase
+from backend.api.features.chat.model import ChatSession
+from backend.api.features.chat.tools.base import BaseTool
+from backend.api.features.chat.tools.models import (
+    DocPageResponse,
+    ErrorResponse,
+    ToolResponseBase,
+)

 logger = logging.getLogger(__name__)

@@ -68,6 +73,7 @@ class GetDocPageTool(BaseTool):
        url_path = path.rsplit(".", 1)[0] if "." in path else path
        return f"{DOCS_BASE_URL}/{url_path}"

+    @observe(as_type="tool", name="get_doc_page")
    async def _execute(
        self,
        user_id: str | None,
--- a/autogpt_platform/backend/backend/api/features/chat/tools/models.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/models.py
@@ -25,31 +25,9 @@ class ResponseType(str, Enum):
    AGENT_SAVED = "agent_saved"
    CLARIFICATION_NEEDED = "clarification_needed"
    BLOCK_LIST = "block_list"
-    BLOCK_DETAILS = "block_details"
    BLOCK_OUTPUT = "block_output"
    DOC_SEARCH_RESULTS = "doc_search_results"
    DOC_PAGE = "doc_page"
-    # Workspace response types
-    WORKSPACE_FILE_LIST = "workspace_file_list"
-    WORKSPACE_FILE_CONTENT = "workspace_file_content"
-    WORKSPACE_FILE_METADATA = "workspace_file_metadata"
-    WORKSPACE_FILE_WRITTEN = "workspace_file_written"
-    WORKSPACE_FILE_DELETED = "workspace_file_deleted"
-    # Long-running operation types
-    OPERATION_STARTED = "operation_started"
-    OPERATION_PENDING = "operation_pending"
-    OPERATION_IN_PROGRESS = "operation_in_progress"
-    # Input validation
-    INPUT_VALIDATION_ERROR = "input_validation_error"
-    # Web fetch
-    WEB_FETCH = "web_fetch"
-    # Code execution
-    BASH_EXEC = "bash_exec"
-    # Operation status check
-    OPERATION_STATUS = "operation_status"
-    # Feature request types
-    FEATURE_REQUEST_SEARCH = "feature_request_search"
-    FEATURE_REQUEST_CREATED = "feature_request_created"


 # Base response model
@@ -80,10 +58,6 @@ class AgentInfo(BaseModel):
    has_external_trigger: bool | None = None
    new_output: bool | None = None
    graph_id: str | None = None
-    inputs: dict[str, Any] | None = Field(
-        default=None,
-        description="Input schema for the agent, including field names, types, and defaults",
-    )


 class AgentsFoundResponse(ToolResponseBase):
@@ -210,20 +184,6 @@ class ErrorResponse(ToolResponseBase):
    details: dict[str, Any] | None = None


-class InputValidationErrorResponse(ToolResponseBase):
-    """Response when run_agent receives unknown input fields."""
-
-    type: ResponseType = ResponseType.INPUT_VALIDATION_ERROR
-    unrecognized_fields: list[str] = Field(
-        description="List of input field names that were not recognized"
-    )
-    inputs: dict[str, Any] = Field(
-        description="The agent's valid input schema for reference"
-    )
-    graph_id: str | None = None
-    graph_version: int | None = None
-
-
 # Agent output models
 class ExecutionOutputInfo(BaseModel):
    """Summary of a single execution's outputs."""
@@ -345,17 +305,11 @@ class BlockInfoSummary(BaseModel):
    name: str
    description: str
    categories: list[str]
-    input_schema: dict[str, Any] = Field(
-        default_factory=dict,
-        description="Full JSON schema for block inputs",
-    )
-    output_schema: dict[str, Any] = Field(
-        default_factory=dict,
-        description="Full JSON schema for block outputs",
-    )
+    input_schema: dict[str, Any]
+    output_schema: dict[str, Any]
    required_inputs: list[BlockInputFieldInfo] = Field(
        default_factory=list,
-        description="List of input fields for this block",
+        description="List of required input fields for this block",
    )


@@ -368,29 +322,10 @@ class BlockListResponse(ToolResponseBase):
    query: str
    usage_hint: str = Field(
        default="To execute a block, call run_block with block_id set to the block's "
-        "'id' field and input_data containing the fields listed in required_inputs."
+        "'id' field and input_data containing the required fields from input_schema."
    )


-class BlockDetails(BaseModel):
-    """Detailed block information."""
-
-    id: str
-    name: str
-    description: str
-    inputs: dict[str, Any] = {}
-    outputs: dict[str, Any] = {}
-    credentials: list[CredentialsMetaInput] = []
-
-
-class BlockDetailsResponse(ToolResponseBase):
-    """Response for block details (first run_block attempt)."""
-
-    type: ResponseType = ResponseType.BLOCK_DETAILS
-    block: BlockDetails
-    user_authenticated: bool = False
-
-
 class BlockOutputResponse(ToolResponseBase):
    """Response for run_block tool."""

@@ -399,111 +334,3 @@ class BlockOutputResponse(ToolResponseBase):
    block_name: str
    outputs: dict[str, list[Any]]
    success: bool = True
-
-
-# Long-running operation models
-class OperationStartedResponse(ToolResponseBase):
-    """Response when a long-running operation has been started in the background.
-
-    This is returned immediately to the client while the operation continues
-    to execute. The user can close the tab and check back later.
-
-    The task_id can be used to reconnect to the SSE stream via
-    GET /chat/tasks/{task_id}/stream?last_idx=0
-    """
-
-    type: ResponseType = ResponseType.OPERATION_STARTED
-    operation_id: str
-    tool_name: str
-    task_id: str | None = None  # For SSE reconnection
-
-
-class OperationPendingResponse(ToolResponseBase):
-    """Response stored in chat history while a long-running operation is executing.
-
-    This is persisted to the database so users see a pending state when they
-    refresh before the operation completes.
-    """
-
-    type: ResponseType = ResponseType.OPERATION_PENDING
-    operation_id: str
-    tool_name: str
-
-
-class OperationInProgressResponse(ToolResponseBase):
-    """Response when an operation is already in progress.
-
-    Returned for idempotency when the same tool_call_id is requested again
-    while the background task is still running.
-    """
-
-    type: ResponseType = ResponseType.OPERATION_IN_PROGRESS
-    tool_call_id: str
-
-
-class AsyncProcessingResponse(ToolResponseBase):
-    """Response when an operation has been delegated to async processing.
-
-    This is returned by tools when the external service accepts the request
-    for async processing (HTTP 202 Accepted). The Redis Streams completion
-    consumer will handle the result when the external service completes.
-
-    The status field is specifically "accepted" to allow the long-running tool
-    handler to detect this response and skip LLM continuation.
-    """
-
-    type: ResponseType = ResponseType.OPERATION_STARTED
-    status: str = "accepted"  # Must be "accepted" for detection
-    operation_id: str | None = None
-    task_id: str | None = None
-
-
-class WebFetchResponse(ToolResponseBase):
-    """Response for web_fetch tool."""
-
-    type: ResponseType = ResponseType.WEB_FETCH
-    url: str
-    status_code: int
-    content_type: str
-    content: str
-    truncated: bool = False
-
-
-class BashExecResponse(ToolResponseBase):
-    """Response for bash_exec tool."""
-
-    type: ResponseType = ResponseType.BASH_EXEC
-    stdout: str
-    stderr: str
-    exit_code: int
-    timed_out: bool = False
-
-
-# Feature request models
-class FeatureRequestInfo(BaseModel):
-    """Information about a feature request issue."""
-
-    id: str
-    identifier: str
-    title: str
-
-
-class FeatureRequestSearchResponse(ToolResponseBase):
-    """Response for search_feature_requests tool."""
-
-    type: ResponseType = ResponseType.FEATURE_REQUEST_SEARCH
-    results: list[FeatureRequestInfo]
-    count: int
-    query: str
-
-
-class FeatureRequestCreatedResponse(ToolResponseBase):
-    """Response for create_feature_request tool."""
-
-    type: ResponseType = ResponseType.FEATURE_REQUEST_CREATED
-    issue_id: str
-    issue_identifier: str
-    issue_title: str
-    issue_url: str
-    is_new_issue: bool  # False if added to existing
-    customer_name: str
--- a/autogpt_platform/backend/backend/api/features/chat/tools/run_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/run_agent.py
@@ -3,14 +3,15 @@
 import logging
 from typing import Any

+from langfuse import observe
 from pydantic import BaseModel, Field, field_validator

-from backend.copilot.config import ChatConfig
-from backend.copilot.model import ChatSession
-from backend.copilot.tracking import track_agent_run_success, track_agent_scheduled
-from backend.data.db_accessors import graph_db, library_db, user_db
+from backend.api.features.chat.config import ChatConfig
+from backend.api.features.chat.model import ChatSession
+from backend.api.features.library import db as library_db
 from backend.data.graph import GraphModel
 from backend.data.model import CredentialsMetaInput
+from backend.data.user import get_user_by_id
 from backend.executor import utils as execution_utils
 from backend.util.clients import get_scheduler_client
 from backend.util.exceptions import DatabaseError, NotFoundError
@@ -20,14 +21,12 @@ from backend.util.timezone_utils import (
 )

 from .base import BaseTool
-from .helpers import get_inputs_from_schema
 from .models import (
    AgentDetails,
    AgentDetailsResponse,
    ErrorResponse,
    ExecutionOptions,
    ExecutionStartedResponse,
-    InputValidationErrorResponse,
    SetupInfo,
    SetupRequirementsResponse,
    ToolResponseBase,
@@ -156,6 +155,7 @@ class RunAgentTool(BaseTool):
        """All operations require authentication."""
        return True

+    @observe(as_type="tool", name="run_agent")
    async def _execute(
        self,
        user_id: str | None,
@@ -196,7 +196,7 @@ class RunAgentTool(BaseTool):

            # Priority: library_agent_id if provided
            if has_library_id:
-                library_agent = await library_db().get_library_agent(
+                library_agent = await library_db.get_library_agent(
                    params.library_agent_id, user_id
                )
                if not library_agent:
@@ -205,7 +205,9 @@ class RunAgentTool(BaseTool):
                        session_id=session_id,
                    )
                # Get the graph from the library agent
-                graph = await graph_db().get_graph(
+                from backend.data.graph import get_graph
+
+                graph = await get_graph(
                    library_agent.graph_id,
                    library_agent.graph_version,
                    user_id=user_id,
@@ -256,7 +258,7 @@ class RunAgentTool(BaseTool):
                        ),
                        requirements={
                            "credentials": requirements_creds_list,
-                            "inputs": get_inputs_from_schema(graph.input_schema),
+                            "inputs": self._get_inputs_list(graph.input_schema),
                            "execution_modes": self._get_execution_modes(graph),
                        },
                    ),
@@ -269,22 +271,6 @@ class RunAgentTool(BaseTool):
            input_properties = graph.input_schema.get("properties", {})
            required_fields = set(graph.input_schema.get("required", []))
            provided_inputs = set(params.inputs.keys())
-            valid_fields = set(input_properties.keys())
-
-            # Check for unknown input fields
-            unrecognized_fields = provided_inputs - valid_fields
-            if unrecognized_fields:
-                return InputValidationErrorResponse(
-                    message=(
-                        f"Unknown input field(s) provided: {', '.join(sorted(unrecognized_fields))}. "
-                        f"Agent was not executed. Please use the correct field names from the schema."
-                    ),
-                    session_id=session_id,
-                    unrecognized_fields=sorted(unrecognized_fields),
-                    inputs=graph.input_schema,
-                    graph_id=graph.id,
-                    graph_version=graph.version,
-                )

            # If agent has inputs but none were provided AND use_defaults is not set,
            # always show what's available first so user can decide
@@ -364,6 +350,22 @@ class RunAgentTool(BaseTool):
                session_id=session_id,
            )

+    def _get_inputs_list(self, input_schema: dict[str, Any]) -> list[dict[str, Any]]:
+        """Extract inputs list from schema."""
+        inputs_list = []
+        if isinstance(input_schema, dict) and "properties" in input_schema:
+            for field_name, field_schema in input_schema["properties"].items():
+                inputs_list.append(
+                    {
+                        "name": field_name,
+                        "title": field_schema.get("title", field_name),
+                        "type": field_schema.get("type", "string"),
+                        "description": field_schema.get("description", ""),
+                        "required": field_name in input_schema.get("required", []),
+                    }
+                )
+        return inputs_list
+
    def _get_execution_modes(self, graph: GraphModel) -> list[str]:
        """Get available execution modes for the graph."""
        trigger_info = graph.trigger_setup_info
@@ -377,7 +379,7 @@ class RunAgentTool(BaseTool):
        suffix: str,
    ) -> str:
        """Build a message describing available inputs for an agent."""
-        inputs_list = get_inputs_from_schema(graph.input_schema)
+        inputs_list = self._get_inputs_list(graph.input_schema)
        required_names = [i["name"] for i in inputs_list if i["required"]]
        optional_names = [i["name"] for i in inputs_list if not i["required"]]

@@ -451,16 +453,6 @@ class RunAgentTool(BaseTool):
            session.successful_agent_runs.get(library_agent.graph_id, 0) + 1
        )

-        # Track in PostHog
-        track_agent_run_success(
-            user_id=user_id,
-            session_id=session_id,
-            graph_id=library_agent.graph_id,
-            graph_name=library_agent.name,
-            execution_id=execution.id,
-            library_agent_id=library_agent.id,
-        )
-
        library_agent_link = f"/library/agents/{library_agent.id}"
        return ExecutionStartedResponse(
            message=(
@@ -516,7 +508,7 @@ class RunAgentTool(BaseTool):
        library_agent = await get_or_create_library_agent(graph, user_id)

        # Get user timezone
-        user = await user_db().get_user_by_id(user_id)
+        user = await get_user_by_id(user_id)
        user_timezone = get_user_timezone_or_utc(user.timezone if user else timezone)

        # Create schedule
@@ -542,18 +534,6 @@ class RunAgentTool(BaseTool):
            session.successful_agent_schedules.get(library_agent.graph_id, 0) + 1
        )

-        # Track in PostHog
-        track_agent_scheduled(
-            user_id=user_id,
-            session_id=session_id,
-            graph_id=library_agent.graph_id,
-            graph_name=library_agent.name,
-            schedule_id=result.id,
-            schedule_name=schedule_name,
-            cron=cron,
-            library_agent_id=library_agent.id,
-        )
-
        library_agent_link = f"/library/agents/{library_agent.id}"
        return ExecutionStartedResponse(
            message=(
--- a/autogpt_platform/backend/backend/api/features/chat/tools/run_agent_test.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/run_agent_test.py
@@ -29,7 +29,7 @@ def mock_embedding_functions():
        yield


-@pytest.mark.asyncio(loop_scope="session")
+@pytest.mark.asyncio(scope="session")
 async def test_run_agent(setup_test_data):
    """Test that the run_agent tool successfully executes an approved agent"""
    # Use test data from fixture
@@ -70,7 +70,7 @@ async def test_run_agent(setup_test_data):
    assert result_data["graph_name"] == "Test Agent"


-@pytest.mark.asyncio(loop_scope="session")
+@pytest.mark.asyncio(scope="session")
 async def test_run_agent_missing_inputs(setup_test_data):
    """Test that the run_agent tool returns error when inputs are missing"""
    # Use test data from fixture
@@ -106,7 +106,7 @@ async def test_run_agent_missing_inputs(setup_test_data):
    assert "message" in result_data


-@pytest.mark.asyncio(loop_scope="session")
+@pytest.mark.asyncio(scope="session")
 async def test_run_agent_invalid_agent_id(setup_test_data):
    """Test that the run_agent tool returns error for invalid agent ID"""
    # Use test data from fixture
@@ -141,7 +141,7 @@ async def test_run_agent_invalid_agent_id(setup_test_data):
    )


-@pytest.mark.asyncio(loop_scope="session")
+@pytest.mark.asyncio(scope="session")
 async def test_run_agent_with_llm_credentials(setup_llm_test_data):
    """Test that run_agent works with an agent requiring LLM credentials"""
    # Use test data from fixture
@@ -185,7 +185,7 @@ async def test_run_agent_with_llm_credentials(setup_llm_test_data):
    assert result_data["graph_name"] == "LLM Test Agent"


-@pytest.mark.asyncio(loop_scope="session")
+@pytest.mark.asyncio(scope="session")
 async def test_run_agent_shows_available_inputs_when_none_provided(setup_test_data):
    """Test that run_agent returns available inputs when called without inputs or use_defaults."""
    user = setup_test_data["user"]
@@ -219,7 +219,7 @@ async def test_run_agent_shows_available_inputs_when_none_provided(setup_test_da
    assert "inputs" in result_data["message"].lower()


-@pytest.mark.asyncio(loop_scope="session")
+@pytest.mark.asyncio(scope="session")
 async def test_run_agent_with_use_defaults(setup_test_data):
    """Test that run_agent executes successfully with use_defaults=True."""
    user = setup_test_data["user"]
@@ -251,7 +251,7 @@ async def test_run_agent_with_use_defaults(setup_test_data):
    assert result_data["graph_id"] == graph.id


-@pytest.mark.asyncio(loop_scope="session")
+@pytest.mark.asyncio(scope="session")
 async def test_run_agent_missing_credentials(setup_firecrawl_test_data):
    """Test that run_agent returns setup_requirements when credentials are missing."""
    user = setup_firecrawl_test_data["user"]
@@ -285,7 +285,7 @@ async def test_run_agent_missing_credentials(setup_firecrawl_test_data):
    assert len(setup_info["user_readiness"]["missing_credentials"]) > 0


-@pytest.mark.asyncio(loop_scope="session")
+@pytest.mark.asyncio(scope="session")
 async def test_run_agent_invalid_slug_format(setup_test_data):
    """Test that run_agent returns error for invalid slug format (no slash)."""
    user = setup_test_data["user"]
@@ -313,7 +313,7 @@ async def test_run_agent_invalid_slug_format(setup_test_data):
    assert "username/agent-name" in result_data["message"]


-@pytest.mark.asyncio(loop_scope="session")
+@pytest.mark.asyncio(scope="session")
 async def test_run_agent_unauthenticated():
    """Test that run_agent returns need_login for unauthenticated users."""
    tool = RunAgentTool()
@@ -340,7 +340,7 @@ async def test_run_agent_unauthenticated():
    assert "sign in" in result_data["message"].lower()


-@pytest.mark.asyncio(loop_scope="session")
+@pytest.mark.asyncio(scope="session")
 async def test_run_agent_schedule_without_cron(setup_test_data):
    """Test that run_agent returns error when scheduling without cron expression."""
    user = setup_test_data["user"]
@@ -372,7 +372,7 @@ async def test_run_agent_schedule_without_cron(setup_test_data):
    assert "cron" in result_data["message"].lower()


-@pytest.mark.asyncio(loop_scope="session")
+@pytest.mark.asyncio(scope="session")
 async def test_run_agent_schedule_without_name(setup_test_data):
    """Test that run_agent returns error when scheduling without schedule_name."""
    user = setup_test_data["user"]
@@ -402,42 +402,3 @@ async def test_run_agent_schedule_without_name(setup_test_data):
    # Should return error about missing schedule_name
    assert result_data.get("type") == "error"
    assert "schedule_name" in result_data["message"].lower()
-
-
-@pytest.mark.asyncio(loop_scope="session")
-async def test_run_agent_rejects_unknown_input_fields(setup_test_data):
-    """Test that run_agent returns input_validation_error for unknown input fields."""
-    user = setup_test_data["user"]
-    store_submission = setup_test_data["store_submission"]
-
-    tool = RunAgentTool()
-    agent_marketplace_id = f"{user.email.split('@')[0]}/{store_submission.slug}"
-    session = make_session(user_id=user.id)
-
-    # Execute with unknown input field names
-    response = await tool.execute(
-        user_id=user.id,
-        session_id=str(uuid.uuid4()),
-        tool_call_id=str(uuid.uuid4()),
-        username_agent_slug=agent_marketplace_id,
-        inputs={
-            "unknown_field": "some value",
-            "another_unknown": "another value",
-        },
-        session=session,
-    )
-
-    assert response is not None
-    assert hasattr(response, "output")
-    assert isinstance(response.output, str)
-    result_data = orjson.loads(response.output)
-
-    # Should return input_validation_error type with unrecognized fields
-    assert result_data.get("type") == "input_validation_error"
-    assert "unrecognized_fields" in result_data
-    assert set(result_data["unrecognized_fields"]) == {
-        "another_unknown",
-        "unknown_field",
-    }
-    assert "inputs" in result_data  # Contains the valid schema
-    assert "Agent was not executed" in result_data["message"]
--- a/autogpt_platform/backend/backend/api/features/chat/tools/run_block.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/run_block.py
@@ -0,0 +1,305 @@
+"""Tool for executing blocks directly."""
+
+import logging
+from collections import defaultdict
+from typing import Any
+
+from langfuse import observe
+
+from backend.api.features.chat.model import ChatSession
+from backend.data.block import get_block
+from backend.data.execution import ExecutionContext
+from backend.data.model import CredentialsMetaInput
+from backend.integrations.creds_manager import IntegrationCredentialsManager
+from backend.util.exceptions import BlockError
+
+from .base import BaseTool
+from .models import (
+    BlockOutputResponse,
+    ErrorResponse,
+    SetupInfo,
+    SetupRequirementsResponse,
+    ToolResponseBase,
+    UserReadiness,
+)
+from .utils import build_missing_credentials_from_field_info
+
+logger = logging.getLogger(__name__)
+
+
+class RunBlockTool(BaseTool):
+    """Tool for executing a block and returning its outputs."""
+
+    @property
+    def name(self) -> str:
+        return "run_block"
+
+    @property
+    def description(self) -> str:
+        return (
+            "Execute a specific block with the provided input data. "
+            "IMPORTANT: You MUST call find_block first to get the block's 'id' - "
+            "do NOT guess or make up block IDs. "
+            "Use the 'id' from find_block results and provide input_data "
+            "matching the block's required_inputs."
+        )
+
+    @property
+    def parameters(self) -> dict[str, Any]:
+        return {
+            "type": "object",
+            "properties": {
+                "block_id": {
+                    "type": "string",
+                    "description": (
+                        "The block's 'id' field from find_block results. "
+                        "NEVER guess this - always get it from find_block first."
+                    ),
+                },
+                "input_data": {
+                    "type": "object",
+                    "description": (
+                        "Input values for the block. Use the 'required_inputs' field "
+                        "from find_block to see what fields are needed."
+                    ),
+                },
+            },
+            "required": ["block_id", "input_data"],
+        }
+
+    @property
+    def requires_auth(self) -> bool:
+        return True
+
+    async def _check_block_credentials(
+        self,
+        user_id: str,
+        block: Any,
+    ) -> tuple[dict[str, CredentialsMetaInput], list[CredentialsMetaInput]]:
+        """
+        Check if user has required credentials for a block.
+
+        Returns:
+            tuple[matched_credentials, missing_credentials]
+        """
+        matched_credentials: dict[str, CredentialsMetaInput] = {}
+        missing_credentials: list[CredentialsMetaInput] = []
+
+        # Get credential field info from block's input schema
+        credentials_fields_info = block.input_schema.get_credentials_fields_info()
+
+        if not credentials_fields_info:
+            return matched_credentials, missing_credentials
+
+        # Get user's available credentials
+        creds_manager = IntegrationCredentialsManager()
+        available_creds = await creds_manager.store.get_all_creds(user_id)
+
+        for field_name, field_info in credentials_fields_info.items():
+            # field_info.provider is a frozenset of acceptable providers
+            # field_info.supported_types is a frozenset of acceptable types
+            matching_cred = next(
+                (
+                    cred
+                    for cred in available_creds
+                    if cred.provider in field_info.provider
+                    and cred.type in field_info.supported_types
+                ),
+                None,
+            )
+
+            if matching_cred:
+                matched_credentials[field_name] = CredentialsMetaInput(
+                    id=matching_cred.id,
+                    provider=matching_cred.provider,  # type: ignore
+                    type=matching_cred.type,
+                    title=matching_cred.title,
+                )
+            else:
+                # Create a placeholder for the missing credential
+                provider = next(iter(field_info.provider), "unknown")
+                cred_type = next(iter(field_info.supported_types), "api_key")
+                missing_credentials.append(
+                    CredentialsMetaInput(
+                        id=field_name,
+                        provider=provider,  # type: ignore
+                        type=cred_type,  # type: ignore
+                        title=field_name.replace("_", " ").title(),
+                    )
+                )
+
+        return matched_credentials, missing_credentials
+
+    @observe(as_type="tool", name="run_block")
+    async def _execute(
+        self,
+        user_id: str | None,
+        session: ChatSession,
+        **kwargs,
+    ) -> ToolResponseBase:
+        """Execute a block with the given input data.
+
+        Args:
+            user_id: User ID (required)
+            session: Chat session
+            block_id: Block UUID to execute
+            input_data: Input values for the block
+
+        Returns:
+            BlockOutputResponse: Block execution outputs
+            SetupRequirementsResponse: Missing credentials
+            ErrorResponse: Error message
+        """
+        block_id = kwargs.get("block_id", "").strip()
+        input_data = kwargs.get("input_data", {})
+        session_id = session.session_id
+
+        if not block_id:
+            return ErrorResponse(
+                message="Please provide a block_id",
+                session_id=session_id,
+            )
+
+        if not isinstance(input_data, dict):
+            return ErrorResponse(
+                message="input_data must be an object",
+                session_id=session_id,
+            )
+
+        if not user_id:
+            return ErrorResponse(
+                message="Authentication required",
+                session_id=session_id,
+            )
+
+        # Get the block
+        block = get_block(block_id)
+        if not block:
+            return ErrorResponse(
+                message=f"Block '{block_id}' not found",
+                session_id=session_id,
+            )
+
+        logger.info(f"Executing block {block.name} ({block_id}) for user {user_id}")
+
+        # Check credentials
+        creds_manager = IntegrationCredentialsManager()
+        matched_credentials, missing_credentials = await self._check_block_credentials(
+            user_id, block
+        )
+
+        if missing_credentials:
+            # Return setup requirements response with missing credentials
+            credentials_fields_info = block.input_schema.get_credentials_fields_info()
+            missing_creds_dict = build_missing_credentials_from_field_info(
+                credentials_fields_info, set(matched_credentials.keys())
+            )
+            missing_creds_list = list(missing_creds_dict.values())
+
+            return SetupRequirementsResponse(
+                message=(
+                    f"Block '{block.name}' requires credentials that are not configured. "
+                    "Please set up the required credentials before running this block."
+                ),
+                session_id=session_id,
+                setup_info=SetupInfo(
+                    agent_id=block_id,
+                    agent_name=block.name,
+                    user_readiness=UserReadiness(
+                        has_all_credentials=False,
+                        missing_credentials=missing_creds_dict,
+                        ready_to_run=False,
+                    ),
+                    requirements={
+                        "credentials": missing_creds_list,
+                        "inputs": self._get_inputs_list(block),
+                        "execution_modes": ["immediate"],
+                    },
+                ),
+                graph_id=None,
+                graph_version=None,
+            )
+
+        try:
+            # Fetch actual credentials and prepare kwargs for block execution
+            # Create execution context with defaults (blocks may require it)
+            exec_kwargs: dict[str, Any] = {
+                "user_id": user_id,
+                "execution_context": ExecutionContext(),
+            }
+
+            for field_name, cred_meta in matched_credentials.items():
+                # Inject metadata into input_data (for validation)
+                if field_name not in input_data:
+                    input_data[field_name] = cred_meta.model_dump()
+
+                # Fetch actual credentials and pass as kwargs (for execution)
+                actual_credentials = await creds_manager.get(
+                    user_id, cred_meta.id, lock=False
+                )
+                if actual_credentials:
+                    exec_kwargs[field_name] = actual_credentials
+                else:
+                    return ErrorResponse(
+                        message=f"Failed to retrieve credentials for {field_name}",
+                        session_id=session_id,
+                    )
+
+            # Execute the block and collect outputs
+            outputs: dict[str, list[Any]] = defaultdict(list)
+            async for output_name, output_data in block.execute(
+                input_data,
+                **exec_kwargs,
+            ):
+                outputs[output_name].append(output_data)
+
+            return BlockOutputResponse(
+                message=f"Block '{block.name}' executed successfully",
+                block_id=block_id,
+                block_name=block.name,
+                outputs=dict(outputs),
+                success=True,
+                session_id=session_id,
+            )
+
+        except BlockError as e:
+            logger.warning(f"Block execution failed: {e}")
+            return ErrorResponse(
+                message=f"Block execution failed: {e}",
+                error=str(e),
+                session_id=session_id,
+            )
+        except Exception as e:
+            logger.error(f"Unexpected error executing block: {e}", exc_info=True)
+            return ErrorResponse(
+                message=f"Failed to execute block: {str(e)}",
+                error=str(e),
+                session_id=session_id,
+            )
+
+    def _get_inputs_list(self, block: Any) -> list[dict[str, Any]]:
+        """Extract non-credential inputs from block schema."""
+        inputs_list = []
+        schema = block.input_schema.jsonschema()
+        properties = schema.get("properties", {})
+        required_fields = set(schema.get("required", []))
+
+        # Get credential field names to exclude
+        credentials_fields = set(block.input_schema.get_credentials_fields().keys())
+
+        for field_name, field_schema in properties.items():
+            # Skip credential fields
+            if field_name in credentials_fields:
+                continue
+
+            inputs_list.append(
+                {
+                    "name": field_name,
+                    "title": field_schema.get("title", field_name),
+                    "type": field_schema.get("type", "string"),
+                    "description": field_schema.get("description", ""),
+                    "required": field_name in required_fields,
+                }
+            )
+
+        return inputs_list
--- a/autogpt_platform/backend/backend/api/features/chat/tools/search_docs.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/search_docs.py
@@ -3,19 +3,19 @@
 import logging
 from typing import Any

+from langfuse import observe
 from prisma.enums import ContentType

-from backend.copilot.model import ChatSession
-from backend.data.db_accessors import search
-
-from .base import BaseTool
-from .models import (
+from backend.api.features.chat.model import ChatSession
+from backend.api.features.chat.tools.base import BaseTool
+from backend.api.features.chat.tools.models import (
    DocSearchResult,
    DocSearchResultsResponse,
    ErrorResponse,
    NoResultsResponse,
    ToolResponseBase,
 )
+from backend.api.features.store.hybrid_search import unified_hybrid_search

 logger = logging.getLogger(__name__)

@@ -88,6 +88,7 @@ class SearchDocsTool(BaseTool):
        url_path = path.rsplit(".", 1)[0] if "." in path else path
        return f"{DOCS_BASE_URL}/{url_path}"

+    @observe(as_type="tool", name="search_docs")
    async def _execute(
        self,
        user_id: str | None,
@@ -118,7 +119,7 @@ class SearchDocsTool(BaseTool):

        try:
            # Search using hybrid search for DOCUMENTATION content type only
-            results, total = await search().unified_hybrid_search(
+            results, total = await unified_hybrid_search(
                query=query,
                content_types=[ContentType.DOCUMENTATION],
                page=1,
--- a/autogpt_platform/backend/backend/api/features/chat/tools/utils.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/utils.py
@@ -3,18 +3,13 @@
 import logging
 from typing import Any

+from backend.api.features.library import db as library_db
 from backend.api.features.library import model as library_model
-from backend.data.db_accessors import library_db, store_db
+from backend.api.features.store import db as store_db
+from backend.data import graph as graph_db
 from backend.data.graph import GraphModel
-from backend.data.model import (
-    Credentials,
-    CredentialsFieldInfo,
-    CredentialsMetaInput,
-    HostScopedCredentials,
-    OAuth2Credentials,
-)
+from backend.data.model import CredentialsFieldInfo, CredentialsMetaInput
 from backend.integrations.creds_manager import IntegrationCredentialsManager
-from backend.integrations.providers import ProviderName
 from backend.util.exceptions import NotFoundError

 logger = logging.getLogger(__name__)
@@ -38,15 +33,20 @@ async def fetch_graph_from_store_slug(
    Raises:
        DatabaseError: If there's a database error during lookup.
    """
-    sdb = store_db()
    try:
-        store_agent = await sdb.get_store_agent_details(username, agent_name)
+        store_agent = await store_db.get_store_agent_details(username, agent_name)
    except NotFoundError:
        return None, None

    # Get the graph from store listing version
-    graph = await sdb.get_available_graph(
-        store_agent.store_listing_version_id, hide_nodes=False
+    graph_meta = await store_db.get_available_graph(
+        store_agent.store_listing_version_id
+    )
+    graph = await graph_db.get_graph(
+        graph_id=graph_meta.id,
+        version=graph_meta.version,
+        user_id=None,  # Public access
+        include_subgraphs=True,
    )
    return graph, store_agent

@@ -123,7 +123,7 @@ def build_missing_credentials_from_graph(

    return {
        field_key: _serialize_missing_credential(field_key, field_info)
-        for field_key, (field_info, _, _) in aggregated_fields.items()
+        for field_key, (field_info, _node_fields) in aggregated_fields.items()
        if field_key not in matched_keys
    }

@@ -210,13 +210,13 @@ async def get_or_create_library_agent(
    Returns:
        LibraryAgent instance
    """
-    existing = await library_db().get_library_agent_by_graph_id(
+    existing = await library_db.get_library_agent_by_graph_id(
        graph_id=graph.id, user_id=user_id
    )
    if existing:
        return existing

-    library_agents = await library_db().create_library_agent(
+    library_agents = await library_db.create_library_agent(
        graph=graph,
        user_id=user_id,
        create_library_agents_for_sub_graphs=False,
@@ -225,99 +225,6 @@ async def get_or_create_library_agent(
    return library_agents[0]


-async def match_credentials_to_requirements(
-    user_id: str,
-    requirements: dict[str, CredentialsFieldInfo],
-) -> tuple[dict[str, CredentialsMetaInput], list[CredentialsMetaInput]]:
-    """
-    Match user's credentials against a dictionary of credential requirements.
-
-    This is the core matching logic shared by both graph and block credential matching.
-    """
-    matched: dict[str, CredentialsMetaInput] = {}
-    missing: list[CredentialsMetaInput] = []
-
-    if not requirements:
-        return matched, missing
-
-    available_creds = await get_user_credentials(user_id)
-
-    for field_name, field_info in requirements.items():
-        matching_cred = find_matching_credential(available_creds, field_info)
-
-        if matching_cred:
-            try:
-                matched[field_name] = create_credential_meta_from_match(matching_cred)
-            except Exception as e:
-                logger.error(
-                    f"Failed to create CredentialsMetaInput for field '{field_name}': "
-                    f"provider={matching_cred.provider}, type={matching_cred.type}, "
-                    f"credential_id={matching_cred.id}",
-                    exc_info=True,
-                )
-                provider = next(iter(field_info.provider), "unknown")
-                cred_type = next(iter(field_info.supported_types), "api_key")
-                missing.append(
-                    CredentialsMetaInput(
-                        id=field_name,
-                        provider=provider,  # type: ignore
-                        type=cred_type,  # type: ignore
-                        title=f"{field_name} (validation failed: {e})",
-                    )
-                )
-        else:
-            provider = next(iter(field_info.provider), "unknown")
-            cred_type = next(iter(field_info.supported_types), "api_key")
-            missing.append(
-                CredentialsMetaInput(
-                    id=field_name,
-                    provider=provider,  # type: ignore
-                    type=cred_type,  # type: ignore
-                    title=field_name.replace("_", " ").title(),
-                )
-            )
-
-    return matched, missing
-
-
-async def get_user_credentials(user_id: str) -> list[Credentials]:
-    """Get all available credentials for a user."""
-    creds_manager = IntegrationCredentialsManager()
-    return await creds_manager.store.get_all_creds(user_id)
-
-
-def find_matching_credential(
-    available_creds: list[Credentials],
-    field_info: CredentialsFieldInfo,
-) -> Credentials | None:
-    """Find a credential that matches the required provider, type, scopes, and host."""
-    for cred in available_creds:
-        if cred.provider not in field_info.provider:
-            continue
-        if cred.type not in field_info.supported_types:
-            continue
-        if cred.type == "oauth2" and not _credential_has_required_scopes(
-            cred, field_info
-        ):
-            continue
-        if cred.type == "host_scoped" and not _credential_is_for_host(cred, field_info):
-            continue
-        return cred
-    return None
-
-
-def create_credential_meta_from_match(
-    matching_cred: Credentials,
-) -> CredentialsMetaInput:
-    """Create a CredentialsMetaInput from a matched credential."""
-    return CredentialsMetaInput(
-        id=matching_cred.id,
-        provider=matching_cred.provider,  # type: ignore
-        type=matching_cred.type,
-        title=matching_cred.title,
-    )
-
-
 async def match_user_credentials_to_graph(
    user_id: str,
    graph: GraphModel,
@@ -357,28 +264,15 @@ async def match_user_credentials_to_graph(
    # provider is in the set of acceptable providers.
    for credential_field_name, (
        credential_requirements,
-        _,
-        _,
+        _node_fields,
    ) in aggregated_creds.items():
-        # Find first matching credential by provider, type, scopes, and host/URL
+        # Find first matching credential by provider and type
        matching_cred = next(
            (
                cred
                for cred in available_creds
                if cred.provider in credential_requirements.provider
                and cred.type in credential_requirements.supported_types
-                and (
-                    cred.type != "oauth2"
-                    or _credential_has_required_scopes(cred, credential_requirements)
-                )
-                and (
-                    cred.type != "host_scoped"
-                    or _credential_is_for_host(cred, credential_requirements)
-                )
-                and (
-                    cred.provider != ProviderName.MCP
-                    or _credential_is_for_mcp_server(cred, credential_requirements)
-                )
            ),
            None,
        )
@@ -402,17 +296,10 @@ async def match_user_credentials_to_graph(
                    f"{credential_field_name} (validation failed: {e})"
                )
        else:
-            # Build a helpful error message including scope requirements
-            error_parts = [
-                f"provider in {list(credential_requirements.provider)}",
-                f"type in {list(credential_requirements.supported_types)}",
-            ]
-            if credential_requirements.required_scopes:
-                error_parts.append(
-                    f"scopes including {list(credential_requirements.required_scopes)}"
-                )
            missing_creds.append(
-                f"{credential_field_name} (requires {', '.join(error_parts)})"
+                f"{credential_field_name} "
+                f"(requires provider in {list(credential_requirements.provider)}, "
+                f"type in {list(credential_requirements.supported_types)})"
            )

    logger.info(
@@ -422,49 +309,6 @@ async def match_user_credentials_to_graph(
    return graph_credentials_inputs, missing_creds


-def _credential_has_required_scopes(
-    credential: OAuth2Credentials,
-    requirements: CredentialsFieldInfo,
-) -> bool:
-    """Check if an OAuth2 credential has all the scopes required by the input."""
-    # If no scopes are required, any credential matches
-    if not requirements.required_scopes:
-        return True
-    return set(credential.scopes).issuperset(requirements.required_scopes)
-
-
-def _credential_is_for_host(
-    credential: HostScopedCredentials,
-    requirements: CredentialsFieldInfo,
-) -> bool:
-    """Check if a host-scoped credential matches the host required by the input."""
-    # We need to know the host to match host-scoped credentials to.
-    # Graph.aggregate_credentials_inputs() adds the node's set URL value (if any)
-    # to discriminator_values. No discriminator_values -> no host to match against.
-    if not requirements.discriminator_values:
-        return True
-
-    # Check that credential host matches required host.
-    # Host-scoped credential inputs are grouped by host, so any item from the set works.
-    return credential.matches_url(list(requirements.discriminator_values)[0])
-
-
-def _credential_is_for_mcp_server(
-    credential: Credentials,
-    requirements: CredentialsFieldInfo,
-) -> bool:
-    """Check if an MCP OAuth credential matches the required server URL."""
-    if not requirements.discriminator_values:
-        return True
-
-    server_url = (
-        credential.metadata.get("mcp_server_url")
-        if isinstance(credential, OAuth2Credentials)
-        else None
-    )
-    return server_url in requirements.discriminator_values if server_url else False
-
-
 async def check_user_has_required_credentials(
    user_id: str,
    required_credentials: list[CredentialsMetaInput],
--- a/autogpt_platform/backend/backend/api/features/executions/review/model.py
+++ b/autogpt_platform/backend/backend/api/features/executions/review/model.py
@@ -23,7 +23,6 @@ class PendingHumanReviewModel(BaseModel):
        id: Unique identifier for the review record
        user_id: ID of the user who must perform the review
        node_exec_id: ID of the node execution that created this review
-        node_id: ID of the node definition (for grouping reviews from same node)
        graph_exec_id: ID of the graph execution containing the node
        graph_id: ID of the graph template being executed
        graph_version: Version number of the graph template
@@ -38,10 +37,6 @@ class PendingHumanReviewModel(BaseModel):
    """

    node_exec_id: str = Field(description="Node execution ID (primary key)")
-    node_id: str = Field(
-        description="Node definition ID (for grouping)",
-        default="",  # Temporary default for test compatibility
-    )
    user_id: str = Field(description="User ID associated with the review")
    graph_exec_id: str = Field(description="Graph execution ID")
    graph_id: str = Field(description="Graph ID")
@@ -71,9 +66,7 @@ class PendingHumanReviewModel(BaseModel):
    )

    @classmethod
-    def from_db(
-        cls, review: "PendingHumanReview", node_id: str
-    ) -> "PendingHumanReviewModel":
+    def from_db(cls, review: "PendingHumanReview") -> "PendingHumanReviewModel":
        """
        Convert a database model to a response model.

@@ -81,14 +74,9 @@ class PendingHumanReviewModel(BaseModel):
        payload, instructions, and editable flag.

        Handles invalid data gracefully by using safe defaults.
-
-        Args:
-            review: Database review object
-            node_id: Node definition ID (fetched from NodeExecution)
        """
        return cls(
            node_exec_id=review.nodeExecId,
-            node_id=node_id,
            user_id=review.userId,
            graph_exec_id=review.graphExecId,
            graph_id=review.graphId,
@@ -119,13 +107,6 @@ class ReviewItem(BaseModel):
    reviewed_data: SafeJsonData | None = Field(
        None, description="Optional edited data (ignored if approved=False)"
    )
-    auto_approve_future: bool = Field(
-        default=False,
-        description=(
-            "If true and this review is approved, future executions of this same "
-            "block (node) will be automatically approved. This only affects approved reviews."
-        ),
-    )

    @field_validator("reviewed_data")
    @classmethod
@@ -193,9 +174,6 @@ class ReviewRequest(BaseModel):
    This request must include ALL pending reviews for a graph execution.
    Each review will be either approved (with optional data modifications)
    or rejected (data ignored). The execution will resume only after ALL reviews are processed.
-
-    Each review item can individually specify whether to auto-approve future executions
-    of the same block via the `auto_approve_future` field on ReviewItem.
    """

    reviews: List[ReviewItem] = Field(
--- a/autogpt_platform/backend/backend/api/features/executions/review/review_routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/executions/review/review_routes_test.py
--- a/autogpt_platform/backend/backend/api/features/executions/review/routes.py
+++ b/autogpt_platform/backend/backend/api/features/executions/review/routes.py
@@ -1,27 +1,17 @@
-import asyncio
 import logging
-from typing import Any, List
+from typing import List

 import autogpt_libs.auth as autogpt_auth_lib
 from fastapi import APIRouter, HTTPException, Query, Security, status
 from prisma.enums import ReviewStatus

-from backend.data.execution import (
-    ExecutionContext,
-    ExecutionStatus,
-    get_graph_execution_meta,
-)
-from backend.data.graph import get_graph_settings
+from backend.data.execution import get_graph_execution_meta
 from backend.data.human_review import (
-    create_auto_approval_record,
    get_pending_reviews_for_execution,
    get_pending_reviews_for_user,
-    get_reviews_by_node_exec_ids,
    has_pending_reviews_for_graph_exec,
    process_all_reviews_for_execution,
 )
-from backend.data.model import USER_TIMEZONE_NOT_SET
-from backend.data.user import get_user_by_id
 from backend.executor.utils import add_graph_execution

 from .model import PendingHumanReviewModel, ReviewRequest, ReviewResponse
@@ -137,70 +127,17 @@ async def process_review_action(
            detail="At least one review must be provided",
        )

-    # Batch fetch all requested reviews (regardless of status for idempotent handling)
-    reviews_map = await get_reviews_by_node_exec_ids(
-        list(all_request_node_ids), user_id
-    )
-
-    # Validate all reviews were found (must exist, any status is OK for now)
-    missing_ids = all_request_node_ids - set(reviews_map.keys())
-    if missing_ids:
-        raise HTTPException(
-            status_code=status.HTTP_404_NOT_FOUND,
-            detail=f"Review(s) not found: {', '.join(missing_ids)}",
-        )
-
-    # Validate all reviews belong to the same execution
-    graph_exec_ids = {review.graph_exec_id for review in reviews_map.values()}
-    if len(graph_exec_ids) > 1:
-        raise HTTPException(
-            status_code=status.HTTP_409_CONFLICT,
-            detail="All reviews in a single request must belong to the same execution.",
-        )
-
-    graph_exec_id = next(iter(graph_exec_ids))
-
-    # Validate execution status before processing reviews
-    graph_exec_meta = await get_graph_execution_meta(
-        user_id=user_id, execution_id=graph_exec_id
-    )
-
-    if not graph_exec_meta:
-        raise HTTPException(
-            status_code=status.HTTP_404_NOT_FOUND,
-            detail=f"Graph execution #{graph_exec_id} not found",
-        )
-
-    # Only allow processing reviews if execution is paused for review
-    # or incomplete (partial execution with some reviews already processed)
-    if graph_exec_meta.status not in (
-        ExecutionStatus.REVIEW,
-        ExecutionStatus.INCOMPLETE,
-    ):
-        raise HTTPException(
-            status_code=status.HTTP_409_CONFLICT,
-            detail=f"Cannot process reviews while execution status is {graph_exec_meta.status}. "
-            f"Reviews can only be processed when execution is paused (REVIEW status). "
-            f"Current status: {graph_exec_meta.status}",
-        )
-
-    # Build review decisions map and track which reviews requested auto-approval
-    # Auto-approved reviews use original data (no modifications allowed)
+    # Build review decisions map
    review_decisions = {}
-    auto_approve_requests = {}  # Map node_exec_id -> auto_approve_future flag
-
    for review in request.reviews:
        review_status = (
            ReviewStatus.APPROVED if review.approved else ReviewStatus.REJECTED
        )
-        # If this review requested auto-approval, don't allow data modifications
-        reviewed_data = None if review.auto_approve_future else review.reviewed_data
        review_decisions[review.node_exec_id] = (
            review_status,
-            reviewed_data,
+            review.reviewed_data,
            review.message,
        )
-        auto_approve_requests[review.node_exec_id] = review.auto_approve_future

    # Process all reviews
    updated_reviews = await process_all_reviews_for_execution(
@@ -208,87 +145,6 @@ async def process_review_action(
        review_decisions=review_decisions,
    )

-    # Create auto-approval records for approved reviews that requested it
-    # Deduplicate by node_id to avoid race conditions when multiple reviews
-    # for the same node are processed in parallel
-    async def create_auto_approval_for_node(
-        node_id: str, review_result
-    ) -> tuple[str, bool]:
-        """
-        Create auto-approval record for a node.
-        Returns (node_id, success) tuple for tracking failures.
-        """
-        try:
-            await create_auto_approval_record(
-                user_id=user_id,
-                graph_exec_id=review_result.graph_exec_id,
-                graph_id=review_result.graph_id,
-                graph_version=review_result.graph_version,
-                node_id=node_id,
-                payload=review_result.payload,
-            )
-            return (node_id, True)
-        except Exception as e:
-            logger.error(
-                f"Failed to create auto-approval record for node {node_id}",
-                exc_info=e,
-            )
-            return (node_id, False)
-
-    # Collect node_exec_ids that need auto-approval
-    node_exec_ids_needing_auto_approval = [
-        node_exec_id
-        for node_exec_id, review_result in updated_reviews.items()
-        if review_result.status == ReviewStatus.APPROVED
-        and auto_approve_requests.get(node_exec_id, False)
-    ]
-
-    # Batch-fetch node executions to get node_ids
-    nodes_needing_auto_approval: dict[str, Any] = {}
-    if node_exec_ids_needing_auto_approval:
-        from backend.data.execution import get_node_executions
-
-        node_execs = await get_node_executions(
-            graph_exec_id=graph_exec_id, include_exec_data=False
-        )
-        node_exec_map = {node_exec.node_exec_id: node_exec for node_exec in node_execs}
-
-        for node_exec_id in node_exec_ids_needing_auto_approval:
-            node_exec = node_exec_map.get(node_exec_id)
-            if node_exec:
-                review_result = updated_reviews[node_exec_id]
-                # Use the first approved review for this node (deduplicate by node_id)
-                if node_exec.node_id not in nodes_needing_auto_approval:
-                    nodes_needing_auto_approval[node_exec.node_id] = review_result
-            else:
-                logger.error(
-                    f"Failed to create auto-approval record for {node_exec_id}: "
-                    f"Node execution not found. This may indicate a race condition "
-                    f"or data inconsistency."
-                )
-
-    # Execute all auto-approval creations in parallel (deduplicated by node_id)
-    auto_approval_results = await asyncio.gather(
-        *[
-            create_auto_approval_for_node(node_id, review_result)
-            for node_id, review_result in nodes_needing_auto_approval.items()
-        ],
-        return_exceptions=True,
-    )
-
-    # Count auto-approval failures
-    auto_approval_failed_count = 0
-    for result in auto_approval_results:
-        if isinstance(result, Exception):
-            # Unexpected exception during auto-approval creation
-            auto_approval_failed_count += 1
-            logger.error(
-                f"Unexpected exception during auto-approval creation: {result}"
-            )
-        elif isinstance(result, tuple) and len(result) == 2 and not result[1]:
-            # Auto-approval creation failed (returned False)
-            auto_approval_failed_count += 1
-
    # Count results
    approved_count = sum(
        1
@@ -301,53 +157,30 @@ async def process_review_action(
        if review.status == ReviewStatus.REJECTED
    )

-    # Resume execution only if ALL pending reviews for this execution have been processed
+    # Resume execution if we processed some reviews
    if updated_reviews:
+        # Get graph execution ID from any processed review
+        first_review = next(iter(updated_reviews.values()))
+        graph_exec_id = first_review.graph_exec_id
+
+        # Check if any pending reviews remain for this execution
        still_has_pending = await has_pending_reviews_for_graph_exec(graph_exec_id)

        if not still_has_pending:
-            # Get the graph_id from any processed review
-            first_review = next(iter(updated_reviews.values()))
-
+            # Resume execution
            try:
-                # Fetch user and settings to build complete execution context
-                user = await get_user_by_id(user_id)
-                settings = await get_graph_settings(
-                    user_id=user_id, graph_id=first_review.graph_id
-                )
-
-                # Preserve user's timezone preference when resuming execution
-                user_timezone = (
-                    user.timezone if user.timezone != USER_TIMEZONE_NOT_SET else "UTC"
-                )
-
-                execution_context = ExecutionContext(
-                    human_in_the_loop_safe_mode=settings.human_in_the_loop_safe_mode,
-                    sensitive_action_safe_mode=settings.sensitive_action_safe_mode,
-                    user_timezone=user_timezone,
-                )
-
                await add_graph_execution(
                    graph_id=first_review.graph_id,
                    user_id=user_id,
                    graph_exec_id=graph_exec_id,
-                    execution_context=execution_context,
                )
                logger.info(f"Resumed execution {graph_exec_id}")
            except Exception as e:
                logger.error(f"Failed to resume execution {graph_exec_id}: {str(e)}")

-    # Build error message if auto-approvals failed
-    error_message = None
-    if auto_approval_failed_count > 0:
-        error_message = (
-            f"{auto_approval_failed_count} auto-approval setting(s) could not be saved. "
-            f"You may need to manually approve these reviews in future executions."
-        )
-
    return ReviewResponse(
        approved_count=approved_count,
        rejected_count=rejected_count,
-        failed_count=auto_approval_failed_count,
-        error=error_message,
+        failed_count=0,
+        error=None,
    )
--- a/autogpt_platform/backend/backend/api/features/integrations/router.py
+++ b/autogpt_platform/backend/backend/api/features/integrations/router.py
@@ -1,7 +1,7 @@
 import asyncio
 import logging
 from datetime import datetime, timedelta, timezone
-from typing import TYPE_CHECKING, Annotated, Any, List, Literal
+from typing import TYPE_CHECKING, Annotated, List, Literal

 from autogpt_libs.auth import get_user_id
 from fastapi import (
@@ -14,7 +14,7 @@ from fastapi import (
    Security,
    status,
 )
-from pydantic import BaseModel, Field, SecretStr, model_validator
+from pydantic import BaseModel, Field, SecretStr
 from starlette.status import HTTP_500_INTERNAL_SERVER_ERROR, HTTP_502_BAD_GATEWAY

 from backend.api.features.library.db import set_preset_webhook, update_preset
@@ -39,11 +39,7 @@ from backend.data.onboarding import OnboardingStep, complete_onboarding_step
 from backend.data.user import get_user_integrations
 from backend.executor.utils import add_graph_execution
 from backend.integrations.ayrshare import AyrshareClient, SocialPlatform
-from backend.integrations.credentials_store import provider_matches
-from backend.integrations.creds_manager import (
-    IntegrationCredentialsManager,
-    create_mcp_oauth_handler,
-)
+from backend.integrations.creds_manager import IntegrationCredentialsManager
 from backend.integrations.oauth import CREDENTIALS_BY_PROVIDER, HANDLERS_BY_NAME
 from backend.integrations.providers import ProviderName
 from backend.integrations.webhooks import get_webhook_manager
@@ -106,37 +102,9 @@ class CredentialsMetaResponse(BaseModel):
    scopes: list[str] | None
    username: str | None
    host: str | None = Field(
-        default=None,
-        description="Host pattern for host-scoped or MCP server URL for MCP credentials",
+        default=None, description="Host pattern for host-scoped credentials"
    )

-    @model_validator(mode="before")
-    @classmethod
-    def _normalize_provider(cls, data: Any) -> Any:
-        """Fix ``ProviderName.X`` format from Python 3.13 ``str(Enum)`` bug."""
-        if isinstance(data, dict):
-            prov = data.get("provider", "")
-            if isinstance(prov, str) and prov.startswith("ProviderName."):
-                member = prov.removeprefix("ProviderName.")
-                try:
-                    data = {**data, "provider": ProviderName[member].value}
-                except KeyError:
-                    pass
-        return data
-
-    @staticmethod
-    def get_host(cred: Credentials) -> str | None:
-        """Extract host from credential: HostScoped host or MCP server URL."""
-        if isinstance(cred, HostScopedCredentials):
-            return cred.host
-        if isinstance(cred, OAuth2Credentials) and cred.provider in (
-            ProviderName.MCP,
-            ProviderName.MCP.value,
-            "ProviderName.MCP",
-        ):
-            return (cred.metadata or {}).get("mcp_server_url")
-        return None
-

@router.post("/{provider}/callback", summary="Exchange OAuth code for tokens")
 async def callback(
@@ -211,7 +179,9 @@ async def callback(
        title=credentials.title,
        scopes=credentials.scopes,
        username=credentials.username,
-        host=(CredentialsMetaResponse.get_host(credentials)),
+        host=(
+            credentials.host if isinstance(credentials, HostScopedCredentials) else None
+        ),
    )


@@ -229,7 +199,7 @@ async def list_credentials(
            title=cred.title,
            scopes=cred.scopes if isinstance(cred, OAuth2Credentials) else None,
            username=cred.username if isinstance(cred, OAuth2Credentials) else None,
-            host=CredentialsMetaResponse.get_host(cred),
+            host=cred.host if isinstance(cred, HostScopedCredentials) else None,
        )
        for cred in credentials
    ]
@@ -252,7 +222,7 @@ async def list_credentials_by_provider(
            title=cred.title,
            scopes=cred.scopes if isinstance(cred, OAuth2Credentials) else None,
            username=cred.username if isinstance(cred, OAuth2Credentials) else None,
-            host=CredentialsMetaResponse.get_host(cred),
+            host=cred.host if isinstance(cred, HostScopedCredentials) else None,
        )
        for cred in credentials
    ]
@@ -352,11 +322,7 @@ async def delete_credentials(

    tokens_revoked = None
    if isinstance(creds, OAuth2Credentials):
-        if provider_matches(provider.value, ProviderName.MCP.value):
-            # MCP uses dynamic per-server OAuth — create handler from metadata
-            handler = create_mcp_oauth_handler(creds)
-        else:
-            handler = _get_provider_oauth_handler(request, provider)
+        handler = _get_provider_oauth_handler(request, provider)
        tokens_revoked = await handler.revoke_tokens(creds)

    return CredentialsDeletionResponse(revoked=tokens_revoked)
--- a/autogpt_platform/backend/backend/api/features/library/db.py
+++ b/autogpt_platform/backend/backend/api/features/library/db.py
@@ -12,18 +12,16 @@ import backend.api.features.store.image_gen as store_image_gen
 import backend.api.features.store.media as store_media
 import backend.data.graph as graph_db
 import backend.data.integrations as integrations_db
+from backend.data.block import BlockInput
 from backend.data.db import transaction
 from backend.data.execution import get_graph_execution
 from backend.data.graph import GraphSettings
 from backend.data.includes import AGENT_PRESET_INCLUDE, library_agent_include
-from backend.data.model import CredentialsMetaInput, GraphInput
+from backend.data.model import CredentialsMetaInput
 from backend.integrations.creds_manager import IntegrationCredentialsManager
-from backend.integrations.webhooks.graph_lifecycle_hooks import (
-    on_graph_activate,
-    on_graph_deactivate,
-)
+from backend.integrations.webhooks.graph_lifecycle_hooks import on_graph_activate
 from backend.util.clients import get_scheduler_client
-from backend.util.exceptions import DatabaseError, InvalidInputError, NotFoundError
+from backend.util.exceptions import DatabaseError, NotFoundError
 from backend.util.json import SafeJson
 from backend.util.models import Pagination
 from backend.util.settings import Config
@@ -41,7 +39,6 @@ async def list_library_agents(
    sort_by: library_model.LibraryAgentSort = library_model.LibraryAgentSort.UPDATED_AT,
    page: int = 1,
    page_size: int = 50,
-    include_executions: bool = False,
 ) -> library_model.LibraryAgentResponse:
    """
    Retrieves a paginated list of LibraryAgent records for a given user.
@@ -52,9 +49,6 @@ async def list_library_agents(
        sort_by: Sorting field (createdAt, updatedAt, isFavorite, isCreatedByUser).
        page: Current page (1-indexed).
        page_size: Number of items per page.
-        include_executions: Whether to include execution data for status calculation.
-            Defaults to False for performance (UI fetches status separately).
-            Set to True when accurate status/metrics are needed (e.g., agent generator).

    Returns:
        A LibraryAgentResponse containing the list of agents and pagination details.
@@ -70,11 +64,11 @@ async def list_library_agents(

    if page < 1 or page_size < 1:
        logger.warning(f"Invalid pagination: page={page}, page_size={page_size}")
-        raise InvalidInputError("Invalid pagination input")
+        raise DatabaseError("Invalid pagination input")

    if search_term and len(search_term.strip()) > 100:
        logger.warning(f"Search term too long: {repr(search_term)}")
-        raise InvalidInputError("Search term is too long")
+        raise DatabaseError("Search term is too long")

    where_clause: prisma.types.LibraryAgentWhereInput = {
        "userId": user_id,
@@ -82,6 +76,7 @@ async def list_library_agents(
        "isArchived": False,
    }

+    # Build search filter if applicable
    if search_term:
        where_clause["OR"] = [
            {
@@ -98,6 +93,7 @@ async def list_library_agents(
            },
        ]

+    # Determine sorting
    order_by: prisma.types.LibraryAgentOrderByInput | None = None

    if sort_by == library_model.LibraryAgentSort.CREATED_AT:
@@ -109,7 +105,7 @@ async def list_library_agents(
        library_agents = await prisma.models.LibraryAgent.prisma().find_many(
            where=where_clause,
            include=library_agent_include(
-                user_id, include_nodes=False, include_executions=include_executions
+                user_id, include_nodes=False, include_executions=False
            ),
            order=order_by,
            skip=(page - 1) * page_size,
@@ -179,7 +175,7 @@ async def list_favorite_library_agents(

    if page < 1 or page_size < 1:
        logger.warning(f"Invalid pagination: page={page}, page_size={page_size}")
-        raise InvalidInputError("Invalid pagination input")
+        raise DatabaseError("Invalid pagination input")

    where_clause: prisma.types.LibraryAgentWhereInput = {
        "userId": user_id,
@@ -373,7 +369,7 @@ async def get_library_agent_by_graph_id(


 async def add_generated_agent_image(
-    graph: graph_db.GraphBaseMeta,
+    graph: graph_db.BaseGraph,
    user_id: str,
    library_agent_id: str,
 ) -> Optional[prisma.models.LibraryAgent]:
@@ -539,92 +535,6 @@ async def update_agent_version_in_library(
    return library_model.LibraryAgent.from_db(lib)


-async def create_graph_in_library(
-    graph: graph_db.Graph,
-    user_id: str,
-) -> tuple[graph_db.GraphModel, library_model.LibraryAgent]:
-    """Create a new graph and add it to the user's library."""
-    graph.version = 1
-    graph_model = graph_db.make_graph_model(graph, user_id)
-    graph_model.reassign_ids(user_id=user_id, reassign_graph_id=True)
-
-    created_graph = await graph_db.create_graph(graph_model, user_id)
-
-    library_agents = await create_library_agent(
-        graph=created_graph,
-        user_id=user_id,
-        sensitive_action_safe_mode=True,
-        create_library_agents_for_sub_graphs=False,
-    )
-
-    if created_graph.is_active:
-        created_graph = await on_graph_activate(created_graph, user_id=user_id)
-
-    return created_graph, library_agents[0]
-
-
-async def update_graph_in_library(
-    graph: graph_db.Graph,
-    user_id: str,
-) -> tuple[graph_db.GraphModel, library_model.LibraryAgent]:
-    """Create a new version of an existing graph and update the library entry."""
-    existing_versions = await graph_db.get_graph_all_versions(graph.id, user_id)
-    current_active_version = (
-        next((v for v in existing_versions if v.is_active), None)
-        if existing_versions
-        else None
-    )
-    graph.version = (
-        max(v.version for v in existing_versions) + 1 if existing_versions else 1
-    )
-
-    graph_model = graph_db.make_graph_model(graph, user_id)
-    graph_model.reassign_ids(user_id=user_id, reassign_graph_id=False)
-
-    created_graph = await graph_db.create_graph(graph_model, user_id)
-
-    library_agent = await get_library_agent_by_graph_id(user_id, created_graph.id)
-    if not library_agent:
-        raise NotFoundError(f"Library agent not found for graph {created_graph.id}")
-
-    library_agent = await update_library_agent_version_and_settings(
-        user_id, created_graph
-    )
-
-    if created_graph.is_active:
-        created_graph = await on_graph_activate(created_graph, user_id=user_id)
-        await graph_db.set_graph_active_version(
-            graph_id=created_graph.id,
-            version=created_graph.version,
-            user_id=user_id,
-        )
-        if current_active_version:
-            await on_graph_deactivate(current_active_version, user_id=user_id)
-
-    return created_graph, library_agent
-
-
-async def update_library_agent_version_and_settings(
-    user_id: str, agent_graph: graph_db.GraphModel
-) -> library_model.LibraryAgent:
-    """Update library agent to point to new graph version and sync settings."""
-    library = await update_agent_version_in_library(
-        user_id, agent_graph.id, agent_graph.version
-    )
-    updated_settings = GraphSettings.from_graph(
-        graph=agent_graph,
-        hitl_safe_mode=library.settings.human_in_the_loop_safe_mode,
-        sensitive_action_safe_mode=library.settings.sensitive_action_safe_mode,
-    )
-    if updated_settings != library.settings:
-        library = await update_library_agent(
-            library_agent_id=library.id,
-            user_id=user_id,
-            settings=updated_settings,
-        )
-    return library
-
-
 async def update_library_agent(
    library_agent_id: str,
    user_id: str,
@@ -673,13 +583,7 @@ async def update_library_agent(
            )
        update_fields["isDeleted"] = is_deleted
    if settings is not None:
-        existing_agent = await get_library_agent(id=library_agent_id, user_id=user_id)
-        current_settings_dict = (
-            existing_agent.settings.model_dump() if existing_agent.settings else {}
-        )
-        new_settings = settings.model_dump(exclude_unset=True)
-        merged_settings = {**current_settings_dict, **new_settings}
-        update_fields["settings"] = SafeJson(merged_settings)
+        update_fields["settings"] = SafeJson(settings.model_dump())

    try:
        # If graph_version is provided, update to that specific version
@@ -1129,7 +1033,7 @@ async def create_preset_from_graph_execution(
 async def update_preset(
    user_id: str,
    preset_id: str,
-    inputs: Optional[GraphInput] = None,
+    inputs: Optional[BlockInput] = None,
    credentials: Optional[dict[str, CredentialsMetaInput]] = None,
    name: Optional[str] = None,
    description: Optional[str] = None,
--- a/autogpt_platform/backend/backend/api/features/library/model.py
+++ b/autogpt_platform/backend/backend/api/features/library/model.py
@@ -6,13 +6,9 @@ import prisma.enums
 import prisma.models
 import pydantic

+from backend.data.block import BlockInput
 from backend.data.graph import GraphModel, GraphSettings, GraphTriggerInfo
-from backend.data.model import (
-    CredentialsMetaInput,
-    GraphInput,
-    is_credentials_field_name,
-)
-from backend.util.json import loads as json_loads
+from backend.data.model import CredentialsMetaInput, is_credentials_field_name
 from backend.util.models import Pagination

 if TYPE_CHECKING:
@@ -20,10 +16,10 @@ if TYPE_CHECKING:


 class LibraryAgentStatus(str, Enum):
-    COMPLETED = "COMPLETED"
-    HEALTHY = "HEALTHY"
-    WAITING = "WAITING"
-    ERROR = "ERROR"
+    COMPLETED = "COMPLETED"  # All runs completed
+    HEALTHY = "HEALTHY"  # Agent is running (not all runs have completed)
+    WAITING = "WAITING"  # Agent is queued or waiting to start
+    ERROR = "ERROR"  # Agent is in an error state


 class MarketplaceListingCreator(pydantic.BaseModel):
@@ -43,30 +39,6 @@ class MarketplaceListing(pydantic.BaseModel):
    creator: MarketplaceListingCreator


-class RecentExecution(pydantic.BaseModel):
-    """Summary of a recent execution for quality assessment.
-
-    Used by the LLM to understand the agent's recent performance with specific examples
-    rather than just aggregate statistics.
-    """
-
-    status: str
-    correctness_score: float | None = None
-    activity_summary: str | None = None
-
-
-def _parse_settings(settings: dict | str | None) -> GraphSettings:
-    """Parse settings from database, handling both dict and string formats."""
-    if settings is None:
-        return GraphSettings()
-    try:
-        if isinstance(settings, str):
-            settings = json_loads(settings)
-        return GraphSettings.model_validate(settings)
-    except Exception:
-        return GraphSettings()
-
-
 class LibraryAgent(pydantic.BaseModel):
    """
    Represents an agent in the library, including metadata for display and
@@ -76,7 +48,7 @@ class LibraryAgent(pydantic.BaseModel):
    id: str
    graph_id: str
    graph_version: int
-    owner_user_id: str
+    owner_user_id: str  # ID of user who owns/created this agent graph

    image_url: str | None

@@ -92,7 +64,7 @@ class LibraryAgent(pydantic.BaseModel):
    description: str
    instructions: str | None = None

-    input_schema: dict[str, Any]
+    input_schema: dict[str, Any]  # Should be BlockIOObjectSubSchema in frontend
    output_schema: dict[str, Any]
    credentials_input_schema: dict[str, Any] | None = pydantic.Field(
        description="Input schema for credentials required by the agent",
@@ -109,19 +81,25 @@ class LibraryAgent(pydantic.BaseModel):
    )
    trigger_setup_info: Optional[GraphTriggerInfo] = None

+    # Indicates whether there's a new output (based on recent runs)
    new_output: bool
-    execution_count: int = 0
-    success_rate: float | None = None
-    avg_correctness_score: float | None = None
-    recent_executions: list[RecentExecution] = pydantic.Field(
-        default_factory=list,
-        description="List of recent executions with status, score, and summary",
-    )
+
+    # Whether the user can access the underlying graph
    can_access_graph: bool
+
+    # Indicates if this agent is the latest version
    is_latest_version: bool
+
+    # Whether the agent is marked as favorite by the user
    is_favorite: bool
+
+    # Recommended schedule cron (from marketplace agents)
    recommended_schedule_cron: str | None = None
+
+    # User-specific settings for this library agent
    settings: GraphSettings = pydantic.Field(default_factory=GraphSettings)
+
+    # Marketplace listing information if the agent has been published
    marketplace_listing: Optional["MarketplaceListing"] = None

    @staticmethod
@@ -145,6 +123,7 @@ class LibraryAgent(pydantic.BaseModel):
        agent_updated_at = agent.AgentGraph.updatedAt
        lib_agent_updated_at = agent.updatedAt

+        # Compute updated_at as the latest between library agent and graph
        updated_at = (
            max(agent_updated_at, lib_agent_updated_at)
            if agent_updated_at
@@ -157,6 +136,7 @@ class LibraryAgent(pydantic.BaseModel):
            creator_name = agent.Creator.name or "Unknown"
            creator_image_url = agent.Creator.avatarUrl or ""

+        # Logic to calculate status and new_output
        week_ago = datetime.datetime.now(datetime.timezone.utc) - datetime.timedelta(
            days=7
        )
@@ -165,55 +145,13 @@ class LibraryAgent(pydantic.BaseModel):
        status = status_result.status
        new_output = status_result.new_output

-        execution_count = len(executions)
-        success_rate: float | None = None
-        avg_correctness_score: float | None = None
-        if execution_count > 0:
-            success_count = sum(
-                1
-                for e in executions
-                if e.executionStatus == prisma.enums.AgentExecutionStatus.COMPLETED
-            )
-            success_rate = (success_count / execution_count) * 100
-
-            correctness_scores = []
-            for e in executions:
-                if e.stats and isinstance(e.stats, dict):
-                    score = e.stats.get("correctness_score")
-                    if score is not None and isinstance(score, (int, float)):
-                        correctness_scores.append(float(score))
-            if correctness_scores:
-                avg_correctness_score = sum(correctness_scores) / len(
-                    correctness_scores
-                )
-
-        recent_executions: list[RecentExecution] = []
-        for e in executions:
-            exec_score: float | None = None
-            exec_summary: str | None = None
-            if e.stats and isinstance(e.stats, dict):
-                score = e.stats.get("correctness_score")
-                if score is not None and isinstance(score, (int, float)):
-                    exec_score = float(score)
-                summary = e.stats.get("activity_status")
-                if summary is not None and isinstance(summary, str):
-                    exec_summary = summary
-            exec_status = (
-                e.executionStatus.value
-                if hasattr(e.executionStatus, "value")
-                else str(e.executionStatus)
-            )
-            recent_executions.append(
-                RecentExecution(
-                    status=exec_status,
-                    correctness_score=exec_score,
-                    activity_summary=exec_summary,
-                )
-            )
-
+        # Check if user can access the graph
        can_access_graph = agent.AgentGraph.userId == agent.userId
+
+        # Hard-coded to True until a method to check is implemented
        is_latest_version = True

+        # Build marketplace_listing if available
        marketplace_listing_data = None
        if store_listing and store_listing.ActiveVersion and profile:
            creator_data = MarketplaceListingCreator(
@@ -252,15 +190,11 @@ class LibraryAgent(pydantic.BaseModel):
            has_sensitive_action=graph.has_sensitive_action,
            trigger_setup_info=graph.trigger_setup_info,
            new_output=new_output,
-            execution_count=execution_count,
-            success_rate=success_rate,
-            avg_correctness_score=avg_correctness_score,
-            recent_executions=recent_executions,
            can_access_graph=can_access_graph,
            is_latest_version=is_latest_version,
            is_favorite=agent.isFavorite,
            recommended_schedule_cron=agent.AgentGraph.recommendedScheduleCron,
-            settings=_parse_settings(agent.settings),
+            settings=GraphSettings.model_validate(agent.settings),
            marketplace_listing=marketplace_listing_data,
        )

@@ -286,15 +220,18 @@ def _calculate_agent_status(
    if not executions:
        return AgentStatusResult(status=LibraryAgentStatus.COMPLETED, new_output=False)

+    # Track how many times each execution status appears
    status_counts = {status: 0 for status in prisma.enums.AgentExecutionStatus}
    new_output = False

    for execution in executions:
+        # Check if there's a completed run more recent than `recent_threshold`
        if execution.createdAt >= recent_threshold:
            if execution.executionStatus == prisma.enums.AgentExecutionStatus.COMPLETED:
                new_output = True
        status_counts[execution.executionStatus] += 1

+    # Determine the final status based on counts
    if status_counts[prisma.enums.AgentExecutionStatus.FAILED] > 0:
        return AgentStatusResult(status=LibraryAgentStatus.ERROR, new_output=new_output)
    elif status_counts[prisma.enums.AgentExecutionStatus.QUEUED] > 0:
@@ -326,7 +263,7 @@ class LibraryAgentPresetCreatable(pydantic.BaseModel):
    graph_id: str
    graph_version: int

-    inputs: GraphInput
+    inputs: BlockInput
    credentials: dict[str, CredentialsMetaInput]

    name: str
@@ -355,7 +292,7 @@ class LibraryAgentPresetUpdatable(pydantic.BaseModel):
    Request model used when updating a preset for a library agent.
    """

-    inputs: Optional[GraphInput] = None
+    inputs: Optional[BlockInput] = None
    credentials: Optional[dict[str, CredentialsMetaInput]] = None

    name: Optional[str] = None
@@ -398,7 +335,7 @@ class LibraryAgentPreset(LibraryAgentPresetCreatable):
                "Webhook must be included in AgentPreset query when webhookId is set"
            )

-        input_data: GraphInput = {}
+        input_data: BlockInput = {}
        input_credentials: dict[str, CredentialsMetaInput] = {}

        for preset_input in preset.InputPresets:
--- a/autogpt_platform/backend/backend/api/features/library/routes/agents.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes/agents.py
@@ -1,3 +1,4 @@
+import logging
 from typing import Literal, Optional

 import autogpt_libs.auth as autogpt_auth_lib
@@ -5,11 +6,15 @@ from fastapi import APIRouter, Body, HTTPException, Query, Security, status
 from fastapi.responses import Response
 from prisma.enums import OnboardingStep

+import backend.api.features.store.exceptions as store_exceptions
 from backend.data.onboarding import complete_onboarding_step
+from backend.util.exceptions import DatabaseError, NotFoundError

 from .. import db as library_db
 from .. import model as library_model

+logger = logging.getLogger(__name__)
+
 router = APIRouter(
    prefix="/agents",
    tags=["library", "private"],
@@ -21,6 +26,10 @@ router = APIRouter(
    "",
    summary="List Library Agents",
    response_model=library_model.LibraryAgentResponse,
+    responses={
+        200: {"description": "List of library agents"},
+        500: {"description": "Server error", "content": {"application/json": {}}},
+    },
 )
 async def list_library_agents(
    user_id: str = Security(autogpt_auth_lib.get_user_id),
@@ -44,19 +53,43 @@ async def list_library_agents(
 ) -> library_model.LibraryAgentResponse:
    """
    Get all agents in the user's library (both created and saved).
+
+    Args:
+        user_id: ID of the authenticated user.
+        search_term: Optional search term to filter agents by name/description.
+        filter_by: List of filters to apply (favorites, created by user).
+        sort_by: List of sorting criteria (created date, updated date).
+        page: Page number to retrieve.
+        page_size: Number of agents per page.
+
+    Returns:
+        A LibraryAgentResponse containing agents and pagination metadata.
+
+    Raises:
+        HTTPException: If a server/database error occurs.
    """
-    return await library_db.list_library_agents(
-        user_id=user_id,
-        search_term=search_term,
-        sort_by=sort_by,
-        page=page,
-        page_size=page_size,
-    )
+    try:
+        return await library_db.list_library_agents(
+            user_id=user_id,
+            search_term=search_term,
+            sort_by=sort_by,
+            page=page,
+            page_size=page_size,
+        )
+    except Exception as e:
+        logger.error(f"Could not list library agents for user #{user_id}: {e}")
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail=str(e),
+        ) from e


@router.get(
    "/favorites",
    summary="List Favorite Library Agents",
+    responses={
+        500: {"description": "Server error", "content": {"application/json": {}}},
+    },
 )
 async def list_favorite_library_agents(
    user_id: str = Security(autogpt_auth_lib.get_user_id),
@@ -73,12 +106,30 @@ async def list_favorite_library_agents(
 ) -> library_model.LibraryAgentResponse:
    """
    Get all favorite agents in the user's library.
+
+    Args:
+        user_id: ID of the authenticated user.
+        page: Page number to retrieve.
+        page_size: Number of agents per page.
+
+    Returns:
+        A LibraryAgentResponse containing favorite agents and pagination metadata.
+
+    Raises:
+        HTTPException: If a server/database error occurs.
    """
-    return await library_db.list_favorite_library_agents(
-        user_id=user_id,
-        page=page,
-        page_size=page_size,
-    )
+    try:
+        return await library_db.list_favorite_library_agents(
+            user_id=user_id,
+            page=page,
+            page_size=page_size,
+        )
+    except Exception as e:
+        logger.error(f"Could not list favorite library agents for user #{user_id}: {e}")
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail=str(e),
+        ) from e


@router.get("/{library_agent_id}", summary="Get Library Agent")
@@ -111,6 +162,10 @@ async def get_library_agent_by_graph_id(
    summary="Get Agent By Store ID",
    tags=["store", "library"],
    response_model=library_model.LibraryAgent | None,
+    responses={
+        200: {"description": "Library agent found"},
+        404: {"description": "Agent not found"},
+    },
 )
 async def get_library_agent_by_store_listing_version_id(
    store_listing_version_id: str,
@@ -119,15 +174,32 @@ async def get_library_agent_by_store_listing_version_id(
    """
    Get Library Agent from Store Listing Version ID.
    """
-    return await library_db.get_library_agent_by_store_version_id(
-        store_listing_version_id, user_id
-    )
+    try:
+        return await library_db.get_library_agent_by_store_version_id(
+            store_listing_version_id, user_id
+        )
+    except NotFoundError as e:
+        raise HTTPException(
+            status_code=status.HTTP_404_NOT_FOUND,
+            detail=str(e),
+        )
+    except Exception as e:
+        logger.error(f"Could not fetch library agent from store version ID: {e}")
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail=str(e),
+        ) from e


@router.post(
    "",
    summary="Add Marketplace Agent",
    status_code=status.HTTP_201_CREATED,
+    responses={
+        201: {"description": "Agent added successfully"},
+        404: {"description": "Store listing version not found"},
+        500: {"description": "Server error"},
+    },
 )
 async def add_marketplace_agent_to_library(
    store_listing_version_id: str = Body(embed=True),
@@ -138,19 +210,59 @@ async def add_marketplace_agent_to_library(
 ) -> library_model.LibraryAgent:
    """
    Add an agent from the marketplace to the user's library.
+
+    Args:
+        store_listing_version_id: ID of the store listing version to add.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        library_model.LibraryAgent: Agent added to the library
+
+    Raises:
+        HTTPException(404): If the listing version is not found.
+        HTTPException(500): If a server/database error occurs.
    """
-    agent = await library_db.add_store_agent_to_library(
-        store_listing_version_id=store_listing_version_id,
-        user_id=user_id,
-    )
-    if source != "onboarding":
-        await complete_onboarding_step(user_id, OnboardingStep.MARKETPLACE_ADD_AGENT)
-    return agent
+    try:
+        agent = await library_db.add_store_agent_to_library(
+            store_listing_version_id=store_listing_version_id,
+            user_id=user_id,
+        )
+        if source != "onboarding":
+            await complete_onboarding_step(
+                user_id, OnboardingStep.MARKETPLACE_ADD_AGENT
+            )
+        return agent
+
+    except store_exceptions.AgentNotFoundError as e:
+        logger.warning(
+            f"Could not find store listing version {store_listing_version_id} "
+            "to add to library"
+        )
+        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(e))
+    except DatabaseError as e:
+        logger.error(f"Database error while adding agent to library: {e}", e)
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail={"message": str(e), "hint": "Inspect DB logs for details."},
+        ) from e
+    except Exception as e:
+        logger.error(f"Unexpected error while adding agent to library: {e}")
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail={
+                "message": str(e),
+                "hint": "Check server logs for more information.",
+            },
+        ) from e


@router.patch(
    "/{library_agent_id}",
    summary="Update Library Agent",
+    responses={
+        200: {"description": "Agent updated successfully"},
+        500: {"description": "Server error"},
+    },
 )
 async def update_library_agent(
    library_agent_id: str,
@@ -159,21 +271,52 @@ async def update_library_agent(
 ) -> library_model.LibraryAgent:
    """
    Update the library agent with the given fields.
+
+    Args:
+        library_agent_id: ID of the library agent to update.
+        payload: Fields to update (auto_update_version, is_favorite, etc.).
+        user_id: ID of the authenticated user.
+
+    Raises:
+        HTTPException(500): If a server/database error occurs.
    """
-    return await library_db.update_library_agent(
-        library_agent_id=library_agent_id,
-        user_id=user_id,
-        auto_update_version=payload.auto_update_version,
-        graph_version=payload.graph_version,
-        is_favorite=payload.is_favorite,
-        is_archived=payload.is_archived,
-        settings=payload.settings,
-    )
+    try:
+        return await library_db.update_library_agent(
+            library_agent_id=library_agent_id,
+            user_id=user_id,
+            auto_update_version=payload.auto_update_version,
+            graph_version=payload.graph_version,
+            is_favorite=payload.is_favorite,
+            is_archived=payload.is_archived,
+            settings=payload.settings,
+        )
+    except NotFoundError as e:
+        raise HTTPException(
+            status_code=status.HTTP_404_NOT_FOUND,
+            detail=str(e),
+        ) from e
+    except DatabaseError as e:
+        logger.error(f"Database error while updating library agent: {e}")
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail={"message": str(e), "hint": "Verify DB connection."},
+        ) from e
+    except Exception as e:
+        logger.error(f"Unexpected error while updating library agent: {e}")
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail={"message": str(e), "hint": "Check server logs."},
+        ) from e


@router.delete(
    "/{library_agent_id}",
    summary="Delete Library Agent",
+    responses={
+        204: {"description": "Agent deleted successfully"},
+        404: {"description": "Agent not found"},
+        500: {"description": "Server error"},
+    },
 )
 async def delete_library_agent(
    library_agent_id: str,
@@ -181,11 +324,28 @@ async def delete_library_agent(
 ) -> Response:
    """
    Soft-delete the specified library agent.
+
+    Args:
+        library_agent_id: ID of the library agent to delete.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        204 No Content if successful.
+
+    Raises:
+        HTTPException(404): If the agent does not exist.
+        HTTPException(500): If a server/database error occurs.
    """
-    await library_db.delete_library_agent(
-        library_agent_id=library_agent_id, user_id=user_id
-    )
-    return Response(status_code=status.HTTP_204_NO_CONTENT)
+    try:
+        await library_db.delete_library_agent(
+            library_agent_id=library_agent_id, user_id=user_id
+        )
+        return Response(status_code=status.HTTP_204_NO_CONTENT)
+    except NotFoundError as e:
+        raise HTTPException(
+            status_code=status.HTTP_404_NOT_FOUND,
+            detail=str(e),
+        ) from e


@router.post("/{library_agent_id}/fork", summary="Fork Library Agent")
--- a/autogpt_platform/backend/backend/api/features/library/routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes_test.py
@@ -118,6 +118,21 @@ async def test_get_library_agents_success(
    )


+def test_get_library_agents_error(mocker: pytest_mock.MockFixture, test_user_id: str):
+    mock_db_call = mocker.patch("backend.api.features.library.db.list_library_agents")
+    mock_db_call.side_effect = Exception("Test error")
+
+    response = client.get("/agents?search_term=test")
+    assert response.status_code == 500
+    mock_db_call.assert_called_once_with(
+        user_id=test_user_id,
+        search_term="test",
+        sort_by=library_model.LibraryAgentSort.UPDATED_AT,
+        page=1,
+        page_size=15,
+    )
+
+
@pytest.mark.asyncio
 async def test_get_favorite_library_agents_success(
    mocker: pytest_mock.MockFixture,
@@ -175,6 +190,23 @@ async def test_get_favorite_library_agents_success(
    )


+def test_get_favorite_library_agents_error(
+    mocker: pytest_mock.MockFixture, test_user_id: str
+):
+    mock_db_call = mocker.patch(
+        "backend.api.features.library.db.list_favorite_library_agents"
+    )
+    mock_db_call.side_effect = Exception("Test error")
+
+    response = client.get("/agents/favorites")
+    assert response.status_code == 500
+    mock_db_call.assert_called_once_with(
+        user_id=test_user_id,
+        page=1,
+        page_size=15,
+    )
+
+
 def test_add_agent_to_library_success(
    mocker: pytest_mock.MockFixture, test_user_id: str
 ):
@@ -226,3 +258,19 @@ def test_add_agent_to_library_success(
        store_listing_version_id="test-version-id", user_id=test_user_id
    )
    mock_complete_onboarding.assert_awaited_once()
+
+
+def test_add_agent_to_library_error(mocker: pytest_mock.MockFixture, test_user_id: str):
+    mock_db_call = mocker.patch(
+        "backend.api.features.library.db.add_store_agent_to_library"
+    )
+    mock_db_call.side_effect = Exception("Test error")
+
+    response = client.post(
+        "/agents", json={"store_listing_version_id": "test-version-id"}
+    )
+    assert response.status_code == 500
+    assert "detail" in response.json()  # Verify error response structure
+    mock_db_call.assert_called_once_with(
+        store_listing_version_id="test-version-id", user_id=test_user_id
+    )
--- a/autogpt_platform/backend/backend/api/features/mcp/init.py
+++ b/autogpt_platform/backend/backend/api/features/mcp/init.py
--- a/autogpt_platform/backend/backend/api/features/mcp/routes.py
+++ b/autogpt_platform/backend/backend/api/features/mcp/routes.py
@@ -1,404 +0,0 @@
-"""
-MCP (Model Context Protocol) API routes.
-
-Provides endpoints for MCP tool discovery and OAuth authentication so the
-frontend can list available tools on an MCP server before placing a block.
-"""
-
-import logging
-from typing import Annotated, Any
-from urllib.parse import urlparse
-
-import fastapi
-from autogpt_libs.auth import get_user_id
-from fastapi import Security
-from pydantic import BaseModel, Field
-
-from backend.api.features.integrations.router import CredentialsMetaResponse
-from backend.blocks.mcp.client import MCPClient, MCPClientError
-from backend.blocks.mcp.oauth import MCPOAuthHandler
-from backend.data.model import OAuth2Credentials
-from backend.integrations.creds_manager import IntegrationCredentialsManager
-from backend.integrations.providers import ProviderName
-from backend.util.request import HTTPClientError, Requests
-from backend.util.settings import Settings
-
-logger = logging.getLogger(__name__)
-
-settings = Settings()
-router = fastapi.APIRouter(tags=["mcp"])
-creds_manager = IntegrationCredentialsManager()
-
-
-# ====================== Tool Discovery ====================== #
-
-
-class DiscoverToolsRequest(BaseModel):
-    """Request to discover tools on an MCP server."""
-
-    server_url: str = Field(description="URL of the MCP server")
-    auth_token: str | None = Field(
-        default=None,
-        description="Optional Bearer token for authenticated MCP servers",
-    )
-
-
-class MCPToolResponse(BaseModel):
-    """A single MCP tool returned by discovery."""
-
-    name: str
-    description: str
-    input_schema: dict[str, Any]
-
-
-class DiscoverToolsResponse(BaseModel):
-    """Response containing the list of tools available on an MCP server."""
-
-    tools: list[MCPToolResponse]
-    server_name: str | None = None
-    protocol_version: str | None = None
-
-
-@router.post(
-    "/discover-tools",
-    summary="Discover available tools on an MCP server",
-    response_model=DiscoverToolsResponse,
-)
-async def discover_tools(
-    request: DiscoverToolsRequest,
-    user_id: Annotated[str, Security(get_user_id)],
-) -> DiscoverToolsResponse:
-    """
-    Connect to an MCP server and return its available tools.
-
-    If the user has a stored MCP credential for this server URL, it will be
-    used automatically — no need to pass an explicit auth token.
-    """
-    auth_token = request.auth_token
-
-    # Auto-use stored MCP credential when no explicit token is provided.
-    if not auth_token:
-        mcp_creds = await creds_manager.store.get_creds_by_provider(
-            user_id, ProviderName.MCP.value
-        )
-        # Find the freshest credential for this server URL
-        best_cred: OAuth2Credentials | None = None
-        for cred in mcp_creds:
-            if (
-                isinstance(cred, OAuth2Credentials)
-                and (cred.metadata or {}).get("mcp_server_url") == request.server_url
-            ):
-                if best_cred is None or (
-                    (cred.access_token_expires_at or 0)
-                    > (best_cred.access_token_expires_at or 0)
-                ):
-                    best_cred = cred
-        if best_cred:
-            # Refresh the token if expired before using it
-            best_cred = await creds_manager.refresh_if_needed(user_id, best_cred)
-            logger.info(
-                f"Using MCP credential {best_cred.id} for {request.server_url}, "
-                f"expires_at={best_cred.access_token_expires_at}"
-            )
-            auth_token = best_cred.access_token.get_secret_value()
-
-    client = MCPClient(request.server_url, auth_token=auth_token)
-
-    try:
-        init_result = await client.initialize()
-        tools = await client.list_tools()
-    except HTTPClientError as e:
-        if e.status_code in (401, 403):
-            raise fastapi.HTTPException(
-                status_code=401,
-                detail="This MCP server requires authentication. "
-                "Please provide a valid auth token.",
-            )
-        raise fastapi.HTTPException(status_code=502, detail=str(e))
-    except MCPClientError as e:
-        raise fastapi.HTTPException(status_code=502, detail=str(e))
-    except Exception as e:
-        raise fastapi.HTTPException(
-            status_code=502,
-            detail=f"Failed to connect to MCP server: {e}",
-        )
-
-    return DiscoverToolsResponse(
-        tools=[
-            MCPToolResponse(
-                name=t.name,
-                description=t.description,
-                input_schema=t.input_schema,
-            )
-            for t in tools
-        ],
-        server_name=(
-            init_result.get("serverInfo", {}).get("name")
-            or urlparse(request.server_url).hostname
-            or "MCP"
-        ),
-        protocol_version=init_result.get("protocolVersion"),
-    )
-
-
-# ======================== OAuth Flow ======================== #
-
-
-class MCPOAuthLoginRequest(BaseModel):
-    """Request to start an OAuth flow for an MCP server."""
-
-    server_url: str = Field(description="URL of the MCP server that requires OAuth")
-
-
-class MCPOAuthLoginResponse(BaseModel):
-    """Response with the OAuth login URL for the user to authenticate."""
-
-    login_url: str
-    state_token: str
-
-
-@router.post(
-    "/oauth/login",
-    summary="Initiate OAuth login for an MCP server",
-)
-async def mcp_oauth_login(
-    request: MCPOAuthLoginRequest,
-    user_id: Annotated[str, Security(get_user_id)],
-) -> MCPOAuthLoginResponse:
-    """
-    Discover OAuth metadata from the MCP server and return a login URL.
-
-    1. Discovers the protected-resource metadata (RFC 9728)
-    2. Fetches the authorization server metadata (RFC 8414)
-    3. Performs Dynamic Client Registration (RFC 7591) if available
-    4. Returns the authorization URL for the frontend to open in a popup
-    """
-    client = MCPClient(request.server_url)
-
-    # Step 1: Discover protected-resource metadata (RFC 9728)
-    protected_resource = await client.discover_auth()
-
-    metadata: dict[str, Any] | None = None
-
-    if protected_resource and protected_resource.get("authorization_servers"):
-        auth_server_url = protected_resource["authorization_servers"][0]
-        resource_url = protected_resource.get("resource", request.server_url)
-
-        # Step 2a: Discover auth-server metadata (RFC 8414)
-        metadata = await client.discover_auth_server_metadata(auth_server_url)
-    else:
-        # Fallback: Some MCP servers (e.g. Linear) are their own auth server
-        # and serve OAuth metadata directly without protected-resource metadata.
-        # Don't assume a resource_url — omitting it lets the auth server choose
-        # the correct audience for the token (RFC 8707 resource is optional).
-        resource_url = None
-        metadata = await client.discover_auth_server_metadata(request.server_url)
-
-    if (
-        not metadata
-        or "authorization_endpoint" not in metadata
-        or "token_endpoint" not in metadata
-    ):
-        raise fastapi.HTTPException(
-            status_code=400,
-            detail="This MCP server does not advertise OAuth support. "
-            "You may need to provide an auth token manually.",
-        )
-
-    authorize_url = metadata["authorization_endpoint"]
-    token_url = metadata["token_endpoint"]
-    registration_endpoint = metadata.get("registration_endpoint")
-    revoke_url = metadata.get("revocation_endpoint")
-
-    # Step 3: Dynamic Client Registration (RFC 7591) if available
-    frontend_base_url = settings.config.frontend_base_url
-    if not frontend_base_url:
-        raise fastapi.HTTPException(
-            status_code=500,
-            detail="Frontend base URL is not configured.",
-        )
-    redirect_uri = f"{frontend_base_url}/auth/integrations/mcp_callback"
-
-    client_id = ""
-    client_secret = ""
-    if registration_endpoint:
-        reg_result = await _register_mcp_client(
-            registration_endpoint, redirect_uri, request.server_url
-        )
-        if reg_result:
-            client_id = reg_result.get("client_id", "")
-            client_secret = reg_result.get("client_secret", "")
-
-    if not client_id:
-        client_id = "autogpt-platform"
-
-    # Step 4: Store state token with OAuth metadata for the callback
-    scopes = (protected_resource or {}).get("scopes_supported") or metadata.get(
-        "scopes_supported", []
-    )
-    state_token, code_challenge = await creds_manager.store.store_state_token(
-        user_id,
-        ProviderName.MCP.value,
-        scopes,
-        state_metadata={
-            "authorize_url": authorize_url,
-            "token_url": token_url,
-            "revoke_url": revoke_url,
-            "resource_url": resource_url,
-            "server_url": request.server_url,
-            "client_id": client_id,
-            "client_secret": client_secret,
-        },
-    )
-
-    # Step 5: Build and return the login URL
-    handler = MCPOAuthHandler(
-        client_id=client_id,
-        client_secret=client_secret,
-        redirect_uri=redirect_uri,
-        authorize_url=authorize_url,
-        token_url=token_url,
-        resource_url=resource_url,
-    )
-    login_url = handler.get_login_url(
-        scopes, state_token, code_challenge=code_challenge
-    )
-
-    return MCPOAuthLoginResponse(login_url=login_url, state_token=state_token)
-
-
-class MCPOAuthCallbackRequest(BaseModel):
-    """Request to exchange an OAuth code for tokens."""
-
-    code: str = Field(description="Authorization code from OAuth callback")
-    state_token: str = Field(description="State token for CSRF verification")
-
-
-class MCPOAuthCallbackResponse(BaseModel):
-    """Response after successfully storing OAuth credentials."""
-
-    credential_id: str
-
-
-@router.post(
-    "/oauth/callback",
-    summary="Exchange OAuth code for MCP tokens",
-)
-async def mcp_oauth_callback(
-    request: MCPOAuthCallbackRequest,
-    user_id: Annotated[str, Security(get_user_id)],
-) -> CredentialsMetaResponse:
-    """
-    Exchange the authorization code for tokens and store the credential.
-
-    The frontend calls this after receiving the OAuth code from the popup.
-    On success, subsequent ``/discover-tools`` calls for the same server URL
-    will automatically use the stored credential.
-    """
-    valid_state = await creds_manager.store.verify_state_token(
-        user_id, request.state_token, ProviderName.MCP.value
-    )
-    if not valid_state:
-        raise fastapi.HTTPException(
-            status_code=400,
-            detail="Invalid or expired state token.",
-        )
-
-    meta = valid_state.state_metadata
-    frontend_base_url = settings.config.frontend_base_url
-    if not frontend_base_url:
-        raise fastapi.HTTPException(
-            status_code=500,
-            detail="Frontend base URL is not configured.",
-        )
-    redirect_uri = f"{frontend_base_url}/auth/integrations/mcp_callback"
-
-    handler = MCPOAuthHandler(
-        client_id=meta["client_id"],
-        client_secret=meta.get("client_secret", ""),
-        redirect_uri=redirect_uri,
-        authorize_url=meta["authorize_url"],
-        token_url=meta["token_url"],
-        revoke_url=meta.get("revoke_url"),
-        resource_url=meta.get("resource_url"),
-    )
-
-    try:
-        credentials = await handler.exchange_code_for_tokens(
-            request.code, valid_state.scopes, valid_state.code_verifier
-        )
-    except Exception as e:
-        raise fastapi.HTTPException(
-            status_code=400,
-            detail=f"OAuth token exchange failed: {e}",
-        )
-
-    # Enrich credential metadata for future lookup and token refresh
-    if credentials.metadata is None:
-        credentials.metadata = {}
-    credentials.metadata["mcp_server_url"] = meta["server_url"]
-    credentials.metadata["mcp_client_id"] = meta["client_id"]
-    credentials.metadata["mcp_client_secret"] = meta.get("client_secret", "")
-    credentials.metadata["mcp_token_url"] = meta["token_url"]
-    credentials.metadata["mcp_resource_url"] = meta.get("resource_url", "")
-
-    hostname = urlparse(meta["server_url"]).hostname or meta["server_url"]
-    credentials.title = f"MCP: {hostname}"
-
-    # Remove old MCP credentials for the same server to prevent stale token buildup.
-    try:
-        old_creds = await creds_manager.store.get_creds_by_provider(
-            user_id, ProviderName.MCP.value
-        )
-        for old in old_creds:
-            if (
-                isinstance(old, OAuth2Credentials)
-                and (old.metadata or {}).get("mcp_server_url") == meta["server_url"]
-            ):
-                await creds_manager.store.delete_creds_by_id(user_id, old.id)
-                logger.info(
-                    f"Removed old MCP credential {old.id} for {meta['server_url']}"
-                )
-    except Exception:
-        logger.debug("Could not clean up old MCP credentials", exc_info=True)
-
-    await creds_manager.create(user_id, credentials)
-
-    return CredentialsMetaResponse(
-        id=credentials.id,
-        provider=credentials.provider,
-        type=credentials.type,
-        title=credentials.title,
-        scopes=credentials.scopes,
-        username=credentials.username,
-        host=credentials.metadata.get("mcp_server_url"),
-    )
-
-
-# ======================== Helpers ======================== #
-
-
-async def _register_mcp_client(
-    registration_endpoint: str,
-    redirect_uri: str,
-    server_url: str,
-) -> dict[str, Any] | None:
-    """Attempt Dynamic Client Registration (RFC 7591) with an MCP auth server."""
-    try:
-        response = await Requests(raise_for_status=True).post(
-            registration_endpoint,
-            json={
-                "client_name": "AutoGPT Platform",
-                "redirect_uris": [redirect_uri],
-                "grant_types": ["authorization_code"],
-                "response_types": ["code"],
-                "token_endpoint_auth_method": "client_secret_post",
-            },
-        )
-        data = response.json()
-        if isinstance(data, dict) and "client_id" in data:
-            return data
-        return None
-    except Exception as e:
-        logger.warning(f"Dynamic client registration failed for {server_url}: {e}")
-        return None
--- a/autogpt_platform/backend/backend/api/features/mcp/test_routes.py
+++ b/autogpt_platform/backend/backend/api/features/mcp/test_routes.py
@@ -1,436 +0,0 @@
-"""Tests for MCP API routes.
-
-Uses httpx.AsyncClient with ASGITransport instead of fastapi.testclient.TestClient
-to avoid creating blocking portals that can corrupt pytest-asyncio's session event loop.
-"""
-
-from unittest.mock import AsyncMock, patch
-
-import fastapi
-import httpx
-import pytest
-import pytest_asyncio
-from autogpt_libs.auth import get_user_id
-
-from backend.api.features.mcp.routes import router
-from backend.blocks.mcp.client import MCPClientError, MCPTool
-from backend.util.request import HTTPClientError
-
-app = fastapi.FastAPI()
-app.include_router(router)
-app.dependency_overrides[get_user_id] = lambda: "test-user-id"
-
-
-@pytest_asyncio.fixture(scope="module")
-async def client():
-    transport = httpx.ASGITransport(app=app)
-    async with httpx.AsyncClient(transport=transport, base_url="http://test") as c:
-        yield c
-
-
-class TestDiscoverTools:
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_discover_tools_success(self, client):
-        mock_tools = [
-            MCPTool(
-                name="get_weather",
-                description="Get weather for a city",
-                input_schema={
-                    "type": "object",
-                    "properties": {"city": {"type": "string"}},
-                    "required": ["city"],
-                },
-            ),
-            MCPTool(
-                name="add_numbers",
-                description="Add two numbers",
-                input_schema={
-                    "type": "object",
-                    "properties": {
-                        "a": {"type": "number"},
-                        "b": {"type": "number"},
-                    },
-                },
-            ),
-        ]
-
-        with (
-            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
-        ):
-            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[])
-            instance = MockClient.return_value
-            instance.initialize = AsyncMock(
-                return_value={
-                    "protocolVersion": "2025-03-26",
-                    "serverInfo": {"name": "test-server"},
-                }
-            )
-            instance.list_tools = AsyncMock(return_value=mock_tools)
-
-            response = await client.post(
-                "/discover-tools",
-                json={"server_url": "https://mcp.example.com/mcp"},
-            )
-
-        assert response.status_code == 200
-        data = response.json()
-        assert len(data["tools"]) == 2
-        assert data["tools"][0]["name"] == "get_weather"
-        assert data["tools"][1]["name"] == "add_numbers"
-        assert data["server_name"] == "test-server"
-        assert data["protocol_version"] == "2025-03-26"
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_discover_tools_with_auth_token(self, client):
-        with patch("backend.api.features.mcp.routes.MCPClient") as MockClient:
-            instance = MockClient.return_value
-            instance.initialize = AsyncMock(
-                return_value={"serverInfo": {}, "protocolVersion": "2025-03-26"}
-            )
-            instance.list_tools = AsyncMock(return_value=[])
-
-            response = await client.post(
-                "/discover-tools",
-                json={
-                    "server_url": "https://mcp.example.com/mcp",
-                    "auth_token": "my-secret-token",
-                },
-            )
-
-        assert response.status_code == 200
-        MockClient.assert_called_once_with(
-            "https://mcp.example.com/mcp",
-            auth_token="my-secret-token",
-        )
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_discover_tools_auto_uses_stored_credential(self, client):
-        """When no explicit token is given, stored MCP credentials are used."""
-        from pydantic import SecretStr
-
-        from backend.data.model import OAuth2Credentials
-
-        stored_cred = OAuth2Credentials(
-            provider="mcp",
-            title="MCP: example.com",
-            access_token=SecretStr("stored-token-123"),
-            refresh_token=None,
-            access_token_expires_at=None,
-            refresh_token_expires_at=None,
-            scopes=[],
-            metadata={"mcp_server_url": "https://mcp.example.com/mcp"},
-        )
-
-        with (
-            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
-        ):
-            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[stored_cred])
-            mock_cm.refresh_if_needed = AsyncMock(return_value=stored_cred)
-            instance = MockClient.return_value
-            instance.initialize = AsyncMock(
-                return_value={"serverInfo": {}, "protocolVersion": "2025-03-26"}
-            )
-            instance.list_tools = AsyncMock(return_value=[])
-
-            response = await client.post(
-                "/discover-tools",
-                json={"server_url": "https://mcp.example.com/mcp"},
-            )
-
-        assert response.status_code == 200
-        MockClient.assert_called_once_with(
-            "https://mcp.example.com/mcp",
-            auth_token="stored-token-123",
-        )
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_discover_tools_mcp_error(self, client):
-        with (
-            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
-        ):
-            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[])
-            instance = MockClient.return_value
-            instance.initialize = AsyncMock(
-                side_effect=MCPClientError("Connection refused")
-            )
-
-            response = await client.post(
-                "/discover-tools",
-                json={"server_url": "https://bad-server.example.com/mcp"},
-            )
-
-        assert response.status_code == 502
-        assert "Connection refused" in response.json()["detail"]
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_discover_tools_generic_error(self, client):
-        with (
-            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
-        ):
-            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[])
-            instance = MockClient.return_value
-            instance.initialize = AsyncMock(side_effect=Exception("Network timeout"))
-
-            response = await client.post(
-                "/discover-tools",
-                json={"server_url": "https://timeout.example.com/mcp"},
-            )
-
-        assert response.status_code == 502
-        assert "Failed to connect" in response.json()["detail"]
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_discover_tools_auth_required(self, client):
-        with (
-            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
-        ):
-            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[])
-            instance = MockClient.return_value
-            instance.initialize = AsyncMock(
-                side_effect=HTTPClientError("HTTP 401 Error: Unauthorized", 401)
-            )
-
-            response = await client.post(
-                "/discover-tools",
-                json={"server_url": "https://auth-server.example.com/mcp"},
-            )
-
-        assert response.status_code == 401
-        assert "requires authentication" in response.json()["detail"]
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_discover_tools_forbidden(self, client):
-        with (
-            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
-        ):
-            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[])
-            instance = MockClient.return_value
-            instance.initialize = AsyncMock(
-                side_effect=HTTPClientError("HTTP 403 Error: Forbidden", 403)
-            )
-
-            response = await client.post(
-                "/discover-tools",
-                json={"server_url": "https://auth-server.example.com/mcp"},
-            )
-
-        assert response.status_code == 401
-        assert "requires authentication" in response.json()["detail"]
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_discover_tools_missing_url(self, client):
-        response = await client.post("/discover-tools", json={})
-        assert response.status_code == 422
-
-
-class TestOAuthLogin:
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_oauth_login_success(self, client):
-        with (
-            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
-            patch("backend.api.features.mcp.routes.settings") as mock_settings,
-            patch(
-                "backend.api.features.mcp.routes._register_mcp_client"
-            ) as mock_register,
-        ):
-            instance = MockClient.return_value
-            instance.discover_auth = AsyncMock(
-                return_value={
-                    "authorization_servers": ["https://auth.sentry.io"],
-                    "resource": "https://mcp.sentry.dev/mcp",
-                    "scopes_supported": ["openid"],
-                }
-            )
-            instance.discover_auth_server_metadata = AsyncMock(
-                return_value={
-                    "authorization_endpoint": "https://auth.sentry.io/authorize",
-                    "token_endpoint": "https://auth.sentry.io/token",
-                    "registration_endpoint": "https://auth.sentry.io/register",
-                }
-            )
-            mock_register.return_value = {
-                "client_id": "registered-client-id",
-                "client_secret": "registered-secret",
-            }
-            mock_cm.store.store_state_token = AsyncMock(
-                return_value=("state-token-123", "code-challenge-abc")
-            )
-            mock_settings.config.frontend_base_url = "http://localhost:3000"
-
-            response = await client.post(
-                "/oauth/login",
-                json={"server_url": "https://mcp.sentry.dev/mcp"},
-            )
-
-        assert response.status_code == 200
-        data = response.json()
-        assert "login_url" in data
-        assert data["state_token"] == "state-token-123"
-        assert "auth.sentry.io/authorize" in data["login_url"]
-        assert "registered-client-id" in data["login_url"]
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_oauth_login_no_oauth_support(self, client):
-        with patch("backend.api.features.mcp.routes.MCPClient") as MockClient:
-            instance = MockClient.return_value
-            instance.discover_auth = AsyncMock(return_value=None)
-            instance.discover_auth_server_metadata = AsyncMock(return_value=None)
-
-            response = await client.post(
-                "/oauth/login",
-                json={"server_url": "https://simple-server.example.com/mcp"},
-            )
-
-        assert response.status_code == 400
-        assert "does not advertise OAuth" in response.json()["detail"]
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_oauth_login_fallback_to_public_client(self, client):
-        """When DCR is unavailable, falls back to default public client ID."""
-        with (
-            patch("backend.api.features.mcp.routes.MCPClient") as MockClient,
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
-            patch("backend.api.features.mcp.routes.settings") as mock_settings,
-        ):
-            instance = MockClient.return_value
-            instance.discover_auth = AsyncMock(
-                return_value={
-                    "authorization_servers": ["https://auth.example.com"],
-                    "resource": "https://mcp.example.com/mcp",
-                }
-            )
-            instance.discover_auth_server_metadata = AsyncMock(
-                return_value={
-                    "authorization_endpoint": "https://auth.example.com/authorize",
-                    "token_endpoint": "https://auth.example.com/token",
-                    # No registration_endpoint
-                }
-            )
-            mock_cm.store.store_state_token = AsyncMock(
-                return_value=("state-abc", "challenge-xyz")
-            )
-            mock_settings.config.frontend_base_url = "http://localhost:3000"
-
-            response = await client.post(
-                "/oauth/login",
-                json={"server_url": "https://mcp.example.com/mcp"},
-            )
-
-        assert response.status_code == 200
-        data = response.json()
-        assert "autogpt-platform" in data["login_url"]
-
-
-class TestOAuthCallback:
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_oauth_callback_success(self, client):
-        from pydantic import SecretStr
-
-        from backend.data.model import OAuth2Credentials
-
-        mock_creds = OAuth2Credentials(
-            provider="mcp",
-            title=None,
-            access_token=SecretStr("access-token-xyz"),
-            refresh_token=None,
-            access_token_expires_at=None,
-            refresh_token_expires_at=None,
-            scopes=[],
-            metadata={
-                "mcp_token_url": "https://auth.sentry.io/token",
-                "mcp_resource_url": "https://mcp.sentry.dev/mcp",
-            },
-        )
-
-        with (
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
-            patch("backend.api.features.mcp.routes.settings") as mock_settings,
-            patch("backend.api.features.mcp.routes.MCPOAuthHandler") as MockHandler,
-        ):
-            mock_settings.config.frontend_base_url = "http://localhost:3000"
-
-            # Mock state verification
-            mock_state = AsyncMock()
-            mock_state.state_metadata = {
-                "authorize_url": "https://auth.sentry.io/authorize",
-                "token_url": "https://auth.sentry.io/token",
-                "client_id": "test-client-id",
-                "client_secret": "test-secret",
-                "server_url": "https://mcp.sentry.dev/mcp",
-            }
-            mock_state.scopes = ["openid"]
-            mock_state.code_verifier = "verifier-123"
-            mock_cm.store.verify_state_token = AsyncMock(return_value=mock_state)
-            mock_cm.create = AsyncMock()
-
-            handler_instance = MockHandler.return_value
-            handler_instance.exchange_code_for_tokens = AsyncMock(
-                return_value=mock_creds
-            )
-
-            # Mock old credential cleanup
-            mock_cm.store.get_creds_by_provider = AsyncMock(return_value=[])
-
-            response = await client.post(
-                "/oauth/callback",
-                json={"code": "auth-code-abc", "state_token": "state-token-123"},
-            )
-
-        assert response.status_code == 200
-        data = response.json()
-        assert "id" in data
-        assert data["provider"] == "mcp"
-        assert data["type"] == "oauth2"
-        mock_cm.create.assert_called_once()
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_oauth_callback_invalid_state(self, client):
-        with patch("backend.api.features.mcp.routes.creds_manager") as mock_cm:
-            mock_cm.store.verify_state_token = AsyncMock(return_value=None)
-
-            response = await client.post(
-                "/oauth/callback",
-                json={"code": "auth-code", "state_token": "bad-state"},
-            )
-
-        assert response.status_code == 400
-        assert "Invalid or expired" in response.json()["detail"]
-
-    @pytest.mark.asyncio(loop_scope="session")
-    async def test_oauth_callback_token_exchange_fails(self, client):
-        with (
-            patch("backend.api.features.mcp.routes.creds_manager") as mock_cm,
-            patch("backend.api.features.mcp.routes.settings") as mock_settings,
-            patch("backend.api.features.mcp.routes.MCPOAuthHandler") as MockHandler,
-        ):
-            mock_settings.config.frontend_base_url = "http://localhost:3000"
-            mock_state = AsyncMock()
-            mock_state.state_metadata = {
-                "authorize_url": "https://auth.example.com/authorize",
-                "token_url": "https://auth.example.com/token",
-                "client_id": "cid",
-                "server_url": "https://mcp.example.com/mcp",
-            }
-            mock_state.scopes = []
-            mock_state.code_verifier = "v"
-            mock_cm.store.verify_state_token = AsyncMock(return_value=mock_state)
-
-            handler_instance = MockHandler.return_value
-            handler_instance.exchange_code_for_tokens = AsyncMock(
-                side_effect=RuntimeError("Token exchange failed")
-            )
-
-            response = await client.post(
-                "/oauth/callback",
-                json={"code": "bad-code", "state_token": "state"},
-            )
-
-        assert response.status_code == 400
-        assert "token exchange failed" in response.json()["detail"].lower()
--- a/autogpt_platform/backend/backend/api/features/oauth_test.py
+++ b/autogpt_platform/backend/backend/api/features/oauth_test.py
@@ -20,7 +20,6 @@ from typing import AsyncGenerator

 import httpx
 import pytest
-import pytest_asyncio
 from autogpt_libs.api_key.keysmith import APIKeySmith
 from prisma.enums import APIKeyPermission
 from prisma.models import OAuthAccessToken as PrismaOAuthAccessToken
@@ -39,13 +38,13 @@ keysmith = APIKeySmith()
 # ============================================================================


-@pytest.fixture(scope="session")
+@pytest.fixture
 def test_user_id() -> str:
    """Test user ID for OAuth tests."""
    return str(uuid.uuid4())


-@pytest_asyncio.fixture(scope="session", loop_scope="session")
+@pytest.fixture
 async def test_user(server, test_user_id: str):
    """Create a test user in the database."""
    await PrismaUser.prisma().create(
@@ -68,7 +67,7 @@ async def test_user(server, test_user_id: str):
    await PrismaUser.prisma().delete(where={"id": test_user_id})


-@pytest_asyncio.fixture
+@pytest.fixture
 async def test_oauth_app(test_user: str):
    """Create a test OAuth application in the database."""
    app_id = str(uuid.uuid4())
@@ -123,7 +122,7 @@ def pkce_credentials() -> tuple[str, str]:
    return generate_pkce()


-@pytest_asyncio.fixture
+@pytest.fixture
 async def client(server, test_user: str) -> AsyncGenerator[httpx.AsyncClient, None]:
    """
    Create an async HTTP client that talks directly to the FastAPI app.
@@ -288,7 +287,7 @@ async def test_authorize_invalid_client_returns_error(
    assert query_params["error"][0] == "invalid_client"


-@pytest_asyncio.fixture
+@pytest.fixture
 async def inactive_oauth_app(test_user: str):
    """Create an inactive test OAuth application in the database."""
    app_id = str(uuid.uuid4())
@@ -1005,7 +1004,7 @@ async def test_token_refresh_revoked(
    assert "revoked" in response.json()["detail"].lower()


-@pytest_asyncio.fixture
+@pytest.fixture
 async def other_oauth_app(test_user: str):
    """Create a second OAuth application for cross-app tests."""
    app_id = str(uuid.uuid4())
--- a/autogpt_platform/backend/backend/api/features/otto/service.py
+++ b/autogpt_platform/backend/backend/api/features/otto/service.py
@@ -5,8 +5,8 @@ from typing import Optional
 import aiohttp
 from fastapi import HTTPException

-from backend.blocks import get_block
 from backend.data import graph as graph_db
+from backend.data.block import get_block
 from backend.util.settings import Settings

 from .models import ApiResponse, ChatRequest, GraphData
--- a/autogpt_platform/backend/backend/api/features/store/content_handlers.py
+++ b/autogpt_platform/backend/backend/api/features/store/content_handlers.py
@@ -152,7 +152,7 @@ class BlockHandler(ContentHandler):

    async def get_missing_items(self, batch_size: int) -> list[ContentItem]:
        """Fetch blocks without embeddings."""
-        from backend.blocks import get_blocks
+        from backend.data.block import get_blocks

        # Get all available blocks
        all_blocks = get_blocks()
@@ -188,10 +188,6 @@ class BlockHandler(ContentHandler):
            try:
                block_instance = block_cls()

-                # Skip disabled blocks - they shouldn't be indexed
-                if block_instance.disabled:
-                    continue
-
                # Build searchable text from block metadata
                parts = []
                if hasattr(block_instance, "name") and block_instance.name:
@@ -249,22 +245,15 @@ class BlockHandler(ContentHandler):

    async def get_stats(self) -> dict[str, int]:
        """Get statistics about block embedding coverage."""
-        from backend.blocks import get_blocks
+        from backend.data.block import get_blocks

        all_blocks = get_blocks()
-
-        # Filter out disabled blocks - they're not indexed
-        enabled_block_ids = [
-            block_id
-            for block_id, block_cls in all_blocks.items()
-            if not block_cls().disabled
-        ]
-        total_blocks = len(enabled_block_ids)
+        total_blocks = len(all_blocks)

        if total_blocks == 0:
            return {"total": 0, "with_embeddings": 0, "without_embeddings": 0}

-        block_ids = enabled_block_ids
+        block_ids = list(all_blocks.keys())
        placeholders = ",".join([f"${i+1}" for i in range(len(block_ids))])

        embedded_result = await query_raw_with_schema(
--- a/autogpt_platform/backend/backend/api/features/store/content_handlers_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/content_handlers_test.py
@@ -81,7 +81,6 @@ async def test_block_handler_get_missing_items(mocker):
    mock_block_instance.name = "Calculator Block"
    mock_block_instance.description = "Performs calculations"
    mock_block_instance.categories = [MagicMock(value="MATH")]
-    mock_block_instance.disabled = False
    mock_block_instance.input_schema.model_json_schema.return_value = {
        "properties": {"expression": {"description": "Math expression to evaluate"}}
    }
@@ -93,7 +92,7 @@ async def test_block_handler_get_missing_items(mocker):
    mock_existing = []

    with patch(
-        "backend.blocks.get_blocks",
+        "backend.data.block.get_blocks",
        return_value=mock_blocks,
    ):
        with patch(
@@ -117,25 +116,18 @@ async def test_block_handler_get_stats(mocker):
    """Test BlockHandler returns correct stats."""
    handler = BlockHandler()

-    # Mock get_blocks - each block class returns an instance with disabled=False
-    def make_mock_block_class():
-        mock_class = MagicMock()
-        mock_instance = MagicMock()
-        mock_instance.disabled = False
-        mock_class.return_value = mock_instance
-        return mock_class
-
+    # Mock get_blocks
    mock_blocks = {
-        "block-1": make_mock_block_class(),
-        "block-2": make_mock_block_class(),
-        "block-3": make_mock_block_class(),
+        "block-1": MagicMock(),
+        "block-2": MagicMock(),
+        "block-3": MagicMock(),
    }

    # Mock embedded count query (2 blocks have embeddings)
    mock_embedded = [{"count": 2}]

    with patch(
-        "backend.blocks.get_blocks",
+        "backend.data.block.get_blocks",
        return_value=mock_blocks,
    ):
        with patch(
@@ -317,7 +309,6 @@ async def test_block_handler_handles_missing_attributes():
    mock_block_class = MagicMock()
    mock_block_instance = MagicMock()
    mock_block_instance.name = "Minimal Block"
-    mock_block_instance.disabled = False
    # No description, categories, or schema
    del mock_block_instance.description
    del mock_block_instance.categories
@@ -327,7 +318,7 @@ async def test_block_handler_handles_missing_attributes():
    mock_blocks = {"block-minimal": mock_block_class}

    with patch(
-        "backend.blocks.get_blocks",
+        "backend.data.block.get_blocks",
        return_value=mock_blocks,
    ):
        with patch(
@@ -351,7 +342,6 @@ async def test_block_handler_skips_failed_blocks():
    good_instance.name = "Good Block"
    good_instance.description = "Works fine"
    good_instance.categories = []
-    good_instance.disabled = False
    good_block.return_value = good_instance

    bad_block = MagicMock()
@@ -360,7 +350,7 @@ async def test_block_handler_skips_failed_blocks():
    mock_blocks = {"good-block": good_block, "bad-block": bad_block}

    with patch(
-        "backend.blocks.get_blocks",
+        "backend.data.block.get_blocks",
        return_value=mock_blocks,
    ):
        with patch(
--- a/autogpt_platform/backend/backend/api/features/store/db.py
+++ b/autogpt_platform/backend/backend/api/features/store/db.py
@@ -1,7 +1,7 @@
 import asyncio
 import logging
 from datetime import datetime, timezone
-from typing import Any, Literal, overload
+from typing import Any, Literal

 import fastapi
 import prisma.enums
@@ -11,8 +11,8 @@ import prisma.types

 from backend.data.db import transaction
 from backend.data.graph import (
+    GraphMeta,
    GraphModel,
-    GraphModelWithoutNodes,
    get_graph,
    get_graph_as_admin,
    get_sub_graphs,
@@ -112,7 +112,6 @@ async def get_store_agents(
                            description=agent["description"],
                            runs=agent["runs"],
                            rating=agent["rating"],
-                            agent_graph_id=agent.get("agentGraphId", ""),
                        )
                        store_agents.append(store_agent)
                    except Exception as e:
@@ -171,7 +170,6 @@ async def get_store_agents(
                        description=agent.description,
                        runs=agent.runs,
                        rating=agent.rating,
-                        agent_graph_id=agent.agentGraphId,
                    )
                    # Add to the list only if creation was successful
                    store_agents.append(store_agent)
@@ -334,22 +332,7 @@ async def get_store_agent_details(
        raise DatabaseError("Failed to fetch agent details") from e


-@overload
-async def get_available_graph(
-    store_listing_version_id: str, hide_nodes: Literal[False]
-) -> GraphModel: ...
-
-
-@overload
-async def get_available_graph(
-    store_listing_version_id: str, hide_nodes: Literal[True] = True
-) -> GraphModelWithoutNodes: ...
-
-
-async def get_available_graph(
-    store_listing_version_id: str,
-    hide_nodes: bool = True,
-) -> GraphModelWithoutNodes | GraphModel:
+async def get_available_graph(store_listing_version_id: str) -> GraphMeta:
    try:
        # Get avaialble, non-deleted store listing version
        store_listing_version = (
@@ -359,7 +342,7 @@ async def get_available_graph(
                    "isAvailable": True,
                    "isDeleted": False,
                },
-                include={"AgentGraph": {"include": AGENT_GRAPH_INCLUDE}},
+                include={"AgentGraph": {"include": {"Nodes": True}}},
            )
        )

@@ -369,9 +352,7 @@ async def get_available_graph(
                detail=f"Store listing version {store_listing_version_id} not found",
            )

-        return (GraphModelWithoutNodes if hide_nodes else GraphModel).from_db(
-            store_listing_version.AgentGraph
-        )
+        return GraphModel.from_db(store_listing_version.AgentGraph).meta()

    except Exception as e:
        logger.error(f"Error getting agent: {e}")
@@ -1571,7 +1552,7 @@ async def review_store_submission(

                # Generate embedding for approved listing (blocking - admin operation)
                # Inside transaction: if embedding fails, entire transaction rolls back
-                await ensure_embedding(
+                embedding_success = await ensure_embedding(
                    version_id=store_listing_version_id,
                    name=store_listing_version.name,
                    description=store_listing_version.description,
@@ -1579,6 +1560,12 @@ async def review_store_submission(
                    categories=store_listing_version.categories or [],
                    tx=tx,
                )
+                if not embedding_success:
+                    raise ValueError(
+                        f"Failed to generate embedding for listing {store_listing_version_id}. "
+                        "This is likely due to OpenAI API being unavailable. "
+                        "Please try again later or contact support if the issue persists."
+                    )

                await prisma.models.StoreListing.prisma(tx).update(
                    where={"id": store_listing_version.StoreListing.id},
--- a/autogpt_platform/backend/backend/api/features/store/embeddings.py
+++ b/autogpt_platform/backend/backend/api/features/store/embeddings.py
@@ -21,6 +21,7 @@ from backend.util.json import dumps

 logger = logging.getLogger(__name__)

+
 # OpenAI embedding model configuration
 EMBEDDING_MODEL = "text-embedding-3-small"
 # Embedding dimension for the model above
@@ -62,42 +63,49 @@ def build_searchable_text(
    return " ".join(parts)


-async def generate_embedding(text: str) -> list[float]:
+async def generate_embedding(text: str) -> list[float] | None:
    """
    Generate embedding for text using OpenAI API.

-    Raises exceptions on failure - caller should handle.
+    Returns None if embedding generation fails.
+    Fail-fast: no retries to maintain consistency with approval flow.
    """
-    client = get_openai_client()
-    if not client:
-        raise RuntimeError("openai_internal_api_key not set, cannot generate embedding")
+    try:
+        client = get_openai_client()
+        if not client:
+            logger.error("openai_internal_api_key not set, cannot generate embedding")
+            return None

-    # Truncate text to token limit using tiktoken
-    # Character-based truncation is insufficient because token ratios vary by content type
-    enc = encoding_for_model(EMBEDDING_MODEL)
-    tokens = enc.encode(text)
-    if len(tokens) > EMBEDDING_MAX_TOKENS:
-        tokens = tokens[:EMBEDDING_MAX_TOKENS]
-        truncated_text = enc.decode(tokens)
-        logger.info(
-            f"Truncated text from {len(enc.encode(text))} to {len(tokens)} tokens"
+        # Truncate text to token limit using tiktoken
+        # Character-based truncation is insufficient because token ratios vary by content type
+        enc = encoding_for_model(EMBEDDING_MODEL)
+        tokens = enc.encode(text)
+        if len(tokens) > EMBEDDING_MAX_TOKENS:
+            tokens = tokens[:EMBEDDING_MAX_TOKENS]
+            truncated_text = enc.decode(tokens)
+            logger.info(
+                f"Truncated text from {len(enc.encode(text))} to {len(tokens)} tokens"
+            )
+        else:
+            truncated_text = text
+
+        start_time = time.time()
+        response = await client.embeddings.create(
+            model=EMBEDDING_MODEL,
+            input=truncated_text,
        )
-    else:
-        truncated_text = text
+        latency_ms = (time.time() - start_time) * 1000

-    start_time = time.time()
-    response = await client.embeddings.create(
-        model=EMBEDDING_MODEL,
-        input=truncated_text,
-    )
-    latency_ms = (time.time() - start_time) * 1000
+        embedding = response.data[0].embedding
+        logger.info(
+            f"Generated embedding: {len(embedding)} dims, "
+            f"{len(tokens)} tokens, {latency_ms:.0f}ms"
+        )
+        return embedding

-    embedding = response.data[0].embedding
-    logger.info(
-        f"Generated embedding: {len(embedding)} dims, "
-        f"{len(tokens)} tokens, {latency_ms:.0f}ms"
-    )
-    return embedding
+    except Exception as e:
+        logger.error(f"Failed to generate embedding: {e}")
+        return None


 async def store_embedding(
@@ -136,45 +144,48 @@ async def store_content_embedding(

    New function for unified content embedding storage.
    Uses raw SQL since Prisma doesn't natively support pgvector.
-
-    Raises exceptions on failure - caller should handle.
    """
-    client = tx if tx else prisma.get_client()
+    try:
+        client = tx if tx else prisma.get_client()

-    # Convert embedding to PostgreSQL vector format
-    embedding_str = embedding_to_vector_string(embedding)
-    metadata_json = dumps(metadata or {})
+        # Convert embedding to PostgreSQL vector format
+        embedding_str = embedding_to_vector_string(embedding)
+        metadata_json = dumps(metadata or {})

-    # Upsert the embedding
-    # WHERE clause in DO UPDATE prevents PostgreSQL 15 bug with NULLS NOT DISTINCT
-    # Use unqualified ::vector - pgvector is in search_path on all environments
-    await execute_raw_with_schema(
-        """
-        INSERT INTO {schema_prefix}"UnifiedContentEmbedding" (
-            "id", "contentType", "contentId", "userId", "embedding", "searchableText", "metadata", "createdAt", "updatedAt"
+        # Upsert the embedding
+        # WHERE clause in DO UPDATE prevents PostgreSQL 15 bug with NULLS NOT DISTINCT
+        # Use unqualified ::vector - pgvector is in search_path on all environments
+        await execute_raw_with_schema(
+            """
+            INSERT INTO {schema_prefix}"UnifiedContentEmbedding" (
+                "id", "contentType", "contentId", "userId", "embedding", "searchableText", "metadata", "createdAt", "updatedAt"
+            )
+            VALUES (gen_random_uuid()::text, $1::{schema_prefix}"ContentType", $2, $3, $4::vector, $5, $6::jsonb, NOW(), NOW())
+            ON CONFLICT ("contentType", "contentId", "userId")
+            DO UPDATE SET
+                "embedding" = $4::vector,
+                "searchableText" = $5,
+                "metadata" = $6::jsonb,
+                "updatedAt" = NOW()
+            WHERE {schema_prefix}"UnifiedContentEmbedding"."contentType" = $1::{schema_prefix}"ContentType"
+                AND {schema_prefix}"UnifiedContentEmbedding"."contentId" = $2
+                AND ({schema_prefix}"UnifiedContentEmbedding"."userId" = $3 OR ($3 IS NULL AND {schema_prefix}"UnifiedContentEmbedding"."userId" IS NULL))
+            """,
+            content_type,
+            content_id,
+            user_id,
+            embedding_str,
+            searchable_text,
+            metadata_json,
+            client=client,
        )
-        VALUES (gen_random_uuid()::text, $1::{schema_prefix}"ContentType", $2, $3, $4::vector, $5, $6::jsonb, NOW(), NOW())
-        ON CONFLICT ("contentType", "contentId", "userId")
-        DO UPDATE SET
-            "embedding" = $4::vector,
-            "searchableText" = $5,
-            "metadata" = $6::jsonb,
-            "updatedAt" = NOW()
-        WHERE {schema_prefix}"UnifiedContentEmbedding"."contentType" = $1::{schema_prefix}"ContentType"
-            AND {schema_prefix}"UnifiedContentEmbedding"."contentId" = $2
-            AND ({schema_prefix}"UnifiedContentEmbedding"."userId" = $3 OR ($3 IS NULL AND {schema_prefix}"UnifiedContentEmbedding"."userId" IS NULL))
-        """,
-        content_type,
-        content_id,
-        user_id,
-        embedding_str,
-        searchable_text,
-        metadata_json,
-        client=client,
-    )

-    logger.info(f"Stored embedding for {content_type}:{content_id}")
-    return True
+        logger.info(f"Stored embedding for {content_type}:{content_id}")
+        return True
+
+    except Exception as e:
+        logger.error(f"Failed to store embedding for {content_type}:{content_id}: {e}")
+        return False


 async def get_embedding(version_id: str) -> dict[str, Any] | None:
@@ -206,31 +217,34 @@ async def get_content_embedding(

    New function for unified content embedding retrieval.
    Returns dict with contentType, contentId, embedding, timestamps or None if not found.
-
-    Raises exceptions on failure - caller should handle.
    """
-    result = await query_raw_with_schema(
-        """
-        SELECT
-            "contentType",
-            "contentId",
-            "userId",
-            "embedding"::text as "embedding",
-            "searchableText",
-            "metadata",
-            "createdAt",
-            "updatedAt"
-        FROM {schema_prefix}"UnifiedContentEmbedding"
-        WHERE "contentType" = $1::{schema_prefix}"ContentType" AND "contentId" = $2 AND ("userId" = $3 OR ($3 IS NULL AND "userId" IS NULL))
-        """,
-        content_type,
-        content_id,
-        user_id,
-    )
+    try:
+        result = await query_raw_with_schema(
+            """
+            SELECT
+                "contentType",
+                "contentId",
+                "userId",
+                "embedding"::text as "embedding",
+                "searchableText",
+                "metadata",
+                "createdAt",
+                "updatedAt"
+            FROM {schema_prefix}"UnifiedContentEmbedding"
+            WHERE "contentType" = $1::{schema_prefix}"ContentType" AND "contentId" = $2 AND ("userId" = $3 OR ($3 IS NULL AND "userId" IS NULL))
+            """,
+            content_type,
+            content_id,
+            user_id,
+        )

-    if result and len(result) > 0:
-        return result[0]
-    return None
+        if result and len(result) > 0:
+            return result[0]
+        return None
+
+    except Exception as e:
+        logger.error(f"Failed to get embedding for {content_type}:{content_id}: {e}")
+        return None


 async def ensure_embedding(
@@ -258,38 +272,46 @@ async def ensure_embedding(
        tx: Optional transaction client

    Returns:
-        True if embedding exists/was created
-
-    Raises exceptions on failure - caller should handle.
+        True if embedding exists/was created, False on failure
    """
-    # Check if embedding already exists
-    if not force:
-        existing = await get_embedding(version_id)
-        if existing and existing.get("embedding"):
-            logger.debug(f"Embedding for version {version_id} already exists")
-            return True
+    try:
+        # Check if embedding already exists
+        if not force:
+            existing = await get_embedding(version_id)
+            if existing and existing.get("embedding"):
+                logger.debug(f"Embedding for version {version_id} already exists")
+                return True

-    # Build searchable text for embedding
-    searchable_text = build_searchable_text(name, description, sub_heading, categories)
+        # Build searchable text for embedding
+        searchable_text = build_searchable_text(
+            name, description, sub_heading, categories
+        )

-    # Generate new embedding
-    embedding = await generate_embedding(searchable_text)
+        # Generate new embedding
+        embedding = await generate_embedding(searchable_text)
+        if embedding is None:
+            logger.warning(f"Could not generate embedding for version {version_id}")
+            return False

-    # Store the embedding with metadata using new function
-    metadata = {
-        "name": name,
-        "subHeading": sub_heading,
-        "categories": categories,
-    }
-    return await store_content_embedding(
-        content_type=ContentType.STORE_AGENT,
-        content_id=version_id,
-        embedding=embedding,
-        searchable_text=searchable_text,
-        metadata=metadata,
-        user_id=None,  # Store agents are public
-        tx=tx,
-    )
+        # Store the embedding with metadata using new function
+        metadata = {
+            "name": name,
+            "subHeading": sub_heading,
+            "categories": categories,
+        }
+        return await store_content_embedding(
+            content_type=ContentType.STORE_AGENT,
+            content_id=version_id,
+            embedding=embedding,
+            searchable_text=searchable_text,
+            metadata=metadata,
+            user_id=None,  # Store agents are public
+            tx=tx,
+        )
+
+    except Exception as e:
+        logger.error(f"Failed to ensure embedding for version {version_id}: {e}")
+        return False


 async def delete_embedding(version_id: str) -> bool:
@@ -454,7 +476,6 @@ async def backfill_all_content_types(batch_size: int = 10) -> dict[str, Any]:
    total_processed = 0
    total_success = 0
    total_failed = 0
-    all_errors: dict[str, int] = {}  # Aggregate errors across all content types

    # Process content types in explicit order
    processing_order = [
@@ -500,13 +521,6 @@ async def backfill_all_content_types(batch_size: int = 10) -> dict[str, Any]:
            success = sum(1 for result in results if result is True)
            failed = len(results) - success

-            # Aggregate errors across all content types
-            if failed > 0:
-                for result in results:
-                    if isinstance(result, Exception):
-                        error_key = f"{type(result).__name__}: {str(result)}"
-                        all_errors[error_key] = all_errors.get(error_key, 0) + 1
-
            results_by_type[content_type.value] = {
                "processed": len(missing_items),
                "success": success,
@@ -532,13 +546,6 @@ async def backfill_all_content_types(batch_size: int = 10) -> dict[str, Any]:
                "error": str(e),
            }

-    # Log aggregated errors once at the end
-    if all_errors:
-        error_details = ", ".join(
-            f"{error} ({count}x)" for error, count in all_errors.items()
-        )
-        logger.error(f"Embedding backfill errors: {error_details}")
-
    return {
        "by_type": results_by_type,
        "totals": {
@@ -550,12 +557,11 @@ async def backfill_all_content_types(batch_size: int = 10) -> dict[str, Any]:
    }


-async def embed_query(query: str) -> list[float]:
+async def embed_query(query: str) -> list[float] | None:
    """
    Generate embedding for a search query.

    Same as generate_embedding but with clearer intent.
-    Raises exceptions on failure - caller should handle.
    """
    return await generate_embedding(query)

@@ -588,30 +594,40 @@ async def ensure_content_embedding(
        tx: Optional transaction client

    Returns:
-        True if embedding exists/was created
-
-    Raises exceptions on failure - caller should handle.
+        True if embedding exists/was created, False on failure
    """
-    # Check if embedding already exists
-    if not force:
-        existing = await get_content_embedding(content_type, content_id, user_id)
-        if existing and existing.get("embedding"):
-            logger.debug(f"Embedding for {content_type}:{content_id} already exists")
-            return True
+    try:
+        # Check if embedding already exists
+        if not force:
+            existing = await get_content_embedding(content_type, content_id, user_id)
+            if existing and existing.get("embedding"):
+                logger.debug(
+                    f"Embedding for {content_type}:{content_id} already exists"
+                )
+                return True

-    # Generate new embedding
-    embedding = await generate_embedding(searchable_text)
+        # Generate new embedding
+        embedding = await generate_embedding(searchable_text)
+        if embedding is None:
+            logger.warning(
+                f"Could not generate embedding for {content_type}:{content_id}"
+            )
+            return False

-    # Store the embedding
-    return await store_content_embedding(
-        content_type=content_type,
-        content_id=content_id,
-        embedding=embedding,
-        searchable_text=searchable_text,
-        metadata=metadata or {},
-        user_id=user_id,
-        tx=tx,
-    )
+        # Store the embedding
+        return await store_content_embedding(
+            content_type=content_type,
+            content_id=content_id,
+            embedding=embedding,
+            searchable_text=searchable_text,
+            metadata=metadata or {},
+            user_id=user_id,
+            tx=tx,
+        )
+
+    except Exception as e:
+        logger.error(f"Failed to ensure embedding for {content_type}:{content_id}: {e}")
+        return False


 async def cleanup_orphaned_embeddings() -> dict[str, Any]:
@@ -662,7 +678,7 @@ async def cleanup_orphaned_embeddings() -> dict[str, Any]:
                )
                current_ids = {row["id"] for row in valid_agents}
            elif content_type == ContentType.BLOCK:
-                from backend.blocks import get_blocks
+                from backend.data.block import get_blocks

                current_ids = set(get_blocks().keys())
            elif content_type == ContentType.DOCUMENTATION:
@@ -838,8 +854,9 @@ async def semantic_search(
        limit = 100

    # Generate query embedding
-    try:
-        query_embedding = await embed_query(query)
+    query_embedding = await embed_query(query)
+
+    if query_embedding is not None:
        # Semantic search with embeddings
        embedding_str = embedding_to_vector_string(query_embedding)

@@ -890,21 +907,24 @@ async def semantic_search(
        """
        )

-        results = await query_raw_with_schema(sql, *params)
-        return [
-            {
-                "content_id": row["content_id"],
-                "content_type": row["content_type"],
-                "searchable_text": row["searchable_text"],
-                "metadata": row["metadata"],
-                "similarity": float(row["similarity"]),
-            }
-            for row in results
-        ]
-    except Exception as e:
-        logger.warning(f"Semantic search failed, falling back to lexical search: {e}")
+        try:
+            results = await query_raw_with_schema(sql, *params)
+            return [
+                {
+                    "content_id": row["content_id"],
+                    "content_type": row["content_type"],
+                    "searchable_text": row["searchable_text"],
+                    "metadata": row["metadata"],
+                    "similarity": float(row["similarity"]),
+                }
+                for row in results
+            ]
+        except Exception as e:
+            logger.error(f"Semantic search failed: {e}")
+            # Fall through to lexical search below

    # Fallback to lexical search if embeddings unavailable
+    logger.warning("Falling back to lexical search (embeddings unavailable)")

    params_lexical: list[Any] = [limit]
    user_filter = ""
--- a/autogpt_platform/backend/backend/api/features/store/embeddings_e2e_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/embeddings_e2e_test.py
@@ -454,9 +454,6 @@ async def test_unified_hybrid_search_pagination(
    cleanup_embeddings: list,
 ):
    """Test unified search pagination works correctly."""
-    # Use a unique search term to avoid matching other test data
-    unique_term = f"xyzpagtest{uuid.uuid4().hex[:8]}"
-
    # Create multiple items
    content_ids = []
    for i in range(5):
@@ -468,14 +465,14 @@ async def test_unified_hybrid_search_pagination(
            content_type=ContentType.BLOCK,
            content_id=content_id,
            embedding=mock_embedding,
-            searchable_text=f"{unique_term} item number {i}",
+            searchable_text=f"pagination test item number {i}",
            metadata={"index": i},
            user_id=None,
        )

    # Get first page
    page1_results, total1 = await unified_hybrid_search(
-        query=unique_term,
+        query="pagination test",
        content_types=[ContentType.BLOCK],
        page=1,
        page_size=2,
@@ -483,7 +480,7 @@ async def test_unified_hybrid_search_pagination(

    # Get second page
    page2_results, total2 = await unified_hybrid_search(
-        query=unique_term,
+        query="pagination test",
        content_types=[ContentType.BLOCK],
        page=2,
        page_size=2,
--- a/autogpt_platform/backend/backend/api/features/store/embeddings_schema_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/embeddings_schema_test.py
@@ -298,16 +298,17 @@ async def test_schema_handling_error_cases():
            mock_client.execute_raw.side_effect = Exception("Database error")
            mock_get_client.return_value = mock_client

-            # Should raise exception on error
-            with pytest.raises(Exception, match="Database error"):
-                await embeddings.store_content_embedding(
-                    content_type=ContentType.STORE_AGENT,
-                    content_id="test-id",
-                    embedding=[0.1] * EMBEDDING_DIM,
-                    searchable_text="test",
-                    metadata=None,
-                    user_id=None,
-                )
+            result = await embeddings.store_content_embedding(
+                content_type=ContentType.STORE_AGENT,
+                content_id="test-id",
+                embedding=[0.1] * EMBEDDING_DIM,
+                searchable_text="test",
+                metadata=None,
+                user_id=None,
+            )
+
+            # Should return False on error, not raise
+            assert result is False


 if __name__ == "__main__":
--- a/autogpt_platform/backend/backend/api/features/store/embeddings_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/embeddings_test.py
@@ -80,8 +80,9 @@ async def test_generate_embedding_no_api_key():
    ) as mock_get_client:
        mock_get_client.return_value = None

-        with pytest.raises(RuntimeError, match="openai_internal_api_key not set"):
-            await embeddings.generate_embedding("test text")
+        result = await embeddings.generate_embedding("test text")
+
+        assert result is None


@pytest.mark.asyncio(loop_scope="session")
@@ -96,8 +97,9 @@ async def test_generate_embedding_api_error():
    ) as mock_get_client:
        mock_get_client.return_value = mock_client

-        with pytest.raises(Exception, match="API Error"):
-            await embeddings.generate_embedding("test text")
+        result = await embeddings.generate_embedding("test text")
+
+        assert result is None


@pytest.mark.asyncio(loop_scope="session")
@@ -171,10 +173,11 @@ async def test_store_embedding_database_error(mocker):

    embedding = [0.1, 0.2, 0.3]

-    with pytest.raises(Exception, match="Database error"):
-        await embeddings.store_embedding(
-            version_id="test-version-id", embedding=embedding, tx=mock_client
-        )
+    result = await embeddings.store_embedding(
+        version_id="test-version-id", embedding=embedding, tx=mock_client
+    )
+
+    assert result is False


@pytest.mark.asyncio(loop_scope="session")
@@ -274,16 +277,17 @@ async def test_ensure_embedding_create_new(mock_get, mock_store, mock_generate):
 async def test_ensure_embedding_generation_fails(mock_get, mock_generate):
    """Test ensure_embedding when generation fails."""
    mock_get.return_value = None
-    mock_generate.side_effect = Exception("Generation failed")
+    mock_generate.return_value = None

-    with pytest.raises(Exception, match="Generation failed"):
-        await embeddings.ensure_embedding(
-            version_id="test-id",
-            name="Test",
-            description="Test description",
-            sub_heading="Test heading",
-            categories=["test"],
-        )
+    result = await embeddings.ensure_embedding(
+        version_id="test-id",
+        name="Test",
+        description="Test description",
+        sub_heading="Test heading",
+        categories=["test"],
+    )
+
+    assert result is False


@pytest.mark.asyncio(loop_scope="session")
--- a/autogpt_platform/backend/backend/api/features/store/hybrid_search.py
+++ b/autogpt_platform/backend/backend/api/features/store/hybrid_search.py
@@ -8,7 +8,6 @@ Includes BM25 reranking for improved lexical relevance.

 import logging
 import re
-import time
 from dataclasses import dataclass
 from typing import Any, Literal

@@ -187,12 +186,13 @@ async def unified_hybrid_search(

    offset = (page - 1) * page_size

-    # Generate query embedding with graceful degradation
-    try:
-        query_embedding = await embed_query(query)
-    except Exception as e:
+    # Generate query embedding
+    query_embedding = await embed_query(query)
+
+    # Graceful degradation if embedding unavailable
+    if query_embedding is None or not query_embedding:
        logger.warning(
-            f"Failed to generate query embedding - falling back to lexical-only search: {e}. "
+            "Failed to generate query embedding - falling back to lexical-only search. "
            "Check that openai_internal_api_key is configured and OpenAI API is accessible."
        )
        query_embedding = [0.0] * EMBEDDING_DIM
@@ -363,11 +363,7 @@ async def unified_hybrid_search(
        LIMIT {limit_param} OFFSET {offset_param}
    """

-    try:
-        results = await query_raw_with_schema(sql_query, *params)
-    except Exception as e:
-        await _log_vector_error_diagnostics(e)
-        raise
+    results = await query_raw_with_schema(sql_query, *params)

    total = results[0]["total_count"] if results else 0
    # Apply BM25 reranking
@@ -468,12 +464,13 @@ async def hybrid_search(

    offset = (page - 1) * page_size

-    # Generate query embedding with graceful degradation
-    try:
-        query_embedding = await embed_query(query)
-    except Exception as e:
+    # Generate query embedding
+    query_embedding = await embed_query(query)
+
+    # Graceful degradation
+    if query_embedding is None or not query_embedding:
        logger.warning(
-            f"Failed to generate query embedding - falling back to lexical-only search: {e}"
+            "Failed to generate query embedding - falling back to lexical-only search."
        )
        query_embedding = [0.0] * EMBEDDING_DIM
        total_non_semantic = (
@@ -605,7 +602,6 @@ async def hybrid_search(
                sa.featured,
                sa.is_available,
                sa.updated_at,
-                sa."agentGraphId",
                -- Searchable text for BM25 reranking
                COALESCE(sa.agent_name, '') || ' ' || COALESCE(sa.sub_heading, '') || ' ' || COALESCE(sa.description, '') as searchable_text,
                -- Semantic score
@@ -665,7 +661,6 @@ async def hybrid_search(
                featured,
                is_available,
                updated_at,
-                "agentGraphId",
                searchable_text,
                semantic_score,
                lexical_score,
@@ -691,11 +686,7 @@ async def hybrid_search(
        LIMIT {limit_param} OFFSET {offset_param}
    """

-    try:
-        results = await query_raw_with_schema(sql_query, *params)
-    except Exception as e:
-        await _log_vector_error_diagnostics(e)
-        raise
+    results = await query_raw_with_schema(sql_query, *params)

    total = results[0]["total_count"] if results else 0

@@ -727,87 +718,6 @@ async def hybrid_search_simple(
    return await hybrid_search(query=query, page=page, page_size=page_size)


-# ============================================================================
-# Diagnostics
-# ============================================================================
-
-# Rate limit: only log vector error diagnostics once per this interval
-_VECTOR_DIAG_INTERVAL_SECONDS = 60
-_last_vector_diag_time: float = 0
-
-
-async def _log_vector_error_diagnostics(error: Exception) -> None:
-    """Log diagnostic info when 'type vector does not exist' error occurs.
-
-    Note: Diagnostic queries use query_raw_with_schema which may run on a different
-    pooled connection than the one that failed. Session-level search_path can differ,
-    so these diagnostics show cluster-wide state, not necessarily the failed session.
-
-    Includes rate limiting to avoid log spam - only logs once per minute.
-    Caller should re-raise the error after calling this function.
-    """
-    global _last_vector_diag_time
-
-    # Check if this is the vector type error
-    error_str = str(error).lower()
-    if not (
-        "type" in error_str and "vector" in error_str and "does not exist" in error_str
-    ):
-        return
-
-    # Rate limit: only log once per interval
-    now = time.time()
-    if now - _last_vector_diag_time < _VECTOR_DIAG_INTERVAL_SECONDS:
-        return
-    _last_vector_diag_time = now
-
-    try:
-        diagnostics: dict[str, object] = {}
-
-        try:
-            search_path_result = await query_raw_with_schema("SHOW search_path")
-            diagnostics["search_path"] = search_path_result
-        except Exception as e:
-            diagnostics["search_path"] = f"Error: {e}"
-
-        try:
-            schema_result = await query_raw_with_schema("SELECT current_schema()")
-            diagnostics["current_schema"] = schema_result
-        except Exception as e:
-            diagnostics["current_schema"] = f"Error: {e}"
-
-        try:
-            user_result = await query_raw_with_schema(
-                "SELECT current_user, session_user, current_database()"
-            )
-            diagnostics["user_info"] = user_result
-        except Exception as e:
-            diagnostics["user_info"] = f"Error: {e}"
-
-        try:
-            # Check pgvector extension installation (cluster-wide, stable info)
-            ext_result = await query_raw_with_schema(
-                "SELECT extname, extversion, nspname as schema "
-                "FROM pg_extension e "
-                "JOIN pg_namespace n ON e.extnamespace = n.oid "
-                "WHERE extname = 'vector'"
-            )
-            diagnostics["pgvector_extension"] = ext_result
-        except Exception as e:
-            diagnostics["pgvector_extension"] = f"Error: {e}"
-
-        logger.error(
-            f"Vector type error diagnostics:\n"
-            f"  Error: {error}\n"
-            f"  search_path: {diagnostics.get('search_path')}\n"
-            f"  current_schema: {diagnostics.get('current_schema')}\n"
-            f"  user_info: {diagnostics.get('user_info')}\n"
-            f"  pgvector_extension: {diagnostics.get('pgvector_extension')}"
-        )
-    except Exception as diag_error:
-        logger.error(f"Failed to collect vector error diagnostics: {diag_error}")
-
-
 # Backward compatibility alias - HybridSearchWeights maps to StoreAgentSearchWeights
 # for existing code that expects the popularity parameter
 HybridSearchWeights = StoreAgentSearchWeights
--- a/autogpt_platform/backend/backend/api/features/store/hybrid_search_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/hybrid_search_test.py
@@ -172,8 +172,8 @@ async def test_hybrid_search_without_embeddings():
        with patch(
            "backend.api.features.store.hybrid_search.query_raw_with_schema"
        ) as mock_query:
-            # Simulate embedding failure by raising exception
-            mock_embed.side_effect = Exception("Embedding generation failed")
+            # Simulate embedding failure
+            mock_embed.return_value = None
            mock_query.return_value = mock_results

            # Should NOT raise - graceful degradation
@@ -613,9 +613,7 @@ async def test_unified_hybrid_search_graceful_degradation():
            "backend.api.features.store.hybrid_search.embed_query"
        ) as mock_embed:
            mock_query.return_value = mock_results
-            mock_embed.side_effect = Exception(
-                "Embedding generation failed"
-            )  # Embedding failure
+            mock_embed.return_value = None  # Embedding failure

            # Should NOT raise - graceful degradation
            results, total = await unified_hybrid_search(
--- a/autogpt_platform/backend/backend/api/features/store/image_gen.py
+++ b/autogpt_platform/backend/backend/api/features/store/image_gen.py
@@ -7,7 +7,16 @@ from replicate.client import Client as ReplicateClient
 from replicate.exceptions import ReplicateError
 from replicate.helpers import FileOutput

-from backend.data.graph import GraphBaseMeta
+from backend.blocks.ideogram import (
+    AspectRatio,
+    ColorPalettePreset,
+    IdeogramModelBlock,
+    IdeogramModelName,
+    MagicPromptOption,
+    StyleType,
+    UpscaleOption,
+)
+from backend.data.graph import BaseGraph
 from backend.data.model import CredentialsMetaInput, ProviderName
 from backend.integrations.credentials_store import ideogram_credentials
 from backend.util.request import Requests
@@ -25,14 +34,14 @@ class ImageStyle(str, Enum):
    DIGITAL_ART = "digital art"


-async def generate_agent_image(agent: GraphBaseMeta | AgentGraph) -> io.BytesIO:
+async def generate_agent_image(agent: BaseGraph | AgentGraph) -> io.BytesIO:
    if settings.config.use_agent_image_generation_v2:
        return await generate_agent_image_v2(graph=agent)
    else:
        return await generate_agent_image_v1(agent=agent)


-async def generate_agent_image_v2(graph: GraphBaseMeta | AgentGraph) -> io.BytesIO:
+async def generate_agent_image_v2(graph: BaseGraph | AgentGraph) -> io.BytesIO:
    """
    Generate an image for an agent using Ideogram model.
    Returns:
@@ -41,31 +50,18 @@ async def generate_agent_image_v2(graph: GraphBaseMeta | AgentGraph) -> io.Bytes
    if not ideogram_credentials.api_key:
        raise ValueError("Missing Ideogram API key")

-    from backend.blocks.ideogram import (
-        AspectRatio,
-        ColorPalettePreset,
-        IdeogramModelBlock,
-        IdeogramModelName,
-        MagicPromptOption,
-        StyleType,
-        UpscaleOption,
-    )
-
    name = graph.name
    description = f"{name} ({graph.description})" if graph.description else name

    prompt = (
-        "Create a visually striking retro-futuristic vector pop art illustration "
-        f'prominently featuring "{name}" in bold typography. The image clearly and '
-        f"literally depicts a {description}, along with recognizable objects directly "
-        f"associated with the primary function of a {name}. "
-        f"Ensure the imagery is concrete, intuitive, and immediately understandable, "
-        f"clearly conveying the purpose of a {name}. "
-        "Maintain vibrant, limited-palette colors, sharp vector lines, "
-        "geometric shapes, flat illustration techniques, and solid colors "
-        "without gradients or shading. Preserve a retro-futuristic aesthetic "
-        "influenced by mid-century futurism and 1960s psychedelia, "
-        "prioritizing clear visual storytelling and thematic clarity above all else."
+        f"Create a visually striking retro-futuristic vector pop art illustration prominently featuring "
+        f'"{name}" in bold typography. The image clearly and literally depicts a {description}, '
+        f"along with recognizable objects directly associated with the primary function of a {name}. "
+        f"Ensure the imagery is concrete, intuitive, and immediately understandable, clearly conveying the "
+        f"purpose of a {name}. Maintain vibrant, limited-palette colors, sharp vector lines, geometric "
+        f"shapes, flat illustration techniques, and solid colors without gradients or shading. Preserve a "
+        f"retro-futuristic aesthetic influenced by mid-century futurism and 1960s psychedelia, "
+        f"prioritizing clear visual storytelling and thematic clarity above all else."
    )

    custom_colors = [
@@ -103,12 +99,12 @@ async def generate_agent_image_v2(graph: GraphBaseMeta | AgentGraph) -> io.Bytes
    return io.BytesIO(response.content)


-async def generate_agent_image_v1(agent: GraphBaseMeta | AgentGraph) -> io.BytesIO:
+async def generate_agent_image_v1(agent: BaseGraph | AgentGraph) -> io.BytesIO:
    """
    Generate an image for an agent using Flux model via Replicate API.

    Args:
-        agent (GraphBaseMeta | AgentGraph): The agent to generate an image for
+        agent (Graph): The agent to generate an image for

    Returns:
        io.BytesIO: The generated image as bytes
@@ -118,13 +114,7 @@ async def generate_agent_image_v1(agent: GraphBaseMeta | AgentGraph) -> io.Bytes
            raise ValueError("Missing Replicate API key in settings")

        # Construct prompt from agent details
-        prompt = (
-            "Create a visually engaging app store thumbnail for the AI agent "
-            "that highlights what it does in a clear and captivating way:\n"
-            f"- **Name**: {agent.name}\n"
-            f"- **Description**: {agent.description}\n"
-            f"Focus on showcasing its core functionality with an appealing design."
-        )
+        prompt = f"Create a visually engaging app store thumbnail for the AI agent that highlights what it does in a clear and captivating way:\n- **Name**: {agent.name}\n- **Description**: {agent.description}\nFocus on showcasing its core functionality with an appealing design."

        # Set up Replicate client
        client = ReplicateClient(api_token=settings.secrets.replicate_api_key)
--- a/autogpt_platform/backend/backend/api/features/store/model.py
+++ b/autogpt_platform/backend/backend/api/features/store/model.py
@@ -38,7 +38,6 @@ class StoreAgent(pydantic.BaseModel):
    description: str
    runs: int
    rating: float
-    agent_graph_id: str


 class StoreAgentsResponse(pydantic.BaseModel):
--- a/autogpt_platform/backend/backend/api/features/store/model_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/model_test.py
@@ -26,13 +26,11 @@ def test_store_agent():
        description="Test description",
        runs=50,
        rating=4.5,
-        agent_graph_id="test-graph-id",
    )
    assert agent.slug == "test-agent"
    assert agent.agent_name == "Test Agent"
    assert agent.runs == 50
    assert agent.rating == 4.5
-    assert agent.agent_graph_id == "test-graph-id"


 def test_store_agents_response():
@@ -48,7 +46,6 @@ def test_store_agents_response():
                description="Test description",
                runs=50,
                rating=4.5,
-                agent_graph_id="test-graph-id",
            )
        ],
        pagination=store_model.Pagination(
--- a/autogpt_platform/backend/backend/api/features/store/routes.py
+++ b/autogpt_platform/backend/backend/api/features/store/routes.py
@@ -278,7 +278,7 @@ async def get_agent(
 )
 async def get_graph_meta_by_store_listing_version_id(
    store_listing_version_id: str,
-) -> backend.data.graph.GraphModelWithoutNodes:
+) -> backend.data.graph.GraphMeta:
    """
    Get Agent Graph from Store Listing Version ID.
    """
--- a/autogpt_platform/backend/backend/api/features/store/routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/store/routes_test.py
@@ -82,7 +82,6 @@ def test_get_agents_featured(
                description="Featured agent description",
                runs=100,
                rating=4.5,
-                agent_graph_id="test-graph-1",
            )
        ],
        pagination=store_model.Pagination(
@@ -128,7 +127,6 @@ def test_get_agents_by_creator(
                description="Creator agent description",
                runs=50,
                rating=4.0,
-                agent_graph_id="test-graph-2",
            )
        ],
        pagination=store_model.Pagination(
@@ -174,7 +172,6 @@ def test_get_agents_sorted(
                description="Top agent description",
                runs=1000,
                rating=5.0,
-                agent_graph_id="test-graph-3",
            )
        ],
        pagination=store_model.Pagination(
@@ -220,7 +217,6 @@ def test_get_agents_search(
                description="Specific search term description",
                runs=75,
                rating=4.2,
-                agent_graph_id="test-graph-search",
            )
        ],
        pagination=store_model.Pagination(
@@ -266,7 +262,6 @@ def test_get_agents_category(
                description="Category agent description",
                runs=60,
                rating=4.1,
-                agent_graph_id="test-graph-category",
            )
        ],
        pagination=store_model.Pagination(
@@ -311,7 +306,6 @@ def test_get_agents_pagination(
                description=f"Agent {i} description",
                runs=i * 10,
                rating=4.0,
-                agent_graph_id="test-graph-2",
            )
            for i in range(5)
        ],
--- a/autogpt_platform/backend/backend/api/features/store/test_cache_delete.py
+++ b/autogpt_platform/backend/backend/api/features/store/test_cache_delete.py
@@ -33,7 +33,6 @@ class TestCacheDeletion:
                    description="Test description",
                    runs=100,
                    rating=4.5,
-                    agent_graph_id="test-graph-id",
                )
            ],
            pagination=Pagination(
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Toran Bruce Richards	b84bc469c9	docs: Add changelog section to platform docs (#12077 ) ## Summary - Adds a new Changelog section to the platform docs sidebar navigation - Creates the first bi-weekly changelog entry covering January 29 – February 11, 2026 - Establishes the directory structure (`docs/platform/changelog/`) for future entries ### What's included in the changelog entry The entry covers voice input, persistent file workspace, Claude Opus 4.6 upgrade, marketplace agent customization, rebuilt chat streaming (Vercel AI SDK migration), reconnection during long tasks, agent creation UX improvements, homepage redesign, and agent library reuse — plus collapsible sections for smaller improvements and fixes. ### How to add future entries 1. Create a new markdown file in `docs/platform/changelog/` (e.g., `february-12-february-25-2026.md`) 2. Add a link to `docs/platform/SUMMARY.md` under the Changelog section ## Test plan - [ ] Verify the Changelog section appears in the GitBook sidebar after merge - [ ] Verify the changelog entry renders correctly (headings, table, collapsible sections) - [ ] Verify navigation between the changelog index and the entry works 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- greptile_comment --> <h2>Greptile Overview</h2> <details><summary><h3>Greptile Summary</h3></summary> Adds a new Changelog section to the platform documentation with the first bi-weekly entry (January 29 – February 11, 2026). The changelog documents major platform updates including voice input, persistent file workspace, Claude Opus 4.6 upgrade, marketplace agent customization, rebuilt chat streaming, reconnection during long tasks, agent creation UX improvements, homepage redesign, and agent library reuse. Key changes: - Created `docs/platform/changelog/` directory with a landing page (`README.md`) and the first entry - Added "Changelog" section to `SUMMARY.md` navigation with proper GitBook structure - Used collapsible `<details>` sections for minor improvements and fixes to keep the changelog scannable - Follows existing documentation patterns with clear headings, GitBook hint blocks, and markdown tables The structure allows for easy addition of future entries by creating new markdown files and linking them in `SUMMARY.md`. </details> <details><summary><h3>Confidence Score: 5/5</h3></summary> - This PR is safe to merge with minimal risk - it only adds documentation files. - The PR introduces only new documentation files with no code changes. The changelog structure follows GitBook conventions correctly, uses proper markdown formatting, and integrates cleanly into the existing documentation navigation. All files are well-formatted with clear headings, proper use of GitBook hint blocks, and collapsible sections for organization. - No files require special attention </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment --> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 18:26:57 +00:00
Nicholas Tindle	fdb7ff8111	docs(blocks): complete block documentation migration cleanup Move remaining block docs to block-integrations/ subdirectory: - Delete old docs from docs/integrations/ root - Add new docs under docs/integrations/block-integrations/ - Add guides/ directory with LLM and voice provider docs - Update SUMMARY.md with correct navigation structure Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-22 14:18:10 -06:00
Nicholas Tindle	0e42efb7d5	docs(blocks): migrate block documentation to docs/integrations Restructure block documentation for GitBook integration: - Move all block docs from docs/platform/blocks to docs/integrations/block-integrations - Add generate_block_docs.py script to auto-generate documentation from block schemas - Generate README.md with overview table of all 463 blocks - Generate SUMMARY.md for GitBook navigation - Add guides/ directory for LLM and voice provider documentation - Fix bug in generate_block_docs.py where summary_path was incorrectly passed instead of summary_content Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-22 14:17:18 -06:00
bobby.gaffin	f2d82d8802	GITBOOK-73: No subject	2026-01-22 18:59:20 +00:00
Nicholas Tindle	446c71fec8	Merge branch 'dev' into gitbook	2026-01-15 12:59:51 -07:00
claude[bot]	ec4c2caa14	Merge remote-tracking branch 'origin/dev' into gitbook	2026-01-12 21:45:54 +00:00
Nicholas Tindle	516e8b4b25	fix: move files to the right places	2026-01-12 13:46:56 -06:00
Nicholas Tindle	e7e118b5a8	wip: fixes	2026-01-09 10:23:31 -07:00
Nicholas Tindle	92a7a7e6d6	wip: fixes	2026-01-09 10:21:06 -07:00
Nicholas Tindle	e16995347f	Refactor/gitbook platform structure (#11739 ) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-09 11:17:32 -06:00
Nicholas Tindle	234d3acb4c	refactor(docs): restructure platform docs for GitBook and remove MkDocs (#11738 ) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-09 11:09:17 -06:00