docs: Add workspace and media file architecture documentation (#11989 )

### Changes 🏗️ - Added comprehensive architecture documentation at `docs/platform/workspace-media-architecture.md` covering: - Database models (`UserWorkspace`, `UserWorkspaceFile`) - `WorkspaceManager` API with session scoping - `store_media_file()` media normalization pipeline (input types, return formats) - Virus scanning responsibility boundaries - Decision tree for choosing `WorkspaceManager` vs `store_media_file()` - Configuration reference including `clamav_max_concurrency` and `clamav_mark_failed_scans_as_clean` - Common patterns with error handling examples - Updated `autogpt_platform/backend/CLAUDE.md` with a "Workspace & Media Files" section referencing the new docs - Removed duplicate `scan_content_safe()` call from `WriteWorkspaceFileTool` — `WorkspaceManager.write_file()` already scans internally, so the tool was double-scanning every file - Replaced removed comment in `workspace.py` with explicit ownership comment clarifying that `WorkspaceManager` is the single scanning boundary ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified `scan_content_safe()` is called inside `WorkspaceManager.write_file()` (workspace.py:186) - [x] Verified `store_media_file()` scans all input branches including local paths (file.py:351) - [x] Verified documentation accuracy against current source code after merge with dev - [x] CI checks all passing  --- > [!NOTE] > **Low Risk** > Mostly adds documentation and internal developer guidance; the only code change is a comment clarifying `WorkspaceManager.write_file()` as the single virus-scanning boundary, with no behavior change. > > **Overview** > Adds a new `docs/platform/workspace-media-architecture.md` describing the Workspace storage layer vs the `store_media_file()` media pipeline, including session scoping and virus-scanning/persistence responsibility boundaries. > > Updates backend `CLAUDE.md` to point contributors to the new doc when working on CoPilot uploads/downloads or `WorkspaceManager`/`store_media_file()`, and clarifies in `WorkspaceManager.write_file()` (comment-only) that callers should not duplicate virus scanning. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 18fcfa03f8. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup>  --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev
2026-03-17 03:00:27 -04:00 · 2026-03-17 06:12:26 +00:00 · 2026-03-17 13:13:42 +07:00 · 2026-03-17 05:24:55 +00:00 · 2026-03-16 23:08:18 +00:00 · 2026-03-16 22:31:21 +00:00
39 changed files with 4522 additions and 969 deletions
--- a/.github/workflows/platform-backend-ci.yml
+++ b/.github/workflows/platform-backend-ci.yml
@@ -5,12 +5,14 @@ on:
    branches: [master, dev, ci-test*]
    paths:
      - ".github/workflows/platform-backend-ci.yml"
+      - ".github/workflows/scripts/get_package_version_from_lockfile.py"
      - "autogpt_platform/backend/**"
      - "autogpt_platform/autogpt_libs/**"
  pull_request:
    branches: [master, dev, release-*]
    paths:
      - ".github/workflows/platform-backend-ci.yml"
+      - ".github/workflows/scripts/get_package_version_from_lockfile.py"
      - "autogpt_platform/backend/**"
      - "autogpt_platform/autogpt_libs/**"
  merge_group:
--- a/.github/workflows/platform-frontend-ci.yml
+++ b/.github/workflows/platform-frontend-ci.yml
@@ -120,175 +120,6 @@ jobs:
          token: ${{ secrets.GITHUB_TOKEN }}
          exitOnceUploaded: true

-  e2e_test:
-    name: end-to-end tests
-    runs-on: big-boi
-
-    steps:
-      - name: Checkout repository
-        uses: actions/checkout@v6
-        with:
-          submodules: recursive
-
-      - name: Set up Platform - Copy default supabase .env
-        run: |
-          cp ../.env.default ../.env
-
-      - name: Set up Platform - Copy backend .env and set OpenAI API key
-        run: |
-          cp ../backend/.env.default ../backend/.env
-          echo "OPENAI_INTERNAL_API_KEY=${{ secrets.OPENAI_API_KEY }}" >> ../backend/.env
-        env:
-          # Used by E2E test data script to generate embeddings for approved store agents
-          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
-
-      - name: Set up Platform - Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3
-        with:
-          driver: docker-container
-          driver-opts: network=host
-
-      - name: Set up Platform - Expose GHA cache to docker buildx CLI
-        uses: crazy-max/ghaction-github-runtime@v4
-
-      - name: Set up Platform - Build Docker images (with cache)
-        working-directory: autogpt_platform
-        run: |
-          pip install pyyaml
-
-          # Resolve extends and generate a flat compose file that bake can understand
-          docker compose -f docker-compose.yml config > docker-compose.resolved.yml
-
-          # Add cache configuration to the resolved compose file
-          python ../.github/workflows/scripts/docker-ci-fix-compose-build-cache.py \
-            --source docker-compose.resolved.yml \
-            --cache-from "type=gha" \
-            --cache-to "type=gha,mode=max" \
-            --backend-hash "${{ hashFiles('autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/poetry.lock', 'autogpt_platform/backend/backend') }}" \
-            --frontend-hash "${{ hashFiles('autogpt_platform/frontend/Dockerfile', 'autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/src') }}" \
-            --git-ref "${{ github.ref }}"
-
-          # Build with bake using the resolved compose file (now includes cache config)
-          docker buildx bake --allow=fs.read=.. -f docker-compose.resolved.yml --load
-        env:
-          NEXT_PUBLIC_PW_TEST: true
-
-      - name: Set up tests - Cache E2E test data
-        id: e2e-data-cache
-        uses: actions/cache@v5
-        with:
-          path: /tmp/e2e_test_data.sql
-          key: e2e-test-data-${{ hashFiles('autogpt_platform/backend/test/e2e_test_data.py', 'autogpt_platform/backend/migrations/**', '.github/workflows/platform-frontend-ci.yml') }}
-
-      - name: Set up Platform - Start Supabase DB + Auth
-        run: |
-          docker compose -f ../docker-compose.resolved.yml up -d db auth --no-build
-          echo "Waiting for database to be ready..."
-          timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done'
-          echo "Waiting for auth service to be ready..."
-          timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -c "SELECT 1 FROM auth.users LIMIT 1" 2>/dev/null; do sleep 2; done' || echo "Auth schema check timeout, continuing..."
-
-      - name: Set up Platform - Run migrations
-        run: |
-          echo "Running migrations..."
-          docker compose -f ../docker-compose.resolved.yml run --rm migrate
-          echo "✅ Migrations completed"
-        env:
-          NEXT_PUBLIC_PW_TEST: true
-
-      - name: Set up tests - Load cached E2E test data
-        if: steps.e2e-data-cache.outputs.cache-hit == 'true'
-        run: |
-          echo "✅ Found cached E2E test data, restoring..."
-          {
-            echo "SET session_replication_role = 'replica';"
-            cat /tmp/e2e_test_data.sql
-            echo "SET session_replication_role = 'origin';"
-          } | docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -b
-          # Refresh materialized views after restore
-          docker compose -f ../docker-compose.resolved.yml exec -T db \
-            psql -U postgres -d postgres -b -c "SET search_path TO platform; SELECT refresh_store_materialized_views();" || true
-
-          echo "✅ E2E test data restored from cache"
-
-      - name: Set up Platform - Start (all other services)
-        run: |
-          docker compose -f ../docker-compose.resolved.yml up -d --no-build
-          echo "Waiting for rest_server to be ready..."
-          timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
-        env:
-          NEXT_PUBLIC_PW_TEST: true
-
-      - name: Set up tests - Create E2E test data
-        if: steps.e2e-data-cache.outputs.cache-hit != 'true'
-        run: |
-          echo "Creating E2E test data..."
-          docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.resolved.yml ps -q rest_server):/tmp/e2e_test_data.py
-          docker compose -f ../docker-compose.resolved.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
-            echo "❌ E2E test data creation failed!"
-            docker compose -f ../docker-compose.resolved.yml logs --tail=50 rest_server
-            exit 1
-          }
-
-          # Dump auth.users + platform schema for cache (two separate dumps)
-          echo "Dumping database for cache..."
-          {
-            docker compose -f ../docker-compose.resolved.yml exec -T db \
-              pg_dump -U postgres --data-only --column-inserts \
-              --table='auth.users' postgres
-            docker compose -f ../docker-compose.resolved.yml exec -T db \
-              pg_dump -U postgres --data-only --column-inserts \
-              --schema=platform \
-              --exclude-table='platform._prisma_migrations' \
-              --exclude-table='platform.apscheduler_jobs' \
-              --exclude-table='platform.apscheduler_jobs_batched_notifications' \
-              postgres
-          } > /tmp/e2e_test_data.sql
-
-          echo "✅ Database dump created for caching ($(wc -l < /tmp/e2e_test_data.sql) lines)"
-
-      - name: Set up tests - Enable corepack
-        run: corepack enable
-
-      - name: Set up tests - Set up Node
-        uses: actions/setup-node@v6
-        with:
-          node-version: "22.18.0"
-          cache: "pnpm"
-          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
-
-      - name: Set up tests - Install dependencies
-        run: pnpm install --frozen-lockfile
-
-      - name: Set up tests - Install browser 'chromium'
-        run: pnpm playwright install --with-deps chromium
-
-      - name: Run Playwright tests
-        run: pnpm test:no-build
-        continue-on-error: false
-
-      - name: Upload Playwright report
-        if: always()
-        uses: actions/upload-artifact@v4
-        with:
-          name: playwright-report
-          path: playwright-report
-          if-no-files-found: ignore
-          retention-days: 3
-
-      - name: Upload Playwright test results
-        if: always()
-        uses: actions/upload-artifact@v4
-        with:
-          name: playwright-test-results
-          path: test-results
-          if-no-files-found: ignore
-          retention-days: 3
-
-      - name: Print Final Docker Compose logs
-        if: always()
-        run: docker compose -f ../docker-compose.resolved.yml logs
-
  integration_test:
    runs-on: ubuntu-latest
    needs: setup
--- a/.github/workflows/platform-fullstack-ci.yml
+++ b/.github/workflows/platform-fullstack-ci.yml
@@ -1,14 +1,18 @@
-name: AutoGPT Platform - Frontend CI
+name: AutoGPT Platform - Full-stack CI

 on:
  push:
    branches: [master, dev]
    paths:
      - ".github/workflows/platform-fullstack-ci.yml"
+      - ".github/workflows/scripts/docker-ci-fix-compose-build-cache.py"
+      - ".github/workflows/scripts/get_package_version_from_lockfile.py"
      - "autogpt_platform/**"
  pull_request:
    paths:
      - ".github/workflows/platform-fullstack-ci.yml"
+      - ".github/workflows/scripts/docker-ci-fix-compose-build-cache.py"
+      - ".github/workflows/scripts/get_package_version_from_lockfile.py"
      - "autogpt_platform/**"
  merge_group:

@@ -24,42 +28,28 @@ defaults:
 jobs:
  setup:
    runs-on: ubuntu-latest
-    outputs:
-      cache-key: ${{ steps.cache-key.outputs.key }}

    steps:
      - name: Checkout repository
        uses: actions/checkout@v6

-      - name: Set up Node.js
-        uses: actions/setup-node@v6
-        with:
-          node-version: "22.18.0"
-
      - name: Enable corepack
        run: corepack enable

-      - name: Generate cache key
-        id: cache-key
-        run: echo "key=${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}" >> $GITHUB_OUTPUT
-
-      - name: Cache dependencies
-        uses: actions/cache@v5
+      - name: Set up Node
+        uses: actions/setup-node@v6
        with:
-          path: ~/.pnpm-store
-          key: ${{ steps.cache-key.outputs.key }}
-          restore-keys: |
-            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
-            ${{ runner.os }}-pnpm-
+          node-version: "22.18.0"
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml

-      - name: Install dependencies
+      - name: Install dependencies to populate cache
        run: pnpm install --frozen-lockfile

-  types:
-    runs-on: big-boi
+  check-api-types:
+    name: check API types
+    runs-on: ubuntu-latest
    needs: setup
-    strategy:
-      fail-fast: false

    steps:
      - name: Checkout repository
@@ -67,70 +57,256 @@ jobs:
        with:
          submodules: recursive

-      - name: Set up Node.js
+      # ------------------------ Backend setup ------------------------
+
+      - name: Set up Backend - Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+
+      - name: Set up Backend - Install Poetry
+        working-directory: autogpt_platform/backend
+        run: |
+          POETRY_VERSION=$(python ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
+          echo "Installing Poetry version ${POETRY_VERSION}"
+          curl -sSL https://install.python-poetry.org | POETRY_VERSION=$POETRY_VERSION python3 -
+
+      - name: Set up Backend - Set up dependency cache
+        uses: actions/cache@v5
+        with:
+          path: ~/.cache/pypoetry
+          key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
+
+      - name: Set up Backend - Install dependencies
+        working-directory: autogpt_platform/backend
+        run: poetry install
+
+      - name: Set up Backend - Generate Prisma client
+        working-directory: autogpt_platform/backend
+        run: poetry run prisma generate && poetry run gen-prisma-stub
+
+      - name: Set up Frontend - Export OpenAPI schema from Backend
+        working-directory: autogpt_platform/backend
+        run: poetry run export-api-schema --output ../frontend/src/app/api/openapi.json
+
+      # ------------------------ Frontend setup ------------------------
+
+      - name: Set up Frontend - Enable corepack
+        run: corepack enable
+
+      - name: Set up Frontend - Set up Node
        uses: actions/setup-node@v6
        with:
          node-version: "22.18.0"
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml

-      - name: Enable corepack
-        run: corepack enable
-
-      - name: Copy default supabase .env
-        run: |
-          cp ../.env.default ../.env
-
-      - name: Copy backend .env
-        run: |
-          cp ../backend/.env.default ../backend/.env
-
-      - name: Run docker compose
-        run: |
-          docker compose -f ../docker-compose.yml --profile local up -d deps_backend
-
-      - name: Restore dependencies cache
-        uses: actions/cache@v5
-        with:
-          path: ~/.pnpm-store
-          key: ${{ needs.setup.outputs.cache-key }}
-          restore-keys: |
-            ${{ runner.os }}-pnpm-
-
-      - name: Install dependencies
+      - name: Set up Frontend - Install dependencies
        run: pnpm install --frozen-lockfile

-      - name: Setup .env
-        run: cp .env.default .env
-
-      - name: Wait for services to be ready
-        run: |
-          echo "Waiting for rest_server to be ready..."
-          timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
-          echo "Waiting for database to be ready..."
-          timeout 60 sh -c 'until docker compose -f ../docker-compose.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done' || echo "Database ready check timeout, continuing..."
-
-      - name: Generate API queries
-        run: pnpm generate:api:force
+      - name: Set up Frontend - Format OpenAPI schema
+        id: format-schema
+        run: pnpm prettier --write ./src/app/api/openapi.json

      - name: Check for API schema changes
        run: |
          if ! git diff --exit-code src/app/api/openapi.json; then
            echo "❌ API schema changes detected in src/app/api/openapi.json"
            echo ""
-            echo "The openapi.json file has been modified after running 'pnpm generate:api-all'."
+            echo "The openapi.json file has been modified after exporting the API schema."
            echo "This usually means changes have been made in the BE endpoints without updating the Frontend."
            echo "The API schema is now out of sync with the Front-end queries."
            echo ""
            echo "To fix this:"
-            echo "1. Pull the backend 'docker compose pull && docker compose up -d --build --force-recreate'"
-            echo "2. Run 'pnpm generate:api' locally"
-            echo "3. Run 'pnpm types' locally"
-            echo "4. Fix any TypeScript errors that may have been introduced"
-            echo "5. Commit and push your changes"
+            echo "\nIn the backend directory:"
+            echo "1. Run 'poetry run export-api-schema --output ../frontend/src/app/api/openapi.json'"
+            echo "\nIn the frontend directory:"
+            echo "2. Run 'pnpm prettier --write src/app/api/openapi.json'"
+            echo "3. Run 'pnpm generate:api'"
+            echo "4. Run 'pnpm types'"
+            echo "5. Fix any TypeScript errors that may have been introduced"
+            echo "6. Commit and push your changes"
            echo ""
            exit 1
          else
            echo "✅ No API schema changes detected"
          fi

-      - name: Run Typescript checks
+      - name: Set up Frontend - Generate API client
+        id: generate-api-client
+        run: pnpm orval --config ./orval.config.ts
+        # Continue with type generation & check even if there are schema changes
+        if: success() || (steps.format-schema.outcome == 'success')
+
+      - name: Check for TypeScript errors
        run: pnpm types
+        if: success() || (steps.generate-api-client.outcome == 'success')
+
+  e2e_test:
+    name: end-to-end tests
+    runs-on: big-boi
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v6
+        with:
+          submodules: recursive
+
+      - name: Set up Platform - Copy default supabase .env
+        run: |
+          cp ../.env.default ../.env
+
+      - name: Set up Platform - Copy backend .env and set OpenAI API key
+        run: |
+          cp ../backend/.env.default ../backend/.env
+          echo "OPENAI_INTERNAL_API_KEY=${{ secrets.OPENAI_API_KEY }}" >> ../backend/.env
+        env:
+          # Used by E2E test data script to generate embeddings for approved store agents
+          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+
+      - name: Set up Platform - Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3
+        with:
+          driver: docker-container
+          driver-opts: network=host
+
+      - name: Set up Platform - Expose GHA cache to docker buildx CLI
+        uses: crazy-max/ghaction-github-runtime@v4
+
+      - name: Set up Platform - Build Docker images (with cache)
+        working-directory: autogpt_platform
+        run: |
+          pip install pyyaml
+
+          # Resolve extends and generate a flat compose file that bake can understand
+          docker compose -f docker-compose.yml config > docker-compose.resolved.yml
+
+          # Add cache configuration to the resolved compose file
+          python ../.github/workflows/scripts/docker-ci-fix-compose-build-cache.py \
+            --source docker-compose.resolved.yml \
+            --cache-from "type=gha" \
+            --cache-to "type=gha,mode=max" \
+            --backend-hash "${{ hashFiles('autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/poetry.lock', 'autogpt_platform/backend/backend/**') }}" \
+            --frontend-hash "${{ hashFiles('autogpt_platform/frontend/Dockerfile', 'autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/src/**') }}" \
+            --git-ref "${{ github.ref }}"
+
+          # Build with bake using the resolved compose file (now includes cache config)
+          docker buildx bake --allow=fs.read=.. -f docker-compose.resolved.yml --load
+        env:
+          NEXT_PUBLIC_PW_TEST: true
+
+      - name: Set up tests - Cache E2E test data
+        id: e2e-data-cache
+        uses: actions/cache@v5
+        with:
+          path: /tmp/e2e_test_data.sql
+          key: e2e-test-data-${{ hashFiles('autogpt_platform/backend/test/e2e_test_data.py', 'autogpt_platform/backend/migrations/**', '.github/workflows/platform-fullstack-ci.yml') }}
+
+      - name: Set up Platform - Start Supabase DB + Auth
+        run: |
+          docker compose -f ../docker-compose.resolved.yml up -d db auth --no-build
+          echo "Waiting for database to be ready..."
+          timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done'
+          echo "Waiting for auth service to be ready..."
+          timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -c "SELECT 1 FROM auth.users LIMIT 1" 2>/dev/null; do sleep 2; done' || echo "Auth schema check timeout, continuing..."
+
+      - name: Set up Platform - Run migrations
+        run: |
+          echo "Running migrations..."
+          docker compose -f ../docker-compose.resolved.yml run --rm migrate
+          echo "✅ Migrations completed"
+        env:
+          NEXT_PUBLIC_PW_TEST: true
+
+      - name: Set up tests - Load cached E2E test data
+        if: steps.e2e-data-cache.outputs.cache-hit == 'true'
+        run: |
+          echo "✅ Found cached E2E test data, restoring..."
+          {
+            echo "SET session_replication_role = 'replica';"
+            cat /tmp/e2e_test_data.sql
+            echo "SET session_replication_role = 'origin';"
+          } | docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -b
+          # Refresh materialized views after restore
+          docker compose -f ../docker-compose.resolved.yml exec -T db \
+            psql -U postgres -d postgres -b -c "SET search_path TO platform; SELECT refresh_store_materialized_views();" || true
+
+          echo "✅ E2E test data restored from cache"
+
+      - name: Set up Platform - Start (all other services)
+        run: |
+          docker compose -f ../docker-compose.resolved.yml up -d --no-build
+          echo "Waiting for rest_server to be ready..."
+          timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
+        env:
+          NEXT_PUBLIC_PW_TEST: true
+
+      - name: Set up tests - Create E2E test data
+        if: steps.e2e-data-cache.outputs.cache-hit != 'true'
+        run: |
+          echo "Creating E2E test data..."
+          docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.resolved.yml ps -q rest_server):/tmp/e2e_test_data.py
+          docker compose -f ../docker-compose.resolved.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
+            echo "❌ E2E test data creation failed!"
+            docker compose -f ../docker-compose.resolved.yml logs --tail=50 rest_server
+            exit 1
+          }
+
+          # Dump auth.users + platform schema for cache (two separate dumps)
+          echo "Dumping database for cache..."
+          {
+            docker compose -f ../docker-compose.resolved.yml exec -T db \
+              pg_dump -U postgres --data-only --column-inserts \
+              --table='auth.users' postgres
+            docker compose -f ../docker-compose.resolved.yml exec -T db \
+              pg_dump -U postgres --data-only --column-inserts \
+              --schema=platform \
+              --exclude-table='platform._prisma_migrations' \
+              --exclude-table='platform.apscheduler_jobs' \
+              --exclude-table='platform.apscheduler_jobs_batched_notifications' \
+              postgres
+          } > /tmp/e2e_test_data.sql
+
+          echo "✅ Database dump created for caching ($(wc -l < /tmp/e2e_test_data.sql) lines)"
+
+      - name: Set up tests - Enable corepack
+        run: corepack enable
+
+      - name: Set up tests - Set up Node
+        uses: actions/setup-node@v6
+        with:
+          node-version: "22.18.0"
+          cache: "pnpm"
+          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
+
+      - name: Set up tests - Install dependencies
+        run: pnpm install --frozen-lockfile
+
+      - name: Set up tests - Install browser 'chromium'
+        run: pnpm playwright install --with-deps chromium
+
+      - name: Run Playwright tests
+        run: pnpm test:no-build
+        continue-on-error: false
+
+      - name: Upload Playwright report
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: playwright-report
+          path: playwright-report
+          if-no-files-found: ignore
+          retention-days: 3
+
+      - name: Upload Playwright test results
+        if: always()
+        uses: actions/upload-artifact@v4
+        with:
+          name: playwright-test-results
+          path: test-results
+          if-no-files-found: ignore
+          retention-days: 3
+
+      - name: Print Final Docker Compose logs
+        if: always()
+        run: docker compose -f ../docker-compose.resolved.yml logs
--- a/autogpt_platform/backend/CLAUDE.md
+++ b/autogpt_platform/backend/CLAUDE.md
@@ -178,6 +178,16 @@ yield "image_url", result_url
 3. Write tests alongside the route file
 4. Run `poetry run test` to verify

+## Workspace & Media Files
+
+**Read [Workspace & Media Architecture](../../docs/platform/workspace-media-architecture.md) when:**
+- Working on CoPilot file upload/download features
+- Building blocks that handle `MediaFileType` inputs/outputs
+- Modifying `WorkspaceManager` or `store_media_file()`
+- Debugging file persistence or virus scanning issues
+
+Covers: `WorkspaceManager` (persistent storage with session scoping), `store_media_file()` (media normalization pipeline), and responsibility boundaries for virus scanning and persistence.
+
 ## Security Implementation

 ### Cache Protection Middleware
--- a/autogpt_platform/backend/backend/blocks/github/commits.py
+++ b/autogpt_platform/backend/backend/blocks/github/commits.py
@@ -11,7 +11,10 @@ from backend.blocks._base import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import SchemaField
+from backend.util.file import parse_data_uri, resolve_media_content
+from backend.util.type import MediaFileType

 from ._api import get_api
 from ._auth import (
@@ -178,7 +181,8 @@ class FileOperation(StrEnum):

 class FileOperationInput(TypedDict):
    path: str
-    content: str
+    # MediaFileType is a str NewType — no runtime breakage for existing callers.
+    content: MediaFileType
    operation: FileOperation


@@ -275,11 +279,11 @@ class GithubMultiFileCommitBlock(Block):
        base_tree_sha = commit_data["tree"]["sha"]

        # 3. Build tree entries for each file operation (blobs created concurrently)
-        async def _create_blob(content: str) -> str:
+        async def _create_blob(content: str, encoding: str = "utf-8") -> str:
            blob_url = repo_url + "/git/blobs"
            blob_response = await api.post(
                blob_url,
-                json={"content": content, "encoding": "utf-8"},
+                json={"content": content, "encoding": encoding},
            )
            return blob_response.json()["sha"]

@@ -301,10 +305,19 @@ class GithubMultiFileCommitBlock(Block):
            else:
                upsert_files.append((path, file_op.get("content", "")))

-        # Create all blobs concurrently
+        # Create all blobs concurrently. Data URIs (from store_media_file)
+        # are sent as base64 blobs to preserve binary content.
        if upsert_files:
+
+            async def _make_blob(content: str) -> str:
+                parsed = parse_data_uri(content)
+                if parsed is not None:
+                    _, b64_payload = parsed
+                    return await _create_blob(b64_payload, encoding="base64")
+                return await _create_blob(content)
+
            blob_shas = await asyncio.gather(
-                *[_create_blob(content) for _, content in upsert_files]
+                *[_make_blob(content) for _, content in upsert_files]
            )
            for (path, _), blob_sha in zip(upsert_files, blob_shas):
                tree_entries.append(
@@ -358,15 +371,36 @@ class GithubMultiFileCommitBlock(Block):
        input_data: Input,
        *,
        credentials: GithubCredentials,
+        execution_context: ExecutionContext,
        **kwargs,
    ) -> BlockOutput:
        try:
+            # Resolve media references (workspace://, data:, URLs) to data
+            # URIs so _make_blob can send binary content correctly.
+            resolved_files: list[FileOperationInput] = []
+            for file_op in input_data.files:
+                content = file_op.get("content", "")
+                operation = FileOperation(file_op.get("operation", "upsert"))
+                if operation != FileOperation.DELETE:
+                    content = await resolve_media_content(
+                        MediaFileType(content),
+                        execution_context,
+                        return_format="for_external_api",
+                    )
+                resolved_files.append(
+                    FileOperationInput(
+                        path=file_op["path"],
+                        content=MediaFileType(content),
+                        operation=operation,
+                    )
+                )
+
            sha, url = await self.multi_file_commit(
                credentials,
                input_data.repo_url,
                input_data.branch,
                input_data.commit_message,
-                input_data.files,
+                resolved_files,
            )
            yield "sha", sha
            yield "url", url
--- a/autogpt_platform/backend/backend/blocks/github/test_github_blocks.py
+++ b/autogpt_platform/backend/backend/blocks/github/test_github_blocks.py
@@ -8,6 +8,7 @@ from backend.blocks.github.pull_requests import (
    GithubMergePullRequestBlock,
    prepare_pr_api_url,
 )
+from backend.data.execution import ExecutionContext
 from backend.util.exceptions import BlockExecutionError

 # ── prepare_pr_api_url tests ──
@@ -97,7 +98,11 @@ async def test_multi_file_commit_error_path():
        "credentials": TEST_CREDENTIALS_INPUT,
    }
    with pytest.raises(BlockExecutionError, match="ref update failed"):
-        async for _ in block.execute(input_data, credentials=TEST_CREDENTIALS):
+        async for _ in block.execute(
+            input_data,
+            credentials=TEST_CREDENTIALS,
+            execution_context=ExecutionContext(),
+        ):
            pass


--- a/autogpt_platform/backend/backend/copilot/baseline/service.py
+++ b/autogpt_platform/backend/backend/copilot/baseline/service.py
@@ -40,7 +40,7 @@ from backend.copilot.response_model import (
 from backend.copilot.service import (
    _build_system_prompt,
    _generate_session_title,
-    client,
+    _get_openai_client,
    config,
 )
 from backend.copilot.tools import execute_tool, get_available_tools
@@ -89,7 +89,7 @@ async def _compress_session_messages(
        result = await compress_context(
            messages=messages_dict,
            model=config.model,
-            client=client,
+            client=_get_openai_client(),
        )
    except Exception as e:
        logger.warning("[Baseline] Context compression with LLM failed: %s", e)
@@ -235,7 +235,7 @@ async def stream_chat_completion_baseline(
            )
            if tools:
                create_kwargs["tools"] = tools
-            response = await client.chat.completions.create(**create_kwargs)  # type: ignore[arg-type]  # dynamic kwargs
+            response = await _get_openai_client().chat.completions.create(**create_kwargs)  # type: ignore[arg-type]  # dynamic kwargs

            # Accumulate streamed response (text + tool calls)
            round_text = ""
--- a/autogpt_platform/backend/backend/copilot/context.py
+++ b/autogpt_platform/backend/backend/copilot/context.py
@@ -11,6 +11,8 @@ from contextvars import ContextVar
 from typing import TYPE_CHECKING

 from backend.copilot.model import ChatSession
+from backend.data.db_accessors import workspace_db
+from backend.util.workspace import WorkspaceManager

 if TYPE_CHECKING:
    from e2b import AsyncSandbox
@@ -82,6 +84,17 @@ def resolve_sandbox_path(path: str) -> str:
    return normalized


+async def get_workspace_manager(user_id: str, session_id: str) -> WorkspaceManager:
+    """Create a session-scoped :class:`WorkspaceManager`.
+
+    Placed here (rather than in ``tools/workspace_files``) so that modules
+    like ``sdk/file_ref`` can import it without triggering the heavy
+    ``tools/__init__`` import chain.
+    """
+    workspace = await workspace_db().get_or_create_workspace(user_id)
+    return WorkspaceManager(user_id, workspace.id, session_id)
+
+
 def is_allowed_local_path(path: str, sdk_cwd: str | None = None) -> bool:
    """Return True if *path* is within an allowed host-filesystem location.

--- a/autogpt_platform/backend/backend/copilot/prompting.py
+++ b/autogpt_platform/backend/backend/copilot/prompting.py
@@ -52,11 +52,43 @@ Examples:
 You can embed a reference inside any string argument, or use it as the entire
 value.  Multiple references in one argument are all expanded.

-**Type coercion**: The platform automatically coerces expanded string values
-to match the block's expected input types.  For example, if a block expects
-`list[list[str]]` and you pass a string containing a JSON array (e.g. from
-an @@agptfile: expansion), the string will be parsed into the correct type.
+**Structured data**: When the **entire** argument value is a single file
+reference (no surrounding text), the platform automatically parses the file
+content based on its extension or MIME type.  Supported formats: JSON, JSONL,
+CSV, TSV, YAML, TOML, Parquet, and Excel (.xlsx — first sheet only).
+For example, pass `@@agptfile:workspace://<id>` where the file is a `.csv` and
+the rows will be parsed into `list[list[str]]` automatically.  If the format is
+unrecognised or parsing fails, the content is returned as a plain string.
+Legacy `.xls` files are **not** supported — only the modern `.xlsx` format.

+**Type coercion**: The platform also coerces expanded values to match the
+block's expected input types.  For example, if a block expects `list[list[str]]`
+and the expanded value is a JSON string, it will be parsed into the correct type.
+
+### Media file inputs (format: "file")
+Some block inputs accept media files — their schema shows `"format": "file"`.
+These fields accept:
+- **`workspace://<file_id>`** or **`workspace://<file_id>#<mime>`** — preferred
+  for large files (images, videos, PDFs). The platform passes the reference
+  directly to the block without reading the content into memory.
+- **`data:<mime>;base64,<payload>`** — inline base64 data URI, suitable for
+  small files only.
+
+When a block input has `format: "file"`, **pass the `workspace://` URI
+directly as the value** (do NOT wrap it in `@@agptfile:`). This avoids large
+payloads in tool arguments and preserves binary content (images, videos)
+that would be corrupted by text encoding.
+
+Example — committing an image file to GitHub:
+```json
+{
+  "files": [{
+    "path": "docs/hero.png",
+    "content": "workspace://abc123#image/png",
+    "operation": "upsert"
+  }]
+}
+```

 ### Sub-agent tasks
 - When using the Task tool, NEVER set `run_in_background` to true.
--- a/autogpt_platform/backend/backend/copilot/sdk/init.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/init.py
@@ -3,12 +3,45 @@
 This module provides the integration layer between the Claude Agent SDK
 and the existing CoPilot tool system, enabling drop-in replacement of
 the current LLM orchestration with the battle-tested Claude Agent SDK.
+
+Submodule imports are deferred via PEP 562 ``__getattr__`` to break a
+circular import cycle::
+
+    sdk/__init__ → tool_adapter → copilot.tools (TOOL_REGISTRY)
+    copilot.tools → run_block → sdk.file_ref  (no cycle here, but…)
+    sdk/__init__ → service → copilot.prompting → copilot.tools  (cycle!)
+
+``tool_adapter`` uses ``TOOL_REGISTRY`` at **module level** to build the
+static ``COPILOT_TOOL_NAMES`` list, so the import cannot be deferred to
+function scope without a larger refactor (moving tool-name registration
+to a separate lightweight module).  The lazy-import pattern here is the
+least invasive way to break the cycle while keeping module-level constants
+intact.
 """

-from .service import stream_chat_completion_sdk
-from .tool_adapter import create_copilot_mcp_server
+from typing import Any

 __all__ = [
    "stream_chat_completion_sdk",
    "create_copilot_mcp_server",
 ]
+
+# Dispatch table for PEP 562 lazy imports.  Each entry is a (module, attr)
+# pair so new exports can be added without touching __getattr__ itself.
+_LAZY_IMPORTS: dict[str, tuple[str, str]] = {
+    "stream_chat_completion_sdk": (".service", "stream_chat_completion_sdk"),
+    "create_copilot_mcp_server": (".tool_adapter", "create_copilot_mcp_server"),
+}
+
+
+def __getattr__(name: str) -> Any:
+    entry = _LAZY_IMPORTS.get(name)
+    if entry is not None:
+        module_path, attr = entry
+        import importlib
+
+        module = importlib.import_module(module_path, package=__name__)
+        value = getattr(module, attr)
+        globals()[name] = value
+        return value
+    raise AttributeError(f"module {__name__!r} has no attribute {name!r}")
--- a/autogpt_platform/backend/backend/copilot/sdk/file_ref.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/file_ref.py
@@ -41,12 +41,20 @@ from typing import Any
 from backend.copilot.context import (
    get_current_sandbox,
    get_sdk_cwd,
+    get_workspace_manager,
    is_allowed_local_path,
    resolve_sandbox_path,
 )
 from backend.copilot.model import ChatSession
-from backend.copilot.tools.workspace_files import get_manager
 from backend.util.file import parse_workspace_uri
+from backend.util.file_content_parser import (
+    BINARY_FORMATS,
+    MIME_TO_FORMAT,
+    PARSE_EXCEPTIONS,
+    infer_format_from_uri,
+    parse_file_content,
+)
+from backend.util.type import MediaFileType


 class FileRefExpansionError(Exception):
@@ -74,6 +82,8 @@ _FILE_REF_RE = re.compile(
 _MAX_EXPAND_CHARS = 200_000
 # Maximum total characters across all @@agptfile: expansions in one string.
 _MAX_TOTAL_EXPAND_CHARS = 1_000_000
+# Maximum raw byte size for bare ref structured parsing (10 MB).
+_MAX_BARE_REF_BYTES = 10_000_000


@dataclass
@@ -83,6 +93,11 @@ class FileRef:
    end_line: int | None  # 1-indexed, inclusive


+# ---------------------------------------------------------------------------
+# Public API  (top-down: main functions first, helpers below)
+# ---------------------------------------------------------------------------
+
+
 def parse_file_ref(text: str) -> FileRef | None:
    """Return a :class:`FileRef` if *text* is a bare file reference token.

@@ -104,17 +119,6 @@ def parse_file_ref(text: str) -> FileRef | None:
    return FileRef(uri=m.group(1), start_line=start, end_line=end)


-def _apply_line_range(text: str, start: int | None, end: int | None) -> str:
-    """Slice *text* to the requested 1-indexed line range (inclusive)."""
-    if start is None and end is None:
-        return text
-    lines = text.splitlines(keepends=True)
-    s = (start - 1) if start is not None else 0
-    e = end if end is not None else len(lines)
-    selected = list(itertools.islice(lines, s, e))
-    return "".join(selected)
-
-
 async def read_file_bytes(
    uri: str,
    user_id: str | None,
@@ -130,27 +134,47 @@ async def read_file_bytes(
    if plain.startswith("workspace://"):
        if not user_id:
            raise ValueError("workspace:// file references require authentication")
-        manager = await get_manager(user_id, session.session_id)
+        manager = await get_workspace_manager(user_id, session.session_id)
        ws = parse_workspace_uri(plain)
        try:
-            return await (
+            data = await (
                manager.read_file(ws.file_ref)
                if ws.is_path
                else manager.read_file_by_id(ws.file_ref)
            )
        except FileNotFoundError:
            raise ValueError(f"File not found: {plain}")
-        except Exception as exc:
+        except (PermissionError, OSError) as exc:
            raise ValueError(f"Failed to read {plain}: {exc}") from exc
+        except (AttributeError, TypeError, RuntimeError) as exc:
+            # AttributeError/TypeError: workspace manager returned an
+            # unexpected type or interface; RuntimeError: async runtime issues.
+            logger.warning("Unexpected error reading %s: %s", plain, exc)
+            raise ValueError(f"Failed to read {plain}: {exc}") from exc
+        # NOTE: Workspace API does not support pre-read size checks;
+        # the full file is loaded before the size guard below.
+        if len(data) > _MAX_BARE_REF_BYTES:
+            raise ValueError(
+                f"File too large ({len(data)} bytes, limit {_MAX_BARE_REF_BYTES})"
+            )
+        return data

    if is_allowed_local_path(plain, get_sdk_cwd()):
        resolved = os.path.realpath(os.path.expanduser(plain))
        try:
+            # Read with a one-byte overshoot to detect files that exceed the limit
+            # without a separate os.path.getsize call (avoids TOCTOU race).
            with open(resolved, "rb") as fh:
-                return fh.read()
+                data = fh.read(_MAX_BARE_REF_BYTES + 1)
+            if len(data) > _MAX_BARE_REF_BYTES:
+                raise ValueError(
+                    f"File too large (>{_MAX_BARE_REF_BYTES} bytes, "
+                    f"limit {_MAX_BARE_REF_BYTES})"
+                )
+            return data
        except FileNotFoundError:
            raise ValueError(f"File not found: {plain}")
-        except Exception as exc:
+        except OSError as exc:
            raise ValueError(f"Failed to read {plain}: {exc}") from exc

    sandbox = get_current_sandbox()
@@ -162,9 +186,33 @@ async def read_file_bytes(
                f"Path is not allowed (not in workspace, sdk_cwd, or sandbox): {plain}"
            ) from exc
        try:
-            return bytes(await sandbox.files.read(remote, format="bytes"))
-        except Exception as exc:
+            data = bytes(await sandbox.files.read(remote, format="bytes"))
+        except (FileNotFoundError, OSError, UnicodeDecodeError) as exc:
            raise ValueError(f"Failed to read from sandbox: {plain}: {exc}") from exc
+        except Exception as exc:
+            # E2B SDK raises SandboxException subclasses (NotFoundException,
+            # TimeoutException, NotEnoughSpaceException, etc.) which don't
+            # inherit from standard exceptions.  Import lazily to avoid a
+            # hard dependency on e2b at module level.
+            try:
+                from e2b.exceptions import SandboxException  # noqa: PLC0415
+
+                if isinstance(exc, SandboxException):
+                    raise ValueError(
+                        f"Failed to read from sandbox: {plain}: {exc}"
+                    ) from exc
+            except ImportError:
+                pass
+            # Re-raise unexpected exceptions (TypeError, AttributeError, etc.)
+            # so they surface as real bugs rather than being silently masked.
+            raise
+        # NOTE: E2B sandbox API does not support pre-read size checks;
+        # the full file is loaded before the size guard below.
+        if len(data) > _MAX_BARE_REF_BYTES:
+            raise ValueError(
+                f"File too large ({len(data)} bytes, limit {_MAX_BARE_REF_BYTES})"
+            )
+        return data

    raise ValueError(
        f"Path is not allowed (not in workspace, sdk_cwd, or sandbox): {plain}"
@@ -178,15 +226,13 @@ async def resolve_file_ref(
 ) -> str:
    """Resolve a :class:`FileRef` to its text content."""
    raw = await read_file_bytes(ref.uri, user_id, session)
-    return _apply_line_range(
-        raw.decode("utf-8", errors="replace"), ref.start_line, ref.end_line
-    )
+    return _apply_line_range(_to_str(raw), ref.start_line, ref.end_line)


 async def expand_file_refs_in_string(
    text: str,
    user_id: str | None,
-    session: "ChatSession",
+    session: ChatSession,
    *,
    raise_on_error: bool = False,
 ) -> str:
@@ -232,6 +278,9 @@ async def expand_file_refs_in_string(
            if len(content) > _MAX_EXPAND_CHARS:
                content = content[:_MAX_EXPAND_CHARS] + "\n... [truncated]"
            remaining = _MAX_TOTAL_EXPAND_CHARS - total_chars
+            # remaining == 0 means the budget was exactly exhausted by the
+            # previous ref.  The elif below (len > remaining) won't catch
+            # this since 0 > 0 is false, so we need the <= 0 check.
            if remaining <= 0:
                content = "[file-ref budget exhausted: total expansion limit reached]"
            elif len(content) > remaining:
@@ -252,13 +301,31 @@ async def expand_file_refs_in_string(
 async def expand_file_refs_in_args(
    args: dict[str, Any],
    user_id: str | None,
-    session: "ChatSession",
+    session: ChatSession,
+    *,
+    input_schema: dict[str, Any] | None = None,
 ) -> dict[str, Any]:
    """Recursively expand ``@@agptfile:...`` references in tool call arguments.

    String values are expanded in-place.  Nested dicts and lists are
    traversed.  Non-string scalars are returned unchanged.

+    **Bare references** (the entire argument value is a single
+    ``@@agptfile:...`` token with no surrounding text) are resolved and then
+    parsed according to the file's extension or MIME type.  See
+    :mod:`backend.util.file_content_parser` for the full list of supported
+    formats (JSON, JSONL, CSV, TSV, YAML, TOML, Parquet, Excel).
+
+    When *input_schema* is provided and the target property has
+    ``"type": "string"``, structured parsing is skipped — the raw file content
+    is returned as a plain string so blocks receive the original text.
+
+    If the format is unrecognised or parsing fails, the content is returned as
+    a plain string (the fallback).
+
+    **Embedded references** (``@@agptfile:`` mixed with other text) always
+    produce a plain string — structured parsing only applies to bare refs.
+
    Raises :class:`FileRefExpansionError` if any reference fails to resolve,
    so the tool is *not* executed with an error string as its input.  The
    caller (the MCP tool wrapper) should convert this into an MCP error
@@ -267,15 +334,382 @@ async def expand_file_refs_in_args(
    if not args:
        return args

-    async def _expand(value: Any) -> Any:
+    properties = (input_schema or {}).get("properties", {})
+
+    async def _expand(
+        value: Any,
+        *,
+        prop_schema: dict[str, Any] | None = None,
+    ) -> Any:
+        """Recursively expand a single argument value.
+
+        Strings are checked for ``@@agptfile:`` references and expanded
+        (bare refs get structured parsing; embedded refs get inline
+        substitution).  Dicts and lists are traversed recursively,
+        threading the corresponding sub-schema from *prop_schema* so
+        that nested fields also receive correct type-aware expansion.
+        Non-string scalars pass through unchanged.
+        """
        if isinstance(value, str):
+            ref = parse_file_ref(value)
+            if ref is not None:
+                # MediaFileType fields: return the raw URI immediately —
+                # no file reading, no format inference, no content parsing.
+                if _is_media_file_field(prop_schema):
+                    return ref.uri
+
+                fmt = infer_format_from_uri(ref.uri)
+                # Workspace URIs by ID (workspace://abc123) have no extension.
+                # When the MIME fragment is also missing, fall back to the
+                # workspace file manager's metadata for format detection.
+                if fmt is None and ref.uri.startswith("workspace://"):
+                    fmt = await _infer_format_from_workspace(ref.uri, user_id, session)
+                return await _expand_bare_ref(ref, fmt, user_id, session, prop_schema)
+
+            # Not a bare ref — do normal inline expansion.
            return await expand_file_refs_in_string(
                value, user_id, session, raise_on_error=True
            )
        if isinstance(value, dict):
-            return {k: await _expand(v) for k, v in value.items()}
+            # When the schema says this is an object but doesn't define
+            # inner properties, skip expansion — the caller (e.g.
+            # RunBlockTool) will expand with the actual nested schema.
+            if (
+                prop_schema is not None
+                and prop_schema.get("type") == "object"
+                and "properties" not in prop_schema
+            ):
+                return value
+            nested_props = (prop_schema or {}).get("properties", {})
+            return {
+                k: await _expand(v, prop_schema=nested_props.get(k))
+                for k, v in value.items()
+            }
        if isinstance(value, list):
-            return [await _expand(item) for item in value]
+            items_schema = (prop_schema or {}).get("items")
+            return [await _expand(item, prop_schema=items_schema) for item in value]
        return value

-    return {k: await _expand(v) for k, v in args.items()}
+    return {k: await _expand(v, prop_schema=properties.get(k)) for k, v in args.items()}
+
+
+# ---------------------------------------------------------------------------
+# Private helpers  (used by the public functions above)
+# ---------------------------------------------------------------------------
+
+
+def _apply_line_range(text: str, start: int | None, end: int | None) -> str:
+    """Slice *text* to the requested 1-indexed line range (inclusive).
+
+    When the requested range extends beyond the file, a note is appended
+    so the LLM knows it received the entire remaining content.
+    """
+    if start is None and end is None:
+        return text
+    lines = text.splitlines(keepends=True)
+    total = len(lines)
+    s = (start - 1) if start is not None else 0
+    e = end if end is not None else total
+    selected = list(itertools.islice(lines, s, e))
+    result = "".join(selected)
+    if end is not None and end > total:
+        result += f"\n[Note: file has only {total} lines]\n"
+    return result
+
+
+def _to_str(content: str | bytes) -> str:
+    """Decode *content* to a string if it is bytes, otherwise return as-is."""
+    if isinstance(content, str):
+        return content
+    return content.decode("utf-8", errors="replace")
+
+
+def _check_content_size(content: str | bytes) -> None:
+    """Raise :class:`ValueError` if *content* exceeds the byte limit.
+
+    Raises ``ValueError`` (not ``FileRefExpansionError``) so that the caller
+    (``_expand_bare_ref``) can unify all resolution errors into a single
+    ``except ValueError`` → ``FileRefExpansionError`` handler, keeping the
+    error-flow consistent with ``read_file_bytes`` and ``resolve_file_ref``.
+
+    For ``bytes``, the length is the byte count directly.  For ``str``,
+    we encode to UTF-8 first because multi-byte characters (e.g. emoji)
+    mean the byte size can be up to 4x the character count.
+    """
+    if isinstance(content, bytes):
+        size = len(content)
+    else:
+        char_len = len(content)
+        # Fast lower bound: UTF-8 byte count >= char count.
+        # If char count already exceeds the limit, reject immediately
+        # without allocating an encoded copy.
+        if char_len > _MAX_BARE_REF_BYTES:
+            size = char_len  # real byte size is even larger
+        # Fast upper bound: each char is at most 4 UTF-8 bytes.
+        # If worst-case is still under the limit, skip encoding entirely.
+        elif char_len * 4 <= _MAX_BARE_REF_BYTES:
+            return
+        else:
+            # Edge case: char count is under limit but multibyte chars
+            # might push byte count over. Encode to get exact size.
+            size = len(content.encode("utf-8"))
+    if size > _MAX_BARE_REF_BYTES:
+        raise ValueError(
+            f"File too large for structured parsing "
+            f"({size} bytes, limit {_MAX_BARE_REF_BYTES})"
+        )
+
+
+async def _infer_format_from_workspace(
+    uri: str,
+    user_id: str | None,
+    session: ChatSession,
+) -> str | None:
+    """Look up workspace file metadata to infer the format.
+
+    Workspace URIs by ID (``workspace://abc123``) have no file extension.
+    When the MIME fragment is also absent, we query the workspace file
+    manager for the file's stored MIME type and original filename.
+    """
+    if not user_id:
+        return None
+    try:
+        ws = parse_workspace_uri(uri)
+        manager = await get_workspace_manager(user_id, session.session_id)
+        info = await (
+            manager.get_file_info(ws.file_ref)
+            if not ws.is_path
+            else manager.get_file_info_by_path(ws.file_ref)
+        )
+        if info is None:
+            return None
+        # Try MIME type first, then filename extension.
+        mime = (info.mime_type or "").split(";", 1)[0].strip().lower()
+        return MIME_TO_FORMAT.get(mime) or infer_format_from_uri(info.name)
+    except (
+        ValueError,
+        FileNotFoundError,
+        OSError,
+        PermissionError,
+        AttributeError,
+        TypeError,
+    ):
+        # Expected failures: bad URI, missing file, permission denied, or
+        # workspace manager returning unexpected types.  Propagate anything
+        # else (e.g. programming errors) so they don't get silently swallowed.
+        logger.debug("workspace metadata lookup failed for %s", uri, exc_info=True)
+        return None
+
+
+def _is_media_file_field(prop_schema: dict[str, Any] | None) -> bool:
+    """Return True if *prop_schema* describes a MediaFileType field (format: file)."""
+    if prop_schema is None:
+        return False
+    return (
+        prop_schema.get("type") == "string"
+        and prop_schema.get("format") == MediaFileType.string_format
+    )
+
+
+async def _expand_bare_ref(
+    ref: FileRef,
+    fmt: str | None,
+    user_id: str | None,
+    session: ChatSession,
+    prop_schema: dict[str, Any] | None,
+) -> Any:
+    """Resolve and parse a bare ``@@agptfile:`` reference.
+
+    This is the structured-parsing path: the file is read, optionally parsed
+    according to *fmt*, and adapted to the target *prop_schema*.
+
+    Raises :class:`FileRefExpansionError` on resolution or parsing failure.
+
+    Note: MediaFileType fields (format: "file") are handled earlier in
+    ``_expand`` to avoid unnecessary format inference and file I/O.
+    """
+    try:
+        if fmt is not None and fmt in BINARY_FORMATS:
+            # Binary formats need raw bytes, not UTF-8 text.
+            # Line ranges are meaningless for binary formats (parquet/xlsx)
+            # — ignore them and parse full bytes.  Warn so the caller/model
+            # knows the range was silently dropped.
+            if ref.start_line is not None or ref.end_line is not None:
+                logger.warning(
+                    "Line range [%s-%s] ignored for binary format %s (%s); "
+                    "binary formats are always parsed in full.",
+                    ref.start_line,
+                    ref.end_line,
+                    fmt,
+                    ref.uri,
+                )
+            content: str | bytes = await read_file_bytes(ref.uri, user_id, session)
+        else:
+            content = await resolve_file_ref(ref, user_id, session)
+    except ValueError as exc:
+        raise FileRefExpansionError(str(exc)) from exc
+
+    # For known formats this rejects files >10 MB before parsing.
+    # For unknown formats _MAX_EXPAND_CHARS (200K chars) below is stricter,
+    # but this check still guards the parsing path which has no char limit.
+    # _check_content_size raises ValueError, which we unify here just like
+    # resolution errors above.
+    try:
+        _check_content_size(content)
+    except ValueError as exc:
+        raise FileRefExpansionError(str(exc)) from exc
+
+    # When the schema declares this parameter as "string",
+    # return raw file content — don't parse into a structured
+    # type that would need json.dumps() serialisation.
+    expect_string = (prop_schema or {}).get("type") == "string"
+    if expect_string:
+        if isinstance(content, bytes):
+            raise FileRefExpansionError(
+                f"Cannot use {fmt} file as text input: "
+                f"binary formats (parquet, xlsx) must be passed "
+                f"to a block that accepts structured data (list/object), "
+                f"not a string-typed parameter."
+            )
+        return content
+
+    if fmt is not None:
+        # Use strict mode for binary formats so we surface the
+        # actual error (e.g. missing pyarrow/openpyxl, corrupt
+        # file) instead of silently returning garbled bytes.
+        strict = fmt in BINARY_FORMATS
+        try:
+            parsed = parse_file_content(content, fmt, strict=strict)
+        except PARSE_EXCEPTIONS as exc:
+            raise FileRefExpansionError(f"Failed to parse {fmt} file: {exc}") from exc
+        # Normalize bytes fallback to str so tools never
+        # receive raw bytes when parsing fails.
+        if isinstance(parsed, bytes):
+            parsed = _to_str(parsed)
+        return _adapt_to_schema(parsed, prop_schema)
+
+    # Unknown format — return as plain string, but apply
+    # the same per-ref character limit used by inline refs
+    # to prevent injecting unexpectedly large content.
+    text = _to_str(content)
+    if len(text) > _MAX_EXPAND_CHARS:
+        text = text[:_MAX_EXPAND_CHARS] + "\n... [truncated]"
+    return text
+
+
+def _adapt_to_schema(parsed: Any, prop_schema: dict[str, Any] | None) -> Any:
+    """Adapt a parsed file value to better fit the target schema type.
+
+    When the parser returns a natural type (e.g. dict from YAML, list from CSV)
+    that doesn't match the block's expected type, this function converts it to
+    a more useful representation instead of relying on pydantic's generic
+    coercion (which can produce awkward results like flattened dicts → lists).
+
+    Returns *parsed* unchanged when no adaptation is needed.
+    """
+    if prop_schema is None:
+        return parsed
+
+    target_type = prop_schema.get("type")
+
+    # Dict → array: delegate to helper.
+    if isinstance(parsed, dict) and target_type == "array":
+        return _adapt_dict_to_array(parsed, prop_schema)
+
+    # List → object: delegate to helper (raises for non-tabular lists).
+    if isinstance(parsed, list) and target_type == "object":
+        return _adapt_list_to_object(parsed)
+
+    # Tabular list → Any (no type): convert to list of dicts.
+    # Blocks like FindInDictionaryBlock have `input: Any` which produces
+    # a schema with no "type" key.  Tabular [[header],[rows]] is unusable
+    # for key lookup, but [{col: val}, ...] works with FindInDict's
+    # list-of-dicts branch (line 195-199 in data_manipulation.py).
+    if isinstance(parsed, list) and target_type is None and _is_tabular(parsed):
+        return _tabular_to_list_of_dicts(parsed)
+
+    return parsed
+
+
+def _adapt_dict_to_array(parsed: dict, prop_schema: dict[str, Any]) -> Any:
+    """Adapt a parsed dict to an array-typed field.
+
+    Extracts list-valued entries when the target item type is ``array``,
+    passes through unchanged when item type is ``string`` (lets pydantic error),
+    or wraps in ``[parsed]`` as a fallback.
+    """
+    items_type = (prop_schema.get("items") or {}).get("type")
+    if items_type == "array":
+        # Target is List[List[Any]] — extract list-typed values from the
+        # dict as inner lists.  E.g. YAML {"fruits": [{...},...]}} with
+        # ConcatenateLists (List[List[Any]]) → [[{...},...]].
+        list_values = [v for v in parsed.values() if isinstance(v, list)]
+        if list_values:
+            return list_values
+    if items_type == "string":
+        # Target is List[str] — wrapping a dict would give [dict]
+        # which can't coerce to strings.  Return unchanged and let
+        # pydantic surface a clear validation error.
+        return parsed
+    # Fallback: wrap in a single-element list so the block gets [dict]
+    # instead of pydantic flattening keys/values into a flat list.
+    return [parsed]
+
+
+def _adapt_list_to_object(parsed: list) -> Any:
+    """Adapt a parsed list to an object-typed field.
+
+    Converts tabular lists to column-dicts; raises for non-tabular lists.
+    """
+    if _is_tabular(parsed):
+        return _tabular_to_column_dict(parsed)
+    # Non-tabular list (e.g. a plain Python list from a YAML file) cannot
+    # be meaningfully coerced to an object.  Raise explicitly so callers
+    # get a clear error rather than pydantic silently wrapping the list.
+    raise FileRefExpansionError(
+        "Cannot adapt a non-tabular list to an object-typed field. "
+        "Expected a tabular structure ([[header], [row1], ...]) or a dict."
+    )
+
+
+def _is_tabular(parsed: Any) -> bool:
+    """Check if parsed data is in tabular format: [[header], [row1], ...].
+
+    Uses isinstance checks because this is a structural type guard on
+    opaque parser output (Any), not duck typing.  A Protocol wouldn't
+    help here — we need to verify exact list-of-lists shape.
+    """
+    if not isinstance(parsed, list) or len(parsed) < 2:
+        return False
+    header = parsed[0]
+    if not isinstance(header, list) or not header:
+        return False
+    if not all(isinstance(h, str) for h in header):
+        return False
+    return all(isinstance(row, list) for row in parsed[1:])
+
+
+def _tabular_to_list_of_dicts(parsed: list) -> list[dict[str, Any]]:
+    """Convert [[header], [row1], ...] → [{header[0]: row[0], ...}, ...].
+
+    Ragged rows (fewer columns than the header) get None for missing values.
+    Extra values beyond the header length are silently dropped.
+    """
+    header = parsed[0]
+    return [
+        dict(itertools.zip_longest(header, row[: len(header)], fillvalue=None))
+        for row in parsed[1:]
+    ]
+
+
+def _tabular_to_column_dict(parsed: list) -> dict[str, list]:
+    """Convert [[header], [row1], ...] → {"col1": [val1, ...], ...}.
+
+    Ragged rows (fewer columns than the header) get None for missing values,
+    ensuring all columns have equal length.
+    """
+    header = parsed[0]
+    return {
+        col: [row[i] if i < len(row) else None for row in parsed[1:]]
+        for i, col in enumerate(header)
+    }
--- a/autogpt_platform/backend/backend/copilot/sdk/file_ref_integration_test.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/file_ref_integration_test.py
@@ -175,6 +175,199 @@ async def test_expand_args_replaces_file_ref_in_nested_dict():
        assert result["count"] == 42


+# ---------------------------------------------------------------------------
+# expand_file_refs_in_args — bare ref structured parsing
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.asyncio
+async def test_bare_ref_json_returns_parsed_dict():
+    """Bare ref to a .json file returns parsed dict, not raw string."""
+    with tempfile.TemporaryDirectory() as sdk_cwd:
+        json_file = os.path.join(sdk_cwd, "data.json")
+        with open(json_file, "w") as f:
+            f.write('{"key": "value", "count": 42}')
+
+        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
+            mock_cwd_var.get.return_value = sdk_cwd
+
+            result = await expand_file_refs_in_args(
+                {"data": f"@@agptfile:{json_file}"},
+                user_id="u1",
+                session=_make_session(),
+            )
+
+        assert result["data"] == {"key": "value", "count": 42}
+
+
+@pytest.mark.asyncio
+async def test_bare_ref_csv_returns_parsed_table():
+    """Bare ref to a .csv file returns list[list[str]] table."""
+    with tempfile.TemporaryDirectory() as sdk_cwd:
+        csv_file = os.path.join(sdk_cwd, "data.csv")
+        with open(csv_file, "w") as f:
+            f.write("Name,Score\nAlice,90\nBob,85")
+
+        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
+            mock_cwd_var.get.return_value = sdk_cwd
+
+            result = await expand_file_refs_in_args(
+                {"input": f"@@agptfile:{csv_file}"},
+                user_id="u1",
+                session=_make_session(),
+            )
+
+        assert result["input"] == [
+            ["Name", "Score"],
+            ["Alice", "90"],
+            ["Bob", "85"],
+        ]
+
+
+@pytest.mark.asyncio
+async def test_bare_ref_unknown_extension_returns_string():
+    """Bare ref to a file with unknown extension returns plain string."""
+    with tempfile.TemporaryDirectory() as sdk_cwd:
+        txt_file = os.path.join(sdk_cwd, "readme.txt")
+        with open(txt_file, "w") as f:
+            f.write("plain text content")
+
+        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
+            mock_cwd_var.get.return_value = sdk_cwd
+
+            result = await expand_file_refs_in_args(
+                {"data": f"@@agptfile:{txt_file}"},
+                user_id="u1",
+                session=_make_session(),
+            )
+
+        assert result["data"] == "plain text content"
+        assert isinstance(result["data"], str)
+
+
+@pytest.mark.asyncio
+async def test_bare_ref_invalid_json_falls_back_to_string():
+    """Bare ref to a .json file with invalid JSON falls back to string."""
+    with tempfile.TemporaryDirectory() as sdk_cwd:
+        json_file = os.path.join(sdk_cwd, "bad.json")
+        with open(json_file, "w") as f:
+            f.write("not valid json {{{")
+
+        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
+            mock_cwd_var.get.return_value = sdk_cwd
+
+            result = await expand_file_refs_in_args(
+                {"data": f"@@agptfile:{json_file}"},
+                user_id="u1",
+                session=_make_session(),
+            )
+
+        assert result["data"] == "not valid json {{{"
+        assert isinstance(result["data"], str)
+
+
+@pytest.mark.asyncio
+async def test_embedded_ref_always_returns_string_even_for_json():
+    """Embedded ref (text around it) returns plain string, not parsed JSON."""
+    with tempfile.TemporaryDirectory() as sdk_cwd:
+        json_file = os.path.join(sdk_cwd, "data.json")
+        with open(json_file, "w") as f:
+            f.write('{"key": "value"}')
+
+        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
+            mock_cwd_var.get.return_value = sdk_cwd
+
+            result = await expand_file_refs_in_args(
+                {"data": f"prefix @@agptfile:{json_file} suffix"},
+                user_id="u1",
+                session=_make_session(),
+            )
+
+        assert isinstance(result["data"], str)
+        assert result["data"].startswith("prefix ")
+        assert result["data"].endswith(" suffix")
+
+
+@pytest.mark.asyncio
+async def test_bare_ref_yaml_returns_parsed_dict():
+    """Bare ref to a .yaml file returns parsed dict."""
+    with tempfile.TemporaryDirectory() as sdk_cwd:
+        yaml_file = os.path.join(sdk_cwd, "config.yaml")
+        with open(yaml_file, "w") as f:
+            f.write("name: test\ncount: 42\n")
+
+        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
+            mock_cwd_var.get.return_value = sdk_cwd
+
+            result = await expand_file_refs_in_args(
+                {"config": f"@@agptfile:{yaml_file}"},
+                user_id="u1",
+                session=_make_session(),
+            )
+
+        assert result["config"] == {"name": "test", "count": 42}
+
+
+@pytest.mark.asyncio
+async def test_bare_ref_binary_with_line_range_ignores_range():
+    """Bare ref to a binary file (.parquet) with line range parses the full file.
+
+    Binary formats (parquet, xlsx) ignore line ranges — the full content is
+    parsed and the range is silently dropped with a log warning.
+    """
+    try:
+        import pandas as pd
+    except ImportError:
+        pytest.skip("pandas not installed")
+    try:
+        import pyarrow  # noqa: F401  # pyright: ignore[reportMissingImports]
+    except ImportError:
+        pytest.skip("pyarrow not installed")
+
+    with tempfile.TemporaryDirectory() as sdk_cwd:
+        parquet_file = os.path.join(sdk_cwd, "data.parquet")
+        import io as _io
+
+        df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
+        buf = _io.BytesIO()
+        df.to_parquet(buf, index=False)
+        with open(parquet_file, "wb") as f:
+            f.write(buf.getvalue())
+
+        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
+            mock_cwd_var.get.return_value = sdk_cwd
+
+            # Line range [1-2] should be silently ignored for binary formats.
+            result = await expand_file_refs_in_args(
+                {"data": f"@@agptfile:{parquet_file}[1-2]"},
+                user_id="u1",
+                session=_make_session(),
+            )
+
+        # Full file is returned despite the line range.
+        assert result["data"] == [["A", "B"], [1, 4], [2, 5], [3, 6]]
+
+
+@pytest.mark.asyncio
+async def test_bare_ref_toml_returns_parsed_dict():
+    """Bare ref to a .toml file returns parsed dict."""
+    with tempfile.TemporaryDirectory() as sdk_cwd:
+        toml_file = os.path.join(sdk_cwd, "config.toml")
+        with open(toml_file, "w") as f:
+            f.write('name = "test"\ncount = 42\n')
+
+        with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
+            mock_cwd_var.get.return_value = sdk_cwd
+
+            result = await expand_file_refs_in_args(
+                {"config": f"@@agptfile:{toml_file}"},
+                user_id="u1",
+                session=_make_session(),
+            )
+
+        assert result["config"] == {"name": "test", "count": 42}
+
+
 # ---------------------------------------------------------------------------
 # _read_file_handler — extended to accept workspace:// and local paths
 # ---------------------------------------------------------------------------
@@ -219,7 +412,7 @@ async def test_read_file_handler_workspace_uri():
        "backend.copilot.sdk.tool_adapter.get_execution_context",
        return_value=("user-1", mock_session),
    ), patch(
-        "backend.copilot.sdk.file_ref.get_manager",
+        "backend.copilot.sdk.file_ref.get_workspace_manager",
        new=AsyncMock(return_value=mock_manager),
    ):
        result = await _read_file_handler(
@@ -276,7 +469,7 @@ async def test_read_file_bytes_workspace_virtual_path():
    mock_manager.read_file.return_value = b"virtual path content"

    with patch(
-        "backend.copilot.sdk.file_ref.get_manager",
+        "backend.copilot.sdk.file_ref.get_workspace_manager",
        new=AsyncMock(return_value=mock_manager),
    ):
        result = await read_file_bytes("workspace:///reports/q1.md", "user-1", session)
--- a/autogpt_platform/backend/backend/copilot/sdk/file_ref_test.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/file_ref_test.py
--- a/autogpt_platform/backend/backend/copilot/sdk/service.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/service.py
@@ -29,6 +29,7 @@ from langfuse import propagate_attributes
 from langsmith.integrations.claude_agent_sdk import configure_claude_agent_sdk
 from pydantic import BaseModel

+from backend.copilot.context import get_workspace_manager
 from backend.data.redis_client import get_redis_async
 from backend.executor.cluster_lock import AsyncClusterLock
 from backend.util.exceptions import NotFoundError
@@ -62,7 +63,6 @@ from ..service import (
 )
 from ..tools.e2b_sandbox import get_or_create_sandbox, pause_sandbox_direct
 from ..tools.sandbox import WORKSPACE_PREFIX, make_session_path
-from ..tools.workspace_files import get_manager
 from ..tracking import track_user_message
 from .compaction import CompactionTracker, filter_compaction_messages
 from .response_adapter import SDKResponseAdapter
@@ -565,7 +565,7 @@ async def _prepare_file_attachments(
        return empty

    try:
-        manager = await get_manager(user_id, session_id)
+        manager = await get_workspace_manager(user_id, session_id)
    except Exception:
        logger.warning(
            "Failed to create workspace manager for file attachments",
--- a/autogpt_platform/backend/backend/copilot/sdk/service_test.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/service_test.py
@@ -20,7 +20,7 @@ class _FakeFileInfo:
    size_bytes: int


-_PATCH_TARGET = "backend.copilot.sdk.service.get_manager"
+_PATCH_TARGET = "backend.copilot.sdk.service.get_workspace_manager"


 class TestPrepareFileAttachments:
--- a/autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
+++ b/autogpt_platform/backend/backend/copilot/sdk/tool_adapter.py
@@ -347,7 +347,7 @@ def create_copilot_mcp_server(*, use_e2b: bool = False):
    :func:`get_sdk_disallowed_tools`.
    """

-    def _truncating(fn, tool_name: str):
+    def _truncating(fn, tool_name: str, input_schema: dict[str, Any] | None = None):
        """Wrap a tool handler so its response is truncated to stay under the
        SDK's 10 MB JSON buffer, and stash the (truncated) output for the
        response adapter before the SDK can apply its own head-truncation.
@@ -361,7 +361,9 @@ def create_copilot_mcp_server(*, use_e2b: bool = False):
            user_id, session = get_execution_context()
            if session is not None:
                try:
-                    args = await expand_file_refs_in_args(args, user_id, session)
+                    args = await expand_file_refs_in_args(
+                        args, user_id, session, input_schema=input_schema
+                    )
                except FileRefExpansionError as exc:
                    return _mcp_error(
                        f"@@agptfile: reference could not be resolved: {exc}. "
@@ -389,11 +391,12 @@ def create_copilot_mcp_server(*, use_e2b: bool = False):

    for tool_name, base_tool in TOOL_REGISTRY.items():
        handler = create_tool_handler(base_tool)
+        schema = _build_input_schema(base_tool)
        decorated = tool(
            tool_name,
            base_tool.description,
-            _build_input_schema(base_tool),
-        )(_truncating(handler, tool_name))
+            schema,
+        )(_truncating(handler, tool_name, input_schema=schema))
        sdk_tools.append(decorated)

    # E2B file tools replace SDK built-in Read/Write/Edit/Glob/Grep.
--- a/autogpt_platform/backend/backend/copilot/service.py
+++ b/autogpt_platform/backend/backend/copilot/service.py
@@ -28,10 +28,24 @@ logger = logging.getLogger(__name__)

 config = ChatConfig()
 settings = Settings()
-client = LangfuseAsyncOpenAI(api_key=config.api_key, base_url=config.base_url)
+
+_client: LangfuseAsyncOpenAI | None = None
+_langfuse = None


-langfuse = get_client()
+def _get_openai_client() -> LangfuseAsyncOpenAI:
+    global _client
+    if _client is None:
+        _client = LangfuseAsyncOpenAI(api_key=config.api_key, base_url=config.base_url)
+    return _client
+
+
+def _get_langfuse():
+    global _langfuse
+    if _langfuse is None:
+        _langfuse = get_client()
+    return _langfuse
+

 # Default system prompt used when Langfuse is not configured
 # Provides minimal baseline tone and personality - all workflow, tools, and
@@ -84,7 +98,7 @@ async def _get_system_prompt_template(context: str) -> str:
                else "latest"
            )
            prompt = await asyncio.to_thread(
-                langfuse.get_prompt,
+                _get_langfuse().get_prompt,
                config.langfuse_prompt_name,
                label=label,
                cache_ttl_seconds=config.langfuse_prompt_cache_ttl,
@@ -158,7 +172,7 @@ async def _generate_session_title(
            "environment": settings.config.app_env.value,
        }

-        response = await client.chat.completions.create(
+        response = await _get_openai_client().chat.completions.create(
            model=config.title_model,
            messages=[
                {
--- a/autogpt_platform/backend/backend/copilot/tools/agent_browser.py
+++ b/autogpt_platform/backend/backend/copilot/tools/agent_browser.py
@@ -32,6 +32,7 @@ import shutil
 import tempfile
 from typing import Any

+from backend.copilot.context import get_workspace_manager
 from backend.copilot.model import ChatSession
 from backend.util.request import validate_url_host

@@ -43,7 +44,6 @@ from .models import (
    ErrorResponse,
    ToolResponseBase,
 )
-from .workspace_files import get_manager

 logger = logging.getLogger(__name__)

@@ -194,7 +194,7 @@ async def _save_browser_state(
            ),
        }

-        manager = await get_manager(user_id, session.session_id)
+        manager = await get_workspace_manager(user_id, session.session_id)
        await manager.write_file(
            content=json.dumps(state).encode("utf-8"),
            filename=_STATE_FILENAME,
@@ -218,7 +218,7 @@ async def _restore_browser_state(
    Returns True on success (or no state to restore), False on failure.
    """
    try:
-        manager = await get_manager(user_id, session.session_id)
+        manager = await get_workspace_manager(user_id, session.session_id)

        file_info = await manager.get_file_info_by_path(_STATE_FILENAME)
        if file_info is None:
@@ -360,7 +360,7 @@ async def close_browser_session(session_name: str, user_id: str | None = None) -
    # Delete persisted browser state (cookies, localStorage) from workspace.
    if user_id:
        try:
-            manager = await get_manager(user_id, session_name)
+            manager = await get_workspace_manager(user_id, session_name)
            file_info = await manager.get_file_info_by_path(_STATE_FILENAME)
            if file_info is not None:
                await manager.delete_file(file_info.id)
--- a/autogpt_platform/backend/backend/copilot/tools/agent_browser_test.py
+++ b/autogpt_platform/backend/backend/copilot/tools/agent_browser_test.py
@@ -897,7 +897,7 @@ class TestHasLocalSession:
 # _save_browser_state
 # ---------------------------------------------------------------------------

-_GET_MANAGER = "backend.copilot.tools.agent_browser.get_manager"
+_GET_MANAGER = "backend.copilot.tools.agent_browser.get_workspace_manager"


 def _make_mock_manager():
--- a/autogpt_platform/backend/backend/copilot/tools/run_block.py
+++ b/autogpt_platform/backend/backend/copilot/tools/run_block.py
@@ -12,6 +12,7 @@ from backend.copilot.constants import (
    COPILOT_SESSION_PREFIX,
 )
 from backend.copilot.model import ChatSession
+from backend.copilot.sdk.file_ref import FileRefExpansionError, expand_file_refs_in_args
 from backend.data.db_accessors import review_db
 from backend.data.execution import ExecutionContext

@@ -197,6 +198,29 @@ class RunBlockTool(BaseTool):
                session_id=session_id,
            )

+        # Expand @@agptfile: refs in input_data with the block's input
+        # schema.  The generic _truncating wrapper skips opaque object
+        # properties (input_data has no declared inner properties in the
+        # tool schema), so file ref tokens are still intact here.
+        # Using the block's schema lets us return raw text for string-typed
+        # fields and parsed structures for list/dict-typed fields.
+        if input_data:
+            try:
+                input_data = await expand_file_refs_in_args(
+                    input_data,
+                    user_id,
+                    session,
+                    input_schema=input_schema,
+                )
+            except FileRefExpansionError as exc:
+                return ErrorResponse(
+                    message=(
+                        f"Failed to resolve file reference: {exc}. "
+                        "Ensure the file exists before referencing it."
+                    ),
+                    session_id=session_id,
+                )
+
        if missing_credentials:
            # Return setup requirements response with missing credentials
            credentials_fields_info = block.input_schema.get_credentials_fields_info()
--- a/autogpt_platform/backend/backend/copilot/tools/workspace_files.py
+++ b/autogpt_platform/backend/backend/copilot/tools/workspace_files.py
@@ -10,11 +10,11 @@ from pydantic import BaseModel
 from backend.copilot.context import (
    E2B_WORKDIR,
    get_current_sandbox,
+    get_workspace_manager,
    resolve_sandbox_path,
 )
 from backend.copilot.model import ChatSession
 from backend.copilot.tools.sandbox import make_session_path
-from backend.data.db_accessors import workspace_db
 from backend.util.settings import Config
 from backend.util.virus_scanner import scan_content_safe
 from backend.util.workspace import WorkspaceManager
@@ -218,12 +218,6 @@ def _is_text_mime(mime_type: str) -> bool:
    return any(mime_type.startswith(t) for t in _TEXT_MIME_PREFIXES)


-async def get_manager(user_id: str, session_id: str) -> WorkspaceManager:
-    """Create a session-scoped WorkspaceManager."""
-    workspace = await workspace_db().get_or_create_workspace(user_id)
-    return WorkspaceManager(user_id, workspace.id, session_id)
-
-
 async def _resolve_file(
    manager: WorkspaceManager,
    file_id: str | None,
@@ -386,7 +380,7 @@ class ListWorkspaceFilesTool(BaseTool):
        include_all_sessions: bool = kwargs.get("include_all_sessions", False)

        try:
-            manager = await get_manager(user_id, session_id)
+            manager = await get_workspace_manager(user_id, session_id)
            files = await manager.list_files(
                path=path_prefix, limit=limit, include_all_sessions=include_all_sessions
            )
@@ -536,7 +530,7 @@ class ReadWorkspaceFileTool(BaseTool):
            )

        try:
-            manager = await get_manager(user_id, session_id)
+            manager = await get_workspace_manager(user_id, session_id)
            resolved = await _resolve_file(manager, file_id, path, session_id)
            if isinstance(resolved, ErrorResponse):
                return resolved
@@ -772,7 +766,7 @@ class WriteWorkspaceFileTool(BaseTool):

        try:
            await scan_content_safe(content, filename=filename)
-            manager = await get_manager(user_id, session_id)
+            manager = await get_workspace_manager(user_id, session_id)
            rec = await manager.write_file(
                content=content,
                filename=filename,
@@ -899,7 +893,7 @@ class DeleteWorkspaceFileTool(BaseTool):
            )

        try:
-            manager = await get_manager(user_id, session_id)
+            manager = await get_workspace_manager(user_id, session_id)
            resolved = await _resolve_file(manager, file_id, path, session_id)
            if isinstance(resolved, ErrorResponse):
                return resolved
--- a/autogpt_platform/backend/backend/util/file.py
+++ b/autogpt_platform/backend/backend/util/file.py
@@ -275,13 +275,12 @@ async def store_media_file(
    # Process file
    elif file.startswith("data:"):
        # Data URI
-        match = re.match(r"^data:([^;]+);base64,(.*)$", file, re.DOTALL)
-        if not match:
+        parsed_uri = parse_data_uri(file)
+        if parsed_uri is None:
            raise ValueError(
                "Invalid data URI format. Expected data:<mime>;base64,<data>"
            )
-        mime_type = match.group(1).strip().lower()
-        b64_content = match.group(2).strip()
+        mime_type, b64_content = parsed_uri

        # Generate filename and decode
        extension = _extension_from_mime(mime_type)
@@ -415,13 +414,70 @@ def get_dir_size(path: Path) -> int:
    return total


+async def resolve_media_content(
+    content: MediaFileType,
+    execution_context: "ExecutionContext",
+    *,
+    return_format: MediaReturnFormat,
+) -> MediaFileType:
+    """Resolve a ``MediaFileType`` value if it is a media reference, pass through otherwise.
+
+    Convenience wrapper around :func:`is_media_file_ref` + :func:`store_media_file`.
+    Plain text content (source code, filenames) is returned unchanged.  Media
+    references (``data:``, ``workspace://``, ``http(s)://``) are resolved via
+    :func:`store_media_file` using *return_format*.
+
+    Use this when a block field is typed as ``MediaFileType`` but may contain
+    either literal text or a media reference.
+    """
+    if not content or not is_media_file_ref(content):
+        return content
+    return await store_media_file(
+        content, execution_context, return_format=return_format
+    )
+
+
+def is_media_file_ref(value: str) -> bool:
+    """Return True if *value* looks like a ``MediaFileType`` reference.
+
+    Detects data URIs, workspace:// references, and HTTP(S) URLs — the
+    formats accepted by :func:`store_media_file`.  Plain text content
+    (e.g. source code, filenames) returns False.
+
+    Known limitation: HTTP(S) URL detection is heuristic.  Any string that
+    starts with ``http://`` or ``https://`` is treated as a media URL, even
+    if it appears as a URL inside source-code comments or documentation.
+    Blocks that produce source code or Markdown as output may therefore
+    trigger false positives.  Callers that need higher precision should
+    inspect the string further (e.g. verify the URL is reachable or has a
+    media-friendly extension).
+
+    Note: this does *not* match local file paths, which are ambiguous
+    (could be filenames or actual paths).  Blocks that need to resolve
+    local paths should check for them separately.
+    """
+    return value.startswith(("data:", "workspace://", "http://", "https://"))
+
+
+def parse_data_uri(value: str) -> tuple[str, str] | None:
+    """Parse a ``data:<mime>;base64,<payload>`` URI.
+
+    Returns ``(mime_type, base64_payload)`` if *value* is a valid data URI,
+    or ``None`` if it is not.
+    """
+    match = re.match(r"^data:([^;]+);base64,(.*)$", value, re.DOTALL)
+    if not match:
+        return None
+    return match.group(1).strip().lower(), match.group(2).strip()
+
+
 def get_mime_type(file: str) -> str:
    """
    Get the MIME type of a file, whether it's a data URI, URL, or local path.
    """
    if file.startswith("data:"):
-        match = re.match(r"^data:([^;]+);base64,", file)
-        return match.group(1) if match else "application/octet-stream"
+        parsed_uri = parse_data_uri(file)
+        return parsed_uri[0] if parsed_uri else "application/octet-stream"

    elif file.startswith(("http://", "https://")):
        parsed_url = urlparse(file)
--- a/autogpt_platform/backend/backend/util/file_content_parser.py
+++ b/autogpt_platform/backend/backend/util/file_content_parser.py
@@ -0,0 +1,375 @@
+"""Parse file content into structured Python objects based on file format.
+
+Used by the ``@@agptfile:`` expansion system to eagerly parse well-known file
+formats into native Python types *before* schema-driven coercion runs.  This
+lets blocks with ``Any``-typed inputs receive structured data rather than raw
+strings, while blocks expecting strings get the value coerced back via
+``convert()``.
+
+Supported formats:
+
+- **JSON** (``.json``) — arrays and objects are promoted; scalars stay as strings
+- **JSON Lines** (``.jsonl``, ``.ndjson``) — each non-empty line parsed as JSON;
+  when all lines are dicts with the same keys (tabular data), output is
+  ``list[list[Any]]`` with a header row, consistent with CSV/Parquet/Excel;
+  otherwise returns a plain ``list`` of parsed values
+- **CSV** (``.csv``) — ``csv.reader`` → ``list[list[str]]``
+- **TSV** (``.tsv``) — tab-delimited → ``list[list[str]]``
+- **YAML** (``.yaml``, ``.yml``) — parsed via PyYAML; containers only
+- **TOML** (``.toml``) — parsed via stdlib ``tomllib``
+- **Parquet** (``.parquet``) — via pandas/pyarrow → ``list[list[Any]]`` with header row
+- **Excel** (``.xlsx``) — via pandas/openpyxl → ``list[list[Any]]`` with header row
+  (legacy ``.xls`` is **not** supported — only the modern OOXML format)
+
+The **fallback contract** is enforced by :func:`parse_file_content`, not by
+individual parser functions.  If any parser raises, ``parse_file_content``
+catches the exception and returns the original content unchanged (string for
+text formats, bytes for binary formats).  Callers should never see an
+exception from the public API when ``strict=False``.
+"""
+
+import csv
+import io
+import json
+import logging
+import tomllib
+import zipfile
+from collections.abc import Callable
+
+# posixpath.splitext handles forward-slash URI paths correctly on all platforms,
+# unlike os.path.splitext which uses platform-native separators.
+from posixpath import splitext
+from typing import Any
+
+import yaml
+
+logger = logging.getLogger(__name__)
+
+# ---------------------------------------------------------------------------
+# Extension / MIME → format label mapping
+# ---------------------------------------------------------------------------
+
+_EXT_TO_FORMAT: dict[str, str] = {
+    ".json": "json",
+    ".jsonl": "jsonl",
+    ".ndjson": "jsonl",
+    ".csv": "csv",
+    ".tsv": "tsv",
+    ".yaml": "yaml",
+    ".yml": "yaml",
+    ".toml": "toml",
+    ".parquet": "parquet",
+    ".xlsx": "xlsx",
+}
+
+MIME_TO_FORMAT: dict[str, str] = {
+    "application/json": "json",
+    "application/x-ndjson": "jsonl",
+    "application/jsonl": "jsonl",
+    "text/csv": "csv",
+    "text/tab-separated-values": "tsv",
+    "application/x-yaml": "yaml",
+    "application/yaml": "yaml",
+    "text/yaml": "yaml",
+    "application/toml": "toml",
+    "application/vnd.apache.parquet": "parquet",
+    "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet": "xlsx",
+}
+
+# Formats that require raw bytes rather than decoded text.
+BINARY_FORMATS: frozenset[str] = frozenset({"parquet", "xlsx"})
+
+
+# ---------------------------------------------------------------------------
+# Public API  (top-down: main functions first, helpers below)
+# ---------------------------------------------------------------------------
+
+
+def infer_format_from_uri(uri: str) -> str | None:
+    """Return a format label based on URI extension or MIME fragment.
+
+    Returns ``None`` when the format cannot be determined — the caller should
+    fall back to returning the content as a plain string.
+    """
+    # 1. Check MIME fragment  (workspace://abc123#application/json)
+    if "#" in uri:
+        _, fragment = uri.rsplit("#", 1)
+        fmt = MIME_TO_FORMAT.get(fragment.lower())
+        if fmt:
+            return fmt
+
+    # 2. Check file extension from the path portion.
+    #    Strip the fragment first so ".json#mime" doesn't confuse splitext.
+    path = uri.split("#")[0].split("?")[0]
+    _, ext = splitext(path)
+    fmt = _EXT_TO_FORMAT.get(ext.lower())
+    if fmt is not None:
+        return fmt
+
+    # Legacy .xls is not supported — map it so callers can produce a
+    # user-friendly error instead of returning garbled binary.
+    if ext.lower() == ".xls":
+        return "xls"
+
+    return None
+
+
+def parse_file_content(content: str | bytes, fmt: str, *, strict: bool = False) -> Any:
+    """Parse *content* according to *fmt* and return a native Python value.
+
+    When *strict* is ``False`` (default), returns the original *content*
+    unchanged if *fmt* is not recognised or parsing fails for any reason.
+    This mode **never raises**.
+
+    When *strict* is ``True``, parsing errors are propagated to the caller.
+    Unrecognised formats or type mismatches (e.g. text for a binary format)
+    still return *content* unchanged without raising.
+    """
+    if fmt == "xls":
+        return (
+            "[Unsupported format] Legacy .xls files are not supported. "
+            "Please re-save the file as .xlsx (Excel 2007+) and upload again."
+        )
+
+    try:
+        if fmt in BINARY_FORMATS:
+            parser = _BINARY_PARSERS.get(fmt)
+            if parser is None:
+                return content
+            if isinstance(content, str):
+                # Caller gave us text for a binary format — can't parse.
+                return content
+            return parser(content)
+
+        parser = _TEXT_PARSERS.get(fmt)
+        if parser is None:
+            return content
+        if isinstance(content, bytes):
+            content = content.decode("utf-8", errors="replace")
+        return parser(content)
+
+    except PARSE_EXCEPTIONS:
+        if strict:
+            raise
+        logger.debug("Structured parsing failed for format=%s, falling back", fmt)
+        return content
+
+
+# ---------------------------------------------------------------------------
+# Exception loading helpers
+# ---------------------------------------------------------------------------
+
+
+def _load_openpyxl_exception() -> type[Exception]:
+    """Return openpyxl's InvalidFileException, raising ImportError if absent."""
+    from openpyxl.utils.exceptions import InvalidFileException  # noqa: PLC0415
+
+    return InvalidFileException
+
+
+def _load_arrow_exception() -> type[Exception]:
+    """Return pyarrow's ArrowException, raising ImportError if absent."""
+    from pyarrow import ArrowException  # noqa: PLC0415
+
+    return ArrowException
+
+
+def _optional_exc(loader: "Callable[[], type[Exception]]") -> "type[Exception] | None":
+    """Return the exception class from *loader*, or ``None`` if the dep is absent."""
+    try:
+        return loader()
+    except ImportError:
+        return None
+
+
+# Exception types that can be raised during file content parsing.
+# Shared between ``parse_file_content`` (which catches them in non-strict mode)
+# and ``file_ref._expand_bare_ref`` (which re-raises them as FileRefExpansionError).
+#
+# Optional-dependency exception types are loaded via a helper that raises
+# ``ImportError`` at *parse time* rather than silently becoming ``None`` here.
+# This ensures mypy sees clean types and missing deps surface as real errors.
+PARSE_EXCEPTIONS: tuple[type[BaseException], ...] = tuple(
+    exc
+    for exc in (
+        json.JSONDecodeError,
+        csv.Error,
+        yaml.YAMLError,
+        tomllib.TOMLDecodeError,
+        ValueError,
+        UnicodeDecodeError,
+        ImportError,
+        OSError,
+        KeyError,
+        TypeError,
+        zipfile.BadZipFile,
+        _optional_exc(_load_openpyxl_exception),
+        # ArrowException covers ArrowIOError and ArrowCapacityError which
+        # do not inherit from standard exceptions; ArrowInvalid/ArrowTypeError
+        # already map to ValueError/TypeError but this catches the rest.
+        _optional_exc(_load_arrow_exception),
+    )
+    if exc is not None
+)
+
+
+# ---------------------------------------------------------------------------
+# Text-based parsers  (content: str → Any)
+# ---------------------------------------------------------------------------
+
+
+def _parse_container(parser: Callable[[str], Any], content: str) -> list | dict | str:
+    """Parse *content* and return the result only if it is a container (list/dict).
+
+    Scalar values (strings, numbers, booleans, None) are discarded and the
+    original *content* string is returned instead.  This prevents e.g. a JSON
+    file containing just ``"42"`` from silently becoming an int.
+    """
+    parsed = parser(content)
+    if isinstance(parsed, (list, dict)):
+        return parsed
+    return content
+
+
+def _parse_json(content: str) -> list | dict | str:
+    return _parse_container(json.loads, content)
+
+
+def _parse_jsonl(content: str) -> Any:
+    lines = [json.loads(line) for line in content.splitlines() if line.strip()]
+    if not lines:
+        return content
+
+    # When every line is a dict with the same keys, convert to table format
+    # (header row + data rows) — consistent with CSV/TSV/Parquet/Excel output.
+    # Require ≥2 dicts so a single-line JSONL stays as [dict] (not a table).
+    if len(lines) >= 2 and all(isinstance(obj, dict) for obj in lines):
+        keys = list(lines[0].keys())
+        # Cache as tuple to avoid O(n×k) list allocations in the all() call.
+        keys_tuple = tuple(keys)
+        if keys and all(tuple(obj.keys()) == keys_tuple for obj in lines[1:]):
+            return [keys] + [[obj[k] for k in keys] for obj in lines]
+
+    return lines
+
+
+def _parse_csv(content: str) -> Any:
+    return _parse_delimited(content, delimiter=",")
+
+
+def _parse_tsv(content: str) -> Any:
+    return _parse_delimited(content, delimiter="\t")
+
+
+def _parse_delimited(content: str, *, delimiter: str) -> Any:
+    reader = csv.reader(io.StringIO(content), delimiter=delimiter)
+    # csv.reader never yields [] — blank lines yield [""]. Filter out
+    # rows where every cell is empty (i.e. truly blank lines).
+    rows = [row for row in reader if _row_has_content(row)]
+    if not rows:
+        return content
+    # If the declared delimiter produces only single-column rows, try
+    # sniffing the actual delimiter — catches misidentified files (e.g.
+    # a tab-delimited file with a .csv extension).
+    if len(rows[0]) == 1:
+        try:
+            dialect = csv.Sniffer().sniff(content[:8192])
+            if dialect.delimiter != delimiter:
+                reader = csv.reader(io.StringIO(content), dialect)
+                rows = [row for row in reader if _row_has_content(row)]
+        except csv.Error:
+            pass
+    if rows and len(rows[0]) >= 2:
+        return rows
+    return content
+
+
+def _row_has_content(row: list[str]) -> bool:
+    """Return True when *row* contains at least one non-empty cell.
+
+    ``csv.reader`` never yields ``[]`` — truly blank lines yield ``[""]``.
+    This predicate filters those out consistently across the initial read
+    and the sniffer-fallback re-read.
+    """
+    return any(cell for cell in row)
+
+
+def _parse_yaml(content: str) -> list | dict | str:
+    # NOTE: YAML anchor/alias expansion can amplify input beyond the 10MB cap.
+    # safe_load prevents code execution; for production hardening consider
+    # a YAML parser with expansion limits (e.g. ruamel.yaml with max_alias_count).
+    if "\n---" in content or content.startswith("---\n"):
+        # Multi-document YAML: only the first document is parsed; the rest
+        # are silently ignored by yaml.safe_load.  Warn so callers are aware.
+        logger.warning(
+            "Multi-document YAML detected (--- separator); "
+            "only the first document will be parsed."
+        )
+    return _parse_container(yaml.safe_load, content)
+
+
+def _parse_toml(content: str) -> Any:
+    parsed = tomllib.loads(content)
+    # tomllib.loads always returns a dict — return it even if empty.
+    return parsed
+
+
+_TEXT_PARSERS: dict[str, Callable[[str], Any]] = {
+    "json": _parse_json,
+    "jsonl": _parse_jsonl,
+    "csv": _parse_csv,
+    "tsv": _parse_tsv,
+    "yaml": _parse_yaml,
+    "toml": _parse_toml,
+}
+
+# ---------------------------------------------------------------------------
+# Binary-based parsers  (content: bytes → Any)
+# ---------------------------------------------------------------------------
+
+
+def _parse_parquet(content: bytes) -> list[list[Any]]:
+    import pandas as pd
+
+    df = pd.read_parquet(io.BytesIO(content))
+    return _df_to_rows(df)
+
+
+def _parse_xlsx(content: bytes) -> list[list[Any]]:
+    import pandas as pd
+
+    # Explicitly specify openpyxl engine; the default engine varies by pandas
+    # version and does not support legacy .xls (which is excluded by our format map).
+    df = pd.read_excel(io.BytesIO(content), engine="openpyxl")
+    return _df_to_rows(df)
+
+
+def _df_to_rows(df: Any) -> list[list[Any]]:
+    """Convert a DataFrame to ``list[list[Any]]`` with a header row.
+
+    NaN values are replaced with ``None`` so the result is JSON-serializable.
+    Uses explicit cell-level checking because ``df.where(df.notna(), None)``
+    silently converts ``None`` back to ``NaN`` in float64 columns.
+    """
+    header = df.columns.tolist()
+    rows = [
+        [None if _is_nan(cell) else cell for cell in row] for row in df.values.tolist()
+    ]
+    return [header] + rows
+
+
+def _is_nan(cell: Any) -> bool:
+    """Check if a cell value is NaN, handling non-scalar types (lists, dicts).
+
+    ``pd.isna()`` on a list/dict returns a boolean array which raises
+    ``ValueError`` in a boolean context.  Guard with a scalar check first.
+    """
+    import pandas as pd
+
+    return bool(pd.api.types.is_scalar(cell) and pd.isna(cell))
+
+
+_BINARY_PARSERS: dict[str, Callable[[bytes], Any]] = {
+    "parquet": _parse_parquet,
+    "xlsx": _parse_xlsx,
+}
--- a/autogpt_platform/backend/backend/util/file_content_parser_test.py
+++ b/autogpt_platform/backend/backend/util/file_content_parser_test.py
@@ -0,0 +1,624 @@
+"""Tests for file_content_parser — format inference and structured parsing."""
+
+import io
+import json
+
+import pytest
+
+from backend.util.file_content_parser import (
+    BINARY_FORMATS,
+    infer_format_from_uri,
+    parse_file_content,
+)
+
+# ---------------------------------------------------------------------------
+# infer_format_from_uri
+# ---------------------------------------------------------------------------
+
+
+class TestInferFormat:
+    # --- extension-based ---
+
+    def test_json_extension(self):
+        assert infer_format_from_uri("/home/user/data.json") == "json"
+
+    def test_jsonl_extension(self):
+        assert infer_format_from_uri("/tmp/events.jsonl") == "jsonl"
+
+    def test_ndjson_extension(self):
+        assert infer_format_from_uri("/tmp/events.ndjson") == "jsonl"
+
+    def test_csv_extension(self):
+        assert infer_format_from_uri("workspace:///reports/sales.csv") == "csv"
+
+    def test_tsv_extension(self):
+        assert infer_format_from_uri("/home/user/data.tsv") == "tsv"
+
+    def test_yaml_extension(self):
+        assert infer_format_from_uri("/home/user/config.yaml") == "yaml"
+
+    def test_yml_extension(self):
+        assert infer_format_from_uri("/home/user/config.yml") == "yaml"
+
+    def test_toml_extension(self):
+        assert infer_format_from_uri("/home/user/config.toml") == "toml"
+
+    def test_parquet_extension(self):
+        assert infer_format_from_uri("/data/table.parquet") == "parquet"
+
+    def test_xlsx_extension(self):
+        assert infer_format_from_uri("/data/spreadsheet.xlsx") == "xlsx"
+
+    def test_xls_extension_returns_xls_label(self):
+        # Legacy .xls is mapped so callers can produce a helpful error.
+        assert infer_format_from_uri("/data/old_spreadsheet.xls") == "xls"
+
+    def test_case_insensitive(self):
+        assert infer_format_from_uri("/data/FILE.JSON") == "json"
+        assert infer_format_from_uri("/data/FILE.CSV") == "csv"
+
+    def test_unicode_filename(self):
+        assert infer_format_from_uri("/home/user/\u30c7\u30fc\u30bf.json") == "json"
+        assert infer_format_from_uri("/home/user/\u00e9t\u00e9.csv") == "csv"
+
+    def test_unknown_extension(self):
+        assert infer_format_from_uri("/home/user/readme.txt") is None
+
+    def test_no_extension(self):
+        assert infer_format_from_uri("workspace://abc123") is None
+
+    # --- MIME-based ---
+
+    def test_mime_json(self):
+        assert infer_format_from_uri("workspace://abc123#application/json") == "json"
+
+    def test_mime_csv(self):
+        assert infer_format_from_uri("workspace://abc123#text/csv") == "csv"
+
+    def test_mime_tsv(self):
+        assert (
+            infer_format_from_uri("workspace://abc123#text/tab-separated-values")
+            == "tsv"
+        )
+
+    def test_mime_ndjson(self):
+        assert (
+            infer_format_from_uri("workspace://abc123#application/x-ndjson") == "jsonl"
+        )
+
+    def test_mime_yaml(self):
+        assert infer_format_from_uri("workspace://abc123#application/x-yaml") == "yaml"
+
+    def test_mime_xlsx(self):
+        uri = "workspace://abc123#application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
+        assert infer_format_from_uri(uri) == "xlsx"
+
+    def test_mime_parquet(self):
+        assert (
+            infer_format_from_uri("workspace://abc123#application/vnd.apache.parquet")
+            == "parquet"
+        )
+
+    def test_unknown_mime(self):
+        assert infer_format_from_uri("workspace://abc123#text/plain") is None
+
+    def test_unknown_mime_falls_through_to_extension(self):
+        # Unknown MIME (text/plain) should fall through to extension-based detection.
+        assert infer_format_from_uri("workspace:///data.csv#text/plain") == "csv"
+
+    # --- MIME takes precedence over extension ---
+
+    def test_mime_overrides_extension(self):
+        # .txt extension but JSON MIME → json
+        assert infer_format_from_uri("workspace:///file.txt#application/json") == "json"
+
+
+# ---------------------------------------------------------------------------
+# parse_file_content — JSON
+# ---------------------------------------------------------------------------
+
+
+class TestParseJson:
+    def test_array(self):
+        result = parse_file_content("[1, 2, 3]", "json")
+        assert result == [1, 2, 3]
+
+    def test_object(self):
+        result = parse_file_content('{"key": "value"}', "json")
+        assert result == {"key": "value"}
+
+    def test_nested(self):
+        content = json.dumps({"rows": [[1, 2], [3, 4]]})
+        result = parse_file_content(content, "json")
+        assert result == {"rows": [[1, 2], [3, 4]]}
+
+    def test_scalar_string_stays_as_string(self):
+        result = parse_file_content('"hello"', "json")
+        assert result == '"hello"'  # original content, not parsed
+
+    def test_scalar_number_stays_as_string(self):
+        result = parse_file_content("42", "json")
+        assert result == "42"
+
+    def test_scalar_boolean_stays_as_string(self):
+        result = parse_file_content("true", "json")
+        assert result == "true"
+
+    def test_null_stays_as_string(self):
+        result = parse_file_content("null", "json")
+        assert result == "null"
+
+    def test_invalid_json_fallback(self):
+        content = "not json at all"
+        result = parse_file_content(content, "json")
+        assert result == content
+
+    def test_empty_string_fallback(self):
+        result = parse_file_content("", "json")
+        assert result == ""
+
+    def test_bytes_input_decoded(self):
+        result = parse_file_content(b"[1, 2, 3]", "json")
+        assert result == [1, 2, 3]
+
+
+# ---------------------------------------------------------------------------
+# parse_file_content — JSONL
+# ---------------------------------------------------------------------------
+
+
+class TestParseJsonl:
+    def test_tabular_uniform_dicts_to_table_format(self):
+        """JSONL with uniform dict keys → table format (header + rows),
+        consistent with CSV/TSV/Parquet/Excel output."""
+        content = '{"name":"apple","color":"red"}\n{"name":"banana","color":"yellow"}\n{"name":"cherry","color":"red"}'
+        result = parse_file_content(content, "jsonl")
+        assert result == [
+            ["name", "color"],
+            ["apple", "red"],
+            ["banana", "yellow"],
+            ["cherry", "red"],
+        ]
+
+    def test_tabular_single_key_dicts(self):
+        """JSONL with single-key uniform dicts → table format."""
+        content = '{"a": 1}\n{"a": 2}\n{"a": 3}'
+        result = parse_file_content(content, "jsonl")
+        assert result == [["a"], [1], [2], [3]]
+
+    def test_tabular_blank_lines_skipped(self):
+        content = '{"a": 1}\n\n{"a": 2}\n'
+        result = parse_file_content(content, "jsonl")
+        assert result == [["a"], [1], [2]]
+
+    def test_heterogeneous_dicts_stay_as_list(self):
+        """JSONL with different keys across objects → list of dicts (no table)."""
+        content = '{"name":"apple"}\n{"color":"red"}\n{"size":3}'
+        result = parse_file_content(content, "jsonl")
+        assert result == [{"name": "apple"}, {"color": "red"}, {"size": 3}]
+
+    def test_partially_overlapping_keys_stay_as_list(self):
+        """JSONL dicts with partially overlapping keys → list of dicts."""
+        content = '{"name":"apple","color":"red"}\n{"name":"banana","size":"medium"}'
+        result = parse_file_content(content, "jsonl")
+        assert result == [
+            {"name": "apple", "color": "red"},
+            {"name": "banana", "size": "medium"},
+        ]
+
+    def test_mixed_types_stay_as_list(self):
+        """JSONL with non-dict lines → list of parsed values (no table)."""
+        content = '1\n"hello"\n[1,2]\n'
+        result = parse_file_content(content, "jsonl")
+        assert result == [1, "hello", [1, 2]]
+
+    def test_mixed_dicts_and_non_dicts_stay_as_list(self):
+        """JSONL mixing dicts and non-dicts → list of parsed values."""
+        content = '{"a": 1}\n42\n{"b": 2}'
+        result = parse_file_content(content, "jsonl")
+        assert result == [{"a": 1}, 42, {"b": 2}]
+
+    def test_tabular_preserves_key_order(self):
+        """Table header should follow the key order of the first object."""
+        content = '{"z": 1, "a": 2}\n{"z": 3, "a": 4}'
+        result = parse_file_content(content, "jsonl")
+        assert result[0] == ["z", "a"]  # order from first object
+        assert result[1] == [1, 2]
+        assert result[2] == [3, 4]
+
+    def test_single_dict_stays_as_list(self):
+        """Single-line JSONL with one dict → [dict], NOT a table.
+        Tabular detection requires ≥2 dicts to avoid vacuously true all()."""
+        content = '{"a": 1, "b": 2}'
+        result = parse_file_content(content, "jsonl")
+        assert result == [{"a": 1, "b": 2}]
+
+    def test_tabular_with_none_values(self):
+        """Uniform keys but some null values → table with None cells."""
+        content = '{"name":"apple","color":"red"}\n{"name":"banana","color":null}'
+        result = parse_file_content(content, "jsonl")
+        assert result == [
+            ["name", "color"],
+            ["apple", "red"],
+            ["banana", None],
+        ]
+
+    def test_empty_file_fallback(self):
+        result = parse_file_content("", "jsonl")
+        assert result == ""
+
+    def test_all_blank_lines_fallback(self):
+        result = parse_file_content("\n\n\n", "jsonl")
+        assert result == "\n\n\n"
+
+    def test_invalid_line_fallback(self):
+        content = '{"a": 1}\nnot json\n'
+        result = parse_file_content(content, "jsonl")
+        assert result == content  # fallback
+
+
+# ---------------------------------------------------------------------------
+# parse_file_content — CSV
+# ---------------------------------------------------------------------------
+
+
+class TestParseCsv:
+    def test_basic(self):
+        content = "Name,Score\nAlice,90\nBob,85"
+        result = parse_file_content(content, "csv")
+        assert result == [["Name", "Score"], ["Alice", "90"], ["Bob", "85"]]
+
+    def test_quoted_fields(self):
+        content = 'Name,Bio\nAlice,"Loves, commas"\nBob,Simple'
+        result = parse_file_content(content, "csv")
+        assert result[1] == ["Alice", "Loves, commas"]
+
+    def test_single_column_fallback(self):
+        # Only 1 column — not tabular enough.
+        content = "Name\nAlice\nBob"
+        result = parse_file_content(content, "csv")
+        assert result == content
+
+    def test_empty_rows_skipped(self):
+        content = "A,B\n\n1,2\n\n3,4"
+        result = parse_file_content(content, "csv")
+        assert result == [["A", "B"], ["1", "2"], ["3", "4"]]
+
+    def test_empty_file_fallback(self):
+        result = parse_file_content("", "csv")
+        assert result == ""
+
+    def test_utf8_bom(self):
+        """CSV with a UTF-8 BOM should parse correctly (BOM stripped by decode)."""
+        bom = "\ufeff"
+        content = bom + "Name,Score\nAlice,90\nBob,85"
+        result = parse_file_content(content, "csv")
+        # The BOM may be part of the first header cell; ensure rows are still parsed.
+        assert len(result) == 3
+        assert result[1] == ["Alice", "90"]
+        assert result[2] == ["Bob", "85"]
+
+
+# ---------------------------------------------------------------------------
+# parse_file_content — TSV
+# ---------------------------------------------------------------------------
+
+
+class TestParseTsv:
+    def test_basic(self):
+        content = "Name\tScore\nAlice\t90\nBob\t85"
+        result = parse_file_content(content, "tsv")
+        assert result == [["Name", "Score"], ["Alice", "90"], ["Bob", "85"]]
+
+    def test_single_column_fallback(self):
+        content = "Name\nAlice\nBob"
+        result = parse_file_content(content, "tsv")
+        assert result == content
+
+
+# ---------------------------------------------------------------------------
+# parse_file_content — YAML
+# ---------------------------------------------------------------------------
+
+
+class TestParseYaml:
+    def test_list(self):
+        content = "- apple\n- banana\n- cherry"
+        result = parse_file_content(content, "yaml")
+        assert result == ["apple", "banana", "cherry"]
+
+    def test_dict(self):
+        content = "name: Alice\nage: 30"
+        result = parse_file_content(content, "yaml")
+        assert result == {"name": "Alice", "age": 30}
+
+    def test_nested(self):
+        content = "users:\n  - name: Alice\n  - name: Bob"
+        result = parse_file_content(content, "yaml")
+        assert result == {"users": [{"name": "Alice"}, {"name": "Bob"}]}
+
+    def test_scalar_stays_as_string(self):
+        result = parse_file_content("hello world", "yaml")
+        assert result == "hello world"
+
+    def test_invalid_yaml_fallback(self):
+        content = ":\n  :\n    invalid: - -"
+        result = parse_file_content(content, "yaml")
+        # Malformed YAML should fall back to the original string, not raise.
+        assert result == content
+
+
+# ---------------------------------------------------------------------------
+# parse_file_content — TOML
+# ---------------------------------------------------------------------------
+
+
+class TestParseToml:
+    def test_basic(self):
+        content = '[server]\nhost = "localhost"\nport = 8080'
+        result = parse_file_content(content, "toml")
+        assert result == {"server": {"host": "localhost", "port": 8080}}
+
+    def test_flat(self):
+        content = 'name = "test"\ncount = 42'
+        result = parse_file_content(content, "toml")
+        assert result == {"name": "test", "count": 42}
+
+    def test_empty_string_returns_empty_dict(self):
+        result = parse_file_content("", "toml")
+        assert result == {}
+
+    def test_invalid_toml_fallback(self):
+        result = parse_file_content("not = [valid toml", "toml")
+        assert result == "not = [valid toml"
+
+
+# ---------------------------------------------------------------------------
+# parse_file_content — Parquet (binary)
+# ---------------------------------------------------------------------------
+
+
+try:
+    import pyarrow as _pa  # noqa: F401  # pyright: ignore[reportMissingImports]
+
+    _has_pyarrow = True
+except ImportError:
+    _has_pyarrow = False
+
+
+@pytest.mark.skipif(not _has_pyarrow, reason="pyarrow not installed")
+class TestParseParquet:
+    @pytest.fixture
+    def parquet_bytes(self) -> bytes:
+        import pandas as pd
+
+        df = pd.DataFrame({"Name": ["Alice", "Bob"], "Score": [90, 85]})
+        buf = io.BytesIO()
+        df.to_parquet(buf, index=False)
+        return buf.getvalue()
+
+    def test_basic(self, parquet_bytes: bytes):
+        result = parse_file_content(parquet_bytes, "parquet")
+        assert result == [["Name", "Score"], ["Alice", 90], ["Bob", 85]]
+
+    def test_string_input_fallback(self):
+        # Parquet is binary — string input can't be parsed.
+        result = parse_file_content("not parquet", "parquet")
+        assert result == "not parquet"
+
+    def test_invalid_bytes_fallback(self):
+        result = parse_file_content(b"not parquet bytes", "parquet")
+        assert result == b"not parquet bytes"
+
+    def test_empty_bytes_fallback(self):
+        """Empty binary input should return the empty bytes, not crash."""
+        result = parse_file_content(b"", "parquet")
+        assert result == b""
+
+    def test_nan_replaced_with_none(self):
+        """NaN values in Parquet must become None for JSON serializability."""
+        import math
+
+        import pandas as pd
+
+        df = pd.DataFrame({"A": [1.0, float("nan"), 3.0], "B": ["x", None, "z"]})
+        buf = io.BytesIO()
+        df.to_parquet(buf, index=False)
+        result = parse_file_content(buf.getvalue(), "parquet")
+        # Row with NaN in float col → None
+        assert result[2][0] is None  # float NaN → None
+        assert result[2][1] is None  # str None → None
+        # Ensure no NaN leaks
+        for row in result[1:]:
+            for cell in row:
+                if isinstance(cell, float):
+                    assert not math.isnan(cell), f"NaN leaked: {row}"
+
+
+# ---------------------------------------------------------------------------
+# parse_file_content — Excel (binary)
+# ---------------------------------------------------------------------------
+
+
+class TestParseExcel:
+    @pytest.fixture
+    def xlsx_bytes(self) -> bytes:
+        import pandas as pd
+
+        df = pd.DataFrame({"Name": ["Alice", "Bob"], "Score": [90, 85]})
+        buf = io.BytesIO()
+        df.to_excel(buf, index=False)  # type: ignore[arg-type]  # BytesIO is a valid target
+        return buf.getvalue()
+
+    def test_basic(self, xlsx_bytes: bytes):
+        result = parse_file_content(xlsx_bytes, "xlsx")
+        assert result == [["Name", "Score"], ["Alice", 90], ["Bob", 85]]
+
+    def test_string_input_fallback(self):
+        result = parse_file_content("not xlsx", "xlsx")
+        assert result == "not xlsx"
+
+    def test_invalid_bytes_fallback(self):
+        result = parse_file_content(b"not xlsx bytes", "xlsx")
+        assert result == b"not xlsx bytes"
+
+    def test_empty_bytes_fallback(self):
+        """Empty binary input should return the empty bytes, not crash."""
+        result = parse_file_content(b"", "xlsx")
+        assert result == b""
+
+    def test_nan_replaced_with_none(self):
+        """NaN values in float columns must become None for JSON serializability."""
+        import math
+
+        import pandas as pd
+
+        df = pd.DataFrame({"A": [1.0, float("nan"), 3.0], "B": ["x", "y", None]})
+        buf = io.BytesIO()
+        df.to_excel(buf, index=False)  # type: ignore[arg-type]
+        result = parse_file_content(buf.getvalue(), "xlsx")
+        # Row with NaN in float col → None, not float('nan')
+        assert result[2][0] is None  # float NaN → None
+        assert result[3][1] is None  # str None → None
+        # Ensure no NaN leaks
+        for row in result[1:]:  # skip header
+            for cell in row:
+                if isinstance(cell, float):
+                    assert not math.isnan(cell), f"NaN leaked: {row}"
+
+
+# ---------------------------------------------------------------------------
+# parse_file_content — unknown format / fallback
+# ---------------------------------------------------------------------------
+
+
+class TestFallback:
+    def test_unknown_format_returns_content(self):
+        result = parse_file_content("hello world", "xml")
+        assert result == "hello world"
+
+    def test_none_format_returns_content(self):
+        # Shouldn't normally be called with unrecognised format, but must not crash.
+        result = parse_file_content("hello", "unknown_format")
+        assert result == "hello"
+
+
+# ---------------------------------------------------------------------------
+# BINARY_FORMATS
+# ---------------------------------------------------------------------------
+
+
+class TestBinaryFormats:
+    def test_parquet_is_binary(self):
+        assert "parquet" in BINARY_FORMATS
+
+    def test_xlsx_is_binary(self):
+        assert "xlsx" in BINARY_FORMATS
+
+    def test_text_formats_not_binary(self):
+        for fmt in ("json", "jsonl", "csv", "tsv", "yaml", "toml"):
+            assert fmt not in BINARY_FORMATS
+
+
+# ---------------------------------------------------------------------------
+# MIME mapping
+# ---------------------------------------------------------------------------
+
+
+class TestMimeMapping:
+    def test_application_yaml(self):
+        assert infer_format_from_uri("workspace://abc123#application/yaml") == "yaml"
+
+
+# ---------------------------------------------------------------------------
+# CSV sniffer fallback
+# ---------------------------------------------------------------------------
+
+
+class TestCsvSnifferFallback:
+    def test_tab_delimited_with_csv_format(self):
+        """Tab-delimited content parsed as csv should use sniffer fallback."""
+        content = "Name\tScore\nAlice\t90\nBob\t85"
+        result = parse_file_content(content, "csv")
+        assert result == [["Name", "Score"], ["Alice", "90"], ["Bob", "85"]]
+
+    def test_sniffer_failure_returns_content(self):
+        """When sniffer fails, single-column falls back to raw content."""
+        content = "Name\nAlice\nBob"
+        result = parse_file_content(content, "csv")
+        assert result == content
+
+
+# ---------------------------------------------------------------------------
+# OpenpyxlInvalidFile fallback
+# ---------------------------------------------------------------------------
+
+
+class TestOpenpyxlFallback:
+    def test_invalid_xlsx_non_strict(self):
+        """Invalid xlsx bytes should fall back gracefully in non-strict mode."""
+        result = parse_file_content(b"not xlsx bytes", "xlsx")
+        assert result == b"not xlsx bytes"
+
+
+# ---------------------------------------------------------------------------
+# Header-only CSV
+# ---------------------------------------------------------------------------
+
+
+class TestHeaderOnlyCsv:
+    def test_header_only_csv_returns_header_row(self):
+        """CSV with only a header row (no data rows) should return [[header]]."""
+        content = "Name,Score"
+        result = parse_file_content(content, "csv")
+        assert result == [["Name", "Score"]]
+
+    def test_header_only_csv_with_trailing_newline(self):
+        content = "Name,Score\n"
+        result = parse_file_content(content, "csv")
+        assert result == [["Name", "Score"]]
+
+
+# ---------------------------------------------------------------------------
+# Binary format + line range (line range ignored for binary formats)
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.skipif(not _has_pyarrow, reason="pyarrow not installed")
+class TestBinaryFormatLineRange:
+    def test_parquet_ignores_line_range(self):
+        """Binary formats should parse the full file regardless of line range.
+
+        Line ranges are meaningless for binary formats (parquet/xlsx) — the
+        caller (file_ref._expand_bare_ref) passes raw bytes and the parser
+        should return the complete structured data.
+        """
+        import pandas as pd
+
+        df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
+        buf = io.BytesIO()
+        df.to_parquet(buf, index=False)
+        # parse_file_content itself doesn't take a line range — this tests
+        # that the full content is parsed even though the bytes could have
+        # been truncated upstream (it's not, by design).
+        result = parse_file_content(buf.getvalue(), "parquet")
+        assert result == [["A", "B"], [1, 4], [2, 5], [3, 6]]
+
+
+# ---------------------------------------------------------------------------
+# Legacy .xls UX
+# ---------------------------------------------------------------------------
+
+
+class TestXlsFallback:
+    def test_xls_returns_helpful_error_string(self):
+        """Uploading a .xls file should produce a helpful error, not garbled binary."""
+        result = parse_file_content(b"\xd0\xcf\x11\xe0garbled", "xls")
+        assert isinstance(result, str)
+        assert ".xlsx" in result
+        assert "not supported" in result.lower()
+
+    def test_xls_with_string_content(self):
+        result = parse_file_content("some text", "xls")
+        assert isinstance(result, str)
+        assert ".xlsx" in result
--- a/autogpt_platform/backend/backend/util/file_test.py
+++ b/autogpt_platform/backend/backend/util/file_test.py
@@ -8,7 +8,12 @@ from unittest.mock import AsyncMock, MagicMock, patch
 import pytest

 from backend.data.execution import ExecutionContext
-from backend.util.file import store_media_file
+from backend.util.file import (
+    is_media_file_ref,
+    parse_data_uri,
+    resolve_media_content,
+    store_media_file,
+)
 from backend.util.type import MediaFileType


@@ -344,3 +349,162 @@ class TestFileCloudIntegration:
                    execution_context=make_test_context(graph_exec_id=graph_exec_id),
                    return_format="for_local_processing",
                )
+
+
+# ---------------------------------------------------------------------------
+# is_media_file_ref
+# ---------------------------------------------------------------------------
+
+
+class TestIsMediaFileRef:
+    def test_data_uri(self):
+        assert is_media_file_ref("data:image/png;base64,iVBORw0KGg==") is True
+
+    def test_workspace_uri(self):
+        assert is_media_file_ref("workspace://abc123") is True
+
+    def test_workspace_uri_with_mime(self):
+        assert is_media_file_ref("workspace://abc123#image/png") is True
+
+    def test_http_url(self):
+        assert is_media_file_ref("http://example.com/image.png") is True
+
+    def test_https_url(self):
+        assert is_media_file_ref("https://example.com/image.png") is True
+
+    def test_plain_text(self):
+        assert is_media_file_ref("print('hello')") is False
+
+    def test_local_path(self):
+        assert is_media_file_ref("/tmp/file.txt") is False
+
+    def test_empty_string(self):
+        assert is_media_file_ref("") is False
+
+    def test_filename(self):
+        assert is_media_file_ref("image.png") is False
+
+
+# ---------------------------------------------------------------------------
+# parse_data_uri
+# ---------------------------------------------------------------------------
+
+
+class TestParseDataUri:
+    def test_valid_png(self):
+        result = parse_data_uri("data:image/png;base64,iVBORw0KGg==")
+        assert result is not None
+        mime, payload = result
+        assert mime == "image/png"
+        assert payload == "iVBORw0KGg=="
+
+    def test_valid_text(self):
+        result = parse_data_uri("data:text/plain;base64,SGVsbG8=")
+        assert result is not None
+        assert result[0] == "text/plain"
+        assert result[1] == "SGVsbG8="
+
+    def test_mime_case_normalized(self):
+        result = parse_data_uri("data:IMAGE/PNG;base64,abc")
+        assert result is not None
+        assert result[0] == "image/png"
+
+    def test_not_data_uri(self):
+        assert parse_data_uri("workspace://abc123") is None
+
+    def test_plain_text(self):
+        assert parse_data_uri("hello world") is None
+
+    def test_missing_base64(self):
+        assert parse_data_uri("data:image/png;utf-8,abc") is None
+
+    def test_empty_payload(self):
+        result = parse_data_uri("data:image/png;base64,")
+        assert result is not None
+        assert result[1] == ""
+
+
+# ---------------------------------------------------------------------------
+# resolve_media_content
+# ---------------------------------------------------------------------------
+
+
+class TestResolveMediaContent:
+    @pytest.mark.asyncio
+    async def test_plain_text_passthrough(self):
+        """Plain text content (not a media ref) passes through unchanged."""
+        ctx = make_test_context()
+        result = await resolve_media_content(
+            MediaFileType("print('hello')"),
+            ctx,
+            return_format="for_external_api",
+        )
+        assert result == "print('hello')"
+
+    @pytest.mark.asyncio
+    async def test_empty_string_passthrough(self):
+        """Empty string passes through unchanged."""
+        ctx = make_test_context()
+        result = await resolve_media_content(
+            MediaFileType(""),
+            ctx,
+            return_format="for_external_api",
+        )
+        assert result == ""
+
+    @pytest.mark.asyncio
+    async def test_media_ref_delegates_to_store(self):
+        """Media references are resolved via store_media_file."""
+        ctx = make_test_context()
+        with patch(
+            "backend.util.file.store_media_file",
+            new=AsyncMock(return_value=MediaFileType("data:image/png;base64,abc")),
+        ) as mock_store:
+            result = await resolve_media_content(
+                MediaFileType("workspace://img123"),
+                ctx,
+                return_format="for_external_api",
+            )
+        assert result == "data:image/png;base64,abc"
+        mock_store.assert_called_once_with(
+            MediaFileType("workspace://img123"),
+            ctx,
+            return_format="for_external_api",
+        )
+
+    @pytest.mark.asyncio
+    async def test_data_uri_delegates_to_store(self):
+        """Data URIs are also resolved via store_media_file."""
+        ctx = make_test_context()
+        data_uri = "data:image/png;base64,iVBORw0KGg=="
+        with patch(
+            "backend.util.file.store_media_file",
+            new=AsyncMock(return_value=MediaFileType(data_uri)),
+        ) as mock_store:
+            result = await resolve_media_content(
+                MediaFileType(data_uri),
+                ctx,
+                return_format="for_external_api",
+            )
+        assert result == data_uri
+        mock_store.assert_called_once()
+
+    @pytest.mark.asyncio
+    async def test_https_url_delegates_to_store(self):
+        """HTTPS URLs are resolved via store_media_file."""
+        ctx = make_test_context()
+        with patch(
+            "backend.util.file.store_media_file",
+            new=AsyncMock(return_value=MediaFileType("data:image/png;base64,abc")),
+        ) as mock_store:
+            result = await resolve_media_content(
+                MediaFileType("https://example.com/image.png"),
+                ctx,
+                return_format="for_local_processing",
+            )
+        assert result == "data:image/png;base64,abc"
+        mock_store.assert_called_once_with(
+            MediaFileType("https://example.com/image.png"),
+            ctx,
+            return_format="for_local_processing",
+        )
--- a/autogpt_platform/backend/backend/util/workspace.py
+++ b/autogpt_platform/backend/backend/util/workspace.py
@@ -183,7 +183,8 @@ class WorkspaceManager:
                f"{Config().max_file_size_mb}MB limit"
            )

-        # Virus scan content before persisting (defense in depth)
+        # Scan here — callers must NOT duplicate this scan.
+        # WorkspaceManager owns virus scanning for all persisted files.
        await scan_content_safe(content, filename=filename)

        # Determine path with session scoping
--- a/autogpt_platform/backend/poetry.lock
+++ b/autogpt_platform/backend/poetry.lock
@@ -1360,6 +1360,18 @@ files = [
 dnspython = ">=2.0.0"
 idna = ">=2.0.0"

+[[package]]
+name = "et-xmlfile"
+version = "2.0.0"
+description = "An implementation of lxml.xmlfile for the standard library"
+optional = false
+python-versions = ">=3.8"
+groups = ["main"]
+files = [
+    {file = "et_xmlfile-2.0.0-py3-none-any.whl", hash = "sha256:7a91720bc756843502c3b7504c77b8fe44217c85c537d85037f0f536151b2caa"},
+    {file = "et_xmlfile-2.0.0.tar.gz", hash = "sha256:dab3f4764309081ce75662649be815c4c9081e88f0837825f90fd28317d4da54"},
+]
+
 [[package]]
 name = "exa-py"
 version = "1.16.1"
@@ -4228,6 +4240,21 @@ datalib = ["numpy (>=1)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
 realtime = ["websockets (>=13,<16)"]
 voice-helpers = ["numpy (>=2.0.2)", "sounddevice (>=0.5.1)"]

+[[package]]
+name = "openpyxl"
+version = "3.1.5"
+description = "A Python library to read/write Excel 2010 xlsx/xlsm files"
+optional = false
+python-versions = ">=3.8"
+groups = ["main"]
+files = [
+    {file = "openpyxl-3.1.5-py2.py3-none-any.whl", hash = "sha256:5282c12b107bffeef825f4617dc029afaf41d0ea60823bbb665ef3079dc79de2"},
+    {file = "openpyxl-3.1.5.tar.gz", hash = "sha256:cf0e3cf56142039133628b5acffe8ef0c12bc902d2aadd3e0fe5878dc08d1050"},
+]
+
+[package.dependencies]
+et-xmlfile = "*"
+
 [[package]]
 name = "opentelemetry-api"
 version = "1.39.1"
@@ -5430,6 +5457,66 @@ files = [
    {file = "psycopg2_binary-2.9.11-cp39-cp39-win_amd64.whl", hash = "sha256:875039274f8a2361e5207857899706da840768e2a775bf8c65e82f60b197df02"},
 ]

+[[package]]
+name = "pyarrow"
+version = "23.0.1"
+description = "Python library for Apache Arrow"
+optional = false
+python-versions = ">=3.10"
+groups = ["main"]
+files = [
+    {file = "pyarrow-23.0.1-cp310-cp310-macosx_12_0_arm64.whl", hash = "sha256:3fab8f82571844eb3c460f90a75583801d14ca0cc32b1acc8c361650e006fd56"},
+    {file = "pyarrow-23.0.1-cp310-cp310-macosx_12_0_x86_64.whl", hash = "sha256:3f91c038b95f71ddfc865f11d5876c42f343b4495535bd262c7b321b0b94507c"},
+    {file = "pyarrow-23.0.1-cp310-cp310-manylinux_2_28_aarch64.whl", hash = "sha256:d0744403adabef53c985a7f8a082b502a368510c40d184df349a0a8754533258"},
+    {file = "pyarrow-23.0.1-cp310-cp310-manylinux_2_28_x86_64.whl", hash = "sha256:c33b5bf406284fd0bba436ed6f6c3ebe8e311722b441d89397c54f871c6863a2"},
+    {file = "pyarrow-23.0.1-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:ddf743e82f69dcd6dbbcb63628895d7161e04e56794ef80550ac6f3315eeb1d5"},
+    {file = "pyarrow-23.0.1-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:e052a211c5ac9848ae15d5ec875ed0943c0221e2fcfe69eee80b604b4e703222"},
+    {file = "pyarrow-23.0.1-cp310-cp310-win_amd64.whl", hash = "sha256:5abde149bb3ce524782d838eb67ac095cd3fd6090eba051130589793f1a7f76d"},
+    {file = "pyarrow-23.0.1-cp311-cp311-macosx_12_0_arm64.whl", hash = "sha256:6f0147ee9e0386f519c952cc670eb4a8b05caa594eeffe01af0e25f699e4e9bb"},
+    {file = "pyarrow-23.0.1-cp311-cp311-macosx_12_0_x86_64.whl", hash = "sha256:0ae6e17c828455b6265d590100c295193f93cc5675eb0af59e49dbd00d2de350"},
+    {file = "pyarrow-23.0.1-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:fed7020203e9ef273360b9e45be52a2a47d3103caf156a30ace5247ffb51bdbd"},
+    {file = "pyarrow-23.0.1-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:26d50dee49d741ac0e82185033488d28d35be4d763ae6f321f97d1140eb7a0e9"},
+    {file = "pyarrow-23.0.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:3c30143b17161310f151f4a2bcfe41b5ff744238c1039338779424e38579d701"},
+    {file = "pyarrow-23.0.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:db2190fa79c80a23fdd29fef4b8992893f024ae7c17d2f5f4db7171fa30c2c78"},
+    {file = "pyarrow-23.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:f00f993a8179e0e1c9713bcc0baf6d6c01326a406a9c23495ec1ba9c9ebf2919"},
+    {file = "pyarrow-23.0.1-cp312-cp312-macosx_12_0_arm64.whl", hash = "sha256:f4b0dbfa124c0bb161f8b5ebb40f1a680b70279aa0c9901d44a2b5a20806039f"},
+    {file = "pyarrow-23.0.1-cp312-cp312-macosx_12_0_x86_64.whl", hash = "sha256:7707d2b6673f7de054e2e83d59f9e805939038eebe1763fe811ee8fa5c0cd1a7"},
+    {file = "pyarrow-23.0.1-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:86ff03fb9f1a320266e0de855dee4b17da6794c595d207f89bba40d16b5c78b9"},
+    {file = "pyarrow-23.0.1-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:813d99f31275919c383aab17f0f455a04f5a429c261cc411b1e9a8f5e4aaaa05"},
+    {file = "pyarrow-23.0.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:bf5842f960cddd2ef757d486041d57c96483efc295a8c4a0e20e704cbbf39c67"},
+    {file = "pyarrow-23.0.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:564baf97c858ecc03ec01a41062e8f4698abc3e6e2acd79c01c2e97880a19730"},
+    {file = "pyarrow-23.0.1-cp312-cp312-win_amd64.whl", hash = "sha256:07deae7783782ac7250989a7b2ecde9b3c343a643f82e8a4df03d93b633006f0"},
+    {file = "pyarrow-23.0.1-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:6b8fda694640b00e8af3c824f99f789e836720aa8c9379fb435d4c4953a756b8"},
+    {file = "pyarrow-23.0.1-cp313-cp313-macosx_12_0_x86_64.whl", hash = "sha256:8ff51b1addc469b9444b7c6f3548e19dc931b172ab234e995a60aea9f6e6025f"},
+    {file = "pyarrow-23.0.1-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:71c5be5cbf1e1cb6169d2a0980850bccb558ddc9b747b6206435313c47c37677"},
+    {file = "pyarrow-23.0.1-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:9b6f4f17b43bc39d56fec96e53fe89d94bac3eb134137964371b45352d40d0c2"},
+    {file = "pyarrow-23.0.1-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:9fc13fc6c403d1337acab46a2c4346ca6c9dec5780c3c697cf8abfd5e19b6b37"},
+    {file = "pyarrow-23.0.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:5c16ed4f53247fa3ffb12a14d236de4213a4415d127fe9cebed33d51671113e2"},
+    {file = "pyarrow-23.0.1-cp313-cp313-win_amd64.whl", hash = "sha256:cecfb12ef629cf6be0b1887f9f86463b0dd3dc3195ae6224e74006be4736035a"},
+    {file = "pyarrow-23.0.1-cp313-cp313t-macosx_12_0_arm64.whl", hash = "sha256:29f7f7419a0e30264ea261fdc0e5fe63ce5a6095003db2945d7cd78df391a7e1"},
+    {file = "pyarrow-23.0.1-cp313-cp313t-macosx_12_0_x86_64.whl", hash = "sha256:33d648dc25b51fd8055c19e4261e813dfc4d2427f068bcecc8b53d01b81b0500"},
+    {file = "pyarrow-23.0.1-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:cd395abf8f91c673dd3589cadc8cc1ee4e8674fa61b2e923c8dd215d9c7d1f41"},
+    {file = "pyarrow-23.0.1-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:00be9576d970c31defb5c32eb72ef585bf600ef6d0a82d5eccaae96639cf9d07"},
+    {file = "pyarrow-23.0.1-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:c2139549494445609f35a5cda4eb94e2c9e4d704ce60a095b342f82460c73a83"},
+    {file = "pyarrow-23.0.1-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:7044b442f184d84e2351e5084600f0d7343d6117aabcbc1ac78eb1ae11eb4125"},
+    {file = "pyarrow-23.0.1-cp313-cp313t-win_amd64.whl", hash = "sha256:a35581e856a2fafa12f3f54fce4331862b1cfb0bef5758347a858a4aa9d6bae8"},
+    {file = "pyarrow-23.0.1-cp314-cp314-macosx_12_0_arm64.whl", hash = "sha256:5df1161da23636a70838099d4aaa65142777185cc0cdba4037a18cee7d8db9ca"},
+    {file = "pyarrow-23.0.1-cp314-cp314-macosx_12_0_x86_64.whl", hash = "sha256:fa8e51cb04b9f8c9c5ace6bab63af9a1f88d35c0d6cbf53e8c17c098552285e1"},
+    {file = "pyarrow-23.0.1-cp314-cp314-manylinux_2_28_aarch64.whl", hash = "sha256:0b95a3994f015be13c63148fef8832e8a23938128c185ee951c98908a696e0eb"},
+    {file = "pyarrow-23.0.1-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:4982d71350b1a6e5cfe1af742c53dfb759b11ce14141870d05d9e540d13bc5d1"},
+    {file = "pyarrow-23.0.1-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:c250248f1fe266db627921c89b47b7c06fee0489ad95b04d50353537d74d6886"},
+    {file = "pyarrow-23.0.1-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:5f4763b83c11c16e5f4c15601ba6dfa849e20723b46aa2617cb4bffe8768479f"},
+    {file = "pyarrow-23.0.1-cp314-cp314-win_amd64.whl", hash = "sha256:3a4c85ef66c134161987c17b147d6bffdca4566f9a4c1d81a0a01cdf08414ea5"},
+    {file = "pyarrow-23.0.1-cp314-cp314t-macosx_12_0_arm64.whl", hash = "sha256:17cd28e906c18af486a499422740298c52d7c6795344ea5002a7720b4eadf16d"},
+    {file = "pyarrow-23.0.1-cp314-cp314t-macosx_12_0_x86_64.whl", hash = "sha256:76e823d0e86b4fb5e1cf4a58d293036e678b5a4b03539be933d3b31f9406859f"},
+    {file = "pyarrow-23.0.1-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:a62e1899e3078bf65943078b3ad2a6ddcacf2373bc06379aac61b1e548a75814"},
+    {file = "pyarrow-23.0.1-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:df088e8f640c9fae3b1f495b3c64755c4e719091caf250f3a74d095ddf3c836d"},
+    {file = "pyarrow-23.0.1-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:46718a220d64677c93bc243af1d44b55998255427588e400677d7192671845c7"},
+    {file = "pyarrow-23.0.1-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:a09f3876e87f48bc2f13583ab551f0379e5dfb83210391e68ace404181a20690"},
+    {file = "pyarrow-23.0.1-cp314-cp314t-win_amd64.whl", hash = "sha256:527e8d899f14bd15b740cd5a54ad56b7f98044955373a17179d5956ddb93d9ce"},
+    {file = "pyarrow-23.0.1.tar.gz", hash = "sha256:b8c5873e33440b2bc2f4a79d2b47017a89c5a24116c055625e6f2ee50523f019"},
+]
+
 [[package]]
 name = "pyasn1"
 version = "0.6.2"
@@ -8882,4 +8969,4 @@ cffi = ["cffi (>=1.17,<2.0) ; platform_python_implementation != \"PyPy\" and pyt
 [metadata]
 lock-version = "2.1"
 python-versions = ">=3.10,<3.14"
-content-hash = "4e4365721cd3b68c58c237353b74adae1c64233fd4446904c335f23eb866fdca"
+content-hash = "86dab25684dd46e635a33bd33281a926e5626a874ecc048c34389fecf34a87d8"
--- a/autogpt_platform/backend/pyproject.toml
+++ b/autogpt_platform/backend/pyproject.toml
@@ -92,6 +92,8 @@ gravitas-md2gdocs = "^0.1.0"
 posthog = "^7.6.0"
 fpdf2 = "^2.8.6"
 langsmith = "^0.7.7"
+openpyxl = "^3.1.5"
+pyarrow = "^23.0.0"

 [tool.poetry.group.dev.dependencies]
 aiohappyeyeballs = "^2.6.1"
--- a/autogpt_platform/frontend/src/app/(platform)/build/tests/DraftRecoveryPopup.test.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/build/tests/DraftRecoveryPopup.test.tsx
@@ -1,172 +0,0 @@
-import React from "react";
-import { describe, it, expect, vi, beforeEach } from "vitest";
-import { render, screen, fireEvent } from "@testing-library/react";
-import { TooltipProvider } from "@radix-ui/react-tooltip";
-import { DraftRecoveryPopup } from "../components/DraftRecoveryDialog/DraftRecoveryPopup";
-
-const mockOnLoad = vi.fn();
-const mockOnDiscard = vi.fn();
-
-vi.mock("../components/DraftRecoveryDialog/useDraftRecoveryPopup", () => ({
-  useDraftRecoveryPopup: vi.fn(() => ({
-    isOpen: true,
-    popupRef: { current: null },
-    nodeCount: 3,
-    edgeCount: 2,
-    diff: {
-      nodes: { added: 1, removed: 0, modified: 2 },
-      edges: { added: 1, removed: 1, modified: 0 },
-    },
-    savedAt: Date.now(),
-    onLoad: mockOnLoad,
-    onDiscard: mockOnDiscard,
-  })),
-}));
-
-vi.mock("framer-motion", () => ({
-  AnimatePresence: ({ children }: { children: React.ReactNode }) => (
-    <>{children}</>
-  ),
-  motion: {
-    div: React.forwardRef(function MotionDiv(
-      props: Record<string, unknown>,
-      ref: React.Ref<HTMLDivElement>,
-    ) {
-      const {
-        children,
-        initial: _initial,
-        animate: _animate,
-        exit: _exit,
-        transition: _transition,
-        ...rest
-      } = props as {
-        children?: React.ReactNode;
-        initial?: unknown;
-        animate?: unknown;
-        exit?: unknown;
-        transition?: unknown;
-        [key: string]: unknown;
-      };
-      return (
-        <div ref={ref} {...rest}>
-          {children}
-        </div>
-      );
-    }),
-  },
-}));
-
-function renderWithProviders(ui: React.ReactElement) {
-  return render(<TooltipProvider>{ui}</TooltipProvider>);
-}
-
-beforeEach(() => {
-  vi.clearAllMocks();
-});
-
-describe("DraftRecoveryPopup", () => {
-  describe("when open with diff data", () => {
-    it("shows the unsaved changes message", () => {
-      renderWithProviders(<DraftRecoveryPopup isInitialLoadComplete={true} />);
-      expect(screen.getByText("Unsaved changes found")).toBeDefined();
-    });
-
-    it("displays diff summary", () => {
-      renderWithProviders(<DraftRecoveryPopup isInitialLoadComplete={true} />);
-      const text = document.body.textContent;
-      expect(text).toContain("+1/~2 blocks");
-      expect(text).toContain("+1/-1 connections");
-    });
-
-    it("renders restore and discard buttons", () => {
-      renderWithProviders(<DraftRecoveryPopup isInitialLoadComplete={true} />);
-      expect(screen.getAllByText("Restore changes").length).toBeGreaterThan(0);
-      expect(screen.getAllByText("Discard changes").length).toBeGreaterThan(0);
-    });
-
-    it("calls onLoad when restore is clicked", () => {
-      renderWithProviders(<DraftRecoveryPopup isInitialLoadComplete={true} />);
-      const buttons = screen.getAllByRole("button", {
-        name: /restore changes/i,
-      });
-      fireEvent.click(buttons[0]);
-      expect(mockOnLoad).toHaveBeenCalledOnce();
-    });
-
-    it("calls onDiscard when discard is clicked", () => {
-      renderWithProviders(<DraftRecoveryPopup isInitialLoadComplete={true} />);
-      const buttons = screen.getAllByRole("button", {
-        name: /discard changes/i,
-      });
-      fireEvent.click(buttons[0]);
-      expect(mockOnDiscard).toHaveBeenCalledOnce();
-    });
-  });
-
-  describe("when closed", () => {
-    it("renders nothing when isOpen is false", async () => {
-      const { useDraftRecoveryPopup } = await import(
-        "../components/DraftRecoveryDialog/useDraftRecoveryPopup"
-      );
-      vi.mocked(useDraftRecoveryPopup).mockReturnValue({
-        isOpen: false,
-        popupRef: { current: null },
-        nodeCount: 0,
-        edgeCount: 0,
-        diff: null,
-        savedAt: 0,
-        onLoad: vi.fn(),
-        onDiscard: vi.fn(),
-      });
-
-      const { container } = renderWithProviders(
-        <DraftRecoveryPopup isInitialLoadComplete={true} />,
-      );
-      expect(container.textContent).toBe("");
-    });
-  });
-
-  describe("when diff is null", () => {
-    it("falls back to node/edge count display", async () => {
-      const { useDraftRecoveryPopup } = await import(
-        "../components/DraftRecoveryDialog/useDraftRecoveryPopup"
-      );
-      vi.mocked(useDraftRecoveryPopup).mockReturnValue({
-        isOpen: true,
-        popupRef: { current: null },
-        nodeCount: 5,
-        edgeCount: 1,
-        diff: null,
-        savedAt: Date.now(),
-        onLoad: vi.fn(),
-        onDiscard: vi.fn(),
-      });
-
-      renderWithProviders(<DraftRecoveryPopup isInitialLoadComplete={true} />);
-      const text = document.body.textContent;
-      expect(text).toContain("5 blocks");
-      expect(text).toContain("1 connection");
-    });
-
-    it("uses singular for 1 block", async () => {
-      const { useDraftRecoveryPopup } = await import(
-        "../components/DraftRecoveryDialog/useDraftRecoveryPopup"
-      );
-      vi.mocked(useDraftRecoveryPopup).mockReturnValue({
-        isOpen: true,
-        popupRef: { current: null },
-        nodeCount: 1,
-        edgeCount: 0,
-        diff: null,
-        savedAt: Date.now(),
-        onLoad: vi.fn(),
-        onDiscard: vi.fn(),
-      });
-
-      renderWithProviders(<DraftRecoveryPopup isInitialLoadComplete={true} />);
-      const text = document.body.textContent;
-      expect(text).toContain("1 block,");
-      expect(text).toContain("0 connections");
-    });
-  });
-});
--- a/autogpt_platform/frontend/src/app/(platform)/build/tests/blockMenuStore.test.ts
+++ b/autogpt_platform/frontend/src/app/(platform)/build/tests/blockMenuStore.test.ts
@@ -1,246 +0,0 @@
-import { describe, it, expect, beforeEach } from "vitest";
-import { useBlockMenuStore } from "../stores/blockMenuStore";
-import { DefaultStateType } from "../components/NewControlPanel/NewBlockMenu/types";
-import { SearchEntryFilterAnyOfItem } from "@/app/api/__generated__/models/searchEntryFilterAnyOfItem";
-import { StoreAgent } from "@/app/api/__generated__/models/storeAgent";
-import { SearchResponseItemsItem } from "@/app/api/__generated__/models/searchResponseItemsItem";
-
-beforeEach(() => {
-  useBlockMenuStore.setState({
-    searchQuery: "",
-    searchId: undefined,
-    defaultState: DefaultStateType.SUGGESTION,
-    integration: undefined,
-    filters: [],
-    creators: [],
-    creators_list: [],
-    categoryCounts: {
-      blocks: 0,
-      integrations: 0,
-      marketplace_agents: 0,
-      my_agents: 0,
-    },
-  });
-});
-
-describe("blockMenuStore", () => {
-  describe("initial state", () => {
-    it("has empty search and suggestion default state", () => {
-      const state = useBlockMenuStore.getState();
-      expect(state.searchQuery).toBe("");
-      expect(state.searchId).toBeUndefined();
-      expect(state.defaultState).toBe("suggestion");
-      expect(state.integration).toBeUndefined();
-      expect(state.filters).toEqual([]);
-      expect(state.creators).toEqual([]);
-    });
-  });
-
-  describe("search state", () => {
-    it("sets search query", () => {
-      useBlockMenuStore.getState().setSearchQuery("weather");
-      expect(useBlockMenuStore.getState().searchQuery).toBe("weather");
-    });
-
-    it("sets search id", () => {
-      useBlockMenuStore.getState().setSearchId("abc-123");
-      expect(useBlockMenuStore.getState().searchId).toBe("abc-123");
-    });
-
-    it("clears search id", () => {
-      useBlockMenuStore.getState().setSearchId("abc-123");
-      useBlockMenuStore.getState().setSearchId(undefined);
-      expect(useBlockMenuStore.getState().searchId).toBeUndefined();
-    });
-  });
-
-  describe("default state", () => {
-    it("sets default state", () => {
-      useBlockMenuStore.getState().setDefaultState(DefaultStateType.ALL_BLOCKS);
-      expect(useBlockMenuStore.getState().defaultState).toBe(
-        DefaultStateType.ALL_BLOCKS,
-      );
-    });
-
-    it("changes between states", () => {
-      useBlockMenuStore
-        .getState()
-        .setDefaultState(DefaultStateType.INTEGRATIONS);
-      useBlockMenuStore.getState().setDefaultState(DefaultStateType.MY_AGENTS);
-      expect(useBlockMenuStore.getState().defaultState).toBe(
-        DefaultStateType.MY_AGENTS,
-      );
-    });
-  });
-
-  describe("integration", () => {
-    it("sets integration", () => {
-      useBlockMenuStore.getState().setIntegration("slack");
-      expect(useBlockMenuStore.getState().integration).toBe("slack");
-    });
-
-    it("clears integration", () => {
-      useBlockMenuStore.getState().setIntegration("slack");
-      useBlockMenuStore.getState().setIntegration(undefined);
-      expect(useBlockMenuStore.getState().integration).toBeUndefined();
-    });
-  });
-
-  describe("filters", () => {
-    it("adds a filter", () => {
-      useBlockMenuStore.getState().addFilter(SearchEntryFilterAnyOfItem.blocks);
-      expect(useBlockMenuStore.getState().filters).toEqual(["blocks"]);
-    });
-
-    it("adds multiple filters", () => {
-      useBlockMenuStore.getState().addFilter(SearchEntryFilterAnyOfItem.blocks);
-      useBlockMenuStore
-        .getState()
-        .addFilter(SearchEntryFilterAnyOfItem.integrations);
-      expect(useBlockMenuStore.getState().filters).toEqual([
-        "blocks",
-        "integrations",
-      ]);
-    });
-
-    it("removes a filter", () => {
-      useBlockMenuStore.getState().addFilter(SearchEntryFilterAnyOfItem.blocks);
-      useBlockMenuStore
-        .getState()
-        .addFilter(SearchEntryFilterAnyOfItem.integrations);
-      useBlockMenuStore
-        .getState()
-        .removeFilter(SearchEntryFilterAnyOfItem.blocks);
-      expect(useBlockMenuStore.getState().filters).toEqual(["integrations"]);
-    });
-
-    it("sets filters directly", () => {
-      useBlockMenuStore
-        .getState()
-        .setFilters([
-          SearchEntryFilterAnyOfItem.my_agents,
-          SearchEntryFilterAnyOfItem.marketplace_agents,
-        ]);
-      expect(useBlockMenuStore.getState().filters).toEqual([
-        "my_agents",
-        "marketplace_agents",
-      ]);
-    });
-
-    it("removing a non-existent filter is a no-op", () => {
-      useBlockMenuStore.getState().addFilter(SearchEntryFilterAnyOfItem.blocks);
-      useBlockMenuStore
-        .getState()
-        .removeFilter(SearchEntryFilterAnyOfItem.integrations);
-      expect(useBlockMenuStore.getState().filters).toEqual(["blocks"]);
-    });
-  });
-
-  describe("creators", () => {
-    it("adds a creator", () => {
-      useBlockMenuStore.getState().addCreator("alice");
-      expect(useBlockMenuStore.getState().creators).toEqual(["alice"]);
-    });
-
-    it("removes a creator", () => {
-      useBlockMenuStore.getState().addCreator("alice");
-      useBlockMenuStore.getState().addCreator("bob");
-      useBlockMenuStore.getState().removeCreator("alice");
-      expect(useBlockMenuStore.getState().creators).toEqual(["bob"]);
-    });
-
-    it("sets creators directly", () => {
-      useBlockMenuStore.getState().setCreators(["x", "y"]);
-      expect(useBlockMenuStore.getState().creators).toEqual(["x", "y"]);
-    });
-  });
-
-  describe("setCreatorsList", () => {
-    it("extracts creators from store_agent items", () => {
-      const items: SearchResponseItemsItem[] = [
-        {
-          slug: "agent-1",
-          agent_name: "Agent 1",
-          creator: "alice",
-        } as StoreAgent,
-        {
-          slug: "agent-2",
-          agent_name: "Agent 2",
-          creator: "bob",
-        } as StoreAgent,
-      ];
-
-      useBlockMenuStore.getState().setCreatorsList(items);
-      const list = useBlockMenuStore.getState().creators_list;
-      expect(list).toContain("alice");
-      expect(list).toContain("bob");
-    });
-
-    it("deduplicates creators across calls", () => {
-      const items1 = [
-        {
-          slug: "a1",
-          agent_name: "A1",
-          creator: "alice",
-        } as StoreAgent,
-      ] as SearchResponseItemsItem[];
-      const items2 = [
-        {
-          slug: "a2",
-          agent_name: "A2",
-          creator: "alice",
-        } as StoreAgent,
-      ] as SearchResponseItemsItem[];
-
-      useBlockMenuStore.getState().setCreatorsList(items1);
-      useBlockMenuStore.getState().setCreatorsList(items2);
-
-      const aliceCount = useBlockMenuStore
-        .getState()
-        .creators_list.filter((c) => c === "alice").length;
-      expect(aliceCount).toBe(1);
-    });
-  });
-
-  describe("categoryCounts", () => {
-    it("sets category counts", () => {
-      const counts = {
-        blocks: 10,
-        integrations: 5,
-        marketplace_agents: 3,
-        my_agents: 2,
-      };
-      useBlockMenuStore.getState().setCategoryCounts(counts);
-      expect(useBlockMenuStore.getState().categoryCounts).toEqual(counts);
-    });
-  });
-
-  describe("reset", () => {
-    it("resets search query, searchId, defaultState, and integration", () => {
-      useBlockMenuStore.getState().setSearchQuery("test");
-      useBlockMenuStore.getState().setSearchId("id-1");
-      useBlockMenuStore.getState().setDefaultState(DefaultStateType.ALL_BLOCKS);
-      useBlockMenuStore.getState().setIntegration("slack");
-      useBlockMenuStore.getState().addFilter(SearchEntryFilterAnyOfItem.blocks);
-      useBlockMenuStore.getState().addCreator("alice");
-
-      useBlockMenuStore.getState().reset();
-
-      const state = useBlockMenuStore.getState();
-      expect(state.searchQuery).toBe("");
-      expect(state.searchId).toBeUndefined();
-      expect(state.defaultState).toBe("suggestion");
-      expect(state.integration).toBeUndefined();
-    });
-
-    it("does not clear filters or creators", () => {
-      useBlockMenuStore.getState().addFilter(SearchEntryFilterAnyOfItem.blocks);
-      useBlockMenuStore.getState().addCreator("alice");
-
-      useBlockMenuStore.getState().reset();
-
-      expect(useBlockMenuStore.getState().filters).toEqual(["blocks"]);
-      expect(useBlockMenuStore.getState().creators).toEqual(["alice"]);
-    });
-  });
-});
--- a/autogpt_platform/frontend/src/app/(platform)/build/tests/controlPanelStore.test.ts
+++ b/autogpt_platform/frontend/src/app/(platform)/build/tests/controlPanelStore.test.ts
@@ -1,104 +0,0 @@
-import { describe, it, expect, beforeEach } from "vitest";
-import { useControlPanelStore } from "../stores/controlPanelStore";
-
-beforeEach(() => {
-  useControlPanelStore.getState().reset();
-});
-
-describe("controlPanelStore", () => {
-  describe("initial state", () => {
-    it("starts with all panels closed", () => {
-      const state = useControlPanelStore.getState();
-      expect(state.blockMenuOpen).toBe(false);
-      expect(state.saveControlOpen).toBe(false);
-      expect(state.forceOpenBlockMenu).toBe(false);
-      expect(state.forceOpenSave).toBe(false);
-    });
-  });
-
-  describe("setBlockMenuOpen", () => {
-    it("opens the block menu", () => {
-      useControlPanelStore.getState().setBlockMenuOpen(true);
-      expect(useControlPanelStore.getState().blockMenuOpen).toBe(true);
-    });
-
-    it("closes the block menu", () => {
-      useControlPanelStore.getState().setBlockMenuOpen(true);
-      useControlPanelStore.getState().setBlockMenuOpen(false);
-      expect(useControlPanelStore.getState().blockMenuOpen).toBe(false);
-    });
-  });
-
-  describe("setSaveControlOpen", () => {
-    it("opens the save control", () => {
-      useControlPanelStore.getState().setSaveControlOpen(true);
-      expect(useControlPanelStore.getState().saveControlOpen).toBe(true);
-    });
-
-    it("closes the save control", () => {
-      useControlPanelStore.getState().setSaveControlOpen(true);
-      useControlPanelStore.getState().setSaveControlOpen(false);
-      expect(useControlPanelStore.getState().saveControlOpen).toBe(false);
-    });
-  });
-
-  describe("setForceOpenBlockMenu", () => {
-    it("sets force open state", () => {
-      useControlPanelStore.getState().setForceOpenBlockMenu(true);
-      expect(useControlPanelStore.getState().forceOpenBlockMenu).toBe(true);
-    });
-
-    it("does not affect blockMenuOpen", () => {
-      useControlPanelStore.getState().setForceOpenBlockMenu(true);
-      expect(useControlPanelStore.getState().blockMenuOpen).toBe(false);
-    });
-  });
-
-  describe("setForceOpenSave", () => {
-    it("sets force open state", () => {
-      useControlPanelStore.getState().setForceOpenSave(true);
-      expect(useControlPanelStore.getState().forceOpenSave).toBe(true);
-    });
-
-    it("does not affect saveControlOpen", () => {
-      useControlPanelStore.getState().setForceOpenSave(true);
-      expect(useControlPanelStore.getState().saveControlOpen).toBe(false);
-    });
-  });
-
-  describe("independent panel state", () => {
-    it("opening block menu does not affect save control", () => {
-      useControlPanelStore.getState().setBlockMenuOpen(true);
-      expect(useControlPanelStore.getState().saveControlOpen).toBe(false);
-    });
-
-    it("opening save control does not affect block menu", () => {
-      useControlPanelStore.getState().setSaveControlOpen(true);
-      expect(useControlPanelStore.getState().blockMenuOpen).toBe(false);
-    });
-
-    it("both panels can be open simultaneously", () => {
-      useControlPanelStore.getState().setBlockMenuOpen(true);
-      useControlPanelStore.getState().setSaveControlOpen(true);
-      expect(useControlPanelStore.getState().blockMenuOpen).toBe(true);
-      expect(useControlPanelStore.getState().saveControlOpen).toBe(true);
-    });
-  });
-
-  describe("reset", () => {
-    it("resets all state to defaults", () => {
-      useControlPanelStore.getState().setBlockMenuOpen(true);
-      useControlPanelStore.getState().setSaveControlOpen(true);
-      useControlPanelStore.getState().setForceOpenBlockMenu(true);
-      useControlPanelStore.getState().setForceOpenSave(true);
-
-      useControlPanelStore.getState().reset();
-
-      const state = useControlPanelStore.getState();
-      expect(state.blockMenuOpen).toBe(false);
-      expect(state.saveControlOpen).toBe(false);
-      expect(state.forceOpenBlockMenu).toBe(false);
-      expect(state.forceOpenSave).toBe(false);
-    });
-  });
-});
--- a/autogpt_platform/frontend/src/app/(platform)/build/tests/tutorialStore.test.ts
+++ b/autogpt_platform/frontend/src/app/(platform)/build/tests/tutorialStore.test.ts
@@ -1,118 +0,0 @@
-import { describe, it, expect, beforeEach } from "vitest";
-import { useTutorialStore } from "../stores/tutorialStore";
-
-beforeEach(() => {
-  useTutorialStore.setState({
-    isTutorialRunning: false,
-    currentStep: 0,
-    forceOpenRunInputDialog: false,
-    tutorialInputValues: {},
-  });
-});
-
-describe("tutorialStore", () => {
-  describe("initial state", () => {
-    it("starts with tutorial not running at step 0", () => {
-      const state = useTutorialStore.getState();
-      expect(state.isTutorialRunning).toBe(false);
-      expect(state.currentStep).toBe(0);
-      expect(state.forceOpenRunInputDialog).toBe(false);
-      expect(state.tutorialInputValues).toEqual({});
-    });
-  });
-
-  describe("setIsTutorialRunning", () => {
-    it("starts the tutorial", () => {
-      useTutorialStore.getState().setIsTutorialRunning(true);
-      expect(useTutorialStore.getState().isTutorialRunning).toBe(true);
-    });
-
-    it("stops the tutorial", () => {
-      useTutorialStore.getState().setIsTutorialRunning(true);
-      useTutorialStore.getState().setIsTutorialRunning(false);
-      expect(useTutorialStore.getState().isTutorialRunning).toBe(false);
-    });
-  });
-
-  describe("setCurrentStep", () => {
-    it("advances to a step", () => {
-      useTutorialStore.getState().setCurrentStep(3);
-      expect(useTutorialStore.getState().currentStep).toBe(3);
-    });
-
-    it("can go back to a previous step", () => {
-      useTutorialStore.getState().setCurrentStep(5);
-      useTutorialStore.getState().setCurrentStep(2);
-      expect(useTutorialStore.getState().currentStep).toBe(2);
-    });
-
-    it("can reset to step 0", () => {
-      useTutorialStore.getState().setCurrentStep(4);
-      useTutorialStore.getState().setCurrentStep(0);
-      expect(useTutorialStore.getState().currentStep).toBe(0);
-    });
-  });
-
-  describe("setForceOpenRunInputDialog", () => {
-    it("forces the dialog open", () => {
-      useTutorialStore.getState().setForceOpenRunInputDialog(true);
-      expect(useTutorialStore.getState().forceOpenRunInputDialog).toBe(true);
-    });
-
-    it("closes the forced dialog", () => {
-      useTutorialStore.getState().setForceOpenRunInputDialog(true);
-      useTutorialStore.getState().setForceOpenRunInputDialog(false);
-      expect(useTutorialStore.getState().forceOpenRunInputDialog).toBe(false);
-    });
-  });
-
-  describe("setTutorialInputValues", () => {
-    it("sets input values", () => {
-      useTutorialStore
-        .getState()
-        .setTutorialInputValues({ topic: "AI agents" });
-      expect(useTutorialStore.getState().tutorialInputValues).toEqual({
-        topic: "AI agents",
-      });
-    });
-
-    it("replaces previous values entirely", () => {
-      useTutorialStore.getState().setTutorialInputValues({ a: "1" });
-      useTutorialStore.getState().setTutorialInputValues({ b: "2" });
-      expect(useTutorialStore.getState().tutorialInputValues).toEqual({
-        b: "2",
-      });
-    });
-
-    it("clears values with empty object", () => {
-      useTutorialStore.getState().setTutorialInputValues({ x: "y" });
-      useTutorialStore.getState().setTutorialInputValues({});
-      expect(useTutorialStore.getState().tutorialInputValues).toEqual({});
-    });
-  });
-
-  describe("step progression lifecycle", () => {
-    it("simulates a full tutorial run", () => {
-      useTutorialStore.getState().setIsTutorialRunning(true);
-      expect(useTutorialStore.getState().isTutorialRunning).toBe(true);
-
-      useTutorialStore.getState().setCurrentStep(1);
-      useTutorialStore.getState().setCurrentStep(2);
-      useTutorialStore.getState().setCurrentStep(3);
-
-      useTutorialStore.getState().setForceOpenRunInputDialog(true);
-      useTutorialStore
-        .getState()
-        .setTutorialInputValues({ prompt: "test prompt" });
-      useTutorialStore.getState().setForceOpenRunInputDialog(false);
-
-      useTutorialStore.getState().setCurrentStep(4);
-      useTutorialStore.getState().setIsTutorialRunning(false);
-      useTutorialStore.getState().setCurrentStep(0);
-
-      const state = useTutorialStore.getState();
-      expect(state.isTutorialRunning).toBe(false);
-      expect(state.currentStep).toBe(0);
-    });
-  });
-});
--- a/autogpt_platform/frontend/src/app/(platform)/copilot/components/ChatInput/useChatInput.ts
+++ b/autogpt_platform/frontend/src/app/(platform)/copilot/components/ChatInput/useChatInput.ts
@@ -1,3 +1,4 @@
+import { useCopilotUIStore } from "@/app/(platform)/copilot/store";
 import { ChangeEvent, FormEvent, useEffect, useState } from "react";

 interface Args {
@@ -16,6 +17,16 @@ export function useChatInput({
 }: Args) {
  const [value, setValue] = useState("");
  const [isSending, setIsSending] = useState(false);
+  const { initialPrompt, setInitialPrompt } = useCopilotUIStore();
+
+  useEffect(
+    function consumeInitialPrompt() {
+      if (!initialPrompt) return;
+      setValue((prev) => (prev.length === 0 ? initialPrompt : prev));
+      setInitialPrompt(null);
+    },
+    [initialPrompt, setInitialPrompt],
+  );

  useEffect(
    function focusOnMount() {
--- a/autogpt_platform/frontend/src/app/(platform)/copilot/store.ts
+++ b/autogpt_platform/frontend/src/app/(platform)/copilot/store.ts
@@ -7,6 +7,10 @@ export interface DeleteTarget {
 }

 interface CopilotUIState {
+  /** Prompt extracted from URL hash (e.g. /copilot#prompt=...) for input prefill. */
+  initialPrompt: string | null;
+  setInitialPrompt: (prompt: string | null) => void;
+
  sessionToDelete: DeleteTarget | null;
  setSessionToDelete: (target: DeleteTarget | null) => void;

@@ -31,6 +35,9 @@ interface CopilotUIState {
 }

 export const useCopilotUIStore = create<CopilotUIState>((set) => ({
+  initialPrompt: null,
+  setInitialPrompt: (prompt) => set({ initialPrompt: prompt }),
+
  sessionToDelete: null,
  setSessionToDelete: (target) => set({ sessionToDelete: target }),

--- a/autogpt_platform/frontend/src/app/(platform)/copilot/useCopilotPage.ts
+++ b/autogpt_platform/frontend/src/app/(platform)/copilot/useCopilotPage.ts
@@ -19,6 +19,42 @@ import { useCopilotStream } from "./useCopilotStream";
 const TITLE_POLL_INTERVAL_MS = 2_000;
 const TITLE_POLL_MAX_ATTEMPTS = 5;

+/**
+ * Extract a prompt from the URL hash fragment.
+ * Supports: /copilot#prompt=URL-encoded-text
+ * Optionally auto-submits if ?autosubmit=true is in the query string.
+ * Returns null if no prompt is present.
+ */
+function extractPromptFromUrl(): {
+  prompt: string;
+  autosubmit: boolean;
+} | null {
+  if (typeof window === "undefined") return null;
+
+  const hash = window.location.hash;
+  if (!hash) return null;
+
+  const hashParams = new URLSearchParams(hash.slice(1));
+  const prompt = hashParams.get("prompt");
+
+  if (!prompt || !prompt.trim()) return null;
+
+  const searchParams = new URLSearchParams(window.location.search);
+  const autosubmit = searchParams.get("autosubmit") === "true";
+
+  // Clean up hash + autosubmit param only (preserve other query params)
+  const cleanURL = new URL(window.location.href);
+  cleanURL.hash = "";
+  cleanURL.searchParams.delete("autosubmit");
+  window.history.replaceState(
+    null,
+    "",
+    `${cleanURL.pathname}${cleanURL.search}`,
+  );
+
+  return { prompt: prompt.trim(), autosubmit };
+}
+
 interface UploadedFile {
  file_id: string;
  name: string;
@@ -127,6 +163,28 @@ export function useCopilotPage() {
    }
  }, [sessionId, pendingMessage, sendMessage]);

+  // --- Extract prompt from URL hash on mount (e.g. /copilot#prompt=Hello) ---
+  const { setInitialPrompt } = useCopilotUIStore();
+  const hasProcessedUrlPrompt = useRef(false);
+  useEffect(() => {
+    if (hasProcessedUrlPrompt.current) return;
+
+    const urlPrompt = extractPromptFromUrl();
+    if (!urlPrompt) return;
+
+    hasProcessedUrlPrompt.current = true;
+
+    if (urlPrompt.autosubmit) {
+      setPendingMessage(urlPrompt.prompt);
+      void createSession().catch(() => {
+        setPendingMessage(null);
+        setInitialPrompt(urlPrompt.prompt);
+      });
+    } else {
+      setInitialPrompt(urlPrompt.prompt);
+    }
+  }, [createSession, setInitialPrompt]);
+
  async function uploadFiles(
    files: File[],
    sid: string,
--- a/autogpt_platform/frontend/src/app/(platform)/library/components/JumpBackIn/JumpBackIn.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/components/JumpBackIn/JumpBackIn.tsx
@@ -0,0 +1,40 @@
+"use client";
+
+import { ArrowRight, Lightning } from "@phosphor-icons/react";
+import NextLink from "next/link";
+
+import { Button } from "@/components/atoms/Button/Button";
+import { Text } from "@/components/atoms/Text/Text";
+import { useJumpBackIn } from "./useJumpBackIn";
+
+export function JumpBackIn() {
+  const { agent, isLoading } = useJumpBackIn();
+
+  if (isLoading || !agent) {
+    return null;
+  }
+
+  return (
+    <div className="flex items-center justify-between rounded-large border border-zinc-200 bg-gradient-to-r from-zinc-50 to-white px-5 py-4">
+      <div className="flex items-center gap-3">
+        <div className="flex h-9 w-9 items-center justify-center rounded-full bg-zinc-900">
+          <Lightning size={18} weight="fill" className="text-white" />
+        </div>
+        <div className="flex flex-col">
+          <Text variant="small" className="text-zinc-500">
+            Continue where you left off
+          </Text>
+          <Text variant="body-medium" className="text-zinc-900">
+            {agent.name}
+          </Text>
+        </div>
+      </div>
+      <NextLink href={`/library/agents/${agent.id}`}>
+        <Button variant="primary" size="small" className="gap-1.5">
+          Jump Back In
+          <ArrowRight size={16} />
+        </Button>
+      </NextLink>
+    </div>
+  );
+}
--- a/autogpt_platform/frontend/src/app/(platform)/library/components/JumpBackIn/useJumpBackIn.ts
+++ b/autogpt_platform/frontend/src/app/(platform)/library/components/JumpBackIn/useJumpBackIn.ts
@@ -0,0 +1,28 @@
+"use client";
+
+import { useGetV2ListLibraryAgents } from "@/app/api/__generated__/endpoints/library/library";
+import { okData } from "@/app/api/helpers";
+
+export function useJumpBackIn() {
+  const { data, isLoading } = useGetV2ListLibraryAgents(
+    {
+      page: 1,
+      page_size: 1,
+      sort_by: "updatedAt",
+    },
+    {
+      query: { select: okData },
+    },
+  );
+
+  // The API doesn't include execution data by default (include_executions is
+  // internal to the backend), so recent_executions is always empty here.
+  // We use the most recently updated agent as the "jump back in" candidate
+  // instead — updatedAt is the best available proxy for recent activity.
+  const agent = data?.agents[0] ?? null;
+
+  return {
+    agent,
+    isLoading,
+  };
+}
--- a/autogpt_platform/frontend/src/app/(platform)/library/page.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/library/page.tsx
@@ -2,6 +2,7 @@

 import { useEffect, useState, useCallback } from "react";
 import { HeartIcon, ListIcon } from "@phosphor-icons/react";
+import { JumpBackIn } from "./components/JumpBackIn/JumpBackIn";
 import { LibraryActionHeader } from "./components/LibraryActionHeader/LibraryActionHeader";
 import { LibraryAgentList } from "./components/LibraryAgentList/LibraryAgentList";
 import { Tab } from "./components/LibraryTabs/LibraryTabs";
@@ -38,6 +39,7 @@ export default function LibraryPage() {
      onAnimationComplete={handleFavoriteAnimationComplete}
    >
      <main className="pt-160 container min-h-screen space-y-4 pb-20 pt-16 sm:px-8 md:px-12">
+        <JumpBackIn />
        <LibraryActionHeader setSearchTerm={setSearchTerm} />
        <LibraryAgentList
          searchTerm={searchTerm}
--- a/docs/platform/workspace-media-architecture.md
+++ b/docs/platform/workspace-media-architecture.md
@@ -0,0 +1,343 @@
+# Workspace & Media File Architecture
+
+This document describes the architecture for handling user files in AutoGPT Platform, covering persistent user storage (Workspace) and ephemeral media processing pipelines.
+
+## Overview
+
+The platform has two distinct file-handling layers:
+
+| Layer | Purpose | Persistence | Scope |
+|-------|---------|-------------|-------|
+| **Workspace** | Long-term user file storage | Persistent (DB + GCS/local) | Per-user, session-scoped access |
+| **Media Pipeline** | Ephemeral file processing for blocks | Temporary (local disk) | Per-execution |
+
+## Database Models
+
+### UserWorkspace
+
+Represents a user's file storage space. Created on-demand (one per user).
+
+```prisma
+model UserWorkspace {
+  id        String   @id @default(uuid())
+  createdAt DateTime @default(now())
+  updatedAt DateTime @updatedAt
+  userId    String   @unique
+  Files     UserWorkspaceFile[]
+}
+```
+
+**Key points:**
+- One workspace per user (enforced by `@unique` on `userId`)
+- Created lazily via `get_or_create_workspace()` 
+- Uses upsert to handle race conditions
+
+### UserWorkspaceFile
+
+Represents a file stored in a user's workspace.
+
+```prisma
+model UserWorkspaceFile {
+  id          String    @id @default(uuid())
+  workspaceId String
+  name        String    // User-visible filename
+  path        String    // Virtual path (e.g., "/sessions/abc123/image.png")
+  storagePath String    // Actual storage path (gcs://... or local://...)
+  mimeType    String
+  sizeBytes   BigInt
+  checksum    String?   // SHA256 for integrity
+  isDeleted   Boolean   @default(false)
+  deletedAt   DateTime?
+  metadata    Json      @default("{}")
+
+  @@unique([workspaceId, path])  // Enforce unique paths within workspace
+}
+```
+
+**Key points:**
+- `path` is a virtual path for organizing files (not actual filesystem path)
+- `storagePath` contains the actual GCS or local storage location
+- Soft-delete pattern: `isDeleted` flag with `deletedAt` timestamp
+- Path is modified on delete to free up the virtual path for reuse
+
+---
+
+## WorkspaceManager
+
+**Location:** `backend/util/workspace.py`
+
+High-level API for workspace file operations. Combines storage backend operations with database record management.
+
+### Initialization
+
+```python
+from backend.util.workspace import WorkspaceManager
+
+# Basic usage
+manager = WorkspaceManager(user_id="user-123", workspace_id="ws-456")
+
+# With session scoping (CoPilot sessions)
+manager = WorkspaceManager(
+    user_id="user-123",
+    workspace_id="ws-456", 
+    session_id="session-789"
+)
+```
+
+### Session Scoping
+
+When `session_id` is provided, files are isolated to `/sessions/{session_id}/`:
+
+```python
+# With session_id="abc123":
+manager.write_file(content, "image.png")  
+# → stored at /sessions/abc123/image.png
+
+# Cross-session access is explicit:
+manager.read_file("/sessions/other-session/file.txt")  # Works
+```
+
+**Why session scoping?**
+- CoPilot conversations need file isolation
+- Prevents file collisions between concurrent sessions
+- Allows session cleanup without affecting other sessions
+
+### Core Methods
+
+| Method | Description |
+|--------|-------------|
+| `write_file(content, filename, path?, mime_type?, overwrite?)` | Write file to workspace |
+| `read_file(path)` | Read file by virtual path |
+| `read_file_by_id(file_id)` | Read file by ID |
+| `list_files(path?, limit?, offset?, include_all_sessions?)` | List files |
+| `delete_file(file_id)` | Soft-delete a file |
+| `get_download_url(file_id, expires_in?)` | Get signed download URL |
+| `get_file_info(file_id)` | Get file metadata |
+| `get_file_info_by_path(path)` | Get file metadata by path |
+| `get_file_count(path?, include_all_sessions?)` | Count files |
+
+### Storage Backends
+
+WorkspaceManager delegates to `WorkspaceStorageBackend`:
+
+| Backend | When Used | Storage Path Format |
+|---------|-----------|---------------------|
+| `GCSWorkspaceStorage` | `media_gcs_bucket_name` is configured | `gcs://bucket/workspaces/{ws_id}/{file_id}/{filename}` |
+| `LocalWorkspaceStorage` | No GCS bucket configured | `local://{ws_id}/{file_id}/{filename}` |
+
+---
+
+## store_media_file()
+
+**Location:** `backend/util/file.py`
+
+The media normalization pipeline. Handles various input types and normalizes them for processing or output.
+
+### Purpose
+
+Blocks receive files in many formats (URLs, data URIs, workspace references, local paths). `store_media_file()` normalizes these to a consistent format based on what the block needs.
+
+### Input Types Handled
+
+| Input Format | Example | How It's Processed |
+|--------------|---------|-------------------|
+| Data URI | `data:image/png;base64,iVBOR...` | Decoded, virus scanned, written locally |
+| HTTP(S) URL | `https://example.com/image.png` | Downloaded, virus scanned, written locally |
+| Workspace URI | `workspace://abc123` or `workspace:///path/to/file` | Read from workspace, virus scanned, written locally |
+| Cloud path | `gcs://bucket/path` | Downloaded, virus scanned, written locally |
+| Local path | `image.png` | Verified to exist in exec_file directory |
+
+### Return Formats
+
+The `return_format` parameter determines what you get back:
+
+```python
+from backend.util.file import store_media_file
+
+# For local processing (ffmpeg, MoviePy, PIL)
+local_path = await store_media_file(
+    file=input_file,
+    execution_context=ctx,
+    return_format="for_local_processing"
+)
+# Returns: "image.png" (relative path in exec_file dir)
+
+# For external APIs (Replicate, OpenAI, etc.)
+data_uri = await store_media_file(
+    file=input_file,
+    execution_context=ctx,
+    return_format="for_external_api"
+)
+# Returns: "data:image/png;base64,iVBOR..."
+
+# For block output (adapts to execution context)
+output = await store_media_file(
+    file=input_file,
+    execution_context=ctx,
+    return_format="for_block_output"
+)
+# In CoPilot: Returns "workspace://file-id#image/png"
+# In graphs:  Returns "data:image/png;base64,..."
+```
+
+### Execution Context
+
+`store_media_file()` requires an `ExecutionContext` with:
+- `graph_exec_id` - Required for temp file location
+- `user_id` - Required for workspace access
+- `workspace_id` - Optional; enables workspace features
+- `session_id` - Optional; for session scoping in CoPilot
+
+---
+
+## Responsibility Boundaries
+
+### Virus Scanning
+
+| Component | Scans? | Notes |
+|-----------|--------|-------|
+| `store_media_file()` | ✅ Yes | Scans **all** content before writing to local disk |
+| `WorkspaceManager.write_file()` | ✅ Yes | Scans content before persisting |
+
+**Scanning happens at:**
+1. `store_media_file()` — scans everything it downloads/decodes
+2. `WorkspaceManager.write_file()` — scans before persistence
+
+Tools like `WriteWorkspaceFileTool` don't need to scan because `WorkspaceManager.write_file()` handles it.
+
+### Persistence
+
+| Component | Persists To | Lifecycle |
+|-----------|-------------|-----------|
+| `store_media_file()` | Temp dir (`/tmp/exec_file/{exec_id}/`) | Cleaned after execution |
+| `WorkspaceManager` | GCS or local storage + DB | Persistent until deleted |
+
+**Automatic cleanup:** `clean_exec_files(graph_exec_id)` removes temp files after execution completes.
+
+---
+
+## Decision Tree: WorkspaceManager vs store_media_file
+
+```text
+┌─────────────────────────────────────────────────────┐
+│ What do you need to do with the file?               │
+└─────────────────────────────────────────────────────┘
+                         │
+           ┌─────────────┴─────────────┐
+           ▼                           ▼
+    Process in a block          Store for user access
+    (ffmpeg, PIL, etc.)         (CoPilot files, uploads)
+           │                           │
+           ▼                           ▼
+    store_media_file()           WorkspaceManager
+    with appropriate             
+    return_format                
+           │                           
+           │                           
+    ┌──────┴──────┐                    
+    ▼             ▼                    
+ "for_local_   "for_block_
+ processing"   output"
+    │             │
+    ▼             ▼
+ Get local    Auto-saves to
+ path for     workspace in
+ tools        CoPilot context
+
+Store for user access
+    │
+    ├── write_file() ─── Upload + persist (scans internally)
+    ├── read_file() / get_download_url() ─── Retrieve
+    └── list_files() / delete_file() ─── Manage
+```
+
+### Quick Reference
+
+| Scenario | Use |
+|----------|-----|
+| Block needs to process a file with ffmpeg | `store_media_file(..., return_format="for_local_processing")` |
+| Block needs to send file to external API | `store_media_file(..., return_format="for_external_api")` |
+| Block returning a generated file | `store_media_file(..., return_format="for_block_output")` |
+| API endpoint handling file upload | `WorkspaceManager.write_file()` (handles virus scanning internally) |
+| API endpoint serving file download | `WorkspaceManager.get_download_url()` |
+| Listing user's files | `WorkspaceManager.list_files()` |
+
+---
+
+## Key Files Reference
+
+| File | Purpose |
+|------|---------|
+| `backend/data/workspace.py` | Database CRUD operations for UserWorkspace and UserWorkspaceFile |
+| `backend/util/workspace.py` | `WorkspaceManager` class - high-level workspace API |
+| `backend/util/workspace_storage.py` | Storage backends (GCS, local) and `WorkspaceStorageBackend` interface |
+| `backend/util/file.py` | `store_media_file()` and media processing utilities |
+| `backend/util/virus_scanner.py` | `VirusScannerService` and `scan_content_safe()` |
+| `schema.prisma` | Database model definitions |
+
+---
+
+## Common Patterns
+
+### Block Processing a User's File
+
+```python
+async def run(self, input_data, *, execution_context, **kwargs):
+    # Normalize input to local path
+    local_path = await store_media_file(
+        file=input_data.video,
+        execution_context=execution_context,
+        return_format="for_local_processing",
+    )
+    
+    # Process with local tools
+    output_path = process_video(local_path)
+    
+    # Return (auto-saves to workspace in CoPilot)
+    result = await store_media_file(
+        file=output_path,
+        execution_context=execution_context,
+        return_format="for_block_output",
+    )
+    yield "output", result
+```
+
+### API Upload Endpoint
+
+```python
+from backend.util.virus_scanner import VirusDetectedError, VirusScanError
+
+async def upload_file(file: UploadFile, user_id: str, workspace_id: str):
+    content = await file.read()
+
+    # write_file handles virus scanning internally
+    manager = WorkspaceManager(user_id, workspace_id)
+    try:
+        workspace_file = await manager.write_file(
+            content=content,
+            filename=file.filename,
+        )
+    except VirusDetectedError:
+        raise HTTPException(status_code=400, detail="File rejected: virus detected")
+    except VirusScanError:
+        raise HTTPException(status_code=503, detail="Virus scanning unavailable")
+    except ValueError as e:
+        raise HTTPException(status_code=400, detail=str(e))
+
+    return {"file_id": workspace_file.id}
+```
+
+---
+
+## Configuration
+
+| Setting | Purpose | Default |
+|---------|---------|---------|
+| `media_gcs_bucket_name` | GCS bucket for workspace storage | None (uses local) |
+| `workspace_storage_dir` | Local storage directory | `{app_data}/workspaces` |
+| `max_file_size_mb` | Maximum file size in MB | 100 |
+| `clamav_service_enabled` | Enable virus scanning | true |
+| `clamav_service_host` | ClamAV daemon host | localhost |
+| `clamav_service_port` | ClamAV daemon port | 3310 |
+| `clamav_max_concurrency` | Max concurrent scans to ClamAV daemon | 5 |
+| `clamav_mark_failed_scans_as_clean` | If true, scan failures pass content through instead of rejecting (⚠️ security risk if ClamAV is unreachable) | false |
Author	SHA1	Message	Date
Nicholas Tindle	8892bcd230	docs: Add workspace and media file architecture documentation (#11989 ) ### Changes 🏗️ - Added comprehensive architecture documentation at `docs/platform/workspace-media-architecture.md` covering: - Database models (`UserWorkspace`, `UserWorkspaceFile`) - `WorkspaceManager` API with session scoping - `store_media_file()` media normalization pipeline (input types, return formats) - Virus scanning responsibility boundaries - Decision tree for choosing `WorkspaceManager` vs `store_media_file()` - Configuration reference including `clamav_max_concurrency` and `clamav_mark_failed_scans_as_clean` - Common patterns with error handling examples - Updated `autogpt_platform/backend/CLAUDE.md` with a "Workspace & Media Files" section referencing the new docs - Removed duplicate `scan_content_safe()` call from `WriteWorkspaceFileTool` — `WorkspaceManager.write_file()` already scans internally, so the tool was double-scanning every file - Replaced removed comment in `workspace.py` with explicit ownership comment clarifying that `WorkspaceManager` is the single scanning boundary ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Verified `scan_content_safe()` is called inside `WorkspaceManager.write_file()` (workspace.py:186) - [x] Verified `store_media_file()` scans all input branches including local paths (file.py:351) - [x] Verified documentation accuracy against current source code after merge with dev - [x] CI checks all passing <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Low Risk > Mostly adds documentation and internal developer guidance; the only code change is a comment clarifying `WorkspaceManager.write_file()` as the single virus-scanning boundary, with no behavior change. > > Overview > Adds a new `docs/platform/workspace-media-architecture.md` describing the Workspace storage layer vs the `store_media_file()` media pipeline, including session scoping and virus-scanning/persistence responsibility boundaries. > > Updates backend `CLAUDE.md` to point contributors to the new doc when working on CoPilot uploads/downloads or `WorkspaceManager`/`store_media_file()`, and clarifies in `WorkspaceManager.write_file()` (comment-only) that callers should not duplicate virus scanning. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `18fcfa03f8`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-17 06:12:26 +00:00
Zamil Majdy	48ff8300a4	Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev	2026-03-17 13:13:42 +07:00
Abhimanyu Yadav	c268fc6464	test(frontend/builder): add integration tests for builder stores, components, and hooks (part-1) (#12433 ) ### Changes - Add 329 integration tests across 11 test files for the builder (visual workflow editor) - Cover all Zustand stores (nodeStore, edgeStore, historyStore, graphStore, copyPasteStore, blockMenuStore, controlPanelStore) - Cover key components (CustomNode, NewBlockMenu, NewSaveControl, RunGraph) - Cover hooks (useFlow, useCopyPaste) ### Test files \| File \| Tests \| Coverage \| \|------\|-------\|----------\| \| `nodeStore.test.ts` \| 58 \| Node lifecycle, bulk ops, backend conversion, execution tracking, status, errors, resolution mode \| \| `edgeStore.test.ts` \| 37 \| Edge CRUD, duplicate rejection, bead visualization, backend link conversion, upsert \| \| `historyStore.test.ts` \| 22 \| Undo/redo, history limits (50), microtask batching, deduplication, canUndo/canRedo \| \| `graphStore.test.ts` \| 28 \| Execution status transitions, isGraphRunning, schema management, sub-graphs \| \| `copyPasteStore.test.ts` \| 8 \| Copy/paste with ID remapping, position offset, edge preservation \| \| `CustomNode.test.tsx` \| 25 \| Rendering by block type (NOTE, WEBHOOK, AGENT, OUTPUT, AYRSHARE), error states \| \| `NewBlockMenu.test.tsx` \| 29 \| Store state (search, filters, creators, categories), search/default view routing \| \| `NewSaveControl.test.tsx` \| 11 \| Save dialog rendering, form validation, version display, popover state \| \| `RunGraph.test.tsx` \| 11 \| Run/stop button states, loading, click handlers, RunInputDialog visibility \| \| `useFlow.test.ts` \| 4 \| Loading states, initial load completion \| \| `useCopyPaste.test.ts` \| 16 \| Clipboard copy/paste, UUID remapping, viewport centering, input field guard \|	2026-03-17 05:24:55 +00:00
Reinier van der Leer	aff3fb44af	ci(platform): Improve end-to-end CI & reduce its cost (#12437 ) Our CI costs are skyrocketing, most of it because of `platform-fullstack-ci.yml`. The `types` job currently uses in a `big-boi` runner (= expensive), but doesn't need to. Additionally, the "end-to-end tests" job is currently in `platform-frontend-ci.yml` instead of `platform-fullstack-ci.yml`, causing it not to run on backend changes (which it should). ### Changes 🏗️ - Simplify `check-api-types` job (renamed from `types`) and make it use regular `ubuntu-latest` runner - Export API schema from backend through CLI (instead of spinning it up in docker) - Fix dependency caching in `platform-fullstack-ci.yml` (based on recent improvements in `platform-frontend-ci.yml`) - Move `e2e_tests` job to `platform-fullstack-ci.yml` Out-of-scope but necessary: - Eliminate module-level init of OpenAI client in `backend.copilot.service` ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - CI	2026-03-16 23:08:18 +00:00
Zamil Majdy	9a41312769	feat(backend/copilot): parse @@agptfile bare refs by file extension (#12392 ) The `@@agptfile:` expansion system previously used content-sniffing (trying `json.loads` then `csv.Sniffer`) to decide whether to parse file content as structured data. This was fragile — a file containing just `"42"` would be parsed as an integer, and the heuristics could misfire on ambiguous content. This PR replaces content-sniffing with extension/MIME-based format detection. When the file has a well-known extension (`.json`, `.csv`, etc.) or MIME type fragment (`workspace://id#application/json`), the content is parsed accordingly. Unknown formats or parse failures always fall back to plain string — no surprises. > [!NOTE] > This PR builds on the `@@agptfile:` file reference protocol introduced in #12332 and the structured data auto-parsing added in #12390. > > What is `@@agptfile:`? > It is a special URI prefix (e.g. `@@agptfile:workspace:///report.csv`) that the CoPilot SDK expands inline before sending tool arguments to blocks. This lets the AI reference workspace files by name, and the SDK automatically reads and injects the file content. See #12332 for the full design. ### Changes 🏗️ New utility: `backend/util/file_content_parser.py` - `infer_format(uri)` — determines format from file extension or MIME fragment - `parse_file_content(content, fmt)` — parses content, never raises - Supported text formats: JSON, JSONL/NDJSON, CSV, TSV, YAML, TOML - Supported binary formats: Parquet (via pyarrow), Excel/XLSX (via openpyxl) - JSON scalars (strings, numbers, booleans, null) stay as strings — only containers (arrays, objects) are promoted - CSV/TSV require ≥1 row and ≥2 columns to qualify as tabular data - Added `openpyxl` dependency for Excel reading via pandas - Case-insensitive MIME fragment matching per RFC 2045 - Shared `PARSE_EXCEPTIONS` constant to avoid duplication between modules Updated `expand_file_refs_in_args` in `file_ref.py` - Bare refs now use `infer_format` + `parse_file_content` instead of the old `_try_parse_structured` content-sniffing function - Binary formats (parquet, xlsx) read raw bytes via `read_file_bytes` - Embedded refs (text around `@@agptfile:`) still produce plain strings - Size guards: Workspace and sandbox file reads now enforce a 10 MB limit (matching the existing local file limit) to prevent OOM on large files Updated `blocks/github/commits.py` - Consolidated `_create_blob` and `_create_binary_blob` into a single function with an `encoding` parameter Updated copilot system prompt - Documents the extension-based structured data parsing and supported formats 66 new tests in `file_content_parser_test.py` covering: - Format inference (extension, MIME, case-insensitive, precedence) - All 8 format parsers (happy path + edge cases + fallbacks) - Binary format handling (string input fallback, invalid bytes fallback) - Unknown format passthrough ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] All 66 file_content_parser_test.py tests pass - [x] All 31 file_ref_test.py tests pass - [x] All 13 file_ref_integration_test.py tests pass - [x] `poetry run format` passes clean (including pyright)	2026-03-16 22:31:21 +00:00
Ubbe	048fb06b0a	feat(frontend): add "Jump Back In" button to Library page (#12387 ) Adds a "Jump Back In" CTA at the top of the Library page to encourage users to quickly rerun their most recently successful agent. Closes SECRT-1536 ### Changes 🏗️ - New `JumpBackIn` component with `useJumpBackIn` hook at `library/components/JumpBackIn/` - Fetches first page of library agents sorted by `updatedAt` - Finds the first agent with a `COMPLETED` execution in `recent_executions` - Shows banner with agent name + "Jump Back In" button linking to `/library/agents/{id}` - Returns `null` (hidden) when loading or when no agent with a successful run exists ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] `pnpm format`, `pnpm lint`, `pnpm types` all pass - [x] Verified banner is hidden when no successful runs exist (edge case) - [x] Verified library page renders correctly with no visual regressions 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 21:35:03 +08:00
Otto	0b594a219c	feat(copilot): support prompt-in-URL for shareable prompt links (#12406 ) Requested by @torantula Add support for shareable AutoPilot URLs that contain a prompt in the URL hash fragment, inspired by [Lovable's implementation](https://docs.lovable.dev/integrations/build-with-url). URL format: - `/copilot#prompt=URL-encoded-text` — pre-fills the input for the user to review before sending - `/copilot?autosubmit=true#prompt=...` — auto-creates a session and sends the prompt immediately Example: ``` https://platform.agpt.co/copilot#prompt=Create%20a%20todo%20app https://platform.agpt.co/copilot?autosubmit=true#prompt=Create%20a%20todo%20app ``` Key design decisions: - Uses URL fragment (`#`) instead of query params — fragments never hit the server, so prompts stay client-side only (better for privacy, no backend URL length limits) - URL is cleaned via `history.replaceState` immediately after extraction to prevent re-triggering on navigation/reload - Leverages existing `pendingMessage` + `createSession()` flow for auto-submit — no new backend APIs needed - For populate-only mode, passes `initialPrompt` down through component tree to pre-fill the chat input Files changed: - `useCopilotPage.ts` — URL hash extraction logic + `initialPrompt` state - `CopilotPage.tsx` — passes `initialPrompt` to `ChatContainer` - `ChatContainer.tsx` — passes `initialPrompt` to `EmptySession` - `EmptySession.tsx` — passes `initialPrompt` to `ChatInput` - `ChatInput.tsx` / `useChatInput.ts` — accepts `initialValue` to pre-fill the textarea Fixes SECRT-2119 --- Co-authored-by: Toran Bruce Richards (@Torantulino) <toran@agpt.co>	2026-03-13 23:54:54 +07:00