style: black formatting

fix: skip binary file if stat fails to prevent OOM
If the stat command fails (file deleted, permissions issue, etc.), we now skip the file rather than proceeding to read it with an unknown size. This prevents potential OOM crashes from large files where size verification failed.
2026-02-12 15:55:03 -05:00 · 2026-02-12 12:46:20 +00:00 · 2026-02-12 12:32:13 +00:00 · 2026-02-12 12:25:29 +00:00 · 2026-02-12 12:02:45 +00:00 · 2026-02-12 11:58:35 +00:00
7 changed files with 303 additions and 344 deletions
--- a/.dockerignore
+++ b/.dockerignore
@@ -5,13 +5,42 @@
 !docs/

 # Platform - Libs
-!autogpt_platform/autogpt_libs/
+!autogpt_platform/autogpt_libs/autogpt_libs/
+!autogpt_platform/autogpt_libs/pyproject.toml
+!autogpt_platform/autogpt_libs/poetry.lock
+!autogpt_platform/autogpt_libs/README.md

 # Platform - Backend
-!autogpt_platform/backend/
+!autogpt_platform/backend/backend/
+!autogpt_platform/backend/test/e2e_test_data.py
+!autogpt_platform/backend/migrations/
+!autogpt_platform/backend/schema.prisma
+!autogpt_platform/backend/pyproject.toml
+!autogpt_platform/backend/poetry.lock
+!autogpt_platform/backend/README.md
+!autogpt_platform/backend/.env
+!autogpt_platform/backend/gen_prisma_types_stub.py
+
+# Platform - Market
+!autogpt_platform/market/market/
+!autogpt_platform/market/scripts.py
+!autogpt_platform/market/schema.prisma
+!autogpt_platform/market/pyproject.toml
+!autogpt_platform/market/poetry.lock
+!autogpt_platform/market/README.md

 # Platform - Frontend
-!autogpt_platform/frontend/
+!autogpt_platform/frontend/src/
+!autogpt_platform/frontend/public/
+!autogpt_platform/frontend/scripts/
+!autogpt_platform/frontend/package.json
+!autogpt_platform/frontend/pnpm-lock.yaml
+!autogpt_platform/frontend/tsconfig.json
+!autogpt_platform/frontend/README.md
+## config
+!autogpt_platform/frontend/*.config.*
+!autogpt_platform/frontend/.env.*
+!autogpt_platform/frontend/.env

 # Classic - AutoGPT
 !classic/original_autogpt/autogpt/
@@ -35,37 +64,6 @@
 # Classic - Frontend
 !classic/frontend/build/web/

-# Explicitly re-ignore unwanted files from whitelisted directories
-# Note: These patterns MUST come after the whitelist rules to take effect
-
-# Hidden files and directories (but keep frontend .env files needed for build)
-**/.*
-!autogpt_platform/frontend/.env
-!autogpt_platform/frontend/.env.default
-!autogpt_platform/frontend/.env.production
-
-# Python artifacts
-**/__pycache__/
-**/*.pyc
-**/*.pyo
-**/.venv/
-**/.ruff_cache/
-**/.pytest_cache/
-**/.coverage
-**/htmlcov/
-
-# Node artifacts
-**/node_modules/
-**/.next/
-**/storybook-static/
-**/playwright-report/
-**/test-results/
-
-# Build artifacts
-**/dist/
-**/build/
-**/target/
-
-# Logs and temp files
-**/*.log
-**/*.tmp
+# Explicitly re-ignore some folders
+.*
+**/__pycache__
--- a/.github/workflows/platform-frontend-ci.yml
+++ b/.github/workflows/platform-frontend-ci.yml
@@ -26,6 +26,7 @@ jobs:
  setup:
    runs-on: ubuntu-latest
    outputs:
+      cache-key: ${{ steps.cache-key.outputs.key }}
      components-changed: ${{ steps.filter.outputs.components }}

    steps:
@@ -40,17 +41,28 @@ jobs:
            components:
              - 'autogpt_platform/frontend/src/components/**'

-      - name: Enable corepack
-        run: corepack enable
-
-      - name: Set up Node
+      - name: Set up Node.js
        uses: actions/setup-node@v6
        with:
          node-version: "22.18.0"
-          cache: "pnpm"
-          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml

-      - name: Install dependencies to populate cache
+      - name: Enable corepack
+        run: corepack enable
+
+      - name: Generate cache key
+        id: cache-key
+        run: echo "key=${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}" >> $GITHUB_OUTPUT
+
+      - name: Cache dependencies
+        uses: actions/cache@v5
+        with:
+          path: ~/.pnpm-store
+          key: ${{ steps.cache-key.outputs.key }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
+            ${{ runner.os }}-pnpm-
+
+      - name: Install dependencies
        run: pnpm install --frozen-lockfile

  lint:
@@ -61,15 +73,22 @@ jobs:
      - name: Checkout repository
        uses: actions/checkout@v6

-      - name: Enable corepack
-        run: corepack enable
-
-      - name: Set up Node
+      - name: Set up Node.js
        uses: actions/setup-node@v6
        with:
          node-version: "22.18.0"
-          cache: "pnpm"
-          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
+
+      - name: Enable corepack
+        run: corepack enable
+
+      - name: Restore dependencies cache
+        uses: actions/cache@v5
+        with:
+          path: ~/.pnpm-store
+          key: ${{ needs.setup.outputs.cache-key }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
+            ${{ runner.os }}-pnpm-

      - name: Install dependencies
        run: pnpm install --frozen-lockfile
@@ -92,15 +111,22 @@ jobs:
        with:
          fetch-depth: 0

-      - name: Enable corepack
-        run: corepack enable
-
-      - name: Set up Node
+      - name: Set up Node.js
        uses: actions/setup-node@v6
        with:
          node-version: "22.18.0"
-          cache: "pnpm"
-          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
+
+      - name: Enable corepack
+        run: corepack enable
+
+      - name: Restore dependencies cache
+        uses: actions/cache@v5
+        with:
+          path: ~/.pnpm-store
+          key: ${{ needs.setup.outputs.cache-key }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
+            ${{ runner.os }}-pnpm-

      - name: Install dependencies
        run: pnpm install --frozen-lockfile
@@ -115,8 +141,10 @@ jobs:
          exitOnceUploaded: true

  e2e_test:
-    name: end-to-end tests
    runs-on: big-boi
+    needs: setup
+    strategy:
+      fail-fast: false

    steps:
      - name: Checkout repository
@@ -124,11 +152,19 @@ jobs:
        with:
          submodules: recursive

-      - name: Set up Platform - Copy default supabase .env
+      - name: Set up Node.js
+        uses: actions/setup-node@v6
+        with:
+          node-version: "22.18.0"
+
+      - name: Enable corepack
+        run: corepack enable
+
+      - name: Copy default supabase .env
        run: |
          cp ../.env.default ../.env

-      - name: Set up Platform - Copy backend .env and set OpenAI API key
+      - name: Copy backend .env and set OpenAI API key
        run: |
          cp ../backend/.env.default ../backend/.env
          echo "OPENAI_INTERNAL_API_KEY=${{ secrets.OPENAI_API_KEY }}" >> ../backend/.env
@@ -136,87 +172,77 @@ jobs:
          # Used by E2E test data script to generate embeddings for approved store agents
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

-      - name: Set up Platform - Set up Docker Buildx
+      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
+
+      - name: Cache Docker layers
+        uses: actions/cache@v5
        with:
-          driver: docker-container
-          driver-opts: network=host
+          path: /tmp/.buildx-cache
+          key: ${{ runner.os }}-buildx-frontend-test-${{ hashFiles('autogpt_platform/docker-compose.yml', 'autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/pyproject.toml', 'autogpt_platform/backend/poetry.lock') }}
+          restore-keys: |
+            ${{ runner.os }}-buildx-frontend-test-

-      - name: Set up Platform - Expose GHA cache to docker buildx CLI
-        uses: crazy-max/ghaction-github-runtime@v3
-
-      - name: Set up Platform - Build Docker images (with cache)
-        working-directory: autogpt_platform
+      - name: Run docker compose
        run: |
-          pip install pyyaml
-
-          # Resolve extends and generate a flat compose file that bake can understand
-          docker compose -f docker-compose.yml config > docker-compose.resolved.yml
-
-          # Add cache configuration to the resolved compose file
-          python ../.github/workflows/scripts/docker-ci-fix-compose-build-cache.py \
-            --source docker-compose.resolved.yml \
-            --cache-from "type=gha" \
-            --cache-to "type=gha,mode=max" \
-            --backend-scope "platform-backend-${{ hashFiles('autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/poetry.lock', 'autogpt_platform/backend/backend') }}" \
-            --frontend-scope "platform-frontend-${{ hashFiles('autogpt_platform/frontend/Dockerfile', 'autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/src') }}"
-
-          # Build with bake using the resolved compose file (now includes cache config)
-          docker buildx bake --allow=fs.read=.. -f docker-compose.resolved.yml --load
+          NEXT_PUBLIC_PW_TEST=true docker compose -f ../docker-compose.yml up -d
        env:
-          NEXT_PUBLIC_PW_TEST: true
+          DOCKER_BUILDKIT: 1
+          BUILDX_CACHE_FROM: type=local,src=/tmp/.buildx-cache
+          BUILDX_CACHE_TO: type=local,dest=/tmp/.buildx-cache-new,mode=max

-      - name: Set up Platform - Run (docker compose up)
-        run: docker compose -f ../docker-compose.resolved.yml up -d --no-build
-        env:
-          NEXT_PUBLIC_PW_TEST: true
+      - name: Move cache
+        run: |
+          rm -rf /tmp/.buildx-cache
+          if [ -d "/tmp/.buildx-cache-new" ]; then
+            mv /tmp/.buildx-cache-new /tmp/.buildx-cache
+          fi

-      - name: Set up Platform - Wait for services to be ready
+      - name: Wait for services to be ready
        run: |
          echo "Waiting for rest_server to be ready..."
          timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
          echo "Waiting for database to be ready..."
-          timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done' || echo "Database ready check timeout, continuing..."
+          timeout 60 sh -c 'until docker compose -f ../docker-compose.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done' || echo "Database ready check timeout, continuing..."

-      - name: Set up tests - Create E2E test data
+      - name: Create E2E test data
        run: |
          echo "Creating E2E test data..."
          # First try to run the script from inside the container
-          if docker compose -f ../docker-compose.resolved.yml exec -T rest_server test -f /app/autogpt_platform/backend/test/e2e_test_data.py; then
+          if docker compose -f ../docker-compose.yml exec -T rest_server test -f /app/autogpt_platform/backend/test/e2e_test_data.py; then
            echo "✅ Found e2e_test_data.py in container, running it..."
-            docker compose -f ../docker-compose.resolved.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python backend/test/e2e_test_data.py" || {
+            docker compose -f ../docker-compose.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python backend/test/e2e_test_data.py" || {
              echo "❌ E2E test data creation failed!"
-              docker compose -f ../docker-compose.resolved.yml logs --tail=50 rest_server
+              docker compose -f ../docker-compose.yml logs --tail=50 rest_server
              exit 1
            }
          else
            echo "⚠️ e2e_test_data.py not found in container, copying and running..."
            # Copy the script into the container and run it
-            docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.resolved.yml ps -q rest_server):/tmp/e2e_test_data.py || {
+            docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.yml ps -q rest_server):/tmp/e2e_test_data.py || {
              echo "❌ Failed to copy script to container"
              exit 1
            }
-            docker compose -f ../docker-compose.resolved.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
+            docker compose -f ../docker-compose.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
              echo "❌ E2E test data creation failed!"
-              docker compose -f ../docker-compose.resolved.yml logs --tail=50 rest_server
+              docker compose -f ../docker-compose.yml logs --tail=50 rest_server
              exit 1
            }
          fi

-      - name: Set up tests - Enable corepack
-        run: corepack enable
-
-      - name: Set up tests - Set up Node
-        uses: actions/setup-node@v6
+      - name: Restore dependencies cache
+        uses: actions/cache@v5
        with:
-          node-version: "22.18.0"
-          cache: "pnpm"
-          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
+          path: ~/.pnpm-store
+          key: ${{ needs.setup.outputs.cache-key }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
+            ${{ runner.os }}-pnpm-

-      - name: Set up tests - Install dependencies
+      - name: Install dependencies
        run: pnpm install --frozen-lockfile

-      - name: Set up tests - Install browser 'chromium'
+      - name: Install Browser 'chromium'
        run: pnpm playwright install --with-deps chromium

      - name: Run Playwright tests
@@ -243,7 +269,7 @@ jobs:

      - name: Print Final Docker Compose logs
        if: always()
-        run: docker compose -f ../docker-compose.resolved.yml logs
+        run: docker compose -f ../docker-compose.yml logs

  integration_test:
    runs-on: ubuntu-latest
@@ -255,15 +281,22 @@ jobs:
        with:
          submodules: recursive

-      - name: Enable corepack
-        run: corepack enable
-
-      - name: Set up Node
+      - name: Set up Node.js
        uses: actions/setup-node@v6
        with:
          node-version: "22.18.0"
-          cache: "pnpm"
-          cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
+
+      - name: Enable corepack
+        run: corepack enable
+
+      - name: Restore dependencies cache
+        uses: actions/cache@v5
+        with:
+          path: ~/.pnpm-store
+          key: ${{ needs.setup.outputs.cache-key }}
+          restore-keys: |
+            ${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
+            ${{ runner.os }}-pnpm-

      - name: Install dependencies
        run: pnpm install --frozen-lockfile
--- a/.github/workflows/scripts/docker-ci-fix-compose-build-cache.py
+++ b/.github/workflows/scripts/docker-ci-fix-compose-build-cache.py
@@ -1,154 +0,0 @@
-#!/usr/bin/env python3
-"""
-Add cache configuration to a resolved docker-compose file for all services
-that have a build key, and ensure image names match what docker compose expects.
-"""
-
-import argparse
-
-import yaml
-
-
-def main():
-    parser = argparse.ArgumentParser(
-        description="Add cache config to a resolved compose file"
-    )
-    parser.add_argument(
-        "--source",
-        required=True,
-        help="Source compose file to read (should be output of `docker compose config`)",
-    )
-    parser.add_argument(
-        "--cache-from",
-        default="type=gha",
-        help="Cache source configuration",
-    )
-    parser.add_argument(
-        "--cache-to",
-        default="type=gha,mode=max",
-        help="Cache destination configuration",
-    )
-    parser.add_argument(
-        "--backend-scope",
-        default="",
-        help="GHA cache scope for backend services (e.g., platform-backend-{hash})",
-    )
-    parser.add_argument(
-        "--frontend-scope",
-        default="",
-        help="GHA cache scope for frontend service (e.g., platform-frontend-{hash})",
-    )
-    args = parser.parse_args()
-
-    with open(args.source, "r") as f:
-        compose = yaml.safe_load(f)
-
-    # Get project name from compose file or default
-    project_name = compose.get("name", "autogpt_platform")
-
-    def get_image_name(dockerfile: str, target: str) -> str:
-        """Generate image name based on Dockerfile folder and build target."""
-        dockerfile_parts = dockerfile.replace("\\", "/").split("/")
-        if len(dockerfile_parts) >= 2:
-            folder_name = dockerfile_parts[-2]  # e.g., "backend" or "frontend"
-        else:
-            folder_name = "app"
-        return f"{project_name}-{folder_name}:{target}"
-
-    def get_build_key(dockerfile: str, target: str) -> str:
-        """Generate a unique key for a Dockerfile+target combination."""
-        return f"{dockerfile}:{target}"
-
-    # First pass: collect all services with build configs and identify duplicates
-    # Track which (dockerfile, target) combinations we've seen
-    build_key_to_first_service: dict[str, str] = {}
-    services_to_build: list[str] = []
-    services_to_dedupe: list[str] = []
-
-    for service_name, service_config in compose.get("services", {}).items():
-        if "build" not in service_config:
-            continue
-
-        build_config = service_config["build"]
-        dockerfile = build_config.get("dockerfile", "Dockerfile")
-        target = build_config.get("target", "default")
-        build_key = get_build_key(dockerfile, target)
-
-        if build_key not in build_key_to_first_service:
-            # First service with this build config - it will do the actual build
-            build_key_to_first_service[build_key] = service_name
-            services_to_build.append(service_name)
-        else:
-            # Duplicate - will just use the image from the first service
-            services_to_dedupe.append(service_name)
-
-    # Second pass: configure builds and deduplicate
-    modified_services = []
-    for service_name, service_config in compose.get("services", {}).items():
-        if "build" not in service_config:
-            continue
-
-        build_config = service_config["build"]
-        dockerfile = build_config.get("dockerfile", "Dockerfile")
-        target = build_config.get("target", "latest")
-        image_name = get_image_name(dockerfile, target)
-
-        # Set image name for all services (needed for both builders and deduped)
-        service_config["image"] = image_name
-
-        if service_name in services_to_dedupe:
-            # Remove build config - this service will use the pre-built image
-            del service_config["build"]
-            continue
-
-        # This service will do the actual build - add cache config
-        cache_from = args.cache_from
-        cache_to = args.cache_to
-
-        # Determine scope based on Dockerfile path and target
-        # Each unique (dockerfile, target) combination gets its own cache scope
-        if "type=gha" in args.cache_from or "type=gha" in args.cache_to:
-            if "frontend" in dockerfile:
-                base_scope = args.frontend_scope
-            elif "backend" in dockerfile:
-                base_scope = args.backend_scope
-            else:
-                # Skip services that don't clearly match frontend/backend
-                continue
-
-            if base_scope:
-                # Append target to scope to differentiate e.g. migrate vs server
-                scope = f"{base_scope}-{target}"
-                if "type=gha" in args.cache_from:
-                    cache_from = f"{args.cache_from},scope={scope}"
-                if "type=gha" in args.cache_to:
-                    cache_to = f"{args.cache_to},scope={scope}"
-
-        build_config["cache_from"] = [cache_from]
-        build_config["cache_to"] = [cache_to]
-        modified_services.append(service_name)
-
-    # Write back to the same file
-    with open(args.source, "w") as f:
-        yaml.dump(compose, f, default_flow_style=False, sort_keys=False)
-
-    print(f"Added cache config to {len(modified_services)} services in {args.source}:")
-    for svc in modified_services:
-        svc_config = compose["services"][svc]
-        build_cfg = svc_config.get("build", {})
-        cache_from_val = build_cfg.get("cache_from", ["none"])[0]
-        cache_to_val = build_cfg.get("cache_to", ["none"])[0]
-        print(f"  - {svc}")
-        print(f"      image: {svc_config.get('image', 'N/A')}")
-        print(f"      cache_from: {cache_from_val}")
-        print(f"      cache_to: {cache_to_val}")
-    if services_to_dedupe:
-        print(
-            f"Deduplicated {len(services_to_dedupe)} services (will use pre-built images):"
-        )
-        for svc in services_to_dedupe:
-            print(f"  - {svc} -> {compose['services'][svc].get('image', 'N/A')}")
-
-
-if __name__ == "__main__":
-    main()
--- a/autogpt_platform/backend/Dockerfile
+++ b/autogpt_platform/backend/Dockerfile
@@ -1,5 +1,3 @@
-# ============================ DEPENDENCY BUILDER ============================ #
-
 FROM debian:13-slim AS builder

 # Set environment variables
@@ -53,9 +51,7 @@ COPY autogpt_platform/backend/backend/data/partial_types.py ./backend/data/parti
 COPY autogpt_platform/backend/gen_prisma_types_stub.py ./
 RUN poetry run prisma generate && poetry run gen-prisma-stub

-# ============================== BACKEND SERVER ============================== #
-
-FROM debian:13-slim AS server
+FROM debian:13-slim AS server_dependencies

 WORKDIR /app

@@ -67,14 +63,15 @@ ENV POETRY_HOME=/opt/poetry \
 ENV PATH=/opt/poetry/bin:$PATH

 # Install Python, FFmpeg, and ImageMagick (required for video processing blocks)
-# Using --no-install-recommends saves ~650MB by skipping unnecessary deps like llvm, mesa, etc.
-RUN apt-get update && apt-get install -y --no-install-recommends \
+RUN apt-get update && apt-get install -y \
    python3.13 \
    python3-pip \
    ffmpeg \
    imagemagick \
    && rm -rf /var/lib/apt/lists/*

+# Copy only necessary files from builder
+COPY --from=builder /app /app
 COPY --from=builder /usr/local/lib/python3* /usr/local/lib/python3*
 COPY --from=builder /usr/local/bin/poetry /usr/local/bin/poetry
 # Copy Node.js installation for Prisma
@@ -84,54 +81,30 @@ COPY --from=builder /usr/bin/npm /usr/bin/npm
 COPY --from=builder /usr/bin/npx /usr/bin/npx
 COPY --from=builder /root/.cache/prisma-python/binaries /root/.cache/prisma-python/binaries

-WORKDIR /app/autogpt_platform/backend
-
-# Copy only the .venv from builder (not the entire /app directory)
-# The .venv includes the generated Prisma client
-COPY --from=builder /app/autogpt_platform/backend/.venv ./.venv
 ENV PATH="/app/autogpt_platform/backend/.venv/bin:$PATH"

-# Copy dependency files + autogpt_libs (path dependency)
-COPY autogpt_platform/autogpt_libs /app/autogpt_platform/autogpt_libs
-COPY autogpt_platform/backend/poetry.lock autogpt_platform/backend/pyproject.toml ./
+RUN mkdir -p /app/autogpt_platform/autogpt_libs
+RUN mkdir -p /app/autogpt_platform/backend

-# Copy backend code + docs (for Copilot docs search)
-COPY autogpt_platform/backend ./
+COPY autogpt_platform/autogpt_libs /app/autogpt_platform/autogpt_libs
+
+COPY autogpt_platform/backend/poetry.lock autogpt_platform/backend/pyproject.toml /app/autogpt_platform/backend/
+
+WORKDIR /app/autogpt_platform/backend
+
+FROM server_dependencies AS migrate
+
+# Migration stage only needs schema and migrations - much lighter than full backend
+COPY autogpt_platform/backend/schema.prisma /app/autogpt_platform/backend/
+COPY autogpt_platform/backend/backend/data/partial_types.py /app/autogpt_platform/backend/backend/data/partial_types.py
+COPY autogpt_platform/backend/migrations /app/autogpt_platform/backend/migrations
+
+FROM server_dependencies AS server
+
+COPY autogpt_platform/backend /app/autogpt_platform/backend
 COPY docs /app/docs
 RUN poetry install --no-ansi --only-root

 ENV PORT=8000

 CMD ["poetry", "run", "rest"]
-
-# =============================== DB MIGRATOR =============================== #
-
-# Lightweight migrate stage - only needs Prisma CLI, not full Python environment
-FROM debian:13-slim AS migrate
-
-WORKDIR /app/autogpt_platform/backend
-
-ENV DEBIAN_FRONTEND=noninteractive
-
-# Install only what's needed for prisma migrate: Node.js and minimal Python for prisma-python
-RUN apt-get update && apt-get install -y --no-install-recommends \
-    python3.13 \
-    python3-pip \
-    ca-certificates \
-    && rm -rf /var/lib/apt/lists/*
-
-# Copy Node.js from builder (needed for Prisma CLI)
-COPY --from=builder /usr/bin/node /usr/bin/node
-COPY --from=builder /usr/lib/node_modules /usr/lib/node_modules
-COPY --from=builder /usr/bin/npm /usr/bin/npm
-
-# Copy Prisma binaries
-COPY --from=builder /root/.cache/prisma-python/binaries /root/.cache/prisma-python/binaries
-
-# Install prisma-client-py directly (much smaller than copying full venv)
-RUN pip3 install prisma>=0.15.0 --break-system-packages
-
-COPY autogpt_platform/backend/schema.prisma ./
-COPY autogpt_platform/backend/backend/data/partial_types.py ./backend/data/partial_types.py
-COPY autogpt_platform/backend/gen_prisma_types_stub.py ./
-COPY autogpt_platform/backend/migrations ./migrations
--- a/autogpt_platform/backend/backend/blocks/claude_code.py
+++ b/autogpt_platform/backend/backend/blocks/claude_code.py
@@ -1,4 +1,6 @@
+import base64
 import json
+import logging
 import shlex
 import uuid
 from typing import Literal, Optional
@@ -21,6 +23,11 @@ from backend.data.model import (
 )
 from backend.integrations.providers import ProviderName

+logger = logging.getLogger(__name__)
+
+# Maximum size for binary files to extract (50MB)
+MAX_BINARY_FILE_SIZE = 50 * 1024 * 1024
+

 class ClaudeCodeExecutionError(Exception):
    """Exception raised when Claude Code execution fails.
@@ -180,7 +187,9 @@ class ClaudeCodeBlock(Block):
        path: str
        relative_path: str  # Path relative to working directory (for GitHub, etc.)
        name: str
-        content: str
+        content: str  # Text content for text files, empty string for binary files
+        is_binary: bool = False  # True if this is a binary file
+        content_base64: Optional[str] = None  # Base64-encoded content for binary files

    class Output(BlockSchemaOutput):
        response: str = SchemaField(
@@ -188,8 +197,11 @@ class ClaudeCodeBlock(Block):
        )
        files: list["ClaudeCodeBlock.FileOutput"] = SchemaField(
            description=(
-                "List of text files created/modified by Claude Code during this execution. "
-                "Each file has 'path', 'relative_path', 'name', and 'content' fields."
+                "List of files created/modified by Claude Code during this execution. "
+                "Each file has 'path', 'relative_path', 'name', 'content', 'is_binary', "
+                "and 'content_base64' fields. For text files, 'content' contains the text "
+                "and 'is_binary' is False. For binary files (PDFs, images, etc.), "
+                "'is_binary' is True and 'content_base64' contains the base64-encoded data."
            )
        )
        conversation_history: str = SchemaField(
@@ -252,6 +264,8 @@ class ClaudeCodeBlock(Block):
                            "relative_path": "index.html",
                            "name": "index.html",
                            "content": "<html>Hello World</html>",
+                            "is_binary": False,
+                            "content_base64": None,
                        }
                    ],
                ),
@@ -272,6 +286,8 @@ class ClaudeCodeBlock(Block):
                            relative_path="index.html",
                            name="index.html",
                            content="<html>Hello World</html>",
+                            is_binary=False,
+                            content_base64=None,
                        )
                    ],  # files
                    "User: Create a hello world HTML file\n"
@@ -531,7 +547,6 @@ class ClaudeCodeBlock(Block):
            ".env",
            ".gitignore",
            ".dockerfile",
-            "Dockerfile",
            ".vue",
            ".svelte",
            ".astro",
@@ -540,6 +555,44 @@ class ClaudeCodeBlock(Block):
            ".tex",
            ".csv",
            ".log",
+            ".svg",  # SVG is XML-based text
+        }
+
+        # Binary file extensions we can read and base64-encode
+        binary_extensions = {
+            # Images
+            ".png",
+            ".jpg",
+            ".jpeg",
+            ".gif",
+            ".webp",
+            ".ico",
+            ".bmp",
+            ".tiff",
+            ".tif",
+            # Documents
+            ".pdf",
+            # Archives (useful for downloads)
+            ".zip",
+            ".tar",
+            ".gz",
+            ".7z",
+            # Audio/Video (if small enough)
+            ".mp3",
+            ".wav",
+            ".mp4",
+            ".webm",
+            # Other binary formats
+            ".woff",
+            ".woff2",
+            ".ttf",
+            ".otf",
+            ".eot",
+            ".bin",
+            ".exe",
+            ".dll",
+            ".so",
+            ".dylib",
        }

        try:
@@ -564,10 +617,26 @@ class ClaudeCodeBlock(Block):
                    if not file_path:
                        continue

-                    # Check if it's a text file we can read
+                    # Check if it's a text file we can read (case-insensitive)
+                    file_path_lower = file_path.lower()
                    is_text = any(
-                        file_path.endswith(ext) for ext in text_extensions
-                    ) or file_path.endswith("Dockerfile")
+                        file_path_lower.endswith(ext) for ext in text_extensions
+                    ) or file_path_lower.endswith("dockerfile")
+
+                    # Check if it's a binary file we should extract
+                    is_binary = any(
+                        file_path_lower.endswith(ext) for ext in binary_extensions
+                    )
+
+                    # Helper to extract filename and relative path
+                    def get_file_info(path: str, work_dir: str) -> tuple[str, str]:
+                        name = path.split("/")[-1]
+                        rel_path = path
+                        if path.startswith(work_dir):
+                            rel_path = path[len(work_dir) :]
+                            if rel_path.startswith("/"):
+                                rel_path = rel_path[1:]
+                        return name, rel_path

                    if is_text:
                        try:
@@ -576,32 +645,72 @@ class ClaudeCodeBlock(Block):
                            if isinstance(content, bytes):
                                content = content.decode("utf-8", errors="replace")

-                            # Extract filename from path
-                            file_name = file_path.split("/")[-1]
-
-                            # Calculate relative path by stripping working directory
-                            relative_path = file_path
-                            if file_path.startswith(working_directory):
-                                relative_path = file_path[len(working_directory) :]
-                                # Remove leading slash if present
-                                if relative_path.startswith("/"):
-                                    relative_path = relative_path[1:]
-
+                            file_name, relative_path = get_file_info(
+                                file_path, working_directory
+                            )
                            files.append(
                                ClaudeCodeBlock.FileOutput(
                                    path=file_path,
                                    relative_path=relative_path,
                                    name=file_name,
                                    content=content,
+                                    is_binary=False,
+                                    content_base64=None,
                                )
                            )
-                        except Exception:
-                            # Skip files that can't be read
-                            pass
+                        except Exception as e:
+                            logger.warning(f"Failed to read text file {file_path}: {e}")
+                    elif is_binary:
+                        try:
+                            # Check file size before reading to avoid OOM
+                            stat_result = await sandbox.commands.run(
+                                f"stat -c %s {shlex.quote(file_path)} 2>/dev/null"
+                            )
+                            if stat_result.exit_code != 0 or not stat_result.stdout:
+                                logger.warning(
+                                    f"Skipping binary file {file_path}: "
+                                    f"could not determine file size"
+                                )
+                                continue
+                            file_size = int(stat_result.stdout.strip())
+                            if file_size > MAX_BINARY_FILE_SIZE:
+                                logger.warning(
+                                    f"Skipping binary file {file_path}: "
+                                    f"size {file_size} exceeds limit "
+                                    f"{MAX_BINARY_FILE_SIZE}"
+                                )
+                                continue

-        except Exception:
-            # If file extraction fails, return empty results
-            pass
+                            # Read binary file as bytes using format="bytes"
+                            content_bytes = await sandbox.files.read(
+                                file_path, format="bytes"
+                            )
+
+                            # Base64 encode the binary content
+                            content_b64 = base64.b64encode(content_bytes).decode(
+                                "ascii"
+                            )
+
+                            file_name, relative_path = get_file_info(
+                                file_path, working_directory
+                            )
+                            files.append(
+                                ClaudeCodeBlock.FileOutput(
+                                    path=file_path,
+                                    relative_path=relative_path,
+                                    name=file_name,
+                                    content="",  # Empty for binary files
+                                    is_binary=True,
+                                    content_base64=content_b64,
+                                )
+                            )
+                        except Exception as e:
+                            logger.warning(
+                                f"Failed to read binary file {file_path}: {e}"
+                            )
+
+        except Exception as e:
+            logger.warning(f"File extraction failed: {e}")

        return files

--- a/docs/integrations/block-integrations/claude_code.md
+++ b/docs/integrations/block-integrations/claude_code.md
@@ -16,7 +16,7 @@ When activated, the block:
   - Install dependencies (npm, pip, etc.)
   - Run terminal commands
   - Build and test applications
-5. Extracts all text files created/modified during execution
+5. Extracts all text and binary files created/modified during execution
 6. Returns the response and files, optionally keeping the sandbox alive for follow-up tasks

 The block supports conversation continuation through three mechanisms:
@@ -42,7 +42,7 @@ The block supports conversation continuation through three mechanisms:
 | Output | Description |
 |--------|-------------|
 | Response | The output/response from Claude Code execution |
-| Files | List of text files created/modified during execution. Each file includes path, relative_path, name, and content fields |
+| Files | List of files created/modified during execution. Each file includes path, relative_path, name, content, is_binary, and content_base64 fields. For text files, content contains the text and is_binary is False. For binary files (PDFs, images, etc.), is_binary is True and content_base64 contains the base64-encoded data |
 | Conversation History | Full conversation history including this turn. Use to restore context on a fresh sandbox |
 | Session ID | Session ID for this conversation. Pass back with sandbox_id to continue the conversation |
 | Sandbox ID | ID of the sandbox instance (null if disposed). Pass back with session_id to continue the conversation |
--- a/docs/integrations/block-integrations/llm.md
+++ b/docs/integrations/block-integrations/llm.md
@@ -535,7 +535,7 @@ When activated, the block:
 2. Installs the latest version of Claude Code in the sandbox
 3. Optionally runs setup commands to prepare the environment
 4. Executes your prompt using Claude Code, which can create/edit files, install dependencies, run terminal commands, and build applications
-5. Extracts all text files created/modified during execution
+5. Extracts all text and binary files created/modified during execution
 6. Returns the response and files, optionally keeping the sandbox alive for follow-up tasks

 The block supports conversation continuation through three mechanisms:
@@ -563,7 +563,7 @@ The block supports conversation continuation through three mechanisms:
 |--------|-------------|------|
 | error | Error message if execution failed | str |
 | response | The output/response from Claude Code execution | str |
-| files | List of text files created/modified by Claude Code during this execution. Each file has 'path', 'relative_path', 'name', and 'content' fields. | List[FileOutput] |
+| files | List of files created/modified by Claude Code during this execution. Each file has 'path', 'relative_path', 'name', 'content', 'is_binary', and 'content_base64' fields. For text files, 'content' contains the text and 'is_binary' is False. For binary files (PDFs, images, etc.), 'is_binary' is True and 'content_base64' contains the base64-encoded data. | List[FileOutput] |
 | conversation_history | Full conversation history including this turn. Pass this to conversation_history input to continue on a fresh sandbox if the previous sandbox timed out. | str |
 | session_id | Session ID for this conversation. Pass this back along with sandbox_id to continue the conversation. | str |
 | sandbox_id | ID of the sandbox instance. Pass this back along with session_id to continue the conversation. This is None if dispose_sandbox was True (sandbox was disposed). | str |
Author	SHA1	Message	Date
Bentlybro	0b2fb655bc	style: black formatting	2026-02-12 12:46:20 +00:00
Bentlybro	99f8bf5f0c	fix: skip binary file if stat fails to prevent OOM If the stat command fails (file deleted, permissions issue, etc.), we now skip the file rather than proceeding to read it with an unknown size. This prevents potential OOM crashes from large files where size verification failed.	2026-02-12 12:32:13 +00:00
Bentlybro	3f76f1318b	docs: Fix llm.md to match exact schema description	2026-02-12 12:25:29 +00:00
Bentlybro	b011289dd2	fix: Address code review feedback - Add 50MB size guard for binary files to prevent OOM - Extract helper function for path resolution (DRY) - Add logging for file extraction errors - Remove dead 'Dockerfile' entry from text_extensions	2026-02-12 12:02:45 +00:00
Bentlybro	49c2f578b4	docs: Update llm.md for binary file support in Claude Code block	2026-02-12 11:58:35 +00:00
Bentlybro	7150b7768d	fix: Make Dockerfile check case-insensitive	2026-02-12 11:53:57 +00:00
Bentlybro	8c95b03636	fix: Update tests and address code review feedback - Update test fixtures with is_binary and content_base64 fields - Move .svg to text_extensions (it's XML-based) - Make extension matching case-insensitive for both text and binary	2026-02-12 11:45:52 +00:00
Bentlybro	4a8368887f	fix: Use format='bytes' for reading binary files from E2B sandbox Fixes the critical bug where binary files would fail to read because files.read() defaults to text mode (UTF-8 decoding). Now explicitly uses format='bytes' which returns a bytearray.	2026-02-12 11:29:43 +00:00
Bentlybro	d46e5e6b6a	docs: Update claude_code.md for binary file support	2026-02-12 11:26:58 +00:00
Bentlybro	4e632bbd60	fix(backend): Extract binary files from ClaudeCodeBlock sandbox Add support for extracting binary files (PDFs, images, etc.) from the E2B sandbox in ClaudeCodeBlock. Changes: - Add binary_extensions set for common binary file types (.pdf, .png, .jpg, etc.) - Update FileOutput schema with is_binary and content_base64 fields - Binary files are read as bytes and base64-encoded before returning - Text files continue to work as before with is_binary=False Closes SECRT-1897	2026-02-12 11:23:05 +00:00