Merge branch 'dev' into update-branchlet

Fix formatting in .branchlet.json
refactor(claude): Split autogpt_platform/CLAUDE.md into project-specific files (#11788 )
2026-01-29 17:08:01 -05:00 · 2026-01-29 11:57:56 -06:00 · 2026-01-29 11:57:12 -06:00 · 2026-01-29 17:33:02 +00:00 · 2026-01-29 17:46:36 +07:00 · 2026-01-29 05:49:47 +00:00
181 changed files with 6835 additions and 11292 deletions
--- a/.branchlet.json
+++ b/.branchlet.json
@@ -29,8 +29,7 @@
  "postCreateCmd": [
    "cd autogpt_platform/autogpt_libs && poetry install",
    "cd autogpt_platform/backend && poetry install && poetry run prisma generate",
-    "cd autogpt_platform/frontend && pnpm install",
-    "cd docs && pip install -r requirements.txt"
+    "cd autogpt_platform/frontend && pnpm install"
  ],
  "terminalCommand": "code .",
  "deleteBranchWithWorktree": false
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -160,7 +160,7 @@ pnpm storybook                      # Start component development server

 **Backend Entry Points:**

- `backend/backend/server/server.py` - FastAPI application setup
+- `backend/backend/api/rest_api.py` - FastAPI application setup
 - `backend/backend/data/` - Database models and user management
 - `backend/blocks/` - Agent execution blocks and logic

@@ -219,7 +219,7 @@ Agents are built using a visual block-based system where each block performs a s

 ### API Development

-1. Update routes in `/backend/backend/server/routers/`
+1. Update routes in `/backend/backend/api/features/`
 2. Add/update Pydantic models in same directory
 3. Write tests alongside route files
 4. For `data/*.py` changes, validate user ID checks
@@ -285,7 +285,7 @@ Agents are built using a visual block-based system where each block performs a s

 ### Security Guidelines

-**Cache Protection Middleware** (`/backend/backend/server/middleware/security.py`):
+**Cache Protection Middleware** (`/backend/backend/api/middleware/security.py`):

 - Default: Disables caching for ALL endpoints with `Cache-Control: no-store, no-cache, must-revalidate, private`
 - Uses allow list approach for cacheable paths (static assets, health checks, public pages)
--- a/.gitignore
+++ b/.gitignore
@@ -178,4 +178,5 @@ autogpt_platform/backend/settings.py
 *.ign.*
 .test-contents
 .claude/settings.local.json
+CLAUDE.local.md
 /autogpt_platform/backend/logs
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -16,7 +16,6 @@ See `docs/content/platform/getting-started.md` for setup instructions.
 - Format Python code with `poetry run format`.
 - Format frontend code using `pnpm format`.

-
 ## Frontend guidelines:

 See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:
@@ -33,14 +32,17 @@ See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:
 4. **Styling**: Tailwind CSS only, use design tokens, Phosphor Icons only
 5. **Testing**: Add Storybook stories for new components, Playwright for E2E
 6. **Code conventions**: Function declarations (not arrow functions) for components/handlers
+
 - Component props should be `interface Props { ... }` (not exported) unless the interface needs to be used outside the component
 - Separate render logic from business logic (component.tsx + useComponent.ts + helpers.ts)
 - Colocate state when possible and avoid creating large components, use sub-components ( local `/components` folder next to the parent component ) when sensible
 - Avoid large hooks, abstract logic into `helpers.ts` files when sensible
 - Use function declarations for components, arrow functions only for callbacks
 - No barrel files or `index.ts` re-exports
- Do not use `useCallback` or `useMemo` unless strictly needed
 - Avoid comments at all times unless the code is very complex
+- Do not use `useCallback` or `useMemo` unless asked to optimise a given function
+- Do not type hook returns, let Typescript infer as much as possible
+- Never type with `any`, if not types available use `unknown`

 ## Testing

@@ -49,22 +51,8 @@ See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:

 Always run the relevant linters and tests before committing.
 Use conventional commit messages for all commits (e.g. `feat(backend): add API`).
-  Types:
-    - feat
-    - fix
-    - refactor
-    - ci
-    - dx (developer experience)
-  Scopes:
-    - platform
-      - platform/library
-      - platform/marketplace
-      - backend
-        - backend/executor
-      - frontend
-        - frontend/library
-        - frontend/marketplace
-      - blocks
+Types: - feat - fix - refactor - ci - dx (developer experience)
+Scopes: - platform - platform/library - platform/marketplace - backend - backend/executor - frontend - frontend/library - frontend/marketplace - blocks

 ## Pull requests

--- a/autogpt_platform/CLAUDE.md
+++ b/autogpt_platform/CLAUDE.md
@@ -6,152 +6,30 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

 AutoGPT Platform is a monorepo containing:

- **Backend** (`/backend`): Python FastAPI server with async support
- **Frontend** (`/frontend`): Next.js React application
- **Shared Libraries** (`/autogpt_libs`): Common Python utilities
+- **Backend** (`backend`): Python FastAPI server with async support
+- **Frontend** (`frontend`): Next.js React application
+- **Shared Libraries** (`autogpt_libs`): Common Python utilities

-## Essential Commands
+## Component Documentation

-### Backend Development
+- **Backend**: See @backend/CLAUDE.md for backend-specific commands, architecture, and development tasks
+- **Frontend**: See @frontend/CLAUDE.md for frontend-specific commands, architecture, and development patterns

-```bash
-# Install dependencies
-cd backend && poetry install
-
-# Run database migrations
-poetry run prisma migrate dev
-
-# Start all services (database, redis, rabbitmq, clamav)
-docker compose up -d
-
-# Run the backend server
-poetry run serve
-
-# Run tests
-poetry run test
-
-# Run specific test
-poetry run pytest path/to/test_file.py::test_function_name
-
-# Run block tests (tests that validate all blocks work correctly)
-poetry run pytest backend/blocks/test/test_block.py -xvs
-
-# Run tests for a specific block (e.g., GetCurrentTimeBlock)
-poetry run pytest 'backend/blocks/test/test_block.py::test_available_blocks[GetCurrentTimeBlock]' -xvs
-
-# Lint and format
-# prefer format if you want to just "fix" it and only get the errors that can't be autofixed
-poetry run format  # Black + isort
-poetry run lint    # ruff
-```
-
-More details can be found in TESTING.md
-
-#### Creating/Updating Snapshots
-
-When you first write a test or when the expected output changes:
-
-```bash
-poetry run pytest path/to/test.py --snapshot-update
-```
-
-⚠️ **Important**: Always review snapshot changes before committing! Use `git diff` to verify the changes are expected.
-
-### Frontend Development
-
-```bash
-# Install dependencies
-cd frontend && pnpm i
-
-# Generate API client from OpenAPI spec
-pnpm generate:api
-
-# Start development server
-pnpm dev
-
-# Run E2E tests
-pnpm test
-
-# Run Storybook for component development
-pnpm storybook
-
-# Build production
-pnpm build
-
-# Format and lint
-pnpm format
-
-# Type checking
-pnpm types
-```
-
-**📖 Complete Guide**: See `/frontend/CONTRIBUTING.md` and `/frontend/.cursorrules` for comprehensive frontend patterns.
-
-**Key Frontend Conventions:**
-
- Separate render logic from data/behavior in components
- Use generated API hooks from `@/app/api/__generated__/endpoints/`
- Use function declarations (not arrow functions) for components/handlers
- Use design system components from `src/components/` (atoms, molecules, organisms)
- Only use Phosphor Icons
- Never use `src/components/__legacy__/*` or deprecated `BackendAPI`
-
-## Architecture Overview
-
-### Backend Architecture
-
- **API Layer**: FastAPI with REST and WebSocket endpoints
- **Database**: PostgreSQL with Prisma ORM, includes pgvector for embeddings
- **Queue System**: RabbitMQ for async task processing
- **Execution Engine**: Separate executor service processes agent workflows
- **Authentication**: JWT-based with Supabase integration
- **Security**: Cache protection middleware prevents sensitive data caching in browsers/proxies
-
-### Frontend Architecture
-
- **Framework**: Next.js 15 App Router (client-first approach)
- **Data Fetching**: Type-safe generated API hooks via Orval + React Query
- **State Management**: React Query for server state, co-located UI state in components/hooks
- **Component Structure**: Separate render logic (`.tsx`) from business logic (`use*.ts` hooks)
- **Workflow Builder**: Visual graph editor using @xyflow/react
- **UI Components**: shadcn/ui (Radix UI primitives) with Tailwind CSS styling
- **Icons**: Phosphor Icons only
- **Feature Flags**: LaunchDarkly integration
- **Error Handling**: ErrorCard for render errors, toast for mutations, Sentry for exceptions
- **Testing**: Playwright for E2E, Storybook for component development
-
-### Key Concepts
+## Key Concepts

 1. **Agent Graphs**: Workflow definitions stored as JSON, executed by the backend
-2. **Blocks**: Reusable components in `/backend/blocks/` that perform specific tasks
+2. **Blocks**: Reusable components in `backend/backend/blocks/` that perform specific tasks
 3. **Integrations**: OAuth and API connections stored per user
 4. **Store**: Marketplace for sharing agent templates
 5. **Virus Scanning**: ClamAV integration for file upload security

-### Testing Approach
-
- Backend uses pytest with snapshot testing for API responses
- Test files are colocated with source files (`*_test.py`)
- Frontend uses Playwright for E2E tests
- Component testing via Storybook
-
-### Database Schema
-
-Key models (defined in `/backend/schema.prisma`):
-
- `User`: Authentication and profile data
- `AgentGraph`: Workflow definitions with version control
- `AgentGraphExecution`: Execution history and results
- `AgentNode`: Individual nodes in a workflow
- `StoreListing`: Marketplace listings for sharing agents
-
 ### Environment Configuration

 #### Configuration Files

- **Backend**: `/backend/.env.default` (defaults) → `/backend/.env` (user overrides)
- **Frontend**: `/frontend/.env.default` (defaults) → `/frontend/.env` (user overrides)
- **Platform**: `/.env.default` (Supabase/shared defaults) → `/.env` (user overrides)
+- **Backend**: `backend/.env.default` (defaults) → `backend/.env` (user overrides)
+- **Frontend**: `frontend/.env.default` (defaults) → `frontend/.env` (user overrides)
+- **Platform**: `.env.default` (Supabase/shared defaults) → `.env` (user overrides)

 #### Docker Environment Loading Order

@@ -167,83 +45,12 @@ Key models (defined in `/backend/schema.prisma`):
 - Backend/Frontend services use YAML anchors for consistent configuration
 - Supabase services (`db/docker/docker-compose.yml`) follow the same pattern

-### Common Development Tasks
-
-**Adding a new block:**
-
-Follow the comprehensive [Block SDK Guide](../../../docs/content/platform/block-sdk-guide.md) which covers:
-
- Provider configuration with `ProviderBuilder`
- Block schema definition
- Authentication (API keys, OAuth, webhooks)
- Testing and validation
- File organization
-
-Quick steps:
-
-1. Create new file in `/backend/backend/blocks/`
-2. Configure provider using `ProviderBuilder` in `_config.py`
-3. Inherit from `Block` base class
-4. Define input/output schemas using `BlockSchema`
-5. Implement async `run` method
-6. Generate unique block ID using `uuid.uuid4()`
-7. Test with `poetry run pytest backend/blocks/test/test_block.py`
-
-Note: when making many new blocks analyze the interfaces for each of these blocks and picture if they would go well together in a graph based editor or would they struggle to connect productively?
-ex: do the inputs and outputs tie well together?
-
-If you get any pushback or hit complex block conditions check the new_blocks guide in the docs.
-
-**Modifying the API:**
-
-1. Update route in `/backend/backend/server/routers/`
-2. Add/update Pydantic models in same directory
-3. Write tests alongside the route file
-4. Run `poetry run test` to verify
-
-### Frontend guidelines:
-
-See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:
-
-1. **Pages**: Create in `src/app/(platform)/feature-name/page.tsx`
-   - Add `usePageName.ts` hook for logic
-   - Put sub-components in local `components/` folder
-2. **Components**: Structure as `ComponentName/ComponentName.tsx` + `useComponentName.ts` + `helpers.ts`
-   - Use design system components from `src/components/` (atoms, molecules, organisms)
-   - Never use `src/components/__legacy__/*`
-3. **Data fetching**: Use generated API hooks from `@/app/api/__generated__/endpoints/`
-   - Regenerate with `pnpm generate:api`
-   - Pattern: `use{Method}{Version}{OperationName}`
-4. **Styling**: Tailwind CSS only, use design tokens, Phosphor Icons only
-5. **Testing**: Add Storybook stories for new components, Playwright for E2E
-6. **Code conventions**: Function declarations (not arrow functions) for components/handlers
- Component props should be `interface Props { ... }` (not exported) unless the interface needs to be used outside the component
- Separate render logic from business logic (component.tsx + useComponent.ts + helpers.ts)
- Colocate state when possible and avoid creating large components, use sub-components ( local `/components` folder next to the parent component ) when sensible
- Avoid large hooks, abstract logic into `helpers.ts` files when sensible
- Use function declarations for components, arrow functions only for callbacks
- No barrel files or `index.ts` re-exports
- Do not use `useCallback` or `useMemo` unless strictly needed
- Avoid comments at all times unless the code is very complex
-
-### Security Implementation
-
-**Cache Protection Middleware:**
-
- Located in `/backend/backend/server/middleware/security.py`
- Default behavior: Disables caching for ALL endpoints with `Cache-Control: no-store, no-cache, must-revalidate, private`
- Uses an allow list approach - only explicitly permitted paths can be cached
- Cacheable paths include: static assets (`/static/*`, `/_next/static/*`), health checks, public store pages, documentation
- Prevents sensitive data (auth tokens, API keys, user data) from being cached by browsers/proxies
- To allow caching for a new endpoint, add it to `CACHEABLE_PATHS` in the middleware
- Applied to both main API server and external API applications
-
 ### Creating Pull Requests

- Create the PR aginst the `dev` branch of the repository.
- Ensure the branch name is descriptive (e.g., `feature/add-new-block`)/
- Use conventional commit messages (see below)/
- Fill out the .github/PULL_REQUEST_TEMPLATE.md template as the PR description/
+- Create the PR against the `dev` branch of the repository.
+- Ensure the branch name is descriptive (e.g., `feature/add-new-block`)
+- Use conventional commit messages (see below)
+- Fill out the .github/PULL_REQUEST_TEMPLATE.md template as the PR description
 - Run the github pre-commit hooks to ensure code quality.

 ### Reviewing/Revising Pull Requests
--- a/autogpt_platform/backend/CLAUDE.md
+++ b/autogpt_platform/backend/CLAUDE.md
@@ -0,0 +1,170 @@
+# CLAUDE.md - Backend
+
+This file provides guidance to Claude Code when working with the backend.
+
+## Essential Commands
+
+To run something with Python package dependencies you MUST use `poetry run ...`.
+
+```bash
+# Install dependencies
+poetry install
+
+# Run database migrations
+poetry run prisma migrate dev
+
+# Start all services (database, redis, rabbitmq, clamav)
+docker compose up -d
+
+# Run the backend as a whole
+poetry run app
+
+# Run tests
+poetry run test
+
+# Run specific test
+poetry run pytest path/to/test_file.py::test_function_name
+
+# Run block tests (tests that validate all blocks work correctly)
+poetry run pytest backend/blocks/test/test_block.py -xvs
+
+# Run tests for a specific block (e.g., GetCurrentTimeBlock)
+poetry run pytest 'backend/blocks/test/test_block.py::test_available_blocks[GetCurrentTimeBlock]' -xvs
+
+# Lint and format
+# prefer format if you want to just "fix" it and only get the errors that can't be autofixed
+poetry run format  # Black + isort
+poetry run lint    # ruff
+```
+
+More details can be found in @TESTING.md
+
+### Creating/Updating Snapshots
+
+When you first write a test or when the expected output changes:
+
+```bash
+poetry run pytest path/to/test.py --snapshot-update
+```
+
+⚠️ **Important**: Always review snapshot changes before committing! Use `git diff` to verify the changes are expected.
+
+## Architecture
+
+- **API Layer**: FastAPI with REST and WebSocket endpoints
+- **Database**: PostgreSQL with Prisma ORM, includes pgvector for embeddings
+- **Queue System**: RabbitMQ for async task processing
+- **Execution Engine**: Separate executor service processes agent workflows
+- **Authentication**: JWT-based with Supabase integration
+- **Security**: Cache protection middleware prevents sensitive data caching in browsers/proxies
+
+## Testing Approach
+
+- Uses pytest with snapshot testing for API responses
+- Test files are colocated with source files (`*_test.py`)
+
+## Database Schema
+
+Key models (defined in `schema.prisma`):
+
+- `User`: Authentication and profile data
+- `AgentGraph`: Workflow definitions with version control
+- `AgentGraphExecution`: Execution history and results
+- `AgentNode`: Individual nodes in a workflow
+- `StoreListing`: Marketplace listings for sharing agents
+
+## Environment Configuration
+
+- **Backend**: `.env.default` (defaults) → `.env` (user overrides)
+
+## Common Development Tasks
+
+### Adding a new block
+
+Follow the comprehensive [Block SDK Guide](@../../docs/content/platform/block-sdk-guide.md) which covers:
+
+- Provider configuration with `ProviderBuilder`
+- Block schema definition
+- Authentication (API keys, OAuth, webhooks)
+- Testing and validation
+- File organization
+
+Quick steps:
+
+1. Create new file in `backend/blocks/`
+2. Configure provider using `ProviderBuilder` in `_config.py`
+3. Inherit from `Block` base class
+4. Define input/output schemas using `BlockSchema`
+5. Implement async `run` method
+6. Generate unique block ID using `uuid.uuid4()`
+7. Test with `poetry run pytest backend/blocks/test/test_block.py`
+
+Note: when making many new blocks analyze the interfaces for each of these blocks and picture if they would go well together in a graph-based editor or would they struggle to connect productively?
+ex: do the inputs and outputs tie well together?
+
+If you get any pushback or hit complex block conditions check the new_blocks guide in the docs.
+
+#### Handling files in blocks with `store_media_file()`
+
+When blocks need to work with files (images, videos, documents), use `store_media_file()` from `backend.util.file`. The `return_format` parameter determines what you get back:
+
+| Format | Use When | Returns |
+|--------|----------|---------|
+| `"for_local_processing"` | Processing with local tools (ffmpeg, MoviePy, PIL) | Local file path (e.g., `"image.png"`) |
+| `"for_external_api"` | Sending content to external APIs (Replicate, OpenAI) | Data URI (e.g., `"data:image/png;base64,..."`) |
+| `"for_block_output"` | Returning output from your block | Smart: `workspace://` in CoPilot, data URI in graphs |
+
+**Examples:**
+
+```python
+# INPUT: Need to process file locally with ffmpeg
+local_path = await store_media_file(
+    file=input_data.video,
+    execution_context=execution_context,
+    return_format="for_local_processing",
+)
+# local_path = "video.mp4" - use with Path/ffmpeg/etc
+
+# INPUT: Need to send to external API like Replicate
+image_b64 = await store_media_file(
+    file=input_data.image,
+    execution_context=execution_context,
+    return_format="for_external_api",
+)
+# image_b64 = "data:image/png;base64,iVBORw0..." - send to API
+
+# OUTPUT: Returning result from block
+result_url = await store_media_file(
+    file=generated_image_url,
+    execution_context=execution_context,
+    return_format="for_block_output",
+)
+yield "image_url", result_url
+# In CoPilot: result_url = "workspace://abc123"
+# In graphs:  result_url = "data:image/png;base64,..."
+```
+
+**Key points:**
+
+- `for_block_output` is the ONLY format that auto-adapts to execution context
+- Always use `for_block_output` for block outputs unless you have a specific reason not to
+- Never hardcode workspace checks - let `for_block_output` handle it
+
+### Modifying the API
+
+1. Update route in `backend/api/features/`
+2. Add/update Pydantic models in same directory
+3. Write tests alongside the route file
+4. Run `poetry run test` to verify
+
+## Security Implementation
+
+### Cache Protection Middleware
+
+- Located in `backend/api/middleware/security.py`
+- Default behavior: Disables caching for ALL endpoints with `Cache-Control: no-store, no-cache, must-revalidate, private`
+- Uses an allow list approach - only explicitly permitted paths can be cached
+- Cacheable paths include: static assets (`static/*`, `_next/static/*`), health checks, public store pages, documentation
+- Prevents sensitive data (auth tokens, API keys, user data) from being cached by browsers/proxies
+- To allow caching for a new endpoint, add it to `CACHEABLE_PATHS` in the middleware
+- Applied to both main API server and external API applications
--- a/autogpt_platform/backend/TESTING.md
+++ b/autogpt_platform/backend/TESTING.md
@@ -138,7 +138,7 @@ If the test doesn't need the `user_id` specifically, mocking is not necessary as

 #### Using Global Auth Fixtures

-Two global auth fixtures are provided by `backend/server/conftest.py`:
+Two global auth fixtures are provided by `backend/api/conftest.py`:

 - `mock_jwt_user` - Regular user with `test_user_id` ("test-user-id")
 - `mock_jwt_admin` - Admin user with `admin_user_id` ("admin-user-id")
--- a/autogpt_platform/backend/backend/api/conn_manager.py
+++ b/autogpt_platform/backend/backend/api/conn_manager.py
@@ -122,24 +122,6 @@ class ConnectionManager:

        return len(connections)

-    async def broadcast_to_all(self, *, method: WSMethod, data: dict) -> int:
-        """Broadcast a message to all active websocket connections."""
-        message = WSMessage(
-            method=method,
-            data=data,
-        ).model_dump_json()
-
-        connections = tuple(self.active_connections)
-        if not connections:
-            return 0
-
-        await asyncio.gather(
-            *(connection.send_text(message) for connection in connections),
-            return_exceptions=True,
-        )
-
-        return len(connections)
-
    async def _subscribe(self, channel_key: str, websocket: WebSocket) -> str:
        if channel_key not in self.subscriptions:
            self.subscriptions[channel_key] = set()
--- a/autogpt_platform/backend/backend/api/features/admin/execution_analytics_routes.py
+++ b/autogpt_platform/backend/backend/api/features/admin/execution_analytics_routes.py
@@ -176,64 +176,30 @@ async def get_execution_analytics_config(
        # Return with provider prefix for clarity
        return f"{provider_name}: {model_name}"

-    # Get all models from the registry (dynamic, not hardcoded enum)
-    from backend.data import llm_registry
-    from backend.server.v2.llm import db as llm_db
-
-    # Get the recommended model from the database (configurable via admin UI)
-    recommended_model_slug = await llm_db.get_recommended_model_slug()
-
-    # Build the available models list
-    first_enabled_slug = None
-    for registry_model in llm_registry.iter_dynamic_models():
-        # Only include enabled models in the list
-        if not registry_model.is_enabled:
-            continue
-
-        # Track first enabled model as fallback
-        if first_enabled_slug is None:
-            first_enabled_slug = registry_model.slug
-
-        model_enum = LlmModel(registry_model.slug)  # Create enum instance from slug
-        label = generate_model_label(model_enum)
+    # Include all LlmModel values (no more filtering by hardcoded list)
+    recommended_model = LlmModel.GPT4O_MINI.value
+    for model in LlmModel:
+        label = generate_model_label(model)
        # Add "(Recommended)" suffix to the recommended model
-        if registry_model.slug == recommended_model_slug:
+        if model.value == recommended_model:
            label += " (Recommended)"

        available_models.append(
            ModelInfo(
-                value=registry_model.slug,
+                value=model.value,
                label=label,
-                provider=registry_model.metadata.provider,
+                provider=model.provider,
            )
        )

    # Sort models by provider and name for better UX
    available_models.sort(key=lambda x: (x.provider, x.label))

-    # Handle case where no models are available
-    if not available_models:
-        logger.warning(
-            "No enabled LLM models found in registry. "
-            "Ensure models are configured and enabled in the LLM Registry."
-        )
-        # Provide a placeholder entry so admins see meaningful feedback
-        available_models.append(
-            ModelInfo(
-                value="",
-                label="No models available - configure in LLM Registry",
-                provider="none",
-            )
-        )
-
-    # Use the DB recommended model, or fallback to first enabled model
-    final_recommended = recommended_model_slug or first_enabled_slug or ""
-
    return ExecutionAnalyticsConfig(
        available_models=available_models,
        default_system_prompt=DEFAULT_SYSTEM_PROMPT,
        default_user_prompt=DEFAULT_USER_PROMPT,
-        recommended_model=final_recommended,
+        recommended_model=recommended_model,
    )


--- a/autogpt_platform/backend/backend/api/features/admin/llm_routes.py
+++ b/autogpt_platform/backend/backend/api/features/admin/llm_routes.py
@@ -1,595 +0,0 @@
-import logging
-
-import autogpt_libs.auth
-import fastapi
-
-from backend.data import llm_registry
-from backend.data.block_cost_config import refresh_llm_costs
-from backend.server.v2.llm import db as llm_db
-from backend.server.v2.llm import model as llm_model
-
-logger = logging.getLogger(__name__)
-
-router = fastapi.APIRouter(
-    tags=["llm", "admin"],
-    dependencies=[fastapi.Security(autogpt_libs.auth.requires_admin_user)],
-)
-
-
-async def _refresh_runtime_state() -> None:
-    """Refresh the LLM registry and clear all related caches to ensure real-time updates."""
-    logger.info("Refreshing LLM registry runtime state...")
-    try:
-        # Refresh registry from database
-        await llm_registry.refresh_llm_registry()
-        refresh_llm_costs()
-
-        # Clear block schema caches so they're regenerated with updated model options
-        from backend.data.block import BlockSchema
-
-        BlockSchema.clear_all_schema_caches()
-        logger.info("Cleared all block schema caches")
-
-        # Clear the /blocks endpoint cache so frontend gets updated schemas
-        try:
-            from backend.api.features.v1 import _get_cached_blocks
-
-            _get_cached_blocks.cache_clear()
-            logger.info("Cleared /blocks endpoint cache")
-        except Exception as e:
-            logger.warning("Failed to clear /blocks cache: %s", e)
-
-        # Clear the v2 builder caches (if they exist)
-        try:
-            from backend.api.features.builder import db as builder_db
-
-            if hasattr(builder_db, "_get_all_providers"):
-                builder_db._get_all_providers.cache_clear()
-                logger.info("Cleared v2 builder providers cache")
-            if hasattr(builder_db, "_build_cached_search_results"):
-                builder_db._build_cached_search_results.cache_clear()
-                logger.info("Cleared v2 builder search results cache")
-        except Exception as e:
-            logger.debug("Could not clear v2 builder cache: %s", e)
-
-        # Notify all executor services to refresh their registry cache
-        from backend.data.llm_registry import publish_registry_refresh_notification
-
-        await publish_registry_refresh_notification()
-        logger.info("Published registry refresh notification")
-    except Exception as exc:
-        logger.exception(
-            "LLM runtime state refresh failed; caches may be stale: %s", exc
-        )
-
-
-@router.get(
-    "/providers",
-    summary="List LLM providers",
-    response_model=llm_model.LlmProvidersResponse,
-)
-async def list_llm_providers(include_models: bool = True):
-    providers = await llm_db.list_providers(include_models=include_models)
-    return llm_model.LlmProvidersResponse(providers=providers)
-
-
-@router.post(
-    "/providers",
-    summary="Create LLM provider",
-    response_model=llm_model.LlmProvider,
-)
-async def create_llm_provider(request: llm_model.UpsertLlmProviderRequest):
-    provider = await llm_db.upsert_provider(request=request)
-    await _refresh_runtime_state()
-    return provider
-
-
-@router.patch(
-    "/providers/{provider_id}",
-    summary="Update LLM provider",
-    response_model=llm_model.LlmProvider,
-)
-async def update_llm_provider(
-    provider_id: str,
-    request: llm_model.UpsertLlmProviderRequest,
-):
-    provider = await llm_db.upsert_provider(request=request, provider_id=provider_id)
-    await _refresh_runtime_state()
-    return provider
-
-
-@router.delete(
-    "/providers/{provider_id}",
-    summary="Delete LLM provider",
-    response_model=dict,
-)
-async def delete_llm_provider(provider_id: str):
-    """
-    Delete an LLM provider.
-
-    A provider can only be deleted if it has no associated models.
-    Delete all models from the provider first before deleting the provider.
-    """
-    try:
-        await llm_db.delete_provider(provider_id)
-        await _refresh_runtime_state()
-        logger.info("Deleted LLM provider '%s'", provider_id)
-        return {"success": True, "message": "Provider deleted successfully"}
-    except ValueError as e:
-        logger.warning("Failed to delete provider '%s': %s", provider_id, e)
-        raise fastapi.HTTPException(status_code=400, detail=str(e))
-    except Exception as e:
-        logger.exception("Failed to delete provider '%s': %s", provider_id, e)
-        raise fastapi.HTTPException(status_code=500, detail=str(e))
-
-
-@router.get(
-    "/models",
-    summary="List LLM models",
-    response_model=llm_model.LlmModelsResponse,
-)
-async def list_llm_models(
-    provider_id: str | None = fastapi.Query(default=None),
-    page: int = fastapi.Query(default=1, ge=1, description="Page number (1-indexed)"),
-    page_size: int = fastapi.Query(
-        default=50, ge=1, le=100, description="Number of models per page"
-    ),
-):
-    return await llm_db.list_models(
-        provider_id=provider_id, page=page, page_size=page_size
-    )
-
-
-@router.post(
-    "/models",
-    summary="Create LLM model",
-    response_model=llm_model.LlmModel,
-)
-async def create_llm_model(request: llm_model.CreateLlmModelRequest):
-    model = await llm_db.create_model(request=request)
-    await _refresh_runtime_state()
-    return model
-
-
-@router.patch(
-    "/models/{model_id}",
-    summary="Update LLM model",
-    response_model=llm_model.LlmModel,
-)
-async def update_llm_model(
-    model_id: str,
-    request: llm_model.UpdateLlmModelRequest,
-):
-    model = await llm_db.update_model(model_id=model_id, request=request)
-    await _refresh_runtime_state()
-    return model
-
-
-@router.patch(
-    "/models/{model_id}/toggle",
-    summary="Toggle LLM model availability",
-    response_model=llm_model.ToggleLlmModelResponse,
-)
-async def toggle_llm_model(
-    model_id: str,
-    request: llm_model.ToggleLlmModelRequest,
-):
-    """
-    Toggle a model's enabled status, optionally migrating workflows when disabling.
-
-    If disabling a model and `migrate_to_slug` is provided, all workflows using
-    this model will be migrated to the specified replacement model before disabling.
-    A migration record is created which can be reverted later using the revert endpoint.
-
-    Optional fields:
-    - `migration_reason`: Reason for the migration (e.g., "Provider outage")
-    - `custom_credit_cost`: Custom pricing override for billing during migration
-    """
-    try:
-        result = await llm_db.toggle_model(
-            model_id=model_id,
-            is_enabled=request.is_enabled,
-            migrate_to_slug=request.migrate_to_slug,
-            migration_reason=request.migration_reason,
-            custom_credit_cost=request.custom_credit_cost,
-        )
-        await _refresh_runtime_state()
-        if result.nodes_migrated > 0:
-            logger.info(
-                "Toggled model '%s' to %s and migrated %d nodes to '%s' (migration_id=%s)",
-                result.model.slug,
-                "enabled" if request.is_enabled else "disabled",
-                result.nodes_migrated,
-                result.migrated_to_slug,
-                result.migration_id,
-            )
-        return result
-    except ValueError as exc:
-        logger.warning("Model toggle validation failed: %s", exc)
-        raise fastapi.HTTPException(status_code=400, detail=str(exc)) from exc
-    except Exception as exc:
-        logger.exception("Failed to toggle LLM model %s: %s", model_id, exc)
-        raise fastapi.HTTPException(
-            status_code=500,
-            detail="Failed to toggle model availability",
-        ) from exc
-
-
-@router.get(
-    "/models/{model_id}/usage",
-    summary="Get model usage count",
-    response_model=llm_model.LlmModelUsageResponse,
-)
-async def get_llm_model_usage(model_id: str):
-    """Get the number of workflow nodes using this model."""
-    try:
-        return await llm_db.get_model_usage(model_id=model_id)
-    except ValueError as exc:
-        raise fastapi.HTTPException(status_code=404, detail=str(exc)) from exc
-    except Exception as exc:
-        logger.exception("Failed to get model usage %s: %s", model_id, exc)
-        raise fastapi.HTTPException(
-            status_code=500,
-            detail="Failed to get model usage",
-        ) from exc
-
-
-@router.delete(
-    "/models/{model_id}",
-    summary="Delete LLM model and migrate workflows",
-    response_model=llm_model.DeleteLlmModelResponse,
-)
-async def delete_llm_model(
-    model_id: str,
-    replacement_model_slug: str | None = fastapi.Query(
-        default=None,
-        description="Slug of the model to migrate existing workflows to (required only if workflows use this model)",
-    ),
-):
-    """
-    Delete a model and optionally migrate workflows using it to a replacement model.
-
-    If no workflows are using this model, it can be deleted without providing a
-    replacement. If workflows exist, replacement_model_slug is required.
-
-    This endpoint:
-    1. Counts how many workflow nodes use the model being deleted
-    2. If nodes exist, validates the replacement model and migrates them
-    3. Deletes the model record
-    4. Refreshes all caches and notifies executors
-
-    Example: DELETE /admin/llm/models/{id}?replacement_model_slug=gpt-4o
-    Example (no usage): DELETE /admin/llm/models/{id}
-    """
-    try:
-        result = await llm_db.delete_model(
-            model_id=model_id, replacement_model_slug=replacement_model_slug
-        )
-        await _refresh_runtime_state()
-        logger.info(
-            "Deleted model '%s' and migrated %d nodes to '%s'",
-            result.deleted_model_slug,
-            result.nodes_migrated,
-            result.replacement_model_slug,
-        )
-        return result
-    except ValueError as exc:
-        # Validation errors (model not found, replacement invalid, etc.)
-        logger.warning("Model deletion validation failed: %s", exc)
-        raise fastapi.HTTPException(status_code=400, detail=str(exc)) from exc
-    except Exception as exc:
-        logger.exception("Failed to delete LLM model %s: %s", model_id, exc)
-        raise fastapi.HTTPException(
-            status_code=500,
-            detail="Failed to delete model and migrate workflows",
-        ) from exc
-
-
-# ============================================================================
-# Migration Management Endpoints
-# ============================================================================
-
-
-@router.get(
-    "/migrations",
-    summary="List model migrations",
-    response_model=llm_model.LlmMigrationsResponse,
-)
-async def list_llm_migrations(
-    include_reverted: bool = fastapi.Query(
-        default=False, description="Include reverted migrations in the list"
-    ),
-):
-    """
-    List all model migrations.
-
-    Migrations are created when disabling a model with the migrate_to_slug option.
-    They can be reverted to restore the original model configuration.
-    """
-    try:
-        migrations = await llm_db.list_migrations(include_reverted=include_reverted)
-        return llm_model.LlmMigrationsResponse(migrations=migrations)
-    except Exception as exc:
-        logger.exception("Failed to list migrations: %s", exc)
-        raise fastapi.HTTPException(
-            status_code=500,
-            detail="Failed to list migrations",
-        ) from exc
-
-
-@router.get(
-    "/migrations/{migration_id}",
-    summary="Get migration details",
-    response_model=llm_model.LlmModelMigration,
-)
-async def get_llm_migration(migration_id: str):
-    """Get details of a specific migration."""
-    try:
-        migration = await llm_db.get_migration(migration_id)
-        if not migration:
-            raise fastapi.HTTPException(
-                status_code=404, detail=f"Migration '{migration_id}' not found"
-            )
-        return migration
-    except fastapi.HTTPException:
-        raise
-    except Exception as exc:
-        logger.exception("Failed to get migration %s: %s", migration_id, exc)
-        raise fastapi.HTTPException(
-            status_code=500,
-            detail="Failed to get migration",
-        ) from exc
-
-
-@router.post(
-    "/migrations/{migration_id}/revert",
-    summary="Revert a model migration",
-    response_model=llm_model.RevertMigrationResponse,
-)
-async def revert_llm_migration(
-    migration_id: str,
-    request: llm_model.RevertMigrationRequest | None = None,
-):
-    """
-    Revert a model migration, restoring affected workflows to their original model.
-
-    This only reverts the specific nodes that were part of the migration.
-    The source model must exist for the revert to succeed.
-
-    Options:
-    - `re_enable_source_model`: Whether to re-enable the source model if disabled (default: True)
-
-    Response includes:
-    - `nodes_reverted`: Number of nodes successfully reverted
-    - `nodes_already_changed`: Number of nodes that were modified since migration (not reverted)
-    - `source_model_re_enabled`: Whether the source model was re-enabled
-
-    Requirements:
-    - Migration must not already be reverted
-    - Source model must exist
-    """
-    try:
-        re_enable = request.re_enable_source_model if request else True
-        result = await llm_db.revert_migration(
-            migration_id,
-            re_enable_source_model=re_enable,
-        )
-        await _refresh_runtime_state()
-        logger.info(
-            "Reverted migration '%s': %d nodes restored from '%s' to '%s' "
-            "(%d already changed, source re-enabled=%s)",
-            migration_id,
-            result.nodes_reverted,
-            result.target_model_slug,
-            result.source_model_slug,
-            result.nodes_already_changed,
-            result.source_model_re_enabled,
-        )
-        return result
-    except ValueError as exc:
-        logger.warning("Migration revert validation failed: %s", exc)
-        raise fastapi.HTTPException(status_code=400, detail=str(exc)) from exc
-    except Exception as exc:
-        logger.exception("Failed to revert migration %s: %s", migration_id, exc)
-        raise fastapi.HTTPException(
-            status_code=500,
-            detail="Failed to revert migration",
-        ) from exc
-
-
-# ============================================================================
-# Creator Management Endpoints
-# ============================================================================
-
-
-@router.get(
-    "/creators",
-    summary="List model creators",
-    response_model=llm_model.LlmCreatorsResponse,
-)
-async def list_llm_creators():
-    """
-    List all model creators.
-
-    Creators are organizations that create/train models (e.g., OpenAI, Meta, Anthropic).
-    This is distinct from providers who host/serve the models (e.g., OpenRouter).
-    """
-    try:
-        creators = await llm_db.list_creators()
-        return llm_model.LlmCreatorsResponse(creators=creators)
-    except Exception as exc:
-        logger.exception("Failed to list creators: %s", exc)
-        raise fastapi.HTTPException(
-            status_code=500,
-            detail="Failed to list creators",
-        ) from exc
-
-
-@router.get(
-    "/creators/{creator_id}",
-    summary="Get creator details",
-    response_model=llm_model.LlmModelCreator,
-)
-async def get_llm_creator(creator_id: str):
-    """Get details of a specific model creator."""
-    try:
-        creator = await llm_db.get_creator(creator_id)
-        if not creator:
-            raise fastapi.HTTPException(
-                status_code=404, detail=f"Creator '{creator_id}' not found"
-            )
-        return creator
-    except fastapi.HTTPException:
-        raise
-    except Exception as exc:
-        logger.exception("Failed to get creator %s: %s", creator_id, exc)
-        raise fastapi.HTTPException(
-            status_code=500,
-            detail="Failed to get creator",
-        ) from exc
-
-
-@router.post(
-    "/creators",
-    summary="Create model creator",
-    response_model=llm_model.LlmModelCreator,
-)
-async def create_llm_creator(request: llm_model.UpsertLlmCreatorRequest):
-    """
-    Create a new model creator.
-
-    A creator represents an organization that creates/trains AI models,
-    such as OpenAI, Anthropic, Meta, or Google.
-    """
-    try:
-        creator = await llm_db.upsert_creator(request=request)
-        await _refresh_runtime_state()
-        logger.info("Created model creator '%s' (%s)", creator.display_name, creator.id)
-        return creator
-    except Exception as exc:
-        logger.exception("Failed to create creator: %s", exc)
-        raise fastapi.HTTPException(
-            status_code=500,
-            detail="Failed to create creator",
-        ) from exc
-
-
-@router.patch(
-    "/creators/{creator_id}",
-    summary="Update model creator",
-    response_model=llm_model.LlmModelCreator,
-)
-async def update_llm_creator(
-    creator_id: str,
-    request: llm_model.UpsertLlmCreatorRequest,
-):
-    """Update an existing model creator."""
-    try:
-        creator = await llm_db.upsert_creator(request=request, creator_id=creator_id)
-        await _refresh_runtime_state()
-        logger.info("Updated model creator '%s' (%s)", creator.display_name, creator_id)
-        return creator
-    except Exception as exc:
-        logger.exception("Failed to update creator %s: %s", creator_id, exc)
-        raise fastapi.HTTPException(
-            status_code=500,
-            detail="Failed to update creator",
-        ) from exc
-
-
-@router.delete(
-    "/creators/{creator_id}",
-    summary="Delete model creator",
-    response_model=dict,
-)
-async def delete_llm_creator(creator_id: str):
-    """
-    Delete a model creator.
-
-    This will remove the creator association from all models that reference it
-    (sets creatorId to NULL), but will not delete the models themselves.
-    """
-    try:
-        await llm_db.delete_creator(creator_id)
-        await _refresh_runtime_state()
-        logger.info("Deleted model creator '%s'", creator_id)
-        return {"success": True, "message": f"Creator '{creator_id}' deleted"}
-    except ValueError as exc:
-        logger.warning("Creator deletion validation failed: %s", exc)
-        raise fastapi.HTTPException(status_code=404, detail=str(exc)) from exc
-    except Exception as exc:
-        logger.exception("Failed to delete creator %s: %s", creator_id, exc)
-        raise fastapi.HTTPException(
-            status_code=500,
-            detail="Failed to delete creator",
-        ) from exc
-
-
-# ============================================================================
-# Recommended Model Endpoints
-# ============================================================================
-
-
-@router.get(
-    "/recommended-model",
-    summary="Get recommended model",
-    response_model=llm_model.RecommendedModelResponse,
-)
-async def get_recommended_model():
-    """
-    Get the currently recommended LLM model.
-
-    The recommended model is shown to users as the default/suggested option
-    in model selection dropdowns.
-    """
-    try:
-        model = await llm_db.get_recommended_model()
-        return llm_model.RecommendedModelResponse(
-            model=model,
-            slug=model.slug if model else None,
-        )
-    except Exception as exc:
-        logger.exception("Failed to get recommended model: %s", exc)
-        raise fastapi.HTTPException(
-            status_code=500,
-            detail="Failed to get recommended model",
-        ) from exc
-
-
-@router.post(
-    "/recommended-model",
-    summary="Set recommended model",
-    response_model=llm_model.SetRecommendedModelResponse,
-)
-async def set_recommended_model(request: llm_model.SetRecommendedModelRequest):
-    """
-    Set a model as the recommended model.
-
-    This clears the recommended flag from any other model and sets it on
-    the specified model. The model must be enabled to be set as recommended.
-
-    The recommended model is displayed to users as the default/suggested
-    option in model selection dropdowns throughout the platform.
-    """
-    try:
-        model, previous_slug = await llm_db.set_recommended_model(request.model_id)
-        await _refresh_runtime_state()
-        logger.info(
-            "Set recommended model to '%s' (previous: %s)",
-            model.slug,
-            previous_slug or "none",
-        )
-        return llm_model.SetRecommendedModelResponse(
-            model=model,
-            previous_recommended_slug=previous_slug,
-            message=f"Model '{model.display_name}' is now the recommended model",
-        )
-    except ValueError as exc:
-        logger.warning("Set recommended model validation failed: %s", exc)
-        raise fastapi.HTTPException(status_code=400, detail=str(exc)) from exc
-    except Exception as exc:
-        logger.exception("Failed to set recommended model: %s", exc)
-        raise fastapi.HTTPException(
-            status_code=500,
-            detail="Failed to set recommended model",
-        ) from exc
--- a/autogpt_platform/backend/backend/api/features/admin/llm_routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/admin/llm_routes_test.py
@@ -1,491 +0,0 @@
-import json
-from unittest.mock import AsyncMock
-
-import fastapi
-import fastapi.testclient
-import pytest
-import pytest_mock
-from autogpt_libs.auth.jwt_utils import get_jwt_payload
-from pytest_snapshot.plugin import Snapshot
-
-import backend.api.features.admin.llm_routes as llm_routes
-from backend.server.v2.llm import model as llm_model
-from backend.util.models import Pagination
-
-app = fastapi.FastAPI()
-app.include_router(llm_routes.router, prefix="/admin/llm")
-
-client = fastapi.testclient.TestClient(app)
-
-
-@pytest.fixture(autouse=True)
-def setup_app_admin_auth(mock_jwt_admin):
-    """Setup admin auth overrides for all tests in this module"""
-    app.dependency_overrides[get_jwt_payload] = mock_jwt_admin["get_jwt_payload"]
-    yield
-    app.dependency_overrides.clear()
-
-
-def test_list_llm_providers_success(
-    mocker: pytest_mock.MockFixture,
-    configured_snapshot: Snapshot,
-) -> None:
-    """Test successful listing of LLM providers"""
-    # Mock the database function
-    mock_providers = [
-        {
-            "id": "provider-1",
-            "name": "openai",
-            "display_name": "OpenAI",
-            "description": "OpenAI LLM provider",
-            "supports_tools": True,
-            "supports_json_output": True,
-            "supports_reasoning": False,
-            "supports_parallel_tool": True,
-            "metadata": {},
-            "models": [],
-        },
-        {
-            "id": "provider-2",
-            "name": "anthropic",
-            "display_name": "Anthropic",
-            "description": "Anthropic LLM provider",
-            "supports_tools": True,
-            "supports_json_output": True,
-            "supports_reasoning": False,
-            "supports_parallel_tool": True,
-            "metadata": {},
-            "models": [],
-        },
-    ]
-
-    mocker.patch(
-        "backend.api.features.admin.llm_routes.llm_db.list_providers",
-        new=AsyncMock(return_value=mock_providers),
-    )
-
-    response = client.get("/admin/llm/providers")
-
-    assert response.status_code == 200
-    response_data = response.json()
-    assert len(response_data["providers"]) == 2
-    assert response_data["providers"][0]["name"] == "openai"
-
-    # Snapshot test the response (must be string)
-    configured_snapshot.assert_match(
-        json.dumps(response_data, indent=2, sort_keys=True),
-        "list_llm_providers_success.json",
-    )
-
-
-def test_list_llm_models_success(
-    mocker: pytest_mock.MockFixture,
-    configured_snapshot: Snapshot,
-) -> None:
-    """Test successful listing of LLM models with pagination"""
-    # Mock the database function - now returns LlmModelsResponse
-    mock_model = llm_model.LlmModel(
-        id="model-1",
-        slug="gpt-4o",
-        display_name="GPT-4o",
-        description="GPT-4 Optimized",
-        provider_id="provider-1",
-        context_window=128000,
-        max_output_tokens=16384,
-        is_enabled=True,
-        capabilities={},
-        metadata={},
-        costs=[
-            llm_model.LlmModelCost(
-                id="cost-1",
-                credit_cost=10,
-                credential_provider="openai",
-                metadata={},
-            )
-        ],
-    )
-
-    mock_response = llm_model.LlmModelsResponse(
-        models=[mock_model],
-        pagination=Pagination(
-            total_items=1,
-            total_pages=1,
-            current_page=1,
-            page_size=50,
-        ),
-    )
-
-    mocker.patch(
-        "backend.api.features.admin.llm_routes.llm_db.list_models",
-        new=AsyncMock(return_value=mock_response),
-    )
-
-    response = client.get("/admin/llm/models")
-
-    assert response.status_code == 200
-    response_data = response.json()
-    assert len(response_data["models"]) == 1
-    assert response_data["models"][0]["slug"] == "gpt-4o"
-    assert response_data["pagination"]["total_items"] == 1
-    assert response_data["pagination"]["page_size"] == 50
-
-    # Snapshot test the response (must be string)
-    configured_snapshot.assert_match(
-        json.dumps(response_data, indent=2, sort_keys=True),
-        "list_llm_models_success.json",
-    )
-
-
-def test_create_llm_provider_success(
-    mocker: pytest_mock.MockFixture,
-    configured_snapshot: Snapshot,
-) -> None:
-    """Test successful creation of LLM provider"""
-    mock_provider = {
-        "id": "new-provider-id",
-        "name": "groq",
-        "display_name": "Groq",
-        "description": "Groq LLM provider",
-        "supports_tools": True,
-        "supports_json_output": True,
-        "supports_reasoning": False,
-        "supports_parallel_tool": False,
-        "metadata": {},
-    }
-
-    mocker.patch(
-        "backend.api.features.admin.llm_routes.llm_db.upsert_provider",
-        new=AsyncMock(return_value=mock_provider),
-    )
-
-    mock_refresh = mocker.patch(
-        "backend.api.features.admin.llm_routes._refresh_runtime_state",
-        new=AsyncMock(),
-    )
-
-    request_data = {
-        "name": "groq",
-        "display_name": "Groq",
-        "description": "Groq LLM provider",
-        "supports_tools": True,
-        "supports_json_output": True,
-        "supports_reasoning": False,
-        "supports_parallel_tool": False,
-        "metadata": {},
-    }
-
-    response = client.post("/admin/llm/providers", json=request_data)
-
-    assert response.status_code == 200
-    response_data = response.json()
-    assert response_data["name"] == "groq"
-    assert response_data["display_name"] == "Groq"
-
-    # Verify refresh was called
-    mock_refresh.assert_called_once()
-
-    # Snapshot test the response (must be string)
-    configured_snapshot.assert_match(
-        json.dumps(response_data, indent=2, sort_keys=True),
-        "create_llm_provider_success.json",
-    )
-
-
-def test_create_llm_model_success(
-    mocker: pytest_mock.MockFixture,
-    configured_snapshot: Snapshot,
-) -> None:
-    """Test successful creation of LLM model"""
-    mock_model = {
-        "id": "new-model-id",
-        "slug": "gpt-4.1-mini",
-        "display_name": "GPT-4.1 Mini",
-        "description": "Latest GPT-4.1 Mini model",
-        "provider_id": "provider-1",
-        "context_window": 128000,
-        "max_output_tokens": 16384,
-        "is_enabled": True,
-        "capabilities": {},
-        "metadata": {},
-        "costs": [
-            {
-                "id": "cost-id",
-                "credit_cost": 5,
-                "credential_provider": "openai",
-                "metadata": {},
-            }
-        ],
-    }
-
-    mocker.patch(
-        "backend.api.features.admin.llm_routes.llm_db.create_model",
-        new=AsyncMock(return_value=mock_model),
-    )
-
-    mock_refresh = mocker.patch(
-        "backend.api.features.admin.llm_routes._refresh_runtime_state",
-        new=AsyncMock(),
-    )
-
-    request_data = {
-        "slug": "gpt-4.1-mini",
-        "display_name": "GPT-4.1 Mini",
-        "description": "Latest GPT-4.1 Mini model",
-        "provider_id": "provider-1",
-        "context_window": 128000,
-        "max_output_tokens": 16384,
-        "is_enabled": True,
-        "capabilities": {},
-        "metadata": {},
-        "costs": [
-            {
-                "credit_cost": 5,
-                "credential_provider": "openai",
-                "metadata": {},
-            }
-        ],
-    }
-
-    response = client.post("/admin/llm/models", json=request_data)
-
-    assert response.status_code == 200
-    response_data = response.json()
-    assert response_data["slug"] == "gpt-4.1-mini"
-    assert response_data["is_enabled"] is True
-
-    # Verify refresh was called
-    mock_refresh.assert_called_once()
-
-    # Snapshot test the response (must be string)
-    configured_snapshot.assert_match(
-        json.dumps(response_data, indent=2, sort_keys=True),
-        "create_llm_model_success.json",
-    )
-
-
-def test_update_llm_model_success(
-    mocker: pytest_mock.MockFixture,
-    configured_snapshot: Snapshot,
-) -> None:
-    """Test successful update of LLM model"""
-    mock_model = {
-        "id": "model-1",
-        "slug": "gpt-4o",
-        "display_name": "GPT-4o Updated",
-        "description": "Updated description",
-        "provider_id": "provider-1",
-        "context_window": 256000,
-        "max_output_tokens": 32768,
-        "is_enabled": True,
-        "capabilities": {},
-        "metadata": {},
-        "costs": [
-            {
-                "id": "cost-1",
-                "credit_cost": 15,
-                "credential_provider": "openai",
-                "metadata": {},
-            }
-        ],
-    }
-
-    mocker.patch(
-        "backend.api.features.admin.llm_routes.llm_db.update_model",
-        new=AsyncMock(return_value=mock_model),
-    )
-
-    mock_refresh = mocker.patch(
-        "backend.api.features.admin.llm_routes._refresh_runtime_state",
-        new=AsyncMock(),
-    )
-
-    request_data = {
-        "display_name": "GPT-4o Updated",
-        "description": "Updated description",
-        "context_window": 256000,
-        "max_output_tokens": 32768,
-    }
-
-    response = client.patch("/admin/llm/models/model-1", json=request_data)
-
-    assert response.status_code == 200
-    response_data = response.json()
-    assert response_data["display_name"] == "GPT-4o Updated"
-    assert response_data["context_window"] == 256000
-
-    # Verify refresh was called
-    mock_refresh.assert_called_once()
-
-    # Snapshot test the response (must be string)
-    configured_snapshot.assert_match(
-        json.dumps(response_data, indent=2, sort_keys=True),
-        "update_llm_model_success.json",
-    )
-
-
-def test_toggle_llm_model_success(
-    mocker: pytest_mock.MockFixture,
-    configured_snapshot: Snapshot,
-) -> None:
-    """Test successful toggling of LLM model enabled status"""
-    # Create a proper mock model object
-    mock_model = llm_model.LlmModel(
-        id="model-1",
-        slug="gpt-4o",
-        display_name="GPT-4o",
-        description="GPT-4 Optimized",
-        provider_id="provider-1",
-        context_window=128000,
-        max_output_tokens=16384,
-        is_enabled=False,
-        capabilities={},
-        metadata={},
-        costs=[],
-    )
-
-    # Create a proper ToggleLlmModelResponse
-    mock_response = llm_model.ToggleLlmModelResponse(
-        model=mock_model,
-        nodes_migrated=0,
-        migrated_to_slug=None,
-        migration_id=None,
-    )
-
-    mocker.patch(
-        "backend.api.features.admin.llm_routes.llm_db.toggle_model",
-        new=AsyncMock(return_value=mock_response),
-    )
-
-    mock_refresh = mocker.patch(
-        "backend.api.features.admin.llm_routes._refresh_runtime_state",
-        new=AsyncMock(),
-    )
-
-    request_data = {"is_enabled": False}
-
-    response = client.patch("/admin/llm/models/model-1/toggle", json=request_data)
-
-    assert response.status_code == 200
-    response_data = response.json()
-    assert response_data["model"]["is_enabled"] is False
-
-    # Verify refresh was called
-    mock_refresh.assert_called_once()
-
-    # Snapshot test the response (must be string)
-    configured_snapshot.assert_match(
-        json.dumps(response_data, indent=2, sort_keys=True),
-        "toggle_llm_model_success.json",
-    )
-
-
-def test_delete_llm_model_success(
-    mocker: pytest_mock.MockFixture,
-    configured_snapshot: Snapshot,
-) -> None:
-    """Test successful deletion of LLM model with migration"""
-    # Create a proper DeleteLlmModelResponse
-    mock_response = llm_model.DeleteLlmModelResponse(
-        deleted_model_slug="gpt-3.5-turbo",
-        deleted_model_display_name="GPT-3.5 Turbo",
-        replacement_model_slug="gpt-4o-mini",
-        nodes_migrated=42,
-        message="Successfully deleted model 'GPT-3.5 Turbo' (gpt-3.5-turbo) "
-        "and migrated 42 workflow node(s) to 'gpt-4o-mini'.",
-    )
-
-    mocker.patch(
-        "backend.api.features.admin.llm_routes.llm_db.delete_model",
-        new=AsyncMock(return_value=mock_response),
-    )
-
-    mock_refresh = mocker.patch(
-        "backend.api.features.admin.llm_routes._refresh_runtime_state",
-        new=AsyncMock(),
-    )
-
-    response = client.delete(
-        "/admin/llm/models/model-1?replacement_model_slug=gpt-4o-mini"
-    )
-
-    assert response.status_code == 200
-    response_data = response.json()
-    assert response_data["deleted_model_slug"] == "gpt-3.5-turbo"
-    assert response_data["nodes_migrated"] == 42
-    assert response_data["replacement_model_slug"] == "gpt-4o-mini"
-
-    # Verify refresh was called
-    mock_refresh.assert_called_once()
-
-    # Snapshot test the response (must be string)
-    configured_snapshot.assert_match(
-        json.dumps(response_data, indent=2, sort_keys=True),
-        "delete_llm_model_success.json",
-    )
-
-
-def test_delete_llm_model_validation_error(
-    mocker: pytest_mock.MockFixture,
-) -> None:
-    """Test deletion fails with proper error when validation fails"""
-    mocker.patch(
-        "backend.api.features.admin.llm_routes.llm_db.delete_model",
-        new=AsyncMock(side_effect=ValueError("Replacement model 'invalid' not found")),
-    )
-
-    response = client.delete("/admin/llm/models/model-1?replacement_model_slug=invalid")
-
-    assert response.status_code == 400
-    assert "Replacement model 'invalid' not found" in response.json()["detail"]
-
-
-def test_delete_llm_model_no_replacement_with_usage(
-    mocker: pytest_mock.MockFixture,
-) -> None:
-    """Test deletion fails when nodes exist but no replacement is provided"""
-    mocker.patch(
-        "backend.api.features.admin.llm_routes.llm_db.delete_model",
-        new=AsyncMock(
-            side_effect=ValueError(
-                "Cannot delete model 'test-model': 5 workflow node(s) are using it. "
-                "Please provide a replacement_model_slug to migrate them."
-            )
-        ),
-    )
-
-    response = client.delete("/admin/llm/models/model-1")
-
-    assert response.status_code == 400
-    assert "workflow node(s) are using it" in response.json()["detail"]
-
-
-def test_delete_llm_model_no_replacement_no_usage(
-    mocker: pytest_mock.MockFixture,
-) -> None:
-    """Test deletion succeeds when no nodes use the model and no replacement is provided"""
-    mock_response = llm_model.DeleteLlmModelResponse(
-        deleted_model_slug="unused-model",
-        deleted_model_display_name="Unused Model",
-        replacement_model_slug=None,
-        nodes_migrated=0,
-        message="Successfully deleted model 'Unused Model' (unused-model). No workflows were using this model.",
-    )
-
-    mocker.patch(
-        "backend.api.features.admin.llm_routes.llm_db.delete_model",
-        new=AsyncMock(return_value=mock_response),
-    )
-
-    mock_refresh = mocker.patch(
-        "backend.api.features.admin.llm_routes._refresh_runtime_state",
-        new=AsyncMock(),
-    )
-
-    response = client.delete("/admin/llm/models/model-1")
-
-    assert response.status_code == 200
-    response_data = response.json()
-    assert response_data["deleted_model_slug"] == "unused-model"
-    assert response_data["nodes_migrated"] == 0
-    assert response_data["replacement_model_slug"] is None
-    mock_refresh.assert_called_once()
--- a/autogpt_platform/backend/backend/api/features/builder/db.py
+++ b/autogpt_platform/backend/backend/api/features/builder/db.py
@@ -15,7 +15,6 @@ from backend.blocks import load_all_blocks
 from backend.blocks.llm import LlmModel
 from backend.data.block import AnyBlockSchema, BlockCategory, BlockInfo, BlockSchema
 from backend.data.db import query_raw_with_schema
-from backend.data.llm_registry import get_all_model_slugs_for_validation
 from backend.integrations.providers import ProviderName
 from backend.util.cache import cached
 from backend.util.models import Pagination
@@ -32,14 +31,7 @@ from .model import (
 )

 logger = logging.getLogger(__name__)
-
-
-def _get_llm_models() -> list[str]:
-    """Get LLM model names for search matching from the registry."""
-    return [
-        slug.lower().replace("-", " ") for slug in get_all_model_slugs_for_validation()
-    ]
-
+llm_models = [name.name.lower().replace("_", " ") for name in LlmModel]

 MAX_LIBRARY_AGENT_RESULTS = 100
 MAX_MARKETPLACE_AGENT_RESULTS = 100
@@ -504,8 +496,8 @@ async def _get_static_counts():
 def _matches_llm_model(schema_cls: type[BlockSchema], query: str) -> bool:
    for field in schema_cls.model_fields.values():
        if field.annotation == LlmModel:
-            # Check if query matches any value in llm_models from registry
-            if any(query in name for name in _get_llm_models()):
+            # Check if query matches any value in llm_models
+            if any(query in name for name in llm_models):
                return True
    return False

--- a/autogpt_platform/backend/backend/api/features/builder/routes.py
+++ b/autogpt_platform/backend/backend/api/features/builder/routes.py
@@ -17,7 +17,7 @@ router = fastapi.APIRouter(
 )


-# Taken from backend/server/v2/store/db.py
+# Taken from backend/api/features/store/db.py
 def sanitize_query(query: str | None) -> str | None:
    if query is None:
        return query
--- a/autogpt_platform/backend/backend/api/features/chat/config.py
+++ b/autogpt_platform/backend/backend/api/features/chat/config.py
@@ -33,9 +33,15 @@ class ChatConfig(BaseSettings):

    stream_timeout: int = Field(default=300, description="Stream timeout in seconds")
    max_retries: int = Field(default=3, description="Maximum number of retries")
-    max_agent_runs: int = Field(default=3, description="Maximum number of agent runs")
+    max_agent_runs: int = Field(default=30, description="Maximum number of agent runs")
    max_agent_schedules: int = Field(
-        default=3, description="Maximum number of agent schedules"
+        default=30, description="Maximum number of agent schedules"
+    )
+
+    # Long-running operation configuration
+    long_running_operation_ttl: int = Field(
+        default=600,
+        description="TTL in seconds for long-running operation tracking in Redis (safety net if pod dies)",
    )

    # Langfuse Prompt Management Configuration
--- a/autogpt_platform/backend/backend/api/features/chat/db.py
+++ b/autogpt_platform/backend/backend/api/features/chat/db.py
@@ -247,3 +247,45 @@ async def get_chat_session_message_count(session_id: str) -> int:
    """Get the number of messages in a chat session."""
    count = await PrismaChatMessage.prisma().count(where={"sessionId": session_id})
    return count
+
+
+async def update_tool_message_content(
+    session_id: str,
+    tool_call_id: str,
+    new_content: str,
+) -> bool:
+    """Update the content of a tool message in chat history.
+
+    Used by background tasks to update pending operation messages with final results.
+
+    Args:
+        session_id: The chat session ID.
+        tool_call_id: The tool call ID to find the message.
+        new_content: The new content to set.
+
+    Returns:
+        True if a message was updated, False otherwise.
+    """
+    try:
+        result = await PrismaChatMessage.prisma().update_many(
+            where={
+                "sessionId": session_id,
+                "toolCallId": tool_call_id,
+            },
+            data={
+                "content": new_content,
+            },
+        )
+        if result == 0:
+            logger.warning(
+                f"No message found to update for session {session_id}, "
+                f"tool_call_id {tool_call_id}"
+            )
+            return False
+        return True
+    except Exception as e:
+        logger.error(
+            f"Failed to update tool message for session {session_id}, "
+            f"tool_call_id {tool_call_id}: {e}"
+        )
+        return False
--- a/autogpt_platform/backend/backend/api/features/chat/model.py
+++ b/autogpt_platform/backend/backend/api/features/chat/model.py
@@ -295,6 +295,21 @@ async def cache_chat_session(session: ChatSession) -> None:
    await _cache_session(session)


+async def invalidate_session_cache(session_id: str) -> None:
+    """Invalidate a chat session from Redis cache.
+
+    Used by background tasks to ensure fresh data is loaded on next access.
+    This is best-effort - Redis failures are logged but don't fail the operation.
+    """
+    try:
+        redis_key = _get_session_cache_key(session_id)
+        async_redis = await get_redis_async()
+        await async_redis.delete(redis_key)
+    except Exception as e:
+        # Best-effort: log but don't fail - cache will expire naturally
+        logger.warning(f"Failed to invalidate session cache for {session_id}: {e}")
+
+
 async def _get_session_from_db(session_id: str) -> ChatSession | None:
    """Get a chat session from the database."""
    prisma_session = await chat_db.get_chat_session(session_id)
--- a/autogpt_platform/backend/backend/api/features/chat/service.py
+++ b/autogpt_platform/backend/backend/api/features/chat/service.py
@@ -17,6 +17,7 @@ from openai import (
 )
 from openai.types.chat import ChatCompletionChunk, ChatCompletionToolParam

+from backend.data.redis_client import get_redis_async
 from backend.data.understanding import (
    format_understanding_for_prompt,
    get_business_understanding,
@@ -24,6 +25,7 @@ from backend.data.understanding import (
 from backend.util.exceptions import NotFoundError
 from backend.util.settings import Settings

+from . import db as chat_db
 from .config import ChatConfig
 from .model import (
    ChatMessage,
@@ -31,6 +33,7 @@ from .model import (
    Usage,
    cache_chat_session,
    get_chat_session,
+    invalidate_session_cache,
    update_session_title,
    upsert_chat_session,
 )
@@ -48,8 +51,13 @@ from .response_model import (
    StreamToolOutputAvailable,
    StreamUsage,
 )
-from .tools import execute_tool, tools
-from .tools.models import ErrorResponse
+from .tools import execute_tool, get_tool, tools
+from .tools.models import (
+    ErrorResponse,
+    OperationInProgressResponse,
+    OperationPendingResponse,
+    OperationStartedResponse,
+)
 from .tracking import track_user_message

 logger = logging.getLogger(__name__)
@@ -61,11 +69,126 @@ client = openai.AsyncOpenAI(api_key=config.api_key, base_url=config.base_url)

 langfuse = get_client()

+# Redis key prefix for tracking running long-running operations
+# Used for idempotency across Kubernetes pods - prevents duplicate executions on browser refresh
+RUNNING_OPERATION_PREFIX = "chat:running_operation:"

-class LangfuseNotConfiguredError(Exception):
-    """Raised when Langfuse is required but not configured."""
+# Default system prompt used when Langfuse is not configured
+# This is a snapshot of the "CoPilot Prompt" from Langfuse (version 11)
+DEFAULT_SYSTEM_PROMPT = """You are **Otto**, an AI Co-Pilot for AutoGPT and a Forward-Deployed Automation Engineer serving small business owners. Your mission is to help users automate business tasks with AI by delivering tangible value through working automations—not through documentation or lengthy explanations.

-    pass
+Here is everything you know about the current user from previous interactions:
+
+<users_information>
+{users_information}
+</users_information>
+
+## YOUR CORE MANDATE
+
+You are action-oriented. Your success is measured by:
+- **Value Delivery**: Does the user think "wow, that was amazing" or "what was the point"?
+- **Demonstrable Proof**: Show working automations, not descriptions of what's possible
+- **Time Saved**: Focus on tangible efficiency gains
+- **Quality Output**: Deliver results that meet or exceed expectations
+
+## YOUR WORKFLOW
+
+Adapt flexibly to the conversation context. Not every interaction requires all stages:
+
+1. **Explore & Understand**: Learn about the user's business, tasks, and goals. Use `add_understanding` to capture important context that will improve future conversations.
+
+2. **Assess Automation Potential**: Help the user understand whether and how AI can automate their task.
+
+3. **Prepare for AI**: Provide brief, actionable guidance on prerequisites (data, access, etc.).
+
+4. **Discover or Create Agents**:
+   - **Always check the user's library first** with `find_library_agent` (these may be customized to their needs)
+   - Search the marketplace with `find_agent` for pre-built automations
+   - Find reusable components with `find_block`
+   - Create custom solutions with `create_agent` if nothing suitable exists
+   - Modify existing library agents with `edit_agent`
+
+5. **Execute**: Run automations immediately, schedule them, or set up webhooks using `run_agent`. Test specific components with `run_block`.
+
+6. **Show Results**: Display outputs using `agent_output`.
+
+## AVAILABLE TOOLS
+
+**Understanding & Discovery:**
+- `add_understanding`: Create a memory about the user's business or use cases for future sessions
+- `search_docs`: Search platform documentation for specific technical information
+- `get_doc_page`: Retrieve full text of a specific documentation page
+
+**Agent Discovery:**
+- `find_library_agent`: Search the user's existing agents (CHECK HERE FIRST—these may be customized)
+- `find_agent`: Search the marketplace for pre-built automations
+- `find_block`: Find pre-written code units that perform specific tasks (agents are built from blocks)
+
+**Agent Creation & Editing:**
+- `create_agent`: Create a new automation agent
+- `edit_agent`: Modify an agent in the user's library
+
+**Execution & Output:**
+- `run_agent`: Run an agent now, schedule it, or set up a webhook trigger
+- `run_block`: Test or run a specific block independently
+- `agent_output`: View results from previous agent runs
+
+## BEHAVIORAL GUIDELINES
+
+**Be Concise:**
+- Target 2-5 short lines maximum
+- Make every word count—no repetition or filler
+- Use lightweight structure for scannability (bullets, numbered lists, short prompts)
+- Avoid jargon (blocks, slugs, cron) unless the user asks
+
+**Be Proactive:**
+- Suggest next steps before being asked
+- Anticipate needs based on conversation context and user information
+- Look for opportunities to expand scope when relevant
+- Reveal capabilities through action, not explanation
+
+**Use Tools Effectively:**
+- Select the right tool for each task
+- **Always check `find_library_agent` before searching the marketplace**
+- Use `add_understanding` to capture valuable business context
+- When tool calls fail, try alternative approaches
+
+## CRITICAL REMINDER
+
+You are NOT a chatbot. You are NOT documentation. You are a partner who helps busy business owners get value quickly by showing proof through working automations. Bias toward action over explanation."""
+
+# Module-level set to hold strong references to background tasks.
+# This prevents asyncio from garbage collecting tasks before they complete.
+# Tasks are automatically removed on completion via done_callback.
+_background_tasks: set[asyncio.Task] = set()
+
+
+async def _mark_operation_started(tool_call_id: str) -> bool:
+    """Mark a long-running operation as started (Redis-based).
+
+    Returns True if successfully marked (operation was not already running),
+    False if operation was already running (lost race condition).
+    Raises exception if Redis is unavailable (fail-closed).
+    """
+    redis = await get_redis_async()
+    key = f"{RUNNING_OPERATION_PREFIX}{tool_call_id}"
+    # SETNX with TTL - atomic "set if not exists"
+    result = await redis.set(key, "1", ex=config.long_running_operation_ttl, nx=True)
+    return result is not None
+
+
+async def _mark_operation_completed(tool_call_id: str) -> None:
+    """Mark a long-running operation as completed (remove Redis key).
+
+    This is best-effort - if Redis fails, the TTL will eventually clean up.
+    """
+    try:
+        redis = await get_redis_async()
+        key = f"{RUNNING_OPERATION_PREFIX}{tool_call_id}"
+        await redis.delete(key)
+    except Exception as e:
+        # Non-critical: TTL will clean up eventually
+        logger.warning(f"Failed to delete running operation key {tool_call_id}: {e}")


 def _is_langfuse_configured() -> bool:
@@ -75,6 +198,30 @@ def _is_langfuse_configured() -> bool:
    )


+async def _get_system_prompt_template(context: str) -> str:
+    """Get the system prompt, trying Langfuse first with fallback to default.
+
+    Args:
+        context: The user context/information to compile into the prompt.
+
+    Returns:
+        The compiled system prompt string.
+    """
+    if _is_langfuse_configured():
+        try:
+            # cache_ttl_seconds=0 disables SDK caching to always get the latest prompt
+            # Use asyncio.to_thread to avoid blocking the event loop
+            prompt = await asyncio.to_thread(
+                langfuse.get_prompt, config.langfuse_prompt_name, cache_ttl_seconds=0
+            )
+            return prompt.compile(users_information=context)
+        except Exception as e:
+            logger.warning(f"Failed to fetch prompt from Langfuse, using default: {e}")
+
+    # Fallback to default prompt
+    return DEFAULT_SYSTEM_PROMPT.format(users_information=context)
+
+
 async def _build_system_prompt(user_id: str | None) -> tuple[str, Any]:
    """Build the full system prompt including business understanding if available.

@@ -83,12 +230,8 @@ async def _build_system_prompt(user_id: str | None) -> tuple[str, Any]:
                     If "default" and this is the user's first session, will use "onboarding" instead.

    Returns:
-        Tuple of (compiled prompt string, Langfuse prompt object for tracing)
+        Tuple of (compiled prompt string, business understanding object)
    """
-
-    # cache_ttl_seconds=0 disables SDK caching to always get the latest prompt
-    prompt = langfuse.get_prompt(config.langfuse_prompt_name, cache_ttl_seconds=0)
-
    # If user is authenticated, try to fetch their business understanding
    understanding = None
    if user_id:
@@ -97,12 +240,13 @@ async def _build_system_prompt(user_id: str | None) -> tuple[str, Any]:
        except Exception as e:
            logger.warning(f"Failed to fetch business understanding: {e}")
            understanding = None
+
    if understanding:
        context = format_understanding_for_prompt(understanding)
    else:
        context = "This is the first time you are meeting the user. Greet them and introduce them to the platform"

-    compiled = prompt.compile(users_information=context)
+    compiled = await _get_system_prompt_template(context)
    return compiled, understanding


@@ -210,16 +354,6 @@ async def stream_chat_completion(
        f"Streaming chat completion for session {session_id} for message {message} and user id {user_id}. Message is user message: {is_user_message}"
    )

-    # Check if Langfuse is configured - required for chat functionality
-    if not _is_langfuse_configured():
-        logger.error("Chat request failed: Langfuse is not configured")
-        yield StreamError(
-            errorText="Chat service is not available. Langfuse must be configured "
-            "with LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY environment variables."
-        )
-        yield StreamFinish()
-        return
-
    # Only fetch from Redis if session not provided (initial call)
    if session is None:
        session = await get_chat_session(session_id, user_id)
@@ -315,6 +449,7 @@ async def stream_chat_completion(
    has_yielded_end = False
    has_yielded_error = False
    has_done_tool_call = False
+    has_long_running_tool_call = False  # Track if we had a long-running tool call
    has_received_text = False
    text_streaming_ended = False
    tool_response_messages: list[ChatMessage] = []
@@ -336,7 +471,6 @@ async def stream_chat_completion(
            system_prompt=system_prompt,
            text_block_id=text_block_id,
        ):
-
            if isinstance(chunk, StreamTextStart):
                # Emit text-start before first text delta
                if not has_received_text:
@@ -394,13 +528,34 @@ async def stream_chat_completion(
                    if isinstance(chunk.output, str)
                    else orjson.dumps(chunk.output).decode("utf-8")
                )
-                tool_response_messages.append(
-                    ChatMessage(
-                        role="tool",
-                        content=result_content,
-                        tool_call_id=chunk.toolCallId,
+                # Skip saving long-running operation responses - messages already saved in _yield_tool_call
+                # Use JSON parsing instead of substring matching to avoid false positives
+                is_long_running_response = False
+                try:
+                    parsed = orjson.loads(result_content)
+                    if isinstance(parsed, dict) and parsed.get("type") in (
+                        "operation_started",
+                        "operation_in_progress",
+                    ):
+                        is_long_running_response = True
+                except (orjson.JSONDecodeError, TypeError):
+                    pass  # Not JSON or not a dict - treat as regular response
+                if is_long_running_response:
+                    # Remove from accumulated_tool_calls since assistant message was already saved
+                    accumulated_tool_calls[:] = [
+                        tc
+                        for tc in accumulated_tool_calls
+                        if tc["id"] != chunk.toolCallId
+                    ]
+                    has_long_running_tool_call = True
+                else:
+                    tool_response_messages.append(
+                        ChatMessage(
+                            role="tool",
+                            content=result_content,
+                            tool_call_id=chunk.toolCallId,
+                        )
                    )
-                )
                has_done_tool_call = True
                # Track if any tool execution failed
                if not chunk.success:
@@ -576,7 +731,14 @@ async def stream_chat_completion(
            logger.info(
                f"Extended session messages, new message_count={len(session.messages)}"
            )
-        if messages_to_save or has_appended_streaming_message:
+        # Save if there are regular (non-long-running) tool responses or streaming message.
+        # Long-running tools save their own state, but we still need to save regular tools
+        # that may be in the same response.
+        has_regular_tool_responses = len(tool_response_messages) > 0
+        if has_regular_tool_responses or (
+            not has_long_running_tool_call
+            and (messages_to_save or has_appended_streaming_message)
+        ):
            await upsert_chat_session(session)
    else:
        logger.info(
@@ -585,7 +747,9 @@ async def stream_chat_completion(
        )

    # If we did a tool call, stream the chat completion again to get the next response
-    if has_done_tool_call:
+    # Skip only if ALL tools were long-running (they handle their own completion)
+    has_regular_tools = len(tool_response_messages) > 0
+    if has_done_tool_call and (has_regular_tools or not has_long_running_tool_call):
        logger.info(
            "Tool call executed, streaming chat completion again to get assistant response"
        )
@@ -725,6 +889,114 @@ async def _summarize_messages(
    return summary or "No summary available."


+def _ensure_tool_pairs_intact(
+    recent_messages: list[dict],
+    all_messages: list[dict],
+    start_index: int,
+) -> list[dict]:
+    """
+    Ensure tool_call/tool_response pairs stay together after slicing.
+
+    When slicing messages for context compaction, a naive slice can separate
+    an assistant message containing tool_calls from its corresponding tool
+    response messages. This causes API validation errors (e.g., Anthropic's
+    "unexpected tool_use_id found in tool_result blocks").
+
+    This function checks for orphan tool responses in the slice and extends
+    backwards to include their corresponding assistant messages.
+
+    Args:
+        recent_messages: The sliced messages to validate
+        all_messages: The complete message list (for looking up missing assistants)
+        start_index: The index in all_messages where recent_messages begins
+
+    Returns:
+        A potentially extended list of messages with tool pairs intact
+    """
+    if not recent_messages:
+        return recent_messages
+
+    # Collect all tool_call_ids from assistant messages in the slice
+    available_tool_call_ids: set[str] = set()
+    for msg in recent_messages:
+        if msg.get("role") == "assistant" and msg.get("tool_calls"):
+            for tc in msg["tool_calls"]:
+                tc_id = tc.get("id")
+                if tc_id:
+                    available_tool_call_ids.add(tc_id)
+
+    # Find orphan tool responses (tool messages whose tool_call_id is missing)
+    orphan_tool_call_ids: set[str] = set()
+    for msg in recent_messages:
+        if msg.get("role") == "tool":
+            tc_id = msg.get("tool_call_id")
+            if tc_id and tc_id not in available_tool_call_ids:
+                orphan_tool_call_ids.add(tc_id)
+
+    if not orphan_tool_call_ids:
+        # No orphans, slice is valid
+        return recent_messages
+
+    # Find the assistant messages that contain the orphan tool_call_ids
+    # Search backwards from start_index in all_messages
+    messages_to_prepend: list[dict] = []
+    for i in range(start_index - 1, -1, -1):
+        msg = all_messages[i]
+        if msg.get("role") == "assistant" and msg.get("tool_calls"):
+            msg_tool_ids = {tc.get("id") for tc in msg["tool_calls"] if tc.get("id")}
+            if msg_tool_ids & orphan_tool_call_ids:
+                # This assistant message has tool_calls we need
+                # Also collect its contiguous tool responses that follow it
+                assistant_and_responses: list[dict] = [msg]
+
+                # Scan forward from this assistant to collect tool responses
+                for j in range(i + 1, start_index):
+                    following_msg = all_messages[j]
+                    if following_msg.get("role") == "tool":
+                        tool_id = following_msg.get("tool_call_id")
+                        if tool_id and tool_id in msg_tool_ids:
+                            assistant_and_responses.append(following_msg)
+                    else:
+                        # Stop at first non-tool message
+                        break
+
+                # Prepend the assistant and its tool responses (maintain order)
+                messages_to_prepend = assistant_and_responses + messages_to_prepend
+                # Mark these as found
+                orphan_tool_call_ids -= msg_tool_ids
+                # Also add this assistant's tool_call_ids to available set
+                available_tool_call_ids |= msg_tool_ids
+
+        if not orphan_tool_call_ids:
+            # Found all missing assistants
+            break
+
+    if orphan_tool_call_ids:
+        # Some tool_call_ids couldn't be resolved - remove those tool responses
+        # This shouldn't happen in normal operation but handles edge cases
+        logger.warning(
+            f"Could not find assistant messages for tool_call_ids: {orphan_tool_call_ids}. "
+            "Removing orphan tool responses."
+        )
+        recent_messages = [
+            msg
+            for msg in recent_messages
+            if not (
+                msg.get("role") == "tool"
+                and msg.get("tool_call_id") in orphan_tool_call_ids
+            )
+        ]
+
+    if messages_to_prepend:
+        logger.info(
+            f"Extended recent messages by {len(messages_to_prepend)} to preserve "
+            f"tool_call/tool_response pairs"
+        )
+        return messages_to_prepend + recent_messages
+
+    return recent_messages
+
+
 async def _stream_chat_chunks(
    session: ChatSession,
    tools: list[ChatCompletionToolParam],
@@ -816,7 +1088,15 @@ async def _stream_chat_chunks(
            # Always attempt mitigation when over limit, even with few messages
            if messages:
                # Split messages based on whether system prompt exists
-                recent_messages = messages[-KEEP_RECENT:]
+                # Calculate start index for the slice
+                slice_start = max(0, len(messages_dict) - KEEP_RECENT)
+                recent_messages = messages_dict[-KEEP_RECENT:]
+
+                # Ensure tool_call/tool_response pairs stay together
+                # This prevents API errors from orphan tool responses
+                recent_messages = _ensure_tool_pairs_intact(
+                    recent_messages, messages_dict, slice_start
+                )

                if has_system_prompt:
                    # Keep system prompt separate, summarize everything between system and recent
@@ -903,6 +1183,13 @@ async def _stream_chat_chunks(
                                    if len(recent_messages) >= keep_count
                                    else recent_messages
                                )
+                                # Ensure tool pairs stay intact in the reduced slice
+                                reduced_slice_start = max(
+                                    0, len(recent_messages) - keep_count
+                                )
+                                reduced_recent = _ensure_tool_pairs_intact(
+                                    reduced_recent, recent_messages, reduced_slice_start
+                                )
                                if has_system_prompt:
                                    messages = [
                                        system_msg,
@@ -961,7 +1248,10 @@ async def _stream_chat_chunks(

                    # Create a base list excluding system prompt to avoid duplication
                    # This is the pool of messages we'll slice from in the loop
-                    base_msgs = messages[1:] if has_system_prompt else messages
+                    # Use messages_dict for type consistency with _ensure_tool_pairs_intact
+                    base_msgs = (
+                        messages_dict[1:] if has_system_prompt else messages_dict
+                    )

                    # Try progressively smaller keep counts
                    new_token_count = token_count  # Initialize with current count
@@ -984,6 +1274,12 @@ async def _stream_chat_chunks(
                            # Slice from base_msgs to get recent messages (without system prompt)
                            recent_messages = base_msgs[-keep_count:]

+                            # Ensure tool pairs stay intact in the reduced slice
+                            reduced_slice_start = max(0, len(base_msgs) - keep_count)
+                            recent_messages = _ensure_tool_pairs_intact(
+                                recent_messages, base_msgs, reduced_slice_start
+                            )
+
                            if has_system_prompt:
                                messages = [system_msg] + recent_messages
                            else:
@@ -1260,17 +1556,19 @@ async def _yield_tool_call(
    """
    Yield a tool call and its execution result.

-    For long-running tools, yields heartbeat events every 15 seconds to keep
-    the SSE connection alive through proxies and load balancers.
+    For tools marked with `is_long_running=True` (like agent generation), spawns a
+    background task so the operation survives SSE disconnections. For other tools,
+    yields heartbeat events every 15 seconds to keep the SSE connection alive.

    Raises:
        orjson.JSONDecodeError: If tool call arguments cannot be parsed as JSON
        KeyError: If expected tool call fields are missing
        TypeError: If tool call structure is invalid
    """
+    import uuid as uuid_module
+
    tool_name = tool_calls[yield_idx]["function"]["name"]
    tool_call_id = tool_calls[yield_idx]["id"]
-    logger.info(f"Yielding tool call: {tool_calls[yield_idx]}")

    # Parse tool call arguments - handle empty arguments gracefully
    raw_arguments = tool_calls[yield_idx]["function"]["arguments"]
@@ -1285,7 +1583,151 @@ async def _yield_tool_call(
        input=arguments,
    )

-    # Run tool execution in background task with heartbeats to keep connection alive
+    # Check if this tool is long-running (survives SSE disconnection)
+    tool = get_tool(tool_name)
+    if tool and tool.is_long_running:
+        # Atomic check-and-set: returns False if operation already running (lost race)
+        if not await _mark_operation_started(tool_call_id):
+            logger.info(
+                f"Tool call {tool_call_id} already in progress, returning status"
+            )
+            # Build dynamic message based on tool name
+            if tool_name == "create_agent":
+                in_progress_msg = "Agent creation already in progress. Please wait..."
+            elif tool_name == "edit_agent":
+                in_progress_msg = "Agent edit already in progress. Please wait..."
+            else:
+                in_progress_msg = f"{tool_name} already in progress. Please wait..."
+
+            yield StreamToolOutputAvailable(
+                toolCallId=tool_call_id,
+                toolName=tool_name,
+                output=OperationInProgressResponse(
+                    message=in_progress_msg,
+                    tool_call_id=tool_call_id,
+                ).model_dump_json(),
+                success=True,
+            )
+            return
+
+        # Generate operation ID
+        operation_id = str(uuid_module.uuid4())
+
+        # Build a user-friendly message based on tool and arguments
+        if tool_name == "create_agent":
+            agent_desc = arguments.get("description", "")
+            # Truncate long descriptions for the message
+            desc_preview = (
+                (agent_desc[:100] + "...") if len(agent_desc) > 100 else agent_desc
+            )
+            pending_msg = (
+                f"Creating your agent: {desc_preview}"
+                if desc_preview
+                else "Creating agent... This may take a few minutes."
+            )
+            started_msg = (
+                "Agent creation started. You can close this tab - "
+                "check your library in a few minutes."
+            )
+        elif tool_name == "edit_agent":
+            changes = arguments.get("changes", "")
+            changes_preview = (changes[:100] + "...") if len(changes) > 100 else changes
+            pending_msg = (
+                f"Editing agent: {changes_preview}"
+                if changes_preview
+                else "Editing agent... This may take a few minutes."
+            )
+            started_msg = (
+                "Agent edit started. You can close this tab - "
+                "check your library in a few minutes."
+            )
+        else:
+            pending_msg = f"Running {tool_name}... This may take a few minutes."
+            started_msg = (
+                f"{tool_name} started. You can close this tab - "
+                "check back in a few minutes."
+            )
+
+        # Track appended messages for rollback on failure
+        assistant_message: ChatMessage | None = None
+        pending_message: ChatMessage | None = None
+
+        # Wrap session save and task creation in try-except to release lock on failure
+        try:
+            # Save assistant message with tool_call FIRST (required by LLM)
+            assistant_message = ChatMessage(
+                role="assistant",
+                content="",
+                tool_calls=[tool_calls[yield_idx]],
+            )
+            session.messages.append(assistant_message)
+
+            # Then save pending tool result
+            pending_message = ChatMessage(
+                role="tool",
+                content=OperationPendingResponse(
+                    message=pending_msg,
+                    operation_id=operation_id,
+                    tool_name=tool_name,
+                ).model_dump_json(),
+                tool_call_id=tool_call_id,
+            )
+            session.messages.append(pending_message)
+            await upsert_chat_session(session)
+            logger.info(
+                f"Saved pending operation {operation_id} for tool {tool_name} "
+                f"in session {session.session_id}"
+            )
+
+            # Store task reference in module-level set to prevent GC before completion
+            task = asyncio.create_task(
+                _execute_long_running_tool(
+                    tool_name=tool_name,
+                    parameters=arguments,
+                    tool_call_id=tool_call_id,
+                    operation_id=operation_id,
+                    session_id=session.session_id,
+                    user_id=session.user_id,
+                )
+            )
+            _background_tasks.add(task)
+            task.add_done_callback(_background_tasks.discard)
+        except Exception as e:
+            # Roll back appended messages to prevent data corruption on subsequent saves
+            if (
+                pending_message
+                and session.messages
+                and session.messages[-1] == pending_message
+            ):
+                session.messages.pop()
+            if (
+                assistant_message
+                and session.messages
+                and session.messages[-1] == assistant_message
+            ):
+                session.messages.pop()
+
+            # Release the Redis lock since the background task won't be spawned
+            await _mark_operation_completed(tool_call_id)
+            logger.error(
+                f"Failed to setup long-running tool {tool_name}: {e}", exc_info=True
+            )
+            raise
+
+        # Return immediately - don't wait for completion
+        yield StreamToolOutputAvailable(
+            toolCallId=tool_call_id,
+            toolName=tool_name,
+            output=OperationStartedResponse(
+                message=started_msg,
+                operation_id=operation_id,
+                tool_name=tool_name,
+            ).model_dump_json(),
+            success=True,
+        )
+        return
+
+    # Normal flow: Run tool execution in background task with heartbeats
    tool_task = asyncio.create_task(
        execute_tool(
            tool_name=tool_name,
@@ -1335,3 +1777,190 @@ async def _yield_tool_call(
        )

    yield tool_execution_response
+
+
+async def _execute_long_running_tool(
+    tool_name: str,
+    parameters: dict[str, Any],
+    tool_call_id: str,
+    operation_id: str,
+    session_id: str,
+    user_id: str | None,
+) -> None:
+    """Execute a long-running tool in background and update chat history with result.
+
+    This function runs independently of the SSE connection, so the operation
+    survives if the user closes their browser tab.
+    """
+    try:
+        # Load fresh session (not stale reference)
+        session = await get_chat_session(session_id, user_id)
+        if not session:
+            logger.error(f"Session {session_id} not found for background tool")
+            return
+
+        # Execute the actual tool
+        result = await execute_tool(
+            tool_name=tool_name,
+            parameters=parameters,
+            tool_call_id=tool_call_id,
+            user_id=user_id,
+            session=session,
+        )
+
+        # Update the pending message with result
+        await _update_pending_operation(
+            session_id=session_id,
+            tool_call_id=tool_call_id,
+            result=(
+                result.output
+                if isinstance(result.output, str)
+                else orjson.dumps(result.output).decode("utf-8")
+            ),
+        )
+
+        logger.info(f"Background tool {tool_name} completed for session {session_id}")
+
+        # Generate LLM continuation so user sees response when they poll/refresh
+        await _generate_llm_continuation(session_id=session_id, user_id=user_id)
+
+    except Exception as e:
+        logger.error(f"Background tool {tool_name} failed: {e}", exc_info=True)
+        error_response = ErrorResponse(
+            message=f"Tool {tool_name} failed: {str(e)}",
+        )
+        await _update_pending_operation(
+            session_id=session_id,
+            tool_call_id=tool_call_id,
+            result=error_response.model_dump_json(),
+        )
+    finally:
+        await _mark_operation_completed(tool_call_id)
+
+
+async def _update_pending_operation(
+    session_id: str,
+    tool_call_id: str,
+    result: str,
+) -> None:
+    """Update the pending tool message with final result.
+
+    This is called by background tasks when long-running operations complete.
+    """
+    # Update the message in database
+    updated = await chat_db.update_tool_message_content(
+        session_id=session_id,
+        tool_call_id=tool_call_id,
+        new_content=result,
+    )
+
+    if updated:
+        # Invalidate Redis cache so next load gets fresh data
+        # Wrap in try/except to prevent cache failures from triggering error handling
+        # that would overwrite our successful DB update
+        try:
+            await invalidate_session_cache(session_id)
+        except Exception as e:
+            # Non-critical: cache will eventually be refreshed on next load
+            logger.warning(f"Failed to invalidate cache for session {session_id}: {e}")
+        logger.info(
+            f"Updated pending operation for tool_call_id {tool_call_id} "
+            f"in session {session_id}"
+        )
+    else:
+        logger.warning(
+            f"Failed to update pending operation for tool_call_id {tool_call_id} "
+            f"in session {session_id}"
+        )
+
+
+async def _generate_llm_continuation(
+    session_id: str,
+    user_id: str | None,
+) -> None:
+    """Generate an LLM response after a long-running tool completes.
+
+    This is called by background tasks to continue the conversation
+    after a tool result is saved. The response is saved to the database
+    so users see it when they refresh or poll.
+    """
+    try:
+        # Load fresh session from DB (bypass cache to get the updated tool result)
+        await invalidate_session_cache(session_id)
+        session = await get_chat_session(session_id, user_id)
+        if not session:
+            logger.error(f"Session {session_id} not found for LLM continuation")
+            return
+
+        # Build system prompt
+        system_prompt, _ = await _build_system_prompt(user_id)
+
+        # Build messages in OpenAI format
+        messages = session.to_openai_messages()
+        if system_prompt:
+            from openai.types.chat import ChatCompletionSystemMessageParam
+
+            system_message = ChatCompletionSystemMessageParam(
+                role="system",
+                content=system_prompt,
+            )
+            messages = [system_message] + messages
+
+        # Build extra_body for tracing
+        extra_body: dict[str, Any] = {
+            "posthogProperties": {
+                "environment": settings.config.app_env.value,
+            },
+        }
+        if user_id:
+            extra_body["user"] = user_id[:128]
+            extra_body["posthogDistinctId"] = user_id
+        if session_id:
+            extra_body["session_id"] = session_id[:128]
+
+        # Make non-streaming LLM call (no tools - just text response)
+        from typing import cast
+
+        from openai.types.chat import ChatCompletionMessageParam
+
+        # No tools parameter = text-only response (no tool calls)
+        response = await client.chat.completions.create(
+            model=config.model,
+            messages=cast(list[ChatCompletionMessageParam], messages),
+            extra_body=extra_body,
+        )
+
+        if response.choices and response.choices[0].message.content:
+            assistant_content = response.choices[0].message.content
+
+            # Reload session from DB to avoid race condition with user messages
+            # that may have been sent while we were generating the LLM response
+            fresh_session = await get_chat_session(session_id, user_id)
+            if not fresh_session:
+                logger.error(
+                    f"Session {session_id} disappeared during LLM continuation"
+                )
+                return
+
+            # Save assistant message to database
+            assistant_message = ChatMessage(
+                role="assistant",
+                content=assistant_content,
+            )
+            fresh_session.messages.append(assistant_message)
+
+            # Save to database (not cache) to persist the response
+            await upsert_chat_session(fresh_session)
+
+            # Invalidate cache so next poll/refresh gets fresh data
+            await invalidate_session_cache(session_id)
+
+            logger.info(
+                f"Generated LLM continuation for session {session_id}, "
+                f"response length: {len(assistant_content)}"
+            )
+        else:
+            logger.warning(f"LLM continuation returned empty response for {session_id}")
+
+    except Exception as e:
+        logger.error(f"Failed to generate LLM continuation: {e}", exc_info=True)
--- a/autogpt_platform/backend/backend/api/features/chat/tools/IDEAS.md
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/IDEAS.md
@@ -0,0 +1,79 @@
+# CoPilot Tools - Future Ideas
+
+## Multimodal Image Support for CoPilot
+
+**Problem:** CoPilot uses a vision-capable model but can't "see" workspace images. When a block generates an image and returns `workspace://abc123`, CoPilot can't evaluate it (e.g., checking blog thumbnail quality).
+
+**Backend Solution:**
+When preparing messages for the LLM, detect `workspace://` image references and convert them to proper image content blocks:
+
+```python
+# Before sending to LLM, scan for workspace image references
+# and inject them as image content parts
+
+# Example message transformation:
+# FROM: {"role": "assistant", "content": "Generated image: workspace://abc123"}
+# TO:   {"role": "assistant", "content": [
+#         {"type": "text", "text": "Generated image: workspace://abc123"},
+#         {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
+#       ]}
+```
+
+**Where to implement:**
+- In the chat stream handler before calling the LLM
+- Or in a message preprocessing step
+- Need to fetch image from workspace, convert to base64, add as image content
+
+**Considerations:**
+- Only do this for image MIME types (image/png, image/jpeg, etc.)
+- May want a size limit (don't pass 10MB images)
+- Track which images were "shown" to the AI for frontend indicator
+- Cost implications - vision API calls are more expensive
+
+**Frontend Solution:**
+Show visual indicator on workspace files in chat:
+- If AI saw the image: normal display
+- If AI didn't see it: overlay icon saying "AI can't see this image"
+
+Requires response metadata indicating which `workspace://` refs were passed to the model.
+
+---
+
+## Output Post-Processing Layer for run_block
+
+**Problem:** Many blocks produce large outputs that:
+- Consume massive context (100KB base64 image = ~133KB tokens)
+- Can't fit in conversation
+- Break things and cause high LLM costs
+
+**Proposed Solution:** Instead of modifying individual blocks or `store_media_file()`, implement a centralized output processor in `run_block.py` that handles outputs before they're returned to CoPilot.
+
+**Benefits:**
+1. **Centralized** - one place to handle all output processing
+2. **Future-proof** - new blocks automatically get output processing
+3. **Keeps blocks pure** - they don't need to know about context constraints
+4. **Handles all large outputs** - not just images
+
+**Processing Rules:**
+- Detect base64 data URIs → save to workspace, return `workspace://` reference
+- Truncate very long strings (>N chars) with truncation note
+- Summarize large arrays/lists (e.g., "Array with 1000 items, first 5: [...]")
+- Handle nested large outputs in dicts recursively
+- Cap total output size
+
+**Implementation Location:** `run_block.py` after block execution, before returning `BlockOutputResponse`
+
+**Example:**
+```python
+def _process_outputs_for_context(
+    outputs: dict[str, list[Any]],
+    workspace_manager: WorkspaceManager,
+    max_string_length: int = 10000,
+    max_array_preview: int = 5,
+) -> dict[str, list[Any]]:
+    """Process block outputs to prevent context bloat."""
+    processed = {}
+    for name, values in outputs.items():
+        processed[name] = [_process_value(v, workspace_manager) for v in values]
+    return processed
+```
--- a/autogpt_platform/backend/backend/api/features/chat/tools/init.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/init.py
@@ -18,6 +18,12 @@ from .get_doc_page import GetDocPageTool
 from .run_agent import RunAgentTool
 from .run_block import RunBlockTool
 from .search_docs import SearchDocsTool
+from .workspace_files import (
+    DeleteWorkspaceFileTool,
+    ListWorkspaceFilesTool,
+    ReadWorkspaceFileTool,
+    WriteWorkspaceFileTool,
+)

 if TYPE_CHECKING:
    from backend.api.features.chat.response_model import StreamToolOutputAvailable
@@ -37,6 +43,11 @@ TOOL_REGISTRY: dict[str, BaseTool] = {
    "view_agent_output": AgentOutputTool(),
    "search_docs": SearchDocsTool(),
    "get_doc_page": GetDocPageTool(),
+    # Workspace tools for CoPilot file operations
+    "list_workspace_files": ListWorkspaceFilesTool(),
+    "read_workspace_file": ReadWorkspaceFileTool(),
+    "write_workspace_file": WriteWorkspaceFileTool(),
+    "delete_workspace_file": DeleteWorkspaceFileTool(),
 }

 # Export individual tool instances for backwards compatibility
@@ -49,6 +60,11 @@ tools: list[ChatCompletionToolParam] = [
 ]


+def get_tool(tool_name: str) -> BaseTool | None:
+    """Get a tool instance by name."""
+    return TOOL_REGISTRY.get(tool_name)
+
+
 async def execute_tool(
    tool_name: str,
    parameters: dict[str, Any],
@@ -57,7 +73,7 @@ async def execute_tool(
    tool_call_id: str,
 ) -> "StreamToolOutputAvailable":
    """Execute a tool by name."""
-    tool = TOOL_REGISTRY.get(tool_name)
+    tool = get_tool(tool_name)
    if not tool:
        raise ValueError(f"Tool {tool_name} not found")

--- a/autogpt_platform/backend/backend/api/features/chat/tools/base.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/base.py
@@ -36,6 +36,16 @@ class BaseTool:
        """Whether this tool requires authentication."""
        return False

+    @property
+    def is_long_running(self) -> bool:
+        """Whether this tool is long-running and should execute in background.
+
+        Long-running tools (like agent generation) are executed via background
+        tasks to survive SSE disconnections. The result is persisted to chat
+        history and visible when the user refreshes.
+        """
+        return False
+
    def as_openai_tool(self) -> ChatCompletionToolParam:
        """Convert to OpenAI tool format."""
        return ChatCompletionToolParam(
--- a/autogpt_platform/backend/backend/api/features/chat/tools/create_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/create_agent.py
@@ -42,6 +42,10 @@ class CreateAgentTool(BaseTool):
    def requires_auth(self) -> bool:
        return True

+    @property
+    def is_long_running(self) -> bool:
+        return True
+
    @property
    def parameters(self) -> dict[str, Any]:
        return {
--- a/autogpt_platform/backend/backend/api/features/chat/tools/edit_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/edit_agent.py
@@ -42,6 +42,10 @@ class EditAgentTool(BaseTool):
    def requires_auth(self) -> bool:
        return True

+    @property
+    def is_long_running(self) -> bool:
+        return True
+
    @property
    def parameters(self) -> dict[str, Any]:
        return {
--- a/autogpt_platform/backend/backend/api/features/chat/tools/models.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/models.py
@@ -28,6 +28,16 @@ class ResponseType(str, Enum):
    BLOCK_OUTPUT = "block_output"
    DOC_SEARCH_RESULTS = "doc_search_results"
    DOC_PAGE = "doc_page"
+    # Workspace response types
+    WORKSPACE_FILE_LIST = "workspace_file_list"
+    WORKSPACE_FILE_CONTENT = "workspace_file_content"
+    WORKSPACE_FILE_METADATA = "workspace_file_metadata"
+    WORKSPACE_FILE_WRITTEN = "workspace_file_written"
+    WORKSPACE_FILE_DELETED = "workspace_file_deleted"
+    # Long-running operation types
+    OPERATION_STARTED = "operation_started"
+    OPERATION_PENDING = "operation_pending"
+    OPERATION_IN_PROGRESS = "operation_in_progress"


 # Base response model
@@ -334,3 +344,39 @@ class BlockOutputResponse(ToolResponseBase):
    block_name: str
    outputs: dict[str, list[Any]]
    success: bool = True
+
+
+# Long-running operation models
+class OperationStartedResponse(ToolResponseBase):
+    """Response when a long-running operation has been started in the background.
+
+    This is returned immediately to the client while the operation continues
+    to execute. The user can close the tab and check back later.
+    """
+
+    type: ResponseType = ResponseType.OPERATION_STARTED
+    operation_id: str
+    tool_name: str
+
+
+class OperationPendingResponse(ToolResponseBase):
+    """Response stored in chat history while a long-running operation is executing.
+
+    This is persisted to the database so users see a pending state when they
+    refresh before the operation completes.
+    """
+
+    type: ResponseType = ResponseType.OPERATION_PENDING
+    operation_id: str
+    tool_name: str
+
+
+class OperationInProgressResponse(ToolResponseBase):
+    """Response when an operation is already in progress.
+
+    Returned for idempotency when the same tool_call_id is requested again
+    while the background task is still running.
+    """
+
+    type: ResponseType = ResponseType.OPERATION_IN_PROGRESS
+    tool_call_id: str
--- a/autogpt_platform/backend/backend/api/features/chat/tools/run_block.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/run_block.py
@@ -1,6 +1,7 @@
 """Tool for executing blocks directly."""

 import logging
+import uuid
 from collections import defaultdict
 from typing import Any

@@ -8,6 +9,7 @@ from backend.api.features.chat.model import ChatSession
 from backend.data.block import get_block
 from backend.data.execution import ExecutionContext
 from backend.data.model import CredentialsMetaInput
+from backend.data.workspace import get_or_create_workspace
 from backend.integrations.creds_manager import IntegrationCredentialsManager
 from backend.util.exceptions import BlockError

@@ -223,11 +225,48 @@ class RunBlockTool(BaseTool):
            )

        try:
-            # Fetch actual credentials and prepare kwargs for block execution
-            # Create execution context with defaults (blocks may require it)
+            # Get or create user's workspace for CoPilot file operations
+            workspace = await get_or_create_workspace(user_id)
+
+            # Generate synthetic IDs for CoPilot context
+            # Each chat session is treated as its own agent with one continuous run
+            # This means:
+            # - graph_id (agent) = session (memories scoped to session when limit_to_agent=True)
+            # - graph_exec_id (run) = session (memories scoped to session when limit_to_run=True)
+            # - node_exec_id = unique per block execution
+            synthetic_graph_id = f"copilot-session-{session.session_id}"
+            synthetic_graph_exec_id = f"copilot-session-{session.session_id}"
+            synthetic_node_id = f"copilot-node-{block_id}"
+            synthetic_node_exec_id = (
+                f"copilot-{session.session_id}-{uuid.uuid4().hex[:8]}"
+            )
+
+            # Create unified execution context with all required fields
+            execution_context = ExecutionContext(
+                # Execution identity
+                user_id=user_id,
+                graph_id=synthetic_graph_id,
+                graph_exec_id=synthetic_graph_exec_id,
+                graph_version=1,  # Versions are 1-indexed
+                node_id=synthetic_node_id,
+                node_exec_id=synthetic_node_exec_id,
+                # Workspace with session scoping
+                workspace_id=workspace.id,
+                session_id=session.session_id,
+            )
+
+            # Prepare kwargs for block execution
+            # Keep individual kwargs for backwards compatibility with existing blocks
            exec_kwargs: dict[str, Any] = {
                "user_id": user_id,
-                "execution_context": ExecutionContext(),
+                "execution_context": execution_context,
+                # Legacy: individual kwargs for blocks not yet using execution_context
+                "workspace_id": workspace.id,
+                "graph_exec_id": synthetic_graph_exec_id,
+                "node_exec_id": synthetic_node_exec_id,
+                "node_id": synthetic_node_id,
+                "graph_version": 1,  # Versions are 1-indexed
+                "graph_id": synthetic_graph_id,
            }

            for field_name, cred_meta in matched_credentials.items():
--- a/autogpt_platform/backend/backend/api/features/chat/tools/workspace_files.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/workspace_files.py
@@ -0,0 +1,620 @@
+"""CoPilot tools for workspace file operations."""
+
+import base64
+import logging
+from typing import Any, Optional
+
+from pydantic import BaseModel
+
+from backend.api.features.chat.model import ChatSession
+from backend.data.workspace import get_or_create_workspace
+from backend.util.settings import Config
+from backend.util.virus_scanner import scan_content_safe
+from backend.util.workspace import WorkspaceManager
+
+from .base import BaseTool
+from .models import ErrorResponse, ResponseType, ToolResponseBase
+
+logger = logging.getLogger(__name__)
+
+
+class WorkspaceFileInfoData(BaseModel):
+    """Data model for workspace file information (not a response itself)."""
+
+    file_id: str
+    name: str
+    path: str
+    mime_type: str
+    size_bytes: int
+
+
+class WorkspaceFileListResponse(ToolResponseBase):
+    """Response containing list of workspace files."""
+
+    type: ResponseType = ResponseType.WORKSPACE_FILE_LIST
+    files: list[WorkspaceFileInfoData]
+    total_count: int
+
+
+class WorkspaceFileContentResponse(ToolResponseBase):
+    """Response containing workspace file content (legacy, for small text files)."""
+
+    type: ResponseType = ResponseType.WORKSPACE_FILE_CONTENT
+    file_id: str
+    name: str
+    path: str
+    mime_type: str
+    content_base64: str
+
+
+class WorkspaceFileMetadataResponse(ToolResponseBase):
+    """Response containing workspace file metadata and download URL (prevents context bloat)."""
+
+    type: ResponseType = ResponseType.WORKSPACE_FILE_METADATA
+    file_id: str
+    name: str
+    path: str
+    mime_type: str
+    size_bytes: int
+    download_url: str
+    preview: str | None = None  # First 500 chars for text files
+
+
+class WorkspaceWriteResponse(ToolResponseBase):
+    """Response after writing a file to workspace."""
+
+    type: ResponseType = ResponseType.WORKSPACE_FILE_WRITTEN
+    file_id: str
+    name: str
+    path: str
+    size_bytes: int
+
+
+class WorkspaceDeleteResponse(ToolResponseBase):
+    """Response after deleting a file from workspace."""
+
+    type: ResponseType = ResponseType.WORKSPACE_FILE_DELETED
+    file_id: str
+    success: bool
+
+
+class ListWorkspaceFilesTool(BaseTool):
+    """Tool for listing files in user's workspace."""
+
+    @property
+    def name(self) -> str:
+        return "list_workspace_files"
+
+    @property
+    def description(self) -> str:
+        return (
+            "List files in the user's workspace. "
+            "Returns file names, paths, sizes, and metadata. "
+            "Optionally filter by path prefix."
+        )
+
+    @property
+    def parameters(self) -> dict[str, Any]:
+        return {
+            "type": "object",
+            "properties": {
+                "path_prefix": {
+                    "type": "string",
+                    "description": (
+                        "Optional path prefix to filter files "
+                        "(e.g., '/documents/' to list only files in documents folder). "
+                        "By default, only files from the current session are listed."
+                    ),
+                },
+                "limit": {
+                    "type": "integer",
+                    "description": "Maximum number of files to return (default 50, max 100)",
+                    "minimum": 1,
+                    "maximum": 100,
+                },
+                "include_all_sessions": {
+                    "type": "boolean",
+                    "description": (
+                        "If true, list files from all sessions. "
+                        "Default is false (only current session's files)."
+                    ),
+                },
+            },
+            "required": [],
+        }
+
+    @property
+    def requires_auth(self) -> bool:
+        return True
+
+    async def _execute(
+        self,
+        user_id: str | None,
+        session: ChatSession,
+        **kwargs,
+    ) -> ToolResponseBase:
+        session_id = session.session_id
+
+        if not user_id:
+            return ErrorResponse(
+                message="Authentication required",
+                session_id=session_id,
+            )
+
+        path_prefix: Optional[str] = kwargs.get("path_prefix")
+        limit = min(kwargs.get("limit", 50), 100)
+        include_all_sessions: bool = kwargs.get("include_all_sessions", False)
+
+        try:
+            workspace = await get_or_create_workspace(user_id)
+            # Pass session_id for session-scoped file access
+            manager = WorkspaceManager(user_id, workspace.id, session_id)
+
+            files = await manager.list_files(
+                path=path_prefix,
+                limit=limit,
+                include_all_sessions=include_all_sessions,
+            )
+            total = await manager.get_file_count(
+                path=path_prefix,
+                include_all_sessions=include_all_sessions,
+            )
+
+            file_infos = [
+                WorkspaceFileInfoData(
+                    file_id=f.id,
+                    name=f.name,
+                    path=f.path,
+                    mime_type=f.mimeType,
+                    size_bytes=f.sizeBytes,
+                )
+                for f in files
+            ]
+
+            scope_msg = "all sessions" if include_all_sessions else "current session"
+            return WorkspaceFileListResponse(
+                files=file_infos,
+                total_count=total,
+                message=f"Found {len(files)} files in workspace ({scope_msg})",
+                session_id=session_id,
+            )
+
+        except Exception as e:
+            logger.error(f"Error listing workspace files: {e}", exc_info=True)
+            return ErrorResponse(
+                message=f"Failed to list workspace files: {str(e)}",
+                error=str(e),
+                session_id=session_id,
+            )
+
+
+class ReadWorkspaceFileTool(BaseTool):
+    """Tool for reading file content from workspace."""
+
+    # Size threshold for returning full content vs metadata+URL
+    # Files larger than this return metadata with download URL to prevent context bloat
+    MAX_INLINE_SIZE_BYTES = 32 * 1024  # 32KB
+    # Preview size for text files
+    PREVIEW_SIZE = 500
+
+    @property
+    def name(self) -> str:
+        return "read_workspace_file"
+
+    @property
+    def description(self) -> str:
+        return (
+            "Read a file from the user's workspace. "
+            "Specify either file_id or path to identify the file. "
+            "For small text files, returns content directly. "
+            "For large or binary files, returns metadata and a download URL. "
+            "Paths are scoped to the current session by default. "
+            "Use /sessions/<session_id>/... for cross-session access."
+        )
+
+    @property
+    def parameters(self) -> dict[str, Any]:
+        return {
+            "type": "object",
+            "properties": {
+                "file_id": {
+                    "type": "string",
+                    "description": "The file's unique ID (from list_workspace_files)",
+                },
+                "path": {
+                    "type": "string",
+                    "description": (
+                        "The virtual file path (e.g., '/documents/report.pdf'). "
+                        "Scoped to current session by default."
+                    ),
+                },
+                "force_download_url": {
+                    "type": "boolean",
+                    "description": (
+                        "If true, always return metadata+URL instead of inline content. "
+                        "Default is false (auto-selects based on file size/type)."
+                    ),
+                },
+            },
+            "required": [],  # At least one must be provided
+        }
+
+    @property
+    def requires_auth(self) -> bool:
+        return True
+
+    def _is_text_mime_type(self, mime_type: str) -> bool:
+        """Check if the MIME type is a text-based type."""
+        text_types = [
+            "text/",
+            "application/json",
+            "application/xml",
+            "application/javascript",
+            "application/x-python",
+            "application/x-sh",
+        ]
+        return any(mime_type.startswith(t) for t in text_types)
+
+    async def _execute(
+        self,
+        user_id: str | None,
+        session: ChatSession,
+        **kwargs,
+    ) -> ToolResponseBase:
+        session_id = session.session_id
+
+        if not user_id:
+            return ErrorResponse(
+                message="Authentication required",
+                session_id=session_id,
+            )
+
+        file_id: Optional[str] = kwargs.get("file_id")
+        path: Optional[str] = kwargs.get("path")
+        force_download_url: bool = kwargs.get("force_download_url", False)
+
+        if not file_id and not path:
+            return ErrorResponse(
+                message="Please provide either file_id or path",
+                session_id=session_id,
+            )
+
+        try:
+            workspace = await get_or_create_workspace(user_id)
+            # Pass session_id for session-scoped file access
+            manager = WorkspaceManager(user_id, workspace.id, session_id)
+
+            # Get file info
+            if file_id:
+                file_info = await manager.get_file_info(file_id)
+                if file_info is None:
+                    return ErrorResponse(
+                        message=f"File not found: {file_id}",
+                        session_id=session_id,
+                    )
+                target_file_id = file_id
+            else:
+                # path is guaranteed to be non-None here due to the check above
+                assert path is not None
+                file_info = await manager.get_file_info_by_path(path)
+                if file_info is None:
+                    return ErrorResponse(
+                        message=f"File not found at path: {path}",
+                        session_id=session_id,
+                    )
+                target_file_id = file_info.id
+
+            # Decide whether to return inline content or metadata+URL
+            is_small_file = file_info.sizeBytes <= self.MAX_INLINE_SIZE_BYTES
+            is_text_file = self._is_text_mime_type(file_info.mimeType)
+
+            # Return inline content for small text files (unless force_download_url)
+            if is_small_file and is_text_file and not force_download_url:
+                content = await manager.read_file_by_id(target_file_id)
+                content_b64 = base64.b64encode(content).decode("utf-8")
+
+                return WorkspaceFileContentResponse(
+                    file_id=file_info.id,
+                    name=file_info.name,
+                    path=file_info.path,
+                    mime_type=file_info.mimeType,
+                    content_base64=content_b64,
+                    message=f"Successfully read file: {file_info.name}",
+                    session_id=session_id,
+                )
+
+            # Return metadata + workspace:// reference for large or binary files
+            # This prevents context bloat (100KB file = ~133KB as base64)
+            # Use workspace:// format so frontend urlTransform can add proxy prefix
+            download_url = f"workspace://{target_file_id}"
+
+            # Generate preview for text files
+            preview: str | None = None
+            if is_text_file:
+                try:
+                    content = await manager.read_file_by_id(target_file_id)
+                    preview_text = content[: self.PREVIEW_SIZE].decode(
+                        "utf-8", errors="replace"
+                    )
+                    if len(content) > self.PREVIEW_SIZE:
+                        preview_text += "..."
+                    preview = preview_text
+                except Exception:
+                    pass  # Preview is optional
+
+            return WorkspaceFileMetadataResponse(
+                file_id=file_info.id,
+                name=file_info.name,
+                path=file_info.path,
+                mime_type=file_info.mimeType,
+                size_bytes=file_info.sizeBytes,
+                download_url=download_url,
+                preview=preview,
+                message=f"File: {file_info.name} ({file_info.sizeBytes} bytes). Use download_url to retrieve content.",
+                session_id=session_id,
+            )
+
+        except FileNotFoundError as e:
+            return ErrorResponse(
+                message=str(e),
+                session_id=session_id,
+            )
+        except Exception as e:
+            logger.error(f"Error reading workspace file: {e}", exc_info=True)
+            return ErrorResponse(
+                message=f"Failed to read workspace file: {str(e)}",
+                error=str(e),
+                session_id=session_id,
+            )
+
+
+class WriteWorkspaceFileTool(BaseTool):
+    """Tool for writing files to workspace."""
+
+    @property
+    def name(self) -> str:
+        return "write_workspace_file"
+
+    @property
+    def description(self) -> str:
+        return (
+            "Write or create a file in the user's workspace. "
+            "Provide the content as a base64-encoded string. "
+            f"Maximum file size is {Config().max_file_size_mb}MB. "
+            "Files are saved to the current session's folder by default. "
+            "Use /sessions/<session_id>/... for cross-session access."
+        )
+
+    @property
+    def parameters(self) -> dict[str, Any]:
+        return {
+            "type": "object",
+            "properties": {
+                "filename": {
+                    "type": "string",
+                    "description": "Name for the file (e.g., 'report.pdf')",
+                },
+                "content_base64": {
+                    "type": "string",
+                    "description": "Base64-encoded file content",
+                },
+                "path": {
+                    "type": "string",
+                    "description": (
+                        "Optional virtual path where to save the file "
+                        "(e.g., '/documents/report.pdf'). "
+                        "Defaults to '/{filename}'. Scoped to current session."
+                    ),
+                },
+                "mime_type": {
+                    "type": "string",
+                    "description": (
+                        "Optional MIME type of the file. "
+                        "Auto-detected from filename if not provided."
+                    ),
+                },
+                "overwrite": {
+                    "type": "boolean",
+                    "description": "Whether to overwrite if file exists at path (default: false)",
+                },
+            },
+            "required": ["filename", "content_base64"],
+        }
+
+    @property
+    def requires_auth(self) -> bool:
+        return True
+
+    async def _execute(
+        self,
+        user_id: str | None,
+        session: ChatSession,
+        **kwargs,
+    ) -> ToolResponseBase:
+        session_id = session.session_id
+
+        if not user_id:
+            return ErrorResponse(
+                message="Authentication required",
+                session_id=session_id,
+            )
+
+        filename: str = kwargs.get("filename", "")
+        content_b64: str = kwargs.get("content_base64", "")
+        path: Optional[str] = kwargs.get("path")
+        mime_type: Optional[str] = kwargs.get("mime_type")
+        overwrite: bool = kwargs.get("overwrite", False)
+
+        if not filename:
+            return ErrorResponse(
+                message="Please provide a filename",
+                session_id=session_id,
+            )
+
+        if not content_b64:
+            return ErrorResponse(
+                message="Please provide content_base64",
+                session_id=session_id,
+            )
+
+        # Decode content
+        try:
+            content = base64.b64decode(content_b64)
+        except Exception:
+            return ErrorResponse(
+                message="Invalid base64-encoded content",
+                session_id=session_id,
+            )
+
+        # Check size
+        max_file_size = Config().max_file_size_mb * 1024 * 1024
+        if len(content) > max_file_size:
+            return ErrorResponse(
+                message=f"File too large. Maximum size is {Config().max_file_size_mb}MB",
+                session_id=session_id,
+            )
+
+        try:
+            # Virus scan
+            await scan_content_safe(content, filename=filename)
+
+            workspace = await get_or_create_workspace(user_id)
+            # Pass session_id for session-scoped file access
+            manager = WorkspaceManager(user_id, workspace.id, session_id)
+
+            file_record = await manager.write_file(
+                content=content,
+                filename=filename,
+                path=path,
+                mime_type=mime_type,
+                overwrite=overwrite,
+            )
+
+            return WorkspaceWriteResponse(
+                file_id=file_record.id,
+                name=file_record.name,
+                path=file_record.path,
+                size_bytes=file_record.sizeBytes,
+                message=f"Successfully wrote file: {file_record.name}",
+                session_id=session_id,
+            )
+
+        except ValueError as e:
+            return ErrorResponse(
+                message=str(e),
+                session_id=session_id,
+            )
+        except Exception as e:
+            logger.error(f"Error writing workspace file: {e}", exc_info=True)
+            return ErrorResponse(
+                message=f"Failed to write workspace file: {str(e)}",
+                error=str(e),
+                session_id=session_id,
+            )
+
+
+class DeleteWorkspaceFileTool(BaseTool):
+    """Tool for deleting files from workspace."""
+
+    @property
+    def name(self) -> str:
+        return "delete_workspace_file"
+
+    @property
+    def description(self) -> str:
+        return (
+            "Delete a file from the user's workspace. "
+            "Specify either file_id or path to identify the file. "
+            "Paths are scoped to the current session by default. "
+            "Use /sessions/<session_id>/... for cross-session access."
+        )
+
+    @property
+    def parameters(self) -> dict[str, Any]:
+        return {
+            "type": "object",
+            "properties": {
+                "file_id": {
+                    "type": "string",
+                    "description": "The file's unique ID (from list_workspace_files)",
+                },
+                "path": {
+                    "type": "string",
+                    "description": (
+                        "The virtual file path (e.g., '/documents/report.pdf'). "
+                        "Scoped to current session by default."
+                    ),
+                },
+            },
+            "required": [],  # At least one must be provided
+        }
+
+    @property
+    def requires_auth(self) -> bool:
+        return True
+
+    async def _execute(
+        self,
+        user_id: str | None,
+        session: ChatSession,
+        **kwargs,
+    ) -> ToolResponseBase:
+        session_id = session.session_id
+
+        if not user_id:
+            return ErrorResponse(
+                message="Authentication required",
+                session_id=session_id,
+            )
+
+        file_id: Optional[str] = kwargs.get("file_id")
+        path: Optional[str] = kwargs.get("path")
+
+        if not file_id and not path:
+            return ErrorResponse(
+                message="Please provide either file_id or path",
+                session_id=session_id,
+            )
+
+        try:
+            workspace = await get_or_create_workspace(user_id)
+            # Pass session_id for session-scoped file access
+            manager = WorkspaceManager(user_id, workspace.id, session_id)
+
+            # Determine the file_id to delete
+            target_file_id: str
+            if file_id:
+                target_file_id = file_id
+            else:
+                # path is guaranteed to be non-None here due to the check above
+                assert path is not None
+                file_info = await manager.get_file_info_by_path(path)
+                if file_info is None:
+                    return ErrorResponse(
+                        message=f"File not found at path: {path}",
+                        session_id=session_id,
+                    )
+                target_file_id = file_info.id
+
+            success = await manager.delete_file(target_file_id)
+
+            if not success:
+                return ErrorResponse(
+                    message=f"File not found: {target_file_id}",
+                    session_id=session_id,
+                )
+
+            return WorkspaceDeleteResponse(
+                file_id=target_file_id,
+                success=True,
+                message="File deleted successfully",
+                session_id=session_id,
+            )
+
+        except Exception as e:
+            logger.error(f"Error deleting workspace file: {e}", exc_info=True)
+            return ErrorResponse(
+                message=f"Failed to delete workspace file: {str(e)}",
+                error=str(e),
+                session_id=session_id,
+            )
--- a/autogpt_platform/backend/backend/api/features/library/db.py
+++ b/autogpt_platform/backend/backend/api/features/library/db.py
@@ -21,7 +21,7 @@ from backend.data.model import CredentialsMetaInput
 from backend.integrations.creds_manager import IntegrationCredentialsManager
 from backend.integrations.webhooks.graph_lifecycle_hooks import on_graph_activate
 from backend.util.clients import get_scheduler_client
-from backend.util.exceptions import DatabaseError, NotFoundError
+from backend.util.exceptions import DatabaseError, InvalidInputError, NotFoundError
 from backend.util.json import SafeJson
 from backend.util.models import Pagination
 from backend.util.settings import Config
@@ -64,11 +64,11 @@ async def list_library_agents(

    if page < 1 or page_size < 1:
        logger.warning(f"Invalid pagination: page={page}, page_size={page_size}")
-        raise DatabaseError("Invalid pagination input")
+        raise InvalidInputError("Invalid pagination input")

    if search_term and len(search_term.strip()) > 100:
        logger.warning(f"Search term too long: {repr(search_term)}")
-        raise DatabaseError("Search term is too long")
+        raise InvalidInputError("Search term is too long")

    where_clause: prisma.types.LibraryAgentWhereInput = {
        "userId": user_id,
@@ -175,7 +175,7 @@ async def list_favorite_library_agents(

    if page < 1 or page_size < 1:
        logger.warning(f"Invalid pagination: page={page}, page_size={page_size}")
-        raise DatabaseError("Invalid pagination input")
+        raise InvalidInputError("Invalid pagination input")

    where_clause: prisma.types.LibraryAgentWhereInput = {
        "userId": user_id,
--- a/autogpt_platform/backend/backend/api/features/library/routes/agents.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes/agents.py
@@ -1,4 +1,3 @@
-import logging
 from typing import Literal, Optional

 import autogpt_libs.auth as autogpt_auth_lib
@@ -6,15 +5,11 @@ from fastapi import APIRouter, Body, HTTPException, Query, Security, status
 from fastapi.responses import Response
 from prisma.enums import OnboardingStep

-import backend.api.features.store.exceptions as store_exceptions
 from backend.data.onboarding import complete_onboarding_step
-from backend.util.exceptions import DatabaseError, NotFoundError

 from .. import db as library_db
 from .. import model as library_model

-logger = logging.getLogger(__name__)
-
 router = APIRouter(
    prefix="/agents",
    tags=["library", "private"],
@@ -26,10 +21,6 @@ router = APIRouter(
    "",
    summary="List Library Agents",
    response_model=library_model.LibraryAgentResponse,
-    responses={
-        200: {"description": "List of library agents"},
-        500: {"description": "Server error", "content": {"application/json": {}}},
-    },
 )
 async def list_library_agents(
    user_id: str = Security(autogpt_auth_lib.get_user_id),
@@ -53,43 +44,19 @@ async def list_library_agents(
 ) -> library_model.LibraryAgentResponse:
    """
    Get all agents in the user's library (both created and saved).
-
-    Args:
-        user_id: ID of the authenticated user.
-        search_term: Optional search term to filter agents by name/description.
-        filter_by: List of filters to apply (favorites, created by user).
-        sort_by: List of sorting criteria (created date, updated date).
-        page: Page number to retrieve.
-        page_size: Number of agents per page.
-
-    Returns:
-        A LibraryAgentResponse containing agents and pagination metadata.
-
-    Raises:
-        HTTPException: If a server/database error occurs.
    """
-    try:
-        return await library_db.list_library_agents(
-            user_id=user_id,
-            search_term=search_term,
-            sort_by=sort_by,
-            page=page,
-            page_size=page_size,
-        )
-    except Exception as e:
-        logger.error(f"Could not list library agents for user #{user_id}: {e}")
-        raise HTTPException(
-            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
-            detail=str(e),
-        ) from e
+    return await library_db.list_library_agents(
+        user_id=user_id,
+        search_term=search_term,
+        sort_by=sort_by,
+        page=page,
+        page_size=page_size,
+    )


@router.get(
    "/favorites",
    summary="List Favorite Library Agents",
-    responses={
-        500: {"description": "Server error", "content": {"application/json": {}}},
-    },
 )
 async def list_favorite_library_agents(
    user_id: str = Security(autogpt_auth_lib.get_user_id),
@@ -106,30 +73,12 @@ async def list_favorite_library_agents(
 ) -> library_model.LibraryAgentResponse:
    """
    Get all favorite agents in the user's library.
-
-    Args:
-        user_id: ID of the authenticated user.
-        page: Page number to retrieve.
-        page_size: Number of agents per page.
-
-    Returns:
-        A LibraryAgentResponse containing favorite agents and pagination metadata.
-
-    Raises:
-        HTTPException: If a server/database error occurs.
    """
-    try:
-        return await library_db.list_favorite_library_agents(
-            user_id=user_id,
-            page=page,
-            page_size=page_size,
-        )
-    except Exception as e:
-        logger.error(f"Could not list favorite library agents for user #{user_id}: {e}")
-        raise HTTPException(
-            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
-            detail=str(e),
-        ) from e
+    return await library_db.list_favorite_library_agents(
+        user_id=user_id,
+        page=page,
+        page_size=page_size,
+    )


@router.get("/{library_agent_id}", summary="Get Library Agent")
@@ -162,10 +111,6 @@ async def get_library_agent_by_graph_id(
    summary="Get Agent By Store ID",
    tags=["store", "library"],
    response_model=library_model.LibraryAgent | None,
-    responses={
-        200: {"description": "Library agent found"},
-        404: {"description": "Agent not found"},
-    },
 )
 async def get_library_agent_by_store_listing_version_id(
    store_listing_version_id: str,
@@ -174,32 +119,15 @@ async def get_library_agent_by_store_listing_version_id(
    """
    Get Library Agent from Store Listing Version ID.
    """
-    try:
-        return await library_db.get_library_agent_by_store_version_id(
-            store_listing_version_id, user_id
-        )
-    except NotFoundError as e:
-        raise HTTPException(
-            status_code=status.HTTP_404_NOT_FOUND,
-            detail=str(e),
-        )
-    except Exception as e:
-        logger.error(f"Could not fetch library agent from store version ID: {e}")
-        raise HTTPException(
-            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
-            detail=str(e),
-        ) from e
+    return await library_db.get_library_agent_by_store_version_id(
+        store_listing_version_id, user_id
+    )


@router.post(
    "",
    summary="Add Marketplace Agent",
    status_code=status.HTTP_201_CREATED,
-    responses={
-        201: {"description": "Agent added successfully"},
-        404: {"description": "Store listing version not found"},
-        500: {"description": "Server error"},
-    },
 )
 async def add_marketplace_agent_to_library(
    store_listing_version_id: str = Body(embed=True),
@@ -210,59 +138,19 @@ async def add_marketplace_agent_to_library(
 ) -> library_model.LibraryAgent:
    """
    Add an agent from the marketplace to the user's library.
-
-    Args:
-        store_listing_version_id: ID of the store listing version to add.
-        user_id: ID of the authenticated user.
-
-    Returns:
-        library_model.LibraryAgent: Agent added to the library
-
-    Raises:
-        HTTPException(404): If the listing version is not found.
-        HTTPException(500): If a server/database error occurs.
    """
-    try:
-        agent = await library_db.add_store_agent_to_library(
-            store_listing_version_id=store_listing_version_id,
-            user_id=user_id,
-        )
-        if source != "onboarding":
-            await complete_onboarding_step(
-                user_id, OnboardingStep.MARKETPLACE_ADD_AGENT
-            )
-        return agent
-
-    except store_exceptions.AgentNotFoundError as e:
-        logger.warning(
-            f"Could not find store listing version {store_listing_version_id} "
-            "to add to library"
-        )
-        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(e))
-    except DatabaseError as e:
-        logger.error(f"Database error while adding agent to library: {e}", e)
-        raise HTTPException(
-            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
-            detail={"message": str(e), "hint": "Inspect DB logs for details."},
-        ) from e
-    except Exception as e:
-        logger.error(f"Unexpected error while adding agent to library: {e}")
-        raise HTTPException(
-            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
-            detail={
-                "message": str(e),
-                "hint": "Check server logs for more information.",
-            },
-        ) from e
+    agent = await library_db.add_store_agent_to_library(
+        store_listing_version_id=store_listing_version_id,
+        user_id=user_id,
+    )
+    if source != "onboarding":
+        await complete_onboarding_step(user_id, OnboardingStep.MARKETPLACE_ADD_AGENT)
+    return agent


@router.patch(
    "/{library_agent_id}",
    summary="Update Library Agent",
-    responses={
-        200: {"description": "Agent updated successfully"},
-        500: {"description": "Server error"},
-    },
 )
 async def update_library_agent(
    library_agent_id: str,
@@ -271,52 +159,21 @@ async def update_library_agent(
 ) -> library_model.LibraryAgent:
    """
    Update the library agent with the given fields.
-
-    Args:
-        library_agent_id: ID of the library agent to update.
-        payload: Fields to update (auto_update_version, is_favorite, etc.).
-        user_id: ID of the authenticated user.
-
-    Raises:
-        HTTPException(500): If a server/database error occurs.
    """
-    try:
-        return await library_db.update_library_agent(
-            library_agent_id=library_agent_id,
-            user_id=user_id,
-            auto_update_version=payload.auto_update_version,
-            graph_version=payload.graph_version,
-            is_favorite=payload.is_favorite,
-            is_archived=payload.is_archived,
-            settings=payload.settings,
-        )
-    except NotFoundError as e:
-        raise HTTPException(
-            status_code=status.HTTP_404_NOT_FOUND,
-            detail=str(e),
-        ) from e
-    except DatabaseError as e:
-        logger.error(f"Database error while updating library agent: {e}")
-        raise HTTPException(
-            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
-            detail={"message": str(e), "hint": "Verify DB connection."},
-        ) from e
-    except Exception as e:
-        logger.error(f"Unexpected error while updating library agent: {e}")
-        raise HTTPException(
-            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
-            detail={"message": str(e), "hint": "Check server logs."},
-        ) from e
+    return await library_db.update_library_agent(
+        library_agent_id=library_agent_id,
+        user_id=user_id,
+        auto_update_version=payload.auto_update_version,
+        graph_version=payload.graph_version,
+        is_favorite=payload.is_favorite,
+        is_archived=payload.is_archived,
+        settings=payload.settings,
+    )


@router.delete(
    "/{library_agent_id}",
    summary="Delete Library Agent",
-    responses={
-        204: {"description": "Agent deleted successfully"},
-        404: {"description": "Agent not found"},
-        500: {"description": "Server error"},
-    },
 )
 async def delete_library_agent(
    library_agent_id: str,
@@ -324,28 +181,11 @@ async def delete_library_agent(
 ) -> Response:
    """
    Soft-delete the specified library agent.
-
-    Args:
-        library_agent_id: ID of the library agent to delete.
-        user_id: ID of the authenticated user.
-
-    Returns:
-        204 No Content if successful.
-
-    Raises:
-        HTTPException(404): If the agent does not exist.
-        HTTPException(500): If a server/database error occurs.
    """
-    try:
-        await library_db.delete_library_agent(
-            library_agent_id=library_agent_id, user_id=user_id
-        )
-        return Response(status_code=status.HTTP_204_NO_CONTENT)
-    except NotFoundError as e:
-        raise HTTPException(
-            status_code=status.HTTP_404_NOT_FOUND,
-            detail=str(e),
-        ) from e
+    await library_db.delete_library_agent(
+        library_agent_id=library_agent_id, user_id=user_id
+    )
+    return Response(status_code=status.HTTP_204_NO_CONTENT)


@router.post("/{library_agent_id}/fork", summary="Fork Library Agent")
--- a/autogpt_platform/backend/backend/api/features/library/routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes_test.py
@@ -118,21 +118,6 @@ async def test_get_library_agents_success(
    )


-def test_get_library_agents_error(mocker: pytest_mock.MockFixture, test_user_id: str):
-    mock_db_call = mocker.patch("backend.api.features.library.db.list_library_agents")
-    mock_db_call.side_effect = Exception("Test error")
-
-    response = client.get("/agents?search_term=test")
-    assert response.status_code == 500
-    mock_db_call.assert_called_once_with(
-        user_id=test_user_id,
-        search_term="test",
-        sort_by=library_model.LibraryAgentSort.UPDATED_AT,
-        page=1,
-        page_size=15,
-    )
-
-
@pytest.mark.asyncio
 async def test_get_favorite_library_agents_success(
    mocker: pytest_mock.MockFixture,
@@ -190,23 +175,6 @@ async def test_get_favorite_library_agents_success(
    )


-def test_get_favorite_library_agents_error(
-    mocker: pytest_mock.MockFixture, test_user_id: str
-):
-    mock_db_call = mocker.patch(
-        "backend.api.features.library.db.list_favorite_library_agents"
-    )
-    mock_db_call.side_effect = Exception("Test error")
-
-    response = client.get("/agents/favorites")
-    assert response.status_code == 500
-    mock_db_call.assert_called_once_with(
-        user_id=test_user_id,
-        page=1,
-        page_size=15,
-    )
-
-
 def test_add_agent_to_library_success(
    mocker: pytest_mock.MockFixture, test_user_id: str
 ):
@@ -258,19 +226,3 @@ def test_add_agent_to_library_success(
        store_listing_version_id="test-version-id", user_id=test_user_id
    )
    mock_complete_onboarding.assert_awaited_once()
-
-
-def test_add_agent_to_library_error(mocker: pytest_mock.MockFixture, test_user_id: str):
-    mock_db_call = mocker.patch(
-        "backend.api.features.library.db.add_store_agent_to_library"
-    )
-    mock_db_call.side_effect = Exception("Test error")
-
-    response = client.post(
-        "/agents", json={"store_listing_version_id": "test-version-id"}
-    )
-    assert response.status_code == 500
-    assert "detail" in response.json()  # Verify error response structure
-    mock_db_call.assert_called_once_with(
-        store_listing_version_id="test-version-id", user_id=test_user_id
-    )
--- a/autogpt_platform/backend/backend/api/features/store/embeddings.py
+++ b/autogpt_platform/backend/backend/api/features/store/embeddings.py
@@ -454,6 +454,7 @@ async def backfill_all_content_types(batch_size: int = 10) -> dict[str, Any]:
    total_processed = 0
    total_success = 0
    total_failed = 0
+    all_errors: dict[str, int] = {}  # Aggregate errors across all content types

    # Process content types in explicit order
    processing_order = [
@@ -499,23 +500,12 @@ async def backfill_all_content_types(batch_size: int = 10) -> dict[str, Any]:
            success = sum(1 for result in results if result is True)
            failed = len(results) - success

-            # Aggregate unique errors to avoid Sentry spam
+            # Aggregate errors across all content types
            if failed > 0:
-                # Group errors by type and message
-                error_summary: dict[str, int] = {}
                for result in results:
                    if isinstance(result, Exception):
                        error_key = f"{type(result).__name__}: {str(result)}"
-                        error_summary[error_key] = error_summary.get(error_key, 0) + 1
-
-                # Log aggregated error summary
-                error_details = ", ".join(
-                    f"{error} ({count}x)" for error, count in error_summary.items()
-                )
-                logger.error(
-                    f"{content_type.value}: {failed}/{len(results)} embeddings failed. "
-                    f"Errors: {error_details}"
-                )
+                        all_errors[error_key] = all_errors.get(error_key, 0) + 1

            results_by_type[content_type.value] = {
                "processed": len(missing_items),
@@ -542,6 +532,13 @@ async def backfill_all_content_types(batch_size: int = 10) -> dict[str, Any]:
                "error": str(e),
            }

+    # Log aggregated errors once at the end
+    if all_errors:
+        error_details = ", ".join(
+            f"{error} ({count}x)" for error, count in all_errors.items()
+        )
+        logger.error(f"Embedding backfill errors: {error_details}")
+
    return {
        "by_type": results_by_type,
        "totals": {
--- a/autogpt_platform/backend/backend/api/features/store/routes.py
+++ b/autogpt_platform/backend/backend/api/features/store/routes.py
@@ -393,7 +393,6 @@ async def get_creators(
@router.get(
    "/creator/{username}",
    summary="Get creator details",
-    operation_id="getV2GetCreatorDetails",
    tags=["store", "public"],
    response_model=store_model.CreatorDetails,
 )
--- a/autogpt_platform/backend/backend/api/features/v1.py
+++ b/autogpt_platform/backend/backend/api/features/v1.py
@@ -261,14 +261,36 @@ async def get_onboarding_agents(
    return await get_recommended_agents(user_id)


+class OnboardingStatusResponse(pydantic.BaseModel):
+    """Response for onboarding status check."""
+
+    is_onboarding_enabled: bool
+    is_chat_enabled: bool
+
+
@v1_router.get(
    "/onboarding/enabled",
    summary="Is onboarding enabled",
    tags=["onboarding", "public"],
-    dependencies=[Security(requires_user)],
+    response_model=OnboardingStatusResponse,
 )
-async def is_onboarding_enabled() -> bool:
-    return await onboarding_enabled()
+async def is_onboarding_enabled(
+    user_id: Annotated[str, Security(get_user_id)],
+) -> OnboardingStatusResponse:
+    # Check if chat is enabled for user
+    is_chat_enabled = await is_feature_enabled(Flag.CHAT, user_id, False)
+
+    # If chat is enabled, skip legacy onboarding
+    if is_chat_enabled:
+        return OnboardingStatusResponse(
+            is_onboarding_enabled=False,
+            is_chat_enabled=True,
+        )
+
+    return OnboardingStatusResponse(
+        is_onboarding_enabled=await onboarding_enabled(),
+        is_chat_enabled=False,
+    )


@v1_router.post(
--- a/autogpt_platform/backend/backend/api/features/workspace/init.py
+++ b/autogpt_platform/backend/backend/api/features/workspace/init.py
@@ -0,0 +1 @@
+# Workspace API feature module
--- a/autogpt_platform/backend/backend/api/features/workspace/routes.py
+++ b/autogpt_platform/backend/backend/api/features/workspace/routes.py
@@ -0,0 +1,122 @@
+"""
+Workspace API routes for managing user file storage.
+"""
+
+import logging
+import re
+from typing import Annotated
+from urllib.parse import quote
+
+import fastapi
+from autogpt_libs.auth.dependencies import get_user_id, requires_user
+from fastapi.responses import Response
+
+from backend.data.workspace import get_workspace, get_workspace_file
+from backend.util.workspace_storage import get_workspace_storage
+
+
+def _sanitize_filename_for_header(filename: str) -> str:
+    """
+    Sanitize filename for Content-Disposition header to prevent header injection.
+
+    Removes/replaces characters that could break the header or inject new headers.
+    Uses RFC5987 encoding for non-ASCII characters.
+    """
+    # Remove CR, LF, and null bytes (header injection prevention)
+    sanitized = re.sub(r"[\r\n\x00]", "", filename)
+    # Escape quotes
+    sanitized = sanitized.replace('"', '\\"')
+    # For non-ASCII, use RFC5987 filename* parameter
+    # Check if filename has non-ASCII characters
+    try:
+        sanitized.encode("ascii")
+        return f'attachment; filename="{sanitized}"'
+    except UnicodeEncodeError:
+        # Use RFC5987 encoding for UTF-8 filenames
+        encoded = quote(sanitized, safe="")
+        return f"attachment; filename*=UTF-8''{encoded}"
+
+
+logger = logging.getLogger(__name__)
+
+router = fastapi.APIRouter(
+    dependencies=[fastapi.Security(requires_user)],
+)
+
+
+def _create_streaming_response(content: bytes, file) -> Response:
+    """Create a streaming response for file content."""
+    return Response(
+        content=content,
+        media_type=file.mimeType,
+        headers={
+            "Content-Disposition": _sanitize_filename_for_header(file.name),
+            "Content-Length": str(len(content)),
+        },
+    )
+
+
+async def _create_file_download_response(file) -> Response:
+    """
+    Create a download response for a workspace file.
+
+    Handles both local storage (direct streaming) and GCS (signed URL redirect
+    with fallback to streaming).
+    """
+    storage = await get_workspace_storage()
+
+    # For local storage, stream the file directly
+    if file.storagePath.startswith("local://"):
+        content = await storage.retrieve(file.storagePath)
+        return _create_streaming_response(content, file)
+
+    # For GCS, try to redirect to signed URL, fall back to streaming
+    try:
+        url = await storage.get_download_url(file.storagePath, expires_in=300)
+        # If we got back an API path (fallback), stream directly instead
+        if url.startswith("/api/"):
+            content = await storage.retrieve(file.storagePath)
+            return _create_streaming_response(content, file)
+        return fastapi.responses.RedirectResponse(url=url, status_code=302)
+    except Exception as e:
+        # Log the signed URL failure with context
+        logger.error(
+            f"Failed to get signed URL for file {file.id} "
+            f"(storagePath={file.storagePath}): {e}",
+            exc_info=True,
+        )
+        # Fall back to streaming directly from GCS
+        try:
+            content = await storage.retrieve(file.storagePath)
+            return _create_streaming_response(content, file)
+        except Exception as fallback_error:
+            logger.error(
+                f"Fallback streaming also failed for file {file.id} "
+                f"(storagePath={file.storagePath}): {fallback_error}",
+                exc_info=True,
+            )
+            raise
+
+
+@router.get(
+    "/files/{file_id}/download",
+    summary="Download file by ID",
+)
+async def download_file(
+    user_id: Annotated[str, fastapi.Security(get_user_id)],
+    file_id: str,
+) -> Response:
+    """
+    Download a file by its ID.
+
+    Returns the file content directly or redirects to a signed URL for GCS.
+    """
+    workspace = await get_workspace(user_id)
+    if workspace is None:
+        raise fastapi.HTTPException(status_code=404, detail="Workspace not found")
+
+    file = await get_workspace_file(file_id, workspace.id)
+    if file is None:
+        raise fastapi.HTTPException(status_code=404, detail="File not found")
+
+    return await _create_file_download_response(file)
--- a/autogpt_platform/backend/backend/api/rest_api.py
+++ b/autogpt_platform/backend/backend/api/rest_api.py
@@ -18,7 +18,6 @@ from prisma.errors import PrismaError

 import backend.api.features.admin.credit_admin_routes
 import backend.api.features.admin.execution_analytics_routes
-import backend.api.features.admin.llm_routes
 import backend.api.features.admin.store_admin_routes
 import backend.api.features.builder
 import backend.api.features.builder.routes
@@ -33,16 +32,15 @@ import backend.api.features.postmark.postmark
 import backend.api.features.store.model
 import backend.api.features.store.routes
 import backend.api.features.v1
+import backend.api.features.workspace.routes as workspace_routes
 import backend.data.block
 import backend.data.db
 import backend.data.graph
 import backend.data.user
 import backend.integrations.webhooks.utils
-import backend.server.v2.llm.routes as public_llm_routes
 import backend.util.service
 import backend.util.settings
-from backend.data import llm_registry
-from backend.data.block_cost_config import refresh_llm_costs
+from backend.blocks.llm import DEFAULT_LLM_MODEL
 from backend.data.model import Credentials
 from backend.integrations.providers import ProviderName
 from backend.monitoring.instrumentation import instrument_fastapi
@@ -55,6 +53,7 @@ from backend.util.exceptions import (
 )
 from backend.util.feature_flag import initialize_launchdarkly, shutdown_launchdarkly
 from backend.util.service import UnhealthyServiceError
+from backend.util.workspace_storage import shutdown_workspace_storage

 from .external.fastapi_app import external_api
 from .features.analytics import router as analytics_router
@@ -112,27 +111,11 @@ async def lifespan_context(app: fastapi.FastAPI):

    AutoRegistry.patch_integrations()

-    # Refresh LLM registry before initializing blocks so blocks can use registry data
-    await llm_registry.refresh_llm_registry()
-    refresh_llm_costs()
-
-    # Clear block schema caches so they're regenerated with updated discriminator_mapping
-    from backend.data.block import BlockSchema
-
-    BlockSchema.clear_all_schema_caches()
-
    await backend.data.block.initialize_blocks()

    await backend.data.user.migrate_and_encrypt_user_integrations()
    await backend.data.graph.fix_llm_provider_credentials()
-    # migrate_llm_models uses registry default model
-    from backend.blocks.llm import LlmModel
-
-    default_model_slug = llm_registry.get_default_model_slug()
-    if default_model_slug:
-        await backend.data.graph.migrate_llm_models(LlmModel(default_model_slug))
-    else:
-        logger.warning("Skipping LLM model migration: no default model available")
+    await backend.data.graph.migrate_llm_models(DEFAULT_LLM_MODEL)
    await backend.integrations.webhooks.utils.migrate_legacy_triggered_graphs()

    with launch_darkly_context():
@@ -143,6 +126,11 @@ async def lifespan_context(app: fastapi.FastAPI):
    except Exception as e:
        logger.warning(f"Error shutting down cloud storage handler: {e}")

+    try:
+        await shutdown_workspace_storage()
+    except Exception as e:
+        logger.warning(f"Error shutting down workspace storage: {e}")
+
    await backend.data.db.disconnect()


@@ -317,16 +305,6 @@ app.include_router(
    tags=["v2", "executions", "review"],
    prefix="/api/review",
 )
-app.include_router(
-    backend.api.features.admin.llm_routes.router,
-    tags=["v2", "admin", "llm"],
-    prefix="/api/llm/admin",
-)
-app.include_router(
-    public_llm_routes.router,
-    tags=["v2", "llm"],
-    prefix="/api",
-)
 app.include_router(
    backend.api.features.library.routes.router, tags=["v2"], prefix="/api/library"
 )
@@ -344,6 +322,11 @@ app.include_router(
    tags=["v2", "chat"],
    prefix="/api/chat",
 )
+app.include_router(
+    workspace_routes.router,
+    tags=["workspace"],
+    prefix="/api/workspace",
+)
 app.include_router(
    backend.api.features.oauth.router,
    tags=["oauth"],
--- a/autogpt_platform/backend/backend/api/ws_api.py
+++ b/autogpt_platform/backend/backend/api/ws_api.py
@@ -77,39 +77,7 @@ async def event_broadcaster(manager: ConnectionManager):
                payload=notification.payload,
            )

-    async def registry_refresh_worker():
-        """Listen for LLM registry refresh notifications and broadcast to all clients."""
-        from backend.data.llm_registry import REGISTRY_REFRESH_CHANNEL
-        from backend.data.redis_client import connect_async
-
-        redis = await connect_async()
-        pubsub = redis.pubsub()
-        await pubsub.subscribe(REGISTRY_REFRESH_CHANNEL)
-        logger.info(
-            "Subscribed to LLM registry refresh notifications for WebSocket broadcast"
-        )
-
-        async for message in pubsub.listen():
-            if (
-                message["type"] == "message"
-                and message["channel"] == REGISTRY_REFRESH_CHANNEL
-            ):
-                logger.info(
-                    "Broadcasting LLM registry refresh to all WebSocket clients"
-                )
-                await manager.broadcast_to_all(
-                    method=WSMethod.NOTIFICATION,
-                    data={
-                        "type": "LLM_REGISTRY_REFRESH",
-                        "event": "registry_updated",
-                    },
-                )
-
-    await asyncio.gather(
-        execution_worker(),
-        notification_worker(),
-        registry_refresh_worker(),
-    )
+    await asyncio.gather(execution_worker(), notification_worker())


 async def authenticate_websocket(websocket: WebSocket) -> str:
--- a/autogpt_platform/backend/backend/blocks/ai_condition.py
+++ b/autogpt_platform/backend/backend/blocks/ai_condition.py
@@ -1,6 +1,7 @@
 from typing import Any

 from backend.blocks.llm import (
+    DEFAULT_LLM_MODEL,
    TEST_CREDENTIALS,
    TEST_CREDENTIALS_INPUT,
    AIBlockBase,
@@ -9,7 +10,6 @@ from backend.blocks.llm import (
    LlmModel,
    LLMResponse,
    llm_call,
-    llm_model_schema_extra,
 )
 from backend.data.block import (
    BlockCategory,
@@ -50,10 +50,9 @@ class AIConditionBlock(AIBlockBase):
        )
        model: LlmModel = SchemaField(
            title="LLM Model",
-            default_factory=LlmModel.default,
+            default=DEFAULT_LLM_MODEL,
            description="The language model to use for evaluating the condition.",
            advanced=False,
-            json_schema_extra=llm_model_schema_extra(),
        )
        credentials: AICredentials = AICredentialsField()

@@ -83,7 +82,7 @@ class AIConditionBlock(AIBlockBase):
                "condition": "the input is an email address",
                "yes_value": "Valid email",
                "no_value": "Not an email",
-                "model": "gpt-4o",  # Using string value - enum accepts any model slug dynamically
+                "model": DEFAULT_LLM_MODEL,
                "credentials": TEST_CREDENTIALS_INPUT,
            },
            test_credentials=TEST_CREDENTIALS,
--- a/autogpt_platform/backend/backend/blocks/ai_image_customizer.py
+++ b/autogpt_platform/backend/backend/blocks/ai_image_customizer.py
@@ -13,6 +13,7 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import (
    APIKeyCredentials,
    CredentialsField,
@@ -117,11 +118,13 @@ class AIImageCustomizerBlock(Block):
                "credentials": TEST_CREDENTIALS_INPUT,
            },
            test_output=[
-                ("image_url", "https://replicate.delivery/generated-image.jpg"),
+                # Output will be a workspace ref or data URI depending on context
+                ("image_url", lambda x: x.startswith(("workspace://", "data:"))),
            ],
            test_mock={
+                # Use data URI to avoid HTTP requests during tests
                "run_model": lambda *args, **kwargs: MediaFileType(
-                    "https://replicate.delivery/generated-image.jpg"
+                    "data:image/jpeg;base64,/9j/4AAQSkZJRgABAgAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCAABAAEDASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD3+iiigD//2Q=="
                ),
            },
            test_credentials=TEST_CREDENTIALS,
@@ -132,8 +135,7 @@ class AIImageCustomizerBlock(Block):
        input_data: Input,
        *,
        credentials: APIKeyCredentials,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
        **kwargs,
    ) -> BlockOutput:
        try:
@@ -141,10 +143,9 @@ class AIImageCustomizerBlock(Block):
            processed_images = await asyncio.gather(
                *(
                    store_media_file(
-                        graph_exec_id=graph_exec_id,
                        file=img,
-                        user_id=user_id,
-                        return_content=True,
+                        execution_context=execution_context,
+                        return_format="for_external_api",  # Get content for Replicate API
                    )
                    for img in input_data.images
                )
@@ -158,7 +159,14 @@ class AIImageCustomizerBlock(Block):
                aspect_ratio=input_data.aspect_ratio.value,
                output_format=input_data.output_format.value,
            )
-            yield "image_url", result
+
+            # Store the generated image to the user's workspace for persistence
+            stored_url = await store_media_file(
+                file=result,
+                execution_context=execution_context,
+                return_format="for_block_output",
+            )
+            yield "image_url", stored_url
        except Exception as e:
            yield "error", str(e)

--- a/autogpt_platform/backend/backend/blocks/ai_image_generator_block.py
+++ b/autogpt_platform/backend/backend/blocks/ai_image_generator_block.py
@@ -6,6 +6,7 @@ from replicate.client import Client as ReplicateClient
 from replicate.helpers import FileOutput

 from backend.data.block import Block, BlockCategory, BlockSchemaInput, BlockSchemaOutput
+from backend.data.execution import ExecutionContext
 from backend.data.model import (
    APIKeyCredentials,
    CredentialsField,
@@ -13,6 +14,8 @@ from backend.data.model import (
    SchemaField,
 )
 from backend.integrations.providers import ProviderName
+from backend.util.file import store_media_file
+from backend.util.type import MediaFileType


 class ImageSize(str, Enum):
@@ -165,11 +168,13 @@ class AIImageGeneratorBlock(Block):
            test_output=[
                (
                    "image_url",
-                    "https://replicate.delivery/generated-image.webp",
+                    # Test output is a data URI since we now store images
+                    lambda x: x.startswith("data:image/"),
                ),
            ],
            test_mock={
-                "_run_client": lambda *args, **kwargs: "https://replicate.delivery/generated-image.webp"
+                # Return a data URI directly so store_media_file doesn't need to download
+                "_run_client": lambda *args, **kwargs: "data:image/webp;base64,UklGRiQAAABXRUJQVlA4IBgAAAAwAQCdASoBAAEAAQAcJYgCdAEO"
            },
        )

@@ -318,11 +323,24 @@ class AIImageGeneratorBlock(Block):
        style_text = style_map.get(style, "")
        return f"{style_text} of" if style_text else ""

-    async def run(self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs):
+    async def run(
+        self,
+        input_data: Input,
+        *,
+        credentials: APIKeyCredentials,
+        execution_context: ExecutionContext,
+        **kwargs,
+    ):
        try:
            url = await self.generate_image(input_data, credentials)
            if url:
-                yield "image_url", url
+                # Store the generated image to the user's workspace/execution folder
+                stored_url = await store_media_file(
+                    file=MediaFileType(url),
+                    execution_context=execution_context,
+                    return_format="for_block_output",
+                )
+                yield "image_url", stored_url
            else:
                yield "error", "Image generation returned an empty result."
        except Exception as e:
--- a/autogpt_platform/backend/backend/blocks/ai_shortform_video_block.py
+++ b/autogpt_platform/backend/backend/blocks/ai_shortform_video_block.py
@@ -13,6 +13,7 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import (
    APIKeyCredentials,
    CredentialsField,
@@ -21,7 +22,9 @@ from backend.data.model import (
 )
 from backend.integrations.providers import ProviderName
 from backend.util.exceptions import BlockExecutionError
+from backend.util.file import store_media_file
 from backend.util.request import Requests
+from backend.util.type import MediaFileType

 TEST_CREDENTIALS = APIKeyCredentials(
    id="01234567-89ab-cdef-0123-456789abcdef",
@@ -271,7 +274,10 @@ class AIShortformVideoCreatorBlock(Block):
                "voice": Voice.LILY,
                "video_style": VisualMediaType.STOCK_VIDEOS,
            },
-            test_output=("video_url", "https://example.com/video.mp4"),
+            test_output=(
+                "video_url",
+                lambda x: x.startswith(("workspace://", "data:")),
+            ),
            test_mock={
                "create_webhook": lambda *args, **kwargs: (
                    "test_uuid",
@@ -280,15 +286,21 @@ class AIShortformVideoCreatorBlock(Block):
                "create_video": lambda *args, **kwargs: {"pid": "test_pid"},
                "check_video_status": lambda *args, **kwargs: {
                    "status": "ready",
-                    "videoUrl": "https://example.com/video.mp4",
+                    "videoUrl": "data:video/mp4;base64,AAAA",
                },
-                "wait_for_video": lambda *args, **kwargs: "https://example.com/video.mp4",
+                # Use data URI to avoid HTTP requests during tests
+                "wait_for_video": lambda *args, **kwargs: "data:video/mp4;base64,AAAA",
            },
            test_credentials=TEST_CREDENTIALS,
        )

    async def run(
-        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+        self,
+        input_data: Input,
+        *,
+        credentials: APIKeyCredentials,
+        execution_context: ExecutionContext,
+        **kwargs,
    ) -> BlockOutput:
        # Create a new Webhook.site URL
        webhook_token, webhook_url = await self.create_webhook()
@@ -340,7 +352,13 @@ class AIShortformVideoCreatorBlock(Block):
            )
            video_url = await self.wait_for_video(credentials.api_key, pid)
            logger.debug(f"Video ready: {video_url}")
-            yield "video_url", video_url
+            # Store the generated video to the user's workspace for persistence
+            stored_url = await store_media_file(
+                file=MediaFileType(video_url),
+                execution_context=execution_context,
+                return_format="for_block_output",
+            )
+            yield "video_url", stored_url


 class AIAdMakerVideoCreatorBlock(Block):
@@ -447,7 +465,10 @@ class AIAdMakerVideoCreatorBlock(Block):
                    "https://cdn.revid.ai/uploads/1747076315114-image.png",
                ],
            },
-            test_output=("video_url", "https://example.com/ad.mp4"),
+            test_output=(
+                "video_url",
+                lambda x: x.startswith(("workspace://", "data:")),
+            ),
            test_mock={
                "create_webhook": lambda *args, **kwargs: (
                    "test_uuid",
@@ -456,14 +477,21 @@ class AIAdMakerVideoCreatorBlock(Block):
                "create_video": lambda *args, **kwargs: {"pid": "test_pid"},
                "check_video_status": lambda *args, **kwargs: {
                    "status": "ready",
-                    "videoUrl": "https://example.com/ad.mp4",
+                    "videoUrl": "data:video/mp4;base64,AAAA",
                },
-                "wait_for_video": lambda *args, **kwargs: "https://example.com/ad.mp4",
+                "wait_for_video": lambda *args, **kwargs: "data:video/mp4;base64,AAAA",
            },
            test_credentials=TEST_CREDENTIALS,
        )

-    async def run(self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs):
+    async def run(
+        self,
+        input_data: Input,
+        *,
+        credentials: APIKeyCredentials,
+        execution_context: ExecutionContext,
+        **kwargs,
+    ):
        webhook_token, webhook_url = await self.create_webhook()

        payload = {
@@ -531,7 +559,13 @@ class AIAdMakerVideoCreatorBlock(Block):
            raise RuntimeError("Failed to create video: No project ID returned")

        video_url = await self.wait_for_video(credentials.api_key, pid)
-        yield "video_url", video_url
+        # Store the generated video to the user's workspace for persistence
+        stored_url = await store_media_file(
+            file=MediaFileType(video_url),
+            execution_context=execution_context,
+            return_format="for_block_output",
+        )
+        yield "video_url", stored_url


 class AIScreenshotToVideoAdBlock(Block):
@@ -626,7 +660,10 @@ class AIScreenshotToVideoAdBlock(Block):
                "script": "Amazing numbers!",
                "screenshot_url": "https://cdn.revid.ai/uploads/1747080376028-image.png",
            },
-            test_output=("video_url", "https://example.com/screenshot.mp4"),
+            test_output=(
+                "video_url",
+                lambda x: x.startswith(("workspace://", "data:")),
+            ),
            test_mock={
                "create_webhook": lambda *args, **kwargs: (
                    "test_uuid",
@@ -635,14 +672,21 @@ class AIScreenshotToVideoAdBlock(Block):
                "create_video": lambda *args, **kwargs: {"pid": "test_pid"},
                "check_video_status": lambda *args, **kwargs: {
                    "status": "ready",
-                    "videoUrl": "https://example.com/screenshot.mp4",
+                    "videoUrl": "data:video/mp4;base64,AAAA",
                },
-                "wait_for_video": lambda *args, **kwargs: "https://example.com/screenshot.mp4",
+                "wait_for_video": lambda *args, **kwargs: "data:video/mp4;base64,AAAA",
            },
            test_credentials=TEST_CREDENTIALS,
        )

-    async def run(self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs):
+    async def run(
+        self,
+        input_data: Input,
+        *,
+        credentials: APIKeyCredentials,
+        execution_context: ExecutionContext,
+        **kwargs,
+    ):
        webhook_token, webhook_url = await self.create_webhook()

        payload = {
@@ -710,4 +754,10 @@ class AIScreenshotToVideoAdBlock(Block):
            raise RuntimeError("Failed to create video: No project ID returned")

        video_url = await self.wait_for_video(credentials.api_key, pid)
-        yield "video_url", video_url
+        # Store the generated video to the user's workspace for persistence
+        stored_url = await store_media_file(
+            file=MediaFileType(video_url),
+            execution_context=execution_context,
+            return_format="for_block_output",
+        )
+        yield "video_url", stored_url
--- a/autogpt_platform/backend/backend/blocks/bannerbear/text_overlay.py
+++ b/autogpt_platform/backend/backend/blocks/bannerbear/text_overlay.py
@@ -6,6 +6,7 @@ if TYPE_CHECKING:

 from pydantic import SecretStr

+from backend.data.execution import ExecutionContext
 from backend.sdk import (
    APIKeyCredentials,
    Block,
@@ -17,6 +18,8 @@ from backend.sdk import (
    Requests,
    SchemaField,
 )
+from backend.util.file import store_media_file
+from backend.util.type import MediaFileType

 from ._config import bannerbear

@@ -135,15 +138,17 @@ class BannerbearTextOverlayBlock(Block):
            },
            test_output=[
                ("success", True),
-                ("image_url", "https://cdn.bannerbear.com/test-image.jpg"),
+                # Output will be a workspace ref or data URI depending on context
+                ("image_url", lambda x: x.startswith(("workspace://", "data:"))),
                ("uid", "test-uid-123"),
                ("status", "completed"),
            ],
            test_mock={
+                # Use data URI to avoid HTTP requests during tests
                "_make_api_request": lambda *args, **kwargs: {
                    "uid": "test-uid-123",
                    "status": "completed",
-                    "image_url": "https://cdn.bannerbear.com/test-image.jpg",
+                    "image_url": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/wAALCAABAAEBAREA/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/9oACAEBAAA/APn+v//Z",
                }
            },
            test_credentials=TEST_CREDENTIALS,
@@ -177,7 +182,12 @@ class BannerbearTextOverlayBlock(Block):
            raise Exception(error_msg)

    async def run(
-        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+        self,
+        input_data: Input,
+        *,
+        credentials: APIKeyCredentials,
+        execution_context: ExecutionContext,
+        **kwargs,
    ) -> BlockOutput:
        # Build the modifications array
        modifications = []
@@ -234,6 +244,18 @@ class BannerbearTextOverlayBlock(Block):

        # Synchronous request - image should be ready
        yield "success", True
-        yield "image_url", data.get("image_url", "")
+
+        # Store the generated image to workspace for persistence
+        image_url = data.get("image_url", "")
+        if image_url:
+            stored_url = await store_media_file(
+                file=MediaFileType(image_url),
+                execution_context=execution_context,
+                return_format="for_block_output",
+            )
+            yield "image_url", stored_url
+        else:
+            yield "image_url", ""
+
        yield "uid", data.get("uid", "")
        yield "status", data.get("status", "completed")
--- a/autogpt_platform/backend/backend/blocks/basic.py
+++ b/autogpt_platform/backend/backend/blocks/basic.py
@@ -9,6 +9,7 @@ from backend.data.block import (
    BlockSchemaOutput,
    BlockType,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import SchemaField
 from backend.util.file import store_media_file
 from backend.util.type import MediaFileType, convert
@@ -17,10 +18,10 @@ from backend.util.type import MediaFileType, convert
 class FileStoreBlock(Block):
    class Input(BlockSchemaInput):
        file_in: MediaFileType = SchemaField(
-            description="The file to store in the temporary directory, it can be a URL, data URI, or local path."
+            description="The file to download and store. Can be a URL (https://...), data URI, or local path."
        )
        base_64: bool = SchemaField(
-            description="Whether produce an output in base64 format (not recommended, you can pass the string path just fine accross blocks).",
+            description="Whether to produce output in base64 format (not recommended, you can pass the file reference across blocks).",
            default=False,
            advanced=True,
            title="Produce Base64 Output",
@@ -28,13 +29,18 @@ class FileStoreBlock(Block):

    class Output(BlockSchemaOutput):
        file_out: MediaFileType = SchemaField(
-            description="The relative path to the stored file in the temporary directory."
+            description="Reference to the stored file. In CoPilot: workspace:// URI (visible in list_workspace_files). In graphs: data URI for passing to other blocks."
        )

    def __init__(self):
        super().__init__(
            id="cbb50872-625b-42f0-8203-a2ae78242d8a",
-            description="Stores the input file in the temporary directory.",
+            description=(
+                "Downloads and stores a file from a URL, data URI, or local path. "
+                "Use this to fetch images, documents, or other files for processing. "
+                "In CoPilot: saves to workspace (use list_workspace_files to see it). "
+                "In graphs: outputs a data URI to pass to other blocks."
+            ),
            categories={BlockCategory.BASIC, BlockCategory.MULTIMEDIA},
            input_schema=FileStoreBlock.Input,
            output_schema=FileStoreBlock.Output,
@@ -45,15 +51,18 @@ class FileStoreBlock(Block):
        self,
        input_data: Input,
        *,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
        **kwargs,
    ) -> BlockOutput:
+        # Determine return format based on user preference
+        # for_external_api: always returns data URI (base64) - honors "Produce Base64 Output"
+        # for_block_output: smart format - workspace:// in CoPilot, data URI in graphs
+        return_format = "for_external_api" if input_data.base_64 else "for_block_output"
+
        yield "file_out", await store_media_file(
-            graph_exec_id=graph_exec_id,
            file=input_data.file_in,
-            user_id=user_id,
-            return_content=input_data.base_64,
+            execution_context=execution_context,
+            return_format=return_format,
        )


--- a/autogpt_platform/backend/backend/blocks/discord/bot_blocks.py
+++ b/autogpt_platform/backend/backend/blocks/discord/bot_blocks.py
@@ -15,6 +15,7 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import APIKeyCredentials, SchemaField
 from backend.util.file import store_media_file
 from backend.util.request import Requests
@@ -666,8 +667,7 @@ class SendDiscordFileBlock(Block):
        file: MediaFileType,
        filename: str,
        message_content: str,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
    ) -> dict:
        intents = discord.Intents.default()
        intents.guilds = True
@@ -731,10 +731,9 @@ class SendDiscordFileBlock(Block):
                    # Local file path - read from stored media file
                    # This would be a path from a previous block's output
                    stored_file = await store_media_file(
-                        graph_exec_id=graph_exec_id,
                        file=file,
-                        user_id=user_id,
-                        return_content=True,  # Get as data URI
+                        execution_context=execution_context,
+                        return_format="for_external_api",  # Get content to send to Discord
                    )
                    # Now process as data URI
                    header, encoded = stored_file.split(",", 1)
@@ -781,8 +780,7 @@ class SendDiscordFileBlock(Block):
        input_data: Input,
        *,
        credentials: APIKeyCredentials,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
        **kwargs,
    ) -> BlockOutput:
        try:
@@ -793,8 +791,7 @@ class SendDiscordFileBlock(Block):
                file=input_data.file,
                filename=input_data.filename,
                message_content=input_data.message_content,
-                graph_exec_id=graph_exec_id,
-                user_id=user_id,
+                execution_context=execution_context,
            )

            yield "status", result.get("status", "Unknown error")
--- a/autogpt_platform/backend/backend/blocks/fal/ai_video_generator.py
+++ b/autogpt_platform/backend/backend/blocks/fal/ai_video_generator.py
@@ -17,8 +17,11 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import SchemaField
+from backend.util.file import store_media_file
 from backend.util.request import ClientResponseError, Requests
+from backend.util.type import MediaFileType

 logger = logging.getLogger(__name__)

@@ -64,9 +67,13 @@ class AIVideoGeneratorBlock(Block):
                "credentials": TEST_CREDENTIALS_INPUT,
            },
            test_credentials=TEST_CREDENTIALS,
-            test_output=[("video_url", "https://fal.media/files/example/video.mp4")],
+            test_output=[
+                # Output will be a workspace ref or data URI depending on context
+                ("video_url", lambda x: x.startswith(("workspace://", "data:"))),
+            ],
            test_mock={
-                "generate_video": lambda *args, **kwargs: "https://fal.media/files/example/video.mp4"
+                # Use data URI to avoid HTTP requests during tests
+                "generate_video": lambda *args, **kwargs: "data:video/mp4;base64,AAAA"
            },
        )

@@ -208,11 +215,22 @@ class AIVideoGeneratorBlock(Block):
            raise RuntimeError(f"API request failed: {str(e)}")

    async def run(
-        self, input_data: Input, *, credentials: FalCredentials, **kwargs
+        self,
+        input_data: Input,
+        *,
+        credentials: FalCredentials,
+        execution_context: ExecutionContext,
+        **kwargs,
    ) -> BlockOutput:
        try:
            video_url = await self.generate_video(input_data, credentials)
-            yield "video_url", video_url
+            # Store the generated video to the user's workspace for persistence
+            stored_url = await store_media_file(
+                file=MediaFileType(video_url),
+                execution_context=execution_context,
+                return_format="for_block_output",
+            )
+            yield "video_url", stored_url
        except Exception as e:
            error_message = str(e)
            yield "error", error_message
--- a/autogpt_platform/backend/backend/blocks/flux_kontext.py
+++ b/autogpt_platform/backend/backend/blocks/flux_kontext.py
@@ -12,6 +12,7 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import (
    APIKeyCredentials,
    CredentialsField,
@@ -121,10 +122,12 @@ class AIImageEditorBlock(Block):
                "credentials": TEST_CREDENTIALS_INPUT,
            },
            test_output=[
-                ("output_image", "https://replicate.com/output/edited-image.png"),
+                # Output will be a workspace ref or data URI depending on context
+                ("output_image", lambda x: x.startswith(("workspace://", "data:"))),
            ],
            test_mock={
-                "run_model": lambda *args, **kwargs: "https://replicate.com/output/edited-image.png",
+                # Use data URI to avoid HTTP requests during tests
+                "run_model": lambda *args, **kwargs: "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg==",
            },
            test_credentials=TEST_CREDENTIALS,
        )
@@ -134,8 +137,7 @@ class AIImageEditorBlock(Block):
        input_data: Input,
        *,
        credentials: APIKeyCredentials,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
        **kwargs,
    ) -> BlockOutput:
        result = await self.run_model(
@@ -144,20 +146,25 @@ class AIImageEditorBlock(Block):
            prompt=input_data.prompt,
            input_image_b64=(
                await store_media_file(
-                    graph_exec_id=graph_exec_id,
                    file=input_data.input_image,
-                    user_id=user_id,
-                    return_content=True,
+                    execution_context=execution_context,
+                    return_format="for_external_api",  # Get content for Replicate API
                )
                if input_data.input_image
                else None
            ),
            aspect_ratio=input_data.aspect_ratio.value,
            seed=input_data.seed,
-            user_id=user_id,
-            graph_exec_id=graph_exec_id,
+            user_id=execution_context.user_id or "",
+            graph_exec_id=execution_context.graph_exec_id or "",
        )
-        yield "output_image", result
+        # Store the generated image to the user's workspace for persistence
+        stored_url = await store_media_file(
+            file=result,
+            execution_context=execution_context,
+            return_format="for_block_output",
+        )
+        yield "output_image", stored_url

    async def run_model(
        self,
--- a/autogpt_platform/backend/backend/blocks/google/gmail.py
+++ b/autogpt_platform/backend/backend/blocks/google/gmail.py
@@ -21,6 +21,7 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import SchemaField
 from backend.util.file import MediaFileType, get_exec_file_path, store_media_file
 from backend.util.settings import Settings
@@ -95,8 +96,7 @@ def _make_mime_text(

 async def create_mime_message(
    input_data,
-    graph_exec_id: str,
-    user_id: str,
+    execution_context: ExecutionContext,
 ) -> str:
    """Create a MIME message with attachments and return base64-encoded raw message."""

@@ -117,12 +117,12 @@ async def create_mime_message(
    if input_data.attachments:
        for attach in input_data.attachments:
            local_path = await store_media_file(
-                user_id=user_id,
-                graph_exec_id=graph_exec_id,
                file=attach,
-                return_content=False,
+                execution_context=execution_context,
+                return_format="for_local_processing",
            )
-            abs_path = get_exec_file_path(graph_exec_id, local_path)
+            assert execution_context.graph_exec_id  # Validated by store_media_file
+            abs_path = get_exec_file_path(execution_context.graph_exec_id, local_path)
            part = MIMEBase("application", "octet-stream")
            with open(abs_path, "rb") as f:
                part.set_payload(f.read())
@@ -582,27 +582,25 @@ class GmailSendBlock(GmailBase):
        input_data: Input,
        *,
        credentials: GoogleCredentials,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
        **kwargs,
    ) -> BlockOutput:
        service = self._build_service(credentials, **kwargs)
        result = await self._send_email(
            service,
            input_data,
-            graph_exec_id,
-            user_id,
+            execution_context,
        )
        yield "result", result

    async def _send_email(
-        self, service, input_data: Input, graph_exec_id: str, user_id: str
+        self, service, input_data: Input, execution_context: ExecutionContext
    ) -> dict:
        if not input_data.to or not input_data.subject or not input_data.body:
            raise ValueError(
                "At least one recipient, subject, and body are required for sending an email"
            )
-        raw_message = await create_mime_message(input_data, graph_exec_id, user_id)
+        raw_message = await create_mime_message(input_data, execution_context)
        sent_message = await asyncio.to_thread(
            lambda: service.users()
            .messages()
@@ -692,30 +690,28 @@ class GmailCreateDraftBlock(GmailBase):
        input_data: Input,
        *,
        credentials: GoogleCredentials,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
        **kwargs,
    ) -> BlockOutput:
        service = self._build_service(credentials, **kwargs)
        result = await self._create_draft(
            service,
            input_data,
-            graph_exec_id,
-            user_id,
+            execution_context,
        )
        yield "result", GmailDraftResult(
            id=result["id"], message_id=result["message"]["id"], status="draft_created"
        )

    async def _create_draft(
-        self, service, input_data: Input, graph_exec_id: str, user_id: str
+        self, service, input_data: Input, execution_context: ExecutionContext
    ) -> dict:
        if not input_data.to or not input_data.subject:
            raise ValueError(
                "At least one recipient and subject are required for creating a draft"
            )

-        raw_message = await create_mime_message(input_data, graph_exec_id, user_id)
+        raw_message = await create_mime_message(input_data, execution_context)
        draft = await asyncio.to_thread(
            lambda: service.users()
            .drafts()
@@ -1100,7 +1096,7 @@ class GmailGetThreadBlock(GmailBase):


 async def _build_reply_message(
-    service, input_data, graph_exec_id: str, user_id: str
+    service, input_data, execution_context: ExecutionContext
 ) -> tuple[str, str]:
    """
    Builds a reply MIME message for Gmail threads.
@@ -1190,12 +1186,12 @@ async def _build_reply_message(
    # Handle attachments
    for attach in input_data.attachments:
        local_path = await store_media_file(
-            user_id=user_id,
-            graph_exec_id=graph_exec_id,
            file=attach,
-            return_content=False,
+            execution_context=execution_context,
+            return_format="for_local_processing",
        )
-        abs_path = get_exec_file_path(graph_exec_id, local_path)
+        assert execution_context.graph_exec_id  # Validated by store_media_file
+        abs_path = get_exec_file_path(execution_context.graph_exec_id, local_path)
        part = MIMEBase("application", "octet-stream")
        with open(abs_path, "rb") as f:
            part.set_payload(f.read())
@@ -1311,16 +1307,14 @@ class GmailReplyBlock(GmailBase):
        input_data: Input,
        *,
        credentials: GoogleCredentials,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
        **kwargs,
    ) -> BlockOutput:
        service = self._build_service(credentials, **kwargs)
        message = await self._reply(
            service,
            input_data,
-            graph_exec_id,
-            user_id,
+            execution_context,
        )
        yield "messageId", message["id"]
        yield "threadId", message.get("threadId", input_data.threadId)
@@ -1343,11 +1337,11 @@ class GmailReplyBlock(GmailBase):
        yield "email", email

    async def _reply(
-        self, service, input_data: Input, graph_exec_id: str, user_id: str
+        self, service, input_data: Input, execution_context: ExecutionContext
    ) -> dict:
        # Build the reply message using the shared helper
        raw, thread_id = await _build_reply_message(
-            service, input_data, graph_exec_id, user_id
+            service, input_data, execution_context
        )

        # Send the message
@@ -1441,16 +1435,14 @@ class GmailDraftReplyBlock(GmailBase):
        input_data: Input,
        *,
        credentials: GoogleCredentials,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
        **kwargs,
    ) -> BlockOutput:
        service = self._build_service(credentials, **kwargs)
        draft = await self._create_draft_reply(
            service,
            input_data,
-            graph_exec_id,
-            user_id,
+            execution_context,
        )
        yield "draftId", draft["id"]
        yield "messageId", draft["message"]["id"]
@@ -1458,11 +1450,11 @@ class GmailDraftReplyBlock(GmailBase):
        yield "status", "draft_created"

    async def _create_draft_reply(
-        self, service, input_data: Input, graph_exec_id: str, user_id: str
+        self, service, input_data: Input, execution_context: ExecutionContext
    ) -> dict:
        # Build the reply message using the shared helper
        raw, thread_id = await _build_reply_message(
-            service, input_data, graph_exec_id, user_id
+            service, input_data, execution_context
        )

        # Create draft with proper thread association
@@ -1629,23 +1621,21 @@ class GmailForwardBlock(GmailBase):
        input_data: Input,
        *,
        credentials: GoogleCredentials,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
        **kwargs,
    ) -> BlockOutput:
        service = self._build_service(credentials, **kwargs)
        result = await self._forward_message(
            service,
            input_data,
-            graph_exec_id,
-            user_id,
+            execution_context,
        )
        yield "messageId", result["id"]
        yield "threadId", result.get("threadId", "")
        yield "status", "forwarded"

    async def _forward_message(
-        self, service, input_data: Input, graph_exec_id: str, user_id: str
+        self, service, input_data: Input, execution_context: ExecutionContext
    ) -> dict:
        if not input_data.to:
            raise ValueError("At least one recipient is required for forwarding")
@@ -1727,12 +1717,12 @@ To: {original_to}
        # Add any additional attachments
        for attach in input_data.additionalAttachments:
            local_path = await store_media_file(
-                user_id=user_id,
-                graph_exec_id=graph_exec_id,
                file=attach,
-                return_content=False,
+                execution_context=execution_context,
+                return_format="for_local_processing",
            )
-            abs_path = get_exec_file_path(graph_exec_id, local_path)
+            assert execution_context.graph_exec_id  # Validated by store_media_file
+            abs_path = get_exec_file_path(execution_context.graph_exec_id, local_path)
            part = MIMEBase("application", "octet-stream")
            with open(abs_path, "rb") as f:
                part.set_payload(f.read())
--- a/autogpt_platform/backend/backend/blocks/http.py
+++ b/autogpt_platform/backend/backend/blocks/http.py
@@ -15,6 +15,7 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import (
    CredentialsField,
    CredentialsMetaInput,
@@ -116,10 +117,9 @@ class SendWebRequestBlock(Block):

    @staticmethod
    async def _prepare_files(
-        graph_exec_id: str,
+        execution_context: ExecutionContext,
        files_name: str,
        files: list[MediaFileType],
-        user_id: str,
    ) -> list[tuple[str, tuple[str, BytesIO, str]]]:
        """
        Prepare files for the request by storing them and reading their content.
@@ -127,11 +127,16 @@ class SendWebRequestBlock(Block):
        (files_name, (filename, BytesIO, mime_type))
        """
        files_payload: list[tuple[str, tuple[str, BytesIO, str]]] = []
+        graph_exec_id = execution_context.graph_exec_id
+        if graph_exec_id is None:
+            raise ValueError("graph_exec_id is required for file operations")

        for media in files:
            # Normalise to a list so we can repeat the same key
            rel_path = await store_media_file(
-                graph_exec_id, media, user_id, return_content=False
+                file=media,
+                execution_context=execution_context,
+                return_format="for_local_processing",
            )
            abs_path = get_exec_file_path(graph_exec_id, rel_path)
            async with aiofiles.open(abs_path, "rb") as f:
@@ -143,7 +148,7 @@ class SendWebRequestBlock(Block):
        return files_payload

    async def run(
-        self, input_data: Input, *, graph_exec_id: str, user_id: str, **kwargs
+        self, input_data: Input, *, execution_context: ExecutionContext, **kwargs
    ) -> BlockOutput:
        # ─── Parse/normalise body ────────────────────────────────────
        body = input_data.body
@@ -174,7 +179,7 @@ class SendWebRequestBlock(Block):
        files_payload: list[tuple[str, tuple[str, BytesIO, str]]] = []
        if use_files:
            files_payload = await self._prepare_files(
-                graph_exec_id, input_data.files_name, input_data.files, user_id
+                execution_context, input_data.files_name, input_data.files
            )

        # Enforce body format rules
@@ -238,9 +243,8 @@ class SendAuthenticatedWebRequestBlock(SendWebRequestBlock):
        self,
        input_data: Input,
        *,
-        graph_exec_id: str,
+        execution_context: ExecutionContext,
        credentials: HostScopedCredentials,
-        user_id: str,
        **kwargs,
    ) -> BlockOutput:
        # Create SendWebRequestBlock.Input from our input (removing credentials field)
@@ -271,6 +275,6 @@ class SendAuthenticatedWebRequestBlock(SendWebRequestBlock):

        # Use parent class run method
        async for output_name, output_data in super().run(
-            base_input, graph_exec_id=graph_exec_id, user_id=user_id, **kwargs
+            base_input, execution_context=execution_context, **kwargs
        ):
            yield output_name, output_data
--- a/autogpt_platform/backend/backend/blocks/io.py
+++ b/autogpt_platform/backend/backend/blocks/io.py
@@ -12,6 +12,7 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockType,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import SchemaField
 from backend.util.file import store_media_file
 from backend.util.mock import MockObject
@@ -462,18 +463,21 @@ class AgentFileInputBlock(AgentInputBlock):
        self,
        input_data: Input,
        *,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
        **kwargs,
    ) -> BlockOutput:
        if not input_data.value:
            return

+        # Determine return format based on user preference
+        # for_external_api: always returns data URI (base64) - honors "Produce Base64 Output"
+        # for_block_output: smart format - workspace:// in CoPilot, data URI in graphs
+        return_format = "for_external_api" if input_data.base_64 else "for_block_output"
+
        yield "result", await store_media_file(
-            graph_exec_id=graph_exec_id,
            file=input_data.value,
-            user_id=user_id,
-            return_content=input_data.base_64,
+            execution_context=execution_context,
+            return_format=return_format,
        )


--- a/autogpt_platform/backend/backend/blocks/llm.py
+++ b/autogpt_platform/backend/backend/blocks/llm.py
@@ -4,19 +4,17 @@ import logging
 import re
 import secrets
 from abc import ABC
-from enum import Enum
+from enum import Enum, EnumMeta
 from json import JSONDecodeError
-from typing import Any, Iterable, List, Literal, Optional
+from typing import Any, Iterable, List, Literal, NamedTuple, Optional

 import anthropic
 import ollama
 import openai
 from anthropic.types import ToolParam
 from groq import AsyncGroq
-from pydantic import BaseModel, GetCoreSchemaHandler, SecretStr
-from pydantic_core import CoreSchema, core_schema
+from pydantic import BaseModel, SecretStr

-from backend.data import llm_registry
 from backend.data.block import (
    Block,
    BlockCategory,
@@ -24,7 +22,6 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
-from backend.data.llm_registry import ModelMetadata
 from backend.data.model import (
    APIKeyCredentials,
    CredentialsField,
@@ -69,123 +66,114 @@ TEST_CREDENTIALS_INPUT = {


 def AICredentialsField() -> AICredentials:
-    """
-    Returns a CredentialsField for LLM providers.
-    The discriminator_mapping will be refreshed when the schema is generated
-    if it's empty, ensuring the LLM registry is loaded.
-    """
-    # Get the mapping now - it may be empty initially, but will be refreshed
-    # when the schema is generated via CredentialsMetaInput._add_json_schema_extra
-    mapping = llm_registry.get_llm_discriminator_mapping()
-
    return CredentialsField(
        description="API key for the LLM provider.",
        discriminator="model",
-        discriminator_mapping=mapping,  # May be empty initially, refreshed later
+        discriminator_mapping={
+            model.value: model.metadata.provider for model in LlmModel
+        },
    )


-def llm_model_schema_extra() -> dict[str, Any]:
-    return {"options": llm_registry.get_llm_model_schema_options()}
+class ModelMetadata(NamedTuple):
+    provider: str
+    context_window: int
+    max_output_tokens: int | None
+    display_name: str
+    provider_name: str
+    creator_name: str
+    price_tier: Literal[1, 2, 3]


-class LlmModelMeta(type):
-    """
-    Metaclass for LlmModel that enables attribute-style access to dynamic models.
-
-    This allows code like `LlmModel.GPT4O` to work by converting the attribute
-    name to a slug format:
-    - GPT4O -> gpt-4o
-    - GPT4O_MINI -> gpt-4o-mini
-    - CLAUDE_3_5_SONNET -> claude-3-5-sonnet
-    """
-
-    def __getattr__(cls, name: str):
-        # Don't intercept private/dunder attributes
-        if name.startswith("_"):
-            raise AttributeError(f"type object 'LlmModel' has no attribute '{name}'")
-
-        # Convert attribute name to slug format:
-        # 1. Lowercase: GPT4O -> gpt4o
-        # 2. Underscores to hyphens: GPT4O_MINI -> gpt4o-mini
-        slug = name.lower().replace("_", "-")
-
-        # Check for exact match in registry first (e.g., "o1" stays "o1")
-        registry_slugs = llm_registry.get_dynamic_model_slugs()
-        if slug in registry_slugs:
-            return cls(slug)
-
-        # If no exact match, try inserting hyphen between letter and digit
-        # e.g., gpt4o -> gpt-4o
-        transformed_slug = re.sub(r"([a-z])(\d)", r"\1-\2", slug)
-        return cls(transformed_slug)
-
-    def __iter__(cls):
-        """Iterate over all models from the registry.
-
-        Yields LlmModel instances for each model in the dynamic registry.
-        Used by __get_pydantic_json_schema__ to build model metadata.
-        """
-        for model in llm_registry.iter_dynamic_models():
-            yield cls(model.slug)
+class LlmModelMeta(EnumMeta):
+    pass


-class LlmModel(str, metaclass=LlmModelMeta):
-    """
-    Dynamic LLM model type that accepts any model slug from the registry.
-
-    This is a string subclass (not an Enum) that allows any model slug value.
-    All models are managed via the LLM Registry in the database.
-
-    Usage:
-        model = LlmModel("gpt-4o")  # Direct construction
-        model = LlmModel.GPT4O      # Attribute access (converted to "gpt-4o")
-        model.value                  # Returns the slug string
-        model.provider               # Returns the provider from registry
-    """
-
-    def __new__(cls, value: str):
-        if isinstance(value, LlmModel):
-            return value
-        return str.__new__(cls, value)
-
-    @classmethod
-    def __get_pydantic_core_schema__(
-        cls, source_type: Any, handler: GetCoreSchemaHandler
-    ) -> CoreSchema:
-        """
-        Tell Pydantic how to validate LlmModel.
-
-        Accepts strings and converts them to LlmModel instances.
-        """
-        return core_schema.no_info_after_validator_function(
-            cls,  # The validator function (LlmModel constructor)
-            core_schema.str_schema(),  # Accept string input
-            serialization=core_schema.to_string_ser_schema(),  # Serialize as string
-        )
-
-    @property
-    def value(self) -> str:
-        """Return the model slug (for compatibility with enum-style access)."""
-        return str(self)
-
-    @classmethod
-    def default(cls) -> "LlmModel":
-        """
-        Get the default model from the registry.
-
-        Returns the recommended model if set, otherwise gpt-4o if available
-        and enabled, otherwise the first enabled model from the registry.
-        Falls back to "gpt-4o" if registry is empty (e.g., at module import time).
-        """
-        from backend.data.llm_registry import get_default_model_slug
-
-        slug = get_default_model_slug()
-        if slug is None:
-            # Registry is empty (e.g., at module import time before DB connection).
-            # Fall back to gpt-4o for backward compatibility.
-            slug = "gpt-4o"
-        return cls(slug)
+class LlmModel(str, Enum, metaclass=LlmModelMeta):
+    # OpenAI models
+    O3_MINI = "o3-mini"
+    O3 = "o3-2025-04-16"
+    O1 = "o1"
+    O1_MINI = "o1-mini"
+    # GPT-5 models
+    GPT5_2 = "gpt-5.2-2025-12-11"
+    GPT5_1 = "gpt-5.1-2025-11-13"
+    GPT5 = "gpt-5-2025-08-07"
+    GPT5_MINI = "gpt-5-mini-2025-08-07"
+    GPT5_NANO = "gpt-5-nano-2025-08-07"
+    GPT5_CHAT = "gpt-5-chat-latest"
+    GPT41 = "gpt-4.1-2025-04-14"
+    GPT41_MINI = "gpt-4.1-mini-2025-04-14"
+    GPT4O_MINI = "gpt-4o-mini"
+    GPT4O = "gpt-4o"
+    GPT4_TURBO = "gpt-4-turbo"
+    GPT3_5_TURBO = "gpt-3.5-turbo"
+    # Anthropic models
+    CLAUDE_4_1_OPUS = "claude-opus-4-1-20250805"
+    CLAUDE_4_OPUS = "claude-opus-4-20250514"
+    CLAUDE_4_SONNET = "claude-sonnet-4-20250514"
+    CLAUDE_4_5_OPUS = "claude-opus-4-5-20251101"
+    CLAUDE_4_5_SONNET = "claude-sonnet-4-5-20250929"
+    CLAUDE_4_5_HAIKU = "claude-haiku-4-5-20251001"
+    CLAUDE_3_7_SONNET = "claude-3-7-sonnet-20250219"
+    CLAUDE_3_HAIKU = "claude-3-haiku-20240307"
+    # AI/ML API models
+    AIML_API_QWEN2_5_72B = "Qwen/Qwen2.5-72B-Instruct-Turbo"
+    AIML_API_LLAMA3_1_70B = "nvidia/llama-3.1-nemotron-70b-instruct"
+    AIML_API_LLAMA3_3_70B = "meta-llama/Llama-3.3-70B-Instruct-Turbo"
+    AIML_API_META_LLAMA_3_1_70B = "meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo"
+    AIML_API_LLAMA_3_2_3B = "meta-llama/Llama-3.2-3B-Instruct-Turbo"
+    # Groq models
+    LLAMA3_3_70B = "llama-3.3-70b-versatile"
+    LLAMA3_1_8B = "llama-3.1-8b-instant"
+    # Ollama models
+    OLLAMA_LLAMA3_3 = "llama3.3"
+    OLLAMA_LLAMA3_2 = "llama3.2"
+    OLLAMA_LLAMA3_8B = "llama3"
+    OLLAMA_LLAMA3_405B = "llama3.1:405b"
+    OLLAMA_DOLPHIN = "dolphin-mistral:latest"
+    # OpenRouter models
+    OPENAI_GPT_OSS_120B = "openai/gpt-oss-120b"
+    OPENAI_GPT_OSS_20B = "openai/gpt-oss-20b"
+    GEMINI_2_5_PRO = "google/gemini-2.5-pro-preview-03-25"
+    GEMINI_3_PRO_PREVIEW = "google/gemini-3-pro-preview"
+    GEMINI_2_5_FLASH = "google/gemini-2.5-flash"
+    GEMINI_2_0_FLASH = "google/gemini-2.0-flash-001"
+    GEMINI_2_5_FLASH_LITE_PREVIEW = "google/gemini-2.5-flash-lite-preview-06-17"
+    GEMINI_2_0_FLASH_LITE = "google/gemini-2.0-flash-lite-001"
+    MISTRAL_NEMO = "mistralai/mistral-nemo"
+    COHERE_COMMAND_R_08_2024 = "cohere/command-r-08-2024"
+    COHERE_COMMAND_R_PLUS_08_2024 = "cohere/command-r-plus-08-2024"
+    DEEPSEEK_CHAT = "deepseek/deepseek-chat"  # Actually: DeepSeek V3
+    DEEPSEEK_R1_0528 = "deepseek/deepseek-r1-0528"
+    PERPLEXITY_SONAR = "perplexity/sonar"
+    PERPLEXITY_SONAR_PRO = "perplexity/sonar-pro"
+    PERPLEXITY_SONAR_DEEP_RESEARCH = "perplexity/sonar-deep-research"
+    NOUSRESEARCH_HERMES_3_LLAMA_3_1_405B = "nousresearch/hermes-3-llama-3.1-405b"
+    NOUSRESEARCH_HERMES_3_LLAMA_3_1_70B = "nousresearch/hermes-3-llama-3.1-70b"
+    AMAZON_NOVA_LITE_V1 = "amazon/nova-lite-v1"
+    AMAZON_NOVA_MICRO_V1 = "amazon/nova-micro-v1"
+    AMAZON_NOVA_PRO_V1 = "amazon/nova-pro-v1"
+    MICROSOFT_WIZARDLM_2_8X22B = "microsoft/wizardlm-2-8x22b"
+    GRYPHE_MYTHOMAX_L2_13B = "gryphe/mythomax-l2-13b"
+    META_LLAMA_4_SCOUT = "meta-llama/llama-4-scout"
+    META_LLAMA_4_MAVERICK = "meta-llama/llama-4-maverick"
+    GROK_4 = "x-ai/grok-4"
+    GROK_4_FAST = "x-ai/grok-4-fast"
+    GROK_4_1_FAST = "x-ai/grok-4.1-fast"
+    GROK_CODE_FAST_1 = "x-ai/grok-code-fast-1"
+    KIMI_K2 = "moonshotai/kimi-k2"
+    QWEN3_235B_A22B_THINKING = "qwen/qwen3-235b-a22b-thinking-2507"
+    QWEN3_CODER = "qwen/qwen3-coder"
+    # Llama API models
+    LLAMA_API_LLAMA_4_SCOUT = "Llama-4-Scout-17B-16E-Instruct-FP8"
+    LLAMA_API_LLAMA4_MAVERICK = "Llama-4-Maverick-17B-128E-Instruct-FP8"
+    LLAMA_API_LLAMA3_3_8B = "Llama-3.3-8B-Instruct"
+    LLAMA_API_LLAMA3_3_70B = "Llama-3.3-70B-Instruct"
+    # v0 by Vercel models
+    V0_1_5_MD = "v0-1.5-md"
+    V0_1_5_LG = "v0-1.5-lg"
+    V0_1_0_MD = "v0-1.0-md"

    @classmethod
    def __get_pydantic_json_schema__(cls, schema, handler):
@@ -193,15 +181,7 @@ class LlmModel(str, metaclass=LlmModelMeta):
        llm_model_metadata = {}
        for model in cls:
            model_name = model.value
-            # Skip disabled models - only show enabled models in the picker
-            if not llm_registry.is_model_enabled(model_name):
-                continue
-            # Use registry directly with None check to gracefully handle
-            # missing metadata during startup/import before registry is populated
-            metadata = llm_registry.get_llm_model_metadata(model_name)
-            if metadata is None:
-                # Skip models without metadata (registry not yet populated)
-                continue
+            metadata = model.metadata
            llm_model_metadata[model_name] = {
                "creator": metadata.creator_name,
                "creator_name": metadata.creator_name,
@@ -217,12 +197,7 @@ class LlmModel(str, metaclass=LlmModelMeta):

    @property
    def metadata(self) -> ModelMetadata:
-        metadata = llm_registry.get_llm_model_metadata(self.value)
-        if metadata:
-            return metadata
-        raise ValueError(
-            f"Missing metadata for model: {self.value}. Model not found in LLM registry."
-        )
+        return MODEL_METADATA[self]

    @property
    def provider(self) -> str:
@@ -237,11 +212,300 @@ class LlmModel(str, metaclass=LlmModelMeta):
        return self.metadata.max_output_tokens


-# MODEL_METADATA removed - all models now come from the database via llm_registry
+MODEL_METADATA = {
+    # https://platform.openai.com/docs/models
+    LlmModel.O3: ModelMetadata("openai", 200000, 100000, "O3", "OpenAI", "OpenAI", 2),
+    LlmModel.O3_MINI: ModelMetadata(
+        "openai", 200000, 100000, "O3 Mini", "OpenAI", "OpenAI", 1
+    ),  # o3-mini-2025-01-31
+    LlmModel.O1: ModelMetadata(
+        "openai", 200000, 100000, "O1", "OpenAI", "OpenAI", 3
+    ),  # o1-2024-12-17
+    LlmModel.O1_MINI: ModelMetadata(
+        "openai", 128000, 65536, "O1 Mini", "OpenAI", "OpenAI", 2
+    ),  # o1-mini-2024-09-12
+    # GPT-5 models
+    LlmModel.GPT5_2: ModelMetadata(
+        "openai", 400000, 128000, "GPT-5.2", "OpenAI", "OpenAI", 3
+    ),
+    LlmModel.GPT5_1: ModelMetadata(
+        "openai", 400000, 128000, "GPT-5.1", "OpenAI", "OpenAI", 2
+    ),
+    LlmModel.GPT5: ModelMetadata(
+        "openai", 400000, 128000, "GPT-5", "OpenAI", "OpenAI", 1
+    ),
+    LlmModel.GPT5_MINI: ModelMetadata(
+        "openai", 400000, 128000, "GPT-5 Mini", "OpenAI", "OpenAI", 1
+    ),
+    LlmModel.GPT5_NANO: ModelMetadata(
+        "openai", 400000, 128000, "GPT-5 Nano", "OpenAI", "OpenAI", 1
+    ),
+    LlmModel.GPT5_CHAT: ModelMetadata(
+        "openai", 400000, 16384, "GPT-5 Chat Latest", "OpenAI", "OpenAI", 2
+    ),
+    LlmModel.GPT41: ModelMetadata(
+        "openai", 1047576, 32768, "GPT-4.1", "OpenAI", "OpenAI", 1
+    ),
+    LlmModel.GPT41_MINI: ModelMetadata(
+        "openai", 1047576, 32768, "GPT-4.1 Mini", "OpenAI", "OpenAI", 1
+    ),
+    LlmModel.GPT4O_MINI: ModelMetadata(
+        "openai", 128000, 16384, "GPT-4o Mini", "OpenAI", "OpenAI", 1
+    ),  # gpt-4o-mini-2024-07-18
+    LlmModel.GPT4O: ModelMetadata(
+        "openai", 128000, 16384, "GPT-4o", "OpenAI", "OpenAI", 2
+    ),  # gpt-4o-2024-08-06
+    LlmModel.GPT4_TURBO: ModelMetadata(
+        "openai", 128000, 4096, "GPT-4 Turbo", "OpenAI", "OpenAI", 3
+    ),  # gpt-4-turbo-2024-04-09
+    LlmModel.GPT3_5_TURBO: ModelMetadata(
+        "openai", 16385, 4096, "GPT-3.5 Turbo", "OpenAI", "OpenAI", 1
+    ),  # gpt-3.5-turbo-0125
+    # https://docs.anthropic.com/en/docs/about-claude/models
+    LlmModel.CLAUDE_4_1_OPUS: ModelMetadata(
+        "anthropic", 200000, 32000, "Claude Opus 4.1", "Anthropic", "Anthropic", 3
+    ),  # claude-opus-4-1-20250805
+    LlmModel.CLAUDE_4_OPUS: ModelMetadata(
+        "anthropic", 200000, 32000, "Claude Opus 4", "Anthropic", "Anthropic", 3
+    ),  # claude-4-opus-20250514
+    LlmModel.CLAUDE_4_SONNET: ModelMetadata(
+        "anthropic", 200000, 64000, "Claude Sonnet 4", "Anthropic", "Anthropic", 2
+    ),  # claude-4-sonnet-20250514
+    LlmModel.CLAUDE_4_5_OPUS: ModelMetadata(
+        "anthropic", 200000, 64000, "Claude Opus 4.5", "Anthropic", "Anthropic", 3
+    ),  # claude-opus-4-5-20251101
+    LlmModel.CLAUDE_4_5_SONNET: ModelMetadata(
+        "anthropic", 200000, 64000, "Claude Sonnet 4.5", "Anthropic", "Anthropic", 3
+    ),  # claude-sonnet-4-5-20250929
+    LlmModel.CLAUDE_4_5_HAIKU: ModelMetadata(
+        "anthropic", 200000, 64000, "Claude Haiku 4.5", "Anthropic", "Anthropic", 2
+    ),  # claude-haiku-4-5-20251001
+    LlmModel.CLAUDE_3_7_SONNET: ModelMetadata(
+        "anthropic", 200000, 64000, "Claude 3.7 Sonnet", "Anthropic", "Anthropic", 2
+    ),  # claude-3-7-sonnet-20250219
+    LlmModel.CLAUDE_3_HAIKU: ModelMetadata(
+        "anthropic", 200000, 4096, "Claude 3 Haiku", "Anthropic", "Anthropic", 1
+    ),  # claude-3-haiku-20240307
+    # https://docs.aimlapi.com/api-overview/model-database/text-models
+    LlmModel.AIML_API_QWEN2_5_72B: ModelMetadata(
+        "aiml_api", 32000, 8000, "Qwen 2.5 72B Instruct Turbo", "AI/ML", "Qwen", 1
+    ),
+    LlmModel.AIML_API_LLAMA3_1_70B: ModelMetadata(
+        "aiml_api",
+        128000,
+        40000,
+        "Llama 3.1 Nemotron 70B Instruct",
+        "AI/ML",
+        "Nvidia",
+        1,
+    ),
+    LlmModel.AIML_API_LLAMA3_3_70B: ModelMetadata(
+        "aiml_api", 128000, None, "Llama 3.3 70B Instruct Turbo", "AI/ML", "Meta", 1
+    ),
+    LlmModel.AIML_API_META_LLAMA_3_1_70B: ModelMetadata(
+        "aiml_api", 131000, 2000, "Llama 3.1 70B Instruct Turbo", "AI/ML", "Meta", 1
+    ),
+    LlmModel.AIML_API_LLAMA_3_2_3B: ModelMetadata(
+        "aiml_api", 128000, None, "Llama 3.2 3B Instruct Turbo", "AI/ML", "Meta", 1
+    ),
+    # https://console.groq.com/docs/models
+    LlmModel.LLAMA3_3_70B: ModelMetadata(
+        "groq", 128000, 32768, "Llama 3.3 70B Versatile", "Groq", "Meta", 1
+    ),
+    LlmModel.LLAMA3_1_8B: ModelMetadata(
+        "groq", 128000, 8192, "Llama 3.1 8B Instant", "Groq", "Meta", 1
+    ),
+    # https://ollama.com/library
+    LlmModel.OLLAMA_LLAMA3_3: ModelMetadata(
+        "ollama", 8192, None, "Llama 3.3", "Ollama", "Meta", 1
+    ),
+    LlmModel.OLLAMA_LLAMA3_2: ModelMetadata(
+        "ollama", 8192, None, "Llama 3.2", "Ollama", "Meta", 1
+    ),
+    LlmModel.OLLAMA_LLAMA3_8B: ModelMetadata(
+        "ollama", 8192, None, "Llama 3", "Ollama", "Meta", 1
+    ),
+    LlmModel.OLLAMA_LLAMA3_405B: ModelMetadata(
+        "ollama", 8192, None, "Llama 3.1 405B", "Ollama", "Meta", 1
+    ),
+    LlmModel.OLLAMA_DOLPHIN: ModelMetadata(
+        "ollama", 32768, None, "Dolphin Mistral Latest", "Ollama", "Mistral AI", 1
+    ),
+    # https://openrouter.ai/models
+    LlmModel.GEMINI_2_5_PRO: ModelMetadata(
+        "open_router",
+        1050000,
+        8192,
+        "Gemini 2.5 Pro Preview 03.25",
+        "OpenRouter",
+        "Google",
+        2,
+    ),
+    LlmModel.GEMINI_3_PRO_PREVIEW: ModelMetadata(
+        "open_router", 1048576, 65535, "Gemini 3 Pro Preview", "OpenRouter", "Google", 2
+    ),
+    LlmModel.GEMINI_2_5_FLASH: ModelMetadata(
+        "open_router", 1048576, 65535, "Gemini 2.5 Flash", "OpenRouter", "Google", 1
+    ),
+    LlmModel.GEMINI_2_0_FLASH: ModelMetadata(
+        "open_router", 1048576, 8192, "Gemini 2.0 Flash 001", "OpenRouter", "Google", 1
+    ),
+    LlmModel.GEMINI_2_5_FLASH_LITE_PREVIEW: ModelMetadata(
+        "open_router",
+        1048576,
+        65535,
+        "Gemini 2.5 Flash Lite Preview 06.17",
+        "OpenRouter",
+        "Google",
+        1,
+    ),
+    LlmModel.GEMINI_2_0_FLASH_LITE: ModelMetadata(
+        "open_router",
+        1048576,
+        8192,
+        "Gemini 2.0 Flash Lite 001",
+        "OpenRouter",
+        "Google",
+        1,
+    ),
+    LlmModel.MISTRAL_NEMO: ModelMetadata(
+        "open_router", 128000, 4096, "Mistral Nemo", "OpenRouter", "Mistral AI", 1
+    ),
+    LlmModel.COHERE_COMMAND_R_08_2024: ModelMetadata(
+        "open_router", 128000, 4096, "Command R 08.2024", "OpenRouter", "Cohere", 1
+    ),
+    LlmModel.COHERE_COMMAND_R_PLUS_08_2024: ModelMetadata(
+        "open_router", 128000, 4096, "Command R Plus 08.2024", "OpenRouter", "Cohere", 2
+    ),
+    LlmModel.DEEPSEEK_CHAT: ModelMetadata(
+        "open_router", 64000, 2048, "DeepSeek Chat", "OpenRouter", "DeepSeek", 1
+    ),
+    LlmModel.DEEPSEEK_R1_0528: ModelMetadata(
+        "open_router", 163840, 163840, "DeepSeek R1 0528", "OpenRouter", "DeepSeek", 1
+    ),
+    LlmModel.PERPLEXITY_SONAR: ModelMetadata(
+        "open_router", 127000, 8000, "Sonar", "OpenRouter", "Perplexity", 1
+    ),
+    LlmModel.PERPLEXITY_SONAR_PRO: ModelMetadata(
+        "open_router", 200000, 8000, "Sonar Pro", "OpenRouter", "Perplexity", 2
+    ),
+    LlmModel.PERPLEXITY_SONAR_DEEP_RESEARCH: ModelMetadata(
+        "open_router",
+        128000,
+        16000,
+        "Sonar Deep Research",
+        "OpenRouter",
+        "Perplexity",
+        3,
+    ),
+    LlmModel.NOUSRESEARCH_HERMES_3_LLAMA_3_1_405B: ModelMetadata(
+        "open_router",
+        131000,
+        4096,
+        "Hermes 3 Llama 3.1 405B",
+        "OpenRouter",
+        "Nous Research",
+        1,
+    ),
+    LlmModel.NOUSRESEARCH_HERMES_3_LLAMA_3_1_70B: ModelMetadata(
+        "open_router",
+        12288,
+        12288,
+        "Hermes 3 Llama 3.1 70B",
+        "OpenRouter",
+        "Nous Research",
+        1,
+    ),
+    LlmModel.OPENAI_GPT_OSS_120B: ModelMetadata(
+        "open_router", 131072, 131072, "GPT-OSS 120B", "OpenRouter", "OpenAI", 1
+    ),
+    LlmModel.OPENAI_GPT_OSS_20B: ModelMetadata(
+        "open_router", 131072, 32768, "GPT-OSS 20B", "OpenRouter", "OpenAI", 1
+    ),
+    LlmModel.AMAZON_NOVA_LITE_V1: ModelMetadata(
+        "open_router", 300000, 5120, "Nova Lite V1", "OpenRouter", "Amazon", 1
+    ),
+    LlmModel.AMAZON_NOVA_MICRO_V1: ModelMetadata(
+        "open_router", 128000, 5120, "Nova Micro V1", "OpenRouter", "Amazon", 1
+    ),
+    LlmModel.AMAZON_NOVA_PRO_V1: ModelMetadata(
+        "open_router", 300000, 5120, "Nova Pro V1", "OpenRouter", "Amazon", 1
+    ),
+    LlmModel.MICROSOFT_WIZARDLM_2_8X22B: ModelMetadata(
+        "open_router", 65536, 4096, "WizardLM 2 8x22B", "OpenRouter", "Microsoft", 1
+    ),
+    LlmModel.GRYPHE_MYTHOMAX_L2_13B: ModelMetadata(
+        "open_router", 4096, 4096, "MythoMax L2 13B", "OpenRouter", "Gryphe", 1
+    ),
+    LlmModel.META_LLAMA_4_SCOUT: ModelMetadata(
+        "open_router", 131072, 131072, "Llama 4 Scout", "OpenRouter", "Meta", 1
+    ),
+    LlmModel.META_LLAMA_4_MAVERICK: ModelMetadata(
+        "open_router", 1048576, 1000000, "Llama 4 Maverick", "OpenRouter", "Meta", 1
+    ),
+    LlmModel.GROK_4: ModelMetadata(
+        "open_router", 256000, 256000, "Grok 4", "OpenRouter", "xAI", 3
+    ),
+    LlmModel.GROK_4_FAST: ModelMetadata(
+        "open_router", 2000000, 30000, "Grok 4 Fast", "OpenRouter", "xAI", 1
+    ),
+    LlmModel.GROK_4_1_FAST: ModelMetadata(
+        "open_router", 2000000, 30000, "Grok 4.1 Fast", "OpenRouter", "xAI", 1
+    ),
+    LlmModel.GROK_CODE_FAST_1: ModelMetadata(
+        "open_router", 256000, 10000, "Grok Code Fast 1", "OpenRouter", "xAI", 1
+    ),
+    LlmModel.KIMI_K2: ModelMetadata(
+        "open_router", 131000, 131000, "Kimi K2", "OpenRouter", "Moonshot AI", 1
+    ),
+    LlmModel.QWEN3_235B_A22B_THINKING: ModelMetadata(
+        "open_router",
+        262144,
+        262144,
+        "Qwen 3 235B A22B Thinking 2507",
+        "OpenRouter",
+        "Qwen",
+        1,
+    ),
+    LlmModel.QWEN3_CODER: ModelMetadata(
+        "open_router", 262144, 262144, "Qwen 3 Coder", "OpenRouter", "Qwen", 3
+    ),
+    # Llama API models
+    LlmModel.LLAMA_API_LLAMA_4_SCOUT: ModelMetadata(
+        "llama_api",
+        128000,
+        4028,
+        "Llama 4 Scout 17B 16E Instruct FP8",
+        "Llama API",
+        "Meta",
+        1,
+    ),
+    LlmModel.LLAMA_API_LLAMA4_MAVERICK: ModelMetadata(
+        "llama_api",
+        128000,
+        4028,
+        "Llama 4 Maverick 17B 128E Instruct FP8",
+        "Llama API",
+        "Meta",
+        1,
+    ),
+    LlmModel.LLAMA_API_LLAMA3_3_8B: ModelMetadata(
+        "llama_api", 128000, 4028, "Llama 3.3 8B Instruct", "Llama API", "Meta", 1
+    ),
+    LlmModel.LLAMA_API_LLAMA3_3_70B: ModelMetadata(
+        "llama_api", 128000, 4028, "Llama 3.3 70B Instruct", "Llama API", "Meta", 1
+    ),
+    # v0 by Vercel models
+    LlmModel.V0_1_5_MD: ModelMetadata("v0", 128000, 64000, "v0 1.5 MD", "V0", "V0", 1),
+    LlmModel.V0_1_5_LG: ModelMetadata("v0", 512000, 64000, "v0 1.5 LG", "V0", "V0", 1),
+    LlmModel.V0_1_0_MD: ModelMetadata("v0", 128000, 64000, "v0 1.0 MD", "V0", "V0", 1),
+}

-# Default model constant for backward compatibility
-# Uses the dynamic registry to get the default model
-DEFAULT_LLM_MODEL = LlmModel.default()
+DEFAULT_LLM_MODEL = LlmModel.GPT5_2
+
+for model in LlmModel:
+    if model not in MODEL_METADATA:
+        raise ValueError(f"Missing MODEL_METADATA metadata for model: {model}")


 class ToolCall(BaseModel):
@@ -334,10 +598,7 @@ def get_parallel_tool_calls_param(
    llm_model: LlmModel, parallel_tool_calls: bool | None
 ):
    """Get the appropriate parallel_tool_calls parameter for OpenAI-compatible APIs."""
-    # Check for o-series models (o1, o1-mini, o3-mini, etc.) which don't support
-    # parallel tool calls. Use regex to avoid false positives like "openai/gpt-oss".
-    is_o_series = re.match(r"^o\d", llm_model) is not None
-    if is_o_series or parallel_tool_calls is None:
+    if llm_model.startswith("o") or parallel_tool_calls is None:
        return openai.NOT_GIVEN
    return parallel_tool_calls

@@ -373,98 +634,19 @@ async def llm_call(
            - prompt_tokens: The number of tokens used in the prompt.
            - completion_tokens: The number of tokens used in the completion.
    """
-    # Get model metadata and check if enabled - with fallback support
-    # The model we'll actually use (may differ if original is disabled)
-    model_to_use = llm_model.value
-
-    # Check if model is in registry and if it's enabled
-    from backend.data.llm_registry import (
-        get_fallback_model_for_disabled,
-        get_model_info,
-    )
-
-    model_info = get_model_info(llm_model.value)
-
-    if model_info and not model_info.is_enabled:
-        # Model is disabled - try to find a fallback from the same provider
-        fallback = get_fallback_model_for_disabled(llm_model.value)
-        if fallback:
-            logger.warning(
-                f"Model '{llm_model.value}' is disabled. Using fallback model '{fallback.slug}' from the same provider ({fallback.metadata.provider})."
-            )
-            model_to_use = fallback.slug
-            # Use fallback model's metadata
-            provider = fallback.metadata.provider
-            context_window = fallback.metadata.context_window
-            model_max_output = fallback.metadata.max_output_tokens or int(2**15)
-        else:
-            # No fallback available - raise error
-            raise ValueError(
-                f"LLM model '{llm_model.value}' is disabled and no fallback model "
-                f"from the same provider is available. Please enable the model or "
-                f"select a different model in the block configuration."
-            )
-    else:
-        # Model is enabled or not in registry (legacy/static model)
-        try:
-            provider = llm_model.metadata.provider
-            context_window = llm_model.context_window
-            model_max_output = llm_model.max_output_tokens or int(2**15)
-        except ValueError:
-            # Model not in cache - try refreshing the registry once if we have DB access
-            logger.warning(f"Model {llm_model.value} not found in registry cache")
-
-            # Try refreshing the registry if we have database access
-            from backend.data.db import is_connected
-
-            if is_connected():
-                try:
-                    logger.info(
-                        f"Refreshing LLM registry and retrying lookup for {llm_model.value}"
-                    )
-                    await llm_registry.refresh_llm_registry()
-                    # Try again after refresh
-                    try:
-                        provider = llm_model.metadata.provider
-                        context_window = llm_model.context_window
-                        model_max_output = llm_model.max_output_tokens or int(2**15)
-                        logger.info(
-                            f"Successfully loaded model {llm_model.value} metadata after registry refresh"
-                        )
-                    except ValueError:
-                        # Still not found after refresh
-                        raise ValueError(
-                            f"LLM model '{llm_model.value}' not found in registry after refresh. "
-                            "Please ensure the model is added and enabled in the LLM registry via the admin UI."
-                        )
-                except Exception as refresh_exc:
-                    logger.error(f"Failed to refresh LLM registry: {refresh_exc}")
-                    raise ValueError(
-                        f"LLM model '{llm_model.value}' not found in registry and failed to refresh. "
-                        "Please ensure the model is added to the LLM registry via the admin UI."
-                    ) from refresh_exc
-            else:
-                # No DB access (e.g., in executor without direct DB connection)
-                # The registry should have been loaded on startup
-                raise ValueError(
-                    f"LLM model '{llm_model.value}' not found in registry cache. "
-                    "The registry may need to be refreshed. Please contact support or try again later."
-                )
-
-    # Create effective model for model-specific parameter resolution (e.g., o-series check)
-    # This uses the resolved model_to_use which may differ from llm_model if fallback occurred
-    effective_model = LlmModel(model_to_use)
+    provider = llm_model.metadata.provider
+    context_window = llm_model.context_window

    if compress_prompt_to_fit:
        prompt = compress_prompt(
            messages=prompt,
-            target_tokens=context_window // 2,
+            target_tokens=llm_model.context_window // 2,
            lossy_ok=True,
        )

    # Calculate available tokens based on context window and input length
    estimated_input_tokens = estimate_token_count(prompt)
-    # model_max_output already set above
+    model_max_output = llm_model.max_output_tokens or int(2**15)
    user_max = max_tokens or model_max_output
    available_tokens = max(context_window - estimated_input_tokens, 0)
    max_tokens = max(min(available_tokens, model_max_output, user_max), 1)
@@ -475,14 +657,14 @@ async def llm_call(
        response_format = None

        parallel_tool_calls = get_parallel_tool_calls_param(
-            effective_model, parallel_tool_calls
+            llm_model, parallel_tool_calls
        )

        if force_json_output:
            response_format = {"type": "json_object"}

        response = await oai_client.chat.completions.create(
-            model=model_to_use,
+            model=llm_model.value,
            messages=prompt,  # type: ignore
            response_format=response_format,  # type: ignore
            max_completion_tokens=max_tokens,
@@ -529,7 +711,7 @@ async def llm_call(
        )
        try:
            resp = await client.messages.create(
-                model=model_to_use,
+                model=llm_model.value,
                system=sysprompt,
                messages=messages,
                max_tokens=max_tokens,
@@ -593,7 +775,7 @@ async def llm_call(
        client = AsyncGroq(api_key=credentials.api_key.get_secret_value())
        response_format = {"type": "json_object"} if force_json_output else None
        response = await client.chat.completions.create(
-            model=model_to_use,
+            model=llm_model.value,
            messages=prompt,  # type: ignore
            response_format=response_format,  # type: ignore
            max_tokens=max_tokens,
@@ -615,7 +797,7 @@ async def llm_call(
        sys_messages = [p["content"] for p in prompt if p["role"] == "system"]
        usr_messages = [p["content"] for p in prompt if p["role"] != "system"]
        response = await client.generate(
-            model=model_to_use,
+            model=llm_model.value,
            prompt=f"{sys_messages}\n\n{usr_messages}",
            stream=False,
            options={"num_ctx": max_tokens},
@@ -637,7 +819,7 @@ async def llm_call(
        )

        parallel_tool_calls_param = get_parallel_tool_calls_param(
-            effective_model, parallel_tool_calls
+            llm_model, parallel_tool_calls
        )

        response = await client.chat.completions.create(
@@ -645,7 +827,7 @@ async def llm_call(
                "HTTP-Referer": "https://agpt.co",
                "X-Title": "AutoGPT",
            },
-            model=model_to_use,
+            model=llm_model.value,
            messages=prompt,  # type: ignore
            max_tokens=max_tokens,
            tools=tools_param,  # type: ignore
@@ -679,7 +861,7 @@ async def llm_call(
        )

        parallel_tool_calls_param = get_parallel_tool_calls_param(
-            effective_model, parallel_tool_calls
+            llm_model, parallel_tool_calls
        )

        response = await client.chat.completions.create(
@@ -687,7 +869,7 @@ async def llm_call(
                "HTTP-Referer": "https://agpt.co",
                "X-Title": "AutoGPT",
            },
-            model=model_to_use,
+            model=llm_model.value,
            messages=prompt,  # type: ignore
            max_tokens=max_tokens,
            tools=tools_param,  # type: ignore
@@ -714,7 +896,7 @@ async def llm_call(
            reasoning=reasoning,
        )
    elif provider == "aiml_api":
-        client = openai.AsyncOpenAI(
+        client = openai.OpenAI(
            base_url="https://api.aimlapi.com/v2",
            api_key=credentials.api_key.get_secret_value(),
            default_headers={
@@ -724,8 +906,8 @@ async def llm_call(
            },
        )

-        completion = await client.chat.completions.create(
-            model=model_to_use,
+        completion = client.chat.completions.create(
+            model=llm_model.value,
            messages=prompt,  # type: ignore
            max_tokens=max_tokens,
        )
@@ -753,11 +935,11 @@ async def llm_call(
            response_format = {"type": "json_object"}

        parallel_tool_calls_param = get_parallel_tool_calls_param(
-            effective_model, parallel_tool_calls
+            llm_model, parallel_tool_calls
        )

        response = await client.chat.completions.create(
-            model=model_to_use,
+            model=llm_model.value,
            messages=prompt,  # type: ignore
            response_format=response_format,  # type: ignore
            max_tokens=max_tokens,
@@ -808,10 +990,9 @@ class AIStructuredResponseGeneratorBlock(AIBlockBase):
        )
        model: LlmModel = SchemaField(
            title="LLM Model",
-            default_factory=LlmModel.default,
+            default=DEFAULT_LLM_MODEL,
            description="The language model to use for answering the prompt.",
            advanced=False,
-            json_schema_extra=llm_model_schema_extra(),
        )
        force_json_output: bool = SchemaField(
            title="Restrict LLM to pure JSON output",
@@ -874,7 +1055,7 @@ class AIStructuredResponseGeneratorBlock(AIBlockBase):
            input_schema=AIStructuredResponseGeneratorBlock.Input,
            output_schema=AIStructuredResponseGeneratorBlock.Output,
            test_input={
-                "model": "gpt-4o",  # Using string value - enum accepts any model slug dynamically
+                "model": DEFAULT_LLM_MODEL,
                "credentials": TEST_CREDENTIALS_INPUT,
                "expected_format": {
                    "key1": "value1",
@@ -1240,10 +1421,9 @@ class AITextGeneratorBlock(AIBlockBase):
        )
        model: LlmModel = SchemaField(
            title="LLM Model",
-            default_factory=LlmModel.default,
+            default=DEFAULT_LLM_MODEL,
            description="The language model to use for answering the prompt.",
            advanced=False,
-            json_schema_extra=llm_model_schema_extra(),
        )
        credentials: AICredentials = AICredentialsField()
        sys_prompt: str = SchemaField(
@@ -1337,9 +1517,8 @@ class AITextSummarizerBlock(AIBlockBase):
        )
        model: LlmModel = SchemaField(
            title="LLM Model",
-            default_factory=LlmModel.default,
+            default=DEFAULT_LLM_MODEL,
            description="The language model to use for summarizing the text.",
-            json_schema_extra=llm_model_schema_extra(),
        )
        focus: str = SchemaField(
            title="Focus",
@@ -1555,9 +1734,8 @@ class AIConversationBlock(AIBlockBase):
        )
        model: LlmModel = SchemaField(
            title="LLM Model",
-            default_factory=LlmModel.default,
+            default=DEFAULT_LLM_MODEL,
            description="The language model to use for the conversation.",
-            json_schema_extra=llm_model_schema_extra(),
        )
        credentials: AICredentials = AICredentialsField()
        max_tokens: int | None = SchemaField(
@@ -1594,7 +1772,7 @@ class AIConversationBlock(AIBlockBase):
                    },
                    {"role": "user", "content": "Where was it played?"},
                ],
-                "model": "gpt-4o",  # Using string value - enum accepts any model slug dynamically
+                "model": DEFAULT_LLM_MODEL,
                "credentials": TEST_CREDENTIALS_INPUT,
            },
            test_credentials=TEST_CREDENTIALS,
@@ -1657,10 +1835,9 @@ class AIListGeneratorBlock(AIBlockBase):
        )
        model: LlmModel = SchemaField(
            title="LLM Model",
-            default_factory=LlmModel.default,
+            default=DEFAULT_LLM_MODEL,
            description="The language model to use for generating the list.",
            advanced=True,
-            json_schema_extra=llm_model_schema_extra(),
        )
        credentials: AICredentials = AICredentialsField()
        max_retries: int = SchemaField(
@@ -1715,7 +1892,7 @@ class AIListGeneratorBlock(AIBlockBase):
                    "drawing explorers to uncover its mysteries. Each planet showcases the limitless possibilities of "
                    "fictional worlds."
                ),
-                "model": "gpt-4o",  # Using string value - enum accepts any model slug dynamically
+                "model": DEFAULT_LLM_MODEL,
                "credentials": TEST_CREDENTIALS_INPUT,
                "max_retries": 3,
                "force_json_output": False,
--- a/autogpt_platform/backend/backend/blocks/media.py
+++ b/autogpt_platform/backend/backend/blocks/media.py
@@ -1,6 +1,6 @@
 import os
 import tempfile
-from typing import Literal, Optional
+from typing import Optional

 from moviepy.audio.io.AudioFileClip import AudioFileClip
 from moviepy.video.fx.Loop import Loop
@@ -13,6 +13,7 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import SchemaField
 from backend.util.file import MediaFileType, get_exec_file_path, store_media_file

@@ -46,18 +47,19 @@ class MediaDurationBlock(Block):
        self,
        input_data: Input,
        *,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
        **kwargs,
    ) -> BlockOutput:
        # 1) Store the input media locally
        local_media_path = await store_media_file(
-            graph_exec_id=graph_exec_id,
            file=input_data.media_in,
-            user_id=user_id,
-            return_content=False,
+            execution_context=execution_context,
+            return_format="for_local_processing",
+        )
+        assert execution_context.graph_exec_id is not None
+        media_abspath = get_exec_file_path(
+            execution_context.graph_exec_id, local_media_path
        )
-        media_abspath = get_exec_file_path(graph_exec_id, local_media_path)

        # 2) Load the clip
        if input_data.is_video:
@@ -88,10 +90,6 @@ class LoopVideoBlock(Block):
            default=None,
            ge=1,
        )
-        output_return_type: Literal["file_path", "data_uri"] = SchemaField(
-            description="How to return the output video. Either a relative path or base64 data URI.",
-            default="file_path",
-        )

    class Output(BlockSchemaOutput):
        video_out: str = SchemaField(
@@ -111,17 +109,19 @@ class LoopVideoBlock(Block):
        self,
        input_data: Input,
        *,
-        node_exec_id: str,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
        **kwargs,
    ) -> BlockOutput:
+        assert execution_context.graph_exec_id is not None
+        assert execution_context.node_exec_id is not None
+        graph_exec_id = execution_context.graph_exec_id
+        node_exec_id = execution_context.node_exec_id
+
        # 1) Store the input video locally
        local_video_path = await store_media_file(
-            graph_exec_id=graph_exec_id,
            file=input_data.video_in,
-            user_id=user_id,
-            return_content=False,
+            execution_context=execution_context,
+            return_format="for_local_processing",
        )
        input_abspath = get_exec_file_path(graph_exec_id, local_video_path)

@@ -149,12 +149,11 @@ class LoopVideoBlock(Block):
        looped_clip = looped_clip.with_audio(clip.audio)
        looped_clip.write_videofile(output_abspath, codec="libx264", audio_codec="aac")

-        # Return as data URI
+        # Return output - for_block_output returns workspace:// if available, else data URI
        video_out = await store_media_file(
-            graph_exec_id=graph_exec_id,
            file=output_filename,
-            user_id=user_id,
-            return_content=input_data.output_return_type == "data_uri",
+            execution_context=execution_context,
+            return_format="for_block_output",
        )

        yield "video_out", video_out
@@ -177,10 +176,6 @@ class AddAudioToVideoBlock(Block):
            description="Volume scale for the newly attached audio track (1.0 = original).",
            default=1.0,
        )
-        output_return_type: Literal["file_path", "data_uri"] = SchemaField(
-            description="Return the final output as a relative path or base64 data URI.",
-            default="file_path",
-        )

    class Output(BlockSchemaOutput):
        video_out: MediaFileType = SchemaField(
@@ -200,23 +195,24 @@ class AddAudioToVideoBlock(Block):
        self,
        input_data: Input,
        *,
-        node_exec_id: str,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
        **kwargs,
    ) -> BlockOutput:
+        assert execution_context.graph_exec_id is not None
+        assert execution_context.node_exec_id is not None
+        graph_exec_id = execution_context.graph_exec_id
+        node_exec_id = execution_context.node_exec_id
+
        # 1) Store the inputs locally
        local_video_path = await store_media_file(
-            graph_exec_id=graph_exec_id,
            file=input_data.video_in,
-            user_id=user_id,
-            return_content=False,
+            execution_context=execution_context,
+            return_format="for_local_processing",
        )
        local_audio_path = await store_media_file(
-            graph_exec_id=graph_exec_id,
            file=input_data.audio_in,
-            user_id=user_id,
-            return_content=False,
+            execution_context=execution_context,
+            return_format="for_local_processing",
        )

        abs_temp_dir = os.path.join(tempfile.gettempdir(), "exec_file", graph_exec_id)
@@ -240,12 +236,11 @@ class AddAudioToVideoBlock(Block):
        output_abspath = os.path.join(abs_temp_dir, output_filename)
        final_clip.write_videofile(output_abspath, codec="libx264", audio_codec="aac")

-        # 5) Return either path or data URI
+        # 5) Return output - for_block_output returns workspace:// if available, else data URI
        video_out = await store_media_file(
-            graph_exec_id=graph_exec_id,
            file=output_filename,
-            user_id=user_id,
-            return_content=input_data.output_return_type == "data_uri",
+            execution_context=execution_context,
+            return_format="for_block_output",
        )

        yield "video_out", video_out
--- a/autogpt_platform/backend/backend/blocks/screenshotone.py
+++ b/autogpt_platform/backend/backend/blocks/screenshotone.py
@@ -11,6 +11,7 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import (
    APIKeyCredentials,
    CredentialsField,
@@ -112,8 +113,7 @@ class ScreenshotWebPageBlock(Block):
    @staticmethod
    async def take_screenshot(
        credentials: APIKeyCredentials,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
        url: str,
        viewport_width: int,
        viewport_height: int,
@@ -155,12 +155,11 @@ class ScreenshotWebPageBlock(Block):

        return {
            "image": await store_media_file(
-                graph_exec_id=graph_exec_id,
                file=MediaFileType(
                    f"data:image/{format.value};base64,{b64encode(content).decode('utf-8')}"
                ),
-                user_id=user_id,
-                return_content=True,
+                execution_context=execution_context,
+                return_format="for_block_output",
            )
        }

@@ -169,15 +168,13 @@ class ScreenshotWebPageBlock(Block):
        input_data: Input,
        *,
        credentials: APIKeyCredentials,
-        graph_exec_id: str,
-        user_id: str,
+        execution_context: ExecutionContext,
        **kwargs,
    ) -> BlockOutput:
        try:
            screenshot_data = await self.take_screenshot(
                credentials=credentials,
-                graph_exec_id=graph_exec_id,
-                user_id=user_id,
+                execution_context=execution_context,
                url=input_data.url,
                viewport_width=input_data.viewport_width,
                viewport_height=input_data.viewport_height,
--- a/autogpt_platform/backend/backend/blocks/smart_decision_maker.py
+++ b/autogpt_platform/backend/backend/blocks/smart_decision_maker.py
@@ -226,10 +226,9 @@ class SmartDecisionMakerBlock(Block):
        )
        model: llm.LlmModel = SchemaField(
            title="LLM Model",
-            default_factory=llm.LlmModel.default,
+            default=llm.DEFAULT_LLM_MODEL,
            description="The language model to use for answering the prompt.",
            advanced=False,
-            json_schema_extra=llm.llm_model_schema_extra(),
        )
        credentials: llm.AICredentials = llm.AICredentialsField()
        multiple_tool_calls: bool = SchemaField(
--- a/autogpt_platform/backend/backend/blocks/spreadsheet.py
+++ b/autogpt_platform/backend/backend/blocks/spreadsheet.py
@@ -7,6 +7,7 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import ContributorDetails, SchemaField
 from backend.util.file import get_exec_file_path, store_media_file
 from backend.util.type import MediaFileType
@@ -98,7 +99,7 @@ class ReadSpreadsheetBlock(Block):
        )

    async def run(
-        self, input_data: Input, *, graph_exec_id: str, user_id: str, **_kwargs
+        self, input_data: Input, *, execution_context: ExecutionContext, **_kwargs
    ) -> BlockOutput:
        import csv
        from io import StringIO
@@ -106,14 +107,16 @@ class ReadSpreadsheetBlock(Block):
        # Determine data source - prefer file_input if provided, otherwise use contents
        if input_data.file_input:
            stored_file_path = await store_media_file(
-                user_id=user_id,
-                graph_exec_id=graph_exec_id,
                file=input_data.file_input,
-                return_content=False,
+                execution_context=execution_context,
+                return_format="for_local_processing",
            )

            # Get full file path
-            file_path = get_exec_file_path(graph_exec_id, stored_file_path)
+            assert execution_context.graph_exec_id  # Validated by store_media_file
+            file_path = get_exec_file_path(
+                execution_context.graph_exec_id, stored_file_path
+            )
            if not Path(file_path).exists():
                raise ValueError(f"File does not exist: {file_path}")

--- a/autogpt_platform/backend/backend/blocks/stagehand/blocks.py
+++ b/autogpt_platform/backend/backend/blocks/stagehand/blocks.py
@@ -10,13 +10,13 @@ import stagehand.main
 from stagehand import Stagehand

 from backend.blocks.llm import (
+    MODEL_METADATA,
    AICredentials,
    AICredentialsField,
    LlmModel,
    ModelMetadata,
 )
 from backend.blocks.stagehand._config import stagehand as stagehand_provider
-from backend.data import llm_registry
 from backend.sdk import (
    APIKeyCredentials,
    Block,
@@ -91,7 +91,7 @@ class StagehandRecommendedLlmModel(str, Enum):
        Returns the provider name for the model in the required format for Stagehand:
        provider/model_name
        """
-        model_metadata = self.metadata
+        model_metadata = MODEL_METADATA[LlmModel(self.value)]
        model_name = self.value

        if len(model_name.split("/")) == 1 and not self.value.startswith(
@@ -107,23 +107,19 @@ class StagehandRecommendedLlmModel(str, Enum):

    @property
    def provider(self) -> str:
-        return self.metadata.provider
+        return MODEL_METADATA[LlmModel(self.value)].provider

    @property
    def metadata(self) -> ModelMetadata:
-        metadata = llm_registry.get_llm_model_metadata(self.value)
-        if metadata:
-            return metadata
-        # Fallback to LlmModel enum if registry lookup fails
-        return LlmModel(self.value).metadata
+        return MODEL_METADATA[LlmModel(self.value)]

    @property
    def context_window(self) -> int:
-        return self.metadata.context_window
+        return MODEL_METADATA[LlmModel(self.value)].context_window

    @property
    def max_output_tokens(self) -> int | None:
-        return self.metadata.max_output_tokens
+        return MODEL_METADATA[LlmModel(self.value)].max_output_tokens


 class StagehandObserveBlock(Block):
--- a/autogpt_platform/backend/backend/blocks/talking_head.py
+++ b/autogpt_platform/backend/backend/blocks/talking_head.py
@@ -10,6 +10,7 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import (
    APIKeyCredentials,
    CredentialsField,
@@ -17,7 +18,9 @@ from backend.data.model import (
    SchemaField,
 )
 from backend.integrations.providers import ProviderName
+from backend.util.file import store_media_file
 from backend.util.request import Requests
+from backend.util.type import MediaFileType

 TEST_CREDENTIALS = APIKeyCredentials(
    id="01234567-89ab-cdef-0123-456789abcdef",
@@ -102,7 +105,7 @@ class CreateTalkingAvatarVideoBlock(Block):
            test_output=[
                (
                    "video_url",
-                    "https://d-id.com/api/clips/abcd1234-5678-efgh-ijkl-mnopqrstuvwx/video",
+                    lambda x: x.startswith(("workspace://", "data:")),
                ),
            ],
            test_mock={
@@ -110,9 +113,10 @@ class CreateTalkingAvatarVideoBlock(Block):
                    "id": "abcd1234-5678-efgh-ijkl-mnopqrstuvwx",
                    "status": "created",
                },
+                # Use data URI to avoid HTTP requests during tests
                "get_clip_status": lambda *args, **kwargs: {
                    "status": "done",
-                    "result_url": "https://d-id.com/api/clips/abcd1234-5678-efgh-ijkl-mnopqrstuvwx/video",
+                    "result_url": "data:video/mp4;base64,AAAA",
                },
            },
            test_credentials=TEST_CREDENTIALS,
@@ -138,7 +142,12 @@ class CreateTalkingAvatarVideoBlock(Block):
        return response.json()

    async def run(
-        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
+        self,
+        input_data: Input,
+        *,
+        credentials: APIKeyCredentials,
+        execution_context: ExecutionContext,
+        **kwargs,
    ) -> BlockOutput:
        # Create the clip
        payload = {
@@ -165,7 +174,14 @@ class CreateTalkingAvatarVideoBlock(Block):
        for _ in range(input_data.max_polling_attempts):
            status_response = await self.get_clip_status(credentials.api_key, clip_id)
            if status_response["status"] == "done":
-                yield "video_url", status_response["result_url"]
+                # Store the generated video to the user's workspace for persistence
+                video_url = status_response["result_url"]
+                stored_url = await store_media_file(
+                    file=MediaFileType(video_url),
+                    execution_context=execution_context,
+                    return_format="for_block_output",
+                )
+                yield "video_url", stored_url
                return
            elif status_response["status"] == "error":
                raise RuntimeError(
--- a/autogpt_platform/backend/backend/blocks/test/test_blocks_dos_vulnerability.py
+++ b/autogpt_platform/backend/backend/blocks/test/test_blocks_dos_vulnerability.py
@@ -12,6 +12,7 @@ from backend.blocks.iteration import StepThroughItemsBlock
 from backend.blocks.llm import AITextSummarizerBlock
 from backend.blocks.text import ExtractTextInformationBlock
 from backend.blocks.xml_parser import XMLParserBlock
+from backend.data.execution import ExecutionContext
 from backend.util.file import store_media_file
 from backend.util.type import MediaFileType

@@ -233,9 +234,12 @@ class TestStoreMediaFileSecurity:

        with pytest.raises(ValueError, match="File too large"):
            await store_media_file(
-                graph_exec_id="test",
                file=MediaFileType(large_data_uri),
-                user_id="test_user",
+                execution_context=ExecutionContext(
+                    user_id="test_user",
+                    graph_exec_id="test",
+                ),
+                return_format="for_local_processing",
            )

    @patch("backend.util.file.Path")
@@ -270,9 +274,12 @@ class TestStoreMediaFileSecurity:
        # Should raise an error when directory size exceeds limit
        with pytest.raises(ValueError, match="Disk usage limit exceeded"):
            await store_media_file(
-                graph_exec_id="test",
                file=MediaFileType(
                    "data:text/plain;base64,dGVzdA=="
                ),  # Small test file
-                user_id="test_user",
+                execution_context=ExecutionContext(
+                    user_id="test_user",
+                    graph_exec_id="test",
+                ),
+                return_format="for_local_processing",
            )
--- a/autogpt_platform/backend/backend/blocks/test/test_http.py
+++ b/autogpt_platform/backend/backend/blocks/test/test_http.py
@@ -11,10 +11,22 @@ from backend.blocks.http import (
    HttpMethod,
    SendAuthenticatedWebRequestBlock,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import HostScopedCredentials
 from backend.util.request import Response


+def make_test_context(
+    graph_exec_id: str = "test-exec-id",
+    user_id: str = "test-user-id",
+) -> ExecutionContext:
+    """Helper to create test ExecutionContext."""
+    return ExecutionContext(
+        user_id=user_id,
+        graph_exec_id=graph_exec_id,
+    )
+
+
 class TestHttpBlockWithHostScopedCredentials:
    """Test suite for HTTP block integration with HostScopedCredentials."""

@@ -105,8 +117,7 @@ class TestHttpBlockWithHostScopedCredentials:
        async for output_name, output_data in http_block.run(
            input_data,
            credentials=exact_match_credentials,
-            graph_exec_id="test-exec-id",
-            user_id="test-user-id",
+            execution_context=make_test_context(),
        ):
            result.append((output_name, output_data))

@@ -161,8 +172,7 @@ class TestHttpBlockWithHostScopedCredentials:
        async for output_name, output_data in http_block.run(
            input_data,
            credentials=wildcard_credentials,
-            graph_exec_id="test-exec-id",
-            user_id="test-user-id",
+            execution_context=make_test_context(),
        ):
            result.append((output_name, output_data))

@@ -208,8 +218,7 @@ class TestHttpBlockWithHostScopedCredentials:
        async for output_name, output_data in http_block.run(
            input_data,
            credentials=non_matching_credentials,
-            graph_exec_id="test-exec-id",
-            user_id="test-user-id",
+            execution_context=make_test_context(),
        ):
            result.append((output_name, output_data))

@@ -258,8 +267,7 @@ class TestHttpBlockWithHostScopedCredentials:
        async for output_name, output_data in http_block.run(
            input_data,
            credentials=exact_match_credentials,
-            graph_exec_id="test-exec-id",
-            user_id="test-user-id",
+            execution_context=make_test_context(),
        ):
            result.append((output_name, output_data))

@@ -318,8 +326,7 @@ class TestHttpBlockWithHostScopedCredentials:
        async for output_name, output_data in http_block.run(
            input_data,
            credentials=auto_discovered_creds,  # Execution manager found these
-            graph_exec_id="test-exec-id",
-            user_id="test-user-id",
+            execution_context=make_test_context(),
        ):
            result.append((output_name, output_data))

@@ -382,8 +389,7 @@ class TestHttpBlockWithHostScopedCredentials:
        async for output_name, output_data in http_block.run(
            input_data,
            credentials=multi_header_creds,
-            graph_exec_id="test-exec-id",
-            user_id="test-user-id",
+            execution_context=make_test_context(),
        ):
            result.append((output_name, output_data))

@@ -471,8 +477,7 @@ class TestHttpBlockWithHostScopedCredentials:
            async for output_name, output_data in http_block.run(
                input_data,
                credentials=test_creds,
-                graph_exec_id="test-exec-id",
-                user_id="test-user-id",
+                execution_context=make_test_context(),
            ):
                result.append((output_name, output_data))

--- a/autogpt_platform/backend/backend/blocks/text.py
+++ b/autogpt_platform/backend/backend/blocks/text.py
@@ -11,6 +11,7 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
+from backend.data.execution import ExecutionContext
 from backend.data.model import SchemaField
 from backend.util import json, text
 from backend.util.file import get_exec_file_path, store_media_file
@@ -444,18 +445,21 @@ class FileReadBlock(Block):
        )

    async def run(
-        self, input_data: Input, *, graph_exec_id: str, user_id: str, **_kwargs
+        self, input_data: Input, *, execution_context: ExecutionContext, **_kwargs
    ) -> BlockOutput:
        # Store the media file properly (handles URLs, data URIs, etc.)
        stored_file_path = await store_media_file(
-            user_id=user_id,
-            graph_exec_id=graph_exec_id,
            file=input_data.file_input,
-            return_content=False,
+            execution_context=execution_context,
+            return_format="for_local_processing",
        )

-        # Get full file path
-        file_path = get_exec_file_path(graph_exec_id, stored_file_path)
+        # Get full file path (graph_exec_id validated by store_media_file above)
+        if not execution_context.graph_exec_id:
+            raise ValueError("execution_context.graph_exec_id is required")
+        file_path = get_exec_file_path(
+            execution_context.graph_exec_id, stored_file_path
+        )

        if not Path(file_path).exists():
            raise ValueError(f"File does not exist: {file_path}")
--- a/autogpt_platform/backend/backend/data/block.py
+++ b/autogpt_platform/backend/backend/data/block.py
@@ -25,7 +25,6 @@ from prisma.models import AgentBlock
 from prisma.types import AgentBlockCreateInput
 from pydantic import BaseModel

-from backend.data.llm_registry import update_schema_with_llm_registry
 from backend.data.model import NodeExecutionStats
 from backend.integrations.providers import ProviderName
 from backend.util import json
@@ -144,59 +143,35 @@ class BlockInfo(BaseModel):


 class BlockSchema(BaseModel):
-    cached_jsonschema: ClassVar[dict[str, Any] | None] = None
-
-    @classmethod
-    def clear_schema_cache(cls) -> None:
-        """Clear the cached JSON schema for this class."""
-        # Use None instead of {} because {} is truthy and would prevent regeneration
-        cls.cached_jsonschema = None  # type: ignore
-
-    @staticmethod
-    def clear_all_schema_caches() -> None:
-        """Clear cached JSON schemas for all BlockSchema subclasses."""
-
-        def clear_recursive(cls: type) -> None:
-            """Recursively clear cache for class and all subclasses."""
-            if hasattr(cls, "clear_schema_cache"):
-                cls.clear_schema_cache()
-            for subclass in cls.__subclasses__():
-                clear_recursive(subclass)
-
-        clear_recursive(BlockSchema)
+    cached_jsonschema: ClassVar[dict[str, Any]]

    @classmethod
    def jsonschema(cls) -> dict[str, Any]:
-        # Generate schema if not cached
-        if not cls.cached_jsonschema:
-            model = jsonref.replace_refs(cls.model_json_schema(), merge_props=True)
+        if cls.cached_jsonschema:
+            return cls.cached_jsonschema

-            def ref_to_dict(obj):
-                if isinstance(obj, dict):
-                    # OpenAPI <3.1 does not support sibling fields that has a $ref key
-                    # So sometimes, the schema has an "allOf"/"anyOf"/"oneOf" with 1 item.
-                    keys = {"allOf", "anyOf", "oneOf"}
-                    one_key = next(
-                        (k for k in keys if k in obj and len(obj[k]) == 1), None
-                    )
-                    if one_key:
-                        obj.update(obj[one_key][0])
+        model = jsonref.replace_refs(cls.model_json_schema(), merge_props=True)

-                    return {
-                        key: ref_to_dict(value)
-                        for key, value in obj.items()
-                        if not key.startswith("$") and key != one_key
-                    }
-                elif isinstance(obj, list):
-                    return [ref_to_dict(item) for item in obj]
+        def ref_to_dict(obj):
+            if isinstance(obj, dict):
+                # OpenAPI <3.1 does not support sibling fields that has a $ref key
+                # So sometimes, the schema has an "allOf"/"anyOf"/"oneOf" with 1 item.
+                keys = {"allOf", "anyOf", "oneOf"}
+                one_key = next((k for k in keys if k in obj and len(obj[k]) == 1), None)
+                if one_key:
+                    obj.update(obj[one_key][0])

-                return obj
+                return {
+                    key: ref_to_dict(value)
+                    for key, value in obj.items()
+                    if not key.startswith("$") and key != one_key
+                }
+            elif isinstance(obj, list):
+                return [ref_to_dict(item) for item in obj]

-            cls.cached_jsonschema = cast(dict[str, Any], ref_to_dict(model))
+            return obj

-        # Always post-process to ensure LLM registry data is up-to-date
-        # This refreshes model options and discriminator mappings even if schema was cached
-        update_schema_with_llm_registry(cls.cached_jsonschema, cls)
+        cls.cached_jsonschema = cast(dict[str, Any], ref_to_dict(model))

        return cls.cached_jsonschema

@@ -259,7 +234,7 @@ class BlockSchema(BaseModel):
        super().__pydantic_init_subclass__(**kwargs)

        # Reset cached JSON schema to prevent inheriting it from parent class
-        cls.cached_jsonschema = None
+        cls.cached_jsonschema = {}

        credentials_fields = cls.get_credentials_fields()

@@ -898,28 +873,6 @@ def is_block_auth_configured(


 async def initialize_blocks() -> None:
-    # Refresh LLM registry before initializing blocks so blocks can use registry data
-    # This ensures the registry cache is populated even in executor context
-    try:
-        from backend.data import llm_registry
-        from backend.data.block_cost_config import refresh_llm_costs
-
-        # Only refresh if we have DB access (check if Prisma is connected)
-        from backend.data.db import is_connected
-
-        if is_connected():
-            await llm_registry.refresh_llm_registry()
-            refresh_llm_costs()
-            logger.info("LLM registry refreshed during block initialization")
-        else:
-            logger.warning(
-                "Prisma not connected, skipping LLM registry refresh during block initialization"
-            )
-    except Exception as exc:
-        logger.warning(
-            "Failed to refresh LLM registry during block initialization: %s", exc
-        )
-
    # First, sync all provider costs to blocks
    # Imported here to avoid circular import
    from backend.sdk.cost_integration import sync_all_provider_costs
--- a/autogpt_platform/backend/backend/data/block_cost_config.py
+++ b/autogpt_platform/backend/backend/data/block_cost_config.py
@@ -1,4 +1,3 @@
-import logging
 from typing import Type

 from backend.blocks.ai_image_customizer import AIImageCustomizerBlock, GeminiImageModel
@@ -24,18 +23,19 @@ from backend.blocks.ideogram import IdeogramModelBlock
 from backend.blocks.jina.embeddings import JinaEmbeddingBlock
 from backend.blocks.jina.search import ExtractWebsiteContentBlock, SearchTheWebBlock
 from backend.blocks.llm import (
+    MODEL_METADATA,
    AIConversationBlock,
    AIListGeneratorBlock,
    AIStructuredResponseGeneratorBlock,
    AITextGeneratorBlock,
    AITextSummarizerBlock,
+    LlmModel,
 )
 from backend.blocks.replicate.flux_advanced import ReplicateFluxAdvancedModelBlock
 from backend.blocks.replicate.replicate_block import ReplicateModelBlock
 from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
 from backend.blocks.talking_head import CreateTalkingAvatarVideoBlock
 from backend.blocks.text_to_speech_block import UnrealTextToSpeechBlock
-from backend.data import llm_registry
 from backend.data.block import Block, BlockCost, BlockCostType
 from backend.integrations.credentials_store import (
    aiml_api_credentials,
@@ -55,63 +55,210 @@ from backend.integrations.credentials_store import (
    v0_credentials,
 )

-logger = logging.getLogger(__name__)
+# =============== Configure the cost for each LLM Model call =============== #

-PROVIDER_CREDENTIALS = {
-    "openai": openai_credentials,
-    "anthropic": anthropic_credentials,
-    "groq": groq_credentials,
-    "open_router": open_router_credentials,
-    "llama_api": llama_api_credentials,
-    "aiml_api": aiml_api_credentials,
-    "v0": v0_credentials,
+MODEL_COST: dict[LlmModel, int] = {
+    LlmModel.O3: 4,
+    LlmModel.O3_MINI: 2,
+    LlmModel.O1: 16,
+    LlmModel.O1_MINI: 4,
+    # GPT-5 models
+    LlmModel.GPT5_2: 6,
+    LlmModel.GPT5_1: 5,
+    LlmModel.GPT5: 2,
+    LlmModel.GPT5_MINI: 1,
+    LlmModel.GPT5_NANO: 1,
+    LlmModel.GPT5_CHAT: 5,
+    LlmModel.GPT41: 2,
+    LlmModel.GPT41_MINI: 1,
+    LlmModel.GPT4O_MINI: 1,
+    LlmModel.GPT4O: 3,
+    LlmModel.GPT4_TURBO: 10,
+    LlmModel.GPT3_5_TURBO: 1,
+    LlmModel.CLAUDE_4_1_OPUS: 21,
+    LlmModel.CLAUDE_4_OPUS: 21,
+    LlmModel.CLAUDE_4_SONNET: 5,
+    LlmModel.CLAUDE_4_5_HAIKU: 4,
+    LlmModel.CLAUDE_4_5_OPUS: 14,
+    LlmModel.CLAUDE_4_5_SONNET: 9,
+    LlmModel.CLAUDE_3_7_SONNET: 5,
+    LlmModel.CLAUDE_3_HAIKU: 1,
+    LlmModel.AIML_API_QWEN2_5_72B: 1,
+    LlmModel.AIML_API_LLAMA3_1_70B: 1,
+    LlmModel.AIML_API_LLAMA3_3_70B: 1,
+    LlmModel.AIML_API_META_LLAMA_3_1_70B: 1,
+    LlmModel.AIML_API_LLAMA_3_2_3B: 1,
+    LlmModel.LLAMA3_3_70B: 1,
+    LlmModel.LLAMA3_1_8B: 1,
+    LlmModel.OLLAMA_LLAMA3_3: 1,
+    LlmModel.OLLAMA_LLAMA3_2: 1,
+    LlmModel.OLLAMA_LLAMA3_8B: 1,
+    LlmModel.OLLAMA_LLAMA3_405B: 1,
+    LlmModel.OLLAMA_DOLPHIN: 1,
+    LlmModel.OPENAI_GPT_OSS_120B: 1,
+    LlmModel.OPENAI_GPT_OSS_20B: 1,
+    LlmModel.GEMINI_2_5_PRO: 4,
+    LlmModel.GEMINI_3_PRO_PREVIEW: 5,
+    LlmModel.GEMINI_2_5_FLASH: 1,
+    LlmModel.GEMINI_2_0_FLASH: 1,
+    LlmModel.GEMINI_2_5_FLASH_LITE_PREVIEW: 1,
+    LlmModel.GEMINI_2_0_FLASH_LITE: 1,
+    LlmModel.MISTRAL_NEMO: 1,
+    LlmModel.COHERE_COMMAND_R_08_2024: 1,
+    LlmModel.COHERE_COMMAND_R_PLUS_08_2024: 3,
+    LlmModel.DEEPSEEK_CHAT: 2,
+    LlmModel.DEEPSEEK_R1_0528: 1,
+    LlmModel.PERPLEXITY_SONAR: 1,
+    LlmModel.PERPLEXITY_SONAR_PRO: 5,
+    LlmModel.PERPLEXITY_SONAR_DEEP_RESEARCH: 10,
+    LlmModel.NOUSRESEARCH_HERMES_3_LLAMA_3_1_405B: 1,
+    LlmModel.NOUSRESEARCH_HERMES_3_LLAMA_3_1_70B: 1,
+    LlmModel.AMAZON_NOVA_LITE_V1: 1,
+    LlmModel.AMAZON_NOVA_MICRO_V1: 1,
+    LlmModel.AMAZON_NOVA_PRO_V1: 1,
+    LlmModel.MICROSOFT_WIZARDLM_2_8X22B: 1,
+    LlmModel.GRYPHE_MYTHOMAX_L2_13B: 1,
+    LlmModel.META_LLAMA_4_SCOUT: 1,
+    LlmModel.META_LLAMA_4_MAVERICK: 1,
+    LlmModel.LLAMA_API_LLAMA_4_SCOUT: 1,
+    LlmModel.LLAMA_API_LLAMA4_MAVERICK: 1,
+    LlmModel.LLAMA_API_LLAMA3_3_8B: 1,
+    LlmModel.LLAMA_API_LLAMA3_3_70B: 1,
+    LlmModel.GROK_4: 9,
+    LlmModel.GROK_4_FAST: 1,
+    LlmModel.GROK_4_1_FAST: 1,
+    LlmModel.GROK_CODE_FAST_1: 1,
+    LlmModel.KIMI_K2: 1,
+    LlmModel.QWEN3_235B_A22B_THINKING: 1,
+    LlmModel.QWEN3_CODER: 9,
+    # v0 by Vercel models
+    LlmModel.V0_1_5_MD: 1,
+    LlmModel.V0_1_5_LG: 2,
+    LlmModel.V0_1_0_MD: 1,
 }

-# =============== Configure the cost for each LLM Model call =============== #
-# All LLM costs now come from the database via llm_registry
-
-LLM_COST: list[BlockCost] = []
+for model in LlmModel:
+    if model not in MODEL_COST:
+        raise ValueError(f"Missing MODEL_COST for model: {model}")


-def _build_llm_costs_from_registry() -> list[BlockCost]:
-    """Build BlockCost list from all models in the LLM registry."""
-    costs: list[BlockCost] = []
-    for model in llm_registry.iter_dynamic_models():
-        for cost in model.costs:
-            credentials = PROVIDER_CREDENTIALS.get(cost.credential_provider)
-            if not credentials:
-                logger.warning(
-                    "Skipping cost entry for %s due to unknown credentials provider %s",
-                    model.slug,
-                    cost.credential_provider,
-                )
-                continue
-            cost_filter = {
-                "model": model.slug,
+LLM_COST = (
+    # Anthropic Models
+    [
+        BlockCost(
+            cost_type=BlockCostType.RUN,
+            cost_filter={
+                "model": model,
                "credentials": {
-                    "id": credentials.id,
-                    "provider": credentials.provider,
-                    "type": credentials.type,
+                    "id": anthropic_credentials.id,
+                    "provider": anthropic_credentials.provider,
+                    "type": anthropic_credentials.type,
                },
-            }
-            costs.append(
-                BlockCost(
-                    cost_type=BlockCostType.RUN,
-                    cost_filter=cost_filter,
-                    cost_amount=cost.credit_cost,
-                )
-            )
-    return costs
-
-
-def refresh_llm_costs() -> None:
-    """Refresh LLM costs from the registry. All costs now come from the database."""
-    LLM_COST.clear()
-    LLM_COST.extend(_build_llm_costs_from_registry())
-
-
-# Initial load will happen after registry is refreshed at startup
-# Don't call refresh_llm_costs() here - it will be called after registry refresh
+            },
+            cost_amount=cost,
+        )
+        for model, cost in MODEL_COST.items()
+        if MODEL_METADATA[model].provider == "anthropic"
+    ]
+    # OpenAI Models
+    + [
+        BlockCost(
+            cost_type=BlockCostType.RUN,
+            cost_filter={
+                "model": model,
+                "credentials": {
+                    "id": openai_credentials.id,
+                    "provider": openai_credentials.provider,
+                    "type": openai_credentials.type,
+                },
+            },
+            cost_amount=cost,
+        )
+        for model, cost in MODEL_COST.items()
+        if MODEL_METADATA[model].provider == "openai"
+    ]
+    # Groq Models
+    + [
+        BlockCost(
+            cost_type=BlockCostType.RUN,
+            cost_filter={
+                "model": model,
+                "credentials": {"id": groq_credentials.id},
+            },
+            cost_amount=cost,
+        )
+        for model, cost in MODEL_COST.items()
+        if MODEL_METADATA[model].provider == "groq"
+    ]
+    # Open Router Models
+    + [
+        BlockCost(
+            cost_type=BlockCostType.RUN,
+            cost_filter={
+                "model": model,
+                "credentials": {
+                    "id": open_router_credentials.id,
+                    "provider": open_router_credentials.provider,
+                    "type": open_router_credentials.type,
+                },
+            },
+            cost_amount=cost,
+        )
+        for model, cost in MODEL_COST.items()
+        if MODEL_METADATA[model].provider == "open_router"
+    ]
+    # Llama API Models
+    + [
+        BlockCost(
+            cost_type=BlockCostType.RUN,
+            cost_filter={
+                "model": model,
+                "credentials": {
+                    "id": llama_api_credentials.id,
+                    "provider": llama_api_credentials.provider,
+                    "type": llama_api_credentials.type,
+                },
+            },
+            cost_amount=cost,
+        )
+        for model, cost in MODEL_COST.items()
+        if MODEL_METADATA[model].provider == "llama_api"
+    ]
+    # v0 by Vercel Models
+    + [
+        BlockCost(
+            cost_type=BlockCostType.RUN,
+            cost_filter={
+                "model": model,
+                "credentials": {
+                    "id": v0_credentials.id,
+                    "provider": v0_credentials.provider,
+                    "type": v0_credentials.type,
+                },
+            },
+            cost_amount=cost,
+        )
+        for model, cost in MODEL_COST.items()
+        if MODEL_METADATA[model].provider == "v0"
+    ]
+    # AI/ML Api Models
+    + [
+        BlockCost(
+            cost_type=BlockCostType.RUN,
+            cost_filter={
+                "model": model,
+                "credentials": {
+                    "id": aiml_api_credentials.id,
+                    "provider": aiml_api_credentials.provider,
+                    "type": aiml_api_credentials.type,
+                },
+            },
+            cost_amount=cost,
+        )
+        for model, cost in MODEL_COST.items()
+        if MODEL_METADATA[model].provider == "aiml_api"
+    ]
+)

 # =============== This is the exhaustive list of cost for each Block =============== #

--- a/autogpt_platform/backend/backend/data/execution.py
+++ b/autogpt_platform/backend/backend/data/execution.py
@@ -83,12 +83,29 @@ class ExecutionContext(BaseModel):

    model_config = {"extra": "ignore"}

+    # Execution identity
+    user_id: Optional[str] = None
+    graph_id: Optional[str] = None
+    graph_exec_id: Optional[str] = None
+    graph_version: Optional[int] = None
+    node_id: Optional[str] = None
+    node_exec_id: Optional[str] = None
+
+    # Safety settings
    human_in_the_loop_safe_mode: bool = True
    sensitive_action_safe_mode: bool = False
+
+    # User settings
    user_timezone: str = "UTC"
+
+    # Execution hierarchy
    root_execution_id: Optional[str] = None
    parent_execution_id: Optional[str] = None

+    # Workspace
+    workspace_id: Optional[str] = None
+    session_id: Optional[str] = None
+

 # -------------------------- Models -------------------------- #

--- a/autogpt_platform/backend/backend/data/graph.py
+++ b/autogpt_platform/backend/backend/data/graph.py
@@ -1511,10 +1511,8 @@ async def migrate_llm_models(migrate_to: LlmModel):
            if field.annotation == LlmModel:
                llm_model_fields[block.id] = field_name

-    # Get all model slugs from the registry (dynamic, not hardcoded enum)
-    from backend.data import llm_registry
-
-    enum_values = list(llm_registry.get_all_model_slugs_for_validation())
+    # Convert enum values to a list of strings for the SQL query
+    enum_values = [v.value for v in LlmModel]
    escaped_enum_values = repr(tuple(enum_values))  # hack but works

    # Update each block
--- a/autogpt_platform/backend/backend/data/llm_registry/init.py
+++ b/autogpt_platform/backend/backend/data/llm_registry/init.py
@@ -1,72 +0,0 @@
-"""
-LLM Registry module for managing LLM models, providers, and costs dynamically.
-
-This module provides a database-driven registry system for LLM models,
-replacing hardcoded model configurations with a flexible admin-managed system.
-"""
-
-from backend.data.llm_registry.model import ModelMetadata
-
-# Re-export for backwards compatibility
-from backend.data.llm_registry.notifications import (
-    REGISTRY_REFRESH_CHANNEL,
-    publish_registry_refresh_notification,
-    subscribe_to_registry_refresh,
-)
-from backend.data.llm_registry.registry import (
-    RegistryModel,
-    RegistryModelCost,
-    RegistryModelCreator,
-    get_all_model_slugs_for_validation,
-    get_default_model_slug,
-    get_dynamic_model_slugs,
-    get_fallback_model_for_disabled,
-    get_llm_discriminator_mapping,
-    get_llm_model_cost,
-    get_llm_model_metadata,
-    get_llm_model_schema_options,
-    get_model_info,
-    is_model_enabled,
-    iter_dynamic_models,
-    refresh_llm_registry,
-    register_static_costs,
-    register_static_metadata,
-)
-from backend.data.llm_registry.schema_utils import (
-    is_llm_model_field,
-    refresh_llm_discriminator_mapping,
-    refresh_llm_model_options,
-    update_schema_with_llm_registry,
-)
-
-__all__ = [
-    # Types
-    "ModelMetadata",
-    "RegistryModel",
-    "RegistryModelCost",
-    "RegistryModelCreator",
-    # Registry functions
-    "get_all_model_slugs_for_validation",
-    "get_default_model_slug",
-    "get_dynamic_model_slugs",
-    "get_fallback_model_for_disabled",
-    "get_llm_discriminator_mapping",
-    "get_llm_model_cost",
-    "get_llm_model_metadata",
-    "get_llm_model_schema_options",
-    "get_model_info",
-    "is_model_enabled",
-    "iter_dynamic_models",
-    "refresh_llm_registry",
-    "register_static_costs",
-    "register_static_metadata",
-    # Notifications
-    "REGISTRY_REFRESH_CHANNEL",
-    "publish_registry_refresh_notification",
-    "subscribe_to_registry_refresh",
-    # Schema utilities
-    "is_llm_model_field",
-    "refresh_llm_discriminator_mapping",
-    "refresh_llm_model_options",
-    "update_schema_with_llm_registry",
-]
--- a/autogpt_platform/backend/backend/data/llm_registry/model.py
+++ b/autogpt_platform/backend/backend/data/llm_registry/model.py
@@ -1,25 +0,0 @@
-"""Type definitions for LLM model metadata."""
-
-from typing import Literal, NamedTuple
-
-
-class ModelMetadata(NamedTuple):
-    """Metadata for an LLM model.
-
-    Attributes:
-        provider: The provider identifier (e.g., "openai", "anthropic")
-        context_window: Maximum context window size in tokens
-        max_output_tokens: Maximum output tokens (None if unlimited)
-        display_name: Human-readable name for the model
-        provider_name: Human-readable provider name (e.g., "OpenAI", "Anthropic")
-        creator_name: Name of the organization that created the model
-        price_tier: Relative cost tier (1=cheapest, 2=medium, 3=expensive)
-    """
-
-    provider: str
-    context_window: int
-    max_output_tokens: int | None
-    display_name: str
-    provider_name: str
-    creator_name: str
-    price_tier: Literal[1, 2, 3]
--- a/autogpt_platform/backend/backend/data/llm_registry/notifications.py
+++ b/autogpt_platform/backend/backend/data/llm_registry/notifications.py
@@ -1,89 +0,0 @@
-"""
-Redis pub/sub notifications for LLM registry updates.
-
-When models are added/updated/removed via the admin UI, this module
-publishes notifications to Redis that all executor services subscribe to,
-ensuring they refresh their registry cache in real-time.
-"""
-
-import asyncio
-import logging
-from typing import Any
-
-from backend.data.redis_client import connect_async
-
-logger = logging.getLogger(__name__)
-
-# Redis channel name for LLM registry refresh notifications
-REGISTRY_REFRESH_CHANNEL = "llm_registry:refresh"
-
-
-async def publish_registry_refresh_notification() -> None:
-    """
-    Publish a notification to Redis that the LLM registry has been updated.
-    All executor services subscribed to this channel will refresh their registry.
-    """
-    try:
-        redis = await connect_async()
-        await redis.publish(REGISTRY_REFRESH_CHANNEL, "refresh")
-        logger.info("Published LLM registry refresh notification to Redis")
-    except Exception as exc:
-        logger.warning(
-            "Failed to publish LLM registry refresh notification: %s",
-            exc,
-            exc_info=True,
-        )
-
-
-async def subscribe_to_registry_refresh(
-    on_refresh: Any,  # Async callable that takes no args
-) -> None:
-    """
-    Subscribe to Redis notifications for LLM registry updates.
-    This runs in a loop and processes messages as they arrive.
-
-    Args:
-        on_refresh: Async callable to execute when a refresh notification is received
-    """
-    try:
-        redis = await connect_async()
-        pubsub = redis.pubsub()
-        await pubsub.subscribe(REGISTRY_REFRESH_CHANNEL)
-        logger.info(
-            "Subscribed to LLM registry refresh notifications on channel: %s",
-            REGISTRY_REFRESH_CHANNEL,
-        )
-
-        # Process messages in a loop
-        while True:
-            try:
-                message = await pubsub.get_message(
-                    ignore_subscribe_messages=True, timeout=1.0
-                )
-                if (
-                    message
-                    and message["type"] == "message"
-                    and message["channel"] == REGISTRY_REFRESH_CHANNEL
-                ):
-                    logger.info("Received LLM registry refresh notification")
-                    try:
-                        await on_refresh()
-                    except Exception as exc:
-                        logger.error(
-                            "Error refreshing LLM registry from notification: %s",
-                            exc,
-                            exc_info=True,
-                        )
-            except Exception as exc:
-                logger.warning(
-                    "Error processing registry refresh message: %s", exc, exc_info=True
-                )
-                # Continue listening even if one message fails
-                await asyncio.sleep(1)
-    except Exception as exc:
-        logger.error(
-            "Failed to subscribe to LLM registry refresh notifications: %s",
-            exc,
-            exc_info=True,
-        )
-        raise
--- a/autogpt_platform/backend/backend/data/llm_registry/registry.py
+++ b/autogpt_platform/backend/backend/data/llm_registry/registry.py
@@ -1,388 +0,0 @@
-"""Core LLM registry implementation for managing models dynamically."""
-
-from __future__ import annotations
-
-import asyncio
-import logging
-from dataclasses import dataclass, field
-from typing import Any, Iterable
-
-import prisma.models
-
-from backend.data.llm_registry.model import ModelMetadata
-
-logger = logging.getLogger(__name__)
-
-
-def _json_to_dict(value: Any) -> dict[str, Any]:
-    """Convert Prisma Json type to dict, with fallback to empty dict."""
-    if value is None:
-        return {}
-    if isinstance(value, dict):
-        return value
-    # Prisma Json type should always be a dict at runtime
-    return dict(value) if value else {}
-
-
-@dataclass(frozen=True)
-class RegistryModelCost:
-    """Cost configuration for an LLM model."""
-
-    credit_cost: int
-    credential_provider: str
-    credential_id: str | None
-    credential_type: str | None
-    currency: str | None
-    metadata: dict[str, Any]
-
-
-@dataclass(frozen=True)
-class RegistryModelCreator:
-    """Creator information for an LLM model."""
-
-    id: str
-    name: str
-    display_name: str
-    description: str | None
-    website_url: str | None
-    logo_url: str | None
-
-
-@dataclass(frozen=True)
-class RegistryModel:
-    """Represents a model in the LLM registry."""
-
-    slug: str
-    display_name: str
-    description: str | None
-    metadata: ModelMetadata
-    capabilities: dict[str, Any]
-    extra_metadata: dict[str, Any]
-    provider_display_name: str
-    is_enabled: bool
-    is_recommended: bool = False
-    costs: tuple[RegistryModelCost, ...] = field(default_factory=tuple)
-    creator: RegistryModelCreator | None = None
-
-
-_static_metadata: dict[str, ModelMetadata] = {}
-_static_costs: dict[str, int] = {}
-_dynamic_models: dict[str, RegistryModel] = {}
-_schema_options: list[dict[str, str]] = []
-_discriminator_mapping: dict[str, str] = {}
-_lock = asyncio.Lock()
-
-
-def register_static_metadata(metadata: dict[Any, ModelMetadata]) -> None:
-    """Register static metadata for legacy models (deprecated)."""
-    _static_metadata.update({str(key): value for key, value in metadata.items()})
-    _refresh_cached_schema()
-
-
-def register_static_costs(costs: dict[Any, int]) -> None:
-    """Register static costs for legacy models (deprecated)."""
-    _static_costs.update({str(key): value for key, value in costs.items()})
-
-
-def _build_schema_options() -> list[dict[str, str]]:
-    """Build schema options for model selection dropdown. Only includes enabled models."""
-    options: list[dict[str, str]] = []
-    # Only include enabled models in the dropdown options
-    for model in sorted(_dynamic_models.values(), key=lambda m: m.display_name.lower()):
-        if model.is_enabled:
-            options.append(
-                {
-                    "label": model.display_name,
-                    "value": model.slug,
-                    "group": model.metadata.provider,
-                    "description": model.description or "",
-                }
-            )
-
-    for slug, metadata in _static_metadata.items():
-        if slug in _dynamic_models:
-            continue
-        options.append(
-            {
-                "label": slug,
-                "value": slug,
-                "group": metadata.provider,
-                "description": "",
-            }
-        )
-    return options
-
-
-async def refresh_llm_registry() -> None:
-    """Refresh the LLM registry from the database. Loads all models (enabled and disabled)."""
-    async with _lock:
-        try:
-            records = await prisma.models.LlmModel.prisma().find_many(
-                include={
-                    "Provider": True,
-                    "Costs": True,
-                    "Creator": True,
-                }
-            )
-            logger.debug("Found %d LLM model records in database", len(records))
-        except Exception as exc:
-            logger.error(
-                "Failed to refresh LLM registry from DB: %s", exc, exc_info=True
-            )
-            return
-
-        dynamic: dict[str, RegistryModel] = {}
-        for record in records:
-            provider_name = (
-                record.Provider.name if record.Provider else record.providerId
-            )
-            provider_display_name = (
-                record.Provider.displayName if record.Provider else record.providerId
-            )
-            # Creator name: prefer Creator.name, fallback to provider display name
-            creator_name = (
-                record.Creator.name if record.Creator else provider_display_name
-            )
-            # Price tier: default to 1 (cheapest) if not set
-            price_tier = getattr(record, "priceTier", 1) or 1
-            # Clamp to valid range 1-3
-            price_tier = max(1, min(3, price_tier))
-
-            metadata = ModelMetadata(
-                provider=provider_name,
-                context_window=record.contextWindow,
-                max_output_tokens=record.maxOutputTokens,
-                display_name=record.displayName,
-                provider_name=provider_display_name,
-                creator_name=creator_name,
-                price_tier=price_tier,  # type: ignore[arg-type]
-            )
-            costs = tuple(
-                RegistryModelCost(
-                    credit_cost=cost.creditCost,
-                    credential_provider=cost.credentialProvider,
-                    credential_id=cost.credentialId,
-                    credential_type=cost.credentialType,
-                    currency=cost.currency,
-                    metadata=_json_to_dict(cost.metadata),
-                )
-                for cost in (record.Costs or [])
-            )
-
-            # Map creator if present
-            creator = None
-            if record.Creator:
-                creator = RegistryModelCreator(
-                    id=record.Creator.id,
-                    name=record.Creator.name,
-                    display_name=record.Creator.displayName,
-                    description=record.Creator.description,
-                    website_url=record.Creator.websiteUrl,
-                    logo_url=record.Creator.logoUrl,
-                )
-
-            dynamic[record.slug] = RegistryModel(
-                slug=record.slug,
-                display_name=record.displayName,
-                description=record.description,
-                metadata=metadata,
-                capabilities=_json_to_dict(record.capabilities),
-                extra_metadata=_json_to_dict(record.metadata),
-                provider_display_name=(
-                    record.Provider.displayName
-                    if record.Provider
-                    else record.providerId
-                ),
-                is_enabled=record.isEnabled,
-                is_recommended=record.isRecommended,
-                costs=costs,
-                creator=creator,
-            )
-
-        # Atomic swap - build new structures then replace references
-        # This ensures readers never see partially updated state
-        global _dynamic_models
-        _dynamic_models = dynamic
-        _refresh_cached_schema()
-        logger.info(
-            "LLM registry refreshed with %s dynamic models (enabled: %s, disabled: %s)",
-            len(dynamic),
-            sum(1 for m in dynamic.values() if m.is_enabled),
-            sum(1 for m in dynamic.values() if not m.is_enabled),
-        )
-
-
-def _refresh_cached_schema() -> None:
-    """Refresh cached schema options and discriminator mapping."""
-    global _schema_options, _discriminator_mapping
-
-    # Build new structures
-    new_options = _build_schema_options()
-    new_mapping = {
-        slug: entry.metadata.provider for slug, entry in _dynamic_models.items()
-    }
-    for slug, metadata in _static_metadata.items():
-        new_mapping.setdefault(slug, metadata.provider)
-
-    # Atomic swap - replace references to ensure readers see consistent state
-    _schema_options = new_options
-    _discriminator_mapping = new_mapping
-
-
-def get_llm_model_metadata(slug: str) -> ModelMetadata | None:
-    """Get model metadata by slug. Checks dynamic models first, then static metadata."""
-    if slug in _dynamic_models:
-        return _dynamic_models[slug].metadata
-    return _static_metadata.get(slug)
-
-
-def get_llm_model_cost(slug: str) -> tuple[RegistryModelCost, ...]:
-    """Get model cost configuration by slug."""
-    if slug in _dynamic_models:
-        return _dynamic_models[slug].costs
-    cost_value = _static_costs.get(slug)
-    if cost_value is None:
-        return tuple()
-    return (
-        RegistryModelCost(
-            credit_cost=cost_value,
-            credential_provider="static",
-            credential_id=None,
-            credential_type=None,
-            currency=None,
-            metadata={},
-        ),
-    )
-
-
-def get_llm_model_schema_options() -> list[dict[str, str]]:
-    """
-    Get schema options for LLM model selection dropdown.
-
-    Returns a copy of cached schema options that are refreshed when the registry is
-    updated via refresh_llm_registry() (called on startup and via Redis pub/sub).
-    """
-    # Return a copy to prevent external mutation
-    return list(_schema_options)
-
-
-def get_llm_discriminator_mapping() -> dict[str, str]:
-    """
-    Get discriminator mapping for LLM models.
-
-    Returns a copy of cached discriminator mapping that is refreshed when the registry
-    is updated via refresh_llm_registry() (called on startup and via Redis pub/sub).
-    """
-    # Return a copy to prevent external mutation
-    return dict(_discriminator_mapping)
-
-
-def get_dynamic_model_slugs() -> set[str]:
-    """Get all dynamic model slugs from the registry."""
-    return set(_dynamic_models.keys())
-
-
-def get_all_model_slugs_for_validation() -> set[str]:
-    """
-    Get ALL model slugs (both enabled and disabled) for validation purposes.
-
-    This is used for JSON schema enum validation - we need to accept any known
-    model value (even disabled ones) so that existing graphs don't fail validation.
-    The actual fallback/enforcement happens at runtime in llm_call().
-    """
-    all_slugs = set(_dynamic_models.keys())
-    all_slugs.update(_static_metadata.keys())
-    return all_slugs
-
-
-def iter_dynamic_models() -> Iterable[RegistryModel]:
-    """Iterate over all dynamic models in the registry."""
-    return tuple(_dynamic_models.values())
-
-
-def get_fallback_model_for_disabled(disabled_model_slug: str) -> RegistryModel | None:
-    """
-    Find a fallback model when the requested model is disabled.
-
-    Looks for an enabled model from the same provider. Prefers models with
-    similar names or capabilities if possible.
-
-    Args:
-        disabled_model_slug: The slug of the disabled model
-
-    Returns:
-        An enabled RegistryModel from the same provider, or None if no fallback found
-    """
-    disabled_model = _dynamic_models.get(disabled_model_slug)
-    if not disabled_model:
-        return None
-
-    provider = disabled_model.metadata.provider
-
-    # Find all enabled models from the same provider
-    candidates = [
-        model
-        for model in _dynamic_models.values()
-        if model.is_enabled and model.metadata.provider == provider
-    ]
-
-    if not candidates:
-        return None
-
-    # Sort by: prefer models with similar context window, then by name
-    candidates.sort(
-        key=lambda m: (
-            abs(m.metadata.context_window - disabled_model.metadata.context_window),
-            m.display_name.lower(),
-        )
-    )
-
-    return candidates[0]
-
-
-def is_model_enabled(model_slug: str) -> bool:
-    """Check if a model is enabled in the registry."""
-    model = _dynamic_models.get(model_slug)
-    if not model:
-        # Model not in registry - assume it's a static/legacy model and allow it
-        return True
-    return model.is_enabled
-
-
-def get_model_info(model_slug: str) -> RegistryModel | None:
-    """Get model info from the registry."""
-    return _dynamic_models.get(model_slug)
-
-
-def get_default_model_slug() -> str | None:
-    """
-    Get the default model slug to use for block defaults.
-
-    Returns the recommended model if set (configured via admin UI),
-    otherwise returns the first enabled model alphabetically.
-    Returns None if no models are available or enabled.
-    """
-    # Return the recommended model if one is set and enabled
-    for model in _dynamic_models.values():
-        if model.is_recommended and model.is_enabled:
-            return model.slug
-
-    # No recommended model set - find first enabled model alphabetically
-    for model in sorted(_dynamic_models.values(), key=lambda m: m.display_name.lower()):
-        if model.is_enabled:
-            logger.warning(
-                "No recommended model set, using '%s' as default",
-                model.slug,
-            )
-            return model.slug
-
-    # No enabled models available
-    if _dynamic_models:
-        logger.error(
-            "No enabled models found in registry (%d models registered but all disabled)",
-            len(_dynamic_models),
-        )
-    else:
-        logger.error("No models registered in LLM registry")
-
-    return None
--- a/autogpt_platform/backend/backend/data/llm_registry/schema_utils.py
+++ b/autogpt_platform/backend/backend/data/llm_registry/schema_utils.py
@@ -1,130 +0,0 @@
-"""
-Helper utilities for LLM registry integration with block schemas.
-
-This module handles the dynamic injection of discriminator mappings
-and model options from the LLM registry into block schemas.
-"""
-
-import logging
-from typing import Any
-
-from backend.data.llm_registry.registry import (
-    get_all_model_slugs_for_validation,
-    get_default_model_slug,
-    get_llm_discriminator_mapping,
-    get_llm_model_schema_options,
-)
-
-logger = logging.getLogger(__name__)
-
-
-def is_llm_model_field(field_name: str, field_info: Any) -> bool:
-    """
-    Check if a field is an LLM model selection field.
-
-    Returns True if the field has 'options' in json_schema_extra
-    (set by llm_model_schema_extra() in blocks/llm.py).
-    """
-    if not hasattr(field_info, "json_schema_extra"):
-        return False
-
-    extra = field_info.json_schema_extra
-    if isinstance(extra, dict):
-        return "options" in extra
-
-    return False
-
-
-def refresh_llm_model_options(field_schema: dict[str, Any]) -> None:
-    """
-    Refresh LLM model options from the registry.
-
-    Updates 'options' (for frontend dropdown) to show only enabled models,
-    but keeps the 'enum' (for validation) inclusive of ALL known models.
-
-    This is important because:
-    - Options: What users see in the dropdown (enabled models only)
-    - Enum: What values pass validation (all known models, including disabled)
-
-    Existing graphs may have disabled models selected - they should pass validation
-    and the fallback logic in llm_call() will handle using an alternative model.
-    """
-    fresh_options = get_llm_model_schema_options()
-    if not fresh_options:
-        return
-
-    # Update options array (UI dropdown) - only enabled models
-    if "options" in field_schema:
-        field_schema["options"] = fresh_options
-
-    all_known_slugs = get_all_model_slugs_for_validation()
-    if all_known_slugs and "enum" in field_schema:
-        existing_enum = set(field_schema.get("enum", []))
-        combined_enum = existing_enum | all_known_slugs
-        field_schema["enum"] = sorted(combined_enum)
-
-    # Set the default value from the registry (gpt-4o if available, else first enabled)
-    # This ensures new blocks have a sensible default pre-selected
-    default_slug = get_default_model_slug()
-    if default_slug:
-        field_schema["default"] = default_slug
-
-
-def refresh_llm_discriminator_mapping(field_schema: dict[str, Any]) -> None:
-    """
-    Refresh discriminator_mapping for fields that use model-based discrimination.
-
-    The discriminator is already set when AICredentialsField() creates the field.
-    We only need to refresh the mapping when models are added/removed.
-    """
-    if field_schema.get("discriminator") != "model":
-        return
-
-    # Always refresh the mapping to get latest models
-    fresh_mapping = get_llm_discriminator_mapping()
-    if fresh_mapping is not None:
-        field_schema["discriminator_mapping"] = fresh_mapping
-
-
-def update_schema_with_llm_registry(
-    schema: dict[str, Any], model_class: type | None = None
-) -> None:
-    """
-    Update a JSON schema with current LLM registry data.
-
-    Refreshes:
-    1. Model options for LLM model selection fields (dropdown choices)
-    2. Discriminator mappings for credentials fields (model → provider)
-
-    Args:
-        schema: The JSON schema to update (mutated in-place)
-        model_class: The Pydantic model class (optional, for field introspection)
-    """
-    properties = schema.get("properties", {})
-
-    for field_name, field_schema in properties.items():
-        if not isinstance(field_schema, dict):
-            continue
-
-        # Refresh model options for LLM model fields
-        if model_class and hasattr(model_class, "model_fields"):
-            field_info = model_class.model_fields.get(field_name)
-            if field_info and is_llm_model_field(field_name, field_info):
-                try:
-                    refresh_llm_model_options(field_schema)
-                except Exception as exc:
-                    logger.warning(
-                        "Failed to refresh LLM options for field %s: %s",
-                        field_name,
-                        exc,
-                    )
-
-        # Refresh discriminator mapping for fields that use model discrimination
-        try:
-            refresh_llm_discriminator_mapping(field_schema)
-        except Exception as exc:
-            logger.warning(
-                "Failed to refresh discriminator mapping for field %s: %s",
-                field_name,
-                exc,
-            )
--- a/autogpt_platform/backend/backend/data/model.py
+++ b/autogpt_platform/backend/backend/data/model.py
@@ -40,7 +40,6 @@ from pydantic_core import (
 )
 from typing_extensions import TypedDict

-from backend.data.llm_registry import update_schema_with_llm_registry
 from backend.integrations.providers import ProviderName
 from backend.util.json import loads as json_loads
 from backend.util.settings import Secrets
@@ -545,9 +544,7 @@ class CredentialsMetaInput(BaseModel, Generic[CP, CT]):
            else:
                schema["credentials_provider"] = allowed_providers
            schema["credentials_types"] = model_class.allowed_cred_types()
-
-        # Ensure LLM discriminators are populated (delegates to shared helper)
-        update_schema_with_llm_registry(schema, model_class)
+        # Do not return anything, just mutate schema in place

    model_config = ConfigDict(
        json_schema_extra=_add_json_schema_extra,  # type: ignore
@@ -696,20 +693,16 @@ def CredentialsField(
    This is enforced by the `BlockSchema` base class.
    """

-    # Build field_schema_extra - always include discriminator and mapping if discriminator is set
-    field_schema_extra: dict[str, Any] = {}
-
-    # Always include discriminator if provided
-    if discriminator is not None:
-        field_schema_extra["discriminator"] = discriminator
-        # Always include discriminator_mapping when discriminator is set (even if empty initially)
-        field_schema_extra["discriminator_mapping"] = discriminator_mapping or {}
-
-    # Include other optional fields (only if not None)
-    if required_scopes:
-        field_schema_extra["credentials_scopes"] = list(required_scopes)
-    if discriminator_values:
-        field_schema_extra["discriminator_values"] = discriminator_values
+    field_schema_extra = {
+        k: v
+        for k, v in {
+            "credentials_scopes": list(required_scopes) or None,
+            "discriminator": discriminator,
+            "discriminator_mapping": discriminator_mapping,
+            "discriminator_values": discriminator_values,
+        }.items()
+        if v is not None
+    }

    # Merge any json_schema_extra passed in kwargs
    if "json_schema_extra" in kwargs:
--- a/autogpt_platform/backend/backend/data/onboarding.py
+++ b/autogpt_platform/backend/backend/data/onboarding.py
@@ -41,6 +41,7 @@ FrontendOnboardingStep = Literal[
    OnboardingStep.AGENT_NEW_RUN,
    OnboardingStep.AGENT_INPUT,
    OnboardingStep.CONGRATS,
+    OnboardingStep.VISIT_COPILOT,
    OnboardingStep.MARKETPLACE_VISIT,
    OnboardingStep.BUILDER_OPEN,
 ]
@@ -122,6 +123,9 @@ async def update_user_onboarding(user_id: str, data: UserOnboardingUpdate):
 async def _reward_user(user_id: str, onboarding: UserOnboarding, step: OnboardingStep):
    reward = 0
    match step:
+        # Welcome bonus for visiting copilot ($5 = 500 credits)
+        case OnboardingStep.VISIT_COPILOT:
+            reward = 500
        # Reward user when they clicked New Run during onboarding
        # This is because they need credits before scheduling a run (next step)
        # This is seen as a reward for the GET_RESULTS step in the wallet
--- a/autogpt_platform/backend/backend/data/workspace.py
+++ b/autogpt_platform/backend/backend/data/workspace.py
@@ -0,0 +1,276 @@
+"""
+Database CRUD operations for User Workspace.
+
+This module provides functions for managing user workspaces and workspace files.
+"""
+
+import logging
+from datetime import datetime, timezone
+from typing import Optional
+
+from prisma.models import UserWorkspace, UserWorkspaceFile
+from prisma.types import UserWorkspaceFileWhereInput
+
+from backend.util.json import SafeJson
+
+logger = logging.getLogger(__name__)
+
+
+async def get_or_create_workspace(user_id: str) -> UserWorkspace:
+    """
+    Get user's workspace, creating one if it doesn't exist.
+
+    Uses upsert to handle race conditions when multiple concurrent requests
+    attempt to create a workspace for the same user.
+
+    Args:
+        user_id: The user's ID
+
+    Returns:
+        UserWorkspace instance
+    """
+    workspace = await UserWorkspace.prisma().upsert(
+        where={"userId": user_id},
+        data={
+            "create": {"userId": user_id},
+            "update": {},  # No updates needed if exists
+        },
+    )
+
+    return workspace
+
+
+async def get_workspace(user_id: str) -> Optional[UserWorkspace]:
+    """
+    Get user's workspace if it exists.
+
+    Args:
+        user_id: The user's ID
+
+    Returns:
+        UserWorkspace instance or None
+    """
+    return await UserWorkspace.prisma().find_unique(where={"userId": user_id})
+
+
+async def create_workspace_file(
+    workspace_id: str,
+    file_id: str,
+    name: str,
+    path: str,
+    storage_path: str,
+    mime_type: str,
+    size_bytes: int,
+    checksum: Optional[str] = None,
+    metadata: Optional[dict] = None,
+) -> UserWorkspaceFile:
+    """
+    Create a new workspace file record.
+
+    Args:
+        workspace_id: The workspace ID
+        file_id: The file ID (same as used in storage path for consistency)
+        name: User-visible filename
+        path: Virtual path (e.g., "/documents/report.pdf")
+        storage_path: Actual storage path (GCS or local)
+        mime_type: MIME type of the file
+        size_bytes: File size in bytes
+        checksum: Optional SHA256 checksum
+        metadata: Optional additional metadata
+
+    Returns:
+        Created UserWorkspaceFile instance
+    """
+    # Normalize path to start with /
+    if not path.startswith("/"):
+        path = f"/{path}"
+
+    file = await UserWorkspaceFile.prisma().create(
+        data={
+            "id": file_id,
+            "workspaceId": workspace_id,
+            "name": name,
+            "path": path,
+            "storagePath": storage_path,
+            "mimeType": mime_type,
+            "sizeBytes": size_bytes,
+            "checksum": checksum,
+            "metadata": SafeJson(metadata or {}),
+        }
+    )
+
+    logger.info(
+        f"Created workspace file {file.id} at path {path} "
+        f"in workspace {workspace_id}"
+    )
+    return file
+
+
+async def get_workspace_file(
+    file_id: str,
+    workspace_id: Optional[str] = None,
+) -> Optional[UserWorkspaceFile]:
+    """
+    Get a workspace file by ID.
+
+    Args:
+        file_id: The file ID
+        workspace_id: Optional workspace ID for validation
+
+    Returns:
+        UserWorkspaceFile instance or None
+    """
+    where_clause: dict = {"id": file_id, "isDeleted": False}
+    if workspace_id:
+        where_clause["workspaceId"] = workspace_id
+
+    return await UserWorkspaceFile.prisma().find_first(where=where_clause)
+
+
+async def get_workspace_file_by_path(
+    workspace_id: str,
+    path: str,
+) -> Optional[UserWorkspaceFile]:
+    """
+    Get a workspace file by its virtual path.
+
+    Args:
+        workspace_id: The workspace ID
+        path: Virtual path
+
+    Returns:
+        UserWorkspaceFile instance or None
+    """
+    # Normalize path
+    if not path.startswith("/"):
+        path = f"/{path}"
+
+    return await UserWorkspaceFile.prisma().find_first(
+        where={
+            "workspaceId": workspace_id,
+            "path": path,
+            "isDeleted": False,
+        }
+    )
+
+
+async def list_workspace_files(
+    workspace_id: str,
+    path_prefix: Optional[str] = None,
+    include_deleted: bool = False,
+    limit: Optional[int] = None,
+    offset: int = 0,
+) -> list[UserWorkspaceFile]:
+    """
+    List files in a workspace.
+
+    Args:
+        workspace_id: The workspace ID
+        path_prefix: Optional path prefix to filter (e.g., "/documents/")
+        include_deleted: Whether to include soft-deleted files
+        limit: Maximum number of files to return
+        offset: Number of files to skip
+
+    Returns:
+        List of UserWorkspaceFile instances
+    """
+    where_clause: UserWorkspaceFileWhereInput = {"workspaceId": workspace_id}
+
+    if not include_deleted:
+        where_clause["isDeleted"] = False
+
+    if path_prefix:
+        # Normalize prefix
+        if not path_prefix.startswith("/"):
+            path_prefix = f"/{path_prefix}"
+        where_clause["path"] = {"startswith": path_prefix}
+
+    return await UserWorkspaceFile.prisma().find_many(
+        where=where_clause,
+        order={"createdAt": "desc"},
+        take=limit,
+        skip=offset,
+    )
+
+
+async def count_workspace_files(
+    workspace_id: str,
+    path_prefix: Optional[str] = None,
+    include_deleted: bool = False,
+) -> int:
+    """
+    Count files in a workspace.
+
+    Args:
+        workspace_id: The workspace ID
+        path_prefix: Optional path prefix to filter (e.g., "/sessions/abc123/")
+        include_deleted: Whether to include soft-deleted files
+
+    Returns:
+        Number of files
+    """
+    where_clause: dict = {"workspaceId": workspace_id}
+    if not include_deleted:
+        where_clause["isDeleted"] = False
+
+    if path_prefix:
+        # Normalize prefix
+        if not path_prefix.startswith("/"):
+            path_prefix = f"/{path_prefix}"
+        where_clause["path"] = {"startswith": path_prefix}
+
+    return await UserWorkspaceFile.prisma().count(where=where_clause)
+
+
+async def soft_delete_workspace_file(
+    file_id: str,
+    workspace_id: Optional[str] = None,
+) -> Optional[UserWorkspaceFile]:
+    """
+    Soft-delete a workspace file.
+
+    The path is modified to include a deletion timestamp to free up the original
+    path for new files while preserving the record for potential recovery.
+
+    Args:
+        file_id: The file ID
+        workspace_id: Optional workspace ID for validation
+
+    Returns:
+        Updated UserWorkspaceFile instance or None if not found
+    """
+    # First verify the file exists and belongs to workspace
+    file = await get_workspace_file(file_id, workspace_id)
+    if file is None:
+        return None
+
+    deleted_at = datetime.now(timezone.utc)
+    # Modify path to free up the unique constraint for new files at original path
+    # Format: {original_path}__deleted__{timestamp}
+    deleted_path = f"{file.path}__deleted__{int(deleted_at.timestamp())}"
+
+    updated = await UserWorkspaceFile.prisma().update(
+        where={"id": file_id},
+        data={
+            "isDeleted": True,
+            "deletedAt": deleted_at,
+            "path": deleted_path,
+        },
+    )
+
+    logger.info(f"Soft-deleted workspace file {file_id}")
+    return updated
+
+
+async def get_workspace_total_size(workspace_id: str) -> int:
+    """
+    Get the total size of all files in a workspace.
+
+    Args:
+        workspace_id: The workspace ID
+
+    Returns:
+        Total size in bytes
+    """
+    files = await list_workspace_files(workspace_id)
+    return sum(file.sizeBytes for file in files)
--- a/autogpt_platform/backend/backend/executor/llm_registry_init.py
+++ b/autogpt_platform/backend/backend/executor/llm_registry_init.py
@@ -1,66 +0,0 @@
-"""
-Helper functions for LLM registry initialization in executor context.
-
-These functions handle refreshing the LLM registry when the executor starts
-and subscribing to real-time updates via Redis pub/sub.
-"""
-
-import logging
-
-from backend.data import db, llm_registry
-from backend.data.block import BlockSchema, initialize_blocks
-from backend.data.block_cost_config import refresh_llm_costs
-from backend.data.llm_registry import subscribe_to_registry_refresh
-
-logger = logging.getLogger(__name__)
-
-
-async def initialize_registry_for_executor() -> None:
-    """
-    Initialize blocks and refresh LLM registry in the executor context.
-
-    This must run in the executor's event loop to have access to the database.
-    """
-    try:
-        # Connect to database if not already connected
-        if not db.is_connected():
-            await db.connect()
-            logger.info("[GraphExecutor] Connected to database for registry refresh")
-
-        # Initialize blocks (internally refreshes LLM registry and costs)
-        await initialize_blocks()
-        logger.info("[GraphExecutor] Blocks initialized")
-    except Exception as exc:
-        logger.warning(
-            "[GraphExecutor] Failed to refresh LLM registry on startup: %s",
-            exc,
-            exc_info=True,
-        )
-
-
-async def refresh_registry_on_notification() -> None:
-    """Refresh LLM registry when notified via Redis pub/sub."""
-    try:
-        # Ensure DB is connected
-        if not db.is_connected():
-            await db.connect()
-
-        # Refresh registry and costs
-        await llm_registry.refresh_llm_registry()
-        refresh_llm_costs()
-
-        # Clear block schema caches so they regenerate with new model options
-        BlockSchema.clear_all_schema_caches()
-
-        logger.info("[GraphExecutor] LLM registry refreshed from notification")
-    except Exception as exc:
-        logger.error(
-            "[GraphExecutor] Failed to refresh LLM registry from notification: %s",
-            exc,
-            exc_info=True,
-        )
-
-
-async def subscribe_to_registry_updates() -> None:
-    """Subscribe to Redis pub/sub for LLM registry refresh notifications."""
-    await subscribe_to_registry_refresh(refresh_registry_on_notification)
--- a/autogpt_platform/backend/backend/executor/manager.py
+++ b/autogpt_platform/backend/backend/executor/manager.py
@@ -236,7 +236,14 @@ async def execute_node(
    input_size = len(input_data_str)
    log_metadata.debug("Executed node with input", input=input_data_str)

+    # Create node-specific execution context to avoid race conditions
+    # (multiple nodes can execute concurrently and would otherwise mutate shared state)
+    execution_context = execution_context.model_copy(
+        update={"node_id": node_id, "node_exec_id": node_exec_id}
+    )
+
    # Inject extra execution arguments for the blocks via kwargs
+    # Keep individual kwargs for backwards compatibility with existing blocks
    extra_exec_kwargs: dict = {
        "graph_id": graph_id,
        "graph_version": graph_version,
@@ -702,20 +709,6 @@ class ExecutionProcessor:
        )
        self.node_execution_thread.start()
        self.node_evaluation_thread.start()
-
-        # Initialize LLM registry and subscribe to updates
-        from backend.executor.llm_registry_init import (
-            initialize_registry_for_executor,
-            subscribe_to_registry_updates,
-        )
-
-        asyncio.run_coroutine_threadsafe(
-            initialize_registry_for_executor(), self.node_execution_loop
-        )
-        asyncio.run_coroutine_threadsafe(
-            subscribe_to_registry_updates(), self.node_execution_loop
-        )
-
        logger.info(f"[GraphExecutor] {self.tid} started")

    @error_logged(swallow=False)
--- a/autogpt_platform/backend/backend/executor/utils.py
+++ b/autogpt_platform/backend/backend/executor/utils.py
@@ -892,11 +892,19 @@ async def add_graph_execution(
        settings = await gdb.get_graph_settings(user_id=user_id, graph_id=graph_id)

        execution_context = ExecutionContext(
+            # Execution identity
+            user_id=user_id,
+            graph_id=graph_id,
+            graph_exec_id=graph_exec.id,
+            graph_version=graph_exec.graph_version,
+            # Safety settings
            human_in_the_loop_safe_mode=settings.human_in_the_loop_safe_mode,
            sensitive_action_safe_mode=settings.sensitive_action_safe_mode,
+            # User settings
            user_timezone=(
                user.timezone if user.timezone != USER_TIMEZONE_NOT_SET else "UTC"
            ),
+            # Execution hierarchy
            root_execution_id=graph_exec.id,
        )

--- a/autogpt_platform/backend/backend/executor/utils_test.py
+++ b/autogpt_platform/backend/backend/executor/utils_test.py
@@ -348,6 +348,7 @@ async def test_add_graph_execution_is_repeatable(mocker: MockerFixture):
    mock_graph_exec.id = "execution-id-123"
    mock_graph_exec.node_executions = []  # Add this to avoid AttributeError
    mock_graph_exec.status = ExecutionStatus.QUEUED  # Required for race condition check
+    mock_graph_exec.graph_version = graph_version
    mock_graph_exec.to_graph_execution_entry.return_value = mocker.MagicMock()

    # Mock the queue and event bus
@@ -434,6 +435,9 @@ async def test_add_graph_execution_is_repeatable(mocker: MockerFixture):
    # Create a second mock execution for the sanity check
    mock_graph_exec_2 = mocker.MagicMock(spec=GraphExecutionWithNodes)
    mock_graph_exec_2.id = "execution-id-456"
+    mock_graph_exec_2.node_executions = []
+    mock_graph_exec_2.status = ExecutionStatus.QUEUED
+    mock_graph_exec_2.graph_version = graph_version
    mock_graph_exec_2.to_graph_execution_entry.return_value = mocker.MagicMock()

    # Reset mocks and set up for second call
@@ -614,6 +618,7 @@ async def test_add_graph_execution_with_nodes_to_skip(mocker: MockerFixture):
    mock_graph_exec.id = "execution-id-123"
    mock_graph_exec.node_executions = []
    mock_graph_exec.status = ExecutionStatus.QUEUED  # Required for race condition check
+    mock_graph_exec.graph_version = graph_version

    # Track what's passed to to_graph_execution_entry
    captured_kwargs = {}
--- a/autogpt_platform/backend/backend/server/v2/llm/db.py
+++ b/autogpt_platform/backend/backend/server/v2/llm/db.py
@@ -1,935 +0,0 @@
-from __future__ import annotations
-
-from typing import Any, Iterable, Sequence, cast
-
-import prisma
-import prisma.models
-
-from backend.data.db import transaction
-from backend.server.v2.llm import model as llm_model
-from backend.util.models import Pagination
-
-
-def _json_dict(value: Any | None) -> dict[str, Any]:
-    if not value:
-        return {}
-    if isinstance(value, dict):
-        return value
-    return {}
-
-
-def _map_cost(record: prisma.models.LlmModelCost) -> llm_model.LlmModelCost:
-    return llm_model.LlmModelCost(
-        id=record.id,
-        unit=record.unit,
-        credit_cost=record.creditCost,
-        credential_provider=record.credentialProvider,
-        credential_id=record.credentialId,
-        credential_type=record.credentialType,
-        currency=record.currency,
-        metadata=_json_dict(record.metadata),
-    )
-
-
-def _map_creator(
-    record: prisma.models.LlmModelCreator,
-) -> llm_model.LlmModelCreator:
-    return llm_model.LlmModelCreator(
-        id=record.id,
-        name=record.name,
-        display_name=record.displayName,
-        description=record.description,
-        website_url=record.websiteUrl,
-        logo_url=record.logoUrl,
-        metadata=_json_dict(record.metadata),
-    )
-
-
-def _map_model(record: prisma.models.LlmModel) -> llm_model.LlmModel:
-    costs = []
-    if record.Costs:
-        costs = [_map_cost(cost) for cost in record.Costs]
-
-    creator = None
-    if hasattr(record, "Creator") and record.Creator:
-        creator = _map_creator(record.Creator)
-
-    return llm_model.LlmModel(
-        id=record.id,
-        slug=record.slug,
-        display_name=record.displayName,
-        description=record.description,
-        provider_id=record.providerId,
-        creator_id=record.creatorId,
-        creator=creator,
-        context_window=record.contextWindow,
-        max_output_tokens=record.maxOutputTokens,
-        is_enabled=record.isEnabled,
-        is_recommended=record.isRecommended,
-        capabilities=_json_dict(record.capabilities),
-        metadata=_json_dict(record.metadata),
-        costs=costs,
-    )
-
-
-def _map_provider(record: prisma.models.LlmProvider) -> llm_model.LlmProvider:
-    models: list[llm_model.LlmModel] = []
-    if record.Models:
-        models = [_map_model(model) for model in record.Models]
-
-    return llm_model.LlmProvider(
-        id=record.id,
-        name=record.name,
-        display_name=record.displayName,
-        description=record.description,
-        default_credential_provider=record.defaultCredentialProvider,
-        default_credential_id=record.defaultCredentialId,
-        default_credential_type=record.defaultCredentialType,
-        supports_tools=record.supportsTools,
-        supports_json_output=record.supportsJsonOutput,
-        supports_reasoning=record.supportsReasoning,
-        supports_parallel_tool=record.supportsParallelTool,
-        metadata=_json_dict(record.metadata),
-        models=models,
-    )
-
-
-async def list_providers(
-    include_models: bool = True, enabled_only: bool = False
-) -> list[llm_model.LlmProvider]:
-    """
-    List all LLM providers.
-
-    Args:
-        include_models: Whether to include models for each provider
-        enabled_only: If True, only include enabled models (for public routes)
-    """
-    include: Any = None
-    if include_models:
-        model_where = {"isEnabled": True} if enabled_only else None
-        include = {
-            "Models": {
-                "include": {"Costs": True, "Creator": True},
-                "where": model_where,
-            }
-        }
-    records = await prisma.models.LlmProvider.prisma().find_many(include=include)
-    return [_map_provider(record) for record in records]
-
-
-async def upsert_provider(
-    request: llm_model.UpsertLlmProviderRequest,
-    provider_id: str | None = None,
-) -> llm_model.LlmProvider:
-    data: Any = {
-        "name": request.name,
-        "displayName": request.display_name,
-        "description": request.description,
-        "defaultCredentialProvider": request.default_credential_provider,
-        "defaultCredentialId": request.default_credential_id,
-        "defaultCredentialType": request.default_credential_type,
-        "supportsTools": request.supports_tools,
-        "supportsJsonOutput": request.supports_json_output,
-        "supportsReasoning": request.supports_reasoning,
-        "supportsParallelTool": request.supports_parallel_tool,
-        "metadata": prisma.Json(request.metadata or {}),
-    }
-    include: Any = {"Models": {"include": {"Costs": True, "Creator": True}}}
-    if provider_id:
-        record = await prisma.models.LlmProvider.prisma().update(
-            where={"id": provider_id},
-            data=data,
-            include=include,
-        )
-    else:
-        record = await prisma.models.LlmProvider.prisma().create(
-            data=data,
-            include=include,
-        )
-    if record is None:
-        raise ValueError("Failed to create/update provider")
-    return _map_provider(record)
-
-
-async def delete_provider(provider_id: str) -> bool:
-    """
-    Delete an LLM provider.
-
-    A provider can only be deleted if it has no associated models.
-    Due to onDelete: Restrict on LlmModel.Provider, the database will
-    block deletion if models exist.
-
-    Args:
-        provider_id: UUID of the provider to delete
-
-    Returns:
-        True if deleted successfully
-
-    Raises:
-        ValueError: If provider not found or has associated models
-    """
-    # Check if provider exists
-    provider = await prisma.models.LlmProvider.prisma().find_unique(
-        where={"id": provider_id},
-        include={"Models": True},
-    )
-    if not provider:
-        raise ValueError(f"Provider with id '{provider_id}' not found")
-
-    # Check if provider has any models
-    model_count = len(provider.Models) if provider.Models else 0
-    if model_count > 0:
-        raise ValueError(
-            f"Cannot delete provider '{provider.displayName}' because it has "
-            f"{model_count} model(s). Delete all models first."
-        )
-
-    # Safe to delete
-    await prisma.models.LlmProvider.prisma().delete(where={"id": provider_id})
-    return True
-
-
-async def list_models(
-    provider_id: str | None = None,
-    enabled_only: bool = False,
-    page: int = 1,
-    page_size: int = 50,
-) -> llm_model.LlmModelsResponse:
-    """
-    List LLM models with pagination.
-
-    Args:
-        provider_id: Optional filter by provider ID
-        enabled_only: If True, only return enabled models (for public routes)
-        page: Page number (1-indexed)
-        page_size: Number of models per page
-    """
-    where: Any = {}
-    if provider_id:
-        where["providerId"] = provider_id
-    if enabled_only:
-        where["isEnabled"] = True
-
-    # Get total count for pagination
-    total_items = await prisma.models.LlmModel.prisma().count(
-        where=where if where else None
-    )
-
-    # Calculate pagination
-    skip = (page - 1) * page_size
-    total_pages = (total_items + page_size - 1) // page_size if total_items > 0 else 0
-
-    records = await prisma.models.LlmModel.prisma().find_many(
-        where=where if where else None,
-        include={"Costs": True, "Creator": True},
-        skip=skip,
-        take=page_size,
-    )
-    models = [_map_model(record) for record in records]
-
-    return llm_model.LlmModelsResponse(
-        models=models,
-        pagination=Pagination(
-            total_items=total_items,
-            total_pages=total_pages,
-            current_page=page,
-            page_size=page_size,
-        ),
-    )
-
-
-def _cost_create_payload(
-    costs: Sequence[llm_model.LlmModelCostInput],
-) -> dict[str, Iterable[dict[str, Any]]]:
-
-    create_items = []
-    for cost in costs:
-        item: dict[str, Any] = {
-            "unit": cost.unit,
-            "creditCost": cost.credit_cost,
-            "credentialProvider": cost.credential_provider,
-        }
-        # Only include optional fields if they have values
-        if cost.credential_id:
-            item["credentialId"] = cost.credential_id
-        if cost.credential_type:
-            item["credentialType"] = cost.credential_type
-        if cost.currency:
-            item["currency"] = cost.currency
-        # Handle metadata - use Prisma Json type
-        if cost.metadata is not None and cost.metadata != {}:
-            item["metadata"] = prisma.Json(cost.metadata)
-        create_items.append(item)
-    return {"create": create_items}
-
-
-async def create_model(
-    request: llm_model.CreateLlmModelRequest,
-) -> llm_model.LlmModel:
-    data: Any = {
-        "slug": request.slug,
-        "displayName": request.display_name,
-        "description": request.description,
-        "Provider": {"connect": {"id": request.provider_id}},
-        "contextWindow": request.context_window,
-        "maxOutputTokens": request.max_output_tokens,
-        "isEnabled": request.is_enabled,
-        "capabilities": prisma.Json(request.capabilities or {}),
-        "metadata": prisma.Json(request.metadata or {}),
-        "Costs": _cost_create_payload(request.costs),
-    }
-    if request.creator_id:
-        data["Creator"] = {"connect": {"id": request.creator_id}}
-
-    record = await prisma.models.LlmModel.prisma().create(
-        data=data,
-        include={"Costs": True, "Creator": True, "Provider": True},
-    )
-    return _map_model(record)
-
-
-async def update_model(
-    model_id: str,
-    request: llm_model.UpdateLlmModelRequest,
-) -> llm_model.LlmModel:
-    # Build scalar field updates (non-relation fields)
-    scalar_data: Any = {}
-    if request.display_name is not None:
-        scalar_data["displayName"] = request.display_name
-    if request.description is not None:
-        scalar_data["description"] = request.description
-    if request.context_window is not None:
-        scalar_data["contextWindow"] = request.context_window
-    if request.max_output_tokens is not None:
-        scalar_data["maxOutputTokens"] = request.max_output_tokens
-    if request.is_enabled is not None:
-        scalar_data["isEnabled"] = request.is_enabled
-    if request.capabilities is not None:
-        scalar_data["capabilities"] = request.capabilities
-    if request.metadata is not None:
-        scalar_data["metadata"] = request.metadata
-    # Foreign keys can be updated directly as scalar fields
-    if request.provider_id is not None:
-        scalar_data["providerId"] = request.provider_id
-    if request.creator_id is not None:
-        # Empty string means remove the creator
-        scalar_data["creatorId"] = request.creator_id if request.creator_id else None
-
-    # If we have costs to update, we need to handle them separately
-    # because nested writes have different constraints
-    if request.costs is not None:
-        # Wrap cost replacement in a transaction for atomicity
-        async with transaction() as tx:
-            # First update scalar fields
-            if scalar_data:
-                await tx.llmmodel.update(
-                    where={"id": model_id},
-                    data=scalar_data,
-                )
-            # Then handle costs: delete existing and create new
-            await tx.llmmodelcost.delete_many(where={"llmModelId": model_id})
-            if request.costs:
-                cost_payload = _cost_create_payload(request.costs)
-                for cost_item in cost_payload["create"]:
-                    cost_item["llmModelId"] = model_id
-                    await tx.llmmodelcost.create(data=cast(Any, cost_item))
-        # Fetch the updated record (outside transaction)
-        record = await prisma.models.LlmModel.prisma().find_unique(
-            where={"id": model_id},
-            include={"Costs": True, "Creator": True},
-        )
-    else:
-        # No costs update - simple update
-        record = await prisma.models.LlmModel.prisma().update(
-            where={"id": model_id},
-            data=scalar_data,
-            include={"Costs": True, "Creator": True},
-        )
-
-    if not record:
-        raise ValueError(f"Model with id '{model_id}' not found")
-    return _map_model(record)
-
-
-async def toggle_model(
-    model_id: str,
-    is_enabled: bool,
-    migrate_to_slug: str | None = None,
-    migration_reason: str | None = None,
-    custom_credit_cost: int | None = None,
-) -> llm_model.ToggleLlmModelResponse:
-    """
-    Toggle a model's enabled status, optionally migrating workflows when disabling.
-
-    Args:
-        model_id: UUID of the model to toggle
-        is_enabled: New enabled status
-        migrate_to_slug: If disabling and this is provided, migrate all workflows
-                         using this model to the specified replacement model
-        migration_reason: Optional reason for the migration (e.g., "Provider outage")
-        custom_credit_cost: Optional custom pricing override for migrated workflows.
-                           When set, the billing system should use this cost instead
-                           of the target model's cost for affected nodes.
-
-    Returns:
-        ToggleLlmModelResponse with the updated model and optional migration stats
-    """
-    import json
-
-    # Get the model being toggled
-    model = await prisma.models.LlmModel.prisma().find_unique(
-        where={"id": model_id}, include={"Costs": True}
-    )
-    if not model:
-        raise ValueError(f"Model with id '{model_id}' not found")
-
-    nodes_migrated = 0
-    migration_id: str | None = None
-
-    # If disabling with migration, perform migration first
-    if not is_enabled and migrate_to_slug:
-        # Validate replacement model exists and is enabled
-        replacement = await prisma.models.LlmModel.prisma().find_unique(
-            where={"slug": migrate_to_slug}
-        )
-        if not replacement:
-            raise ValueError(f"Replacement model '{migrate_to_slug}' not found")
-        if not replacement.isEnabled:
-            raise ValueError(
-                f"Replacement model '{migrate_to_slug}' is disabled. "
-                f"Please enable it before using it as a replacement."
-            )
-
-        # Perform all operations atomically within a single transaction
-        # This ensures no nodes are missed between query and update
-        async with transaction() as tx:
-            # Get the IDs of nodes that will be migrated (inside transaction for consistency)
-            node_ids_result = await tx.query_raw(
-                """
-                SELECT id
-                FROM "AgentNode"
-                WHERE "constantInput"::jsonb->>'model' = $1
-                FOR UPDATE
-                """,
-                model.slug,
-            )
-            migrated_node_ids = (
-                [row["id"] for row in node_ids_result] if node_ids_result else []
-            )
-            nodes_migrated = len(migrated_node_ids)
-
-            if nodes_migrated > 0:
-                # Update by IDs to ensure we only update the exact nodes we queried
-                # Use JSON array and jsonb_array_elements_text for safe parameterization
-                node_ids_json = json.dumps(migrated_node_ids)
-                await tx.execute_raw(
-                    """
-                    UPDATE "AgentNode"
-                    SET "constantInput" = JSONB_SET(
-                        "constantInput"::jsonb,
-                        '{model}',
-                        to_jsonb($1::text)
-                    )
-                    WHERE id::text IN (
-                        SELECT jsonb_array_elements_text($2::jsonb)
-                    )
-                    """,
-                    migrate_to_slug,
-                    node_ids_json,
-                )
-
-            record = await tx.llmmodel.update(
-                where={"id": model_id},
-                data={"isEnabled": is_enabled},
-                include={"Costs": True},
-            )
-
-            # Create migration record for revert capability
-            if nodes_migrated > 0:
-                migration_data: Any = {
-                    "sourceModelSlug": model.slug,
-                    "targetModelSlug": migrate_to_slug,
-                    "reason": migration_reason,
-                    "migratedNodeIds": json.dumps(migrated_node_ids),
-                    "nodeCount": nodes_migrated,
-                    "customCreditCost": custom_credit_cost,
-                }
-                migration_record = await tx.llmmodelmigration.create(
-                    data=migration_data
-                )
-                migration_id = migration_record.id
-    else:
-        # Simple toggle without migration
-        record = await prisma.models.LlmModel.prisma().update(
-            where={"id": model_id},
-            data={"isEnabled": is_enabled},
-            include={"Costs": True},
-        )
-
-    if record is None:
-        raise ValueError(f"Model with id '{model_id}' not found")
-    return llm_model.ToggleLlmModelResponse(
-        model=_map_model(record),
-        nodes_migrated=nodes_migrated,
-        migrated_to_slug=migrate_to_slug if nodes_migrated > 0 else None,
-        migration_id=migration_id,
-    )
-
-
-async def get_model_usage(model_id: str) -> llm_model.LlmModelUsageResponse:
-    """Get usage count for a model."""
-    import prisma as prisma_module
-
-    model = await prisma.models.LlmModel.prisma().find_unique(where={"id": model_id})
-    if not model:
-        raise ValueError(f"Model with id '{model_id}' not found")
-
-    count_result = await prisma_module.get_client().query_raw(
-        """
-        SELECT COUNT(*) as count
-        FROM "AgentNode"
-        WHERE "constantInput"::jsonb->>'model' = $1
-        """,
-        model.slug,
-    )
-    node_count = int(count_result[0]["count"]) if count_result else 0
-
-    return llm_model.LlmModelUsageResponse(model_slug=model.slug, node_count=node_count)
-
-
-async def delete_model(
-    model_id: str, replacement_model_slug: str | None = None
-) -> llm_model.DeleteLlmModelResponse:
-    """
-    Delete a model and optionally migrate all AgentNodes using it to a replacement model.
-
-    This performs an atomic operation within a database transaction:
-    1. Validates the model exists
-    2. Counts affected nodes
-    3. If nodes exist, validates replacement model and migrates them
-    4. Deletes the LlmModel record (CASCADE deletes costs)
-
-    Args:
-        model_id: UUID of the model to delete
-        replacement_model_slug: Slug of the model to migrate to (required only if nodes use this model)
-
-    Returns:
-        DeleteLlmModelResponse with migration stats
-
-    Raises:
-        ValueError: If model not found, nodes exist but no replacement provided,
-                    replacement not found, or replacement is disabled
-    """
-    # 1. Get the model being deleted (validation - outside transaction)
-    model = await prisma.models.LlmModel.prisma().find_unique(
-        where={"id": model_id}, include={"Costs": True}
-    )
-    if not model:
-        raise ValueError(f"Model with id '{model_id}' not found")
-
-    deleted_slug = model.slug
-    deleted_display_name = model.displayName
-
-    # 2. Count affected nodes first to determine if replacement is needed
-    import prisma as prisma_module
-
-    count_result = await prisma_module.get_client().query_raw(
-        """
-        SELECT COUNT(*) as count
-        FROM "AgentNode"
-        WHERE "constantInput"::jsonb->>'model' = $1
-        """,
-        deleted_slug,
-    )
-    nodes_to_migrate = int(count_result[0]["count"]) if count_result else 0
-
-    # 3. Validate replacement model only if there are nodes to migrate
-    if nodes_to_migrate > 0:
-        if not replacement_model_slug:
-            raise ValueError(
-                f"Cannot delete model '{deleted_slug}': {nodes_to_migrate} workflow node(s) "
-                f"are using it. Please provide a replacement_model_slug to migrate them."
-            )
-        replacement = await prisma.models.LlmModel.prisma().find_unique(
-            where={"slug": replacement_model_slug}
-        )
-        if not replacement:
-            raise ValueError(f"Replacement model '{replacement_model_slug}' not found")
-        if not replacement.isEnabled:
-            raise ValueError(
-                f"Replacement model '{replacement_model_slug}' is disabled. "
-                f"Please enable it before using it as a replacement."
-            )
-
-    # 4. Perform migration (if needed) and deletion atomically within a transaction
-    async with transaction() as tx:
-        # Migrate all AgentNode.constantInput->model to replacement
-        if nodes_to_migrate > 0 and replacement_model_slug:
-            await tx.execute_raw(
-                """
-                UPDATE "AgentNode"
-                SET "constantInput" = JSONB_SET(
-                    "constantInput"::jsonb,
-                    '{model}',
-                    to_jsonb($1::text)
-                )
-                WHERE "constantInput"::jsonb->>'model' = $2
-                """,
-                replacement_model_slug,
-                deleted_slug,
-            )
-
-        # Delete the model (CASCADE will delete costs automatically)
-        await tx.llmmodel.delete(where={"id": model_id})
-
-    # Build appropriate message based on whether migration happened
-    if nodes_to_migrate > 0:
-        message = (
-            f"Successfully deleted model '{deleted_display_name}' ({deleted_slug}) "
-            f"and migrated {nodes_to_migrate} workflow node(s) to '{replacement_model_slug}'."
-        )
-    else:
-        message = (
-            f"Successfully deleted model '{deleted_display_name}' ({deleted_slug}). "
-            f"No workflows were using this model."
-        )
-
-    return llm_model.DeleteLlmModelResponse(
-        deleted_model_slug=deleted_slug,
-        deleted_model_display_name=deleted_display_name,
-        replacement_model_slug=replacement_model_slug,
-        nodes_migrated=nodes_to_migrate,
-        message=message,
-    )
-
-
-def _map_migration(
-    record: prisma.models.LlmModelMigration,
-) -> llm_model.LlmModelMigration:
-    return llm_model.LlmModelMigration(
-        id=record.id,
-        source_model_slug=record.sourceModelSlug,
-        target_model_slug=record.targetModelSlug,
-        reason=record.reason,
-        node_count=record.nodeCount,
-        custom_credit_cost=record.customCreditCost,
-        is_reverted=record.isReverted,
-        created_at=record.createdAt.isoformat(),
-        reverted_at=record.revertedAt.isoformat() if record.revertedAt else None,
-    )
-
-
-async def list_migrations(
-    include_reverted: bool = False,
-) -> list[llm_model.LlmModelMigration]:
-    """
-    List model migrations, optionally including reverted ones.
-
-    Args:
-        include_reverted: If True, include reverted migrations. Default is False.
-
-    Returns:
-        List of LlmModelMigration records
-    """
-    where: Any = None if include_reverted else {"isReverted": False}
-    records = await prisma.models.LlmModelMigration.prisma().find_many(
-        where=where,
-        order={"createdAt": "desc"},
-    )
-    return [_map_migration(record) for record in records]
-
-
-async def get_migration(migration_id: str) -> llm_model.LlmModelMigration | None:
-    """Get a specific migration by ID."""
-    record = await prisma.models.LlmModelMigration.prisma().find_unique(
-        where={"id": migration_id}
-    )
-    return _map_migration(record) if record else None
-
-
-async def revert_migration(
-    migration_id: str,
-    re_enable_source_model: bool = True,
-) -> llm_model.RevertMigrationResponse:
-    """
-    Revert a model migration, restoring affected nodes to their original model.
-
-    This only reverts the specific nodes that were migrated, not all nodes
-    currently using the target model.
-
-    Args:
-        migration_id: UUID of the migration to revert
-        re_enable_source_model: Whether to re-enable the source model if it's disabled
-
-    Returns:
-        RevertMigrationResponse with revert stats
-
-    Raises:
-        ValueError: If migration not found, already reverted, or source model not available
-    """
-    import json
-    from datetime import datetime, timezone
-
-    # Get the migration record
-    migration = await prisma.models.LlmModelMigration.prisma().find_unique(
-        where={"id": migration_id}
-    )
-    if not migration:
-        raise ValueError(f"Migration with id '{migration_id}' not found")
-
-    if migration.isReverted:
-        raise ValueError(
-            f"Migration '{migration_id}' has already been reverted "
-            f"on {migration.revertedAt.isoformat() if migration.revertedAt else 'unknown date'}"
-        )
-
-    # Check if source model exists
-    source_model = await prisma.models.LlmModel.prisma().find_unique(
-        where={"slug": migration.sourceModelSlug}
-    )
-    if not source_model:
-        raise ValueError(
-            f"Source model '{migration.sourceModelSlug}' no longer exists. "
-            f"Cannot revert migration."
-        )
-
-    # Get the migrated node IDs (Prisma auto-parses JSONB to list)
-    migrated_node_ids: list[str] = (
-        migration.migratedNodeIds
-        if isinstance(migration.migratedNodeIds, list)
-        else json.loads(migration.migratedNodeIds)  # type: ignore
-    )
-    if not migrated_node_ids:
-        raise ValueError("No nodes to revert in this migration")
-
-    # Track if we need to re-enable the source model
-    source_model_was_disabled = not source_model.isEnabled
-    should_re_enable = source_model_was_disabled and re_enable_source_model
-    source_model_re_enabled = False
-
-    # Perform revert atomically
-    async with transaction() as tx:
-        # Re-enable the source model if requested and it was disabled
-        if should_re_enable:
-            await tx.llmmodel.update(
-                where={"id": source_model.id},
-                data={"isEnabled": True},
-            )
-            source_model_re_enabled = True
-
-        # Update only the specific nodes that were migrated
-        # We need to check that they still have the target model (haven't been changed since)
-        # Use a single batch update for efficiency
-        # Use JSON array and jsonb_array_elements_text for safe parameterization
-        node_ids_json = json.dumps(migrated_node_ids)
-        result = await tx.execute_raw(
-            """
-            UPDATE "AgentNode"
-            SET "constantInput" = JSONB_SET(
-                "constantInput"::jsonb,
-                '{model}',
-                to_jsonb($1::text)
-            )
-            WHERE id::text IN (
-                SELECT jsonb_array_elements_text($2::jsonb)
-            )
-            AND "constantInput"::jsonb->>'model' = $3
-            """,
-            migration.sourceModelSlug,
-            node_ids_json,
-            migration.targetModelSlug,
-        )
-        nodes_reverted = result if result else 0
-
-        # Mark migration as reverted
-        await tx.llmmodelmigration.update(
-            where={"id": migration_id},
-            data={
-                "isReverted": True,
-                "revertedAt": datetime.now(timezone.utc),
-            },
-        )
-
-    # Calculate nodes that were already changed since migration
-    nodes_already_changed = len(migrated_node_ids) - nodes_reverted
-
-    # Build appropriate message
-    message_parts = [
-        f"Successfully reverted migration: {nodes_reverted} node(s) restored "
-        f"from '{migration.targetModelSlug}' to '{migration.sourceModelSlug}'."
-    ]
-    if nodes_already_changed > 0:
-        message_parts.append(
-            f" {nodes_already_changed} node(s) were already changed and not reverted."
-        )
-    if source_model_re_enabled:
-        message_parts.append(
-            f" Model '{migration.sourceModelSlug}' has been re-enabled."
-        )
-
-    return llm_model.RevertMigrationResponse(
-        migration_id=migration_id,
-        source_model_slug=migration.sourceModelSlug,
-        target_model_slug=migration.targetModelSlug,
-        nodes_reverted=nodes_reverted,
-        nodes_already_changed=nodes_already_changed,
-        source_model_re_enabled=source_model_re_enabled,
-        message="".join(message_parts),
-    )
-
-
-# ============================================================================
-# Creator CRUD operations
-# ============================================================================
-
-
-async def list_creators() -> list[llm_model.LlmModelCreator]:
-    """List all LLM model creators."""
-    records = await prisma.models.LlmModelCreator.prisma().find_many(
-        order={"displayName": "asc"}
-    )
-    return [_map_creator(record) for record in records]
-
-
-async def get_creator(creator_id: str) -> llm_model.LlmModelCreator | None:
-    """Get a specific creator by ID."""
-    record = await prisma.models.LlmModelCreator.prisma().find_unique(
-        where={"id": creator_id}
-    )
-    return _map_creator(record) if record else None
-
-
-async def upsert_creator(
-    request: llm_model.UpsertLlmCreatorRequest,
-    creator_id: str | None = None,
-) -> llm_model.LlmModelCreator:
-    """Create or update a model creator."""
-    data: Any = {
-        "name": request.name,
-        "displayName": request.display_name,
-        "description": request.description,
-        "websiteUrl": request.website_url,
-        "logoUrl": request.logo_url,
-        "metadata": prisma.Json(request.metadata or {}),
-    }
-    if creator_id:
-        record = await prisma.models.LlmModelCreator.prisma().update(
-            where={"id": creator_id},
-            data=data,
-        )
-    else:
-        record = await prisma.models.LlmModelCreator.prisma().create(data=data)
-    if record is None:
-        raise ValueError("Failed to create/update creator")
-    return _map_creator(record)
-
-
-async def delete_creator(creator_id: str) -> bool:
-    """
-    Delete a model creator.
-
-    This will set creatorId to NULL on all associated models (due to onDelete: SetNull).
-
-    Args:
-        creator_id: UUID of the creator to delete
-
-    Returns:
-        True if deleted successfully
-
-    Raises:
-        ValueError: If creator not found
-    """
-    creator = await prisma.models.LlmModelCreator.prisma().find_unique(
-        where={"id": creator_id}
-    )
-    if not creator:
-        raise ValueError(f"Creator with id '{creator_id}' not found")
-
-    await prisma.models.LlmModelCreator.prisma().delete(where={"id": creator_id})
-    return True
-
-
-async def get_recommended_model() -> llm_model.LlmModel | None:
-    """
-    Get the currently recommended LLM model.
-
-    Returns:
-        The recommended model, or None if no model is marked as recommended.
-    """
-    record = await prisma.models.LlmModel.prisma().find_first(
-        where={"isRecommended": True, "isEnabled": True},
-        include={"Costs": True, "Creator": True},
-    )
-    return _map_model(record) if record else None
-
-
-async def set_recommended_model(
-    model_id: str,
-) -> tuple[llm_model.LlmModel, str | None]:
-    """
-    Set a model as the recommended model.
-
-    This will clear the isRecommended flag from any other model and set it
-    on the specified model. The model must be enabled.
-
-    Args:
-        model_id: UUID of the model to set as recommended
-
-    Returns:
-        Tuple of (the updated model, previous recommended model slug or None)
-
-    Raises:
-        ValueError: If model not found or not enabled
-    """
-    # First, verify the model exists and is enabled
-    target_model = await prisma.models.LlmModel.prisma().find_unique(
-        where={"id": model_id}
-    )
-    if not target_model:
-        raise ValueError(f"Model with id '{model_id}' not found")
-    if not target_model.isEnabled:
-        raise ValueError(
-            f"Cannot set disabled model '{target_model.slug}' as recommended"
-        )
-
-    # Get the current recommended model (if any)
-    current_recommended = await prisma.models.LlmModel.prisma().find_first(
-        where={"isRecommended": True}
-    )
-    previous_slug = current_recommended.slug if current_recommended else None
-
-    # Use a transaction to ensure atomicity
-    async with transaction() as tx:
-        # Clear isRecommended from all models
-        await tx.llmmodel.update_many(
-            where={"isRecommended": True},
-            data={"isRecommended": False},
-        )
-        # Set the new recommended model
-        await tx.llmmodel.update(
-            where={"id": model_id},
-            data={"isRecommended": True},
-        )
-
-    # Fetch and return the updated model
-    updated_record = await prisma.models.LlmModel.prisma().find_unique(
-        where={"id": model_id},
-        include={"Costs": True, "Creator": True},
-    )
-    if not updated_record:
-        raise ValueError("Failed to fetch updated model")
-
-    return _map_model(updated_record), previous_slug
-
-
-async def get_recommended_model_slug() -> str | None:
-    """
-    Get the slug of the currently recommended LLM model.
-
-    Returns:
-        The slug of the recommended model, or None if no model is marked as recommended.
-    """
-    record = await prisma.models.LlmModel.prisma().find_first(
-        where={"isRecommended": True, "isEnabled": True},
-    )
-    return record.slug if record else None
--- a/autogpt_platform/backend/backend/server/v2/llm/model.py
+++ b/autogpt_platform/backend/backend/server/v2/llm/model.py
@@ -1,235 +0,0 @@
-from __future__ import annotations
-
-import re
-from datetime import datetime
-from typing import Any, Optional
-
-import prisma.enums
-import pydantic
-
-from backend.util.models import Pagination
-
-# Pattern for valid model slugs: alphanumeric start, then alphanumeric, dots, underscores, slashes, hyphens
-SLUG_PATTERN = re.compile(r"^[a-zA-Z0-9][a-zA-Z0-9._/-]*$")
-
-
-class LlmModelCost(pydantic.BaseModel):
-    id: str
-    unit: prisma.enums.LlmCostUnit = prisma.enums.LlmCostUnit.RUN
-    credit_cost: int
-    credential_provider: str
-    credential_id: Optional[str] = None
-    credential_type: Optional[str] = None
-    currency: Optional[str] = None
-    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
-
-
-class LlmModelCreator(pydantic.BaseModel):
-    """Represents the organization that created/trained the model (e.g., OpenAI, Meta)."""
-
-    id: str
-    name: str
-    display_name: str
-    description: Optional[str] = None
-    website_url: Optional[str] = None
-    logo_url: Optional[str] = None
-    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
-
-
-class LlmModel(pydantic.BaseModel):
-    id: str
-    slug: str
-    display_name: str
-    description: Optional[str] = None
-    provider_id: str
-    creator_id: Optional[str] = None
-    creator: Optional[LlmModelCreator] = None
-    context_window: int
-    max_output_tokens: Optional[int] = None
-    is_enabled: bool = True
-    is_recommended: bool = False
-    capabilities: dict[str, Any] = pydantic.Field(default_factory=dict)
-    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
-    costs: list[LlmModelCost] = pydantic.Field(default_factory=list)
-
-
-class LlmProvider(pydantic.BaseModel):
-    id: str
-    name: str
-    display_name: str
-    description: Optional[str] = None
-    default_credential_provider: Optional[str] = None
-    default_credential_id: Optional[str] = None
-    default_credential_type: Optional[str] = None
-    supports_tools: bool = True
-    supports_json_output: bool = True
-    supports_reasoning: bool = False
-    supports_parallel_tool: bool = False
-    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
-    models: list[LlmModel] = pydantic.Field(default_factory=list)
-
-
-class LlmProvidersResponse(pydantic.BaseModel):
-    providers: list[LlmProvider]
-
-
-class LlmModelsResponse(pydantic.BaseModel):
-    models: list[LlmModel]
-    pagination: Optional[Pagination] = None
-
-
-class LlmCreatorsResponse(pydantic.BaseModel):
-    creators: list[LlmModelCreator]
-
-
-class UpsertLlmProviderRequest(pydantic.BaseModel):
-    name: str
-    display_name: str
-    description: Optional[str] = None
-    default_credential_provider: Optional[str] = None
-    default_credential_id: Optional[str] = None
-    default_credential_type: Optional[str] = "api_key"
-    supports_tools: bool = True
-    supports_json_output: bool = True
-    supports_reasoning: bool = False
-    supports_parallel_tool: bool = False
-    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
-
-
-class UpsertLlmCreatorRequest(pydantic.BaseModel):
-    name: str
-    display_name: str
-    description: Optional[str] = None
-    website_url: Optional[str] = None
-    logo_url: Optional[str] = None
-    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
-
-
-class LlmModelCostInput(pydantic.BaseModel):
-    unit: prisma.enums.LlmCostUnit = prisma.enums.LlmCostUnit.RUN
-    credit_cost: int
-    credential_provider: str
-    credential_id: Optional[str] = None
-    credential_type: Optional[str] = "api_key"
-    currency: Optional[str] = None
-    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
-
-
-class CreateLlmModelRequest(pydantic.BaseModel):
-    slug: str
-    display_name: str
-    description: Optional[str] = None
-    provider_id: str
-    creator_id: Optional[str] = None
-    context_window: int
-    max_output_tokens: Optional[int] = None
-    is_enabled: bool = True
-    capabilities: dict[str, Any] = pydantic.Field(default_factory=dict)
-    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
-    costs: list[LlmModelCostInput]
-
-    @pydantic.field_validator("slug")
-    @classmethod
-    def validate_slug(cls, v: str) -> str:
-        if not v or len(v) > 100:
-            raise ValueError("Slug must be 1-100 characters")
-        if not SLUG_PATTERN.match(v):
-            raise ValueError(
-                "Slug must start with alphanumeric and contain only "
-                "alphanumeric characters, dots, underscores, slashes, or hyphens"
-            )
-        return v
-
-
-class UpdateLlmModelRequest(pydantic.BaseModel):
-    display_name: Optional[str] = None
-    description: Optional[str] = None
-    context_window: Optional[int] = None
-    max_output_tokens: Optional[int] = None
-    is_enabled: Optional[bool] = None
-    capabilities: Optional[dict[str, Any]] = None
-    metadata: Optional[dict[str, Any]] = None
-    provider_id: Optional[str] = None
-    creator_id: Optional[str] = None
-    costs: Optional[list[LlmModelCostInput]] = None
-
-
-class ToggleLlmModelRequest(pydantic.BaseModel):
-    is_enabled: bool
-    migrate_to_slug: Optional[str] = None
-    migration_reason: Optional[str] = None  # e.g., "Provider outage"
-    # Custom pricing override for migrated workflows. When set, billing should use
-    # this cost instead of the target model's cost for affected nodes.
-    # See LlmModelMigration in schema.prisma for full documentation.
-    custom_credit_cost: Optional[int] = None
-
-
-class ToggleLlmModelResponse(pydantic.BaseModel):
-    model: LlmModel
-    nodes_migrated: int = 0
-    migrated_to_slug: Optional[str] = None
-    migration_id: Optional[str] = None  # ID of the migration record for revert
-
-
-class DeleteLlmModelResponse(pydantic.BaseModel):
-    deleted_model_slug: str
-    deleted_model_display_name: str
-    replacement_model_slug: Optional[str] = None
-    nodes_migrated: int
-    message: str
-
-
-class LlmModelUsageResponse(pydantic.BaseModel):
-    model_slug: str
-    node_count: int
-
-
-# Migration tracking models
-class LlmModelMigration(pydantic.BaseModel):
-    id: str
-    source_model_slug: str
-    target_model_slug: str
-    reason: Optional[str] = None
-    node_count: int
-    # Custom pricing override - billing should use this instead of target model's cost
-    custom_credit_cost: Optional[int] = None
-    is_reverted: bool = False
-    created_at: datetime
-    reverted_at: Optional[datetime] = None
-
-
-class LlmMigrationsResponse(pydantic.BaseModel):
-    migrations: list[LlmModelMigration]
-
-
-class RevertMigrationRequest(pydantic.BaseModel):
-    re_enable_source_model: bool = (
-        True  # Whether to re-enable the source model if disabled
-    )
-
-
-class RevertMigrationResponse(pydantic.BaseModel):
-    migration_id: str
-    source_model_slug: str
-    target_model_slug: str
-    nodes_reverted: int
-    nodes_already_changed: int = (
-        0  # Nodes that were modified since migration (not reverted)
-    )
-    source_model_re_enabled: bool = False  # Whether the source model was re-enabled
-    message: str
-
-
-class SetRecommendedModelRequest(pydantic.BaseModel):
-    model_id: str
-
-
-class SetRecommendedModelResponse(pydantic.BaseModel):
-    model: LlmModel
-    previous_recommended_slug: Optional[str] = None
-    message: str
-
-
-class RecommendedModelResponse(pydantic.BaseModel):
-    model: Optional[LlmModel] = None
-    slug: Optional[str] = None
--- a/autogpt_platform/backend/backend/server/v2/llm/routes.py
+++ b/autogpt_platform/backend/backend/server/v2/llm/routes.py
@@ -1,29 +0,0 @@
-import autogpt_libs.auth
-import fastapi
-
-from backend.server.v2.llm import db as llm_db
-from backend.server.v2.llm import model as llm_model
-
-router = fastapi.APIRouter(
-    prefix="/llm",
-    tags=["llm"],
-    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
-)
-
-
-@router.get("/models", response_model=llm_model.LlmModelsResponse)
-async def list_models(
-    page: int = fastapi.Query(default=1, ge=1, description="Page number (1-indexed)"),
-    page_size: int = fastapi.Query(
-        default=50, ge=1, le=100, description="Number of models per page"
-    ),
-):
-    """List all enabled LLM models available to users."""
-    return await llm_db.list_models(enabled_only=True, page=page, page_size=page_size)
-
-
-@router.get("/providers", response_model=llm_model.LlmProvidersResponse)
-async def list_providers():
-    """List all LLM providers with their enabled models."""
-    providers = await llm_db.list_providers(include_models=True, enabled_only=True)
-    return llm_model.LlmProvidersResponse(providers=providers)
--- a/autogpt_platform/backend/backend/util/cloud_storage.py
+++ b/autogpt_platform/backend/backend/util/cloud_storage.py
@@ -13,6 +13,7 @@ import aiohttp
 from gcloud.aio import storage as async_gcs_storage
 from google.cloud import storage as gcs_storage

+from backend.util.gcs_utils import download_with_fresh_session, generate_signed_url
 from backend.util.settings import Config

 logger = logging.getLogger(__name__)
@@ -251,7 +252,7 @@ class CloudStorageHandler:
            f"in_task: {current_task is not None}"
        )

-        # Parse bucket and blob name from path
+        # Parse bucket and blob name from path (path already has gcs:// prefix removed)
        parts = path.split("/", 1)
        if len(parts) != 2:
            raise ValueError(f"Invalid GCS path: {path}")
@@ -261,50 +262,19 @@ class CloudStorageHandler:
        # Authorization check
        self._validate_file_access(blob_name, user_id, graph_exec_id)

-        # Use a fresh client for each download to avoid session issues
-        # This is less efficient but more reliable with the executor's event loop
-        logger.info("[CloudStorage] Creating fresh GCS client for download")
-
-        # Create a new session specifically for this download
-        session = aiohttp.ClientSession(
-            connector=aiohttp.TCPConnector(limit=10, force_close=True)
+        logger.info(
+            f"[CloudStorage] About to download from GCS - bucket: {bucket_name}, blob: {blob_name}"
        )

-        async_client = None
        try:
-            # Create a new GCS client with the fresh session
-            async_client = async_gcs_storage.Storage(session=session)
-
-            logger.info(
-                f"[CloudStorage] About to download from GCS - bucket: {bucket_name}, blob: {blob_name}"
-            )
-
-            # Download content using the fresh client
-            content = await async_client.download(bucket_name, blob_name)
+            content = await download_with_fresh_session(bucket_name, blob_name)
            logger.info(
                f"[CloudStorage] GCS download successful - size: {len(content)} bytes"
            )
-
-            # Clean up
-            await async_client.close()
-            await session.close()
-
            return content
-
+        except FileNotFoundError:
+            raise
        except Exception as e:
-            # Always try to clean up
-            if async_client is not None:
-                try:
-                    await async_client.close()
-                except Exception as cleanup_error:
-                    logger.warning(
-                        f"[CloudStorage] Error closing GCS client: {cleanup_error}"
-                    )
-            try:
-                await session.close()
-            except Exception as cleanup_error:
-                logger.warning(f"[CloudStorage] Error closing session: {cleanup_error}")
-
            # Log the specific error for debugging
            logger.error(
                f"[CloudStorage] GCS download failed - error: {str(e)}, "
@@ -319,10 +289,6 @@ class CloudStorageHandler:
                    f"current_task: {current_task}, "
                    f"bucket: {bucket_name}, blob: redacted for privacy"
                )
-
-            # Convert gcloud-aio exceptions to standard ones
-            if "404" in str(e) or "Not Found" in str(e):
-                raise FileNotFoundError(f"File not found: gcs://{path}")
            raise

    def _validate_file_access(
@@ -445,8 +411,7 @@ class CloudStorageHandler:
        graph_exec_id: str | None = None,
    ) -> str:
        """Generate signed URL for GCS with authorization."""
-
-        # Parse bucket and blob name from path
+        # Parse bucket and blob name from path (path already has gcs:// prefix removed)
        parts = path.split("/", 1)
        if len(parts) != 2:
            raise ValueError(f"Invalid GCS path: {path}")
@@ -456,21 +421,11 @@ class CloudStorageHandler:
        # Authorization check
        self._validate_file_access(blob_name, user_id, graph_exec_id)

-        # Use sync client for signed URLs since gcloud-aio doesn't support them
        sync_client = self._get_sync_gcs_client()
-        bucket = sync_client.bucket(bucket_name)
-        blob = bucket.blob(blob_name)
-
-        # Generate signed URL asynchronously using sync client
-        url = await asyncio.to_thread(
-            blob.generate_signed_url,
-            version="v4",
-            expiration=datetime.now(timezone.utc) + timedelta(hours=expiration_hours),
-            method="GET",
+        return await generate_signed_url(
+            sync_client, bucket_name, blob_name, expiration_hours * 3600
        )

-        return url
-
    async def delete_expired_files(self, provider: str = "gcs") -> int:
        """
        Delete files that have passed their expiration time.
--- a/autogpt_platform/backend/backend/util/exceptions.py
+++ b/autogpt_platform/backend/backend/util/exceptions.py
@@ -135,6 +135,12 @@ class GraphValidationError(ValueError):
        )


+class InvalidInputError(ValueError):
+    """Raised when user input validation fails (e.g., search term too long)"""
+
+    pass
+
+
 class DatabaseError(Exception):
    """Raised when there is an error interacting with the database"""

--- a/autogpt_platform/backend/backend/util/file.py
+++ b/autogpt_platform/backend/backend/util/file.py
@@ -5,13 +5,26 @@ import shutil
 import tempfile
 import uuid
 from pathlib import Path
+from typing import TYPE_CHECKING, Literal
 from urllib.parse import urlparse

 from backend.util.cloud_storage import get_cloud_storage_handler
 from backend.util.request import Requests
+from backend.util.settings import Config
 from backend.util.type import MediaFileType
 from backend.util.virus_scanner import scan_content_safe

+if TYPE_CHECKING:
+    from backend.data.execution import ExecutionContext
+
+# Return format options for store_media_file
+# - "for_local_processing": Returns local file path - use with ffmpeg, MoviePy, PIL, etc.
+# - "for_external_api": Returns data URI (base64) - use when sending content to external APIs
+# - "for_block_output": Returns best format for output - workspace:// in CoPilot, data URI in graphs
+MediaReturnFormat = Literal[
+    "for_local_processing", "for_external_api", "for_block_output"
+]
+
 TEMP_DIR = Path(tempfile.gettempdir()).resolve()

 # Maximum filename length (conservative limit for most filesystems)
@@ -67,42 +80,56 @@ def clean_exec_files(graph_exec_id: str, file: str = "") -> None:


 async def store_media_file(
-    graph_exec_id: str,
    file: MediaFileType,
-    user_id: str,
-    return_content: bool = False,
+    execution_context: "ExecutionContext",
+    *,
+    return_format: MediaReturnFormat,
 ) -> MediaFileType:
    """
-    Safely handle 'file' (a data URI, a URL, or a local path relative to {temp}/exec_file/{exec_id}),
-    placing or verifying it under:
+    Safely handle 'file' (a data URI, a URL, a workspace:// reference, or a local path
+    relative to {temp}/exec_file/{exec_id}), placing or verifying it under:
        {tempdir}/exec_file/{exec_id}/...

-    If 'return_content=True', return a data URI (data:<mime>;base64,<content>).
-    Otherwise, returns the file media path relative to the exec_id folder.
+    For each MediaFileType input:
+    - Data URI: decode and store locally
+    - URL: download and store locally
+    - workspace:// reference: read from workspace, store locally
+    - Local path: verify it exists in exec_file directory

-    For each MediaFileType type:
-    - Data URI:
-      -> decode and store in a new random file in that folder
-    - URL:
-      -> download and store in that folder
-    - Local path:
-      -> interpret as relative to that folder; verify it exists
-         (no copying, as it's presumably already there).
-         We realpath-check so no symlink or '..' can escape the folder.
+    Return format options:
+    - "for_local_processing": Returns local file path - use with ffmpeg, MoviePy, PIL, etc.
+    - "for_external_api": Returns data URI (base64) - use when sending to external APIs
+    - "for_block_output": Returns best format for output - workspace:// in CoPilot, data URI in graphs

-
-    :param graph_exec_id:  The unique ID of the graph execution.
-    :param file:           Data URI, URL, or local (relative) path.
-    :param return_content: If True, return a data URI of the file content.
-                           If False, return the *relative* path inside the exec_id folder.
-    :return:               The requested result: data URI or relative path of the media.
+    :param file:               Data URI, URL, workspace://, or local (relative) path.
+    :param execution_context:  ExecutionContext with user_id, graph_exec_id, workspace_id.
+    :param return_format:      What to return: "for_local_processing", "for_external_api", or "for_block_output".
+    :return:                   The requested result based on return_format.
    """
+    # Extract values from execution_context
+    graph_exec_id = execution_context.graph_exec_id
+    user_id = execution_context.user_id
+
+    if not graph_exec_id:
+        raise ValueError("execution_context.graph_exec_id is required")
+    if not user_id:
+        raise ValueError("execution_context.user_id is required")
+
+    # Create workspace_manager if we have workspace_id (with session scoping)
+    # Import here to avoid circular import (file.py → workspace.py → data → blocks → file.py)
+    from backend.util.workspace import WorkspaceManager
+
+    workspace_manager: WorkspaceManager | None = None
+    if execution_context.workspace_id:
+        workspace_manager = WorkspaceManager(
+            user_id, execution_context.workspace_id, execution_context.session_id
+        )
    # Build base path
    base_path = Path(get_exec_file_path(graph_exec_id, ""))
    base_path.mkdir(parents=True, exist_ok=True)

    # Security fix: Add disk space limits to prevent DoS
-    MAX_FILE_SIZE = 100 * 1024 * 1024  # 100MB per file
+    MAX_FILE_SIZE_BYTES = Config().max_file_size_mb * 1024 * 1024
    MAX_TOTAL_DISK_USAGE = 1024 * 1024 * 1024  # 1GB total per execution directory

    # Check total disk usage in base_path
@@ -142,9 +169,57 @@ async def store_media_file(
        """
        return str(absolute_path.relative_to(base))

-    # Check if this is a cloud storage path
+    # Get cloud storage handler for checking cloud paths
    cloud_storage = await get_cloud_storage_handler()
-    if cloud_storage.is_cloud_path(file):
+
+    # Track if the input came from workspace (don't re-save it)
+    is_from_workspace = file.startswith("workspace://")
+
+    # Check if this is a workspace file reference
+    if is_from_workspace:
+        if workspace_manager is None:
+            raise ValueError(
+                "Workspace file reference requires workspace context. "
+                "This file type is only available in CoPilot sessions."
+            )
+
+        # Parse workspace reference
+        # workspace://abc123 - by file ID
+        # workspace:///path/to/file.txt - by virtual path
+        file_ref = file[12:]  # Remove "workspace://"
+
+        if file_ref.startswith("/"):
+            # Path reference
+            workspace_content = await workspace_manager.read_file(file_ref)
+            file_info = await workspace_manager.get_file_info_by_path(file_ref)
+            filename = sanitize_filename(
+                file_info.name if file_info else f"{uuid.uuid4()}.bin"
+            )
+        else:
+            # ID reference
+            workspace_content = await workspace_manager.read_file_by_id(file_ref)
+            file_info = await workspace_manager.get_file_info(file_ref)
+            filename = sanitize_filename(
+                file_info.name if file_info else f"{uuid.uuid4()}.bin"
+            )
+
+        try:
+            target_path = _ensure_inside_base(base_path / filename, base_path)
+        except OSError as e:
+            raise ValueError(f"Invalid file path '{filename}': {e}") from e
+
+        # Check file size limit
+        if len(workspace_content) > MAX_FILE_SIZE_BYTES:
+            raise ValueError(
+                f"File too large: {len(workspace_content)} bytes > {MAX_FILE_SIZE_BYTES} bytes"
+            )
+
+        # Virus scan the workspace content before writing locally
+        await scan_content_safe(workspace_content, filename=filename)
+        target_path.write_bytes(workspace_content)
+
+    # Check if this is a cloud storage path
+    elif cloud_storage.is_cloud_path(file):
        # Download from cloud storage and store locally
        cloud_content = await cloud_storage.retrieve_file(
            file, user_id=user_id, graph_exec_id=graph_exec_id
@@ -159,9 +234,9 @@ async def store_media_file(
            raise ValueError(f"Invalid file path '{filename}': {e}") from e

        # Check file size limit
-        if len(cloud_content) > MAX_FILE_SIZE:
+        if len(cloud_content) > MAX_FILE_SIZE_BYTES:
            raise ValueError(
-                f"File too large: {len(cloud_content)} bytes > {MAX_FILE_SIZE} bytes"
+                f"File too large: {len(cloud_content)} bytes > {MAX_FILE_SIZE_BYTES} bytes"
            )

        # Virus scan the cloud content before writing locally
@@ -189,9 +264,9 @@ async def store_media_file(
        content = base64.b64decode(b64_content)

        # Check file size limit
-        if len(content) > MAX_FILE_SIZE:
+        if len(content) > MAX_FILE_SIZE_BYTES:
            raise ValueError(
-                f"File too large: {len(content)} bytes > {MAX_FILE_SIZE} bytes"
+                f"File too large: {len(content)} bytes > {MAX_FILE_SIZE_BYTES} bytes"
            )

        # Virus scan the base64 content before writing
@@ -199,23 +274,31 @@ async def store_media_file(
        target_path.write_bytes(content)

    elif file.startswith(("http://", "https://")):
-        # URL
+        # URL - download first to get Content-Type header
+        resp = await Requests().get(file)
+
+        # Check file size limit
+        if len(resp.content) > MAX_FILE_SIZE_BYTES:
+            raise ValueError(
+                f"File too large: {len(resp.content)} bytes > {MAX_FILE_SIZE_BYTES} bytes"
+            )
+
+        # Extract filename from URL path
        parsed_url = urlparse(file)
        filename = sanitize_filename(Path(parsed_url.path).name or f"{uuid.uuid4()}")
+
+        # If filename lacks extension, add one from Content-Type header
+        if "." not in filename:
+            content_type = resp.headers.get("Content-Type", "").split(";")[0].strip()
+            if content_type:
+                ext = _extension_from_mime(content_type)
+                filename = f"{filename}{ext}"
+
        try:
            target_path = _ensure_inside_base(base_path / filename, base_path)
        except OSError as e:
            raise ValueError(f"Invalid file path '{filename}': {e}") from e

-        # Download and save
-        resp = await Requests().get(file)
-
-        # Check file size limit
-        if len(resp.content) > MAX_FILE_SIZE:
-            raise ValueError(
-                f"File too large: {len(resp.content)} bytes > {MAX_FILE_SIZE} bytes"
-            )
-
        # Virus scan the downloaded content before writing
        await scan_content_safe(resp.content, filename=filename)
        target_path.write_bytes(resp.content)
@@ -230,12 +313,44 @@ async def store_media_file(
        if not target_path.is_file():
            raise ValueError(f"Local file does not exist: {target_path}")

-    # Return result
-    if return_content:
-        return MediaFileType(_file_to_data_uri(target_path))
-    else:
+    # Return based on requested format
+    if return_format == "for_local_processing":
+        # Use when processing files locally with tools like ffmpeg, MoviePy, PIL
+        # Returns: relative path in exec_file directory (e.g., "image.png")
        return MediaFileType(_strip_base_prefix(target_path, base_path))

+    elif return_format == "for_external_api":
+        # Use when sending content to external APIs that need base64
+        # Returns: data URI (e.g., "data:image/png;base64,iVBORw0...")
+        return MediaFileType(_file_to_data_uri(target_path))
+
+    elif return_format == "for_block_output":
+        # Use when returning output from a block to user/next block
+        # Returns: workspace:// ref (CoPilot) or data URI (graph execution)
+        if workspace_manager is None:
+            # No workspace available (graph execution without CoPilot)
+            # Fallback to data URI so the content can still be used/displayed
+            return MediaFileType(_file_to_data_uri(target_path))
+
+        # Don't re-save if input was already from workspace
+        if is_from_workspace:
+            # Return original workspace reference
+            return MediaFileType(file)
+
+        # Save new content to workspace
+        content = target_path.read_bytes()
+        filename = target_path.name
+
+        file_record = await workspace_manager.write_file(
+            content=content,
+            filename=filename,
+            overwrite=True,
+        )
+        return MediaFileType(f"workspace://{file_record.id}")
+
+    else:
+        raise ValueError(f"Invalid return_format: {return_format}")
+

 def get_dir_size(path: Path) -> int:
    """Get total size of directory."""
--- a/autogpt_platform/backend/backend/util/file_test.py
+++ b/autogpt_platform/backend/backend/util/file_test.py
@@ -7,10 +7,22 @@ from unittest.mock import AsyncMock, MagicMock, patch

 import pytest

+from backend.data.execution import ExecutionContext
 from backend.util.file import store_media_file
 from backend.util.type import MediaFileType


+def make_test_context(
+    graph_exec_id: str = "test-exec-123",
+    user_id: str = "test-user-123",
+) -> ExecutionContext:
+    """Helper to create test ExecutionContext."""
+    return ExecutionContext(
+        user_id=user_id,
+        graph_exec_id=graph_exec_id,
+    )
+
+
 class TestFileCloudIntegration:
    """Test cases for cloud storage integration in file utilities."""

@@ -70,10 +82,9 @@ class TestFileCloudIntegration:
            mock_path_class.side_effect = path_constructor

            result = await store_media_file(
-                graph_exec_id,
-                MediaFileType(cloud_path),
-                "test-user-123",
-                return_content=False,
+                file=MediaFileType(cloud_path),
+                execution_context=make_test_context(graph_exec_id=graph_exec_id),
+                return_format="for_local_processing",
            )

            # Verify cloud storage operations
@@ -144,10 +155,9 @@ class TestFileCloudIntegration:
            mock_path_obj.name = "image.png"
            with patch("backend.util.file.Path", return_value=mock_path_obj):
                result = await store_media_file(
-                    graph_exec_id,
-                    MediaFileType(cloud_path),
-                    "test-user-123",
-                    return_content=True,
+                    file=MediaFileType(cloud_path),
+                    execution_context=make_test_context(graph_exec_id=graph_exec_id),
+                    return_format="for_external_api",
                )

            # Verify result is a data URI
@@ -198,10 +208,9 @@ class TestFileCloudIntegration:
            mock_resolved_path.relative_to.return_value = Path("test-uuid-789.txt")

            await store_media_file(
-                graph_exec_id,
-                MediaFileType(data_uri),
-                "test-user-123",
-                return_content=False,
+                file=MediaFileType(data_uri),
+                execution_context=make_test_context(graph_exec_id=graph_exec_id),
+                return_format="for_local_processing",
            )

            # Verify cloud handler was checked but not used for retrieval
@@ -234,5 +243,7 @@ class TestFileCloudIntegration:
                FileNotFoundError, match="File not found in cloud storage"
            ):
                await store_media_file(
-                    graph_exec_id, MediaFileType(cloud_path), "test-user-123"
+                    file=MediaFileType(cloud_path),
+                    execution_context=make_test_context(graph_exec_id=graph_exec_id),
+                    return_format="for_local_processing",
                )
--- a/autogpt_platform/backend/backend/util/gcs_utils.py
+++ b/autogpt_platform/backend/backend/util/gcs_utils.py
@@ -0,0 +1,108 @@
+"""
+Shared GCS utilities for workspace and cloud storage backends.
+
+This module provides common functionality for working with Google Cloud Storage,
+including path parsing, client management, and signed URL generation.
+"""
+
+import asyncio
+import logging
+from datetime import datetime, timedelta, timezone
+
+import aiohttp
+from gcloud.aio import storage as async_gcs_storage
+from google.cloud import storage as gcs_storage
+
+logger = logging.getLogger(__name__)
+
+
+def parse_gcs_path(path: str) -> tuple[str, str]:
+    """
+    Parse a GCS path in the format 'gcs://bucket/blob' to (bucket, blob).
+
+    Args:
+        path: GCS path string (e.g., "gcs://my-bucket/path/to/file")
+
+    Returns:
+        Tuple of (bucket_name, blob_name)
+
+    Raises:
+        ValueError: If the path format is invalid
+    """
+    if not path.startswith("gcs://"):
+        raise ValueError(f"Invalid GCS path: {path}")
+
+    path_without_prefix = path[6:]  # Remove "gcs://"
+    parts = path_without_prefix.split("/", 1)
+    if len(parts) != 2:
+        raise ValueError(f"Invalid GCS path format: {path}")
+
+    return parts[0], parts[1]
+
+
+async def download_with_fresh_session(bucket: str, blob: str) -> bytes:
+    """
+    Download file content using a fresh session.
+
+    This approach avoids event loop issues that can occur when reusing
+    sessions across different async contexts (e.g., in executors).
+
+    Args:
+        bucket: GCS bucket name
+        blob: Blob path within the bucket
+
+    Returns:
+        File content as bytes
+
+    Raises:
+        FileNotFoundError: If the file doesn't exist
+    """
+    session = aiohttp.ClientSession(
+        connector=aiohttp.TCPConnector(limit=10, force_close=True)
+    )
+    client: async_gcs_storage.Storage | None = None
+    try:
+        client = async_gcs_storage.Storage(session=session)
+        content = await client.download(bucket, blob)
+        return content
+    except Exception as e:
+        if "404" in str(e) or "Not Found" in str(e):
+            raise FileNotFoundError(f"File not found: gcs://{bucket}/{blob}")
+        raise
+    finally:
+        if client:
+            try:
+                await client.close()
+            except Exception:
+                pass  # Best-effort cleanup
+        await session.close()
+
+
+async def generate_signed_url(
+    sync_client: gcs_storage.Client,
+    bucket_name: str,
+    blob_name: str,
+    expires_in: int,
+) -> str:
+    """
+    Generate a signed URL for temporary access to a GCS file.
+
+    Uses asyncio.to_thread() to run the sync operation without blocking.
+
+    Args:
+        sync_client: Sync GCS client with service account credentials
+        bucket_name: GCS bucket name
+        blob_name: Blob path within the bucket
+        expires_in: URL expiration time in seconds
+
+    Returns:
+        Signed URL string
+    """
+    bucket = sync_client.bucket(bucket_name)
+    blob = bucket.blob(blob_name)
+    return await asyncio.to_thread(
+        blob.generate_signed_url,
+        version="v4",
+        expiration=datetime.now(timezone.utc) + timedelta(seconds=expires_in),
+        method="GET",
+    )
--- a/autogpt_platform/backend/backend/util/settings.py
+++ b/autogpt_platform/backend/backend/util/settings.py
@@ -263,6 +263,12 @@ class Config(UpdateTrackingModel["Config"], BaseSettings):
        description="The name of the Google Cloud Storage bucket for media files",
    )

+    workspace_storage_dir: str = Field(
+        default="",
+        description="Local directory for workspace file storage when GCS is not configured. "
+        "If empty, defaults to {app_data}/workspaces. Used for self-hosted deployments.",
+    )
+
    reddit_user_agent: str = Field(
        default="web:AutoGPT:v0.6.0 (by /u/autogpt)",
        description="The user agent for the Reddit API",
@@ -359,8 +365,8 @@ class Config(UpdateTrackingModel["Config"], BaseSettings):
        description="The port for the Agent Generator service",
    )
    agentgenerator_timeout: int = Field(
-        default=120,
-        description="The timeout in seconds for Agent Generator service requests",
+        default=600,
+        description="The timeout in seconds for Agent Generator service requests (includes retries for rate limits)",
    )

    enable_example_blocks: bool = Field(
@@ -389,6 +395,13 @@ class Config(UpdateTrackingModel["Config"], BaseSettings):
        description="Maximum file size in MB for file uploads (1-1024 MB)",
    )

+    max_file_size_mb: int = Field(
+        default=100,
+        ge=1,
+        le=1024,
+        description="Maximum file size in MB for workspace files (1-1024 MB)",
+    )
+
    # AutoMod configuration
    automod_enabled: bool = Field(
        default=False,
--- a/autogpt_platform/backend/backend/util/test.py
+++ b/autogpt_platform/backend/backend/util/test.py
@@ -140,14 +140,29 @@ async def execute_block_test(block: Block):
            setattr(block, mock_name, mock_obj)

    # Populate credentials argument(s)
+    # Generate IDs for execution context
+    graph_id = str(uuid.uuid4())
+    node_id = str(uuid.uuid4())
+    graph_exec_id = str(uuid.uuid4())
+    node_exec_id = str(uuid.uuid4())
+    user_id = str(uuid.uuid4())
+    graph_version = 1  # Default version for tests
+
    extra_exec_kwargs: dict = {
-        "graph_id": str(uuid.uuid4()),
-        "node_id": str(uuid.uuid4()),
-        "graph_exec_id": str(uuid.uuid4()),
-        "node_exec_id": str(uuid.uuid4()),
-        "user_id": str(uuid.uuid4()),
-        "graph_version": 1,  # Default version for tests
-        "execution_context": ExecutionContext(),
+        "graph_id": graph_id,
+        "node_id": node_id,
+        "graph_exec_id": graph_exec_id,
+        "node_exec_id": node_exec_id,
+        "user_id": user_id,
+        "graph_version": graph_version,
+        "execution_context": ExecutionContext(
+            user_id=user_id,
+            graph_id=graph_id,
+            graph_exec_id=graph_exec_id,
+            graph_version=graph_version,
+            node_id=node_id,
+            node_exec_id=node_exec_id,
+        ),
    }
    input_model = cast(type[BlockSchema], block.input_schema)

--- a/autogpt_platform/backend/backend/util/workspace.py
+++ b/autogpt_platform/backend/backend/util/workspace.py
@@ -0,0 +1,419 @@
+"""
+WorkspaceManager for managing user workspace file operations.
+
+This module provides a high-level interface for workspace file operations,
+combining the storage backend and database layer.
+"""
+
+import logging
+import mimetypes
+import uuid
+from typing import Optional
+
+from prisma.errors import UniqueViolationError
+from prisma.models import UserWorkspaceFile
+
+from backend.data.workspace import (
+    count_workspace_files,
+    create_workspace_file,
+    get_workspace_file,
+    get_workspace_file_by_path,
+    list_workspace_files,
+    soft_delete_workspace_file,
+)
+from backend.util.settings import Config
+from backend.util.workspace_storage import compute_file_checksum, get_workspace_storage
+
+logger = logging.getLogger(__name__)
+
+
+class WorkspaceManager:
+    """
+    Manages workspace file operations.
+
+    Combines storage backend operations with database record management.
+    Supports session-scoped file segmentation where files are stored in
+    session-specific virtual paths: /sessions/{session_id}/{filename}
+    """
+
+    def __init__(
+        self, user_id: str, workspace_id: str, session_id: Optional[str] = None
+    ):
+        """
+        Initialize WorkspaceManager.
+
+        Args:
+            user_id: The user's ID
+            workspace_id: The workspace ID
+            session_id: Optional session ID for session-scoped file access
+        """
+        self.user_id = user_id
+        self.workspace_id = workspace_id
+        self.session_id = session_id
+        # Session path prefix for file isolation
+        self.session_path = f"/sessions/{session_id}" if session_id else ""
+
+    def _resolve_path(self, path: str) -> str:
+        """
+        Resolve a path, defaulting to session folder if session_id is set.
+
+        Cross-session access is allowed by explicitly using /sessions/other-session-id/...
+
+        Args:
+            path: Virtual path (e.g., "/file.txt" or "/sessions/abc123/file.txt")
+
+        Returns:
+            Resolved path with session prefix if applicable
+        """
+        # If path explicitly references a session folder, use it as-is
+        if path.startswith("/sessions/"):
+            return path
+
+        # If we have a session context, prepend session path
+        if self.session_path:
+            # Normalize the path
+            if not path.startswith("/"):
+                path = f"/{path}"
+            return f"{self.session_path}{path}"
+
+        # No session context, use path as-is
+        return path if path.startswith("/") else f"/{path}"
+
+    def _get_effective_path(
+        self, path: Optional[str], include_all_sessions: bool
+    ) -> Optional[str]:
+        """
+        Get effective path for list/count operations based on session context.
+
+        Args:
+            path: Optional path prefix to filter
+            include_all_sessions: If True, don't apply session scoping
+
+        Returns:
+            Effective path prefix for database query
+        """
+        if include_all_sessions:
+            # Normalize path to ensure leading slash (stored paths are normalized)
+            if path is not None and not path.startswith("/"):
+                return f"/{path}"
+            return path
+        elif path is not None:
+            # Resolve the provided path with session scoping
+            return self._resolve_path(path)
+        elif self.session_path:
+            # Default to session folder with trailing slash to prevent prefix collisions
+            # e.g., "/sessions/abc" should not match "/sessions/abc123"
+            return self.session_path.rstrip("/") + "/"
+        else:
+            # No session context, use path as-is
+            return path
+
+    async def read_file(self, path: str) -> bytes:
+        """
+        Read file from workspace by virtual path.
+
+        When session_id is set, paths are resolved relative to the session folder
+        unless they explicitly reference /sessions/...
+
+        Args:
+            path: Virtual path (e.g., "/documents/report.pdf")
+
+        Returns:
+            File content as bytes
+
+        Raises:
+            FileNotFoundError: If file doesn't exist
+        """
+        resolved_path = self._resolve_path(path)
+        file = await get_workspace_file_by_path(self.workspace_id, resolved_path)
+        if file is None:
+            raise FileNotFoundError(f"File not found at path: {resolved_path}")
+
+        storage = await get_workspace_storage()
+        return await storage.retrieve(file.storagePath)
+
+    async def read_file_by_id(self, file_id: str) -> bytes:
+        """
+        Read file from workspace by file ID.
+
+        Args:
+            file_id: The file's ID
+
+        Returns:
+            File content as bytes
+
+        Raises:
+            FileNotFoundError: If file doesn't exist
+        """
+        file = await get_workspace_file(file_id, self.workspace_id)
+        if file is None:
+            raise FileNotFoundError(f"File not found: {file_id}")
+
+        storage = await get_workspace_storage()
+        return await storage.retrieve(file.storagePath)
+
+    async def write_file(
+        self,
+        content: bytes,
+        filename: str,
+        path: Optional[str] = None,
+        mime_type: Optional[str] = None,
+        overwrite: bool = False,
+    ) -> UserWorkspaceFile:
+        """
+        Write file to workspace.
+
+        When session_id is set, files are written to /sessions/{session_id}/...
+        by default. Use explicit /sessions/... paths for cross-session access.
+
+        Args:
+            content: File content as bytes
+            filename: Filename for the file
+            path: Virtual path (defaults to "/{filename}", session-scoped if session_id set)
+            mime_type: MIME type (auto-detected if not provided)
+            overwrite: Whether to overwrite existing file at path
+
+        Returns:
+            Created UserWorkspaceFile instance
+
+        Raises:
+            ValueError: If file exceeds size limit or path already exists
+        """
+        # Enforce file size limit
+        max_file_size = Config().max_file_size_mb * 1024 * 1024
+        if len(content) > max_file_size:
+            raise ValueError(
+                f"File too large: {len(content)} bytes exceeds "
+                f"{Config().max_file_size_mb}MB limit"
+            )
+
+        # Determine path with session scoping
+        if path is None:
+            path = f"/{filename}"
+        elif not path.startswith("/"):
+            path = f"/{path}"
+
+        # Resolve path with session prefix
+        path = self._resolve_path(path)
+
+        # Check if file exists at path (only error for non-overwrite case)
+        # For overwrite=True, we let the write proceed and handle via UniqueViolationError
+        # This ensures the new file is written to storage BEFORE the old one is deleted,
+        # preventing data loss if the new write fails
+        if not overwrite:
+            existing = await get_workspace_file_by_path(self.workspace_id, path)
+            if existing is not None:
+                raise ValueError(f"File already exists at path: {path}")
+
+        # Auto-detect MIME type if not provided
+        if mime_type is None:
+            mime_type, _ = mimetypes.guess_type(filename)
+            mime_type = mime_type or "application/octet-stream"
+
+        # Compute checksum
+        checksum = compute_file_checksum(content)
+
+        # Generate unique file ID for storage
+        file_id = str(uuid.uuid4())
+
+        # Store file in storage backend
+        storage = await get_workspace_storage()
+        storage_path = await storage.store(
+            workspace_id=self.workspace_id,
+            file_id=file_id,
+            filename=filename,
+            content=content,
+        )
+
+        # Create database record - handle race condition where another request
+        # created a file at the same path between our check and create
+        try:
+            file = await create_workspace_file(
+                workspace_id=self.workspace_id,
+                file_id=file_id,
+                name=filename,
+                path=path,
+                storage_path=storage_path,
+                mime_type=mime_type,
+                size_bytes=len(content),
+                checksum=checksum,
+            )
+        except UniqueViolationError:
+            # Race condition: another request created a file at this path
+            if overwrite:
+                # Re-fetch and delete the conflicting file, then retry
+                existing = await get_workspace_file_by_path(self.workspace_id, path)
+                if existing:
+                    await self.delete_file(existing.id)
+                # Retry the create - if this also fails, clean up storage file
+                try:
+                    file = await create_workspace_file(
+                        workspace_id=self.workspace_id,
+                        file_id=file_id,
+                        name=filename,
+                        path=path,
+                        storage_path=storage_path,
+                        mime_type=mime_type,
+                        size_bytes=len(content),
+                        checksum=checksum,
+                    )
+                except Exception:
+                    # Clean up orphaned storage file on retry failure
+                    try:
+                        await storage.delete(storage_path)
+                    except Exception as e:
+                        logger.warning(f"Failed to clean up orphaned storage file: {e}")
+                    raise
+            else:
+                # Clean up the orphaned storage file before raising
+                try:
+                    await storage.delete(storage_path)
+                except Exception as e:
+                    logger.warning(f"Failed to clean up orphaned storage file: {e}")
+                raise ValueError(f"File already exists at path: {path}")
+        except Exception:
+            # Any other database error (connection, validation, etc.) - clean up storage
+            try:
+                await storage.delete(storage_path)
+            except Exception as e:
+                logger.warning(f"Failed to clean up orphaned storage file: {e}")
+            raise
+
+        logger.info(
+            f"Wrote file {file.id} ({filename}) to workspace {self.workspace_id} "
+            f"at path {path}, size={len(content)} bytes"
+        )
+
+        return file
+
+    async def list_files(
+        self,
+        path: Optional[str] = None,
+        limit: Optional[int] = None,
+        offset: int = 0,
+        include_all_sessions: bool = False,
+    ) -> list[UserWorkspaceFile]:
+        """
+        List files in workspace.
+
+        When session_id is set and include_all_sessions is False (default),
+        only files in the current session's folder are listed.
+
+        Args:
+            path: Optional path prefix to filter (e.g., "/documents/")
+            limit: Maximum number of files to return
+            offset: Number of files to skip
+            include_all_sessions: If True, list files from all sessions.
+                                  If False (default), only list current session's files.
+
+        Returns:
+            List of UserWorkspaceFile instances
+        """
+        effective_path = self._get_effective_path(path, include_all_sessions)
+
+        return await list_workspace_files(
+            workspace_id=self.workspace_id,
+            path_prefix=effective_path,
+            limit=limit,
+            offset=offset,
+        )
+
+    async def delete_file(self, file_id: str) -> bool:
+        """
+        Delete a file (soft-delete).
+
+        Args:
+            file_id: The file's ID
+
+        Returns:
+            True if deleted, False if not found
+        """
+        file = await get_workspace_file(file_id, self.workspace_id)
+        if file is None:
+            return False
+
+        # Delete from storage
+        storage = await get_workspace_storage()
+        try:
+            await storage.delete(file.storagePath)
+        except Exception as e:
+            logger.warning(f"Failed to delete file from storage: {e}")
+            # Continue with database soft-delete even if storage delete fails
+
+        # Soft-delete database record
+        result = await soft_delete_workspace_file(file_id, self.workspace_id)
+        return result is not None
+
+    async def get_download_url(self, file_id: str, expires_in: int = 3600) -> str:
+        """
+        Get download URL for a file.
+
+        Args:
+            file_id: The file's ID
+            expires_in: URL expiration in seconds (default 1 hour)
+
+        Returns:
+            Download URL (signed URL for GCS, API endpoint for local)
+
+        Raises:
+            FileNotFoundError: If file doesn't exist
+        """
+        file = await get_workspace_file(file_id, self.workspace_id)
+        if file is None:
+            raise FileNotFoundError(f"File not found: {file_id}")
+
+        storage = await get_workspace_storage()
+        return await storage.get_download_url(file.storagePath, expires_in)
+
+    async def get_file_info(self, file_id: str) -> Optional[UserWorkspaceFile]:
+        """
+        Get file metadata.
+
+        Args:
+            file_id: The file's ID
+
+        Returns:
+            UserWorkspaceFile instance or None
+        """
+        return await get_workspace_file(file_id, self.workspace_id)
+
+    async def get_file_info_by_path(self, path: str) -> Optional[UserWorkspaceFile]:
+        """
+        Get file metadata by path.
+
+        When session_id is set, paths are resolved relative to the session folder
+        unless they explicitly reference /sessions/...
+
+        Args:
+            path: Virtual path
+
+        Returns:
+            UserWorkspaceFile instance or None
+        """
+        resolved_path = self._resolve_path(path)
+        return await get_workspace_file_by_path(self.workspace_id, resolved_path)
+
+    async def get_file_count(
+        self,
+        path: Optional[str] = None,
+        include_all_sessions: bool = False,
+    ) -> int:
+        """
+        Get number of files in workspace.
+
+        When session_id is set and include_all_sessions is False (default),
+        only counts files in the current session's folder.
+
+        Args:
+            path: Optional path prefix to filter (e.g., "/documents/")
+            include_all_sessions: If True, count all files in workspace.
+                                  If False (default), only count current session's files.
+
+        Returns:
+            Number of files
+        """
+        effective_path = self._get_effective_path(path, include_all_sessions)
+
+        return await count_workspace_files(
+            self.workspace_id, path_prefix=effective_path
+        )
--- a/autogpt_platform/backend/backend/util/workspace_storage.py
+++ b/autogpt_platform/backend/backend/util/workspace_storage.py
@@ -0,0 +1,398 @@
+"""
+Workspace storage backend abstraction for supporting both cloud and local deployments.
+
+This module provides a unified interface for storing workspace files, with implementations
+for Google Cloud Storage (cloud deployments) and local filesystem (self-hosted deployments).
+"""
+
+import asyncio
+import hashlib
+import logging
+from abc import ABC, abstractmethod
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Optional
+
+import aiofiles
+import aiohttp
+from gcloud.aio import storage as async_gcs_storage
+from google.cloud import storage as gcs_storage
+
+from backend.util.data import get_data_path
+from backend.util.gcs_utils import (
+    download_with_fresh_session,
+    generate_signed_url,
+    parse_gcs_path,
+)
+from backend.util.settings import Config
+
+logger = logging.getLogger(__name__)
+
+
+class WorkspaceStorageBackend(ABC):
+    """Abstract interface for workspace file storage."""
+
+    @abstractmethod
+    async def store(
+        self,
+        workspace_id: str,
+        file_id: str,
+        filename: str,
+        content: bytes,
+    ) -> str:
+        """
+        Store file content, return storage path.
+
+        Args:
+            workspace_id: The workspace ID
+            file_id: Unique file ID for storage
+            filename: Original filename
+            content: File content as bytes
+
+        Returns:
+            Storage path string (cloud path or local path)
+        """
+        pass
+
+    @abstractmethod
+    async def retrieve(self, storage_path: str) -> bytes:
+        """
+        Retrieve file content from storage.
+
+        Args:
+            storage_path: The storage path returned from store()
+
+        Returns:
+            File content as bytes
+        """
+        pass
+
+    @abstractmethod
+    async def delete(self, storage_path: str) -> None:
+        """
+        Delete file from storage.
+
+        Args:
+            storage_path: The storage path to delete
+        """
+        pass
+
+    @abstractmethod
+    async def get_download_url(self, storage_path: str, expires_in: int = 3600) -> str:
+        """
+        Get URL for downloading the file.
+
+        Args:
+            storage_path: The storage path
+            expires_in: URL expiration time in seconds (default 1 hour)
+
+        Returns:
+            Download URL (signed URL for GCS, direct API path for local)
+        """
+        pass
+
+
+class GCSWorkspaceStorage(WorkspaceStorageBackend):
+    """Google Cloud Storage implementation for workspace storage."""
+
+    def __init__(self, bucket_name: str):
+        self.bucket_name = bucket_name
+        self._async_client: Optional[async_gcs_storage.Storage] = None
+        self._sync_client: Optional[gcs_storage.Client] = None
+        self._session: Optional[aiohttp.ClientSession] = None
+
+    async def _get_async_client(self) -> async_gcs_storage.Storage:
+        """Get or create async GCS client."""
+        if self._async_client is None:
+            self._session = aiohttp.ClientSession(
+                connector=aiohttp.TCPConnector(limit=100, force_close=False)
+            )
+            self._async_client = async_gcs_storage.Storage(session=self._session)
+        return self._async_client
+
+    def _get_sync_client(self) -> gcs_storage.Client:
+        """Get or create sync GCS client (for signed URLs)."""
+        if self._sync_client is None:
+            self._sync_client = gcs_storage.Client()
+        return self._sync_client
+
+    async def close(self) -> None:
+        """Close all client connections."""
+        if self._async_client is not None:
+            try:
+                await self._async_client.close()
+            except Exception as e:
+                logger.warning(f"Error closing GCS client: {e}")
+            self._async_client = None
+
+        if self._session is not None:
+            try:
+                await self._session.close()
+            except Exception as e:
+                logger.warning(f"Error closing session: {e}")
+            self._session = None
+
+    def _build_blob_name(self, workspace_id: str, file_id: str, filename: str) -> str:
+        """Build the blob path for workspace files."""
+        return f"workspaces/{workspace_id}/{file_id}/{filename}"
+
+    async def store(
+        self,
+        workspace_id: str,
+        file_id: str,
+        filename: str,
+        content: bytes,
+    ) -> str:
+        """Store file in GCS."""
+        client = await self._get_async_client()
+        blob_name = self._build_blob_name(workspace_id, file_id, filename)
+
+        # Upload with metadata
+        upload_time = datetime.now(timezone.utc)
+        await client.upload(
+            self.bucket_name,
+            blob_name,
+            content,
+            metadata={
+                "uploaded_at": upload_time.isoformat(),
+                "workspace_id": workspace_id,
+                "file_id": file_id,
+            },
+        )
+
+        return f"gcs://{self.bucket_name}/{blob_name}"
+
+    async def retrieve(self, storage_path: str) -> bytes:
+        """Retrieve file from GCS."""
+        bucket_name, blob_name = parse_gcs_path(storage_path)
+        return await download_with_fresh_session(bucket_name, blob_name)
+
+    async def delete(self, storage_path: str) -> None:
+        """Delete file from GCS."""
+        bucket_name, blob_name = parse_gcs_path(storage_path)
+        client = await self._get_async_client()
+
+        try:
+            await client.delete(bucket_name, blob_name)
+        except Exception as e:
+            if "404" not in str(e) and "Not Found" not in str(e):
+                raise
+            # File already deleted, that's fine
+
+    async def get_download_url(self, storage_path: str, expires_in: int = 3600) -> str:
+        """
+        Generate download URL for GCS file.
+
+        Attempts to generate a signed URL if running with service account credentials.
+        Falls back to an API proxy endpoint if signed URL generation fails
+        (e.g., when running locally with user OAuth credentials).
+        """
+        bucket_name, blob_name = parse_gcs_path(storage_path)
+
+        # Extract file_id from blob_name for fallback: workspaces/{workspace_id}/{file_id}/{filename}
+        blob_parts = blob_name.split("/")
+        file_id = blob_parts[2] if len(blob_parts) >= 3 else None
+
+        # Try to generate signed URL (requires service account credentials)
+        try:
+            sync_client = self._get_sync_client()
+            return await generate_signed_url(
+                sync_client, bucket_name, blob_name, expires_in
+            )
+        except AttributeError as e:
+            # Signed URL generation requires service account with private key.
+            # When running with user OAuth credentials, fall back to API proxy.
+            if "private key" in str(e) and file_id:
+                logger.debug(
+                    "Cannot generate signed URL (no service account credentials), "
+                    "falling back to API proxy endpoint"
+                )
+                return f"/api/workspace/files/{file_id}/download"
+            raise
+
+
+class LocalWorkspaceStorage(WorkspaceStorageBackend):
+    """Local filesystem implementation for workspace storage (self-hosted deployments)."""
+
+    def __init__(self, base_dir: Optional[str] = None):
+        """
+        Initialize local storage backend.
+
+        Args:
+            base_dir: Base directory for workspace storage.
+                     If None, defaults to {app_data}/workspaces
+        """
+        if base_dir:
+            self.base_dir = Path(base_dir)
+        else:
+            self.base_dir = Path(get_data_path()) / "workspaces"
+
+        # Ensure base directory exists
+        self.base_dir.mkdir(parents=True, exist_ok=True)
+
+    def _build_file_path(self, workspace_id: str, file_id: str, filename: str) -> Path:
+        """Build the local file path with path traversal protection."""
+        # Import here to avoid circular import
+        # (file.py imports workspace.py which imports workspace_storage.py)
+        from backend.util.file import sanitize_filename
+
+        # Sanitize filename to prevent path traversal (removes / and \ among others)
+        safe_filename = sanitize_filename(filename)
+        file_path = (self.base_dir / workspace_id / file_id / safe_filename).resolve()
+
+        # Verify the resolved path is still under base_dir
+        if not file_path.is_relative_to(self.base_dir.resolve()):
+            raise ValueError("Invalid filename: path traversal detected")
+
+        return file_path
+
+    def _parse_storage_path(self, storage_path: str) -> Path:
+        """Parse local storage path to filesystem path."""
+        if storage_path.startswith("local://"):
+            relative_path = storage_path[8:]  # Remove "local://"
+        else:
+            relative_path = storage_path
+
+        full_path = (self.base_dir / relative_path).resolve()
+
+        # Security check: ensure path is under base_dir
+        # Use is_relative_to() for robust path containment check
+        # (handles case-insensitive filesystems and edge cases)
+        if not full_path.is_relative_to(self.base_dir.resolve()):
+            raise ValueError("Invalid storage path: path traversal detected")
+
+        return full_path
+
+    async def store(
+        self,
+        workspace_id: str,
+        file_id: str,
+        filename: str,
+        content: bytes,
+    ) -> str:
+        """Store file locally."""
+        file_path = self._build_file_path(workspace_id, file_id, filename)
+
+        # Create parent directories
+        file_path.parent.mkdir(parents=True, exist_ok=True)
+
+        # Write file asynchronously
+        async with aiofiles.open(file_path, "wb") as f:
+            await f.write(content)
+
+        # Return relative path as storage path
+        relative_path = file_path.relative_to(self.base_dir)
+        return f"local://{relative_path}"
+
+    async def retrieve(self, storage_path: str) -> bytes:
+        """Retrieve file from local storage."""
+        file_path = self._parse_storage_path(storage_path)
+
+        if not file_path.exists():
+            raise FileNotFoundError(f"File not found: {storage_path}")
+
+        async with aiofiles.open(file_path, "rb") as f:
+            return await f.read()
+
+    async def delete(self, storage_path: str) -> None:
+        """Delete file from local storage."""
+        file_path = self._parse_storage_path(storage_path)
+
+        if file_path.exists():
+            # Remove file
+            file_path.unlink()
+
+            # Clean up empty parent directories
+            parent = file_path.parent
+            while parent != self.base_dir:
+                try:
+                    if parent.exists() and not any(parent.iterdir()):
+                        parent.rmdir()
+                    else:
+                        break
+                except OSError:
+                    break
+                parent = parent.parent
+
+    async def get_download_url(self, storage_path: str, expires_in: int = 3600) -> str:
+        """
+        Get download URL for local file.
+
+        For local storage, this returns an API endpoint path.
+        The actual serving is handled by the API layer.
+        """
+        # Parse the storage path to get the components
+        if storage_path.startswith("local://"):
+            relative_path = storage_path[8:]
+        else:
+            relative_path = storage_path
+
+        # Return the API endpoint for downloading
+        # The file_id is extracted from the path: {workspace_id}/{file_id}/{filename}
+        parts = relative_path.split("/")
+        if len(parts) >= 2:
+            file_id = parts[1]  # Second component is file_id
+            return f"/api/workspace/files/{file_id}/download"
+        else:
+            raise ValueError(f"Invalid storage path format: {storage_path}")
+
+
+# Global storage backend instance
+_workspace_storage: Optional[WorkspaceStorageBackend] = None
+_storage_lock = asyncio.Lock()
+
+
+async def get_workspace_storage() -> WorkspaceStorageBackend:
+    """
+    Get the workspace storage backend instance.
+
+    Uses GCS if media_gcs_bucket_name is configured, otherwise uses local storage.
+    """
+    global _workspace_storage
+
+    if _workspace_storage is None:
+        async with _storage_lock:
+            if _workspace_storage is None:
+                config = Config()
+
+                if config.media_gcs_bucket_name:
+                    logger.info(
+                        f"Using GCS workspace storage: {config.media_gcs_bucket_name}"
+                    )
+                    _workspace_storage = GCSWorkspaceStorage(
+                        config.media_gcs_bucket_name
+                    )
+                else:
+                    storage_dir = (
+                        config.workspace_storage_dir
+                        if config.workspace_storage_dir
+                        else None
+                    )
+                    logger.info(
+                        f"Using local workspace storage: {storage_dir or 'default'}"
+                    )
+                    _workspace_storage = LocalWorkspaceStorage(storage_dir)
+
+    return _workspace_storage
+
+
+async def shutdown_workspace_storage() -> None:
+    """
+    Properly shutdown the global workspace storage backend.
+
+    Closes aiohttp sessions and other resources for GCS backend.
+    Should be called during application shutdown.
+    """
+    global _workspace_storage
+
+    if _workspace_storage is not None:
+        async with _storage_lock:
+            if _workspace_storage is not None:
+                if isinstance(_workspace_storage, GCSWorkspaceStorage):
+                    await _workspace_storage.close()
+                _workspace_storage = None
+
+
+def compute_file_checksum(content: bytes) -> str:
+    """Compute SHA256 checksum of file content."""
+    return hashlib.sha256(content).hexdigest()
--- a/autogpt_platform/backend/migrations/20251126113000_add_llm_registry/migration.sql
+++ b/autogpt_platform/backend/migrations/20251126113000_add_llm_registry/migration.sql
@@ -1,81 +0,0 @@
-- CreateEnum
-CREATE TYPE "LlmCostUnit" AS ENUM ('RUN', 'TOKENS');
-
-- CreateTable
-CREATE TABLE "LlmProvider" (
-    "id" TEXT NOT NULL,
-    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
-    "updatedAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
-    "name" TEXT NOT NULL,
-    "displayName" TEXT NOT NULL,
-    "description" TEXT,
-    "defaultCredentialProvider" TEXT,
-    "defaultCredentialId" TEXT,
-    "defaultCredentialType" TEXT,
-    "supportsTools" BOOLEAN NOT NULL DEFAULT TRUE,
-    "supportsJsonOutput" BOOLEAN NOT NULL DEFAULT TRUE,
-    "supportsReasoning" BOOLEAN NOT NULL DEFAULT FALSE,
-    "supportsParallelTool" BOOLEAN NOT NULL DEFAULT FALSE,
-    "metadata" JSONB NOT NULL DEFAULT '{}'::jsonb,
-
-    CONSTRAINT "LlmProvider_pkey" PRIMARY KEY ("id"),
-    CONSTRAINT "LlmProvider_name_key" UNIQUE ("name")
-);
-
-- CreateTable
-CREATE TABLE "LlmModel" (
-    "id" TEXT NOT NULL,
-    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
-    "updatedAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
-    "slug" TEXT NOT NULL,
-    "displayName" TEXT NOT NULL,
-    "description" TEXT,
-    "providerId" TEXT NOT NULL,
-    "contextWindow" INTEGER NOT NULL,
-    "maxOutputTokens" INTEGER,
-    "isEnabled" BOOLEAN NOT NULL DEFAULT TRUE,
-    "capabilities" JSONB NOT NULL DEFAULT '{}'::jsonb,
-    "metadata" JSONB NOT NULL DEFAULT '{}'::jsonb,
-
-    CONSTRAINT "LlmModel_pkey" PRIMARY KEY ("id"),
-    CONSTRAINT "LlmModel_slug_key" UNIQUE ("slug")
-);
-
-- CreateTable
-CREATE TABLE "LlmModelCost" (
-    "id" TEXT NOT NULL,
-    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
-    "updatedAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
-    "unit" "LlmCostUnit" NOT NULL DEFAULT 'RUN',
-    "creditCost" INTEGER NOT NULL,
-    "credentialProvider" TEXT NOT NULL,
-    "credentialId" TEXT,
-    "credentialType" TEXT,
-    "currency" TEXT,
-    "metadata" JSONB NOT NULL DEFAULT '{}'::jsonb,
-    "llmModelId" TEXT NOT NULL,
-
-    CONSTRAINT "LlmModelCost_pkey" PRIMARY KEY ("id")
-);
-
-- CreateIndex
-CREATE INDEX "LlmModel_providerId_isEnabled_idx" ON "LlmModel"("providerId", "isEnabled");
-
-- CreateIndex
-CREATE INDEX "LlmModel_slug_idx" ON "LlmModel"("slug");
-
-- CreateIndex
-CREATE INDEX "LlmModelCost_llmModelId_idx" ON "LlmModelCost"("llmModelId");
-
-- CreateIndex
-CREATE INDEX "LlmModelCost_credentialProvider_idx" ON "LlmModelCost"("credentialProvider");
-
-- CreateIndex
-CREATE UNIQUE INDEX "LlmModelCost_llmModelId_credentialProvider_unit_key" ON "LlmModelCost"("llmModelId", "credentialProvider", "unit");
-
-- AddForeignKey
-ALTER TABLE "LlmModel" ADD CONSTRAINT "LlmModel_providerId_fkey" FOREIGN KEY ("providerId") REFERENCES "LlmProvider"("id") ON DELETE RESTRICT ON UPDATE CASCADE;
-
-- AddForeignKey
-ALTER TABLE "LlmModelCost" ADD CONSTRAINT "LlmModelCost_llmModelId_fkey" FOREIGN KEY ("llmModelId") REFERENCES "LlmModel"("id") ON DELETE CASCADE ON UPDATE CASCADE;
-
--- a/autogpt_platform/backend/migrations/20251126120000_seed_llm_registry/migration.sql
+++ b/autogpt_platform/backend/migrations/20251126120000_seed_llm_registry/migration.sql
@@ -1,226 +0,0 @@
-- Seed LLM Registry from existing hard-coded data
-- This migration populates the LlmProvider, LlmModel, and LlmModelCost tables
-- with data from the existing MODEL_METADATA and MODEL_COST dictionaries
-
-- Insert Providers
-INSERT INTO "LlmProvider" ("id", "name", "displayName", "description", "defaultCredentialProvider", "defaultCredentialType", "supportsTools", "supportsJsonOutput", "supportsReasoning", "supportsParallelTool", "metadata")
-VALUES
-    (gen_random_uuid(), 'openai', 'OpenAI', 'OpenAI language models', 'openai', 'api_key', true, true, true, true, '{}'::jsonb),
-    (gen_random_uuid(), 'anthropic', 'Anthropic', 'Anthropic Claude models', 'anthropic', 'api_key', true, true, true, false, '{}'::jsonb),
-    (gen_random_uuid(), 'groq', 'Groq', 'Groq inference API', 'groq', 'api_key', false, true, false, false, '{}'::jsonb),
-    (gen_random_uuid(), 'open_router', 'OpenRouter', 'OpenRouter unified API', 'open_router', 'api_key', true, true, false, false, '{}'::jsonb),
-    (gen_random_uuid(), 'aiml_api', 'AI/ML API', 'AI/ML API models', 'aiml_api', 'api_key', false, true, false, false, '{}'::jsonb),
-    (gen_random_uuid(), 'ollama', 'Ollama', 'Ollama local models', 'ollama', 'api_key', false, true, false, false, '{}'::jsonb),
-    (gen_random_uuid(), 'llama_api', 'Llama API', 'Llama API models', 'llama_api', 'api_key', false, true, false, false, '{}'::jsonb),
-    (gen_random_uuid(), 'v0', 'v0', 'v0 by Vercel models', 'v0', 'api_key', true, true, false, false, '{}'::jsonb)
-ON CONFLICT ("name") DO NOTHING;
-
-- Insert Models (using CTEs to reference provider IDs)
-WITH provider_ids AS (
-    SELECT "id", "name" FROM "LlmProvider"
-)
-INSERT INTO "LlmModel" ("id", "slug", "displayName", "description", "providerId", "contextWindow", "maxOutputTokens", "isEnabled", "capabilities", "metadata")
-SELECT
-    gen_random_uuid(),
-    model_slug,
-    model_display_name,
-    NULL,
-    p."id",
-    context_window,
-    max_output_tokens,
-    true,
-    '{}'::jsonb,
-    '{}'::jsonb
-FROM (VALUES
-    -- OpenAI models
-    ('o3', 'O3', 'openai', 200000, 100000),
-    ('o3-mini', 'O3 Mini', 'openai', 200000, 100000),
-    ('o1', 'O1', 'openai', 200000, 100000),
-    ('o1-mini', 'O1 Mini', 'openai', 128000, 65536),
-    ('gpt-5-2025-08-07', 'GPT 5', 'openai', 400000, 128000),
-    ('gpt-5.1-2025-11-13', 'GPT 5.1', 'openai', 400000, 128000),
-    ('gpt-5-mini-2025-08-07', 'GPT 5 Mini', 'openai', 400000, 128000),
-    ('gpt-5-nano-2025-08-07', 'GPT 5 Nano', 'openai', 400000, 128000),
-    ('gpt-5-chat-latest', 'GPT 5 Chat', 'openai', 400000, 16384),
-    ('gpt-4.1-2025-04-14', 'GPT 4.1', 'openai', 1000000, 32768),
-    ('gpt-4.1-mini-2025-04-14', 'GPT 4.1 Mini', 'openai', 1047576, 32768),
-    ('gpt-4o-mini', 'GPT 4o Mini', 'openai', 128000, 16384),
-    ('gpt-4o', 'GPT 4o', 'openai', 128000, 16384),
-    ('gpt-4-turbo', 'GPT 4 Turbo', 'openai', 128000, 4096),
-    ('gpt-3.5-turbo', 'GPT 3.5 Turbo', 'openai', 16385, 4096),
-    -- Anthropic models
-    ('claude-opus-4-1-20250805', 'Claude 4.1 Opus', 'anthropic', 200000, 32000),
-    ('claude-opus-4-20250514', 'Claude 4 Opus', 'anthropic', 200000, 32000),
-    ('claude-sonnet-4-20250514', 'Claude 4 Sonnet', 'anthropic', 200000, 64000),
-    ('claude-opus-4-5-20251101', 'Claude 4.5 Opus', 'anthropic', 200000, 64000),
-    ('claude-sonnet-4-5-20250929', 'Claude 4.5 Sonnet', 'anthropic', 200000, 64000),
-    ('claude-haiku-4-5-20251001', 'Claude 4.5 Haiku', 'anthropic', 200000, 64000),
-    ('claude-3-7-sonnet-20250219', 'Claude 3.7 Sonnet', 'anthropic', 200000, 64000),
-    ('claude-3-haiku-20240307', 'Claude 3 Haiku', 'anthropic', 200000, 4096),
-    -- AI/ML API models
-    ('Qwen/Qwen2.5-72B-Instruct-Turbo', 'Qwen 2.5 72B', 'aiml_api', 32000, 8000),
-    ('nvidia/llama-3.1-nemotron-70b-instruct', 'Llama 3.1 Nemotron 70B', 'aiml_api', 128000, 40000),
-    ('meta-llama/Llama-3.3-70B-Instruct-Turbo', 'Llama 3.3 70B', 'aiml_api', 128000, NULL),
-    ('meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo', 'Meta Llama 3.1 70B', 'aiml_api', 131000, 2000),
-    ('meta-llama/Llama-3.2-3B-Instruct-Turbo', 'Llama 3.2 3B', 'aiml_api', 128000, NULL),
-    -- Groq models
-    ('llama-3.3-70b-versatile', 'Llama 3.3 70B', 'groq', 128000, 32768),
-    ('llama-3.1-8b-instant', 'Llama 3.1 8B', 'groq', 128000, 8192),
-    -- Ollama models
-    ('llama3.3', 'Llama 3.3', 'ollama', 8192, NULL),
-    ('llama3.2', 'Llama 3.2', 'ollama', 8192, NULL),
-    ('llama3', 'Llama 3', 'ollama', 8192, NULL),
-    ('llama3.1:405b', 'Llama 3.1 405B', 'ollama', 8192, NULL),
-    ('dolphin-mistral:latest', 'Dolphin Mistral', 'ollama', 32768, NULL),
-    -- OpenRouter models
-    ('google/gemini-2.5-pro-preview-03-25', 'Gemini 2.5 Pro', 'open_router', 1050000, 8192),
-    ('google/gemini-3-pro-preview', 'Gemini 3 Pro Preview', 'open_router', 1048576, 65535),
-    ('google/gemini-2.5-flash', 'Gemini 2.5 Flash', 'open_router', 1048576, 65535),
-    ('google/gemini-2.0-flash-001', 'Gemini 2.0 Flash', 'open_router', 1048576, 8192),
-    ('google/gemini-2.5-flash-lite-preview-06-17', 'Gemini 2.5 Flash Lite Preview', 'open_router', 1048576, 65535),
-    ('google/gemini-2.0-flash-lite-001', 'Gemini 2.0 Flash Lite', 'open_router', 1048576, 8192),
-    ('mistralai/mistral-nemo', 'Mistral Nemo', 'open_router', 128000, 4096),
-    ('cohere/command-r-08-2024', 'Command R', 'open_router', 128000, 4096),
-    ('cohere/command-r-plus-08-2024', 'Command R Plus', 'open_router', 128000, 4096),
-    ('deepseek/deepseek-chat', 'DeepSeek Chat', 'open_router', 64000, 2048),
-    ('deepseek/deepseek-r1-0528', 'DeepSeek R1', 'open_router', 163840, 163840),
-    ('perplexity/sonar', 'Perplexity Sonar', 'open_router', 127000, 8000),
-    ('perplexity/sonar-pro', 'Perplexity Sonar Pro', 'open_router', 200000, 8000),
-    ('perplexity/sonar-deep-research', 'Perplexity Sonar Deep Research', 'open_router', 128000, 16000),
-    ('nousresearch/hermes-3-llama-3.1-405b', 'Hermes 3 Llama 3.1 405B', 'open_router', 131000, 4096),
-    ('nousresearch/hermes-3-llama-3.1-70b', 'Hermes 3 Llama 3.1 70B', 'open_router', 12288, 12288),
-    ('openai/gpt-oss-120b', 'GPT OSS 120B', 'open_router', 131072, 131072),
-    ('openai/gpt-oss-20b', 'GPT OSS 20B', 'open_router', 131072, 32768),
-    ('amazon/nova-lite-v1', 'Amazon Nova Lite', 'open_router', 300000, 5120),
-    ('amazon/nova-micro-v1', 'Amazon Nova Micro', 'open_router', 128000, 5120),
-    ('amazon/nova-pro-v1', 'Amazon Nova Pro', 'open_router', 300000, 5120),
-    ('microsoft/wizardlm-2-8x22b', 'WizardLM 2 8x22B', 'open_router', 65536, 4096),
-    ('gryphe/mythomax-l2-13b', 'MythoMax L2 13B', 'open_router', 4096, 4096),
-    ('meta-llama/llama-4-scout', 'Llama 4 Scout', 'open_router', 131072, 131072),
-    ('meta-llama/llama-4-maverick', 'Llama 4 Maverick', 'open_router', 1048576, 1000000),
-    ('x-ai/grok-4', 'Grok 4', 'open_router', 256000, 256000),
-    ('x-ai/grok-4-fast', 'Grok 4 Fast', 'open_router', 2000000, 30000),
-    ('x-ai/grok-4.1-fast', 'Grok 4.1 Fast', 'open_router', 2000000, 30000),
-    ('x-ai/grok-code-fast-1', 'Grok Code Fast 1', 'open_router', 256000, 10000),
-    ('moonshotai/kimi-k2', 'Kimi K2', 'open_router', 131000, 131000),
-    ('qwen/qwen3-235b-a22b-thinking-2507', 'Qwen 3 235B Thinking', 'open_router', 262144, 262144),
-    ('qwen/qwen3-coder', 'Qwen 3 Coder', 'open_router', 262144, 262144),
-    -- Llama API models
-    ('Llama-4-Scout-17B-16E-Instruct-FP8', 'Llama 4 Scout', 'llama_api', 128000, 4028),
-    ('Llama-4-Maverick-17B-128E-Instruct-FP8', 'Llama 4 Maverick', 'llama_api', 128000, 4028),
-    ('Llama-3.3-8B-Instruct', 'Llama 3.3 8B', 'llama_api', 128000, 4028),
-    ('Llama-3.3-70B-Instruct', 'Llama 3.3 70B', 'llama_api', 128000, 4028),
-    -- v0 models
-    ('v0-1.5-md', 'v0 1.5 MD', 'v0', 128000, 64000),
-    ('v0-1.5-lg', 'v0 1.5 LG', 'v0', 512000, 64000),
-    ('v0-1.0-md', 'v0 1.0 MD', 'v0', 128000, 64000)
-) AS models(model_slug, model_display_name, provider_name, context_window, max_output_tokens)
-JOIN provider_ids p ON p."name" = models.provider_name
-ON CONFLICT ("slug") DO NOTHING;
-
-- Insert Costs (using CTEs to reference model IDs)
-WITH model_ids AS (
-    SELECT "id", "slug", "providerId" FROM "LlmModel"
-),
-provider_ids AS (
-    SELECT "id", "name" FROM "LlmProvider"
-)
-INSERT INTO "LlmModelCost" ("id", "unit", "creditCost", "credentialProvider", "credentialId", "credentialType", "currency", "metadata", "llmModelId")
-SELECT
-    gen_random_uuid(),
-    'RUN'::"LlmCostUnit",
-    cost,
-    p."name",
-    NULL,
-    'api_key',
-    NULL,
-    '{}'::jsonb,
-    m."id"
-FROM (VALUES
-    -- OpenAI costs
-    ('o3', 4),
-    ('o3-mini', 2),
-    ('o1', 16),
-    ('o1-mini', 4),
-    ('gpt-5-2025-08-07', 2),
-    ('gpt-5.1-2025-11-13', 5),
-    ('gpt-5-mini-2025-08-07', 1),
-    ('gpt-5-nano-2025-08-07', 1),
-    ('gpt-5-chat-latest', 5),
-    ('gpt-4.1-2025-04-14', 2),
-    ('gpt-4.1-mini-2025-04-14', 1),
-    ('gpt-4o-mini', 1),
-    ('gpt-4o', 3),
-    ('gpt-4-turbo', 10),
-    ('gpt-3.5-turbo', 1),
-    -- Anthropic costs
-    ('claude-opus-4-1-20250805', 21),
-    ('claude-opus-4-20250514', 21),
-    ('claude-sonnet-4-20250514', 5),
-    ('claude-haiku-4-5-20251001', 4),
-    ('claude-opus-4-5-20251101', 14),
-    ('claude-sonnet-4-5-20250929', 9),
-    ('claude-3-7-sonnet-20250219', 5),
-    ('claude-3-haiku-20240307', 1),
-    -- AI/ML API costs
-    ('Qwen/Qwen2.5-72B-Instruct-Turbo', 1),
-    ('nvidia/llama-3.1-nemotron-70b-instruct', 1),
-    ('meta-llama/Llama-3.3-70B-Instruct-Turbo', 1),
-    ('meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo', 1),
-    ('meta-llama/Llama-3.2-3B-Instruct-Turbo', 1),
-    -- Groq costs
-    ('llama-3.3-70b-versatile', 1),
-    ('llama-3.1-8b-instant', 1),
-    -- Ollama costs
-    ('llama3.3', 1),
-    ('llama3.2', 1),
-    ('llama3', 1),
-    ('llama3.1:405b', 1),
-    ('dolphin-mistral:latest', 1),
-    -- OpenRouter costs
-    ('google/gemini-2.5-pro-preview-03-25', 4),
-    ('google/gemini-3-pro-preview', 5),
-    ('mistralai/mistral-nemo', 1),
-    ('cohere/command-r-08-2024', 1),
-    ('cohere/command-r-plus-08-2024', 3),
-    ('deepseek/deepseek-chat', 2),
-    ('perplexity/sonar', 1),
-    ('perplexity/sonar-pro', 5),
-    ('perplexity/sonar-deep-research', 10),
-    ('nousresearch/hermes-3-llama-3.1-405b', 1),
-    ('nousresearch/hermes-3-llama-3.1-70b', 1),
-    ('amazon/nova-lite-v1', 1),
-    ('amazon/nova-micro-v1', 1),
-    ('amazon/nova-pro-v1', 1),
-    ('microsoft/wizardlm-2-8x22b', 1),
-    ('gryphe/mythomax-l2-13b', 1),
-    ('meta-llama/llama-4-scout', 1),
-    ('meta-llama/llama-4-maverick', 1),
-    ('x-ai/grok-4', 9),
-    ('x-ai/grok-4-fast', 1),
-    ('x-ai/grok-4.1-fast', 1),
-    ('x-ai/grok-code-fast-1', 1),
-    ('moonshotai/kimi-k2', 1),
-    ('qwen/qwen3-235b-a22b-thinking-2507', 1),
-    ('qwen/qwen3-coder', 9),
-    ('google/gemini-2.5-flash', 1),
-    ('google/gemini-2.0-flash-001', 1),
-    ('google/gemini-2.5-flash-lite-preview-06-17', 1),
-    ('google/gemini-2.0-flash-lite-001', 1),
-    ('deepseek/deepseek-r1-0528', 1),
-    ('openai/gpt-oss-120b', 1),
-    ('openai/gpt-oss-20b', 1),
-    -- Llama API costs
-    ('Llama-4-Scout-17B-16E-Instruct-FP8', 1),
-    ('Llama-4-Maverick-17B-128E-Instruct-FP8', 1),
-    ('Llama-3.3-8B-Instruct', 1),
-    ('Llama-3.3-70B-Instruct', 1),
-    -- v0 costs
-    ('v0-1.5-md', 1),
-    ('v0-1.5-lg', 2),
-    ('v0-1.0-md', 1)
-) AS costs(model_slug, cost)
-JOIN model_ids m ON m."slug" = costs.model_slug
-JOIN provider_ids p ON p."id" = m."providerId"
-ON CONFLICT ("llmModelId", "credentialProvider", "unit") DO NOTHING;
-
--- a/autogpt_platform/backend/migrations/20251218100000_add_llm_model_migration/migration.sql
+++ b/autogpt_platform/backend/migrations/20251218100000_add_llm_model_migration/migration.sql
@@ -1,25 +0,0 @@
-- CreateTable
-CREATE TABLE "LlmModelMigration" (
-    "id" TEXT NOT NULL,
-    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
-    "updatedAt" TIMESTAMP(3) NOT NULL,
-    "sourceModelSlug" TEXT NOT NULL,
-    "targetModelSlug" TEXT NOT NULL,
-    "reason" TEXT,
-    "migratedNodeIds" JSONB NOT NULL DEFAULT '[]',
-    "nodeCount" INTEGER NOT NULL,
-    "customCreditCost" INTEGER,
-    "isReverted" BOOLEAN NOT NULL DEFAULT false,
-    "revertedAt" TIMESTAMP(3),
-
-    CONSTRAINT "LlmModelMigration_pkey" PRIMARY KEY ("id")
-);
-
-- CreateIndex
-CREATE INDEX "LlmModelMigration_sourceModelSlug_idx" ON "LlmModelMigration"("sourceModelSlug");
-
-- CreateIndex
-CREATE INDEX "LlmModelMigration_targetModelSlug_idx" ON "LlmModelMigration"("targetModelSlug");
-
-- CreateIndex
-CREATE INDEX "LlmModelMigration_isReverted_idx" ON "LlmModelMigration"("isReverted");
--- a/autogpt_platform/backend/migrations/20251224100000_add_llm_model_creator/migration.sql
+++ b/autogpt_platform/backend/migrations/20251224100000_add_llm_model_creator/migration.sql
@@ -1,127 +0,0 @@
-- Add LlmModelCreator table
-- Creator represents who made/trained the model (e.g., OpenAI, Meta)
-- This is distinct from Provider who hosts/serves the model (e.g., OpenRouter)
-
-- Create the LlmModelCreator table
-CREATE TABLE "LlmModelCreator" (
-    "id" TEXT NOT NULL,
-    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
-    "updatedAt" TIMESTAMP(3) NOT NULL,
-    "name" TEXT NOT NULL,
-    "displayName" TEXT NOT NULL,
-    "description" TEXT,
-    "websiteUrl" TEXT,
-    "logoUrl" TEXT,
-    "metadata" JSONB NOT NULL DEFAULT '{}',
-
-    CONSTRAINT "LlmModelCreator_pkey" PRIMARY KEY ("id")
-);
-
-- Create unique index on name
-CREATE UNIQUE INDEX "LlmModelCreator_name_key" ON "LlmModelCreator"("name");
-
-- Add creatorId column to LlmModel
-ALTER TABLE "LlmModel" ADD COLUMN "creatorId" TEXT;
-
-- Add foreign key constraint
-ALTER TABLE "LlmModel" ADD CONSTRAINT "LlmModel_creatorId_fkey"
-    FOREIGN KEY ("creatorId") REFERENCES "LlmModelCreator"("id") ON DELETE SET NULL ON UPDATE CASCADE;
-
-- Create index on creatorId
-CREATE INDEX "LlmModel_creatorId_idx" ON "LlmModel"("creatorId");
-
-- Seed creators based on known model creators
-INSERT INTO "LlmModelCreator" ("id", "updatedAt", "name", "displayName", "description", "websiteUrl", "metadata")
-VALUES
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'openai', 'OpenAI', 'Creator of GPT models', 'https://openai.com', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'anthropic', 'Anthropic', 'Creator of Claude models', 'https://anthropic.com', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'meta', 'Meta', 'Creator of Llama models', 'https://ai.meta.com', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'google', 'Google', 'Creator of Gemini models', 'https://deepmind.google', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'mistral', 'Mistral AI', 'Creator of Mistral models', 'https://mistral.ai', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'cohere', 'Cohere', 'Creator of Command models', 'https://cohere.com', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'deepseek', 'DeepSeek', 'Creator of DeepSeek models', 'https://deepseek.com', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'perplexity', 'Perplexity AI', 'Creator of Sonar models', 'https://perplexity.ai', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'qwen', 'Qwen (Alibaba)', 'Creator of Qwen models', 'https://qwenlm.github.io', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'xai', 'xAI', 'Creator of Grok models', 'https://x.ai', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'amazon', 'Amazon', 'Creator of Nova models', 'https://aws.amazon.com/bedrock', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'microsoft', 'Microsoft', 'Creator of WizardLM models', 'https://microsoft.com', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'moonshot', 'Moonshot AI', 'Creator of Kimi models', 'https://moonshot.cn', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'nvidia', 'NVIDIA', 'Creator of Nemotron models', 'https://nvidia.com', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'nous_research', 'Nous Research', 'Creator of Hermes models', 'https://nousresearch.com', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'vercel', 'Vercel', 'Creator of v0 models', 'https://vercel.com', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'cognitive_computations', 'Cognitive Computations', 'Creator of Dolphin models', 'https://erichartford.com', '{}'),
-    (gen_random_uuid(), CURRENT_TIMESTAMP, 'gryphe', 'Gryphe', 'Creator of MythoMax models', 'https://huggingface.co/Gryphe', '{}')
-ON CONFLICT ("name") DO NOTHING;
-
-- Update existing models with their creators
-- OpenAI models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'openai')
-WHERE "slug" LIKE 'gpt-%' OR "slug" LIKE 'o1%' OR "slug" LIKE 'o3%' OR "slug" LIKE 'openai/%';
-
-- Anthropic models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'anthropic')
-WHERE "slug" LIKE 'claude-%';
-
-- Meta/Llama models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'meta')
-WHERE "slug" LIKE 'llama%' OR "slug" LIKE 'Llama%' OR "slug" LIKE 'meta-llama/%' OR "slug" LIKE '%/llama-%';
-
-- Google models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'google')
-WHERE "slug" LIKE 'google/%' OR "slug" LIKE 'gemini%';
-
-- Mistral models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'mistral')
-WHERE "slug" LIKE 'mistral%' OR "slug" LIKE 'mistralai/%';
-
-- Cohere models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'cohere')
-WHERE "slug" LIKE 'cohere/%' OR "slug" LIKE 'command-%';
-
-- DeepSeek models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'deepseek')
-WHERE "slug" LIKE 'deepseek/%' OR "slug" LIKE 'deepseek-%';
-
-- Perplexity models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'perplexity')
-WHERE "slug" LIKE 'perplexity/%' OR "slug" LIKE 'sonar%';
-
-- Qwen models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'qwen')
-WHERE "slug" LIKE 'Qwen/%' OR "slug" LIKE 'qwen/%';
-
-- xAI/Grok models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'xai')
-WHERE "slug" LIKE 'x-ai/%' OR "slug" LIKE 'grok%';
-
-- Amazon models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'amazon')
-WHERE "slug" LIKE 'amazon/%' OR "slug" LIKE 'nova-%';
-
-- Microsoft models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'microsoft')
-WHERE "slug" LIKE 'microsoft/%' OR "slug" LIKE 'wizardlm%';
-
-- Moonshot models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'moonshot')
-WHERE "slug" LIKE 'moonshotai/%' OR "slug" LIKE 'kimi%';
-
-- NVIDIA models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'nvidia')
-WHERE "slug" LIKE 'nvidia/%' OR "slug" LIKE '%nemotron%';
-
-- Nous Research models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'nous_research')
-WHERE "slug" LIKE 'nousresearch/%' OR "slug" LIKE 'hermes%';
-
-- Vercel/v0 models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'vercel')
-WHERE "slug" LIKE 'v0-%';
-
-- Dolphin models (Cognitive Computations / Eric Hartford)
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'cognitive_computations')
-WHERE "slug" LIKE 'dolphin-%';
-
-- Gryphe models
-UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'gryphe')
-WHERE "slug" LIKE 'gryphe/%' OR "slug" LIKE 'mythomax%';
--- a/autogpt_platform/backend/migrations/20260105120000_add_agent_node_model_index/migration.sql
+++ b/autogpt_platform/backend/migrations/20260105120000_add_agent_node_model_index/migration.sql
@@ -1,4 +0,0 @@
-- CreateIndex
-- Index for efficient LLM model lookups on AgentNode.constantInput->>'model'
-- This improves performance of model migration queries in the LLM registry
-CREATE INDEX "AgentNode_constantInput_model_idx" ON "AgentNode" ((("constantInput"->>'model')));
--- a/autogpt_platform/backend/migrations/20260106150000_add_gpt52_model/migration.sql
+++ b/autogpt_platform/backend/migrations/20260106150000_add_gpt52_model/migration.sql
@@ -1,52 +0,0 @@
-- Add GPT-5.2 model and update O3 slug
-- This migration adds the new GPT-5.2 model added in dev branch
-
-- Update O3 slug to match dev branch format
-UPDATE "LlmModel"
-SET "slug" = 'o3-2025-04-16'
-WHERE "slug" = 'o3';
-
-- Update cost reference for O3 if needed
-- (costs are linked by model ID, so no update needed)
-
-- Add GPT-5.2 model
-WITH provider_id AS (
-    SELECT "id" FROM "LlmProvider" WHERE "name" = 'openai'
-)
-INSERT INTO "LlmModel" ("id", "slug", "displayName", "description", "providerId", "contextWindow", "maxOutputTokens", "isEnabled", "capabilities", "metadata")
-SELECT
-    gen_random_uuid(),
-    'gpt-5.2-2025-12-11',
-    'GPT 5.2',
-    'OpenAI GPT-5.2 model',
-    p."id",
-    400000,
-    128000,
-    true,
-    '{}'::jsonb,
-    '{}'::jsonb
-FROM provider_id p
-ON CONFLICT ("slug") DO NOTHING;
-
-- Add cost for GPT-5.2
-WITH model_id AS (
-    SELECT m."id", p."name" as provider_name
-    FROM "LlmModel" m
-    JOIN "LlmProvider" p ON p."id" = m."providerId"
-    WHERE m."slug" = 'gpt-5.2-2025-12-11'
-)
-INSERT INTO "LlmModelCost" ("id", "unit", "creditCost", "credentialProvider", "credentialId", "credentialType", "currency", "metadata", "llmModelId")
-SELECT
-    gen_random_uuid(),
-    'RUN'::"LlmCostUnit",
-    3,  -- Same cost tier as GPT-5.1
-    m.provider_name,
-    NULL,
-    'api_key',
-    NULL,
-    '{}'::jsonb,
-    m."id"
-FROM model_id m
-WHERE NOT EXISTS (
-    SELECT 1 FROM "LlmModelCost" c WHERE c."llmModelId" = m."id"
-);
--- a/autogpt_platform/backend/migrations/20260107100000_add_llm_recommended_model/migration.sql
+++ b/autogpt_platform/backend/migrations/20260107100000_add_llm_recommended_model/migration.sql
@@ -1,11 +0,0 @@
-- Add isRecommended field to LlmModel table
-- This allows admins to mark a model as the recommended default
-
-ALTER TABLE "LlmModel" ADD COLUMN "isRecommended" BOOLEAN NOT NULL DEFAULT false;
-
-- Set gpt-4o-mini as the default recommended model (if it exists)
-UPDATE "LlmModel" SET "isRecommended" = true WHERE "slug" = 'gpt-4o-mini' AND "isEnabled" = true;
-
-- Create unique partial index to enforce only one recommended model at the database level
-- This prevents multiple rows from having isRecommended = true
-CREATE UNIQUE INDEX "LlmModel_single_recommended_idx" ON "LlmModel" ("isRecommended") WHERE "isRecommended" = true;
--- a/autogpt_platform/backend/migrations/20260122000000_add_llm_price_tier/migration.sql
+++ b/autogpt_platform/backend/migrations/20260122000000_add_llm_price_tier/migration.sql
@@ -1,61 +0,0 @@
-- Add new columns to LlmModel table for extended model metadata
-- These columns support the LLM Picker UI enhancements
-
-- Add priceTier column: 1=cheapest, 2=medium, 3=expensive
-ALTER TABLE "LlmModel" ADD COLUMN IF NOT EXISTS "priceTier" INTEGER NOT NULL DEFAULT 1;
-
-- Add creatorId column for model creator relationship (if not exists)
-ALTER TABLE "LlmModel" ADD COLUMN IF NOT EXISTS "creatorId" TEXT;
-
-- Add isRecommended column (if not exists)
-ALTER TABLE "LlmModel" ADD COLUMN IF NOT EXISTS "isRecommended" BOOLEAN NOT NULL DEFAULT FALSE;
-
-- Add index on creatorId if not exists
-CREATE INDEX IF NOT EXISTS "LlmModel_creatorId_idx" ON "LlmModel"("creatorId");
-
-- Add foreign key for creatorId if not exists
-DO $$
-BEGIN
-    IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'LlmModel_creatorId_fkey') THEN
-        -- Only add FK if LlmModelCreator table exists
-        IF EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'LlmModelCreator') THEN
-            ALTER TABLE "LlmModel" ADD CONSTRAINT "LlmModel_creatorId_fkey"
-            FOREIGN KEY ("creatorId") REFERENCES "LlmModelCreator"("id") ON DELETE SET NULL ON UPDATE CASCADE;
-        END IF;
-    END IF;
-END $$;
-
-- Update priceTier values for existing models based on original MODEL_METADATA
-- Tier 1 = cheapest, Tier 2 = medium, Tier 3 = expensive
-
-- OpenAI models
-UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" = 'o3';
-UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" = 'o3-mini';
-UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" = 'o1';
-UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" = 'o1-mini';
-UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" = 'gpt-5.2';
-UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" = 'gpt-5.1';
-UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" = 'gpt-5';
-UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" = 'gpt-5-mini';
-UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" = 'gpt-5-nano';
-UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" = 'gpt-5-chat-latest';
-UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" LIKE 'gpt-4.1%';
-UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" = 'gpt-4o-mini';
-UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" = 'gpt-4o';
-UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" = 'gpt-4-turbo';
-UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" = 'gpt-3.5-turbo';
-
-- Anthropic models
-UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" LIKE 'claude-opus%';
-UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" LIKE 'claude-sonnet%';
-UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" LIKE 'claude%-4-5-sonnet%';
-UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" LIKE 'claude%-haiku%';
-UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" = 'claude-3-haiku-20240307';
-
-- OpenRouter models - Pro/expensive tiers
-UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" LIKE 'google/gemini%-pro%';
-UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" LIKE '%command-r-plus%';
-UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" LIKE '%sonar-pro%';
-UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" LIKE '%sonar-deep-research%';
-UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" = 'x-ai/grok-4';
-UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" LIKE '%qwen3-coder%';
--- a/autogpt_platform/backend/migrations/20260127211502_add_visit_copilot_onboarding_step/migration.sql
+++ b/autogpt_platform/backend/migrations/20260127211502_add_visit_copilot_onboarding_step/migration.sql
@@ -0,0 +1,2 @@
+-- AlterEnum
+ALTER TYPE "OnboardingStep" ADD VALUE 'VISIT_COPILOT';
--- a/autogpt_platform/backend/migrations/20260127230419_add_user_workspace/migration.sql
+++ b/autogpt_platform/backend/migrations/20260127230419_add_user_workspace/migration.sql
@@ -0,0 +1,52 @@
+-- CreateEnum
+CREATE TYPE "WorkspaceFileSource" AS ENUM ('UPLOAD', 'EXECUTION', 'COPILOT', 'IMPORT');
+
+-- CreateTable
+CREATE TABLE "UserWorkspace" (
+    "id" TEXT NOT NULL,
+    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
+    "updatedAt" TIMESTAMP(3) NOT NULL,
+    "userId" TEXT NOT NULL,
+
+    CONSTRAINT "UserWorkspace_pkey" PRIMARY KEY ("id")
+);
+
+-- CreateTable
+CREATE TABLE "UserWorkspaceFile" (
+    "id" TEXT NOT NULL,
+    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
+    "updatedAt" TIMESTAMP(3) NOT NULL,
+    "workspaceId" TEXT NOT NULL,
+    "name" TEXT NOT NULL,
+    "path" TEXT NOT NULL,
+    "storagePath" TEXT NOT NULL,
+    "mimeType" TEXT NOT NULL,
+    "sizeBytes" BIGINT NOT NULL,
+    "checksum" TEXT,
+    "isDeleted" BOOLEAN NOT NULL DEFAULT false,
+    "deletedAt" TIMESTAMP(3),
+    "source" "WorkspaceFileSource" NOT NULL DEFAULT 'UPLOAD',
+    "sourceExecId" TEXT,
+    "sourceSessionId" TEXT,
+    "metadata" JSONB NOT NULL DEFAULT '{}',
+
+    CONSTRAINT "UserWorkspaceFile_pkey" PRIMARY KEY ("id")
+);
+
+-- CreateIndex
+CREATE UNIQUE INDEX "UserWorkspace_userId_key" ON "UserWorkspace"("userId");
+
+-- CreateIndex
+CREATE INDEX "UserWorkspace_userId_idx" ON "UserWorkspace"("userId");
+
+-- CreateIndex
+CREATE INDEX "UserWorkspaceFile_workspaceId_isDeleted_idx" ON "UserWorkspaceFile"("workspaceId", "isDeleted");
+
+-- CreateIndex
+CREATE UNIQUE INDEX "UserWorkspaceFile_workspaceId_path_key" ON "UserWorkspaceFile"("workspaceId", "path");
+
+-- AddForeignKey
+ALTER TABLE "UserWorkspace" ADD CONSTRAINT "UserWorkspace_userId_fkey" FOREIGN KEY ("userId") REFERENCES "User"("id") ON DELETE CASCADE ON UPDATE CASCADE;
+
+-- AddForeignKey
+ALTER TABLE "UserWorkspaceFile" ADD CONSTRAINT "UserWorkspaceFile_workspaceId_fkey" FOREIGN KEY ("workspaceId") REFERENCES "UserWorkspace"("id") ON DELETE CASCADE ON UPDATE CASCADE;
--- a/autogpt_platform/backend/migrations/20260129011611_remove_workspace_file_source/migration.sql
+++ b/autogpt_platform/backend/migrations/20260129011611_remove_workspace_file_source/migration.sql
@@ -0,0 +1,16 @@
+/*
+  Warnings:
+
+  - You are about to drop the column `source` on the `UserWorkspaceFile` table. All the data in the column will be lost.
+  - You are about to drop the column `sourceExecId` on the `UserWorkspaceFile` table. All the data in the column will be lost.
+  - You are about to drop the column `sourceSessionId` on the `UserWorkspaceFile` table. All the data in the column will be lost.
+
+*/
+
+-- AlterTable
+ALTER TABLE "UserWorkspaceFile" DROP COLUMN "source",
+DROP COLUMN "sourceExecId",
+DROP COLUMN "sourceSessionId";
+
+-- DropEnum
+DROP TYPE "WorkspaceFileSource";
--- a/autogpt_platform/backend/schema.prisma
+++ b/autogpt_platform/backend/schema.prisma
@@ -63,6 +63,7 @@ model User {
  IntegrationWebhooks   IntegrationWebhook[]
  NotificationBatches   UserNotificationBatch[]
  PendingHumanReviews   PendingHumanReview[]
+  Workspace             UserWorkspace?

  // OAuth Provider relations
  OAuthApplications       OAuthApplication[]
@@ -81,6 +82,7 @@ enum OnboardingStep {
  AGENT_INPUT
  CONGRATS
  // First Wins
+  VISIT_COPILOT
  GET_RESULTS
  MARKETPLACE_VISIT
  MARKETPLACE_ADD_AGENT
@@ -136,6 +138,53 @@ model CoPilotUnderstanding {
  @@index([userId])
 }

+////////////////////////////////////////////////////////////
+////////////////////////////////////////////////////////////
+////////////////   USER WORKSPACE TABLES   /////////////////
+////////////////////////////////////////////////////////////
+////////////////////////////////////////////////////////////
+
+// User's persistent file storage workspace
+model UserWorkspace {
+  id        String   @id @default(uuid())
+  createdAt DateTime @default(now())
+  updatedAt DateTime @updatedAt
+
+  userId String @unique
+  User   User   @relation(fields: [userId], references: [id], onDelete: Cascade)
+
+  Files UserWorkspaceFile[]
+
+  @@index([userId])
+}
+
+// Individual files in a user's workspace
+model UserWorkspaceFile {
+  id        String   @id @default(uuid())
+  createdAt DateTime @default(now())
+  updatedAt DateTime @updatedAt
+
+  workspaceId String
+  Workspace   UserWorkspace @relation(fields: [workspaceId], references: [id], onDelete: Cascade)
+
+  // File metadata
+  name        String // User-visible filename
+  path        String // Virtual path (e.g., "/documents/report.pdf")
+  storagePath String // Actual GCS or local storage path
+  mimeType    String
+  sizeBytes   BigInt
+  checksum    String? // SHA256 for integrity
+
+  // File state
+  isDeleted Boolean   @default(false)
+  deletedAt DateTime?
+
+  metadata Json @default("{}")
+
+  @@unique([workspaceId, path])
+  @@index([workspaceId, isDeleted])
+}
+
 model BuilderSearchHistory {
  id        String   @id @default(uuid())
  createdAt DateTime @default(now())
@@ -1094,153 +1143,6 @@ enum APIKeyStatus {

 ////////////////////////////////////////////////////////////
 ////////////////////////////////////////////////////////////
-/////////////   LLM REGISTRY AND BILLING DATA   /////////////
-////////////////////////////////////////////////////////////
-////////////////////////////////////////////////////////////
-
-// LlmCostUnit: Defines how LLM MODEL costs are calculated (per run or per token).
-// This is distinct from BlockCostType (in backend/data/block.py) which defines
-// how BLOCK EXECUTION costs are calculated (per run, per byte, or per second).
-// LlmCostUnit is for pricing individual LLM model API calls in the registry,
-// while BlockCostType is for billing platform block executions.
-enum LlmCostUnit {
-  RUN
-  TOKENS
-}
-
-model LlmModelCreator {
-  id        String   @id @default(uuid())
-  createdAt DateTime @default(now())
-  updatedAt DateTime @updatedAt
-
-  name        String  @unique // e.g., "openai", "anthropic", "meta"
-  displayName String // e.g., "OpenAI", "Anthropic", "Meta"
-  description String?
-  websiteUrl  String? // Link to creator's website
-  logoUrl     String? // URL to creator's logo
-
-  metadata Json @default("{}")
-
-  Models LlmModel[]
-}
-
-model LlmProvider {
-  id        String   @id @default(uuid())
-  createdAt DateTime @default(now())
-  updatedAt DateTime @updatedAt
-
-  name        String @unique
-  displayName String
-  description String?
-
-  defaultCredentialProvider String?
-  defaultCredentialId       String?
-  defaultCredentialType     String?
-
-  supportsTools        Boolean @default(true)
-  supportsJsonOutput   Boolean @default(true)
-  supportsReasoning    Boolean @default(false)
-  supportsParallelTool Boolean @default(false)
-
-  metadata Json @default("{}")
-
-  Models LlmModel[]
-}
-
-model LlmModel {
-  id        String   @id @default(uuid())
-  createdAt DateTime @default(now())
-  updatedAt DateTime @updatedAt
-
-  slug        String @unique
-  displayName String
-  description String?
-
-  providerId String
-  Provider   LlmProvider @relation(fields: [providerId], references: [id], onDelete: Restrict)
-
-  // Creator is the organization that created/trained the model (e.g., OpenAI, Meta)
-  // This is distinct from the provider who hosts/serves the model (e.g., OpenRouter)
-  creatorId String?
-  Creator   LlmModelCreator? @relation(fields: [creatorId], references: [id], onDelete: SetNull)
-
-  contextWindow   Int
-  maxOutputTokens Int?
-  priceTier       Int     @default(1) // 1=cheapest, 2=medium, 3=expensive
-  isEnabled       Boolean @default(true)
-  isRecommended   Boolean @default(false)
-
-  capabilities Json @default("{}")
-  metadata     Json @default("{}")
-
-  Costs LlmModelCost[]
-
-  @@index([providerId, isEnabled])
-  @@index([creatorId])
-  @@index([slug])
-}
-
-model LlmModelCost {
-  id        String     @id @default(uuid())
-  createdAt DateTime   @default(now())
-  updatedAt DateTime   @updatedAt
-  unit      LlmCostUnit @default(RUN)
-
-  creditCost Int
-
-  credentialProvider String
-  credentialId       String?
-  credentialType     String?
-  currency           String?
-
-  metadata Json @default("{}")
-
-  llmModelId String
-  Model      LlmModel @relation(fields: [llmModelId], references: [id], onDelete: Cascade)
-
-  @@unique([llmModelId, credentialProvider, unit])
-  @@index([llmModelId])
-  @@index([credentialProvider])
-}
-
-// Tracks model migrations for revert capability
-// When a model is disabled with migration, we record which nodes were affected
-// so they can be reverted when the original model is back online
-model LlmModelMigration {
-  id        String   @id @default(uuid())
-  createdAt DateTime @default(now())
-  updatedAt DateTime @updatedAt
-
-  sourceModelSlug String // The original model that was disabled
-  targetModelSlug String // The model workflows were migrated to
-  reason          String? // Why the migration happened (e.g., "Provider outage")
-
-  // Track affected nodes as JSON array of node IDs
-  // Format: ["node-uuid-1", "node-uuid-2", ...]
-  migratedNodeIds Json @default("[]")
-  nodeCount       Int // Number of nodes migrated
-
-  // Custom pricing override for migrated workflows during the migration period.
-  // Use case: When migrating users from an expensive model (e.g., GPT-4) to a cheaper
-  // one (e.g., GPT-3.5), you may want to temporarily maintain the original pricing
-  // to avoid billing surprises, or offer a discount during the transition.
-  //
-  // IMPORTANT: This field is intended for integration with the billing system.
-  // When billing calculates costs for nodes affected by this migration, it should
-  // check if customCreditCost is set and use it instead of the target model's cost.
-  // If null, the target model's normal cost applies.
-  //
-  // TODO: Integrate with billing system to apply this override during cost calculation.
-  customCreditCost Int?
-
-  // Revert tracking
-  isReverted Boolean   @default(false)
-  revertedAt DateTime?
-
-  @@index([sourceModelSlug])
-  @@index([targetModelSlug])
-  @@index([isReverted])
-}
 //////////////   OAUTH PROVIDER TABLES    //////////////////
 ////////////////////////////////////////////////////////////
 ////////////////////////////////////////////////////////////
--- a/autogpt_platform/frontend/.env.default
+++ b/autogpt_platform/frontend/.env.default
@@ -34,3 +34,6 @@ NEXT_PUBLIC_PREVIEW_STEALING_DEV=
 # PostHog Analytics
 NEXT_PUBLIC_POSTHOG_KEY=
 NEXT_PUBLIC_POSTHOG_HOST=https://eu.i.posthog.com
+
+# OpenAI (for voice transcription)
+OPENAI_API_KEY=
--- a/autogpt_platform/frontend/CLAUDE.md
+++ b/autogpt_platform/frontend/CLAUDE.md
@@ -0,0 +1,76 @@
+# CLAUDE.md - Frontend
+
+This file provides guidance to Claude Code when working with the frontend.
+
+## Essential Commands
+
+```bash
+# Install dependencies
+pnpm i
+
+# Generate API client from OpenAPI spec
+pnpm generate:api
+
+# Start development server
+pnpm dev
+
+# Run E2E tests
+pnpm test
+
+# Run Storybook for component development
+pnpm storybook
+
+# Build production
+pnpm build
+
+# Format and lint
+pnpm format
+
+# Type checking
+pnpm types
+```
+
+### Code Style
+
+- Fully capitalize acronyms in symbols, e.g. `graphID`, `useBackendAPI`
+- Use function declarations (not arrow functions) for components/handlers
+
+## Architecture
+
+- **Framework**: Next.js 15 App Router (client-first approach)
+- **Data Fetching**: Type-safe generated API hooks via Orval + React Query
+- **State Management**: React Query for server state, co-located UI state in components/hooks
+- **Component Structure**: Separate render logic (`.tsx`) from business logic (`use*.ts` hooks)
+- **Workflow Builder**: Visual graph editor using @xyflow/react
+- **UI Components**: shadcn/ui (Radix UI primitives) with Tailwind CSS styling
+- **Icons**: Phosphor Icons only
+- **Feature Flags**: LaunchDarkly integration
+- **Error Handling**: ErrorCard for render errors, toast for mutations, Sentry for exceptions
+- **Testing**: Playwright for E2E, Storybook for component development
+
+## Environment Configuration
+
+`.env.default` (defaults) → `.env` (user overrides)
+
+## Feature Development
+
+See @CONTRIBUTING.md for complete patterns. Quick reference:
+
+1. **Pages**: Create in `src/app/(platform)/feature-name/page.tsx`
+   - Extract component logic into custom hooks grouped by concern, not by component. Each hook should represent a cohesive domain of functionality (e.g., useSearch, useFilters, usePagination) rather than bundling all state into one useComponentState hook.
+     - Put each hook in its own `.ts` file
+   - Put sub-components in local `components/` folder
+   - Component props should be `type Props = { ... }` (not exported) unless it needs to be used outside the component
+2. **Components**: Structure as `ComponentName/ComponentName.tsx` + `useComponentName.ts` + `helpers.ts`
+   - Use design system components from `src/components/` (atoms, molecules, organisms)
+   - Never use `src/components/__legacy__/*`
+3. **Data fetching**: Use generated API hooks from `@/app/api/__generated__/endpoints/`
+   - Regenerate with `pnpm generate:api`
+   - Pattern: `use{Method}{Version}{OperationName}`
+4. **Styling**: Tailwind CSS only, use design tokens, Phosphor Icons only
+5. **Testing**: Add Storybook stories for new components, Playwright for E2E
+6. **Code conventions**:
+   - Use function declarations (not arrow functions) for components/handlers
+   - Do not use `useCallback` or `useMemo` unless asked to optimise a given function
+   - Do not type hook returns, let Typescript infer as much as possible
+   - Never type with `any` unless a variable/attribute can ACTUALLY be of any type
--- a/autogpt_platform/frontend/src/app/(no-navbar)/onboarding/page.tsx
+++ b/autogpt_platform/frontend/src/app/(no-navbar)/onboarding/page.tsx
@@ -2,8 +2,9 @@
 import { LoadingSpinner } from "@/components/atoms/LoadingSpinner/LoadingSpinner";
 import { useRouter } from "next/navigation";
 import { useEffect } from "react";
-import { resolveResponse, shouldShowOnboarding } from "@/app/api/helpers";
+import { resolveResponse, getOnboardingStatus } from "@/app/api/helpers";
 import { getV1OnboardingState } from "@/app/api/__generated__/endpoints/onboarding/onboarding";
+import { getHomepageRoute } from "@/lib/constants";

 export default function OnboardingPage() {
  const router = useRouter();
@@ -11,10 +12,13 @@ export default function OnboardingPage() {
  useEffect(() => {
    async function redirectToStep() {
      try {
-        // Check if onboarding is enabled
-        const isEnabled = await shouldShowOnboarding();
-        if (!isEnabled) {
-          router.replace("/");
+        // Check if onboarding is enabled (also gets chat flag for redirect)
+        const { shouldShowOnboarding, isChatEnabled } =
+          await getOnboardingStatus();
+        const homepageRoute = getHomepageRoute(isChatEnabled);
+
+        if (!shouldShowOnboarding) {
+          router.replace(homepageRoute);
          return;
        }

@@ -22,7 +26,7 @@ export default function OnboardingPage() {

        // Handle completed onboarding
        if (onboarding.completedSteps.includes("GET_RESULTS")) {
-          router.replace("/");
+          router.replace(homepageRoute);
          return;
        }

--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Nicholas Tindle	5c008780e6	Merge branch 'dev' into update-branchlet	2026-01-29 11:57:56 -06:00
Nicholas Tindle	1407efcbc0	Fix formatting in .branchlet.json	2026-01-29 11:57:12 -06:00
Reinier van der Leer	4cd5da678d	refactor(claude): Split `autogpt_platform/CLAUDE.md` into project-specific files (#11788 ) Split `autogpt_platform/CLAUDE.md` into project-specific files, to make the scope of the instructions clearer. Also, some minor improvements: - Change references to other Markdown files to @file/path.md syntax that Claude recognizes - Update ambiguous/incorrect/outdated instructions - Remove trailing slashes - Fix broken file path references in other docs (including comments)	2026-01-29 17:33:02 +00:00
Ubbe	b94c83aacc	feat(frontend): Copilot speech to text via Whisper model (#11871 ) ## Changes 🏗️ https://github.com/user-attachments/assets/d9c12ac0-625c-4b38-8834-e494b5eda9c0 Add a "speech to text" feature in the Chat input fox of Copilot, similar as what you have in ChatGPT. ## Checklist 📋 ### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Run locally and try the speech to text feature as part of the chat input box ### For configuration changes: We need to add `OPENAI_API_KEY=` to Vercel ( used in the Front-end ) both in Dev and Prod. - [x] `.env.default` is updated or already compatible with my changes --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-29 17:46:36 +07:00
Nicholas Tindle	7668c17d9c	feat(platform): add User Workspace for persistent CoPilot file storage (#11867 ) Implements persistent User Workspace storage for CoPilot, enabling blocks to save and retrieve files across sessions. Files are stored in session-scoped virtual paths (`/sessions/{session_id}/`). Fixes SECRT-1833 ### Changes 🏗️ Database & Storage: - Add `UserWorkspace` and `UserWorkspaceFile` Prisma models - Implement `WorkspaceStorageBackend` abstraction (GCS for cloud, local filesystem for self-hosted) - Add `workspace_id` and `session_id` fields to `ExecutionContext` Backend API: - Add REST endpoints: `GET/POST /api/workspace/files`, `GET/DELETE /api/workspace/files/{id}`, `GET /api/workspace/files/{id}/download` - Add CoPilot tools: `list_workspace_files`, `read_workspace_file`, `write_workspace_file` - Integrate workspace storage into `store_media_file()` - returns `workspace://file-id` references Block Updates: - Refactor all file-handling blocks to use unified `ExecutionContext` parameter - Update media-generating blocks to persist outputs to workspace (AIImageGenerator, AIImageCustomizer, FluxKontext, TalkingHead, FAL video, Bannerbear, etc.) Frontend: - Render `workspace://` image references in chat via proxy endpoint - Add "AI cannot see this image" overlay indicator CoPilot Context Mapping: - Session = Agent (graph_id) = Run (graph_exec_id) - Files scoped to `/sessions/{session_id}/` ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [ ] I have tested my changes according to the test plan: - [ ] Create CoPilot session, generate image with AIImageGeneratorBlock - [ ] Verify image returns `workspace://file-id` (not base64) - [ ] Verify image renders in chat with visibility indicator - [ ] Verify workspace files persist across sessions - [ ] Test list/read/write workspace files via CoPilot tools - [ ] Test local storage backend for self-hosted deployments #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes) 🤖 Generated with [Claude Code](https://claude.ai/code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Medium Risk > Introduces a new persistent file-storage surface area (DB tables, storage backends, download API, and chat tools) and rewires `store_media_file()`/block execution context across many blocks, so regressions could impact file handling, access control, or storage costs. > > Overview > Adds a persistent per-user Workspace (new `UserWorkspace`/`UserWorkspaceFile` models plus `WorkspaceManager` + `WorkspaceStorageBackend` with GCS/local implementations) and wires it into the API via a new `/api/workspace/files/{file_id}/download` route (including header-sanitized `Content-Disposition`) and shutdown lifecycle hooks. > > Extends `ExecutionContext` to carry execution identity + `workspace_id`/`session_id`, updates executor tooling to clone node-specific contexts, and updates `run_block` (CoPilot) to create a session-scoped workspace and synthetic graph/run/node IDs. > > Refactors `store_media_file()` to require `execution_context` + `return_format` and to support `workspace://` references; migrates many media/file-handling blocks and related tests to the new API and to persist generated media as `workspace://...` (or fall back to data URIs outside CoPilot), and adds CoPilot chat tools for listing/reading/writing/deleting workspace files with safeguards against context bloat. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `6abc70f793`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Reinier van der Leer <pwuts@agpt.co>	2026-01-29 05:49:47 +00:00
Nicholas Tindle	e0dfae5732	fix(platform): evaluate chat flag after auth for correct redirect (#11873 ) Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-28 14:58:02 -06:00
Zamil Majdy	7df867d645	Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev	2026-01-28 12:29:41 -06:00
Zamil Majdy	d855f79874	fix(platform): reduce Sentry alert spam for expected errors (#11872 ) ## Summary - Add `InvalidInputError` for validation errors (search term too long, invalid pagination) - returns 400 instead of 500 - Remove redundant try/catch blocks in library routes - global exception handlers already handle `ValueError`→400 and `NotFoundError`→404 - Aggregate embedding backfill errors and log once at the end instead of per content type to prevent Sentry issue spam ## Test plan - [x] Verify validation errors (search term >100 chars) return 400 Bad Request - [x] Verify NotFoundError still returns 404 - [x] Verify embedding errors are logged once at the end with aggregated counts Fixes AUTOGPT-SERVER-7K5, BUILDER-6NC --------- Co-authored-by: Swifty <craigswift13@gmail.com>	2026-01-29 01:28:27 +07:00
Swifty	dac99694fe	Merge branch 'release/v0.6.44'	2026-01-28 12:19:13 +01:00
Nicholas Tindle	0953983944	feat(platform): disable onboarding redirects and add $5 signup bonus (#11862 ) Disable automatic onboarding redirects on signup/login while keeping the checklist/wallet functional. Users now receive $5 (500 credits) on their first visit to /copilot. ### Changes 🏗️ - Frontend: `shouldShowOnboarding()` now returns `false`, disabling auto-redirects to `/onboarding` - Backend: Added `VISIT_COPILOT` onboarding step with 500 credit ($5) reward - Frontend: Copilot page automatically completes `VISIT_COPILOT` step on mount - Database: Migration to add `VISIT_COPILOT` to `OnboardingStep` enum NOTE: /onboarding/1-welcome -> /library now as shouldShowOnboardin is always false Users land directly on `/copilot` after signup/login and receive $5 invisibly (not shown in checklist UI). ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] New user signup (email/password) → lands on `/copilot`, wallet shows 500 credits - [x] Verified credits are only granted once (idempotent via onboarding reward mechanism) - [x] Existing user login (already granted flag set) → lands on `/copilot`, no duplicate credits - [x] Checklist/wallet remains functional #### For configuration changes: - [x] `.env.default` is updated or already compatible with my changes - [x] `docker-compose.yml` is updated or already compatible with my changes - [x] I have included a list of my configuration changes in the PR description (under Changes) No configuration changes required. --- OPEN-2967 🤖 Generated with [Claude Code](https://claude.ai/code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Introduces a new onboarding step and adjusts onboarding flow. > > - Adds `VISIT_COPILOT` onboarding step (+500 credits) with DB enum migration and API/type updates > - Copilot page auto-completes `VISIT_COPILOT` on mount to grant the welcome bonus > - Changes `/onboarding/enabled` to require user context and return `false` when `CHAT` feature is enabled (skips legacy onboarding) > - Wallet now refreshes credits on any onboarding `step_completed` notification; confetti limited to visible tasks > - Test flows updated to accept redirects to `copilot`/`library` and verify authenticated state > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `ec5a5a4dfd`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Nicholas Tindle <ntindle@users.noreply.github.com>	2026-01-28 07:22:46 +00:00
Zamil Majdy	0058cd3ba6	fix(frontend): auto-poll for long-running tool completion (#11866 ) ## Summary Fixes the issue where the "Creating Agent" spinner doesn't auto-update when agent generation completes - user had to refresh the browser. Changes: - Frontend polling: Add `onOperationStarted` callback to trigger polling when `operation_started` is received via SSE - Polling backoff: 2s, 4s, 6s, 8s... up to 30s max - Message deduplication: Use content-based keys (role + content) instead of timestamps to prevent duplicate messages - Message ordering: Preserve server message order instead of timestamp-based sorting - Debug cleanup: Remove verbose console.log/console.info statements ## Test plan - [ ] Start agent generation in copilot - [ ] Verify "Creating Agent" spinner appears - [ ] Wait for completion (2-5 min) WITHOUT refreshing - [ ] Verify agent carousel appears automatically when done - [ ] Verify no duplicate messages in chat - [ ] Verify message order is correct (user → assistant → tool_call → tool_response)	2026-01-28 10:03:21 +07:00
Nicholas Tindle	ea035224bc	feat(copilot): Increase max_agent_runs and max_agent_schedules (#11865 ) <!-- Clearly explain the need for these changes: --> Config change to increase the max times an agent can run in the chat and the max number of scheduels created by copilot in one chat <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Increases per-chat operational limits for Copilot. > > - Bumps `max_agent_runs` default from `3` to `30` in `ChatConfig` > - Bumps `max_agent_schedules` default from `3` to `30` in `ChatConfig` > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `93cbae6d27`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-01-28 01:08:02 +00:00
Nicholas Tindle	62813a1ea6	Delete backend/blocks/video/__init__.py (#11864 ) <!-- Clearly explain the need for these changes: --> oops file ### Changes 🏗️ <!-- Concisely describe all of the changes made in this pull request: --> removes file that should have not been commited <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Removes erroneous `backend/blocks/video/__init__.py`, eliminating an unintended `video` package. > > - Deletes a placeholder comment-only file > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `3b84576c33`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->	2026-01-28 00:58:49 +00:00
Bently	67405f7eb9	fix(copilot): ensure tool_call/tool_response pairs stay intact during context compaction (#11863 ) ## Summary Fixes context compaction breaking tool_call/tool_response pairs, causing API validation errors. ## Problem When context compaction slices messages with `messages[-KEEP_RECENT:]`, a naive slice can separate an assistant message containing `tool_calls` from its corresponding tool response messages. This causes API validation errors like: ``` messages.0.content.1: unexpected 'tool_use_id' found in 'tool_result' blocks: orphan_12345. Each 'tool_result' block must have a corresponding 'tool_use' block in the previous message. ``` ## Solution Added `_ensure_tool_pairs_intact()` helper function that: 1. Detects orphan tool responses in a slice (tool messages whose `tool_call_id` has no matching assistant message) 2. Extends the slice backwards to include the missing assistant messages 3. Falls back to removing orphan tool responses if the assistant cannot be found (edge case) Applied this safeguard to: - The initial `KEEP_RECENT` slice (line ~990) - The progressive fallback slices when still over token limit (line ~1079) ## Testing - Syntax validated with `python -m py_compile` - Logic reviewed for correctness ## Linear Fixes SECRT-1839 --- Debugged by Toran & Orion in #agpt Discord	2026-01-28 00:21:54 +00:00
Zamil Majdy	171ff6e776	feat(backend): persist long-running tool results to survive SSE disconnects (#11856 ) ## Summary Agent generation (`create_agent`, `edit_agent`) can take 1-5 minutes. Previously, if the user closed their browser tab during this time: 1. The SSE connection would die 2. The tool execution would be cancelled via `CancelledError` 3. The result would be lost - even if the agent-generator service completed successfully This PR ensures long-running tool operations survive SSE disconnections. ### Changes 🏗️ Backend: - base.py: Added `is_long_running` property to `BaseTool` for tools to opt-in to background execution - create_agent.py / edit_agent.py: Set `is_long_running = True` - models.py: Added `OperationStartedResponse`, `OperationPendingResponse`, `OperationInProgressResponse` types - service.py: Modified `_yield_tool_call()` to: - Check if tool is `is_long_running` - Save "pending" message to chat history immediately - Spawn background task that runs independently of SSE - Return `operation_started` immediately (don't wait) - Update chat history with result when background task completes - Track running operations for idempotency (prevents duplicate ops on refresh) - db.py: Added `update_tool_message_content()` to update pending messages - model.py: Added `invalidate_session_cache()` to clear Redis after background completion Frontend: - useChatMessage.ts: Added operation message types - helpers.ts: Handle `operation_started`, `operation_pending`, `operation_in_progress` response types - PendingOperationWidget: New component to display operation status with spinner - ChatMessage.tsx: Render `PendingOperationWidget` for operation messages ### How It Works ``` User Request → Save "pending" message → Spawn background task → Return immediately ↓ Task runs independently of SSE ↓ On completion: Update message in chat history ↓ User refreshes → Loads history → Sees result ``` ### User Experience 1. User requests agent creation 2. Sees "Agent creation started. You can close this tab - check your library in a few minutes." 3. Can close browser tab safely 4. When they return, chat shows the completed result (or error) ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] pyright passes (0 errors) - [x] TypeScript checks pass - [x] Formatters applied ### Test Plan 1. Start agent creation in copilot 2. Close browser tab immediately after seeing "operation_started" 3. Wait 2-3 minutes 4. Reopen chat 5. Verify: Chat history shows completion message and agent appears in library --------- Co-authored-by: Ubbe <hi@ubbe.dev>	2026-01-28 05:09:34 +07:00
Lluis Agusti	349b1f9c79	hotfix(frontend): copilot session handling refinements...	2026-01-28 02:53:45 +07:00
Lluis Agusti	277b0537e9	hotfix(frontend): copilot simplication...	2026-01-28 02:10:18 +07:00
Ubbe	071b3bb5cd	fix(frontend): more copilot refinements (#11858 ) ## Changes 🏗️ On the Copilot page: - prevent unnecessary sidebar repaints - show a disclaimer when switching chats on the sidebar to terminate a current stream - handle loading better - save streams better when disconnecting ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] Run the app locally and test the above	2026-01-28 00:49:28 +07:00