Merge branch 'dev' into add-llm-manager-ui

Update llm.py
Improve LlmModelMeta slug generation logic
2026-01-30 01:18:07 -05:00 · 2026-01-27 10:39:47 -06:00 · 2026-01-23 15:07:55 +00:00 · 2026-01-23 14:59:49 +00:00 · 2026-01-23 12:12:32 +00:00 · 2026-01-22 14:38:24 +00:00
186 changed files with 11333 additions and 7119 deletions
--- a/.branchlet.json
+++ b/.branchlet.json
@@ -29,7 +29,8 @@
  "postCreateCmd": [
    "cd autogpt_platform/autogpt_libs && poetry install",
    "cd autogpt_platform/backend && poetry install && poetry run prisma generate",
-    "cd autogpt_platform/frontend && pnpm install"
+    "cd autogpt_platform/frontend && pnpm install",
+    "cd docs && pip install -r requirements.txt"
  ],
  "terminalCommand": "code .",
  "deleteBranchWithWorktree": false
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -160,7 +160,7 @@ pnpm storybook                      # Start component development server

 **Backend Entry Points:**

- `backend/backend/api/rest_api.py` - FastAPI application setup
+- `backend/backend/server/server.py` - FastAPI application setup
 - `backend/backend/data/` - Database models and user management
 - `backend/blocks/` - Agent execution blocks and logic

@@ -219,7 +219,7 @@ Agents are built using a visual block-based system where each block performs a s

 ### API Development

-1. Update routes in `/backend/backend/api/features/`
+1. Update routes in `/backend/backend/server/routers/`
 2. Add/update Pydantic models in same directory
 3. Write tests alongside route files
 4. For `data/*.py` changes, validate user ID checks
@@ -285,7 +285,7 @@ Agents are built using a visual block-based system where each block performs a s

 ### Security Guidelines

-**Cache Protection Middleware** (`/backend/backend/api/middleware/security.py`):
+**Cache Protection Middleware** (`/backend/backend/server/middleware/security.py`):

 - Default: Disables caching for ALL endpoints with `Cache-Control: no-store, no-cache, must-revalidate, private`
 - Uses allow list approach for cacheable paths (static assets, health checks, public pages)
--- a/.gitignore
+++ b/.gitignore
@@ -178,5 +178,4 @@ autogpt_platform/backend/settings.py
 *.ign.*
 .test-contents
 .claude/settings.local.json
-CLAUDE.local.md
 /autogpt_platform/backend/logs
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -16,6 +16,7 @@ See `docs/content/platform/getting-started.md` for setup instructions.
 - Format Python code with `poetry run format`.
 - Format frontend code using `pnpm format`.

+
 ## Frontend guidelines:

 See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:
@@ -32,17 +33,14 @@ See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:
 4. **Styling**: Tailwind CSS only, use design tokens, Phosphor Icons only
 5. **Testing**: Add Storybook stories for new components, Playwright for E2E
 6. **Code conventions**: Function declarations (not arrow functions) for components/handlers
-
 - Component props should be `interface Props { ... }` (not exported) unless the interface needs to be used outside the component
 - Separate render logic from business logic (component.tsx + useComponent.ts + helpers.ts)
 - Colocate state when possible and avoid creating large components, use sub-components ( local `/components` folder next to the parent component ) when sensible
 - Avoid large hooks, abstract logic into `helpers.ts` files when sensible
 - Use function declarations for components, arrow functions only for callbacks
 - No barrel files or `index.ts` re-exports
+- Do not use `useCallback` or `useMemo` unless strictly needed
 - Avoid comments at all times unless the code is very complex
- Do not use `useCallback` or `useMemo` unless asked to optimise a given function
- Do not type hook returns, let Typescript infer as much as possible
- Never type with `any`, if not types available use `unknown`

 ## Testing

@@ -51,8 +49,22 @@ See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:

 Always run the relevant linters and tests before committing.
 Use conventional commit messages for all commits (e.g. `feat(backend): add API`).
-Types: - feat - fix - refactor - ci - dx (developer experience)
-Scopes: - platform - platform/library - platform/marketplace - backend - backend/executor - frontend - frontend/library - frontend/marketplace - blocks
+  Types:
+    - feat
+    - fix
+    - refactor
+    - ci
+    - dx (developer experience)
+  Scopes:
+    - platform
+      - platform/library
+      - platform/marketplace
+      - backend
+        - backend/executor
+      - frontend
+        - frontend/library
+        - frontend/marketplace
+      - blocks

 ## Pull requests

--- a/autogpt_platform/CLAUDE.md
+++ b/autogpt_platform/CLAUDE.md
@@ -6,30 +6,152 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

 AutoGPT Platform is a monorepo containing:

- **Backend** (`backend`): Python FastAPI server with async support
- **Frontend** (`frontend`): Next.js React application
- **Shared Libraries** (`autogpt_libs`): Common Python utilities
+- **Backend** (`/backend`): Python FastAPI server with async support
+- **Frontend** (`/frontend`): Next.js React application
+- **Shared Libraries** (`/autogpt_libs`): Common Python utilities

-## Component Documentation
+## Essential Commands

- **Backend**: See @backend/CLAUDE.md for backend-specific commands, architecture, and development tasks
- **Frontend**: See @frontend/CLAUDE.md for frontend-specific commands, architecture, and development patterns
+### Backend Development

-## Key Concepts
+```bash
+# Install dependencies
+cd backend && poetry install
+
+# Run database migrations
+poetry run prisma migrate dev
+
+# Start all services (database, redis, rabbitmq, clamav)
+docker compose up -d
+
+# Run the backend server
+poetry run serve
+
+# Run tests
+poetry run test
+
+# Run specific test
+poetry run pytest path/to/test_file.py::test_function_name
+
+# Run block tests (tests that validate all blocks work correctly)
+poetry run pytest backend/blocks/test/test_block.py -xvs
+
+# Run tests for a specific block (e.g., GetCurrentTimeBlock)
+poetry run pytest 'backend/blocks/test/test_block.py::test_available_blocks[GetCurrentTimeBlock]' -xvs
+
+# Lint and format
+# prefer format if you want to just "fix" it and only get the errors that can't be autofixed
+poetry run format  # Black + isort
+poetry run lint    # ruff
+```
+
+More details can be found in TESTING.md
+
+#### Creating/Updating Snapshots
+
+When you first write a test or when the expected output changes:
+
+```bash
+poetry run pytest path/to/test.py --snapshot-update
+```
+
+⚠️ **Important**: Always review snapshot changes before committing! Use `git diff` to verify the changes are expected.
+
+### Frontend Development
+
+```bash
+# Install dependencies
+cd frontend && pnpm i
+
+# Generate API client from OpenAPI spec
+pnpm generate:api
+
+# Start development server
+pnpm dev
+
+# Run E2E tests
+pnpm test
+
+# Run Storybook for component development
+pnpm storybook
+
+# Build production
+pnpm build
+
+# Format and lint
+pnpm format
+
+# Type checking
+pnpm types
+```
+
+**📖 Complete Guide**: See `/frontend/CONTRIBUTING.md` and `/frontend/.cursorrules` for comprehensive frontend patterns.
+
+**Key Frontend Conventions:**
+
+- Separate render logic from data/behavior in components
+- Use generated API hooks from `@/app/api/__generated__/endpoints/`
+- Use function declarations (not arrow functions) for components/handlers
+- Use design system components from `src/components/` (atoms, molecules, organisms)
+- Only use Phosphor Icons
+- Never use `src/components/__legacy__/*` or deprecated `BackendAPI`
+
+## Architecture Overview
+
+### Backend Architecture
+
+- **API Layer**: FastAPI with REST and WebSocket endpoints
+- **Database**: PostgreSQL with Prisma ORM, includes pgvector for embeddings
+- **Queue System**: RabbitMQ for async task processing
+- **Execution Engine**: Separate executor service processes agent workflows
+- **Authentication**: JWT-based with Supabase integration
+- **Security**: Cache protection middleware prevents sensitive data caching in browsers/proxies
+
+### Frontend Architecture
+
+- **Framework**: Next.js 15 App Router (client-first approach)
+- **Data Fetching**: Type-safe generated API hooks via Orval + React Query
+- **State Management**: React Query for server state, co-located UI state in components/hooks
+- **Component Structure**: Separate render logic (`.tsx`) from business logic (`use*.ts` hooks)
+- **Workflow Builder**: Visual graph editor using @xyflow/react
+- **UI Components**: shadcn/ui (Radix UI primitives) with Tailwind CSS styling
+- **Icons**: Phosphor Icons only
+- **Feature Flags**: LaunchDarkly integration
+- **Error Handling**: ErrorCard for render errors, toast for mutations, Sentry for exceptions
+- **Testing**: Playwright for E2E, Storybook for component development
+
+### Key Concepts

 1. **Agent Graphs**: Workflow definitions stored as JSON, executed by the backend
-2. **Blocks**: Reusable components in `backend/backend/blocks/` that perform specific tasks
+2. **Blocks**: Reusable components in `/backend/blocks/` that perform specific tasks
 3. **Integrations**: OAuth and API connections stored per user
 4. **Store**: Marketplace for sharing agent templates
 5. **Virus Scanning**: ClamAV integration for file upload security

+### Testing Approach
+
+- Backend uses pytest with snapshot testing for API responses
+- Test files are colocated with source files (`*_test.py`)
+- Frontend uses Playwright for E2E tests
+- Component testing via Storybook
+
+### Database Schema
+
+Key models (defined in `/backend/schema.prisma`):
+
+- `User`: Authentication and profile data
+- `AgentGraph`: Workflow definitions with version control
+- `AgentGraphExecution`: Execution history and results
+- `AgentNode`: Individual nodes in a workflow
+- `StoreListing`: Marketplace listings for sharing agents
+
 ### Environment Configuration

 #### Configuration Files

- **Backend**: `backend/.env.default` (defaults) → `backend/.env` (user overrides)
- **Frontend**: `frontend/.env.default` (defaults) → `frontend/.env` (user overrides)
- **Platform**: `.env.default` (Supabase/shared defaults) → `.env` (user overrides)
+- **Backend**: `/backend/.env.default` (defaults) → `/backend/.env` (user overrides)
+- **Frontend**: `/frontend/.env.default` (defaults) → `/frontend/.env` (user overrides)
+- **Platform**: `/.env.default` (Supabase/shared defaults) → `/.env` (user overrides)

 #### Docker Environment Loading Order

@@ -45,12 +167,83 @@ AutoGPT Platform is a monorepo containing:
 - Backend/Frontend services use YAML anchors for consistent configuration
 - Supabase services (`db/docker/docker-compose.yml`) follow the same pattern

+### Common Development Tasks
+
+**Adding a new block:**
+
+Follow the comprehensive [Block SDK Guide](../../../docs/content/platform/block-sdk-guide.md) which covers:
+
+- Provider configuration with `ProviderBuilder`
+- Block schema definition
+- Authentication (API keys, OAuth, webhooks)
+- Testing and validation
+- File organization
+
+Quick steps:
+
+1. Create new file in `/backend/backend/blocks/`
+2. Configure provider using `ProviderBuilder` in `_config.py`
+3. Inherit from `Block` base class
+4. Define input/output schemas using `BlockSchema`
+5. Implement async `run` method
+6. Generate unique block ID using `uuid.uuid4()`
+7. Test with `poetry run pytest backend/blocks/test/test_block.py`
+
+Note: when making many new blocks analyze the interfaces for each of these blocks and picture if they would go well together in a graph based editor or would they struggle to connect productively?
+ex: do the inputs and outputs tie well together?
+
+If you get any pushback or hit complex block conditions check the new_blocks guide in the docs.
+
+**Modifying the API:**
+
+1. Update route in `/backend/backend/server/routers/`
+2. Add/update Pydantic models in same directory
+3. Write tests alongside the route file
+4. Run `poetry run test` to verify
+
+### Frontend guidelines:
+
+See `/frontend/CONTRIBUTING.md` for complete patterns. Quick reference:
+
+1. **Pages**: Create in `src/app/(platform)/feature-name/page.tsx`
+   - Add `usePageName.ts` hook for logic
+   - Put sub-components in local `components/` folder
+2. **Components**: Structure as `ComponentName/ComponentName.tsx` + `useComponentName.ts` + `helpers.ts`
+   - Use design system components from `src/components/` (atoms, molecules, organisms)
+   - Never use `src/components/__legacy__/*`
+3. **Data fetching**: Use generated API hooks from `@/app/api/__generated__/endpoints/`
+   - Regenerate with `pnpm generate:api`
+   - Pattern: `use{Method}{Version}{OperationName}`
+4. **Styling**: Tailwind CSS only, use design tokens, Phosphor Icons only
+5. **Testing**: Add Storybook stories for new components, Playwright for E2E
+6. **Code conventions**: Function declarations (not arrow functions) for components/handlers
+- Component props should be `interface Props { ... }` (not exported) unless the interface needs to be used outside the component
+- Separate render logic from business logic (component.tsx + useComponent.ts + helpers.ts)
+- Colocate state when possible and avoid creating large components, use sub-components ( local `/components` folder next to the parent component ) when sensible
+- Avoid large hooks, abstract logic into `helpers.ts` files when sensible
+- Use function declarations for components, arrow functions only for callbacks
+- No barrel files or `index.ts` re-exports
+- Do not use `useCallback` or `useMemo` unless strictly needed
+- Avoid comments at all times unless the code is very complex
+
+### Security Implementation
+
+**Cache Protection Middleware:**
+
+- Located in `/backend/backend/server/middleware/security.py`
+- Default behavior: Disables caching for ALL endpoints with `Cache-Control: no-store, no-cache, must-revalidate, private`
+- Uses an allow list approach - only explicitly permitted paths can be cached
+- Cacheable paths include: static assets (`/static/*`, `/_next/static/*`), health checks, public store pages, documentation
+- Prevents sensitive data (auth tokens, API keys, user data) from being cached by browsers/proxies
+- To allow caching for a new endpoint, add it to `CACHEABLE_PATHS` in the middleware
+- Applied to both main API server and external API applications
+
 ### Creating Pull Requests

- Create the PR against the `dev` branch of the repository.
- Ensure the branch name is descriptive (e.g., `feature/add-new-block`)
- Use conventional commit messages (see below)
- Fill out the .github/PULL_REQUEST_TEMPLATE.md template as the PR description
+- Create the PR aginst the `dev` branch of the repository.
+- Ensure the branch name is descriptive (e.g., `feature/add-new-block`)/
+- Use conventional commit messages (see below)/
+- Fill out the .github/PULL_REQUEST_TEMPLATE.md template as the PR description/
 - Run the github pre-commit hooks to ensure code quality.

 ### Reviewing/Revising Pull Requests
--- a/autogpt_platform/backend/CLAUDE.md
+++ b/autogpt_platform/backend/CLAUDE.md
@@ -1,170 +0,0 @@
-# CLAUDE.md - Backend
-
-This file provides guidance to Claude Code when working with the backend.
-
-## Essential Commands
-
-To run something with Python package dependencies you MUST use `poetry run ...`.
-
-```bash
-# Install dependencies
-poetry install
-
-# Run database migrations
-poetry run prisma migrate dev
-
-# Start all services (database, redis, rabbitmq, clamav)
-docker compose up -d
-
-# Run the backend as a whole
-poetry run app
-
-# Run tests
-poetry run test
-
-# Run specific test
-poetry run pytest path/to/test_file.py::test_function_name
-
-# Run block tests (tests that validate all blocks work correctly)
-poetry run pytest backend/blocks/test/test_block.py -xvs
-
-# Run tests for a specific block (e.g., GetCurrentTimeBlock)
-poetry run pytest 'backend/blocks/test/test_block.py::test_available_blocks[GetCurrentTimeBlock]' -xvs
-
-# Lint and format
-# prefer format if you want to just "fix" it and only get the errors that can't be autofixed
-poetry run format  # Black + isort
-poetry run lint    # ruff
-```
-
-More details can be found in @TESTING.md
-
-### Creating/Updating Snapshots
-
-When you first write a test or when the expected output changes:
-
-```bash
-poetry run pytest path/to/test.py --snapshot-update
-```
-
-⚠️ **Important**: Always review snapshot changes before committing! Use `git diff` to verify the changes are expected.
-
-## Architecture
-
- **API Layer**: FastAPI with REST and WebSocket endpoints
- **Database**: PostgreSQL with Prisma ORM, includes pgvector for embeddings
- **Queue System**: RabbitMQ for async task processing
- **Execution Engine**: Separate executor service processes agent workflows
- **Authentication**: JWT-based with Supabase integration
- **Security**: Cache protection middleware prevents sensitive data caching in browsers/proxies
-
-## Testing Approach
-
- Uses pytest with snapshot testing for API responses
- Test files are colocated with source files (`*_test.py`)
-
-## Database Schema
-
-Key models (defined in `schema.prisma`):
-
- `User`: Authentication and profile data
- `AgentGraph`: Workflow definitions with version control
- `AgentGraphExecution`: Execution history and results
- `AgentNode`: Individual nodes in a workflow
- `StoreListing`: Marketplace listings for sharing agents
-
-## Environment Configuration
-
- **Backend**: `.env.default` (defaults) → `.env` (user overrides)
-
-## Common Development Tasks
-
-### Adding a new block
-
-Follow the comprehensive [Block SDK Guide](@../../docs/content/platform/block-sdk-guide.md) which covers:
-
- Provider configuration with `ProviderBuilder`
- Block schema definition
- Authentication (API keys, OAuth, webhooks)
- Testing and validation
- File organization
-
-Quick steps:
-
-1. Create new file in `backend/blocks/`
-2. Configure provider using `ProviderBuilder` in `_config.py`
-3. Inherit from `Block` base class
-4. Define input/output schemas using `BlockSchema`
-5. Implement async `run` method
-6. Generate unique block ID using `uuid.uuid4()`
-7. Test with `poetry run pytest backend/blocks/test/test_block.py`
-
-Note: when making many new blocks analyze the interfaces for each of these blocks and picture if they would go well together in a graph-based editor or would they struggle to connect productively?
-ex: do the inputs and outputs tie well together?
-
-If you get any pushback or hit complex block conditions check the new_blocks guide in the docs.
-
-#### Handling files in blocks with `store_media_file()`
-
-When blocks need to work with files (images, videos, documents), use `store_media_file()` from `backend.util.file`. The `return_format` parameter determines what you get back:
-
-| Format | Use When | Returns |
-|--------|----------|---------|
-| `"for_local_processing"` | Processing with local tools (ffmpeg, MoviePy, PIL) | Local file path (e.g., `"image.png"`) |
-| `"for_external_api"` | Sending content to external APIs (Replicate, OpenAI) | Data URI (e.g., `"data:image/png;base64,..."`) |
-| `"for_block_output"` | Returning output from your block | Smart: `workspace://` in CoPilot, data URI in graphs |
-
-**Examples:**
-
-```python
-# INPUT: Need to process file locally with ffmpeg
-local_path = await store_media_file(
-    file=input_data.video,
-    execution_context=execution_context,
-    return_format="for_local_processing",
-)
-# local_path = "video.mp4" - use with Path/ffmpeg/etc
-
-# INPUT: Need to send to external API like Replicate
-image_b64 = await store_media_file(
-    file=input_data.image,
-    execution_context=execution_context,
-    return_format="for_external_api",
-)
-# image_b64 = "data:image/png;base64,iVBORw0..." - send to API
-
-# OUTPUT: Returning result from block
-result_url = await store_media_file(
-    file=generated_image_url,
-    execution_context=execution_context,
-    return_format="for_block_output",
-)
-yield "image_url", result_url
-# In CoPilot: result_url = "workspace://abc123"
-# In graphs:  result_url = "data:image/png;base64,..."
-```
-
-**Key points:**
-
- `for_block_output` is the ONLY format that auto-adapts to execution context
- Always use `for_block_output` for block outputs unless you have a specific reason not to
- Never hardcode workspace checks - let `for_block_output` handle it
-
-### Modifying the API
-
-1. Update route in `backend/api/features/`
-2. Add/update Pydantic models in same directory
-3. Write tests alongside the route file
-4. Run `poetry run test` to verify
-
-## Security Implementation
-
-### Cache Protection Middleware
-
- Located in `backend/api/middleware/security.py`
- Default behavior: Disables caching for ALL endpoints with `Cache-Control: no-store, no-cache, must-revalidate, private`
- Uses an allow list approach - only explicitly permitted paths can be cached
- Cacheable paths include: static assets (`static/*`, `_next/static/*`), health checks, public store pages, documentation
- Prevents sensitive data (auth tokens, API keys, user data) from being cached by browsers/proxies
- To allow caching for a new endpoint, add it to `CACHEABLE_PATHS` in the middleware
- Applied to both main API server and external API applications
--- a/autogpt_platform/backend/TESTING.md
+++ b/autogpt_platform/backend/TESTING.md
@@ -138,7 +138,7 @@ If the test doesn't need the `user_id` specifically, mocking is not necessary as

 #### Using Global Auth Fixtures

-Two global auth fixtures are provided by `backend/api/conftest.py`:
+Two global auth fixtures are provided by `backend/server/conftest.py`:

 - `mock_jwt_user` - Regular user with `test_user_id` ("test-user-id")
 - `mock_jwt_admin` - Admin user with `admin_user_id` ("admin-user-id")
--- a/autogpt_platform/backend/backend/api/conn_manager.py
+++ b/autogpt_platform/backend/backend/api/conn_manager.py
@@ -122,6 +122,24 @@ class ConnectionManager:

        return len(connections)

+    async def broadcast_to_all(self, *, method: WSMethod, data: dict) -> int:
+        """Broadcast a message to all active websocket connections."""
+        message = WSMessage(
+            method=method,
+            data=data,
+        ).model_dump_json()
+
+        connections = tuple(self.active_connections)
+        if not connections:
+            return 0
+
+        await asyncio.gather(
+            *(connection.send_text(message) for connection in connections),
+            return_exceptions=True,
+        )
+
+        return len(connections)
+
    async def _subscribe(self, channel_key: str, websocket: WebSocket) -> str:
        if channel_key not in self.subscriptions:
            self.subscriptions[channel_key] = set()
--- a/autogpt_platform/backend/backend/api/features/admin/execution_analytics_routes.py
+++ b/autogpt_platform/backend/backend/api/features/admin/execution_analytics_routes.py
@@ -176,30 +176,64 @@ async def get_execution_analytics_config(
        # Return with provider prefix for clarity
        return f"{provider_name}: {model_name}"

-    # Include all LlmModel values (no more filtering by hardcoded list)
-    recommended_model = LlmModel.GPT4O_MINI.value
-    for model in LlmModel:
-        label = generate_model_label(model)
+    # Get all models from the registry (dynamic, not hardcoded enum)
+    from backend.data import llm_registry
+    from backend.server.v2.llm import db as llm_db
+
+    # Get the recommended model from the database (configurable via admin UI)
+    recommended_model_slug = await llm_db.get_recommended_model_slug()
+
+    # Build the available models list
+    first_enabled_slug = None
+    for registry_model in llm_registry.iter_dynamic_models():
+        # Only include enabled models in the list
+        if not registry_model.is_enabled:
+            continue
+
+        # Track first enabled model as fallback
+        if first_enabled_slug is None:
+            first_enabled_slug = registry_model.slug
+
+        model_enum = LlmModel(registry_model.slug)  # Create enum instance from slug
+        label = generate_model_label(model_enum)
        # Add "(Recommended)" suffix to the recommended model
-        if model.value == recommended_model:
+        if registry_model.slug == recommended_model_slug:
            label += " (Recommended)"

        available_models.append(
            ModelInfo(
-                value=model.value,
+                value=registry_model.slug,
                label=label,
-                provider=model.provider,
+                provider=registry_model.metadata.provider,
            )
        )

    # Sort models by provider and name for better UX
    available_models.sort(key=lambda x: (x.provider, x.label))

+    # Handle case where no models are available
+    if not available_models:
+        logger.warning(
+            "No enabled LLM models found in registry. "
+            "Ensure models are configured and enabled in the LLM Registry."
+        )
+        # Provide a placeholder entry so admins see meaningful feedback
+        available_models.append(
+            ModelInfo(
+                value="",
+                label="No models available - configure in LLM Registry",
+                provider="none",
+            )
+        )
+
+    # Use the DB recommended model, or fallback to first enabled model
+    final_recommended = recommended_model_slug or first_enabled_slug or ""
+
    return ExecutionAnalyticsConfig(
        available_models=available_models,
        default_system_prompt=DEFAULT_SYSTEM_PROMPT,
        default_user_prompt=DEFAULT_USER_PROMPT,
-        recommended_model=recommended_model,
+        recommended_model=final_recommended,
    )


--- a/autogpt_platform/backend/backend/api/features/admin/llm_routes.py
+++ b/autogpt_platform/backend/backend/api/features/admin/llm_routes.py
@@ -0,0 +1,595 @@
+import logging
+
+import autogpt_libs.auth
+import fastapi
+
+from backend.data import llm_registry
+from backend.data.block_cost_config import refresh_llm_costs
+from backend.server.v2.llm import db as llm_db
+from backend.server.v2.llm import model as llm_model
+
+logger = logging.getLogger(__name__)
+
+router = fastapi.APIRouter(
+    tags=["llm", "admin"],
+    dependencies=[fastapi.Security(autogpt_libs.auth.requires_admin_user)],
+)
+
+
+async def _refresh_runtime_state() -> None:
+    """Refresh the LLM registry and clear all related caches to ensure real-time updates."""
+    logger.info("Refreshing LLM registry runtime state...")
+    try:
+        # Refresh registry from database
+        await llm_registry.refresh_llm_registry()
+        refresh_llm_costs()
+
+        # Clear block schema caches so they're regenerated with updated model options
+        from backend.data.block import BlockSchema
+
+        BlockSchema.clear_all_schema_caches()
+        logger.info("Cleared all block schema caches")
+
+        # Clear the /blocks endpoint cache so frontend gets updated schemas
+        try:
+            from backend.api.features.v1 import _get_cached_blocks
+
+            _get_cached_blocks.cache_clear()
+            logger.info("Cleared /blocks endpoint cache")
+        except Exception as e:
+            logger.warning("Failed to clear /blocks cache: %s", e)
+
+        # Clear the v2 builder caches (if they exist)
+        try:
+            from backend.api.features.builder import db as builder_db
+
+            if hasattr(builder_db, "_get_all_providers"):
+                builder_db._get_all_providers.cache_clear()
+                logger.info("Cleared v2 builder providers cache")
+            if hasattr(builder_db, "_build_cached_search_results"):
+                builder_db._build_cached_search_results.cache_clear()
+                logger.info("Cleared v2 builder search results cache")
+        except Exception as e:
+            logger.debug("Could not clear v2 builder cache: %s", e)
+
+        # Notify all executor services to refresh their registry cache
+        from backend.data.llm_registry import publish_registry_refresh_notification
+
+        await publish_registry_refresh_notification()
+        logger.info("Published registry refresh notification")
+    except Exception as exc:
+        logger.exception(
+            "LLM runtime state refresh failed; caches may be stale: %s", exc
+        )
+
+
+@router.get(
+    "/providers",
+    summary="List LLM providers",
+    response_model=llm_model.LlmProvidersResponse,
+)
+async def list_llm_providers(include_models: bool = True):
+    providers = await llm_db.list_providers(include_models=include_models)
+    return llm_model.LlmProvidersResponse(providers=providers)
+
+
+@router.post(
+    "/providers",
+    summary="Create LLM provider",
+    response_model=llm_model.LlmProvider,
+)
+async def create_llm_provider(request: llm_model.UpsertLlmProviderRequest):
+    provider = await llm_db.upsert_provider(request=request)
+    await _refresh_runtime_state()
+    return provider
+
+
+@router.patch(
+    "/providers/{provider_id}",
+    summary="Update LLM provider",
+    response_model=llm_model.LlmProvider,
+)
+async def update_llm_provider(
+    provider_id: str,
+    request: llm_model.UpsertLlmProviderRequest,
+):
+    provider = await llm_db.upsert_provider(request=request, provider_id=provider_id)
+    await _refresh_runtime_state()
+    return provider
+
+
+@router.delete(
+    "/providers/{provider_id}",
+    summary="Delete LLM provider",
+    response_model=dict,
+)
+async def delete_llm_provider(provider_id: str):
+    """
+    Delete an LLM provider.
+
+    A provider can only be deleted if it has no associated models.
+    Delete all models from the provider first before deleting the provider.
+    """
+    try:
+        await llm_db.delete_provider(provider_id)
+        await _refresh_runtime_state()
+        logger.info("Deleted LLM provider '%s'", provider_id)
+        return {"success": True, "message": "Provider deleted successfully"}
+    except ValueError as e:
+        logger.warning("Failed to delete provider '%s': %s", provider_id, e)
+        raise fastapi.HTTPException(status_code=400, detail=str(e))
+    except Exception as e:
+        logger.exception("Failed to delete provider '%s': %s", provider_id, e)
+        raise fastapi.HTTPException(status_code=500, detail=str(e))
+
+
+@router.get(
+    "/models",
+    summary="List LLM models",
+    response_model=llm_model.LlmModelsResponse,
+)
+async def list_llm_models(
+    provider_id: str | None = fastapi.Query(default=None),
+    page: int = fastapi.Query(default=1, ge=1, description="Page number (1-indexed)"),
+    page_size: int = fastapi.Query(
+        default=50, ge=1, le=100, description="Number of models per page"
+    ),
+):
+    return await llm_db.list_models(
+        provider_id=provider_id, page=page, page_size=page_size
+    )
+
+
+@router.post(
+    "/models",
+    summary="Create LLM model",
+    response_model=llm_model.LlmModel,
+)
+async def create_llm_model(request: llm_model.CreateLlmModelRequest):
+    model = await llm_db.create_model(request=request)
+    await _refresh_runtime_state()
+    return model
+
+
+@router.patch(
+    "/models/{model_id}",
+    summary="Update LLM model",
+    response_model=llm_model.LlmModel,
+)
+async def update_llm_model(
+    model_id: str,
+    request: llm_model.UpdateLlmModelRequest,
+):
+    model = await llm_db.update_model(model_id=model_id, request=request)
+    await _refresh_runtime_state()
+    return model
+
+
+@router.patch(
+    "/models/{model_id}/toggle",
+    summary="Toggle LLM model availability",
+    response_model=llm_model.ToggleLlmModelResponse,
+)
+async def toggle_llm_model(
+    model_id: str,
+    request: llm_model.ToggleLlmModelRequest,
+):
+    """
+    Toggle a model's enabled status, optionally migrating workflows when disabling.
+
+    If disabling a model and `migrate_to_slug` is provided, all workflows using
+    this model will be migrated to the specified replacement model before disabling.
+    A migration record is created which can be reverted later using the revert endpoint.
+
+    Optional fields:
+    - `migration_reason`: Reason for the migration (e.g., "Provider outage")
+    - `custom_credit_cost`: Custom pricing override for billing during migration
+    """
+    try:
+        result = await llm_db.toggle_model(
+            model_id=model_id,
+            is_enabled=request.is_enabled,
+            migrate_to_slug=request.migrate_to_slug,
+            migration_reason=request.migration_reason,
+            custom_credit_cost=request.custom_credit_cost,
+        )
+        await _refresh_runtime_state()
+        if result.nodes_migrated > 0:
+            logger.info(
+                "Toggled model '%s' to %s and migrated %d nodes to '%s' (migration_id=%s)",
+                result.model.slug,
+                "enabled" if request.is_enabled else "disabled",
+                result.nodes_migrated,
+                result.migrated_to_slug,
+                result.migration_id,
+            )
+        return result
+    except ValueError as exc:
+        logger.warning("Model toggle validation failed: %s", exc)
+        raise fastapi.HTTPException(status_code=400, detail=str(exc)) from exc
+    except Exception as exc:
+        logger.exception("Failed to toggle LLM model %s: %s", model_id, exc)
+        raise fastapi.HTTPException(
+            status_code=500,
+            detail="Failed to toggle model availability",
+        ) from exc
+
+
+@router.get(
+    "/models/{model_id}/usage",
+    summary="Get model usage count",
+    response_model=llm_model.LlmModelUsageResponse,
+)
+async def get_llm_model_usage(model_id: str):
+    """Get the number of workflow nodes using this model."""
+    try:
+        return await llm_db.get_model_usage(model_id=model_id)
+    except ValueError as exc:
+        raise fastapi.HTTPException(status_code=404, detail=str(exc)) from exc
+    except Exception as exc:
+        logger.exception("Failed to get model usage %s: %s", model_id, exc)
+        raise fastapi.HTTPException(
+            status_code=500,
+            detail="Failed to get model usage",
+        ) from exc
+
+
+@router.delete(
+    "/models/{model_id}",
+    summary="Delete LLM model and migrate workflows",
+    response_model=llm_model.DeleteLlmModelResponse,
+)
+async def delete_llm_model(
+    model_id: str,
+    replacement_model_slug: str | None = fastapi.Query(
+        default=None,
+        description="Slug of the model to migrate existing workflows to (required only if workflows use this model)",
+    ),
+):
+    """
+    Delete a model and optionally migrate workflows using it to a replacement model.
+
+    If no workflows are using this model, it can be deleted without providing a
+    replacement. If workflows exist, replacement_model_slug is required.
+
+    This endpoint:
+    1. Counts how many workflow nodes use the model being deleted
+    2. If nodes exist, validates the replacement model and migrates them
+    3. Deletes the model record
+    4. Refreshes all caches and notifies executors
+
+    Example: DELETE /admin/llm/models/{id}?replacement_model_slug=gpt-4o
+    Example (no usage): DELETE /admin/llm/models/{id}
+    """
+    try:
+        result = await llm_db.delete_model(
+            model_id=model_id, replacement_model_slug=replacement_model_slug
+        )
+        await _refresh_runtime_state()
+        logger.info(
+            "Deleted model '%s' and migrated %d nodes to '%s'",
+            result.deleted_model_slug,
+            result.nodes_migrated,
+            result.replacement_model_slug,
+        )
+        return result
+    except ValueError as exc:
+        # Validation errors (model not found, replacement invalid, etc.)
+        logger.warning("Model deletion validation failed: %s", exc)
+        raise fastapi.HTTPException(status_code=400, detail=str(exc)) from exc
+    except Exception as exc:
+        logger.exception("Failed to delete LLM model %s: %s", model_id, exc)
+        raise fastapi.HTTPException(
+            status_code=500,
+            detail="Failed to delete model and migrate workflows",
+        ) from exc
+
+
+# ============================================================================
+# Migration Management Endpoints
+# ============================================================================
+
+
+@router.get(
+    "/migrations",
+    summary="List model migrations",
+    response_model=llm_model.LlmMigrationsResponse,
+)
+async def list_llm_migrations(
+    include_reverted: bool = fastapi.Query(
+        default=False, description="Include reverted migrations in the list"
+    ),
+):
+    """
+    List all model migrations.
+
+    Migrations are created when disabling a model with the migrate_to_slug option.
+    They can be reverted to restore the original model configuration.
+    """
+    try:
+        migrations = await llm_db.list_migrations(include_reverted=include_reverted)
+        return llm_model.LlmMigrationsResponse(migrations=migrations)
+    except Exception as exc:
+        logger.exception("Failed to list migrations: %s", exc)
+        raise fastapi.HTTPException(
+            status_code=500,
+            detail="Failed to list migrations",
+        ) from exc
+
+
+@router.get(
+    "/migrations/{migration_id}",
+    summary="Get migration details",
+    response_model=llm_model.LlmModelMigration,
+)
+async def get_llm_migration(migration_id: str):
+    """Get details of a specific migration."""
+    try:
+        migration = await llm_db.get_migration(migration_id)
+        if not migration:
+            raise fastapi.HTTPException(
+                status_code=404, detail=f"Migration '{migration_id}' not found"
+            )
+        return migration
+    except fastapi.HTTPException:
+        raise
+    except Exception as exc:
+        logger.exception("Failed to get migration %s: %s", migration_id, exc)
+        raise fastapi.HTTPException(
+            status_code=500,
+            detail="Failed to get migration",
+        ) from exc
+
+
+@router.post(
+    "/migrations/{migration_id}/revert",
+    summary="Revert a model migration",
+    response_model=llm_model.RevertMigrationResponse,
+)
+async def revert_llm_migration(
+    migration_id: str,
+    request: llm_model.RevertMigrationRequest | None = None,
+):
+    """
+    Revert a model migration, restoring affected workflows to their original model.
+
+    This only reverts the specific nodes that were part of the migration.
+    The source model must exist for the revert to succeed.
+
+    Options:
+    - `re_enable_source_model`: Whether to re-enable the source model if disabled (default: True)
+
+    Response includes:
+    - `nodes_reverted`: Number of nodes successfully reverted
+    - `nodes_already_changed`: Number of nodes that were modified since migration (not reverted)
+    - `source_model_re_enabled`: Whether the source model was re-enabled
+
+    Requirements:
+    - Migration must not already be reverted
+    - Source model must exist
+    """
+    try:
+        re_enable = request.re_enable_source_model if request else True
+        result = await llm_db.revert_migration(
+            migration_id,
+            re_enable_source_model=re_enable,
+        )
+        await _refresh_runtime_state()
+        logger.info(
+            "Reverted migration '%s': %d nodes restored from '%s' to '%s' "
+            "(%d already changed, source re-enabled=%s)",
+            migration_id,
+            result.nodes_reverted,
+            result.target_model_slug,
+            result.source_model_slug,
+            result.nodes_already_changed,
+            result.source_model_re_enabled,
+        )
+        return result
+    except ValueError as exc:
+        logger.warning("Migration revert validation failed: %s", exc)
+        raise fastapi.HTTPException(status_code=400, detail=str(exc)) from exc
+    except Exception as exc:
+        logger.exception("Failed to revert migration %s: %s", migration_id, exc)
+        raise fastapi.HTTPException(
+            status_code=500,
+            detail="Failed to revert migration",
+        ) from exc
+
+
+# ============================================================================
+# Creator Management Endpoints
+# ============================================================================
+
+
+@router.get(
+    "/creators",
+    summary="List model creators",
+    response_model=llm_model.LlmCreatorsResponse,
+)
+async def list_llm_creators():
+    """
+    List all model creators.
+
+    Creators are organizations that create/train models (e.g., OpenAI, Meta, Anthropic).
+    This is distinct from providers who host/serve the models (e.g., OpenRouter).
+    """
+    try:
+        creators = await llm_db.list_creators()
+        return llm_model.LlmCreatorsResponse(creators=creators)
+    except Exception as exc:
+        logger.exception("Failed to list creators: %s", exc)
+        raise fastapi.HTTPException(
+            status_code=500,
+            detail="Failed to list creators",
+        ) from exc
+
+
+@router.get(
+    "/creators/{creator_id}",
+    summary="Get creator details",
+    response_model=llm_model.LlmModelCreator,
+)
+async def get_llm_creator(creator_id: str):
+    """Get details of a specific model creator."""
+    try:
+        creator = await llm_db.get_creator(creator_id)
+        if not creator:
+            raise fastapi.HTTPException(
+                status_code=404, detail=f"Creator '{creator_id}' not found"
+            )
+        return creator
+    except fastapi.HTTPException:
+        raise
+    except Exception as exc:
+        logger.exception("Failed to get creator %s: %s", creator_id, exc)
+        raise fastapi.HTTPException(
+            status_code=500,
+            detail="Failed to get creator",
+        ) from exc
+
+
+@router.post(
+    "/creators",
+    summary="Create model creator",
+    response_model=llm_model.LlmModelCreator,
+)
+async def create_llm_creator(request: llm_model.UpsertLlmCreatorRequest):
+    """
+    Create a new model creator.
+
+    A creator represents an organization that creates/trains AI models,
+    such as OpenAI, Anthropic, Meta, or Google.
+    """
+    try:
+        creator = await llm_db.upsert_creator(request=request)
+        await _refresh_runtime_state()
+        logger.info("Created model creator '%s' (%s)", creator.display_name, creator.id)
+        return creator
+    except Exception as exc:
+        logger.exception("Failed to create creator: %s", exc)
+        raise fastapi.HTTPException(
+            status_code=500,
+            detail="Failed to create creator",
+        ) from exc
+
+
+@router.patch(
+    "/creators/{creator_id}",
+    summary="Update model creator",
+    response_model=llm_model.LlmModelCreator,
+)
+async def update_llm_creator(
+    creator_id: str,
+    request: llm_model.UpsertLlmCreatorRequest,
+):
+    """Update an existing model creator."""
+    try:
+        creator = await llm_db.upsert_creator(request=request, creator_id=creator_id)
+        await _refresh_runtime_state()
+        logger.info("Updated model creator '%s' (%s)", creator.display_name, creator_id)
+        return creator
+    except Exception as exc:
+        logger.exception("Failed to update creator %s: %s", creator_id, exc)
+        raise fastapi.HTTPException(
+            status_code=500,
+            detail="Failed to update creator",
+        ) from exc
+
+
+@router.delete(
+    "/creators/{creator_id}",
+    summary="Delete model creator",
+    response_model=dict,
+)
+async def delete_llm_creator(creator_id: str):
+    """
+    Delete a model creator.
+
+    This will remove the creator association from all models that reference it
+    (sets creatorId to NULL), but will not delete the models themselves.
+    """
+    try:
+        await llm_db.delete_creator(creator_id)
+        await _refresh_runtime_state()
+        logger.info("Deleted model creator '%s'", creator_id)
+        return {"success": True, "message": f"Creator '{creator_id}' deleted"}
+    except ValueError as exc:
+        logger.warning("Creator deletion validation failed: %s", exc)
+        raise fastapi.HTTPException(status_code=404, detail=str(exc)) from exc
+    except Exception as exc:
+        logger.exception("Failed to delete creator %s: %s", creator_id, exc)
+        raise fastapi.HTTPException(
+            status_code=500,
+            detail="Failed to delete creator",
+        ) from exc
+
+
+# ============================================================================
+# Recommended Model Endpoints
+# ============================================================================
+
+
+@router.get(
+    "/recommended-model",
+    summary="Get recommended model",
+    response_model=llm_model.RecommendedModelResponse,
+)
+async def get_recommended_model():
+    """
+    Get the currently recommended LLM model.
+
+    The recommended model is shown to users as the default/suggested option
+    in model selection dropdowns.
+    """
+    try:
+        model = await llm_db.get_recommended_model()
+        return llm_model.RecommendedModelResponse(
+            model=model,
+            slug=model.slug if model else None,
+        )
+    except Exception as exc:
+        logger.exception("Failed to get recommended model: %s", exc)
+        raise fastapi.HTTPException(
+            status_code=500,
+            detail="Failed to get recommended model",
+        ) from exc
+
+
+@router.post(
+    "/recommended-model",
+    summary="Set recommended model",
+    response_model=llm_model.SetRecommendedModelResponse,
+)
+async def set_recommended_model(request: llm_model.SetRecommendedModelRequest):
+    """
+    Set a model as the recommended model.
+
+    This clears the recommended flag from any other model and sets it on
+    the specified model. The model must be enabled to be set as recommended.
+
+    The recommended model is displayed to users as the default/suggested
+    option in model selection dropdowns throughout the platform.
+    """
+    try:
+        model, previous_slug = await llm_db.set_recommended_model(request.model_id)
+        await _refresh_runtime_state()
+        logger.info(
+            "Set recommended model to '%s' (previous: %s)",
+            model.slug,
+            previous_slug or "none",
+        )
+        return llm_model.SetRecommendedModelResponse(
+            model=model,
+            previous_recommended_slug=previous_slug,
+            message=f"Model '{model.display_name}' is now the recommended model",
+        )
+    except ValueError as exc:
+        logger.warning("Set recommended model validation failed: %s", exc)
+        raise fastapi.HTTPException(status_code=400, detail=str(exc)) from exc
+    except Exception as exc:
+        logger.exception("Failed to set recommended model: %s", exc)
+        raise fastapi.HTTPException(
+            status_code=500,
+            detail="Failed to set recommended model",
+        ) from exc
--- a/autogpt_platform/backend/backend/api/features/admin/llm_routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/admin/llm_routes_test.py
@@ -0,0 +1,491 @@
+import json
+from unittest.mock import AsyncMock
+
+import fastapi
+import fastapi.testclient
+import pytest
+import pytest_mock
+from autogpt_libs.auth.jwt_utils import get_jwt_payload
+from pytest_snapshot.plugin import Snapshot
+
+import backend.api.features.admin.llm_routes as llm_routes
+from backend.server.v2.llm import model as llm_model
+from backend.util.models import Pagination
+
+app = fastapi.FastAPI()
+app.include_router(llm_routes.router, prefix="/admin/llm")
+
+client = fastapi.testclient.TestClient(app)
+
+
+@pytest.fixture(autouse=True)
+def setup_app_admin_auth(mock_jwt_admin):
+    """Setup admin auth overrides for all tests in this module"""
+    app.dependency_overrides[get_jwt_payload] = mock_jwt_admin["get_jwt_payload"]
+    yield
+    app.dependency_overrides.clear()
+
+
+def test_list_llm_providers_success(
+    mocker: pytest_mock.MockFixture,
+    configured_snapshot: Snapshot,
+) -> None:
+    """Test successful listing of LLM providers"""
+    # Mock the database function
+    mock_providers = [
+        {
+            "id": "provider-1",
+            "name": "openai",
+            "display_name": "OpenAI",
+            "description": "OpenAI LLM provider",
+            "supports_tools": True,
+            "supports_json_output": True,
+            "supports_reasoning": False,
+            "supports_parallel_tool": True,
+            "metadata": {},
+            "models": [],
+        },
+        {
+            "id": "provider-2",
+            "name": "anthropic",
+            "display_name": "Anthropic",
+            "description": "Anthropic LLM provider",
+            "supports_tools": True,
+            "supports_json_output": True,
+            "supports_reasoning": False,
+            "supports_parallel_tool": True,
+            "metadata": {},
+            "models": [],
+        },
+    ]
+
+    mocker.patch(
+        "backend.api.features.admin.llm_routes.llm_db.list_providers",
+        new=AsyncMock(return_value=mock_providers),
+    )
+
+    response = client.get("/admin/llm/providers")
+
+    assert response.status_code == 200
+    response_data = response.json()
+    assert len(response_data["providers"]) == 2
+    assert response_data["providers"][0]["name"] == "openai"
+
+    # Snapshot test the response (must be string)
+    configured_snapshot.assert_match(
+        json.dumps(response_data, indent=2, sort_keys=True),
+        "list_llm_providers_success.json",
+    )
+
+
+def test_list_llm_models_success(
+    mocker: pytest_mock.MockFixture,
+    configured_snapshot: Snapshot,
+) -> None:
+    """Test successful listing of LLM models with pagination"""
+    # Mock the database function - now returns LlmModelsResponse
+    mock_model = llm_model.LlmModel(
+        id="model-1",
+        slug="gpt-4o",
+        display_name="GPT-4o",
+        description="GPT-4 Optimized",
+        provider_id="provider-1",
+        context_window=128000,
+        max_output_tokens=16384,
+        is_enabled=True,
+        capabilities={},
+        metadata={},
+        costs=[
+            llm_model.LlmModelCost(
+                id="cost-1",
+                credit_cost=10,
+                credential_provider="openai",
+                metadata={},
+            )
+        ],
+    )
+
+    mock_response = llm_model.LlmModelsResponse(
+        models=[mock_model],
+        pagination=Pagination(
+            total_items=1,
+            total_pages=1,
+            current_page=1,
+            page_size=50,
+        ),
+    )
+
+    mocker.patch(
+        "backend.api.features.admin.llm_routes.llm_db.list_models",
+        new=AsyncMock(return_value=mock_response),
+    )
+
+    response = client.get("/admin/llm/models")
+
+    assert response.status_code == 200
+    response_data = response.json()
+    assert len(response_data["models"]) == 1
+    assert response_data["models"][0]["slug"] == "gpt-4o"
+    assert response_data["pagination"]["total_items"] == 1
+    assert response_data["pagination"]["page_size"] == 50
+
+    # Snapshot test the response (must be string)
+    configured_snapshot.assert_match(
+        json.dumps(response_data, indent=2, sort_keys=True),
+        "list_llm_models_success.json",
+    )
+
+
+def test_create_llm_provider_success(
+    mocker: pytest_mock.MockFixture,
+    configured_snapshot: Snapshot,
+) -> None:
+    """Test successful creation of LLM provider"""
+    mock_provider = {
+        "id": "new-provider-id",
+        "name": "groq",
+        "display_name": "Groq",
+        "description": "Groq LLM provider",
+        "supports_tools": True,
+        "supports_json_output": True,
+        "supports_reasoning": False,
+        "supports_parallel_tool": False,
+        "metadata": {},
+    }
+
+    mocker.patch(
+        "backend.api.features.admin.llm_routes.llm_db.upsert_provider",
+        new=AsyncMock(return_value=mock_provider),
+    )
+
+    mock_refresh = mocker.patch(
+        "backend.api.features.admin.llm_routes._refresh_runtime_state",
+        new=AsyncMock(),
+    )
+
+    request_data = {
+        "name": "groq",
+        "display_name": "Groq",
+        "description": "Groq LLM provider",
+        "supports_tools": True,
+        "supports_json_output": True,
+        "supports_reasoning": False,
+        "supports_parallel_tool": False,
+        "metadata": {},
+    }
+
+    response = client.post("/admin/llm/providers", json=request_data)
+
+    assert response.status_code == 200
+    response_data = response.json()
+    assert response_data["name"] == "groq"
+    assert response_data["display_name"] == "Groq"
+
+    # Verify refresh was called
+    mock_refresh.assert_called_once()
+
+    # Snapshot test the response (must be string)
+    configured_snapshot.assert_match(
+        json.dumps(response_data, indent=2, sort_keys=True),
+        "create_llm_provider_success.json",
+    )
+
+
+def test_create_llm_model_success(
+    mocker: pytest_mock.MockFixture,
+    configured_snapshot: Snapshot,
+) -> None:
+    """Test successful creation of LLM model"""
+    mock_model = {
+        "id": "new-model-id",
+        "slug": "gpt-4.1-mini",
+        "display_name": "GPT-4.1 Mini",
+        "description": "Latest GPT-4.1 Mini model",
+        "provider_id": "provider-1",
+        "context_window": 128000,
+        "max_output_tokens": 16384,
+        "is_enabled": True,
+        "capabilities": {},
+        "metadata": {},
+        "costs": [
+            {
+                "id": "cost-id",
+                "credit_cost": 5,
+                "credential_provider": "openai",
+                "metadata": {},
+            }
+        ],
+    }
+
+    mocker.patch(
+        "backend.api.features.admin.llm_routes.llm_db.create_model",
+        new=AsyncMock(return_value=mock_model),
+    )
+
+    mock_refresh = mocker.patch(
+        "backend.api.features.admin.llm_routes._refresh_runtime_state",
+        new=AsyncMock(),
+    )
+
+    request_data = {
+        "slug": "gpt-4.1-mini",
+        "display_name": "GPT-4.1 Mini",
+        "description": "Latest GPT-4.1 Mini model",
+        "provider_id": "provider-1",
+        "context_window": 128000,
+        "max_output_tokens": 16384,
+        "is_enabled": True,
+        "capabilities": {},
+        "metadata": {},
+        "costs": [
+            {
+                "credit_cost": 5,
+                "credential_provider": "openai",
+                "metadata": {},
+            }
+        ],
+    }
+
+    response = client.post("/admin/llm/models", json=request_data)
+
+    assert response.status_code == 200
+    response_data = response.json()
+    assert response_data["slug"] == "gpt-4.1-mini"
+    assert response_data["is_enabled"] is True
+
+    # Verify refresh was called
+    mock_refresh.assert_called_once()
+
+    # Snapshot test the response (must be string)
+    configured_snapshot.assert_match(
+        json.dumps(response_data, indent=2, sort_keys=True),
+        "create_llm_model_success.json",
+    )
+
+
+def test_update_llm_model_success(
+    mocker: pytest_mock.MockFixture,
+    configured_snapshot: Snapshot,
+) -> None:
+    """Test successful update of LLM model"""
+    mock_model = {
+        "id": "model-1",
+        "slug": "gpt-4o",
+        "display_name": "GPT-4o Updated",
+        "description": "Updated description",
+        "provider_id": "provider-1",
+        "context_window": 256000,
+        "max_output_tokens": 32768,
+        "is_enabled": True,
+        "capabilities": {},
+        "metadata": {},
+        "costs": [
+            {
+                "id": "cost-1",
+                "credit_cost": 15,
+                "credential_provider": "openai",
+                "metadata": {},
+            }
+        ],
+    }
+
+    mocker.patch(
+        "backend.api.features.admin.llm_routes.llm_db.update_model",
+        new=AsyncMock(return_value=mock_model),
+    )
+
+    mock_refresh = mocker.patch(
+        "backend.api.features.admin.llm_routes._refresh_runtime_state",
+        new=AsyncMock(),
+    )
+
+    request_data = {
+        "display_name": "GPT-4o Updated",
+        "description": "Updated description",
+        "context_window": 256000,
+        "max_output_tokens": 32768,
+    }
+
+    response = client.patch("/admin/llm/models/model-1", json=request_data)
+
+    assert response.status_code == 200
+    response_data = response.json()
+    assert response_data["display_name"] == "GPT-4o Updated"
+    assert response_data["context_window"] == 256000
+
+    # Verify refresh was called
+    mock_refresh.assert_called_once()
+
+    # Snapshot test the response (must be string)
+    configured_snapshot.assert_match(
+        json.dumps(response_data, indent=2, sort_keys=True),
+        "update_llm_model_success.json",
+    )
+
+
+def test_toggle_llm_model_success(
+    mocker: pytest_mock.MockFixture,
+    configured_snapshot: Snapshot,
+) -> None:
+    """Test successful toggling of LLM model enabled status"""
+    # Create a proper mock model object
+    mock_model = llm_model.LlmModel(
+        id="model-1",
+        slug="gpt-4o",
+        display_name="GPT-4o",
+        description="GPT-4 Optimized",
+        provider_id="provider-1",
+        context_window=128000,
+        max_output_tokens=16384,
+        is_enabled=False,
+        capabilities={},
+        metadata={},
+        costs=[],
+    )
+
+    # Create a proper ToggleLlmModelResponse
+    mock_response = llm_model.ToggleLlmModelResponse(
+        model=mock_model,
+        nodes_migrated=0,
+        migrated_to_slug=None,
+        migration_id=None,
+    )
+
+    mocker.patch(
+        "backend.api.features.admin.llm_routes.llm_db.toggle_model",
+        new=AsyncMock(return_value=mock_response),
+    )
+
+    mock_refresh = mocker.patch(
+        "backend.api.features.admin.llm_routes._refresh_runtime_state",
+        new=AsyncMock(),
+    )
+
+    request_data = {"is_enabled": False}
+
+    response = client.patch("/admin/llm/models/model-1/toggle", json=request_data)
+
+    assert response.status_code == 200
+    response_data = response.json()
+    assert response_data["model"]["is_enabled"] is False
+
+    # Verify refresh was called
+    mock_refresh.assert_called_once()
+
+    # Snapshot test the response (must be string)
+    configured_snapshot.assert_match(
+        json.dumps(response_data, indent=2, sort_keys=True),
+        "toggle_llm_model_success.json",
+    )
+
+
+def test_delete_llm_model_success(
+    mocker: pytest_mock.MockFixture,
+    configured_snapshot: Snapshot,
+) -> None:
+    """Test successful deletion of LLM model with migration"""
+    # Create a proper DeleteLlmModelResponse
+    mock_response = llm_model.DeleteLlmModelResponse(
+        deleted_model_slug="gpt-3.5-turbo",
+        deleted_model_display_name="GPT-3.5 Turbo",
+        replacement_model_slug="gpt-4o-mini",
+        nodes_migrated=42,
+        message="Successfully deleted model 'GPT-3.5 Turbo' (gpt-3.5-turbo) "
+        "and migrated 42 workflow node(s) to 'gpt-4o-mini'.",
+    )
+
+    mocker.patch(
+        "backend.api.features.admin.llm_routes.llm_db.delete_model",
+        new=AsyncMock(return_value=mock_response),
+    )
+
+    mock_refresh = mocker.patch(
+        "backend.api.features.admin.llm_routes._refresh_runtime_state",
+        new=AsyncMock(),
+    )
+
+    response = client.delete(
+        "/admin/llm/models/model-1?replacement_model_slug=gpt-4o-mini"
+    )
+
+    assert response.status_code == 200
+    response_data = response.json()
+    assert response_data["deleted_model_slug"] == "gpt-3.5-turbo"
+    assert response_data["nodes_migrated"] == 42
+    assert response_data["replacement_model_slug"] == "gpt-4o-mini"
+
+    # Verify refresh was called
+    mock_refresh.assert_called_once()
+
+    # Snapshot test the response (must be string)
+    configured_snapshot.assert_match(
+        json.dumps(response_data, indent=2, sort_keys=True),
+        "delete_llm_model_success.json",
+    )
+
+
+def test_delete_llm_model_validation_error(
+    mocker: pytest_mock.MockFixture,
+) -> None:
+    """Test deletion fails with proper error when validation fails"""
+    mocker.patch(
+        "backend.api.features.admin.llm_routes.llm_db.delete_model",
+        new=AsyncMock(side_effect=ValueError("Replacement model 'invalid' not found")),
+    )
+
+    response = client.delete("/admin/llm/models/model-1?replacement_model_slug=invalid")
+
+    assert response.status_code == 400
+    assert "Replacement model 'invalid' not found" in response.json()["detail"]
+
+
+def test_delete_llm_model_no_replacement_with_usage(
+    mocker: pytest_mock.MockFixture,
+) -> None:
+    """Test deletion fails when nodes exist but no replacement is provided"""
+    mocker.patch(
+        "backend.api.features.admin.llm_routes.llm_db.delete_model",
+        new=AsyncMock(
+            side_effect=ValueError(
+                "Cannot delete model 'test-model': 5 workflow node(s) are using it. "
+                "Please provide a replacement_model_slug to migrate them."
+            )
+        ),
+    )
+
+    response = client.delete("/admin/llm/models/model-1")
+
+    assert response.status_code == 400
+    assert "workflow node(s) are using it" in response.json()["detail"]
+
+
+def test_delete_llm_model_no_replacement_no_usage(
+    mocker: pytest_mock.MockFixture,
+) -> None:
+    """Test deletion succeeds when no nodes use the model and no replacement is provided"""
+    mock_response = llm_model.DeleteLlmModelResponse(
+        deleted_model_slug="unused-model",
+        deleted_model_display_name="Unused Model",
+        replacement_model_slug=None,
+        nodes_migrated=0,
+        message="Successfully deleted model 'Unused Model' (unused-model). No workflows were using this model.",
+    )
+
+    mocker.patch(
+        "backend.api.features.admin.llm_routes.llm_db.delete_model",
+        new=AsyncMock(return_value=mock_response),
+    )
+
+    mock_refresh = mocker.patch(
+        "backend.api.features.admin.llm_routes._refresh_runtime_state",
+        new=AsyncMock(),
+    )
+
+    response = client.delete("/admin/llm/models/model-1")
+
+    assert response.status_code == 200
+    response_data = response.json()
+    assert response_data["deleted_model_slug"] == "unused-model"
+    assert response_data["nodes_migrated"] == 0
+    assert response_data["replacement_model_slug"] is None
+    mock_refresh.assert_called_once()
--- a/autogpt_platform/backend/backend/api/features/builder/db.py
+++ b/autogpt_platform/backend/backend/api/features/builder/db.py
@@ -15,6 +15,7 @@ from backend.blocks import load_all_blocks
 from backend.blocks.llm import LlmModel
 from backend.data.block import AnyBlockSchema, BlockCategory, BlockInfo, BlockSchema
 from backend.data.db import query_raw_with_schema
+from backend.data.llm_registry import get_all_model_slugs_for_validation
 from backend.integrations.providers import ProviderName
 from backend.util.cache import cached
 from backend.util.models import Pagination
@@ -31,7 +32,14 @@ from .model import (
 )

 logger = logging.getLogger(__name__)
-llm_models = [name.name.lower().replace("_", " ") for name in LlmModel]
+
+
+def _get_llm_models() -> list[str]:
+    """Get LLM model names for search matching from the registry."""
+    return [
+        slug.lower().replace("-", " ") for slug in get_all_model_slugs_for_validation()
+    ]
+

 MAX_LIBRARY_AGENT_RESULTS = 100
 MAX_MARKETPLACE_AGENT_RESULTS = 100
@@ -496,8 +504,8 @@ async def _get_static_counts():
 def _matches_llm_model(schema_cls: type[BlockSchema], query: str) -> bool:
    for field in schema_cls.model_fields.values():
        if field.annotation == LlmModel:
-            # Check if query matches any value in llm_models
-            if any(query in name for name in llm_models):
+            # Check if query matches any value in llm_models from registry
+            if any(query in name for name in _get_llm_models()):
                return True
    return False

--- a/autogpt_platform/backend/backend/api/features/builder/routes.py
+++ b/autogpt_platform/backend/backend/api/features/builder/routes.py
@@ -17,7 +17,7 @@ router = fastapi.APIRouter(
 )


-# Taken from backend/api/features/store/db.py
+# Taken from backend/server/v2/store/db.py
 def sanitize_query(query: str | None) -> str | None:
    if query is None:
        return query
--- a/autogpt_platform/backend/backend/api/features/chat/config.py
+++ b/autogpt_platform/backend/backend/api/features/chat/config.py
@@ -33,15 +33,9 @@ class ChatConfig(BaseSettings):

    stream_timeout: int = Field(default=300, description="Stream timeout in seconds")
    max_retries: int = Field(default=3, description="Maximum number of retries")
-    max_agent_runs: int = Field(default=30, description="Maximum number of agent runs")
+    max_agent_runs: int = Field(default=3, description="Maximum number of agent runs")
    max_agent_schedules: int = Field(
-        default=30, description="Maximum number of agent schedules"
-    )
-
-    # Long-running operation configuration
-    long_running_operation_ttl: int = Field(
-        default=600,
-        description="TTL in seconds for long-running operation tracking in Redis (safety net if pod dies)",
+        default=3, description="Maximum number of agent schedules"
    )

    # Langfuse Prompt Management Configuration
--- a/autogpt_platform/backend/backend/api/features/chat/db.py
+++ b/autogpt_platform/backend/backend/api/features/chat/db.py
@@ -247,45 +247,3 @@ async def get_chat_session_message_count(session_id: str) -> int:
    """Get the number of messages in a chat session."""
    count = await PrismaChatMessage.prisma().count(where={"sessionId": session_id})
    return count
-
-
-async def update_tool_message_content(
-    session_id: str,
-    tool_call_id: str,
-    new_content: str,
-) -> bool:
-    """Update the content of a tool message in chat history.
-
-    Used by background tasks to update pending operation messages with final results.
-
-    Args:
-        session_id: The chat session ID.
-        tool_call_id: The tool call ID to find the message.
-        new_content: The new content to set.
-
-    Returns:
-        True if a message was updated, False otherwise.
-    """
-    try:
-        result = await PrismaChatMessage.prisma().update_many(
-            where={
-                "sessionId": session_id,
-                "toolCallId": tool_call_id,
-            },
-            data={
-                "content": new_content,
-            },
-        )
-        if result == 0:
-            logger.warning(
-                f"No message found to update for session {session_id}, "
-                f"tool_call_id {tool_call_id}"
-            )
-            return False
-        return True
-    except Exception as e:
-        logger.error(
-            f"Failed to update tool message for session {session_id}, "
-            f"tool_call_id {tool_call_id}: {e}"
-        )
-        return False
--- a/autogpt_platform/backend/backend/api/features/chat/model.py
+++ b/autogpt_platform/backend/backend/api/features/chat/model.py
@@ -295,21 +295,6 @@ async def cache_chat_session(session: ChatSession) -> None:
    await _cache_session(session)


-async def invalidate_session_cache(session_id: str) -> None:
-    """Invalidate a chat session from Redis cache.
-
-    Used by background tasks to ensure fresh data is loaded on next access.
-    This is best-effort - Redis failures are logged but don't fail the operation.
-    """
-    try:
-        redis_key = _get_session_cache_key(session_id)
-        async_redis = await get_redis_async()
-        await async_redis.delete(redis_key)
-    except Exception as e:
-        # Best-effort: log but don't fail - cache will expire naturally
-        logger.warning(f"Failed to invalidate session cache for {session_id}: {e}")
-
-
 async def _get_session_from_db(session_id: str) -> ChatSession | None:
    """Get a chat session from the database."""
    prisma_session = await chat_db.get_chat_session(session_id)
--- a/autogpt_platform/backend/backend/api/features/chat/service.py
+++ b/autogpt_platform/backend/backend/api/features/chat/service.py
@@ -17,7 +17,6 @@ from openai import (
 )
 from openai.types.chat import ChatCompletionChunk, ChatCompletionToolParam

-from backend.data.redis_client import get_redis_async
 from backend.data.understanding import (
    format_understanding_for_prompt,
    get_business_understanding,
@@ -25,7 +24,6 @@ from backend.data.understanding import (
 from backend.util.exceptions import NotFoundError
 from backend.util.settings import Settings

-from . import db as chat_db
 from .config import ChatConfig
 from .model import (
    ChatMessage,
@@ -33,7 +31,6 @@ from .model import (
    Usage,
    cache_chat_session,
    get_chat_session,
-    invalidate_session_cache,
    update_session_title,
    upsert_chat_session,
 )
@@ -51,13 +48,8 @@ from .response_model import (
    StreamToolOutputAvailable,
    StreamUsage,
 )
-from .tools import execute_tool, get_tool, tools
-from .tools.models import (
-    ErrorResponse,
-    OperationInProgressResponse,
-    OperationPendingResponse,
-    OperationStartedResponse,
-)
+from .tools import execute_tool, tools
+from .tools.models import ErrorResponse
 from .tracking import track_user_message

 logger = logging.getLogger(__name__)
@@ -69,126 +61,11 @@ client = openai.AsyncOpenAI(api_key=config.api_key, base_url=config.base_url)

 langfuse = get_client()

-# Redis key prefix for tracking running long-running operations
-# Used for idempotency across Kubernetes pods - prevents duplicate executions on browser refresh
-RUNNING_OPERATION_PREFIX = "chat:running_operation:"

-# Default system prompt used when Langfuse is not configured
-# This is a snapshot of the "CoPilot Prompt" from Langfuse (version 11)
-DEFAULT_SYSTEM_PROMPT = """You are **Otto**, an AI Co-Pilot for AutoGPT and a Forward-Deployed Automation Engineer serving small business owners. Your mission is to help users automate business tasks with AI by delivering tangible value through working automations—not through documentation or lengthy explanations.
+class LangfuseNotConfiguredError(Exception):
+    """Raised when Langfuse is required but not configured."""

-Here is everything you know about the current user from previous interactions:
-
-<users_information>
-{users_information}
-</users_information>
-
-## YOUR CORE MANDATE
-
-You are action-oriented. Your success is measured by:
- **Value Delivery**: Does the user think "wow, that was amazing" or "what was the point"?
- **Demonstrable Proof**: Show working automations, not descriptions of what's possible
- **Time Saved**: Focus on tangible efficiency gains
- **Quality Output**: Deliver results that meet or exceed expectations
-
-## YOUR WORKFLOW
-
-Adapt flexibly to the conversation context. Not every interaction requires all stages:
-
-1. **Explore & Understand**: Learn about the user's business, tasks, and goals. Use `add_understanding` to capture important context that will improve future conversations.
-
-2. **Assess Automation Potential**: Help the user understand whether and how AI can automate their task.
-
-3. **Prepare for AI**: Provide brief, actionable guidance on prerequisites (data, access, etc.).
-
-4. **Discover or Create Agents**:
-   - **Always check the user's library first** with `find_library_agent` (these may be customized to their needs)
-   - Search the marketplace with `find_agent` for pre-built automations
-   - Find reusable components with `find_block`
-   - Create custom solutions with `create_agent` if nothing suitable exists
-   - Modify existing library agents with `edit_agent`
-
-5. **Execute**: Run automations immediately, schedule them, or set up webhooks using `run_agent`. Test specific components with `run_block`.
-
-6. **Show Results**: Display outputs using `agent_output`.
-
-## AVAILABLE TOOLS
-
-**Understanding & Discovery:**
- `add_understanding`: Create a memory about the user's business or use cases for future sessions
- `search_docs`: Search platform documentation for specific technical information
- `get_doc_page`: Retrieve full text of a specific documentation page
-
-**Agent Discovery:**
- `find_library_agent`: Search the user's existing agents (CHECK HERE FIRST—these may be customized)
- `find_agent`: Search the marketplace for pre-built automations
- `find_block`: Find pre-written code units that perform specific tasks (agents are built from blocks)
-
-**Agent Creation & Editing:**
- `create_agent`: Create a new automation agent
- `edit_agent`: Modify an agent in the user's library
-
-**Execution & Output:**
- `run_agent`: Run an agent now, schedule it, or set up a webhook trigger
- `run_block`: Test or run a specific block independently
- `agent_output`: View results from previous agent runs
-
-## BEHAVIORAL GUIDELINES
-
-**Be Concise:**
- Target 2-5 short lines maximum
- Make every word count—no repetition or filler
- Use lightweight structure for scannability (bullets, numbered lists, short prompts)
- Avoid jargon (blocks, slugs, cron) unless the user asks
-
-**Be Proactive:**
- Suggest next steps before being asked
- Anticipate needs based on conversation context and user information
- Look for opportunities to expand scope when relevant
- Reveal capabilities through action, not explanation
-
-**Use Tools Effectively:**
- Select the right tool for each task
- **Always check `find_library_agent` before searching the marketplace**
- Use `add_understanding` to capture valuable business context
- When tool calls fail, try alternative approaches
-
-## CRITICAL REMINDER
-
-You are NOT a chatbot. You are NOT documentation. You are a partner who helps busy business owners get value quickly by showing proof through working automations. Bias toward action over explanation."""
-
-# Module-level set to hold strong references to background tasks.
-# This prevents asyncio from garbage collecting tasks before they complete.
-# Tasks are automatically removed on completion via done_callback.
-_background_tasks: set[asyncio.Task] = set()
-
-
-async def _mark_operation_started(tool_call_id: str) -> bool:
-    """Mark a long-running operation as started (Redis-based).
-
-    Returns True if successfully marked (operation was not already running),
-    False if operation was already running (lost race condition).
-    Raises exception if Redis is unavailable (fail-closed).
-    """
-    redis = await get_redis_async()
-    key = f"{RUNNING_OPERATION_PREFIX}{tool_call_id}"
-    # SETNX with TTL - atomic "set if not exists"
-    result = await redis.set(key, "1", ex=config.long_running_operation_ttl, nx=True)
-    return result is not None
-
-
-async def _mark_operation_completed(tool_call_id: str) -> None:
-    """Mark a long-running operation as completed (remove Redis key).
-
-    This is best-effort - if Redis fails, the TTL will eventually clean up.
-    """
-    try:
-        redis = await get_redis_async()
-        key = f"{RUNNING_OPERATION_PREFIX}{tool_call_id}"
-        await redis.delete(key)
-    except Exception as e:
-        # Non-critical: TTL will clean up eventually
-        logger.warning(f"Failed to delete running operation key {tool_call_id}: {e}")
+    pass


 def _is_langfuse_configured() -> bool:
@@ -198,30 +75,6 @@ def _is_langfuse_configured() -> bool:
    )


-async def _get_system_prompt_template(context: str) -> str:
-    """Get the system prompt, trying Langfuse first with fallback to default.
-
-    Args:
-        context: The user context/information to compile into the prompt.
-
-    Returns:
-        The compiled system prompt string.
-    """
-    if _is_langfuse_configured():
-        try:
-            # cache_ttl_seconds=0 disables SDK caching to always get the latest prompt
-            # Use asyncio.to_thread to avoid blocking the event loop
-            prompt = await asyncio.to_thread(
-                langfuse.get_prompt, config.langfuse_prompt_name, cache_ttl_seconds=0
-            )
-            return prompt.compile(users_information=context)
-        except Exception as e:
-            logger.warning(f"Failed to fetch prompt from Langfuse, using default: {e}")
-
-    # Fallback to default prompt
-    return DEFAULT_SYSTEM_PROMPT.format(users_information=context)
-
-
 async def _build_system_prompt(user_id: str | None) -> tuple[str, Any]:
    """Build the full system prompt including business understanding if available.

@@ -230,8 +83,12 @@ async def _build_system_prompt(user_id: str | None) -> tuple[str, Any]:
                     If "default" and this is the user's first session, will use "onboarding" instead.

    Returns:
-        Tuple of (compiled prompt string, business understanding object)
+        Tuple of (compiled prompt string, Langfuse prompt object for tracing)
    """
+
+    # cache_ttl_seconds=0 disables SDK caching to always get the latest prompt
+    prompt = langfuse.get_prompt(config.langfuse_prompt_name, cache_ttl_seconds=0)
+
    # If user is authenticated, try to fetch their business understanding
    understanding = None
    if user_id:
@@ -240,13 +97,12 @@ async def _build_system_prompt(user_id: str | None) -> tuple[str, Any]:
        except Exception as e:
            logger.warning(f"Failed to fetch business understanding: {e}")
            understanding = None
-
    if understanding:
        context = format_understanding_for_prompt(understanding)
    else:
        context = "This is the first time you are meeting the user. Greet them and introduce them to the platform"

-    compiled = await _get_system_prompt_template(context)
+    compiled = prompt.compile(users_information=context)
    return compiled, understanding


@@ -354,6 +210,16 @@ async def stream_chat_completion(
        f"Streaming chat completion for session {session_id} for message {message} and user id {user_id}. Message is user message: {is_user_message}"
    )

+    # Check if Langfuse is configured - required for chat functionality
+    if not _is_langfuse_configured():
+        logger.error("Chat request failed: Langfuse is not configured")
+        yield StreamError(
+            errorText="Chat service is not available. Langfuse must be configured "
+            "with LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY environment variables."
+        )
+        yield StreamFinish()
+        return
+
    # Only fetch from Redis if session not provided (initial call)
    if session is None:
        session = await get_chat_session(session_id, user_id)
@@ -449,7 +315,6 @@ async def stream_chat_completion(
    has_yielded_end = False
    has_yielded_error = False
    has_done_tool_call = False
-    has_long_running_tool_call = False  # Track if we had a long-running tool call
    has_received_text = False
    text_streaming_ended = False
    tool_response_messages: list[ChatMessage] = []
@@ -471,6 +336,7 @@ async def stream_chat_completion(
            system_prompt=system_prompt,
            text_block_id=text_block_id,
        ):
+
            if isinstance(chunk, StreamTextStart):
                # Emit text-start before first text delta
                if not has_received_text:
@@ -528,34 +394,13 @@ async def stream_chat_completion(
                    if isinstance(chunk.output, str)
                    else orjson.dumps(chunk.output).decode("utf-8")
                )
-                # Skip saving long-running operation responses - messages already saved in _yield_tool_call
-                # Use JSON parsing instead of substring matching to avoid false positives
-                is_long_running_response = False
-                try:
-                    parsed = orjson.loads(result_content)
-                    if isinstance(parsed, dict) and parsed.get("type") in (
-                        "operation_started",
-                        "operation_in_progress",
-                    ):
-                        is_long_running_response = True
-                except (orjson.JSONDecodeError, TypeError):
-                    pass  # Not JSON or not a dict - treat as regular response
-                if is_long_running_response:
-                    # Remove from accumulated_tool_calls since assistant message was already saved
-                    accumulated_tool_calls[:] = [
-                        tc
-                        for tc in accumulated_tool_calls
-                        if tc["id"] != chunk.toolCallId
-                    ]
-                    has_long_running_tool_call = True
-                else:
-                    tool_response_messages.append(
-                        ChatMessage(
-                            role="tool",
-                            content=result_content,
-                            tool_call_id=chunk.toolCallId,
-                        )
+                tool_response_messages.append(
+                    ChatMessage(
+                        role="tool",
+                        content=result_content,
+                        tool_call_id=chunk.toolCallId,
                    )
+                )
                has_done_tool_call = True
                # Track if any tool execution failed
                if not chunk.success:
@@ -731,14 +576,7 @@ async def stream_chat_completion(
            logger.info(
                f"Extended session messages, new message_count={len(session.messages)}"
            )
-        # Save if there are regular (non-long-running) tool responses or streaming message.
-        # Long-running tools save their own state, but we still need to save regular tools
-        # that may be in the same response.
-        has_regular_tool_responses = len(tool_response_messages) > 0
-        if has_regular_tool_responses or (
-            not has_long_running_tool_call
-            and (messages_to_save or has_appended_streaming_message)
-        ):
+        if messages_to_save or has_appended_streaming_message:
            await upsert_chat_session(session)
    else:
        logger.info(
@@ -747,9 +585,7 @@ async def stream_chat_completion(
        )

    # If we did a tool call, stream the chat completion again to get the next response
-    # Skip only if ALL tools were long-running (they handle their own completion)
-    has_regular_tools = len(tool_response_messages) > 0
-    if has_done_tool_call and (has_regular_tools or not has_long_running_tool_call):
+    if has_done_tool_call:
        logger.info(
            "Tool call executed, streaming chat completion again to get assistant response"
        )
@@ -889,114 +725,6 @@ async def _summarize_messages(
    return summary or "No summary available."


-def _ensure_tool_pairs_intact(
-    recent_messages: list[dict],
-    all_messages: list[dict],
-    start_index: int,
-) -> list[dict]:
-    """
-    Ensure tool_call/tool_response pairs stay together after slicing.
-
-    When slicing messages for context compaction, a naive slice can separate
-    an assistant message containing tool_calls from its corresponding tool
-    response messages. This causes API validation errors (e.g., Anthropic's
-    "unexpected tool_use_id found in tool_result blocks").
-
-    This function checks for orphan tool responses in the slice and extends
-    backwards to include their corresponding assistant messages.
-
-    Args:
-        recent_messages: The sliced messages to validate
-        all_messages: The complete message list (for looking up missing assistants)
-        start_index: The index in all_messages where recent_messages begins
-
-    Returns:
-        A potentially extended list of messages with tool pairs intact
-    """
-    if not recent_messages:
-        return recent_messages
-
-    # Collect all tool_call_ids from assistant messages in the slice
-    available_tool_call_ids: set[str] = set()
-    for msg in recent_messages:
-        if msg.get("role") == "assistant" and msg.get("tool_calls"):
-            for tc in msg["tool_calls"]:
-                tc_id = tc.get("id")
-                if tc_id:
-                    available_tool_call_ids.add(tc_id)
-
-    # Find orphan tool responses (tool messages whose tool_call_id is missing)
-    orphan_tool_call_ids: set[str] = set()
-    for msg in recent_messages:
-        if msg.get("role") == "tool":
-            tc_id = msg.get("tool_call_id")
-            if tc_id and tc_id not in available_tool_call_ids:
-                orphan_tool_call_ids.add(tc_id)
-
-    if not orphan_tool_call_ids:
-        # No orphans, slice is valid
-        return recent_messages
-
-    # Find the assistant messages that contain the orphan tool_call_ids
-    # Search backwards from start_index in all_messages
-    messages_to_prepend: list[dict] = []
-    for i in range(start_index - 1, -1, -1):
-        msg = all_messages[i]
-        if msg.get("role") == "assistant" and msg.get("tool_calls"):
-            msg_tool_ids = {tc.get("id") for tc in msg["tool_calls"] if tc.get("id")}
-            if msg_tool_ids & orphan_tool_call_ids:
-                # This assistant message has tool_calls we need
-                # Also collect its contiguous tool responses that follow it
-                assistant_and_responses: list[dict] = [msg]
-
-                # Scan forward from this assistant to collect tool responses
-                for j in range(i + 1, start_index):
-                    following_msg = all_messages[j]
-                    if following_msg.get("role") == "tool":
-                        tool_id = following_msg.get("tool_call_id")
-                        if tool_id and tool_id in msg_tool_ids:
-                            assistant_and_responses.append(following_msg)
-                    else:
-                        # Stop at first non-tool message
-                        break
-
-                # Prepend the assistant and its tool responses (maintain order)
-                messages_to_prepend = assistant_and_responses + messages_to_prepend
-                # Mark these as found
-                orphan_tool_call_ids -= msg_tool_ids
-                # Also add this assistant's tool_call_ids to available set
-                available_tool_call_ids |= msg_tool_ids
-
-        if not orphan_tool_call_ids:
-            # Found all missing assistants
-            break
-
-    if orphan_tool_call_ids:
-        # Some tool_call_ids couldn't be resolved - remove those tool responses
-        # This shouldn't happen in normal operation but handles edge cases
-        logger.warning(
-            f"Could not find assistant messages for tool_call_ids: {orphan_tool_call_ids}. "
-            "Removing orphan tool responses."
-        )
-        recent_messages = [
-            msg
-            for msg in recent_messages
-            if not (
-                msg.get("role") == "tool"
-                and msg.get("tool_call_id") in orphan_tool_call_ids
-            )
-        ]
-
-    if messages_to_prepend:
-        logger.info(
-            f"Extended recent messages by {len(messages_to_prepend)} to preserve "
-            f"tool_call/tool_response pairs"
-        )
-        return messages_to_prepend + recent_messages
-
-    return recent_messages
-
-
 async def _stream_chat_chunks(
    session: ChatSession,
    tools: list[ChatCompletionToolParam],
@@ -1088,15 +816,7 @@ async def _stream_chat_chunks(
            # Always attempt mitigation when over limit, even with few messages
            if messages:
                # Split messages based on whether system prompt exists
-                # Calculate start index for the slice
-                slice_start = max(0, len(messages_dict) - KEEP_RECENT)
-                recent_messages = messages_dict[-KEEP_RECENT:]
-
-                # Ensure tool_call/tool_response pairs stay together
-                # This prevents API errors from orphan tool responses
-                recent_messages = _ensure_tool_pairs_intact(
-                    recent_messages, messages_dict, slice_start
-                )
+                recent_messages = messages[-KEEP_RECENT:]

                if has_system_prompt:
                    # Keep system prompt separate, summarize everything between system and recent
@@ -1183,13 +903,6 @@ async def _stream_chat_chunks(
                                    if len(recent_messages) >= keep_count
                                    else recent_messages
                                )
-                                # Ensure tool pairs stay intact in the reduced slice
-                                reduced_slice_start = max(
-                                    0, len(recent_messages) - keep_count
-                                )
-                                reduced_recent = _ensure_tool_pairs_intact(
-                                    reduced_recent, recent_messages, reduced_slice_start
-                                )
                                if has_system_prompt:
                                    messages = [
                                        system_msg,
@@ -1248,10 +961,7 @@ async def _stream_chat_chunks(

                    # Create a base list excluding system prompt to avoid duplication
                    # This is the pool of messages we'll slice from in the loop
-                    # Use messages_dict for type consistency with _ensure_tool_pairs_intact
-                    base_msgs = (
-                        messages_dict[1:] if has_system_prompt else messages_dict
-                    )
+                    base_msgs = messages[1:] if has_system_prompt else messages

                    # Try progressively smaller keep counts
                    new_token_count = token_count  # Initialize with current count
@@ -1274,12 +984,6 @@ async def _stream_chat_chunks(
                            # Slice from base_msgs to get recent messages (without system prompt)
                            recent_messages = base_msgs[-keep_count:]

-                            # Ensure tool pairs stay intact in the reduced slice
-                            reduced_slice_start = max(0, len(base_msgs) - keep_count)
-                            recent_messages = _ensure_tool_pairs_intact(
-                                recent_messages, base_msgs, reduced_slice_start
-                            )
-
                            if has_system_prompt:
                                messages = [system_msg] + recent_messages
                            else:
@@ -1556,19 +1260,17 @@ async def _yield_tool_call(
    """
    Yield a tool call and its execution result.

-    For tools marked with `is_long_running=True` (like agent generation), spawns a
-    background task so the operation survives SSE disconnections. For other tools,
-    yields heartbeat events every 15 seconds to keep the SSE connection alive.
+    For long-running tools, yields heartbeat events every 15 seconds to keep
+    the SSE connection alive through proxies and load balancers.

    Raises:
        orjson.JSONDecodeError: If tool call arguments cannot be parsed as JSON
        KeyError: If expected tool call fields are missing
        TypeError: If tool call structure is invalid
    """
-    import uuid as uuid_module
-
    tool_name = tool_calls[yield_idx]["function"]["name"]
    tool_call_id = tool_calls[yield_idx]["id"]
+    logger.info(f"Yielding tool call: {tool_calls[yield_idx]}")

    # Parse tool call arguments - handle empty arguments gracefully
    raw_arguments = tool_calls[yield_idx]["function"]["arguments"]
@@ -1583,151 +1285,7 @@ async def _yield_tool_call(
        input=arguments,
    )

-    # Check if this tool is long-running (survives SSE disconnection)
-    tool = get_tool(tool_name)
-    if tool and tool.is_long_running:
-        # Atomic check-and-set: returns False if operation already running (lost race)
-        if not await _mark_operation_started(tool_call_id):
-            logger.info(
-                f"Tool call {tool_call_id} already in progress, returning status"
-            )
-            # Build dynamic message based on tool name
-            if tool_name == "create_agent":
-                in_progress_msg = "Agent creation already in progress. Please wait..."
-            elif tool_name == "edit_agent":
-                in_progress_msg = "Agent edit already in progress. Please wait..."
-            else:
-                in_progress_msg = f"{tool_name} already in progress. Please wait..."
-
-            yield StreamToolOutputAvailable(
-                toolCallId=tool_call_id,
-                toolName=tool_name,
-                output=OperationInProgressResponse(
-                    message=in_progress_msg,
-                    tool_call_id=tool_call_id,
-                ).model_dump_json(),
-                success=True,
-            )
-            return
-
-        # Generate operation ID
-        operation_id = str(uuid_module.uuid4())
-
-        # Build a user-friendly message based on tool and arguments
-        if tool_name == "create_agent":
-            agent_desc = arguments.get("description", "")
-            # Truncate long descriptions for the message
-            desc_preview = (
-                (agent_desc[:100] + "...") if len(agent_desc) > 100 else agent_desc
-            )
-            pending_msg = (
-                f"Creating your agent: {desc_preview}"
-                if desc_preview
-                else "Creating agent... This may take a few minutes."
-            )
-            started_msg = (
-                "Agent creation started. You can close this tab - "
-                "check your library in a few minutes."
-            )
-        elif tool_name == "edit_agent":
-            changes = arguments.get("changes", "")
-            changes_preview = (changes[:100] + "...") if len(changes) > 100 else changes
-            pending_msg = (
-                f"Editing agent: {changes_preview}"
-                if changes_preview
-                else "Editing agent... This may take a few minutes."
-            )
-            started_msg = (
-                "Agent edit started. You can close this tab - "
-                "check your library in a few minutes."
-            )
-        else:
-            pending_msg = f"Running {tool_name}... This may take a few minutes."
-            started_msg = (
-                f"{tool_name} started. You can close this tab - "
-                "check back in a few minutes."
-            )
-
-        # Track appended messages for rollback on failure
-        assistant_message: ChatMessage | None = None
-        pending_message: ChatMessage | None = None
-
-        # Wrap session save and task creation in try-except to release lock on failure
-        try:
-            # Save assistant message with tool_call FIRST (required by LLM)
-            assistant_message = ChatMessage(
-                role="assistant",
-                content="",
-                tool_calls=[tool_calls[yield_idx]],
-            )
-            session.messages.append(assistant_message)
-
-            # Then save pending tool result
-            pending_message = ChatMessage(
-                role="tool",
-                content=OperationPendingResponse(
-                    message=pending_msg,
-                    operation_id=operation_id,
-                    tool_name=tool_name,
-                ).model_dump_json(),
-                tool_call_id=tool_call_id,
-            )
-            session.messages.append(pending_message)
-            await upsert_chat_session(session)
-            logger.info(
-                f"Saved pending operation {operation_id} for tool {tool_name} "
-                f"in session {session.session_id}"
-            )
-
-            # Store task reference in module-level set to prevent GC before completion
-            task = asyncio.create_task(
-                _execute_long_running_tool(
-                    tool_name=tool_name,
-                    parameters=arguments,
-                    tool_call_id=tool_call_id,
-                    operation_id=operation_id,
-                    session_id=session.session_id,
-                    user_id=session.user_id,
-                )
-            )
-            _background_tasks.add(task)
-            task.add_done_callback(_background_tasks.discard)
-        except Exception as e:
-            # Roll back appended messages to prevent data corruption on subsequent saves
-            if (
-                pending_message
-                and session.messages
-                and session.messages[-1] == pending_message
-            ):
-                session.messages.pop()
-            if (
-                assistant_message
-                and session.messages
-                and session.messages[-1] == assistant_message
-            ):
-                session.messages.pop()
-
-            # Release the Redis lock since the background task won't be spawned
-            await _mark_operation_completed(tool_call_id)
-            logger.error(
-                f"Failed to setup long-running tool {tool_name}: {e}", exc_info=True
-            )
-            raise
-
-        # Return immediately - don't wait for completion
-        yield StreamToolOutputAvailable(
-            toolCallId=tool_call_id,
-            toolName=tool_name,
-            output=OperationStartedResponse(
-                message=started_msg,
-                operation_id=operation_id,
-                tool_name=tool_name,
-            ).model_dump_json(),
-            success=True,
-        )
-        return
-
-    # Normal flow: Run tool execution in background task with heartbeats
+    # Run tool execution in background task with heartbeats to keep connection alive
    tool_task = asyncio.create_task(
        execute_tool(
            tool_name=tool_name,
@@ -1777,190 +1335,3 @@ async def _yield_tool_call(
        )

    yield tool_execution_response
-
-
-async def _execute_long_running_tool(
-    tool_name: str,
-    parameters: dict[str, Any],
-    tool_call_id: str,
-    operation_id: str,
-    session_id: str,
-    user_id: str | None,
-) -> None:
-    """Execute a long-running tool in background and update chat history with result.
-
-    This function runs independently of the SSE connection, so the operation
-    survives if the user closes their browser tab.
-    """
-    try:
-        # Load fresh session (not stale reference)
-        session = await get_chat_session(session_id, user_id)
-        if not session:
-            logger.error(f"Session {session_id} not found for background tool")
-            return
-
-        # Execute the actual tool
-        result = await execute_tool(
-            tool_name=tool_name,
-            parameters=parameters,
-            tool_call_id=tool_call_id,
-            user_id=user_id,
-            session=session,
-        )
-
-        # Update the pending message with result
-        await _update_pending_operation(
-            session_id=session_id,
-            tool_call_id=tool_call_id,
-            result=(
-                result.output
-                if isinstance(result.output, str)
-                else orjson.dumps(result.output).decode("utf-8")
-            ),
-        )
-
-        logger.info(f"Background tool {tool_name} completed for session {session_id}")
-
-        # Generate LLM continuation so user sees response when they poll/refresh
-        await _generate_llm_continuation(session_id=session_id, user_id=user_id)
-
-    except Exception as e:
-        logger.error(f"Background tool {tool_name} failed: {e}", exc_info=True)
-        error_response = ErrorResponse(
-            message=f"Tool {tool_name} failed: {str(e)}",
-        )
-        await _update_pending_operation(
-            session_id=session_id,
-            tool_call_id=tool_call_id,
-            result=error_response.model_dump_json(),
-        )
-    finally:
-        await _mark_operation_completed(tool_call_id)
-
-
-async def _update_pending_operation(
-    session_id: str,
-    tool_call_id: str,
-    result: str,
-) -> None:
-    """Update the pending tool message with final result.
-
-    This is called by background tasks when long-running operations complete.
-    """
-    # Update the message in database
-    updated = await chat_db.update_tool_message_content(
-        session_id=session_id,
-        tool_call_id=tool_call_id,
-        new_content=result,
-    )
-
-    if updated:
-        # Invalidate Redis cache so next load gets fresh data
-        # Wrap in try/except to prevent cache failures from triggering error handling
-        # that would overwrite our successful DB update
-        try:
-            await invalidate_session_cache(session_id)
-        except Exception as e:
-            # Non-critical: cache will eventually be refreshed on next load
-            logger.warning(f"Failed to invalidate cache for session {session_id}: {e}")
-        logger.info(
-            f"Updated pending operation for tool_call_id {tool_call_id} "
-            f"in session {session_id}"
-        )
-    else:
-        logger.warning(
-            f"Failed to update pending operation for tool_call_id {tool_call_id} "
-            f"in session {session_id}"
-        )
-
-
-async def _generate_llm_continuation(
-    session_id: str,
-    user_id: str | None,
-) -> None:
-    """Generate an LLM response after a long-running tool completes.
-
-    This is called by background tasks to continue the conversation
-    after a tool result is saved. The response is saved to the database
-    so users see it when they refresh or poll.
-    """
-    try:
-        # Load fresh session from DB (bypass cache to get the updated tool result)
-        await invalidate_session_cache(session_id)
-        session = await get_chat_session(session_id, user_id)
-        if not session:
-            logger.error(f"Session {session_id} not found for LLM continuation")
-            return
-
-        # Build system prompt
-        system_prompt, _ = await _build_system_prompt(user_id)
-
-        # Build messages in OpenAI format
-        messages = session.to_openai_messages()
-        if system_prompt:
-            from openai.types.chat import ChatCompletionSystemMessageParam
-
-            system_message = ChatCompletionSystemMessageParam(
-                role="system",
-                content=system_prompt,
-            )
-            messages = [system_message] + messages
-
-        # Build extra_body for tracing
-        extra_body: dict[str, Any] = {
-            "posthogProperties": {
-                "environment": settings.config.app_env.value,
-            },
-        }
-        if user_id:
-            extra_body["user"] = user_id[:128]
-            extra_body["posthogDistinctId"] = user_id
-        if session_id:
-            extra_body["session_id"] = session_id[:128]
-
-        # Make non-streaming LLM call (no tools - just text response)
-        from typing import cast
-
-        from openai.types.chat import ChatCompletionMessageParam
-
-        # No tools parameter = text-only response (no tool calls)
-        response = await client.chat.completions.create(
-            model=config.model,
-            messages=cast(list[ChatCompletionMessageParam], messages),
-            extra_body=extra_body,
-        )
-
-        if response.choices and response.choices[0].message.content:
-            assistant_content = response.choices[0].message.content
-
-            # Reload session from DB to avoid race condition with user messages
-            # that may have been sent while we were generating the LLM response
-            fresh_session = await get_chat_session(session_id, user_id)
-            if not fresh_session:
-                logger.error(
-                    f"Session {session_id} disappeared during LLM continuation"
-                )
-                return
-
-            # Save assistant message to database
-            assistant_message = ChatMessage(
-                role="assistant",
-                content=assistant_content,
-            )
-            fresh_session.messages.append(assistant_message)
-
-            # Save to database (not cache) to persist the response
-            await upsert_chat_session(fresh_session)
-
-            # Invalidate cache so next poll/refresh gets fresh data
-            await invalidate_session_cache(session_id)
-
-            logger.info(
-                f"Generated LLM continuation for session {session_id}, "
-                f"response length: {len(assistant_content)}"
-            )
-        else:
-            logger.warning(f"LLM continuation returned empty response for {session_id}")
-
-    except Exception as e:
-        logger.error(f"Failed to generate LLM continuation: {e}", exc_info=True)
--- a/autogpt_platform/backend/backend/api/features/chat/tools/IDEAS.md
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/IDEAS.md
@@ -1,79 +0,0 @@
-# CoPilot Tools - Future Ideas
-
-## Multimodal Image Support for CoPilot
-
-**Problem:** CoPilot uses a vision-capable model but can't "see" workspace images. When a block generates an image and returns `workspace://abc123`, CoPilot can't evaluate it (e.g., checking blog thumbnail quality).
-
-**Backend Solution:**
-When preparing messages for the LLM, detect `workspace://` image references and convert them to proper image content blocks:
-
-```python
-# Before sending to LLM, scan for workspace image references
-# and inject them as image content parts
-
-# Example message transformation:
-# FROM: {"role": "assistant", "content": "Generated image: workspace://abc123"}
-# TO:   {"role": "assistant", "content": [
-#         {"type": "text", "text": "Generated image: workspace://abc123"},
-#         {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}}
-#       ]}
-```
-
-**Where to implement:**
- In the chat stream handler before calling the LLM
- Or in a message preprocessing step
- Need to fetch image from workspace, convert to base64, add as image content
-
-**Considerations:**
- Only do this for image MIME types (image/png, image/jpeg, etc.)
- May want a size limit (don't pass 10MB images)
- Track which images were "shown" to the AI for frontend indicator
- Cost implications - vision API calls are more expensive
-
-**Frontend Solution:**
-Show visual indicator on workspace files in chat:
- If AI saw the image: normal display
- If AI didn't see it: overlay icon saying "AI can't see this image"
-
-Requires response metadata indicating which `workspace://` refs were passed to the model.
-
---
-
-## Output Post-Processing Layer for run_block
-
-**Problem:** Many blocks produce large outputs that:
- Consume massive context (100KB base64 image = ~133KB tokens)
- Can't fit in conversation
- Break things and cause high LLM costs
-
-**Proposed Solution:** Instead of modifying individual blocks or `store_media_file()`, implement a centralized output processor in `run_block.py` that handles outputs before they're returned to CoPilot.
-
-**Benefits:**
-1. **Centralized** - one place to handle all output processing
-2. **Future-proof** - new blocks automatically get output processing
-3. **Keeps blocks pure** - they don't need to know about context constraints
-4. **Handles all large outputs** - not just images
-
-**Processing Rules:**
- Detect base64 data URIs → save to workspace, return `workspace://` reference
- Truncate very long strings (>N chars) with truncation note
- Summarize large arrays/lists (e.g., "Array with 1000 items, first 5: [...]")
- Handle nested large outputs in dicts recursively
- Cap total output size
-
-**Implementation Location:** `run_block.py` after block execution, before returning `BlockOutputResponse`
-
-**Example:**
-```python
-def _process_outputs_for_context(
-    outputs: dict[str, list[Any]],
-    workspace_manager: WorkspaceManager,
-    max_string_length: int = 10000,
-    max_array_preview: int = 5,
-) -> dict[str, list[Any]]:
-    """Process block outputs to prevent context bloat."""
-    processed = {}
-    for name, values in outputs.items():
-        processed[name] = [_process_value(v, workspace_manager) for v in values]
-    return processed
-```
--- a/autogpt_platform/backend/backend/api/features/chat/tools/init.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/init.py
@@ -18,12 +18,6 @@ from .get_doc_page import GetDocPageTool
 from .run_agent import RunAgentTool
 from .run_block import RunBlockTool
 from .search_docs import SearchDocsTool
-from .workspace_files import (
-    DeleteWorkspaceFileTool,
-    ListWorkspaceFilesTool,
-    ReadWorkspaceFileTool,
-    WriteWorkspaceFileTool,
-)

 if TYPE_CHECKING:
    from backend.api.features.chat.response_model import StreamToolOutputAvailable
@@ -43,11 +37,6 @@ TOOL_REGISTRY: dict[str, BaseTool] = {
    "view_agent_output": AgentOutputTool(),
    "search_docs": SearchDocsTool(),
    "get_doc_page": GetDocPageTool(),
-    # Workspace tools for CoPilot file operations
-    "list_workspace_files": ListWorkspaceFilesTool(),
-    "read_workspace_file": ReadWorkspaceFileTool(),
-    "write_workspace_file": WriteWorkspaceFileTool(),
-    "delete_workspace_file": DeleteWorkspaceFileTool(),
 }

 # Export individual tool instances for backwards compatibility
@@ -60,11 +49,6 @@ tools: list[ChatCompletionToolParam] = [
 ]


-def get_tool(tool_name: str) -> BaseTool | None:
-    """Get a tool instance by name."""
-    return TOOL_REGISTRY.get(tool_name)
-
-
 async def execute_tool(
    tool_name: str,
    parameters: dict[str, Any],
@@ -73,7 +57,7 @@ async def execute_tool(
    tool_call_id: str,
 ) -> "StreamToolOutputAvailable":
    """Execute a tool by name."""
-    tool = get_tool(tool_name)
+    tool = TOOL_REGISTRY.get(tool_name)
    if not tool:
        raise ValueError(f"Tool {tool_name} not found")

--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/init.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/init.py
@@ -9,7 +9,6 @@ from .core import (
    json_to_graph,
    save_agent_to_library,
 )
-from .errors import get_user_message_for_error
 from .service import health_check as check_external_service_health
 from .service import is_external_service_configured

@@ -26,6 +25,4 @@ __all__ = [
    # Service
    "is_external_service_configured",
    "check_external_service_health",
-    # Error handling
-    "get_user_message_for_error",
 ]
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/core.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/core.py
@@ -64,7 +64,7 @@ async def generate_agent(instructions: dict[str, Any]) -> dict[str, Any] | None:
        instructions: Structured instructions from decompose_goal

    Returns:
-        Agent JSON dict, error dict {"type": "error", ...}, or None on error
+        Agent JSON dict or None on error

    Raises:
        AgentGeneratorNotConfiguredError: If the external service is not configured.
@@ -73,10 +73,7 @@ async def generate_agent(instructions: dict[str, Any]) -> dict[str, Any] | None:
    logger.info("Calling external Agent Generator service for generate_agent")
    result = await generate_agent_external(instructions)
    if result:
-        # Check if it's an error response - pass through as-is
-        if isinstance(result, dict) and result.get("type") == "error":
-            return result
-        # Ensure required fields for successful agent generation
+        # Ensure required fields
        if "id" not in result:
            result["id"] = str(uuid.uuid4())
        if "version" not in result:
@@ -270,8 +267,7 @@ async def generate_agent_patch(
        current_agent: Current agent JSON

    Returns:
-        Updated agent JSON, clarifying questions dict {"type": "clarifying_questions", ...},
-        error dict {"type": "error", ...}, or None on unexpected error
+        Updated agent JSON, clarifying questions dict, or None on error

    Raises:
        AgentGeneratorNotConfiguredError: If the external service is not configured.
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/errors.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/errors.py
@@ -1,43 +0,0 @@
-"""Error handling utilities for agent generator."""
-
-
-def get_user_message_for_error(
-    error_type: str,
-    operation: str = "process the request",
-    llm_parse_message: str | None = None,
-    validation_message: str | None = None,
-) -> str:
-    """Get a user-friendly error message based on error type.
-
-    This function maps internal error types to user-friendly messages,
-    providing a consistent experience across different agent operations.
-
-    Args:
-        error_type: The error type from the external service
-            (e.g., "llm_parse_error", "timeout", "rate_limit")
-        operation: Description of what operation failed, used in the default
-            message (e.g., "analyze the goal", "generate the agent")
-        llm_parse_message: Custom message for llm_parse_error type
-        validation_message: Custom message for validation_error type
-
-    Returns:
-        User-friendly error message suitable for display to the user
-    """
-    if error_type == "llm_parse_error":
-        return (
-            llm_parse_message
-            or "The AI had trouble processing this request. Please try again."
-        )
-    elif error_type == "validation_error":
-        return (
-            validation_message
-            or "The request failed validation. Please try rephrasing."
-        )
-    elif error_type == "patch_error":
-        return "Failed to apply the changes. Please try a different approach."
-    elif error_type in ("timeout", "llm_timeout"):
-        return "The request took too long. Please try again."
-    elif error_type in ("rate_limit", "llm_rate_limit"):
-        return "The service is currently busy. Please try again in a moment."
-    else:
-        return f"Failed to {operation}. Please try again."
--- a/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/service.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/agent_generator/service.py
@@ -14,70 +14,6 @@ from backend.util.settings import Settings

 logger = logging.getLogger(__name__)

-
-def _create_error_response(
-    error_message: str,
-    error_type: str = "unknown",
-    details: dict[str, Any] | None = None,
-) -> dict[str, Any]:
-    """Create a standardized error response dict.
-
-    Args:
-        error_message: Human-readable error message
-        error_type: Machine-readable error type
-        details: Optional additional error details
-
-    Returns:
-        Error dict with type="error" and error details
-    """
-    response: dict[str, Any] = {
-        "type": "error",
-        "error": error_message,
-        "error_type": error_type,
-    }
-    if details:
-        response["details"] = details
-    return response
-
-
-def _classify_http_error(e: httpx.HTTPStatusError) -> tuple[str, str]:
-    """Classify an HTTP error into error_type and message.
-
-    Args:
-        e: The HTTP status error
-
-    Returns:
-        Tuple of (error_type, error_message)
-    """
-    status = e.response.status_code
-    if status == 429:
-        return "rate_limit", f"Agent Generator rate limited: {e}"
-    elif status == 503:
-        return "service_unavailable", f"Agent Generator unavailable: {e}"
-    elif status == 504 or status == 408:
-        return "timeout", f"Agent Generator timed out: {e}"
-    else:
-        return "http_error", f"HTTP error calling Agent Generator: {e}"
-
-
-def _classify_request_error(e: httpx.RequestError) -> tuple[str, str]:
-    """Classify a request error into error_type and message.
-
-    Args:
-        e: The request error
-
-    Returns:
-        Tuple of (error_type, error_message)
-    """
-    error_str = str(e).lower()
-    if "timeout" in error_str or "timed out" in error_str:
-        return "timeout", f"Agent Generator request timed out: {e}"
-    elif "connect" in error_str:
-        return "connection_error", f"Could not connect to Agent Generator: {e}"
-    else:
-        return "request_error", f"Request error calling Agent Generator: {e}"
-
-
 _client: httpx.AsyncClient | None = None
 _settings: Settings | None = None

@@ -131,8 +67,7 @@ async def decompose_goal_external(
        - {"type": "instructions", "steps": [...]}
        - {"type": "unachievable_goal", ...}
        - {"type": "vague_goal", ...}
-        - {"type": "error", "error": "...", "error_type": "..."} on error
-        Or None on unexpected error
+        Or None on error
    """
    client = _get_client()

@@ -148,13 +83,8 @@ async def decompose_goal_external(
        data = response.json()

        if not data.get("success"):
-            error_msg = data.get("error", "Unknown error from Agent Generator")
-            error_type = data.get("error_type", "unknown")
-            logger.error(
-                f"Agent Generator decomposition failed: {error_msg} "
-                f"(type: {error_type})"
-            )
-            return _create_error_response(error_msg, error_type)
+            logger.error(f"External service returned error: {data.get('error')}")
+            return None

        # Map the response to the expected format
        response_type = data.get("type")
@@ -176,37 +106,25 @@ async def decompose_goal_external(
                "type": "vague_goal",
                "suggested_goal": data.get("suggested_goal"),
            }
-        elif response_type == "error":
-            # Pass through error from the service
-            return _create_error_response(
-                data.get("error", "Unknown error"),
-                data.get("error_type", "unknown"),
-            )
        else:
            logger.error(
                f"Unknown response type from external service: {response_type}"
            )
-            return _create_error_response(
-                f"Unknown response type from Agent Generator: {response_type}",
-                "invalid_response",
-            )
+            return None

    except httpx.HTTPStatusError as e:
-        error_type, error_msg = _classify_http_error(e)
-        logger.error(error_msg)
-        return _create_error_response(error_msg, error_type)
+        logger.error(f"HTTP error calling external agent generator: {e}")
+        return None
    except httpx.RequestError as e:
-        error_type, error_msg = _classify_request_error(e)
-        logger.error(error_msg)
-        return _create_error_response(error_msg, error_type)
+        logger.error(f"Request error calling external agent generator: {e}")
+        return None
    except Exception as e:
-        error_msg = f"Unexpected error calling Agent Generator: {e}"
-        logger.error(error_msg)
-        return _create_error_response(error_msg, "unexpected_error")
+        logger.error(f"Unexpected error calling external agent generator: {e}")
+        return None


 async def generate_agent_external(
-    instructions: dict[str, Any],
+    instructions: dict[str, Any]
 ) -> dict[str, Any] | None:
    """Call the external service to generate an agent from instructions.

@@ -214,7 +132,7 @@ async def generate_agent_external(
        instructions: Structured instructions from decompose_goal

    Returns:
-        Agent JSON dict on success, or error dict {"type": "error", ...} on error
+        Agent JSON dict or None on error
    """
    client = _get_client()

@@ -226,28 +144,20 @@ async def generate_agent_external(
        data = response.json()

        if not data.get("success"):
-            error_msg = data.get("error", "Unknown error from Agent Generator")
-            error_type = data.get("error_type", "unknown")
-            logger.error(
-                f"Agent Generator generation failed: {error_msg} "
-                f"(type: {error_type})"
-            )
-            return _create_error_response(error_msg, error_type)
+            logger.error(f"External service returned error: {data.get('error')}")
+            return None

        return data.get("agent_json")

    except httpx.HTTPStatusError as e:
-        error_type, error_msg = _classify_http_error(e)
-        logger.error(error_msg)
-        return _create_error_response(error_msg, error_type)
+        logger.error(f"HTTP error calling external agent generator: {e}")
+        return None
    except httpx.RequestError as e:
-        error_type, error_msg = _classify_request_error(e)
-        logger.error(error_msg)
-        return _create_error_response(error_msg, error_type)
+        logger.error(f"Request error calling external agent generator: {e}")
+        return None
    except Exception as e:
-        error_msg = f"Unexpected error calling Agent Generator: {e}"
-        logger.error(error_msg)
-        return _create_error_response(error_msg, "unexpected_error")
+        logger.error(f"Unexpected error calling external agent generator: {e}")
+        return None


 async def generate_agent_patch_external(
@@ -260,7 +170,7 @@ async def generate_agent_patch_external(
        current_agent: Current agent JSON

    Returns:
-        Updated agent JSON, clarifying questions dict, or error dict on error
+        Updated agent JSON, clarifying questions dict, or None on error
    """
    client = _get_client()

@@ -276,13 +186,8 @@ async def generate_agent_patch_external(
        data = response.json()

        if not data.get("success"):
-            error_msg = data.get("error", "Unknown error from Agent Generator")
-            error_type = data.get("error_type", "unknown")
-            logger.error(
-                f"Agent Generator patch generation failed: {error_msg} "
-                f"(type: {error_type})"
-            )
-            return _create_error_response(error_msg, error_type)
+            logger.error(f"External service returned error: {data.get('error')}")
+            return None

        # Check if it's clarifying questions
        if data.get("type") == "clarifying_questions":
@@ -291,28 +196,18 @@ async def generate_agent_patch_external(
                "questions": data.get("questions", []),
            }

-        # Check if it's an error passed through
-        if data.get("type") == "error":
-            return _create_error_response(
-                data.get("error", "Unknown error"),
-                data.get("error_type", "unknown"),
-            )
-
        # Otherwise return the updated agent JSON
        return data.get("agent_json")

    except httpx.HTTPStatusError as e:
-        error_type, error_msg = _classify_http_error(e)
-        logger.error(error_msg)
-        return _create_error_response(error_msg, error_type)
+        logger.error(f"HTTP error calling external agent generator: {e}")
+        return None
    except httpx.RequestError as e:
-        error_type, error_msg = _classify_request_error(e)
-        logger.error(error_msg)
-        return _create_error_response(error_msg, error_type)
+        logger.error(f"Request error calling external agent generator: {e}")
+        return None
    except Exception as e:
-        error_msg = f"Unexpected error calling Agent Generator: {e}"
-        logger.error(error_msg)
-        return _create_error_response(error_msg, "unexpected_error")
+        logger.error(f"Unexpected error calling external agent generator: {e}")
+        return None


 async def get_blocks_external() -> list[dict[str, Any]] | None:
--- a/autogpt_platform/backend/backend/api/features/chat/tools/base.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/base.py
@@ -36,16 +36,6 @@ class BaseTool:
        """Whether this tool requires authentication."""
        return False

-    @property
-    def is_long_running(self) -> bool:
-        """Whether this tool is long-running and should execute in background.
-
-        Long-running tools (like agent generation) are executed via background
-        tasks to survive SSE disconnections. The result is persisted to chat
-        history and visible when the user refreshes.
-        """
-        return False
-
    def as_openai_tool(self) -> ChatCompletionToolParam:
        """Convert to OpenAI tool format."""
        return ChatCompletionToolParam(
--- a/autogpt_platform/backend/backend/api/features/chat/tools/create_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/create_agent.py
@@ -9,7 +9,6 @@ from .agent_generator import (
    AgentGeneratorNotConfiguredError,
    decompose_goal,
    generate_agent,
-    get_user_message_for_error,
    save_agent_to_library,
 )
 from .base import BaseTool
@@ -43,10 +42,6 @@ class CreateAgentTool(BaseTool):
    def requires_auth(self) -> bool:
        return True

-    @property
-    def is_long_running(self) -> bool:
-        return True
-
    @property
    def parameters(self) -> dict[str, Any]:
        return {
@@ -118,29 +113,11 @@ class CreateAgentTool(BaseTool):

        if decomposition_result is None:
            return ErrorResponse(
-                message="Failed to analyze the goal. The agent generation service may be unavailable. Please try again.",
+                message="Failed to analyze the goal. The agent generation service may be unavailable or timed out. Please try again.",
                error="decomposition_failed",
-                details={"description": description[:100]},
-                session_id=session_id,
-            )
-
-        # Check if the result is an error from the external service
-        if decomposition_result.get("type") == "error":
-            error_msg = decomposition_result.get("error", "Unknown error")
-            error_type = decomposition_result.get("error_type", "unknown")
-            user_message = get_user_message_for_error(
-                error_type,
-                operation="analyze the goal",
-                llm_parse_message="The AI had trouble understanding this request. Please try rephrasing your goal.",
-            )
-            return ErrorResponse(
-                message=user_message,
-                error=f"decomposition_failed:{error_type}",
                details={
-                    "description": description[:100],
-                    "service_error": error_msg,
-                    "error_type": error_type,
-                },
+                    "description": description[:100]
+                },  # Include context for debugging
                session_id=session_id,
            )

@@ -205,30 +182,11 @@ class CreateAgentTool(BaseTool):

        if agent_json is None:
            return ErrorResponse(
-                message="Failed to generate the agent. The agent generation service may be unavailable. Please try again.",
+                message="Failed to generate the agent. The agent generation service may be unavailable or timed out. Please try again.",
                error="generation_failed",
-                details={"description": description[:100]},
-                session_id=session_id,
-            )
-
-        # Check if the result is an error from the external service
-        if isinstance(agent_json, dict) and agent_json.get("type") == "error":
-            error_msg = agent_json.get("error", "Unknown error")
-            error_type = agent_json.get("error_type", "unknown")
-            user_message = get_user_message_for_error(
-                error_type,
-                operation="generate the agent",
-                llm_parse_message="The AI had trouble generating the agent. Please try again or simplify your goal.",
-                validation_message="The generated agent failed validation. Please try rephrasing your goal.",
-            )
-            return ErrorResponse(
-                message=user_message,
-                error=f"generation_failed:{error_type}",
                details={
-                    "description": description[:100],
-                    "service_error": error_msg,
-                    "error_type": error_type,
-                },
+                    "description": description[:100]
+                },  # Include context for debugging
                session_id=session_id,
            )

--- a/autogpt_platform/backend/backend/api/features/chat/tools/edit_agent.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/edit_agent.py
@@ -9,7 +9,6 @@ from .agent_generator import (
    AgentGeneratorNotConfiguredError,
    generate_agent_patch,
    get_agent_as_json,
-    get_user_message_for_error,
    save_agent_to_library,
 )
 from .base import BaseTool
@@ -43,10 +42,6 @@ class EditAgentTool(BaseTool):
    def requires_auth(self) -> bool:
        return True

-    @property
-    def is_long_running(self) -> bool:
-        return True
-
    @property
    def parameters(self) -> dict[str, Any]:
        return {
@@ -153,28 +148,6 @@ class EditAgentTool(BaseTool):
                session_id=session_id,
            )

-        # Check if the result is an error from the external service
-        if isinstance(result, dict) and result.get("type") == "error":
-            error_msg = result.get("error", "Unknown error")
-            error_type = result.get("error_type", "unknown")
-            user_message = get_user_message_for_error(
-                error_type,
-                operation="generate the changes",
-                llm_parse_message="The AI had trouble generating the changes. Please try again or simplify your request.",
-                validation_message="The generated changes failed validation. Please try rephrasing your request.",
-            )
-            return ErrorResponse(
-                message=user_message,
-                error=f"update_generation_failed:{error_type}",
-                details={
-                    "agent_id": agent_id,
-                    "changes": changes[:100],
-                    "service_error": error_msg,
-                    "error_type": error_type,
-                },
-                session_id=session_id,
-            )
-
        # Check if LLM returned clarifying questions
        if result.get("type") == "clarifying_questions":
            questions = result.get("questions", [])
--- a/autogpt_platform/backend/backend/api/features/chat/tools/models.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/models.py
@@ -28,16 +28,6 @@ class ResponseType(str, Enum):
    BLOCK_OUTPUT = "block_output"
    DOC_SEARCH_RESULTS = "doc_search_results"
    DOC_PAGE = "doc_page"
-    # Workspace response types
-    WORKSPACE_FILE_LIST = "workspace_file_list"
-    WORKSPACE_FILE_CONTENT = "workspace_file_content"
-    WORKSPACE_FILE_METADATA = "workspace_file_metadata"
-    WORKSPACE_FILE_WRITTEN = "workspace_file_written"
-    WORKSPACE_FILE_DELETED = "workspace_file_deleted"
-    # Long-running operation types
-    OPERATION_STARTED = "operation_started"
-    OPERATION_PENDING = "operation_pending"
-    OPERATION_IN_PROGRESS = "operation_in_progress"


 # Base response model
@@ -344,39 +334,3 @@ class BlockOutputResponse(ToolResponseBase):
    block_name: str
    outputs: dict[str, list[Any]]
    success: bool = True
-
-
-# Long-running operation models
-class OperationStartedResponse(ToolResponseBase):
-    """Response when a long-running operation has been started in the background.
-
-    This is returned immediately to the client while the operation continues
-    to execute. The user can close the tab and check back later.
-    """
-
-    type: ResponseType = ResponseType.OPERATION_STARTED
-    operation_id: str
-    tool_name: str
-
-
-class OperationPendingResponse(ToolResponseBase):
-    """Response stored in chat history while a long-running operation is executing.
-
-    This is persisted to the database so users see a pending state when they
-    refresh before the operation completes.
-    """
-
-    type: ResponseType = ResponseType.OPERATION_PENDING
-    operation_id: str
-    tool_name: str
-
-
-class OperationInProgressResponse(ToolResponseBase):
-    """Response when an operation is already in progress.
-
-    Returned for idempotency when the same tool_call_id is requested again
-    while the background task is still running.
-    """
-
-    type: ResponseType = ResponseType.OPERATION_IN_PROGRESS
-    tool_call_id: str
--- a/autogpt_platform/backend/backend/api/features/chat/tools/run_block.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/run_block.py
@@ -1,7 +1,6 @@
 """Tool for executing blocks directly."""

 import logging
-import uuid
 from collections import defaultdict
 from typing import Any

@@ -9,7 +8,6 @@ from backend.api.features.chat.model import ChatSession
 from backend.data.block import get_block
 from backend.data.execution import ExecutionContext
 from backend.data.model import CredentialsMetaInput
-from backend.data.workspace import get_or_create_workspace
 from backend.integrations.creds_manager import IntegrationCredentialsManager
 from backend.util.exceptions import BlockError

@@ -225,48 +223,11 @@ class RunBlockTool(BaseTool):
            )

        try:
-            # Get or create user's workspace for CoPilot file operations
-            workspace = await get_or_create_workspace(user_id)
-
-            # Generate synthetic IDs for CoPilot context
-            # Each chat session is treated as its own agent with one continuous run
-            # This means:
-            # - graph_id (agent) = session (memories scoped to session when limit_to_agent=True)
-            # - graph_exec_id (run) = session (memories scoped to session when limit_to_run=True)
-            # - node_exec_id = unique per block execution
-            synthetic_graph_id = f"copilot-session-{session.session_id}"
-            synthetic_graph_exec_id = f"copilot-session-{session.session_id}"
-            synthetic_node_id = f"copilot-node-{block_id}"
-            synthetic_node_exec_id = (
-                f"copilot-{session.session_id}-{uuid.uuid4().hex[:8]}"
-            )
-
-            # Create unified execution context with all required fields
-            execution_context = ExecutionContext(
-                # Execution identity
-                user_id=user_id,
-                graph_id=synthetic_graph_id,
-                graph_exec_id=synthetic_graph_exec_id,
-                graph_version=1,  # Versions are 1-indexed
-                node_id=synthetic_node_id,
-                node_exec_id=synthetic_node_exec_id,
-                # Workspace with session scoping
-                workspace_id=workspace.id,
-                session_id=session.session_id,
-            )
-
-            # Prepare kwargs for block execution
-            # Keep individual kwargs for backwards compatibility with existing blocks
+            # Fetch actual credentials and prepare kwargs for block execution
+            # Create execution context with defaults (blocks may require it)
            exec_kwargs: dict[str, Any] = {
                "user_id": user_id,
-                "execution_context": execution_context,
-                # Legacy: individual kwargs for blocks not yet using execution_context
-                "workspace_id": workspace.id,
-                "graph_exec_id": synthetic_graph_exec_id,
-                "node_exec_id": synthetic_node_exec_id,
-                "node_id": synthetic_node_id,
-                "graph_version": 1,  # Versions are 1-indexed
-                "graph_id": synthetic_graph_id,
+                "execution_context": ExecutionContext(),
            }

            for field_name, cred_meta in matched_credentials.items():
--- a/autogpt_platform/backend/backend/api/features/chat/tools/workspace_files.py
+++ b/autogpt_platform/backend/backend/api/features/chat/tools/workspace_files.py
@@ -1,620 +0,0 @@
-"""CoPilot tools for workspace file operations."""
-
-import base64
-import logging
-from typing import Any, Optional
-
-from pydantic import BaseModel
-
-from backend.api.features.chat.model import ChatSession
-from backend.data.workspace import get_or_create_workspace
-from backend.util.settings import Config
-from backend.util.virus_scanner import scan_content_safe
-from backend.util.workspace import WorkspaceManager
-
-from .base import BaseTool
-from .models import ErrorResponse, ResponseType, ToolResponseBase
-
-logger = logging.getLogger(__name__)
-
-
-class WorkspaceFileInfoData(BaseModel):
-    """Data model for workspace file information (not a response itself)."""
-
-    file_id: str
-    name: str
-    path: str
-    mime_type: str
-    size_bytes: int
-
-
-class WorkspaceFileListResponse(ToolResponseBase):
-    """Response containing list of workspace files."""
-
-    type: ResponseType = ResponseType.WORKSPACE_FILE_LIST
-    files: list[WorkspaceFileInfoData]
-    total_count: int
-
-
-class WorkspaceFileContentResponse(ToolResponseBase):
-    """Response containing workspace file content (legacy, for small text files)."""
-
-    type: ResponseType = ResponseType.WORKSPACE_FILE_CONTENT
-    file_id: str
-    name: str
-    path: str
-    mime_type: str
-    content_base64: str
-
-
-class WorkspaceFileMetadataResponse(ToolResponseBase):
-    """Response containing workspace file metadata and download URL (prevents context bloat)."""
-
-    type: ResponseType = ResponseType.WORKSPACE_FILE_METADATA
-    file_id: str
-    name: str
-    path: str
-    mime_type: str
-    size_bytes: int
-    download_url: str
-    preview: str | None = None  # First 500 chars for text files
-
-
-class WorkspaceWriteResponse(ToolResponseBase):
-    """Response after writing a file to workspace."""
-
-    type: ResponseType = ResponseType.WORKSPACE_FILE_WRITTEN
-    file_id: str
-    name: str
-    path: str
-    size_bytes: int
-
-
-class WorkspaceDeleteResponse(ToolResponseBase):
-    """Response after deleting a file from workspace."""
-
-    type: ResponseType = ResponseType.WORKSPACE_FILE_DELETED
-    file_id: str
-    success: bool
-
-
-class ListWorkspaceFilesTool(BaseTool):
-    """Tool for listing files in user's workspace."""
-
-    @property
-    def name(self) -> str:
-        return "list_workspace_files"
-
-    @property
-    def description(self) -> str:
-        return (
-            "List files in the user's workspace. "
-            "Returns file names, paths, sizes, and metadata. "
-            "Optionally filter by path prefix."
-        )
-
-    @property
-    def parameters(self) -> dict[str, Any]:
-        return {
-            "type": "object",
-            "properties": {
-                "path_prefix": {
-                    "type": "string",
-                    "description": (
-                        "Optional path prefix to filter files "
-                        "(e.g., '/documents/' to list only files in documents folder). "
-                        "By default, only files from the current session are listed."
-                    ),
-                },
-                "limit": {
-                    "type": "integer",
-                    "description": "Maximum number of files to return (default 50, max 100)",
-                    "minimum": 1,
-                    "maximum": 100,
-                },
-                "include_all_sessions": {
-                    "type": "boolean",
-                    "description": (
-                        "If true, list files from all sessions. "
-                        "Default is false (only current session's files)."
-                    ),
-                },
-            },
-            "required": [],
-        }
-
-    @property
-    def requires_auth(self) -> bool:
-        return True
-
-    async def _execute(
-        self,
-        user_id: str | None,
-        session: ChatSession,
-        **kwargs,
-    ) -> ToolResponseBase:
-        session_id = session.session_id
-
-        if not user_id:
-            return ErrorResponse(
-                message="Authentication required",
-                session_id=session_id,
-            )
-
-        path_prefix: Optional[str] = kwargs.get("path_prefix")
-        limit = min(kwargs.get("limit", 50), 100)
-        include_all_sessions: bool = kwargs.get("include_all_sessions", False)
-
-        try:
-            workspace = await get_or_create_workspace(user_id)
-            # Pass session_id for session-scoped file access
-            manager = WorkspaceManager(user_id, workspace.id, session_id)
-
-            files = await manager.list_files(
-                path=path_prefix,
-                limit=limit,
-                include_all_sessions=include_all_sessions,
-            )
-            total = await manager.get_file_count(
-                path=path_prefix,
-                include_all_sessions=include_all_sessions,
-            )
-
-            file_infos = [
-                WorkspaceFileInfoData(
-                    file_id=f.id,
-                    name=f.name,
-                    path=f.path,
-                    mime_type=f.mimeType,
-                    size_bytes=f.sizeBytes,
-                )
-                for f in files
-            ]
-
-            scope_msg = "all sessions" if include_all_sessions else "current session"
-            return WorkspaceFileListResponse(
-                files=file_infos,
-                total_count=total,
-                message=f"Found {len(files)} files in workspace ({scope_msg})",
-                session_id=session_id,
-            )
-
-        except Exception as e:
-            logger.error(f"Error listing workspace files: {e}", exc_info=True)
-            return ErrorResponse(
-                message=f"Failed to list workspace files: {str(e)}",
-                error=str(e),
-                session_id=session_id,
-            )
-
-
-class ReadWorkspaceFileTool(BaseTool):
-    """Tool for reading file content from workspace."""
-
-    # Size threshold for returning full content vs metadata+URL
-    # Files larger than this return metadata with download URL to prevent context bloat
-    MAX_INLINE_SIZE_BYTES = 32 * 1024  # 32KB
-    # Preview size for text files
-    PREVIEW_SIZE = 500
-
-    @property
-    def name(self) -> str:
-        return "read_workspace_file"
-
-    @property
-    def description(self) -> str:
-        return (
-            "Read a file from the user's workspace. "
-            "Specify either file_id or path to identify the file. "
-            "For small text files, returns content directly. "
-            "For large or binary files, returns metadata and a download URL. "
-            "Paths are scoped to the current session by default. "
-            "Use /sessions/<session_id>/... for cross-session access."
-        )
-
-    @property
-    def parameters(self) -> dict[str, Any]:
-        return {
-            "type": "object",
-            "properties": {
-                "file_id": {
-                    "type": "string",
-                    "description": "The file's unique ID (from list_workspace_files)",
-                },
-                "path": {
-                    "type": "string",
-                    "description": (
-                        "The virtual file path (e.g., '/documents/report.pdf'). "
-                        "Scoped to current session by default."
-                    ),
-                },
-                "force_download_url": {
-                    "type": "boolean",
-                    "description": (
-                        "If true, always return metadata+URL instead of inline content. "
-                        "Default is false (auto-selects based on file size/type)."
-                    ),
-                },
-            },
-            "required": [],  # At least one must be provided
-        }
-
-    @property
-    def requires_auth(self) -> bool:
-        return True
-
-    def _is_text_mime_type(self, mime_type: str) -> bool:
-        """Check if the MIME type is a text-based type."""
-        text_types = [
-            "text/",
-            "application/json",
-            "application/xml",
-            "application/javascript",
-            "application/x-python",
-            "application/x-sh",
-        ]
-        return any(mime_type.startswith(t) for t in text_types)
-
-    async def _execute(
-        self,
-        user_id: str | None,
-        session: ChatSession,
-        **kwargs,
-    ) -> ToolResponseBase:
-        session_id = session.session_id
-
-        if not user_id:
-            return ErrorResponse(
-                message="Authentication required",
-                session_id=session_id,
-            )
-
-        file_id: Optional[str] = kwargs.get("file_id")
-        path: Optional[str] = kwargs.get("path")
-        force_download_url: bool = kwargs.get("force_download_url", False)
-
-        if not file_id and not path:
-            return ErrorResponse(
-                message="Please provide either file_id or path",
-                session_id=session_id,
-            )
-
-        try:
-            workspace = await get_or_create_workspace(user_id)
-            # Pass session_id for session-scoped file access
-            manager = WorkspaceManager(user_id, workspace.id, session_id)
-
-            # Get file info
-            if file_id:
-                file_info = await manager.get_file_info(file_id)
-                if file_info is None:
-                    return ErrorResponse(
-                        message=f"File not found: {file_id}",
-                        session_id=session_id,
-                    )
-                target_file_id = file_id
-            else:
-                # path is guaranteed to be non-None here due to the check above
-                assert path is not None
-                file_info = await manager.get_file_info_by_path(path)
-                if file_info is None:
-                    return ErrorResponse(
-                        message=f"File not found at path: {path}",
-                        session_id=session_id,
-                    )
-                target_file_id = file_info.id
-
-            # Decide whether to return inline content or metadata+URL
-            is_small_file = file_info.sizeBytes <= self.MAX_INLINE_SIZE_BYTES
-            is_text_file = self._is_text_mime_type(file_info.mimeType)
-
-            # Return inline content for small text files (unless force_download_url)
-            if is_small_file and is_text_file and not force_download_url:
-                content = await manager.read_file_by_id(target_file_id)
-                content_b64 = base64.b64encode(content).decode("utf-8")
-
-                return WorkspaceFileContentResponse(
-                    file_id=file_info.id,
-                    name=file_info.name,
-                    path=file_info.path,
-                    mime_type=file_info.mimeType,
-                    content_base64=content_b64,
-                    message=f"Successfully read file: {file_info.name}",
-                    session_id=session_id,
-                )
-
-            # Return metadata + workspace:// reference for large or binary files
-            # This prevents context bloat (100KB file = ~133KB as base64)
-            # Use workspace:// format so frontend urlTransform can add proxy prefix
-            download_url = f"workspace://{target_file_id}"
-
-            # Generate preview for text files
-            preview: str | None = None
-            if is_text_file:
-                try:
-                    content = await manager.read_file_by_id(target_file_id)
-                    preview_text = content[: self.PREVIEW_SIZE].decode(
-                        "utf-8", errors="replace"
-                    )
-                    if len(content) > self.PREVIEW_SIZE:
-                        preview_text += "..."
-                    preview = preview_text
-                except Exception:
-                    pass  # Preview is optional
-
-            return WorkspaceFileMetadataResponse(
-                file_id=file_info.id,
-                name=file_info.name,
-                path=file_info.path,
-                mime_type=file_info.mimeType,
-                size_bytes=file_info.sizeBytes,
-                download_url=download_url,
-                preview=preview,
-                message=f"File: {file_info.name} ({file_info.sizeBytes} bytes). Use download_url to retrieve content.",
-                session_id=session_id,
-            )
-
-        except FileNotFoundError as e:
-            return ErrorResponse(
-                message=str(e),
-                session_id=session_id,
-            )
-        except Exception as e:
-            logger.error(f"Error reading workspace file: {e}", exc_info=True)
-            return ErrorResponse(
-                message=f"Failed to read workspace file: {str(e)}",
-                error=str(e),
-                session_id=session_id,
-            )
-
-
-class WriteWorkspaceFileTool(BaseTool):
-    """Tool for writing files to workspace."""
-
-    @property
-    def name(self) -> str:
-        return "write_workspace_file"
-
-    @property
-    def description(self) -> str:
-        return (
-            "Write or create a file in the user's workspace. "
-            "Provide the content as a base64-encoded string. "
-            f"Maximum file size is {Config().max_file_size_mb}MB. "
-            "Files are saved to the current session's folder by default. "
-            "Use /sessions/<session_id>/... for cross-session access."
-        )
-
-    @property
-    def parameters(self) -> dict[str, Any]:
-        return {
-            "type": "object",
-            "properties": {
-                "filename": {
-                    "type": "string",
-                    "description": "Name for the file (e.g., 'report.pdf')",
-                },
-                "content_base64": {
-                    "type": "string",
-                    "description": "Base64-encoded file content",
-                },
-                "path": {
-                    "type": "string",
-                    "description": (
-                        "Optional virtual path where to save the file "
-                        "(e.g., '/documents/report.pdf'). "
-                        "Defaults to '/{filename}'. Scoped to current session."
-                    ),
-                },
-                "mime_type": {
-                    "type": "string",
-                    "description": (
-                        "Optional MIME type of the file. "
-                        "Auto-detected from filename if not provided."
-                    ),
-                },
-                "overwrite": {
-                    "type": "boolean",
-                    "description": "Whether to overwrite if file exists at path (default: false)",
-                },
-            },
-            "required": ["filename", "content_base64"],
-        }
-
-    @property
-    def requires_auth(self) -> bool:
-        return True
-
-    async def _execute(
-        self,
-        user_id: str | None,
-        session: ChatSession,
-        **kwargs,
-    ) -> ToolResponseBase:
-        session_id = session.session_id
-
-        if not user_id:
-            return ErrorResponse(
-                message="Authentication required",
-                session_id=session_id,
-            )
-
-        filename: str = kwargs.get("filename", "")
-        content_b64: str = kwargs.get("content_base64", "")
-        path: Optional[str] = kwargs.get("path")
-        mime_type: Optional[str] = kwargs.get("mime_type")
-        overwrite: bool = kwargs.get("overwrite", False)
-
-        if not filename:
-            return ErrorResponse(
-                message="Please provide a filename",
-                session_id=session_id,
-            )
-
-        if not content_b64:
-            return ErrorResponse(
-                message="Please provide content_base64",
-                session_id=session_id,
-            )
-
-        # Decode content
-        try:
-            content = base64.b64decode(content_b64)
-        except Exception:
-            return ErrorResponse(
-                message="Invalid base64-encoded content",
-                session_id=session_id,
-            )
-
-        # Check size
-        max_file_size = Config().max_file_size_mb * 1024 * 1024
-        if len(content) > max_file_size:
-            return ErrorResponse(
-                message=f"File too large. Maximum size is {Config().max_file_size_mb}MB",
-                session_id=session_id,
-            )
-
-        try:
-            # Virus scan
-            await scan_content_safe(content, filename=filename)
-
-            workspace = await get_or_create_workspace(user_id)
-            # Pass session_id for session-scoped file access
-            manager = WorkspaceManager(user_id, workspace.id, session_id)
-
-            file_record = await manager.write_file(
-                content=content,
-                filename=filename,
-                path=path,
-                mime_type=mime_type,
-                overwrite=overwrite,
-            )
-
-            return WorkspaceWriteResponse(
-                file_id=file_record.id,
-                name=file_record.name,
-                path=file_record.path,
-                size_bytes=file_record.sizeBytes,
-                message=f"Successfully wrote file: {file_record.name}",
-                session_id=session_id,
-            )
-
-        except ValueError as e:
-            return ErrorResponse(
-                message=str(e),
-                session_id=session_id,
-            )
-        except Exception as e:
-            logger.error(f"Error writing workspace file: {e}", exc_info=True)
-            return ErrorResponse(
-                message=f"Failed to write workspace file: {str(e)}",
-                error=str(e),
-                session_id=session_id,
-            )
-
-
-class DeleteWorkspaceFileTool(BaseTool):
-    """Tool for deleting files from workspace."""
-
-    @property
-    def name(self) -> str:
-        return "delete_workspace_file"
-
-    @property
-    def description(self) -> str:
-        return (
-            "Delete a file from the user's workspace. "
-            "Specify either file_id or path to identify the file. "
-            "Paths are scoped to the current session by default. "
-            "Use /sessions/<session_id>/... for cross-session access."
-        )
-
-    @property
-    def parameters(self) -> dict[str, Any]:
-        return {
-            "type": "object",
-            "properties": {
-                "file_id": {
-                    "type": "string",
-                    "description": "The file's unique ID (from list_workspace_files)",
-                },
-                "path": {
-                    "type": "string",
-                    "description": (
-                        "The virtual file path (e.g., '/documents/report.pdf'). "
-                        "Scoped to current session by default."
-                    ),
-                },
-            },
-            "required": [],  # At least one must be provided
-        }
-
-    @property
-    def requires_auth(self) -> bool:
-        return True
-
-    async def _execute(
-        self,
-        user_id: str | None,
-        session: ChatSession,
-        **kwargs,
-    ) -> ToolResponseBase:
-        session_id = session.session_id
-
-        if not user_id:
-            return ErrorResponse(
-                message="Authentication required",
-                session_id=session_id,
-            )
-
-        file_id: Optional[str] = kwargs.get("file_id")
-        path: Optional[str] = kwargs.get("path")
-
-        if not file_id and not path:
-            return ErrorResponse(
-                message="Please provide either file_id or path",
-                session_id=session_id,
-            )
-
-        try:
-            workspace = await get_or_create_workspace(user_id)
-            # Pass session_id for session-scoped file access
-            manager = WorkspaceManager(user_id, workspace.id, session_id)
-
-            # Determine the file_id to delete
-            target_file_id: str
-            if file_id:
-                target_file_id = file_id
-            else:
-                # path is guaranteed to be non-None here due to the check above
-                assert path is not None
-                file_info = await manager.get_file_info_by_path(path)
-                if file_info is None:
-                    return ErrorResponse(
-                        message=f"File not found at path: {path}",
-                        session_id=session_id,
-                    )
-                target_file_id = file_info.id
-
-            success = await manager.delete_file(target_file_id)
-
-            if not success:
-                return ErrorResponse(
-                    message=f"File not found: {target_file_id}",
-                    session_id=session_id,
-                )
-
-            return WorkspaceDeleteResponse(
-                file_id=target_file_id,
-                success=True,
-                message="File deleted successfully",
-                session_id=session_id,
-            )
-
-        except Exception as e:
-            logger.error(f"Error deleting workspace file: {e}", exc_info=True)
-            return ErrorResponse(
-                message=f"Failed to delete workspace file: {str(e)}",
-                error=str(e),
-                session_id=session_id,
-            )
--- a/autogpt_platform/backend/backend/api/features/library/db.py
+++ b/autogpt_platform/backend/backend/api/features/library/db.py
@@ -21,7 +21,7 @@ from backend.data.model import CredentialsMetaInput
 from backend.integrations.creds_manager import IntegrationCredentialsManager
 from backend.integrations.webhooks.graph_lifecycle_hooks import on_graph_activate
 from backend.util.clients import get_scheduler_client
-from backend.util.exceptions import DatabaseError, InvalidInputError, NotFoundError
+from backend.util.exceptions import DatabaseError, NotFoundError
 from backend.util.json import SafeJson
 from backend.util.models import Pagination
 from backend.util.settings import Config
@@ -64,11 +64,11 @@ async def list_library_agents(

    if page < 1 or page_size < 1:
        logger.warning(f"Invalid pagination: page={page}, page_size={page_size}")
-        raise InvalidInputError("Invalid pagination input")
+        raise DatabaseError("Invalid pagination input")

    if search_term and len(search_term.strip()) > 100:
        logger.warning(f"Search term too long: {repr(search_term)}")
-        raise InvalidInputError("Search term is too long")
+        raise DatabaseError("Search term is too long")

    where_clause: prisma.types.LibraryAgentWhereInput = {
        "userId": user_id,
@@ -175,7 +175,7 @@ async def list_favorite_library_agents(

    if page < 1 or page_size < 1:
        logger.warning(f"Invalid pagination: page={page}, page_size={page_size}")
-        raise InvalidInputError("Invalid pagination input")
+        raise DatabaseError("Invalid pagination input")

    where_clause: prisma.types.LibraryAgentWhereInput = {
        "userId": user_id,
--- a/autogpt_platform/backend/backend/api/features/library/routes/agents.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes/agents.py
@@ -1,3 +1,4 @@
+import logging
 from typing import Literal, Optional

 import autogpt_libs.auth as autogpt_auth_lib
@@ -5,11 +6,15 @@ from fastapi import APIRouter, Body, HTTPException, Query, Security, status
 from fastapi.responses import Response
 from prisma.enums import OnboardingStep

+import backend.api.features.store.exceptions as store_exceptions
 from backend.data.onboarding import complete_onboarding_step
+from backend.util.exceptions import DatabaseError, NotFoundError

 from .. import db as library_db
 from .. import model as library_model

+logger = logging.getLogger(__name__)
+
 router = APIRouter(
    prefix="/agents",
    tags=["library", "private"],
@@ -21,6 +26,10 @@ router = APIRouter(
    "",
    summary="List Library Agents",
    response_model=library_model.LibraryAgentResponse,
+    responses={
+        200: {"description": "List of library agents"},
+        500: {"description": "Server error", "content": {"application/json": {}}},
+    },
 )
 async def list_library_agents(
    user_id: str = Security(autogpt_auth_lib.get_user_id),
@@ -44,19 +53,43 @@ async def list_library_agents(
 ) -> library_model.LibraryAgentResponse:
    """
    Get all agents in the user's library (both created and saved).
+
+    Args:
+        user_id: ID of the authenticated user.
+        search_term: Optional search term to filter agents by name/description.
+        filter_by: List of filters to apply (favorites, created by user).
+        sort_by: List of sorting criteria (created date, updated date).
+        page: Page number to retrieve.
+        page_size: Number of agents per page.
+
+    Returns:
+        A LibraryAgentResponse containing agents and pagination metadata.
+
+    Raises:
+        HTTPException: If a server/database error occurs.
    """
-    return await library_db.list_library_agents(
-        user_id=user_id,
-        search_term=search_term,
-        sort_by=sort_by,
-        page=page,
-        page_size=page_size,
-    )
+    try:
+        return await library_db.list_library_agents(
+            user_id=user_id,
+            search_term=search_term,
+            sort_by=sort_by,
+            page=page,
+            page_size=page_size,
+        )
+    except Exception as e:
+        logger.error(f"Could not list library agents for user #{user_id}: {e}")
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail=str(e),
+        ) from e


@router.get(
    "/favorites",
    summary="List Favorite Library Agents",
+    responses={
+        500: {"description": "Server error", "content": {"application/json": {}}},
+    },
 )
 async def list_favorite_library_agents(
    user_id: str = Security(autogpt_auth_lib.get_user_id),
@@ -73,12 +106,30 @@ async def list_favorite_library_agents(
 ) -> library_model.LibraryAgentResponse:
    """
    Get all favorite agents in the user's library.
+
+    Args:
+        user_id: ID of the authenticated user.
+        page: Page number to retrieve.
+        page_size: Number of agents per page.
+
+    Returns:
+        A LibraryAgentResponse containing favorite agents and pagination metadata.
+
+    Raises:
+        HTTPException: If a server/database error occurs.
    """
-    return await library_db.list_favorite_library_agents(
-        user_id=user_id,
-        page=page,
-        page_size=page_size,
-    )
+    try:
+        return await library_db.list_favorite_library_agents(
+            user_id=user_id,
+            page=page,
+            page_size=page_size,
+        )
+    except Exception as e:
+        logger.error(f"Could not list favorite library agents for user #{user_id}: {e}")
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail=str(e),
+        ) from e


@router.get("/{library_agent_id}", summary="Get Library Agent")
@@ -111,6 +162,10 @@ async def get_library_agent_by_graph_id(
    summary="Get Agent By Store ID",
    tags=["store", "library"],
    response_model=library_model.LibraryAgent | None,
+    responses={
+        200: {"description": "Library agent found"},
+        404: {"description": "Agent not found"},
+    },
 )
 async def get_library_agent_by_store_listing_version_id(
    store_listing_version_id: str,
@@ -119,15 +174,32 @@ async def get_library_agent_by_store_listing_version_id(
    """
    Get Library Agent from Store Listing Version ID.
    """
-    return await library_db.get_library_agent_by_store_version_id(
-        store_listing_version_id, user_id
-    )
+    try:
+        return await library_db.get_library_agent_by_store_version_id(
+            store_listing_version_id, user_id
+        )
+    except NotFoundError as e:
+        raise HTTPException(
+            status_code=status.HTTP_404_NOT_FOUND,
+            detail=str(e),
+        )
+    except Exception as e:
+        logger.error(f"Could not fetch library agent from store version ID: {e}")
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail=str(e),
+        ) from e


@router.post(
    "",
    summary="Add Marketplace Agent",
    status_code=status.HTTP_201_CREATED,
+    responses={
+        201: {"description": "Agent added successfully"},
+        404: {"description": "Store listing version not found"},
+        500: {"description": "Server error"},
+    },
 )
 async def add_marketplace_agent_to_library(
    store_listing_version_id: str = Body(embed=True),
@@ -138,19 +210,59 @@ async def add_marketplace_agent_to_library(
 ) -> library_model.LibraryAgent:
    """
    Add an agent from the marketplace to the user's library.
+
+    Args:
+        store_listing_version_id: ID of the store listing version to add.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        library_model.LibraryAgent: Agent added to the library
+
+    Raises:
+        HTTPException(404): If the listing version is not found.
+        HTTPException(500): If a server/database error occurs.
    """
-    agent = await library_db.add_store_agent_to_library(
-        store_listing_version_id=store_listing_version_id,
-        user_id=user_id,
-    )
-    if source != "onboarding":
-        await complete_onboarding_step(user_id, OnboardingStep.MARKETPLACE_ADD_AGENT)
-    return agent
+    try:
+        agent = await library_db.add_store_agent_to_library(
+            store_listing_version_id=store_listing_version_id,
+            user_id=user_id,
+        )
+        if source != "onboarding":
+            await complete_onboarding_step(
+                user_id, OnboardingStep.MARKETPLACE_ADD_AGENT
+            )
+        return agent
+
+    except store_exceptions.AgentNotFoundError as e:
+        logger.warning(
+            f"Could not find store listing version {store_listing_version_id} "
+            "to add to library"
+        )
+        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(e))
+    except DatabaseError as e:
+        logger.error(f"Database error while adding agent to library: {e}", e)
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail={"message": str(e), "hint": "Inspect DB logs for details."},
+        ) from e
+    except Exception as e:
+        logger.error(f"Unexpected error while adding agent to library: {e}")
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail={
+                "message": str(e),
+                "hint": "Check server logs for more information.",
+            },
+        ) from e


@router.patch(
    "/{library_agent_id}",
    summary="Update Library Agent",
+    responses={
+        200: {"description": "Agent updated successfully"},
+        500: {"description": "Server error"},
+    },
 )
 async def update_library_agent(
    library_agent_id: str,
@@ -159,21 +271,52 @@ async def update_library_agent(
 ) -> library_model.LibraryAgent:
    """
    Update the library agent with the given fields.
+
+    Args:
+        library_agent_id: ID of the library agent to update.
+        payload: Fields to update (auto_update_version, is_favorite, etc.).
+        user_id: ID of the authenticated user.
+
+    Raises:
+        HTTPException(500): If a server/database error occurs.
    """
-    return await library_db.update_library_agent(
-        library_agent_id=library_agent_id,
-        user_id=user_id,
-        auto_update_version=payload.auto_update_version,
-        graph_version=payload.graph_version,
-        is_favorite=payload.is_favorite,
-        is_archived=payload.is_archived,
-        settings=payload.settings,
-    )
+    try:
+        return await library_db.update_library_agent(
+            library_agent_id=library_agent_id,
+            user_id=user_id,
+            auto_update_version=payload.auto_update_version,
+            graph_version=payload.graph_version,
+            is_favorite=payload.is_favorite,
+            is_archived=payload.is_archived,
+            settings=payload.settings,
+        )
+    except NotFoundError as e:
+        raise HTTPException(
+            status_code=status.HTTP_404_NOT_FOUND,
+            detail=str(e),
+        ) from e
+    except DatabaseError as e:
+        logger.error(f"Database error while updating library agent: {e}")
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail={"message": str(e), "hint": "Verify DB connection."},
+        ) from e
+    except Exception as e:
+        logger.error(f"Unexpected error while updating library agent: {e}")
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail={"message": str(e), "hint": "Check server logs."},
+        ) from e


@router.delete(
    "/{library_agent_id}",
    summary="Delete Library Agent",
+    responses={
+        204: {"description": "Agent deleted successfully"},
+        404: {"description": "Agent not found"},
+        500: {"description": "Server error"},
+    },
 )
 async def delete_library_agent(
    library_agent_id: str,
@@ -181,11 +324,28 @@ async def delete_library_agent(
 ) -> Response:
    """
    Soft-delete the specified library agent.
+
+    Args:
+        library_agent_id: ID of the library agent to delete.
+        user_id: ID of the authenticated user.
+
+    Returns:
+        204 No Content if successful.
+
+    Raises:
+        HTTPException(404): If the agent does not exist.
+        HTTPException(500): If a server/database error occurs.
    """
-    await library_db.delete_library_agent(
-        library_agent_id=library_agent_id, user_id=user_id
-    )
-    return Response(status_code=status.HTTP_204_NO_CONTENT)
+    try:
+        await library_db.delete_library_agent(
+            library_agent_id=library_agent_id, user_id=user_id
+        )
+        return Response(status_code=status.HTTP_204_NO_CONTENT)
+    except NotFoundError as e:
+        raise HTTPException(
+            status_code=status.HTTP_404_NOT_FOUND,
+            detail=str(e),
+        ) from e


@router.post("/{library_agent_id}/fork", summary="Fork Library Agent")
--- a/autogpt_platform/backend/backend/api/features/library/routes_test.py
+++ b/autogpt_platform/backend/backend/api/features/library/routes_test.py
@@ -118,6 +118,21 @@ async def test_get_library_agents_success(
    )


+def test_get_library_agents_error(mocker: pytest_mock.MockFixture, test_user_id: str):
+    mock_db_call = mocker.patch("backend.api.features.library.db.list_library_agents")
+    mock_db_call.side_effect = Exception("Test error")
+
+    response = client.get("/agents?search_term=test")
+    assert response.status_code == 500
+    mock_db_call.assert_called_once_with(
+        user_id=test_user_id,
+        search_term="test",
+        sort_by=library_model.LibraryAgentSort.UPDATED_AT,
+        page=1,
+        page_size=15,
+    )
+
+
@pytest.mark.asyncio
 async def test_get_favorite_library_agents_success(
    mocker: pytest_mock.MockFixture,
@@ -175,6 +190,23 @@ async def test_get_favorite_library_agents_success(
    )


+def test_get_favorite_library_agents_error(
+    mocker: pytest_mock.MockFixture, test_user_id: str
+):
+    mock_db_call = mocker.patch(
+        "backend.api.features.library.db.list_favorite_library_agents"
+    )
+    mock_db_call.side_effect = Exception("Test error")
+
+    response = client.get("/agents/favorites")
+    assert response.status_code == 500
+    mock_db_call.assert_called_once_with(
+        user_id=test_user_id,
+        page=1,
+        page_size=15,
+    )
+
+
 def test_add_agent_to_library_success(
    mocker: pytest_mock.MockFixture, test_user_id: str
 ):
@@ -226,3 +258,19 @@ def test_add_agent_to_library_success(
        store_listing_version_id="test-version-id", user_id=test_user_id
    )
    mock_complete_onboarding.assert_awaited_once()
+
+
+def test_add_agent_to_library_error(mocker: pytest_mock.MockFixture, test_user_id: str):
+    mock_db_call = mocker.patch(
+        "backend.api.features.library.db.add_store_agent_to_library"
+    )
+    mock_db_call.side_effect = Exception("Test error")
+
+    response = client.post(
+        "/agents", json={"store_listing_version_id": "test-version-id"}
+    )
+    assert response.status_code == 500
+    assert "detail" in response.json()  # Verify error response structure
+    mock_db_call.assert_called_once_with(
+        store_listing_version_id="test-version-id", user_id=test_user_id
+    )
--- a/autogpt_platform/backend/backend/api/features/store/embeddings.py
+++ b/autogpt_platform/backend/backend/api/features/store/embeddings.py
@@ -454,7 +454,6 @@ async def backfill_all_content_types(batch_size: int = 10) -> dict[str, Any]:
    total_processed = 0
    total_success = 0
    total_failed = 0
-    all_errors: dict[str, int] = {}  # Aggregate errors across all content types

    # Process content types in explicit order
    processing_order = [
@@ -500,12 +499,23 @@ async def backfill_all_content_types(batch_size: int = 10) -> dict[str, Any]:
            success = sum(1 for result in results if result is True)
            failed = len(results) - success

-            # Aggregate errors across all content types
+            # Aggregate unique errors to avoid Sentry spam
            if failed > 0:
+                # Group errors by type and message
+                error_summary: dict[str, int] = {}
                for result in results:
                    if isinstance(result, Exception):
                        error_key = f"{type(result).__name__}: {str(result)}"
-                        all_errors[error_key] = all_errors.get(error_key, 0) + 1
+                        error_summary[error_key] = error_summary.get(error_key, 0) + 1
+
+                # Log aggregated error summary
+                error_details = ", ".join(
+                    f"{error} ({count}x)" for error, count in error_summary.items()
+                )
+                logger.error(
+                    f"{content_type.value}: {failed}/{len(results)} embeddings failed. "
+                    f"Errors: {error_details}"
+                )

            results_by_type[content_type.value] = {
                "processed": len(missing_items),
@@ -532,13 +542,6 @@ async def backfill_all_content_types(batch_size: int = 10) -> dict[str, Any]:
                "error": str(e),
            }

-    # Log aggregated errors once at the end
-    if all_errors:
-        error_details = ", ".join(
-            f"{error} ({count}x)" for error, count in all_errors.items()
-        )
-        logger.error(f"Embedding backfill errors: {error_details}")
-
    return {
        "by_type": results_by_type,
        "totals": {
--- a/autogpt_platform/backend/backend/api/features/store/routes.py
+++ b/autogpt_platform/backend/backend/api/features/store/routes.py
@@ -393,6 +393,7 @@ async def get_creators(
@router.get(
    "/creator/{username}",
    summary="Get creator details",
+    operation_id="getV2GetCreatorDetails",
    tags=["store", "public"],
    response_model=store_model.CreatorDetails,
 )
--- a/autogpt_platform/backend/backend/api/features/v1.py
+++ b/autogpt_platform/backend/backend/api/features/v1.py
@@ -261,36 +261,14 @@ async def get_onboarding_agents(
    return await get_recommended_agents(user_id)


-class OnboardingStatusResponse(pydantic.BaseModel):
-    """Response for onboarding status check."""
-
-    is_onboarding_enabled: bool
-    is_chat_enabled: bool
-
-
@v1_router.get(
    "/onboarding/enabled",
    summary="Is onboarding enabled",
    tags=["onboarding", "public"],
-    response_model=OnboardingStatusResponse,
+    dependencies=[Security(requires_user)],
 )
-async def is_onboarding_enabled(
-    user_id: Annotated[str, Security(get_user_id)],
-) -> OnboardingStatusResponse:
-    # Check if chat is enabled for user
-    is_chat_enabled = await is_feature_enabled(Flag.CHAT, user_id, False)
-
-    # If chat is enabled, skip legacy onboarding
-    if is_chat_enabled:
-        return OnboardingStatusResponse(
-            is_onboarding_enabled=False,
-            is_chat_enabled=True,
-        )
-
-    return OnboardingStatusResponse(
-        is_onboarding_enabled=await onboarding_enabled(),
-        is_chat_enabled=False,
-    )
+async def is_onboarding_enabled() -> bool:
+    return await onboarding_enabled()


@v1_router.post(
--- a/autogpt_platform/backend/backend/api/features/workspace/init.py
+++ b/autogpt_platform/backend/backend/api/features/workspace/init.py
@@ -1 +0,0 @@
-# Workspace API feature module
--- a/autogpt_platform/backend/backend/api/features/workspace/routes.py
+++ b/autogpt_platform/backend/backend/api/features/workspace/routes.py
@@ -1,122 +0,0 @@
-"""
-Workspace API routes for managing user file storage.
-"""
-
-import logging
-import re
-from typing import Annotated
-from urllib.parse import quote
-
-import fastapi
-from autogpt_libs.auth.dependencies import get_user_id, requires_user
-from fastapi.responses import Response
-
-from backend.data.workspace import get_workspace, get_workspace_file
-from backend.util.workspace_storage import get_workspace_storage
-
-
-def _sanitize_filename_for_header(filename: str) -> str:
-    """
-    Sanitize filename for Content-Disposition header to prevent header injection.
-
-    Removes/replaces characters that could break the header or inject new headers.
-    Uses RFC5987 encoding for non-ASCII characters.
-    """
-    # Remove CR, LF, and null bytes (header injection prevention)
-    sanitized = re.sub(r"[\r\n\x00]", "", filename)
-    # Escape quotes
-    sanitized = sanitized.replace('"', '\\"')
-    # For non-ASCII, use RFC5987 filename* parameter
-    # Check if filename has non-ASCII characters
-    try:
-        sanitized.encode("ascii")
-        return f'attachment; filename="{sanitized}"'
-    except UnicodeEncodeError:
-        # Use RFC5987 encoding for UTF-8 filenames
-        encoded = quote(sanitized, safe="")
-        return f"attachment; filename*=UTF-8''{encoded}"
-
-
-logger = logging.getLogger(__name__)
-
-router = fastapi.APIRouter(
-    dependencies=[fastapi.Security(requires_user)],
-)
-
-
-def _create_streaming_response(content: bytes, file) -> Response:
-    """Create a streaming response for file content."""
-    return Response(
-        content=content,
-        media_type=file.mimeType,
-        headers={
-            "Content-Disposition": _sanitize_filename_for_header(file.name),
-            "Content-Length": str(len(content)),
-        },
-    )
-
-
-async def _create_file_download_response(file) -> Response:
-    """
-    Create a download response for a workspace file.
-
-    Handles both local storage (direct streaming) and GCS (signed URL redirect
-    with fallback to streaming).
-    """
-    storage = await get_workspace_storage()
-
-    # For local storage, stream the file directly
-    if file.storagePath.startswith("local://"):
-        content = await storage.retrieve(file.storagePath)
-        return _create_streaming_response(content, file)
-
-    # For GCS, try to redirect to signed URL, fall back to streaming
-    try:
-        url = await storage.get_download_url(file.storagePath, expires_in=300)
-        # If we got back an API path (fallback), stream directly instead
-        if url.startswith("/api/"):
-            content = await storage.retrieve(file.storagePath)
-            return _create_streaming_response(content, file)
-        return fastapi.responses.RedirectResponse(url=url, status_code=302)
-    except Exception as e:
-        # Log the signed URL failure with context
-        logger.error(
-            f"Failed to get signed URL for file {file.id} "
-            f"(storagePath={file.storagePath}): {e}",
-            exc_info=True,
-        )
-        # Fall back to streaming directly from GCS
-        try:
-            content = await storage.retrieve(file.storagePath)
-            return _create_streaming_response(content, file)
-        except Exception as fallback_error:
-            logger.error(
-                f"Fallback streaming also failed for file {file.id} "
-                f"(storagePath={file.storagePath}): {fallback_error}",
-                exc_info=True,
-            )
-            raise
-
-
-@router.get(
-    "/files/{file_id}/download",
-    summary="Download file by ID",
-)
-async def download_file(
-    user_id: Annotated[str, fastapi.Security(get_user_id)],
-    file_id: str,
-) -> Response:
-    """
-    Download a file by its ID.
-
-    Returns the file content directly or redirects to a signed URL for GCS.
-    """
-    workspace = await get_workspace(user_id)
-    if workspace is None:
-        raise fastapi.HTTPException(status_code=404, detail="Workspace not found")
-
-    file = await get_workspace_file(file_id, workspace.id)
-    if file is None:
-        raise fastapi.HTTPException(status_code=404, detail="File not found")
-
-    return await _create_file_download_response(file)
--- a/autogpt_platform/backend/backend/api/rest_api.py
+++ b/autogpt_platform/backend/backend/api/rest_api.py
@@ -18,6 +18,7 @@ from prisma.errors import PrismaError

 import backend.api.features.admin.credit_admin_routes
 import backend.api.features.admin.execution_analytics_routes
+import backend.api.features.admin.llm_routes
 import backend.api.features.admin.store_admin_routes
 import backend.api.features.builder
 import backend.api.features.builder.routes
@@ -32,15 +33,16 @@ import backend.api.features.postmark.postmark
 import backend.api.features.store.model
 import backend.api.features.store.routes
 import backend.api.features.v1
-import backend.api.features.workspace.routes as workspace_routes
 import backend.data.block
 import backend.data.db
 import backend.data.graph
 import backend.data.user
 import backend.integrations.webhooks.utils
+import backend.server.v2.llm.routes as public_llm_routes
 import backend.util.service
 import backend.util.settings
-from backend.blocks.llm import DEFAULT_LLM_MODEL
+from backend.data import llm_registry
+from backend.data.block_cost_config import refresh_llm_costs
 from backend.data.model import Credentials
 from backend.integrations.providers import ProviderName
 from backend.monitoring.instrumentation import instrument_fastapi
@@ -53,7 +55,6 @@ from backend.util.exceptions import (
 )
 from backend.util.feature_flag import initialize_launchdarkly, shutdown_launchdarkly
 from backend.util.service import UnhealthyServiceError
-from backend.util.workspace_storage import shutdown_workspace_storage

 from .external.fastapi_app import external_api
 from .features.analytics import router as analytics_router
@@ -111,11 +112,27 @@ async def lifespan_context(app: fastapi.FastAPI):

    AutoRegistry.patch_integrations()

+    # Refresh LLM registry before initializing blocks so blocks can use registry data
+    await llm_registry.refresh_llm_registry()
+    refresh_llm_costs()
+
+    # Clear block schema caches so they're regenerated with updated discriminator_mapping
+    from backend.data.block import BlockSchema
+
+    BlockSchema.clear_all_schema_caches()
+
    await backend.data.block.initialize_blocks()

    await backend.data.user.migrate_and_encrypt_user_integrations()
    await backend.data.graph.fix_llm_provider_credentials()
-    await backend.data.graph.migrate_llm_models(DEFAULT_LLM_MODEL)
+    # migrate_llm_models uses registry default model
+    from backend.blocks.llm import LlmModel
+
+    default_model_slug = llm_registry.get_default_model_slug()
+    if default_model_slug:
+        await backend.data.graph.migrate_llm_models(LlmModel(default_model_slug))
+    else:
+        logger.warning("Skipping LLM model migration: no default model available")
    await backend.integrations.webhooks.utils.migrate_legacy_triggered_graphs()

    with launch_darkly_context():
@@ -126,11 +143,6 @@ async def lifespan_context(app: fastapi.FastAPI):
    except Exception as e:
        logger.warning(f"Error shutting down cloud storage handler: {e}")

-    try:
-        await shutdown_workspace_storage()
-    except Exception as e:
-        logger.warning(f"Error shutting down workspace storage: {e}")
-
    await backend.data.db.disconnect()


@@ -305,6 +317,16 @@ app.include_router(
    tags=["v2", "executions", "review"],
    prefix="/api/review",
 )
+app.include_router(
+    backend.api.features.admin.llm_routes.router,
+    tags=["v2", "admin", "llm"],
+    prefix="/api/llm/admin",
+)
+app.include_router(
+    public_llm_routes.router,
+    tags=["v2", "llm"],
+    prefix="/api",
+)
 app.include_router(
    backend.api.features.library.routes.router, tags=["v2"], prefix="/api/library"
 )
@@ -322,11 +344,6 @@ app.include_router(
    tags=["v2", "chat"],
    prefix="/api/chat",
 )
-app.include_router(
-    workspace_routes.router,
-    tags=["workspace"],
-    prefix="/api/workspace",
-)
 app.include_router(
    backend.api.features.oauth.router,
    tags=["oauth"],
--- a/autogpt_platform/backend/backend/api/ws_api.py
+++ b/autogpt_platform/backend/backend/api/ws_api.py
@@ -77,7 +77,39 @@ async def event_broadcaster(manager: ConnectionManager):
                payload=notification.payload,
            )

-    await asyncio.gather(execution_worker(), notification_worker())
+    async def registry_refresh_worker():
+        """Listen for LLM registry refresh notifications and broadcast to all clients."""
+        from backend.data.llm_registry import REGISTRY_REFRESH_CHANNEL
+        from backend.data.redis_client import connect_async
+
+        redis = await connect_async()
+        pubsub = redis.pubsub()
+        await pubsub.subscribe(REGISTRY_REFRESH_CHANNEL)
+        logger.info(
+            "Subscribed to LLM registry refresh notifications for WebSocket broadcast"
+        )
+
+        async for message in pubsub.listen():
+            if (
+                message["type"] == "message"
+                and message["channel"] == REGISTRY_REFRESH_CHANNEL
+            ):
+                logger.info(
+                    "Broadcasting LLM registry refresh to all WebSocket clients"
+                )
+                await manager.broadcast_to_all(
+                    method=WSMethod.NOTIFICATION,
+                    data={
+                        "type": "LLM_REGISTRY_REFRESH",
+                        "event": "registry_updated",
+                    },
+                )
+
+    await asyncio.gather(
+        execution_worker(),
+        notification_worker(),
+        registry_refresh_worker(),
+    )


 async def authenticate_websocket(websocket: WebSocket) -> str:
--- a/autogpt_platform/backend/backend/blocks/ai_condition.py
+++ b/autogpt_platform/backend/backend/blocks/ai_condition.py
@@ -1,7 +1,6 @@
 from typing import Any

 from backend.blocks.llm import (
-    DEFAULT_LLM_MODEL,
    TEST_CREDENTIALS,
    TEST_CREDENTIALS_INPUT,
    AIBlockBase,
@@ -10,6 +9,7 @@ from backend.blocks.llm import (
    LlmModel,
    LLMResponse,
    llm_call,
+    llm_model_schema_extra,
 )
 from backend.data.block import (
    BlockCategory,
@@ -50,9 +50,10 @@ class AIConditionBlock(AIBlockBase):
        )
        model: LlmModel = SchemaField(
            title="LLM Model",
-            default=DEFAULT_LLM_MODEL,
+            default_factory=LlmModel.default,
            description="The language model to use for evaluating the condition.",
            advanced=False,
+            json_schema_extra=llm_model_schema_extra(),
        )
        credentials: AICredentials = AICredentialsField()

@@ -82,7 +83,7 @@ class AIConditionBlock(AIBlockBase):
                "condition": "the input is an email address",
                "yes_value": "Valid email",
                "no_value": "Not an email",
-                "model": DEFAULT_LLM_MODEL,
+                "model": "gpt-4o",  # Using string value - enum accepts any model slug dynamically
                "credentials": TEST_CREDENTIALS_INPUT,
            },
            test_credentials=TEST_CREDENTIALS,
--- a/autogpt_platform/backend/backend/blocks/ai_image_customizer.py
+++ b/autogpt_platform/backend/backend/blocks/ai_image_customizer.py
@@ -13,7 +13,6 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
-from backend.data.execution import ExecutionContext
 from backend.data.model import (
    APIKeyCredentials,
    CredentialsField,
@@ -118,13 +117,11 @@ class AIImageCustomizerBlock(Block):
                "credentials": TEST_CREDENTIALS_INPUT,
            },
            test_output=[
-                # Output will be a workspace ref or data URI depending on context
-                ("image_url", lambda x: x.startswith(("workspace://", "data:"))),
+                ("image_url", "https://replicate.delivery/generated-image.jpg"),
            ],
            test_mock={
-                # Use data URI to avoid HTTP requests during tests
                "run_model": lambda *args, **kwargs: MediaFileType(
-                    "data:image/jpeg;base64,/9j/4AAQSkZJRgABAgAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCAABAAEDASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD3+iiigD//2Q=="
+                    "https://replicate.delivery/generated-image.jpg"
                ),
            },
            test_credentials=TEST_CREDENTIALS,
@@ -135,7 +132,8 @@ class AIImageCustomizerBlock(Block):
        input_data: Input,
        *,
        credentials: APIKeyCredentials,
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
+        user_id: str,
        **kwargs,
    ) -> BlockOutput:
        try:
@@ -143,9 +141,10 @@ class AIImageCustomizerBlock(Block):
            processed_images = await asyncio.gather(
                *(
                    store_media_file(
+                        graph_exec_id=graph_exec_id,
                        file=img,
-                        execution_context=execution_context,
-                        return_format="for_external_api",  # Get content for Replicate API
+                        user_id=user_id,
+                        return_content=True,
                    )
                    for img in input_data.images
                )
@@ -159,14 +158,7 @@ class AIImageCustomizerBlock(Block):
                aspect_ratio=input_data.aspect_ratio.value,
                output_format=input_data.output_format.value,
            )
-
-            # Store the generated image to the user's workspace for persistence
-            stored_url = await store_media_file(
-                file=result,
-                execution_context=execution_context,
-                return_format="for_block_output",
-            )
-            yield "image_url", stored_url
+            yield "image_url", result
        except Exception as e:
            yield "error", str(e)

--- a/autogpt_platform/backend/backend/blocks/ai_image_generator_block.py
+++ b/autogpt_platform/backend/backend/blocks/ai_image_generator_block.py
@@ -6,7 +6,6 @@ from replicate.client import Client as ReplicateClient
 from replicate.helpers import FileOutput

 from backend.data.block import Block, BlockCategory, BlockSchemaInput, BlockSchemaOutput
-from backend.data.execution import ExecutionContext
 from backend.data.model import (
    APIKeyCredentials,
    CredentialsField,
@@ -14,8 +13,6 @@ from backend.data.model import (
    SchemaField,
 )
 from backend.integrations.providers import ProviderName
-from backend.util.file import store_media_file
-from backend.util.type import MediaFileType


 class ImageSize(str, Enum):
@@ -168,13 +165,11 @@ class AIImageGeneratorBlock(Block):
            test_output=[
                (
                    "image_url",
-                    # Test output is a data URI since we now store images
-                    lambda x: x.startswith("data:image/"),
+                    "https://replicate.delivery/generated-image.webp",
                ),
            ],
            test_mock={
-                # Return a data URI directly so store_media_file doesn't need to download
-                "_run_client": lambda *args, **kwargs: "data:image/webp;base64,UklGRiQAAABXRUJQVlA4IBgAAAAwAQCdASoBAAEAAQAcJYgCdAEO"
+                "_run_client": lambda *args, **kwargs: "https://replicate.delivery/generated-image.webp"
            },
        )

@@ -323,24 +318,11 @@ class AIImageGeneratorBlock(Block):
        style_text = style_map.get(style, "")
        return f"{style_text} of" if style_text else ""

-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: APIKeyCredentials,
-        execution_context: ExecutionContext,
-        **kwargs,
-    ):
+    async def run(self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs):
        try:
            url = await self.generate_image(input_data, credentials)
            if url:
-                # Store the generated image to the user's workspace/execution folder
-                stored_url = await store_media_file(
-                    file=MediaFileType(url),
-                    execution_context=execution_context,
-                    return_format="for_block_output",
-                )
-                yield "image_url", stored_url
+                yield "image_url", url
            else:
                yield "error", "Image generation returned an empty result."
        except Exception as e:
--- a/autogpt_platform/backend/backend/blocks/ai_shortform_video_block.py
+++ b/autogpt_platform/backend/backend/blocks/ai_shortform_video_block.py
@@ -13,7 +13,6 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
-from backend.data.execution import ExecutionContext
 from backend.data.model import (
    APIKeyCredentials,
    CredentialsField,
@@ -22,9 +21,7 @@ from backend.data.model import (
 )
 from backend.integrations.providers import ProviderName
 from backend.util.exceptions import BlockExecutionError
-from backend.util.file import store_media_file
 from backend.util.request import Requests
-from backend.util.type import MediaFileType

 TEST_CREDENTIALS = APIKeyCredentials(
    id="01234567-89ab-cdef-0123-456789abcdef",
@@ -274,10 +271,7 @@ class AIShortformVideoCreatorBlock(Block):
                "voice": Voice.LILY,
                "video_style": VisualMediaType.STOCK_VIDEOS,
            },
-            test_output=(
-                "video_url",
-                lambda x: x.startswith(("workspace://", "data:")),
-            ),
+            test_output=("video_url", "https://example.com/video.mp4"),
            test_mock={
                "create_webhook": lambda *args, **kwargs: (
                    "test_uuid",
@@ -286,21 +280,15 @@ class AIShortformVideoCreatorBlock(Block):
                "create_video": lambda *args, **kwargs: {"pid": "test_pid"},
                "check_video_status": lambda *args, **kwargs: {
                    "status": "ready",
-                    "videoUrl": "data:video/mp4;base64,AAAA",
+                    "videoUrl": "https://example.com/video.mp4",
                },
-                # Use data URI to avoid HTTP requests during tests
-                "wait_for_video": lambda *args, **kwargs: "data:video/mp4;base64,AAAA",
+                "wait_for_video": lambda *args, **kwargs: "https://example.com/video.mp4",
            },
            test_credentials=TEST_CREDENTIALS,
        )

    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: APIKeyCredentials,
-        execution_context: ExecutionContext,
-        **kwargs,
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
    ) -> BlockOutput:
        # Create a new Webhook.site URL
        webhook_token, webhook_url = await self.create_webhook()
@@ -352,13 +340,7 @@ class AIShortformVideoCreatorBlock(Block):
            )
            video_url = await self.wait_for_video(credentials.api_key, pid)
            logger.debug(f"Video ready: {video_url}")
-            # Store the generated video to the user's workspace for persistence
-            stored_url = await store_media_file(
-                file=MediaFileType(video_url),
-                execution_context=execution_context,
-                return_format="for_block_output",
-            )
-            yield "video_url", stored_url
+            yield "video_url", video_url


 class AIAdMakerVideoCreatorBlock(Block):
@@ -465,10 +447,7 @@ class AIAdMakerVideoCreatorBlock(Block):
                    "https://cdn.revid.ai/uploads/1747076315114-image.png",
                ],
            },
-            test_output=(
-                "video_url",
-                lambda x: x.startswith(("workspace://", "data:")),
-            ),
+            test_output=("video_url", "https://example.com/ad.mp4"),
            test_mock={
                "create_webhook": lambda *args, **kwargs: (
                    "test_uuid",
@@ -477,21 +456,14 @@ class AIAdMakerVideoCreatorBlock(Block):
                "create_video": lambda *args, **kwargs: {"pid": "test_pid"},
                "check_video_status": lambda *args, **kwargs: {
                    "status": "ready",
-                    "videoUrl": "data:video/mp4;base64,AAAA",
+                    "videoUrl": "https://example.com/ad.mp4",
                },
-                "wait_for_video": lambda *args, **kwargs: "data:video/mp4;base64,AAAA",
+                "wait_for_video": lambda *args, **kwargs: "https://example.com/ad.mp4",
            },
            test_credentials=TEST_CREDENTIALS,
        )

-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: APIKeyCredentials,
-        execution_context: ExecutionContext,
-        **kwargs,
-    ):
+    async def run(self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs):
        webhook_token, webhook_url = await self.create_webhook()

        payload = {
@@ -559,13 +531,7 @@ class AIAdMakerVideoCreatorBlock(Block):
            raise RuntimeError("Failed to create video: No project ID returned")

        video_url = await self.wait_for_video(credentials.api_key, pid)
-        # Store the generated video to the user's workspace for persistence
-        stored_url = await store_media_file(
-            file=MediaFileType(video_url),
-            execution_context=execution_context,
-            return_format="for_block_output",
-        )
-        yield "video_url", stored_url
+        yield "video_url", video_url


 class AIScreenshotToVideoAdBlock(Block):
@@ -660,10 +626,7 @@ class AIScreenshotToVideoAdBlock(Block):
                "script": "Amazing numbers!",
                "screenshot_url": "https://cdn.revid.ai/uploads/1747080376028-image.png",
            },
-            test_output=(
-                "video_url",
-                lambda x: x.startswith(("workspace://", "data:")),
-            ),
+            test_output=("video_url", "https://example.com/screenshot.mp4"),
            test_mock={
                "create_webhook": lambda *args, **kwargs: (
                    "test_uuid",
@@ -672,21 +635,14 @@ class AIScreenshotToVideoAdBlock(Block):
                "create_video": lambda *args, **kwargs: {"pid": "test_pid"},
                "check_video_status": lambda *args, **kwargs: {
                    "status": "ready",
-                    "videoUrl": "data:video/mp4;base64,AAAA",
+                    "videoUrl": "https://example.com/screenshot.mp4",
                },
-                "wait_for_video": lambda *args, **kwargs: "data:video/mp4;base64,AAAA",
+                "wait_for_video": lambda *args, **kwargs: "https://example.com/screenshot.mp4",
            },
            test_credentials=TEST_CREDENTIALS,
        )

-    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: APIKeyCredentials,
-        execution_context: ExecutionContext,
-        **kwargs,
-    ):
+    async def run(self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs):
        webhook_token, webhook_url = await self.create_webhook()

        payload = {
@@ -754,10 +710,4 @@ class AIScreenshotToVideoAdBlock(Block):
            raise RuntimeError("Failed to create video: No project ID returned")

        video_url = await self.wait_for_video(credentials.api_key, pid)
-        # Store the generated video to the user's workspace for persistence
-        stored_url = await store_media_file(
-            file=MediaFileType(video_url),
-            execution_context=execution_context,
-            return_format="for_block_output",
-        )
-        yield "video_url", stored_url
+        yield "video_url", video_url
--- a/autogpt_platform/backend/backend/blocks/bannerbear/text_overlay.py
+++ b/autogpt_platform/backend/backend/blocks/bannerbear/text_overlay.py
@@ -6,7 +6,6 @@ if TYPE_CHECKING:

 from pydantic import SecretStr

-from backend.data.execution import ExecutionContext
 from backend.sdk import (
    APIKeyCredentials,
    Block,
@@ -18,8 +17,6 @@ from backend.sdk import (
    Requests,
    SchemaField,
 )
-from backend.util.file import store_media_file
-from backend.util.type import MediaFileType

 from ._config import bannerbear

@@ -138,17 +135,15 @@ class BannerbearTextOverlayBlock(Block):
            },
            test_output=[
                ("success", True),
-                # Output will be a workspace ref or data URI depending on context
-                ("image_url", lambda x: x.startswith(("workspace://", "data:"))),
+                ("image_url", "https://cdn.bannerbear.com/test-image.jpg"),
                ("uid", "test-uid-123"),
                ("status", "completed"),
            ],
            test_mock={
-                # Use data URI to avoid HTTP requests during tests
                "_make_api_request": lambda *args, **kwargs: {
                    "uid": "test-uid-123",
                    "status": "completed",
-                    "image_url": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/wAALCAABAAEBAREA/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/9oACAEBAAA/APn+v//Z",
+                    "image_url": "https://cdn.bannerbear.com/test-image.jpg",
                }
            },
            test_credentials=TEST_CREDENTIALS,
@@ -182,12 +177,7 @@ class BannerbearTextOverlayBlock(Block):
            raise Exception(error_msg)

    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: APIKeyCredentials,
-        execution_context: ExecutionContext,
-        **kwargs,
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
    ) -> BlockOutput:
        # Build the modifications array
        modifications = []
@@ -244,18 +234,6 @@ class BannerbearTextOverlayBlock(Block):

        # Synchronous request - image should be ready
        yield "success", True
-
-        # Store the generated image to workspace for persistence
-        image_url = data.get("image_url", "")
-        if image_url:
-            stored_url = await store_media_file(
-                file=MediaFileType(image_url),
-                execution_context=execution_context,
-                return_format="for_block_output",
-            )
-            yield "image_url", stored_url
-        else:
-            yield "image_url", ""
-
+        yield "image_url", data.get("image_url", "")
        yield "uid", data.get("uid", "")
        yield "status", data.get("status", "completed")
--- a/autogpt_platform/backend/backend/blocks/basic.py
+++ b/autogpt_platform/backend/backend/blocks/basic.py
@@ -9,7 +9,6 @@ from backend.data.block import (
    BlockSchemaOutput,
    BlockType,
 )
-from backend.data.execution import ExecutionContext
 from backend.data.model import SchemaField
 from backend.util.file import store_media_file
 from backend.util.type import MediaFileType, convert
@@ -18,10 +17,10 @@ from backend.util.type import MediaFileType, convert
 class FileStoreBlock(Block):
    class Input(BlockSchemaInput):
        file_in: MediaFileType = SchemaField(
-            description="The file to download and store. Can be a URL (https://...), data URI, or local path."
+            description="The file to store in the temporary directory, it can be a URL, data URI, or local path."
        )
        base_64: bool = SchemaField(
-            description="Whether to produce output in base64 format (not recommended, you can pass the file reference across blocks).",
+            description="Whether produce an output in base64 format (not recommended, you can pass the string path just fine accross blocks).",
            default=False,
            advanced=True,
            title="Produce Base64 Output",
@@ -29,18 +28,13 @@ class FileStoreBlock(Block):

    class Output(BlockSchemaOutput):
        file_out: MediaFileType = SchemaField(
-            description="Reference to the stored file. In CoPilot: workspace:// URI (visible in list_workspace_files). In graphs: data URI for passing to other blocks."
+            description="The relative path to the stored file in the temporary directory."
        )

    def __init__(self):
        super().__init__(
            id="cbb50872-625b-42f0-8203-a2ae78242d8a",
-            description=(
-                "Downloads and stores a file from a URL, data URI, or local path. "
-                "Use this to fetch images, documents, or other files for processing. "
-                "In CoPilot: saves to workspace (use list_workspace_files to see it). "
-                "In graphs: outputs a data URI to pass to other blocks."
-            ),
+            description="Stores the input file in the temporary directory.",
            categories={BlockCategory.BASIC, BlockCategory.MULTIMEDIA},
            input_schema=FileStoreBlock.Input,
            output_schema=FileStoreBlock.Output,
@@ -51,18 +45,15 @@ class FileStoreBlock(Block):
        self,
        input_data: Input,
        *,
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
+        user_id: str,
        **kwargs,
    ) -> BlockOutput:
-        # Determine return format based on user preference
-        # for_external_api: always returns data URI (base64) - honors "Produce Base64 Output"
-        # for_block_output: smart format - workspace:// in CoPilot, data URI in graphs
-        return_format = "for_external_api" if input_data.base_64 else "for_block_output"
-
        yield "file_out", await store_media_file(
+            graph_exec_id=graph_exec_id,
            file=input_data.file_in,
-            execution_context=execution_context,
-            return_format=return_format,
+            user_id=user_id,
+            return_content=input_data.base_64,
        )


--- a/autogpt_platform/backend/backend/blocks/discord/bot_blocks.py
+++ b/autogpt_platform/backend/backend/blocks/discord/bot_blocks.py
@@ -15,7 +15,6 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
-from backend.data.execution import ExecutionContext
 from backend.data.model import APIKeyCredentials, SchemaField
 from backend.util.file import store_media_file
 from backend.util.request import Requests
@@ -667,7 +666,8 @@ class SendDiscordFileBlock(Block):
        file: MediaFileType,
        filename: str,
        message_content: str,
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
+        user_id: str,
    ) -> dict:
        intents = discord.Intents.default()
        intents.guilds = True
@@ -731,9 +731,10 @@ class SendDiscordFileBlock(Block):
                    # Local file path - read from stored media file
                    # This would be a path from a previous block's output
                    stored_file = await store_media_file(
+                        graph_exec_id=graph_exec_id,
                        file=file,
-                        execution_context=execution_context,
-                        return_format="for_external_api",  # Get content to send to Discord
+                        user_id=user_id,
+                        return_content=True,  # Get as data URI
                    )
                    # Now process as data URI
                    header, encoded = stored_file.split(",", 1)
@@ -780,7 +781,8 @@ class SendDiscordFileBlock(Block):
        input_data: Input,
        *,
        credentials: APIKeyCredentials,
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
+        user_id: str,
        **kwargs,
    ) -> BlockOutput:
        try:
@@ -791,7 +793,8 @@ class SendDiscordFileBlock(Block):
                file=input_data.file,
                filename=input_data.filename,
                message_content=input_data.message_content,
-                execution_context=execution_context,
+                graph_exec_id=graph_exec_id,
+                user_id=user_id,
            )

            yield "status", result.get("status", "Unknown error")
--- a/autogpt_platform/backend/backend/blocks/fal/ai_video_generator.py
+++ b/autogpt_platform/backend/backend/blocks/fal/ai_video_generator.py
@@ -17,11 +17,8 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
-from backend.data.execution import ExecutionContext
 from backend.data.model import SchemaField
-from backend.util.file import store_media_file
 from backend.util.request import ClientResponseError, Requests
-from backend.util.type import MediaFileType

 logger = logging.getLogger(__name__)

@@ -67,13 +64,9 @@ class AIVideoGeneratorBlock(Block):
                "credentials": TEST_CREDENTIALS_INPUT,
            },
            test_credentials=TEST_CREDENTIALS,
-            test_output=[
-                # Output will be a workspace ref or data URI depending on context
-                ("video_url", lambda x: x.startswith(("workspace://", "data:"))),
-            ],
+            test_output=[("video_url", "https://fal.media/files/example/video.mp4")],
            test_mock={
-                # Use data URI to avoid HTTP requests during tests
-                "generate_video": lambda *args, **kwargs: "data:video/mp4;base64,AAAA"
+                "generate_video": lambda *args, **kwargs: "https://fal.media/files/example/video.mp4"
            },
        )

@@ -215,22 +208,11 @@ class AIVideoGeneratorBlock(Block):
            raise RuntimeError(f"API request failed: {str(e)}")

    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: FalCredentials,
-        execution_context: ExecutionContext,
-        **kwargs,
+        self, input_data: Input, *, credentials: FalCredentials, **kwargs
    ) -> BlockOutput:
        try:
            video_url = await self.generate_video(input_data, credentials)
-            # Store the generated video to the user's workspace for persistence
-            stored_url = await store_media_file(
-                file=MediaFileType(video_url),
-                execution_context=execution_context,
-                return_format="for_block_output",
-            )
-            yield "video_url", stored_url
+            yield "video_url", video_url
        except Exception as e:
            error_message = str(e)
            yield "error", error_message
--- a/autogpt_platform/backend/backend/blocks/flux_kontext.py
+++ b/autogpt_platform/backend/backend/blocks/flux_kontext.py
@@ -12,7 +12,6 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
-from backend.data.execution import ExecutionContext
 from backend.data.model import (
    APIKeyCredentials,
    CredentialsField,
@@ -122,12 +121,10 @@ class AIImageEditorBlock(Block):
                "credentials": TEST_CREDENTIALS_INPUT,
            },
            test_output=[
-                # Output will be a workspace ref or data URI depending on context
-                ("output_image", lambda x: x.startswith(("workspace://", "data:"))),
+                ("output_image", "https://replicate.com/output/edited-image.png"),
            ],
            test_mock={
-                # Use data URI to avoid HTTP requests during tests
-                "run_model": lambda *args, **kwargs: "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg==",
+                "run_model": lambda *args, **kwargs: "https://replicate.com/output/edited-image.png",
            },
            test_credentials=TEST_CREDENTIALS,
        )
@@ -137,7 +134,8 @@ class AIImageEditorBlock(Block):
        input_data: Input,
        *,
        credentials: APIKeyCredentials,
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
+        user_id: str,
        **kwargs,
    ) -> BlockOutput:
        result = await self.run_model(
@@ -146,25 +144,20 @@ class AIImageEditorBlock(Block):
            prompt=input_data.prompt,
            input_image_b64=(
                await store_media_file(
+                    graph_exec_id=graph_exec_id,
                    file=input_data.input_image,
-                    execution_context=execution_context,
-                    return_format="for_external_api",  # Get content for Replicate API
+                    user_id=user_id,
+                    return_content=True,
                )
                if input_data.input_image
                else None
            ),
            aspect_ratio=input_data.aspect_ratio.value,
            seed=input_data.seed,
-            user_id=execution_context.user_id or "",
-            graph_exec_id=execution_context.graph_exec_id or "",
+            user_id=user_id,
+            graph_exec_id=graph_exec_id,
        )
-        # Store the generated image to the user's workspace for persistence
-        stored_url = await store_media_file(
-            file=result,
-            execution_context=execution_context,
-            return_format="for_block_output",
-        )
-        yield "output_image", stored_url
+        yield "output_image", result

    async def run_model(
        self,
--- a/autogpt_platform/backend/backend/blocks/google/gmail.py
+++ b/autogpt_platform/backend/backend/blocks/google/gmail.py
@@ -21,7 +21,6 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
-from backend.data.execution import ExecutionContext
 from backend.data.model import SchemaField
 from backend.util.file import MediaFileType, get_exec_file_path, store_media_file
 from backend.util.settings import Settings
@@ -96,7 +95,8 @@ def _make_mime_text(

 async def create_mime_message(
    input_data,
-    execution_context: ExecutionContext,
+    graph_exec_id: str,
+    user_id: str,
 ) -> str:
    """Create a MIME message with attachments and return base64-encoded raw message."""

@@ -117,12 +117,12 @@ async def create_mime_message(
    if input_data.attachments:
        for attach in input_data.attachments:
            local_path = await store_media_file(
+                user_id=user_id,
+                graph_exec_id=graph_exec_id,
                file=attach,
-                execution_context=execution_context,
-                return_format="for_local_processing",
+                return_content=False,
            )
-            assert execution_context.graph_exec_id  # Validated by store_media_file
-            abs_path = get_exec_file_path(execution_context.graph_exec_id, local_path)
+            abs_path = get_exec_file_path(graph_exec_id, local_path)
            part = MIMEBase("application", "octet-stream")
            with open(abs_path, "rb") as f:
                part.set_payload(f.read())
@@ -582,25 +582,27 @@ class GmailSendBlock(GmailBase):
        input_data: Input,
        *,
        credentials: GoogleCredentials,
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
+        user_id: str,
        **kwargs,
    ) -> BlockOutput:
        service = self._build_service(credentials, **kwargs)
        result = await self._send_email(
            service,
            input_data,
-            execution_context,
+            graph_exec_id,
+            user_id,
        )
        yield "result", result

    async def _send_email(
-        self, service, input_data: Input, execution_context: ExecutionContext
+        self, service, input_data: Input, graph_exec_id: str, user_id: str
    ) -> dict:
        if not input_data.to or not input_data.subject or not input_data.body:
            raise ValueError(
                "At least one recipient, subject, and body are required for sending an email"
            )
-        raw_message = await create_mime_message(input_data, execution_context)
+        raw_message = await create_mime_message(input_data, graph_exec_id, user_id)
        sent_message = await asyncio.to_thread(
            lambda: service.users()
            .messages()
@@ -690,28 +692,30 @@ class GmailCreateDraftBlock(GmailBase):
        input_data: Input,
        *,
        credentials: GoogleCredentials,
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
+        user_id: str,
        **kwargs,
    ) -> BlockOutput:
        service = self._build_service(credentials, **kwargs)
        result = await self._create_draft(
            service,
            input_data,
-            execution_context,
+            graph_exec_id,
+            user_id,
        )
        yield "result", GmailDraftResult(
            id=result["id"], message_id=result["message"]["id"], status="draft_created"
        )

    async def _create_draft(
-        self, service, input_data: Input, execution_context: ExecutionContext
+        self, service, input_data: Input, graph_exec_id: str, user_id: str
    ) -> dict:
        if not input_data.to or not input_data.subject:
            raise ValueError(
                "At least one recipient and subject are required for creating a draft"
            )

-        raw_message = await create_mime_message(input_data, execution_context)
+        raw_message = await create_mime_message(input_data, graph_exec_id, user_id)
        draft = await asyncio.to_thread(
            lambda: service.users()
            .drafts()
@@ -1096,7 +1100,7 @@ class GmailGetThreadBlock(GmailBase):


 async def _build_reply_message(
-    service, input_data, execution_context: ExecutionContext
+    service, input_data, graph_exec_id: str, user_id: str
 ) -> tuple[str, str]:
    """
    Builds a reply MIME message for Gmail threads.
@@ -1186,12 +1190,12 @@ async def _build_reply_message(
    # Handle attachments
    for attach in input_data.attachments:
        local_path = await store_media_file(
+            user_id=user_id,
+            graph_exec_id=graph_exec_id,
            file=attach,
-            execution_context=execution_context,
-            return_format="for_local_processing",
+            return_content=False,
        )
-        assert execution_context.graph_exec_id  # Validated by store_media_file
-        abs_path = get_exec_file_path(execution_context.graph_exec_id, local_path)
+        abs_path = get_exec_file_path(graph_exec_id, local_path)
        part = MIMEBase("application", "octet-stream")
        with open(abs_path, "rb") as f:
            part.set_payload(f.read())
@@ -1307,14 +1311,16 @@ class GmailReplyBlock(GmailBase):
        input_data: Input,
        *,
        credentials: GoogleCredentials,
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
+        user_id: str,
        **kwargs,
    ) -> BlockOutput:
        service = self._build_service(credentials, **kwargs)
        message = await self._reply(
            service,
            input_data,
-            execution_context,
+            graph_exec_id,
+            user_id,
        )
        yield "messageId", message["id"]
        yield "threadId", message.get("threadId", input_data.threadId)
@@ -1337,11 +1343,11 @@ class GmailReplyBlock(GmailBase):
        yield "email", email

    async def _reply(
-        self, service, input_data: Input, execution_context: ExecutionContext
+        self, service, input_data: Input, graph_exec_id: str, user_id: str
    ) -> dict:
        # Build the reply message using the shared helper
        raw, thread_id = await _build_reply_message(
-            service, input_data, execution_context
+            service, input_data, graph_exec_id, user_id
        )

        # Send the message
@@ -1435,14 +1441,16 @@ class GmailDraftReplyBlock(GmailBase):
        input_data: Input,
        *,
        credentials: GoogleCredentials,
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
+        user_id: str,
        **kwargs,
    ) -> BlockOutput:
        service = self._build_service(credentials, **kwargs)
        draft = await self._create_draft_reply(
            service,
            input_data,
-            execution_context,
+            graph_exec_id,
+            user_id,
        )
        yield "draftId", draft["id"]
        yield "messageId", draft["message"]["id"]
@@ -1450,11 +1458,11 @@ class GmailDraftReplyBlock(GmailBase):
        yield "status", "draft_created"

    async def _create_draft_reply(
-        self, service, input_data: Input, execution_context: ExecutionContext
+        self, service, input_data: Input, graph_exec_id: str, user_id: str
    ) -> dict:
        # Build the reply message using the shared helper
        raw, thread_id = await _build_reply_message(
-            service, input_data, execution_context
+            service, input_data, graph_exec_id, user_id
        )

        # Create draft with proper thread association
@@ -1621,21 +1629,23 @@ class GmailForwardBlock(GmailBase):
        input_data: Input,
        *,
        credentials: GoogleCredentials,
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
+        user_id: str,
        **kwargs,
    ) -> BlockOutput:
        service = self._build_service(credentials, **kwargs)
        result = await self._forward_message(
            service,
            input_data,
-            execution_context,
+            graph_exec_id,
+            user_id,
        )
        yield "messageId", result["id"]
        yield "threadId", result.get("threadId", "")
        yield "status", "forwarded"

    async def _forward_message(
-        self, service, input_data: Input, execution_context: ExecutionContext
+        self, service, input_data: Input, graph_exec_id: str, user_id: str
    ) -> dict:
        if not input_data.to:
            raise ValueError("At least one recipient is required for forwarding")
@@ -1717,12 +1727,12 @@ To: {original_to}
        # Add any additional attachments
        for attach in input_data.additionalAttachments:
            local_path = await store_media_file(
+                user_id=user_id,
+                graph_exec_id=graph_exec_id,
                file=attach,
-                execution_context=execution_context,
-                return_format="for_local_processing",
+                return_content=False,
            )
-            assert execution_context.graph_exec_id  # Validated by store_media_file
-            abs_path = get_exec_file_path(execution_context.graph_exec_id, local_path)
+            abs_path = get_exec_file_path(graph_exec_id, local_path)
            part = MIMEBase("application", "octet-stream")
            with open(abs_path, "rb") as f:
                part.set_payload(f.read())
--- a/autogpt_platform/backend/backend/blocks/http.py
+++ b/autogpt_platform/backend/backend/blocks/http.py
@@ -15,7 +15,6 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
-from backend.data.execution import ExecutionContext
 from backend.data.model import (
    CredentialsField,
    CredentialsMetaInput,
@@ -117,9 +116,10 @@ class SendWebRequestBlock(Block):

    @staticmethod
    async def _prepare_files(
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
        files_name: str,
        files: list[MediaFileType],
+        user_id: str,
    ) -> list[tuple[str, tuple[str, BytesIO, str]]]:
        """
        Prepare files for the request by storing them and reading their content.
@@ -127,16 +127,11 @@ class SendWebRequestBlock(Block):
        (files_name, (filename, BytesIO, mime_type))
        """
        files_payload: list[tuple[str, tuple[str, BytesIO, str]]] = []
-        graph_exec_id = execution_context.graph_exec_id
-        if graph_exec_id is None:
-            raise ValueError("graph_exec_id is required for file operations")

        for media in files:
            # Normalise to a list so we can repeat the same key
            rel_path = await store_media_file(
-                file=media,
-                execution_context=execution_context,
-                return_format="for_local_processing",
+                graph_exec_id, media, user_id, return_content=False
            )
            abs_path = get_exec_file_path(graph_exec_id, rel_path)
            async with aiofiles.open(abs_path, "rb") as f:
@@ -148,7 +143,7 @@ class SendWebRequestBlock(Block):
        return files_payload

    async def run(
-        self, input_data: Input, *, execution_context: ExecutionContext, **kwargs
+        self, input_data: Input, *, graph_exec_id: str, user_id: str, **kwargs
    ) -> BlockOutput:
        # ─── Parse/normalise body ────────────────────────────────────
        body = input_data.body
@@ -179,7 +174,7 @@ class SendWebRequestBlock(Block):
        files_payload: list[tuple[str, tuple[str, BytesIO, str]]] = []
        if use_files:
            files_payload = await self._prepare_files(
-                execution_context, input_data.files_name, input_data.files
+                graph_exec_id, input_data.files_name, input_data.files, user_id
            )

        # Enforce body format rules
@@ -243,8 +238,9 @@ class SendAuthenticatedWebRequestBlock(SendWebRequestBlock):
        self,
        input_data: Input,
        *,
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
        credentials: HostScopedCredentials,
+        user_id: str,
        **kwargs,
    ) -> BlockOutput:
        # Create SendWebRequestBlock.Input from our input (removing credentials field)
@@ -275,6 +271,6 @@ class SendAuthenticatedWebRequestBlock(SendWebRequestBlock):

        # Use parent class run method
        async for output_name, output_data in super().run(
-            base_input, execution_context=execution_context, **kwargs
+            base_input, graph_exec_id=graph_exec_id, user_id=user_id, **kwargs
        ):
            yield output_name, output_data
--- a/autogpt_platform/backend/backend/blocks/io.py
+++ b/autogpt_platform/backend/backend/blocks/io.py
@@ -12,7 +12,6 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockType,
 )
-from backend.data.execution import ExecutionContext
 from backend.data.model import SchemaField
 from backend.util.file import store_media_file
 from backend.util.mock import MockObject
@@ -463,21 +462,18 @@ class AgentFileInputBlock(AgentInputBlock):
        self,
        input_data: Input,
        *,
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
+        user_id: str,
        **kwargs,
    ) -> BlockOutput:
        if not input_data.value:
            return

-        # Determine return format based on user preference
-        # for_external_api: always returns data URI (base64) - honors "Produce Base64 Output"
-        # for_block_output: smart format - workspace:// in CoPilot, data URI in graphs
-        return_format = "for_external_api" if input_data.base_64 else "for_block_output"
-
        yield "result", await store_media_file(
+            graph_exec_id=graph_exec_id,
            file=input_data.value,
-            execution_context=execution_context,
-            return_format=return_format,
+            user_id=user_id,
+            return_content=input_data.base_64,
        )


--- a/autogpt_platform/backend/backend/blocks/llm.py
+++ b/autogpt_platform/backend/backend/blocks/llm.py
@@ -4,17 +4,19 @@ import logging
 import re
 import secrets
 from abc import ABC
-from enum import Enum, EnumMeta
+from enum import Enum
 from json import JSONDecodeError
-from typing import Any, Iterable, List, Literal, NamedTuple, Optional
+from typing import Any, Iterable, List, Literal, Optional

 import anthropic
 import ollama
 import openai
 from anthropic.types import ToolParam
 from groq import AsyncGroq
-from pydantic import BaseModel, SecretStr
+from pydantic import BaseModel, GetCoreSchemaHandler, SecretStr
+from pydantic_core import CoreSchema, core_schema

+from backend.data import llm_registry
 from backend.data.block import (
    Block,
    BlockCategory,
@@ -22,6 +24,7 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
+from backend.data.llm_registry import ModelMetadata
 from backend.data.model import (
    APIKeyCredentials,
    CredentialsField,
@@ -66,114 +69,123 @@ TEST_CREDENTIALS_INPUT = {


 def AICredentialsField() -> AICredentials:
+    """
+    Returns a CredentialsField for LLM providers.
+    The discriminator_mapping will be refreshed when the schema is generated
+    if it's empty, ensuring the LLM registry is loaded.
+    """
+    # Get the mapping now - it may be empty initially, but will be refreshed
+    # when the schema is generated via CredentialsMetaInput._add_json_schema_extra
+    mapping = llm_registry.get_llm_discriminator_mapping()
+
    return CredentialsField(
        description="API key for the LLM provider.",
        discriminator="model",
-        discriminator_mapping={
-            model.value: model.metadata.provider for model in LlmModel
-        },
+        discriminator_mapping=mapping,  # May be empty initially, refreshed later
    )


-class ModelMetadata(NamedTuple):
-    provider: str
-    context_window: int
-    max_output_tokens: int | None
-    display_name: str
-    provider_name: str
-    creator_name: str
-    price_tier: Literal[1, 2, 3]
+def llm_model_schema_extra() -> dict[str, Any]:
+    return {"options": llm_registry.get_llm_model_schema_options()}


-class LlmModelMeta(EnumMeta):
-    pass
+class LlmModelMeta(type):
+    """
+    Metaclass for LlmModel that enables attribute-style access to dynamic models.
+
+    This allows code like `LlmModel.GPT4O` to work by converting the attribute
+    name to a slug format:
+    - GPT4O -> gpt-4o
+    - GPT4O_MINI -> gpt-4o-mini
+    - CLAUDE_3_5_SONNET -> claude-3-5-sonnet
+    """
+
+    def __getattr__(cls, name: str):
+        # Don't intercept private/dunder attributes
+        if name.startswith("_"):
+            raise AttributeError(f"type object 'LlmModel' has no attribute '{name}'")
+
+        # Convert attribute name to slug format:
+        # 1. Lowercase: GPT4O -> gpt4o
+        # 2. Underscores to hyphens: GPT4O_MINI -> gpt4o-mini
+        slug = name.lower().replace("_", "-")
+
+        # Check for exact match in registry first (e.g., "o1" stays "o1")
+        registry_slugs = llm_registry.get_dynamic_model_slugs()
+        if slug in registry_slugs:
+            return cls(slug)
+
+        # If no exact match, try inserting hyphen between letter and digit
+        # e.g., gpt4o -> gpt-4o
+        transformed_slug = re.sub(r"([a-z])(\d)", r"\1-\2", slug)
+        return cls(transformed_slug)
+
+    def __iter__(cls):
+        """Iterate over all models from the registry.
+
+        Yields LlmModel instances for each model in the dynamic registry.
+        Used by __get_pydantic_json_schema__ to build model metadata.
+        """
+        for model in llm_registry.iter_dynamic_models():
+            yield cls(model.slug)


-class LlmModel(str, Enum, metaclass=LlmModelMeta):
-    # OpenAI models
-    O3_MINI = "o3-mini"
-    O3 = "o3-2025-04-16"
-    O1 = "o1"
-    O1_MINI = "o1-mini"
-    # GPT-5 models
-    GPT5_2 = "gpt-5.2-2025-12-11"
-    GPT5_1 = "gpt-5.1-2025-11-13"
-    GPT5 = "gpt-5-2025-08-07"
-    GPT5_MINI = "gpt-5-mini-2025-08-07"
-    GPT5_NANO = "gpt-5-nano-2025-08-07"
-    GPT5_CHAT = "gpt-5-chat-latest"
-    GPT41 = "gpt-4.1-2025-04-14"
-    GPT41_MINI = "gpt-4.1-mini-2025-04-14"
-    GPT4O_MINI = "gpt-4o-mini"
-    GPT4O = "gpt-4o"
-    GPT4_TURBO = "gpt-4-turbo"
-    GPT3_5_TURBO = "gpt-3.5-turbo"
-    # Anthropic models
-    CLAUDE_4_1_OPUS = "claude-opus-4-1-20250805"
-    CLAUDE_4_OPUS = "claude-opus-4-20250514"
-    CLAUDE_4_SONNET = "claude-sonnet-4-20250514"
-    CLAUDE_4_5_OPUS = "claude-opus-4-5-20251101"
-    CLAUDE_4_5_SONNET = "claude-sonnet-4-5-20250929"
-    CLAUDE_4_5_HAIKU = "claude-haiku-4-5-20251001"
-    CLAUDE_3_7_SONNET = "claude-3-7-sonnet-20250219"
-    CLAUDE_3_HAIKU = "claude-3-haiku-20240307"
-    # AI/ML API models
-    AIML_API_QWEN2_5_72B = "Qwen/Qwen2.5-72B-Instruct-Turbo"
-    AIML_API_LLAMA3_1_70B = "nvidia/llama-3.1-nemotron-70b-instruct"
-    AIML_API_LLAMA3_3_70B = "meta-llama/Llama-3.3-70B-Instruct-Turbo"
-    AIML_API_META_LLAMA_3_1_70B = "meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo"
-    AIML_API_LLAMA_3_2_3B = "meta-llama/Llama-3.2-3B-Instruct-Turbo"
-    # Groq models
-    LLAMA3_3_70B = "llama-3.3-70b-versatile"
-    LLAMA3_1_8B = "llama-3.1-8b-instant"
-    # Ollama models
-    OLLAMA_LLAMA3_3 = "llama3.3"
-    OLLAMA_LLAMA3_2 = "llama3.2"
-    OLLAMA_LLAMA3_8B = "llama3"
-    OLLAMA_LLAMA3_405B = "llama3.1:405b"
-    OLLAMA_DOLPHIN = "dolphin-mistral:latest"
-    # OpenRouter models
-    OPENAI_GPT_OSS_120B = "openai/gpt-oss-120b"
-    OPENAI_GPT_OSS_20B = "openai/gpt-oss-20b"
-    GEMINI_2_5_PRO = "google/gemini-2.5-pro-preview-03-25"
-    GEMINI_3_PRO_PREVIEW = "google/gemini-3-pro-preview"
-    GEMINI_2_5_FLASH = "google/gemini-2.5-flash"
-    GEMINI_2_0_FLASH = "google/gemini-2.0-flash-001"
-    GEMINI_2_5_FLASH_LITE_PREVIEW = "google/gemini-2.5-flash-lite-preview-06-17"
-    GEMINI_2_0_FLASH_LITE = "google/gemini-2.0-flash-lite-001"
-    MISTRAL_NEMO = "mistralai/mistral-nemo"
-    COHERE_COMMAND_R_08_2024 = "cohere/command-r-08-2024"
-    COHERE_COMMAND_R_PLUS_08_2024 = "cohere/command-r-plus-08-2024"
-    DEEPSEEK_CHAT = "deepseek/deepseek-chat"  # Actually: DeepSeek V3
-    DEEPSEEK_R1_0528 = "deepseek/deepseek-r1-0528"
-    PERPLEXITY_SONAR = "perplexity/sonar"
-    PERPLEXITY_SONAR_PRO = "perplexity/sonar-pro"
-    PERPLEXITY_SONAR_DEEP_RESEARCH = "perplexity/sonar-deep-research"
-    NOUSRESEARCH_HERMES_3_LLAMA_3_1_405B = "nousresearch/hermes-3-llama-3.1-405b"
-    NOUSRESEARCH_HERMES_3_LLAMA_3_1_70B = "nousresearch/hermes-3-llama-3.1-70b"
-    AMAZON_NOVA_LITE_V1 = "amazon/nova-lite-v1"
-    AMAZON_NOVA_MICRO_V1 = "amazon/nova-micro-v1"
-    AMAZON_NOVA_PRO_V1 = "amazon/nova-pro-v1"
-    MICROSOFT_WIZARDLM_2_8X22B = "microsoft/wizardlm-2-8x22b"
-    GRYPHE_MYTHOMAX_L2_13B = "gryphe/mythomax-l2-13b"
-    META_LLAMA_4_SCOUT = "meta-llama/llama-4-scout"
-    META_LLAMA_4_MAVERICK = "meta-llama/llama-4-maverick"
-    GROK_4 = "x-ai/grok-4"
-    GROK_4_FAST = "x-ai/grok-4-fast"
-    GROK_4_1_FAST = "x-ai/grok-4.1-fast"
-    GROK_CODE_FAST_1 = "x-ai/grok-code-fast-1"
-    KIMI_K2 = "moonshotai/kimi-k2"
-    QWEN3_235B_A22B_THINKING = "qwen/qwen3-235b-a22b-thinking-2507"
-    QWEN3_CODER = "qwen/qwen3-coder"
-    # Llama API models
-    LLAMA_API_LLAMA_4_SCOUT = "Llama-4-Scout-17B-16E-Instruct-FP8"
-    LLAMA_API_LLAMA4_MAVERICK = "Llama-4-Maverick-17B-128E-Instruct-FP8"
-    LLAMA_API_LLAMA3_3_8B = "Llama-3.3-8B-Instruct"
-    LLAMA_API_LLAMA3_3_70B = "Llama-3.3-70B-Instruct"
-    # v0 by Vercel models
-    V0_1_5_MD = "v0-1.5-md"
-    V0_1_5_LG = "v0-1.5-lg"
-    V0_1_0_MD = "v0-1.0-md"
+class LlmModel(str, metaclass=LlmModelMeta):
+    """
+    Dynamic LLM model type that accepts any model slug from the registry.
+
+    This is a string subclass (not an Enum) that allows any model slug value.
+    All models are managed via the LLM Registry in the database.
+
+    Usage:
+        model = LlmModel("gpt-4o")  # Direct construction
+        model = LlmModel.GPT4O      # Attribute access (converted to "gpt-4o")
+        model.value                  # Returns the slug string
+        model.provider               # Returns the provider from registry
+    """
+
+    def __new__(cls, value: str):
+        if isinstance(value, LlmModel):
+            return value
+        return str.__new__(cls, value)
+
+    @classmethod
+    def __get_pydantic_core_schema__(
+        cls, source_type: Any, handler: GetCoreSchemaHandler
+    ) -> CoreSchema:
+        """
+        Tell Pydantic how to validate LlmModel.
+
+        Accepts strings and converts them to LlmModel instances.
+        """
+        return core_schema.no_info_after_validator_function(
+            cls,  # The validator function (LlmModel constructor)
+            core_schema.str_schema(),  # Accept string input
+            serialization=core_schema.to_string_ser_schema(),  # Serialize as string
+        )
+
+    @property
+    def value(self) -> str:
+        """Return the model slug (for compatibility with enum-style access)."""
+        return str(self)
+
+    @classmethod
+    def default(cls) -> "LlmModel":
+        """
+        Get the default model from the registry.
+
+        Returns the recommended model if set, otherwise gpt-4o if available
+        and enabled, otherwise the first enabled model from the registry.
+        Falls back to "gpt-4o" if registry is empty (e.g., at module import time).
+        """
+        from backend.data.llm_registry import get_default_model_slug
+
+        slug = get_default_model_slug()
+        if slug is None:
+            # Registry is empty (e.g., at module import time before DB connection).
+            # Fall back to gpt-4o for backward compatibility.
+            slug = "gpt-4o"
+        return cls(slug)

    @classmethod
    def __get_pydantic_json_schema__(cls, schema, handler):
@@ -181,7 +193,15 @@ class LlmModel(str, Enum, metaclass=LlmModelMeta):
        llm_model_metadata = {}
        for model in cls:
            model_name = model.value
-            metadata = model.metadata
+            # Skip disabled models - only show enabled models in the picker
+            if not llm_registry.is_model_enabled(model_name):
+                continue
+            # Use registry directly with None check to gracefully handle
+            # missing metadata during startup/import before registry is populated
+            metadata = llm_registry.get_llm_model_metadata(model_name)
+            if metadata is None:
+                # Skip models without metadata (registry not yet populated)
+                continue
            llm_model_metadata[model_name] = {
                "creator": metadata.creator_name,
                "creator_name": metadata.creator_name,
@@ -197,7 +217,12 @@ class LlmModel(str, Enum, metaclass=LlmModelMeta):

    @property
    def metadata(self) -> ModelMetadata:
-        return MODEL_METADATA[self]
+        metadata = llm_registry.get_llm_model_metadata(self.value)
+        if metadata:
+            return metadata
+        raise ValueError(
+            f"Missing metadata for model: {self.value}. Model not found in LLM registry."
+        )

    @property
    def provider(self) -> str:
@@ -212,300 +237,11 @@ class LlmModel(str, Enum, metaclass=LlmModelMeta):
        return self.metadata.max_output_tokens


-MODEL_METADATA = {
-    # https://platform.openai.com/docs/models
-    LlmModel.O3: ModelMetadata("openai", 200000, 100000, "O3", "OpenAI", "OpenAI", 2),
-    LlmModel.O3_MINI: ModelMetadata(
-        "openai", 200000, 100000, "O3 Mini", "OpenAI", "OpenAI", 1
-    ),  # o3-mini-2025-01-31
-    LlmModel.O1: ModelMetadata(
-        "openai", 200000, 100000, "O1", "OpenAI", "OpenAI", 3
-    ),  # o1-2024-12-17
-    LlmModel.O1_MINI: ModelMetadata(
-        "openai", 128000, 65536, "O1 Mini", "OpenAI", "OpenAI", 2
-    ),  # o1-mini-2024-09-12
-    # GPT-5 models
-    LlmModel.GPT5_2: ModelMetadata(
-        "openai", 400000, 128000, "GPT-5.2", "OpenAI", "OpenAI", 3
-    ),
-    LlmModel.GPT5_1: ModelMetadata(
-        "openai", 400000, 128000, "GPT-5.1", "OpenAI", "OpenAI", 2
-    ),
-    LlmModel.GPT5: ModelMetadata(
-        "openai", 400000, 128000, "GPT-5", "OpenAI", "OpenAI", 1
-    ),
-    LlmModel.GPT5_MINI: ModelMetadata(
-        "openai", 400000, 128000, "GPT-5 Mini", "OpenAI", "OpenAI", 1
-    ),
-    LlmModel.GPT5_NANO: ModelMetadata(
-        "openai", 400000, 128000, "GPT-5 Nano", "OpenAI", "OpenAI", 1
-    ),
-    LlmModel.GPT5_CHAT: ModelMetadata(
-        "openai", 400000, 16384, "GPT-5 Chat Latest", "OpenAI", "OpenAI", 2
-    ),
-    LlmModel.GPT41: ModelMetadata(
-        "openai", 1047576, 32768, "GPT-4.1", "OpenAI", "OpenAI", 1
-    ),
-    LlmModel.GPT41_MINI: ModelMetadata(
-        "openai", 1047576, 32768, "GPT-4.1 Mini", "OpenAI", "OpenAI", 1
-    ),
-    LlmModel.GPT4O_MINI: ModelMetadata(
-        "openai", 128000, 16384, "GPT-4o Mini", "OpenAI", "OpenAI", 1
-    ),  # gpt-4o-mini-2024-07-18
-    LlmModel.GPT4O: ModelMetadata(
-        "openai", 128000, 16384, "GPT-4o", "OpenAI", "OpenAI", 2
-    ),  # gpt-4o-2024-08-06
-    LlmModel.GPT4_TURBO: ModelMetadata(
-        "openai", 128000, 4096, "GPT-4 Turbo", "OpenAI", "OpenAI", 3
-    ),  # gpt-4-turbo-2024-04-09
-    LlmModel.GPT3_5_TURBO: ModelMetadata(
-        "openai", 16385, 4096, "GPT-3.5 Turbo", "OpenAI", "OpenAI", 1
-    ),  # gpt-3.5-turbo-0125
-    # https://docs.anthropic.com/en/docs/about-claude/models
-    LlmModel.CLAUDE_4_1_OPUS: ModelMetadata(
-        "anthropic", 200000, 32000, "Claude Opus 4.1", "Anthropic", "Anthropic", 3
-    ),  # claude-opus-4-1-20250805
-    LlmModel.CLAUDE_4_OPUS: ModelMetadata(
-        "anthropic", 200000, 32000, "Claude Opus 4", "Anthropic", "Anthropic", 3
-    ),  # claude-4-opus-20250514
-    LlmModel.CLAUDE_4_SONNET: ModelMetadata(
-        "anthropic", 200000, 64000, "Claude Sonnet 4", "Anthropic", "Anthropic", 2
-    ),  # claude-4-sonnet-20250514
-    LlmModel.CLAUDE_4_5_OPUS: ModelMetadata(
-        "anthropic", 200000, 64000, "Claude Opus 4.5", "Anthropic", "Anthropic", 3
-    ),  # claude-opus-4-5-20251101
-    LlmModel.CLAUDE_4_5_SONNET: ModelMetadata(
-        "anthropic", 200000, 64000, "Claude Sonnet 4.5", "Anthropic", "Anthropic", 3
-    ),  # claude-sonnet-4-5-20250929
-    LlmModel.CLAUDE_4_5_HAIKU: ModelMetadata(
-        "anthropic", 200000, 64000, "Claude Haiku 4.5", "Anthropic", "Anthropic", 2
-    ),  # claude-haiku-4-5-20251001
-    LlmModel.CLAUDE_3_7_SONNET: ModelMetadata(
-        "anthropic", 200000, 64000, "Claude 3.7 Sonnet", "Anthropic", "Anthropic", 2
-    ),  # claude-3-7-sonnet-20250219
-    LlmModel.CLAUDE_3_HAIKU: ModelMetadata(
-        "anthropic", 200000, 4096, "Claude 3 Haiku", "Anthropic", "Anthropic", 1
-    ),  # claude-3-haiku-20240307
-    # https://docs.aimlapi.com/api-overview/model-database/text-models
-    LlmModel.AIML_API_QWEN2_5_72B: ModelMetadata(
-        "aiml_api", 32000, 8000, "Qwen 2.5 72B Instruct Turbo", "AI/ML", "Qwen", 1
-    ),
-    LlmModel.AIML_API_LLAMA3_1_70B: ModelMetadata(
-        "aiml_api",
-        128000,
-        40000,
-        "Llama 3.1 Nemotron 70B Instruct",
-        "AI/ML",
-        "Nvidia",
-        1,
-    ),
-    LlmModel.AIML_API_LLAMA3_3_70B: ModelMetadata(
-        "aiml_api", 128000, None, "Llama 3.3 70B Instruct Turbo", "AI/ML", "Meta", 1
-    ),
-    LlmModel.AIML_API_META_LLAMA_3_1_70B: ModelMetadata(
-        "aiml_api", 131000, 2000, "Llama 3.1 70B Instruct Turbo", "AI/ML", "Meta", 1
-    ),
-    LlmModel.AIML_API_LLAMA_3_2_3B: ModelMetadata(
-        "aiml_api", 128000, None, "Llama 3.2 3B Instruct Turbo", "AI/ML", "Meta", 1
-    ),
-    # https://console.groq.com/docs/models
-    LlmModel.LLAMA3_3_70B: ModelMetadata(
-        "groq", 128000, 32768, "Llama 3.3 70B Versatile", "Groq", "Meta", 1
-    ),
-    LlmModel.LLAMA3_1_8B: ModelMetadata(
-        "groq", 128000, 8192, "Llama 3.1 8B Instant", "Groq", "Meta", 1
-    ),
-    # https://ollama.com/library
-    LlmModel.OLLAMA_LLAMA3_3: ModelMetadata(
-        "ollama", 8192, None, "Llama 3.3", "Ollama", "Meta", 1
-    ),
-    LlmModel.OLLAMA_LLAMA3_2: ModelMetadata(
-        "ollama", 8192, None, "Llama 3.2", "Ollama", "Meta", 1
-    ),
-    LlmModel.OLLAMA_LLAMA3_8B: ModelMetadata(
-        "ollama", 8192, None, "Llama 3", "Ollama", "Meta", 1
-    ),
-    LlmModel.OLLAMA_LLAMA3_405B: ModelMetadata(
-        "ollama", 8192, None, "Llama 3.1 405B", "Ollama", "Meta", 1
-    ),
-    LlmModel.OLLAMA_DOLPHIN: ModelMetadata(
-        "ollama", 32768, None, "Dolphin Mistral Latest", "Ollama", "Mistral AI", 1
-    ),
-    # https://openrouter.ai/models
-    LlmModel.GEMINI_2_5_PRO: ModelMetadata(
-        "open_router",
-        1050000,
-        8192,
-        "Gemini 2.5 Pro Preview 03.25",
-        "OpenRouter",
-        "Google",
-        2,
-    ),
-    LlmModel.GEMINI_3_PRO_PREVIEW: ModelMetadata(
-        "open_router", 1048576, 65535, "Gemini 3 Pro Preview", "OpenRouter", "Google", 2
-    ),
-    LlmModel.GEMINI_2_5_FLASH: ModelMetadata(
-        "open_router", 1048576, 65535, "Gemini 2.5 Flash", "OpenRouter", "Google", 1
-    ),
-    LlmModel.GEMINI_2_0_FLASH: ModelMetadata(
-        "open_router", 1048576, 8192, "Gemini 2.0 Flash 001", "OpenRouter", "Google", 1
-    ),
-    LlmModel.GEMINI_2_5_FLASH_LITE_PREVIEW: ModelMetadata(
-        "open_router",
-        1048576,
-        65535,
-        "Gemini 2.5 Flash Lite Preview 06.17",
-        "OpenRouter",
-        "Google",
-        1,
-    ),
-    LlmModel.GEMINI_2_0_FLASH_LITE: ModelMetadata(
-        "open_router",
-        1048576,
-        8192,
-        "Gemini 2.0 Flash Lite 001",
-        "OpenRouter",
-        "Google",
-        1,
-    ),
-    LlmModel.MISTRAL_NEMO: ModelMetadata(
-        "open_router", 128000, 4096, "Mistral Nemo", "OpenRouter", "Mistral AI", 1
-    ),
-    LlmModel.COHERE_COMMAND_R_08_2024: ModelMetadata(
-        "open_router", 128000, 4096, "Command R 08.2024", "OpenRouter", "Cohere", 1
-    ),
-    LlmModel.COHERE_COMMAND_R_PLUS_08_2024: ModelMetadata(
-        "open_router", 128000, 4096, "Command R Plus 08.2024", "OpenRouter", "Cohere", 2
-    ),
-    LlmModel.DEEPSEEK_CHAT: ModelMetadata(
-        "open_router", 64000, 2048, "DeepSeek Chat", "OpenRouter", "DeepSeek", 1
-    ),
-    LlmModel.DEEPSEEK_R1_0528: ModelMetadata(
-        "open_router", 163840, 163840, "DeepSeek R1 0528", "OpenRouter", "DeepSeek", 1
-    ),
-    LlmModel.PERPLEXITY_SONAR: ModelMetadata(
-        "open_router", 127000, 8000, "Sonar", "OpenRouter", "Perplexity", 1
-    ),
-    LlmModel.PERPLEXITY_SONAR_PRO: ModelMetadata(
-        "open_router", 200000, 8000, "Sonar Pro", "OpenRouter", "Perplexity", 2
-    ),
-    LlmModel.PERPLEXITY_SONAR_DEEP_RESEARCH: ModelMetadata(
-        "open_router",
-        128000,
-        16000,
-        "Sonar Deep Research",
-        "OpenRouter",
-        "Perplexity",
-        3,
-    ),
-    LlmModel.NOUSRESEARCH_HERMES_3_LLAMA_3_1_405B: ModelMetadata(
-        "open_router",
-        131000,
-        4096,
-        "Hermes 3 Llama 3.1 405B",
-        "OpenRouter",
-        "Nous Research",
-        1,
-    ),
-    LlmModel.NOUSRESEARCH_HERMES_3_LLAMA_3_1_70B: ModelMetadata(
-        "open_router",
-        12288,
-        12288,
-        "Hermes 3 Llama 3.1 70B",
-        "OpenRouter",
-        "Nous Research",
-        1,
-    ),
-    LlmModel.OPENAI_GPT_OSS_120B: ModelMetadata(
-        "open_router", 131072, 131072, "GPT-OSS 120B", "OpenRouter", "OpenAI", 1
-    ),
-    LlmModel.OPENAI_GPT_OSS_20B: ModelMetadata(
-        "open_router", 131072, 32768, "GPT-OSS 20B", "OpenRouter", "OpenAI", 1
-    ),
-    LlmModel.AMAZON_NOVA_LITE_V1: ModelMetadata(
-        "open_router", 300000, 5120, "Nova Lite V1", "OpenRouter", "Amazon", 1
-    ),
-    LlmModel.AMAZON_NOVA_MICRO_V1: ModelMetadata(
-        "open_router", 128000, 5120, "Nova Micro V1", "OpenRouter", "Amazon", 1
-    ),
-    LlmModel.AMAZON_NOVA_PRO_V1: ModelMetadata(
-        "open_router", 300000, 5120, "Nova Pro V1", "OpenRouter", "Amazon", 1
-    ),
-    LlmModel.MICROSOFT_WIZARDLM_2_8X22B: ModelMetadata(
-        "open_router", 65536, 4096, "WizardLM 2 8x22B", "OpenRouter", "Microsoft", 1
-    ),
-    LlmModel.GRYPHE_MYTHOMAX_L2_13B: ModelMetadata(
-        "open_router", 4096, 4096, "MythoMax L2 13B", "OpenRouter", "Gryphe", 1
-    ),
-    LlmModel.META_LLAMA_4_SCOUT: ModelMetadata(
-        "open_router", 131072, 131072, "Llama 4 Scout", "OpenRouter", "Meta", 1
-    ),
-    LlmModel.META_LLAMA_4_MAVERICK: ModelMetadata(
-        "open_router", 1048576, 1000000, "Llama 4 Maverick", "OpenRouter", "Meta", 1
-    ),
-    LlmModel.GROK_4: ModelMetadata(
-        "open_router", 256000, 256000, "Grok 4", "OpenRouter", "xAI", 3
-    ),
-    LlmModel.GROK_4_FAST: ModelMetadata(
-        "open_router", 2000000, 30000, "Grok 4 Fast", "OpenRouter", "xAI", 1
-    ),
-    LlmModel.GROK_4_1_FAST: ModelMetadata(
-        "open_router", 2000000, 30000, "Grok 4.1 Fast", "OpenRouter", "xAI", 1
-    ),
-    LlmModel.GROK_CODE_FAST_1: ModelMetadata(
-        "open_router", 256000, 10000, "Grok Code Fast 1", "OpenRouter", "xAI", 1
-    ),
-    LlmModel.KIMI_K2: ModelMetadata(
-        "open_router", 131000, 131000, "Kimi K2", "OpenRouter", "Moonshot AI", 1
-    ),
-    LlmModel.QWEN3_235B_A22B_THINKING: ModelMetadata(
-        "open_router",
-        262144,
-        262144,
-        "Qwen 3 235B A22B Thinking 2507",
-        "OpenRouter",
-        "Qwen",
-        1,
-    ),
-    LlmModel.QWEN3_CODER: ModelMetadata(
-        "open_router", 262144, 262144, "Qwen 3 Coder", "OpenRouter", "Qwen", 3
-    ),
-    # Llama API models
-    LlmModel.LLAMA_API_LLAMA_4_SCOUT: ModelMetadata(
-        "llama_api",
-        128000,
-        4028,
-        "Llama 4 Scout 17B 16E Instruct FP8",
-        "Llama API",
-        "Meta",
-        1,
-    ),
-    LlmModel.LLAMA_API_LLAMA4_MAVERICK: ModelMetadata(
-        "llama_api",
-        128000,
-        4028,
-        "Llama 4 Maverick 17B 128E Instruct FP8",
-        "Llama API",
-        "Meta",
-        1,
-    ),
-    LlmModel.LLAMA_API_LLAMA3_3_8B: ModelMetadata(
-        "llama_api", 128000, 4028, "Llama 3.3 8B Instruct", "Llama API", "Meta", 1
-    ),
-    LlmModel.LLAMA_API_LLAMA3_3_70B: ModelMetadata(
-        "llama_api", 128000, 4028, "Llama 3.3 70B Instruct", "Llama API", "Meta", 1
-    ),
-    # v0 by Vercel models
-    LlmModel.V0_1_5_MD: ModelMetadata("v0", 128000, 64000, "v0 1.5 MD", "V0", "V0", 1),
-    LlmModel.V0_1_5_LG: ModelMetadata("v0", 512000, 64000, "v0 1.5 LG", "V0", "V0", 1),
-    LlmModel.V0_1_0_MD: ModelMetadata("v0", 128000, 64000, "v0 1.0 MD", "V0", "V0", 1),
-}
+# MODEL_METADATA removed - all models now come from the database via llm_registry

-DEFAULT_LLM_MODEL = LlmModel.GPT5_2
-
-for model in LlmModel:
-    if model not in MODEL_METADATA:
-        raise ValueError(f"Missing MODEL_METADATA metadata for model: {model}")
+# Default model constant for backward compatibility
+# Uses the dynamic registry to get the default model
+DEFAULT_LLM_MODEL = LlmModel.default()


 class ToolCall(BaseModel):
@@ -598,7 +334,10 @@ def get_parallel_tool_calls_param(
    llm_model: LlmModel, parallel_tool_calls: bool | None
 ):
    """Get the appropriate parallel_tool_calls parameter for OpenAI-compatible APIs."""
-    if llm_model.startswith("o") or parallel_tool_calls is None:
+    # Check for o-series models (o1, o1-mini, o3-mini, etc.) which don't support
+    # parallel tool calls. Use regex to avoid false positives like "openai/gpt-oss".
+    is_o_series = re.match(r"^o\d", llm_model) is not None
+    if is_o_series or parallel_tool_calls is None:
        return openai.NOT_GIVEN
    return parallel_tool_calls

@@ -634,19 +373,98 @@ async def llm_call(
            - prompt_tokens: The number of tokens used in the prompt.
            - completion_tokens: The number of tokens used in the completion.
    """
-    provider = llm_model.metadata.provider
-    context_window = llm_model.context_window
+    # Get model metadata and check if enabled - with fallback support
+    # The model we'll actually use (may differ if original is disabled)
+    model_to_use = llm_model.value
+
+    # Check if model is in registry and if it's enabled
+    from backend.data.llm_registry import (
+        get_fallback_model_for_disabled,
+        get_model_info,
+    )
+
+    model_info = get_model_info(llm_model.value)
+
+    if model_info and not model_info.is_enabled:
+        # Model is disabled - try to find a fallback from the same provider
+        fallback = get_fallback_model_for_disabled(llm_model.value)
+        if fallback:
+            logger.warning(
+                f"Model '{llm_model.value}' is disabled. Using fallback model '{fallback.slug}' from the same provider ({fallback.metadata.provider})."
+            )
+            model_to_use = fallback.slug
+            # Use fallback model's metadata
+            provider = fallback.metadata.provider
+            context_window = fallback.metadata.context_window
+            model_max_output = fallback.metadata.max_output_tokens or int(2**15)
+        else:
+            # No fallback available - raise error
+            raise ValueError(
+                f"LLM model '{llm_model.value}' is disabled and no fallback model "
+                f"from the same provider is available. Please enable the model or "
+                f"select a different model in the block configuration."
+            )
+    else:
+        # Model is enabled or not in registry (legacy/static model)
+        try:
+            provider = llm_model.metadata.provider
+            context_window = llm_model.context_window
+            model_max_output = llm_model.max_output_tokens or int(2**15)
+        except ValueError:
+            # Model not in cache - try refreshing the registry once if we have DB access
+            logger.warning(f"Model {llm_model.value} not found in registry cache")
+
+            # Try refreshing the registry if we have database access
+            from backend.data.db import is_connected
+
+            if is_connected():
+                try:
+                    logger.info(
+                        f"Refreshing LLM registry and retrying lookup for {llm_model.value}"
+                    )
+                    await llm_registry.refresh_llm_registry()
+                    # Try again after refresh
+                    try:
+                        provider = llm_model.metadata.provider
+                        context_window = llm_model.context_window
+                        model_max_output = llm_model.max_output_tokens or int(2**15)
+                        logger.info(
+                            f"Successfully loaded model {llm_model.value} metadata after registry refresh"
+                        )
+                    except ValueError:
+                        # Still not found after refresh
+                        raise ValueError(
+                            f"LLM model '{llm_model.value}' not found in registry after refresh. "
+                            "Please ensure the model is added and enabled in the LLM registry via the admin UI."
+                        )
+                except Exception as refresh_exc:
+                    logger.error(f"Failed to refresh LLM registry: {refresh_exc}")
+                    raise ValueError(
+                        f"LLM model '{llm_model.value}' not found in registry and failed to refresh. "
+                        "Please ensure the model is added to the LLM registry via the admin UI."
+                    ) from refresh_exc
+            else:
+                # No DB access (e.g., in executor without direct DB connection)
+                # The registry should have been loaded on startup
+                raise ValueError(
+                    f"LLM model '{llm_model.value}' not found in registry cache. "
+                    "The registry may need to be refreshed. Please contact support or try again later."
+                )
+
+    # Create effective model for model-specific parameter resolution (e.g., o-series check)
+    # This uses the resolved model_to_use which may differ from llm_model if fallback occurred
+    effective_model = LlmModel(model_to_use)

    if compress_prompt_to_fit:
        prompt = compress_prompt(
            messages=prompt,
-            target_tokens=llm_model.context_window // 2,
+            target_tokens=context_window // 2,
            lossy_ok=True,
        )

    # Calculate available tokens based on context window and input length
    estimated_input_tokens = estimate_token_count(prompt)
-    model_max_output = llm_model.max_output_tokens or int(2**15)
+    # model_max_output already set above
    user_max = max_tokens or model_max_output
    available_tokens = max(context_window - estimated_input_tokens, 0)
    max_tokens = max(min(available_tokens, model_max_output, user_max), 1)
@@ -657,14 +475,14 @@ async def llm_call(
        response_format = None

        parallel_tool_calls = get_parallel_tool_calls_param(
-            llm_model, parallel_tool_calls
+            effective_model, parallel_tool_calls
        )

        if force_json_output:
            response_format = {"type": "json_object"}

        response = await oai_client.chat.completions.create(
-            model=llm_model.value,
+            model=model_to_use,
            messages=prompt,  # type: ignore
            response_format=response_format,  # type: ignore
            max_completion_tokens=max_tokens,
@@ -711,7 +529,7 @@ async def llm_call(
        )
        try:
            resp = await client.messages.create(
-                model=llm_model.value,
+                model=model_to_use,
                system=sysprompt,
                messages=messages,
                max_tokens=max_tokens,
@@ -775,7 +593,7 @@ async def llm_call(
        client = AsyncGroq(api_key=credentials.api_key.get_secret_value())
        response_format = {"type": "json_object"} if force_json_output else None
        response = await client.chat.completions.create(
-            model=llm_model.value,
+            model=model_to_use,
            messages=prompt,  # type: ignore
            response_format=response_format,  # type: ignore
            max_tokens=max_tokens,
@@ -797,7 +615,7 @@ async def llm_call(
        sys_messages = [p["content"] for p in prompt if p["role"] == "system"]
        usr_messages = [p["content"] for p in prompt if p["role"] != "system"]
        response = await client.generate(
-            model=llm_model.value,
+            model=model_to_use,
            prompt=f"{sys_messages}\n\n{usr_messages}",
            stream=False,
            options={"num_ctx": max_tokens},
@@ -819,7 +637,7 @@ async def llm_call(
        )

        parallel_tool_calls_param = get_parallel_tool_calls_param(
-            llm_model, parallel_tool_calls
+            effective_model, parallel_tool_calls
        )

        response = await client.chat.completions.create(
@@ -827,7 +645,7 @@ async def llm_call(
                "HTTP-Referer": "https://agpt.co",
                "X-Title": "AutoGPT",
            },
-            model=llm_model.value,
+            model=model_to_use,
            messages=prompt,  # type: ignore
            max_tokens=max_tokens,
            tools=tools_param,  # type: ignore
@@ -861,7 +679,7 @@ async def llm_call(
        )

        parallel_tool_calls_param = get_parallel_tool_calls_param(
-            llm_model, parallel_tool_calls
+            effective_model, parallel_tool_calls
        )

        response = await client.chat.completions.create(
@@ -869,7 +687,7 @@ async def llm_call(
                "HTTP-Referer": "https://agpt.co",
                "X-Title": "AutoGPT",
            },
-            model=llm_model.value,
+            model=model_to_use,
            messages=prompt,  # type: ignore
            max_tokens=max_tokens,
            tools=tools_param,  # type: ignore
@@ -896,7 +714,7 @@ async def llm_call(
            reasoning=reasoning,
        )
    elif provider == "aiml_api":
-        client = openai.OpenAI(
+        client = openai.AsyncOpenAI(
            base_url="https://api.aimlapi.com/v2",
            api_key=credentials.api_key.get_secret_value(),
            default_headers={
@@ -906,8 +724,8 @@ async def llm_call(
            },
        )

-        completion = client.chat.completions.create(
-            model=llm_model.value,
+        completion = await client.chat.completions.create(
+            model=model_to_use,
            messages=prompt,  # type: ignore
            max_tokens=max_tokens,
        )
@@ -935,11 +753,11 @@ async def llm_call(
            response_format = {"type": "json_object"}

        parallel_tool_calls_param = get_parallel_tool_calls_param(
-            llm_model, parallel_tool_calls
+            effective_model, parallel_tool_calls
        )

        response = await client.chat.completions.create(
-            model=llm_model.value,
+            model=model_to_use,
            messages=prompt,  # type: ignore
            response_format=response_format,  # type: ignore
            max_tokens=max_tokens,
@@ -990,9 +808,10 @@ class AIStructuredResponseGeneratorBlock(AIBlockBase):
        )
        model: LlmModel = SchemaField(
            title="LLM Model",
-            default=DEFAULT_LLM_MODEL,
+            default_factory=LlmModel.default,
            description="The language model to use for answering the prompt.",
            advanced=False,
+            json_schema_extra=llm_model_schema_extra(),
        )
        force_json_output: bool = SchemaField(
            title="Restrict LLM to pure JSON output",
@@ -1055,7 +874,7 @@ class AIStructuredResponseGeneratorBlock(AIBlockBase):
            input_schema=AIStructuredResponseGeneratorBlock.Input,
            output_schema=AIStructuredResponseGeneratorBlock.Output,
            test_input={
-                "model": DEFAULT_LLM_MODEL,
+                "model": "gpt-4o",  # Using string value - enum accepts any model slug dynamically
                "credentials": TEST_CREDENTIALS_INPUT,
                "expected_format": {
                    "key1": "value1",
@@ -1421,9 +1240,10 @@ class AITextGeneratorBlock(AIBlockBase):
        )
        model: LlmModel = SchemaField(
            title="LLM Model",
-            default=DEFAULT_LLM_MODEL,
+            default_factory=LlmModel.default,
            description="The language model to use for answering the prompt.",
            advanced=False,
+            json_schema_extra=llm_model_schema_extra(),
        )
        credentials: AICredentials = AICredentialsField()
        sys_prompt: str = SchemaField(
@@ -1517,8 +1337,9 @@ class AITextSummarizerBlock(AIBlockBase):
        )
        model: LlmModel = SchemaField(
            title="LLM Model",
-            default=DEFAULT_LLM_MODEL,
+            default_factory=LlmModel.default,
            description="The language model to use for summarizing the text.",
+            json_schema_extra=llm_model_schema_extra(),
        )
        focus: str = SchemaField(
            title="Focus",
@@ -1734,8 +1555,9 @@ class AIConversationBlock(AIBlockBase):
        )
        model: LlmModel = SchemaField(
            title="LLM Model",
-            default=DEFAULT_LLM_MODEL,
+            default_factory=LlmModel.default,
            description="The language model to use for the conversation.",
+            json_schema_extra=llm_model_schema_extra(),
        )
        credentials: AICredentials = AICredentialsField()
        max_tokens: int | None = SchemaField(
@@ -1772,7 +1594,7 @@ class AIConversationBlock(AIBlockBase):
                    },
                    {"role": "user", "content": "Where was it played?"},
                ],
-                "model": DEFAULT_LLM_MODEL,
+                "model": "gpt-4o",  # Using string value - enum accepts any model slug dynamically
                "credentials": TEST_CREDENTIALS_INPUT,
            },
            test_credentials=TEST_CREDENTIALS,
@@ -1835,9 +1657,10 @@ class AIListGeneratorBlock(AIBlockBase):
        )
        model: LlmModel = SchemaField(
            title="LLM Model",
-            default=DEFAULT_LLM_MODEL,
+            default_factory=LlmModel.default,
            description="The language model to use for generating the list.",
            advanced=True,
+            json_schema_extra=llm_model_schema_extra(),
        )
        credentials: AICredentials = AICredentialsField()
        max_retries: int = SchemaField(
@@ -1892,7 +1715,7 @@ class AIListGeneratorBlock(AIBlockBase):
                    "drawing explorers to uncover its mysteries. Each planet showcases the limitless possibilities of "
                    "fictional worlds."
                ),
-                "model": DEFAULT_LLM_MODEL,
+                "model": "gpt-4o",  # Using string value - enum accepts any model slug dynamically
                "credentials": TEST_CREDENTIALS_INPUT,
                "max_retries": 3,
                "force_json_output": False,
--- a/autogpt_platform/backend/backend/blocks/media.py
+++ b/autogpt_platform/backend/backend/blocks/media.py
@@ -1,6 +1,6 @@
 import os
 import tempfile
-from typing import Optional
+from typing import Literal, Optional

 from moviepy.audio.io.AudioFileClip import AudioFileClip
 from moviepy.video.fx.Loop import Loop
@@ -13,7 +13,6 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
-from backend.data.execution import ExecutionContext
 from backend.data.model import SchemaField
 from backend.util.file import MediaFileType, get_exec_file_path, store_media_file

@@ -47,19 +46,18 @@ class MediaDurationBlock(Block):
        self,
        input_data: Input,
        *,
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
+        user_id: str,
        **kwargs,
    ) -> BlockOutput:
        # 1) Store the input media locally
        local_media_path = await store_media_file(
+            graph_exec_id=graph_exec_id,
            file=input_data.media_in,
-            execution_context=execution_context,
-            return_format="for_local_processing",
-        )
-        assert execution_context.graph_exec_id is not None
-        media_abspath = get_exec_file_path(
-            execution_context.graph_exec_id, local_media_path
+            user_id=user_id,
+            return_content=False,
        )
+        media_abspath = get_exec_file_path(graph_exec_id, local_media_path)

        # 2) Load the clip
        if input_data.is_video:
@@ -90,6 +88,10 @@ class LoopVideoBlock(Block):
            default=None,
            ge=1,
        )
+        output_return_type: Literal["file_path", "data_uri"] = SchemaField(
+            description="How to return the output video. Either a relative path or base64 data URI.",
+            default="file_path",
+        )

    class Output(BlockSchemaOutput):
        video_out: str = SchemaField(
@@ -109,19 +111,17 @@ class LoopVideoBlock(Block):
        self,
        input_data: Input,
        *,
-        execution_context: ExecutionContext,
+        node_exec_id: str,
+        graph_exec_id: str,
+        user_id: str,
        **kwargs,
    ) -> BlockOutput:
-        assert execution_context.graph_exec_id is not None
-        assert execution_context.node_exec_id is not None
-        graph_exec_id = execution_context.graph_exec_id
-        node_exec_id = execution_context.node_exec_id
-
        # 1) Store the input video locally
        local_video_path = await store_media_file(
+            graph_exec_id=graph_exec_id,
            file=input_data.video_in,
-            execution_context=execution_context,
-            return_format="for_local_processing",
+            user_id=user_id,
+            return_content=False,
        )
        input_abspath = get_exec_file_path(graph_exec_id, local_video_path)

@@ -149,11 +149,12 @@ class LoopVideoBlock(Block):
        looped_clip = looped_clip.with_audio(clip.audio)
        looped_clip.write_videofile(output_abspath, codec="libx264", audio_codec="aac")

-        # Return output - for_block_output returns workspace:// if available, else data URI
+        # Return as data URI
        video_out = await store_media_file(
+            graph_exec_id=graph_exec_id,
            file=output_filename,
-            execution_context=execution_context,
-            return_format="for_block_output",
+            user_id=user_id,
+            return_content=input_data.output_return_type == "data_uri",
        )

        yield "video_out", video_out
@@ -176,6 +177,10 @@ class AddAudioToVideoBlock(Block):
            description="Volume scale for the newly attached audio track (1.0 = original).",
            default=1.0,
        )
+        output_return_type: Literal["file_path", "data_uri"] = SchemaField(
+            description="Return the final output as a relative path or base64 data URI.",
+            default="file_path",
+        )

    class Output(BlockSchemaOutput):
        video_out: MediaFileType = SchemaField(
@@ -195,24 +200,23 @@ class AddAudioToVideoBlock(Block):
        self,
        input_data: Input,
        *,
-        execution_context: ExecutionContext,
+        node_exec_id: str,
+        graph_exec_id: str,
+        user_id: str,
        **kwargs,
    ) -> BlockOutput:
-        assert execution_context.graph_exec_id is not None
-        assert execution_context.node_exec_id is not None
-        graph_exec_id = execution_context.graph_exec_id
-        node_exec_id = execution_context.node_exec_id
-
        # 1) Store the inputs locally
        local_video_path = await store_media_file(
+            graph_exec_id=graph_exec_id,
            file=input_data.video_in,
-            execution_context=execution_context,
-            return_format="for_local_processing",
+            user_id=user_id,
+            return_content=False,
        )
        local_audio_path = await store_media_file(
+            graph_exec_id=graph_exec_id,
            file=input_data.audio_in,
-            execution_context=execution_context,
-            return_format="for_local_processing",
+            user_id=user_id,
+            return_content=False,
        )

        abs_temp_dir = os.path.join(tempfile.gettempdir(), "exec_file", graph_exec_id)
@@ -236,11 +240,12 @@ class AddAudioToVideoBlock(Block):
        output_abspath = os.path.join(abs_temp_dir, output_filename)
        final_clip.write_videofile(output_abspath, codec="libx264", audio_codec="aac")

-        # 5) Return output - for_block_output returns workspace:// if available, else data URI
+        # 5) Return either path or data URI
        video_out = await store_media_file(
+            graph_exec_id=graph_exec_id,
            file=output_filename,
-            execution_context=execution_context,
-            return_format="for_block_output",
+            user_id=user_id,
+            return_content=input_data.output_return_type == "data_uri",
        )

        yield "video_out", video_out
--- a/autogpt_platform/backend/backend/blocks/screenshotone.py
+++ b/autogpt_platform/backend/backend/blocks/screenshotone.py
@@ -11,7 +11,6 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
-from backend.data.execution import ExecutionContext
 from backend.data.model import (
    APIKeyCredentials,
    CredentialsField,
@@ -113,7 +112,8 @@ class ScreenshotWebPageBlock(Block):
    @staticmethod
    async def take_screenshot(
        credentials: APIKeyCredentials,
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
+        user_id: str,
        url: str,
        viewport_width: int,
        viewport_height: int,
@@ -155,11 +155,12 @@ class ScreenshotWebPageBlock(Block):

        return {
            "image": await store_media_file(
+                graph_exec_id=graph_exec_id,
                file=MediaFileType(
                    f"data:image/{format.value};base64,{b64encode(content).decode('utf-8')}"
                ),
-                execution_context=execution_context,
-                return_format="for_block_output",
+                user_id=user_id,
+                return_content=True,
            )
        }

@@ -168,13 +169,15 @@ class ScreenshotWebPageBlock(Block):
        input_data: Input,
        *,
        credentials: APIKeyCredentials,
-        execution_context: ExecutionContext,
+        graph_exec_id: str,
+        user_id: str,
        **kwargs,
    ) -> BlockOutput:
        try:
            screenshot_data = await self.take_screenshot(
                credentials=credentials,
-                execution_context=execution_context,
+                graph_exec_id=graph_exec_id,
+                user_id=user_id,
                url=input_data.url,
                viewport_width=input_data.viewport_width,
                viewport_height=input_data.viewport_height,
--- a/autogpt_platform/backend/backend/blocks/smart_decision_maker.py
+++ b/autogpt_platform/backend/backend/blocks/smart_decision_maker.py
@@ -226,9 +226,10 @@ class SmartDecisionMakerBlock(Block):
        )
        model: llm.LlmModel = SchemaField(
            title="LLM Model",
-            default=llm.DEFAULT_LLM_MODEL,
+            default_factory=llm.LlmModel.default,
            description="The language model to use for answering the prompt.",
            advanced=False,
+            json_schema_extra=llm.llm_model_schema_extra(),
        )
        credentials: llm.AICredentials = llm.AICredentialsField()
        multiple_tool_calls: bool = SchemaField(
--- a/autogpt_platform/backend/backend/blocks/spreadsheet.py
+++ b/autogpt_platform/backend/backend/blocks/spreadsheet.py
@@ -7,7 +7,6 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
-from backend.data.execution import ExecutionContext
 from backend.data.model import ContributorDetails, SchemaField
 from backend.util.file import get_exec_file_path, store_media_file
 from backend.util.type import MediaFileType
@@ -99,7 +98,7 @@ class ReadSpreadsheetBlock(Block):
        )

    async def run(
-        self, input_data: Input, *, execution_context: ExecutionContext, **_kwargs
+        self, input_data: Input, *, graph_exec_id: str, user_id: str, **_kwargs
    ) -> BlockOutput:
        import csv
        from io import StringIO
@@ -107,16 +106,14 @@ class ReadSpreadsheetBlock(Block):
        # Determine data source - prefer file_input if provided, otherwise use contents
        if input_data.file_input:
            stored_file_path = await store_media_file(
+                user_id=user_id,
+                graph_exec_id=graph_exec_id,
                file=input_data.file_input,
-                execution_context=execution_context,
-                return_format="for_local_processing",
+                return_content=False,
            )

            # Get full file path
-            assert execution_context.graph_exec_id  # Validated by store_media_file
-            file_path = get_exec_file_path(
-                execution_context.graph_exec_id, stored_file_path
-            )
+            file_path = get_exec_file_path(graph_exec_id, stored_file_path)
            if not Path(file_path).exists():
                raise ValueError(f"File does not exist: {file_path}")

--- a/autogpt_platform/backend/backend/blocks/stagehand/blocks.py
+++ b/autogpt_platform/backend/backend/blocks/stagehand/blocks.py
@@ -10,13 +10,13 @@ import stagehand.main
 from stagehand import Stagehand

 from backend.blocks.llm import (
-    MODEL_METADATA,
    AICredentials,
    AICredentialsField,
    LlmModel,
    ModelMetadata,
 )
 from backend.blocks.stagehand._config import stagehand as stagehand_provider
+from backend.data import llm_registry
 from backend.sdk import (
    APIKeyCredentials,
    Block,
@@ -91,7 +91,7 @@ class StagehandRecommendedLlmModel(str, Enum):
        Returns the provider name for the model in the required format for Stagehand:
        provider/model_name
        """
-        model_metadata = MODEL_METADATA[LlmModel(self.value)]
+        model_metadata = self.metadata
        model_name = self.value

        if len(model_name.split("/")) == 1 and not self.value.startswith(
@@ -107,19 +107,23 @@ class StagehandRecommendedLlmModel(str, Enum):

    @property
    def provider(self) -> str:
-        return MODEL_METADATA[LlmModel(self.value)].provider
+        return self.metadata.provider

    @property
    def metadata(self) -> ModelMetadata:
-        return MODEL_METADATA[LlmModel(self.value)]
+        metadata = llm_registry.get_llm_model_metadata(self.value)
+        if metadata:
+            return metadata
+        # Fallback to LlmModel enum if registry lookup fails
+        return LlmModel(self.value).metadata

    @property
    def context_window(self) -> int:
-        return MODEL_METADATA[LlmModel(self.value)].context_window
+        return self.metadata.context_window

    @property
    def max_output_tokens(self) -> int | None:
-        return MODEL_METADATA[LlmModel(self.value)].max_output_tokens
+        return self.metadata.max_output_tokens


 class StagehandObserveBlock(Block):
--- a/autogpt_platform/backend/backend/blocks/talking_head.py
+++ b/autogpt_platform/backend/backend/blocks/talking_head.py
@@ -10,7 +10,6 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
-from backend.data.execution import ExecutionContext
 from backend.data.model import (
    APIKeyCredentials,
    CredentialsField,
@@ -18,9 +17,7 @@ from backend.data.model import (
    SchemaField,
 )
 from backend.integrations.providers import ProviderName
-from backend.util.file import store_media_file
 from backend.util.request import Requests
-from backend.util.type import MediaFileType

 TEST_CREDENTIALS = APIKeyCredentials(
    id="01234567-89ab-cdef-0123-456789abcdef",
@@ -105,7 +102,7 @@ class CreateTalkingAvatarVideoBlock(Block):
            test_output=[
                (
                    "video_url",
-                    lambda x: x.startswith(("workspace://", "data:")),
+                    "https://d-id.com/api/clips/abcd1234-5678-efgh-ijkl-mnopqrstuvwx/video",
                ),
            ],
            test_mock={
@@ -113,10 +110,9 @@ class CreateTalkingAvatarVideoBlock(Block):
                    "id": "abcd1234-5678-efgh-ijkl-mnopqrstuvwx",
                    "status": "created",
                },
-                # Use data URI to avoid HTTP requests during tests
                "get_clip_status": lambda *args, **kwargs: {
                    "status": "done",
-                    "result_url": "data:video/mp4;base64,AAAA",
+                    "result_url": "https://d-id.com/api/clips/abcd1234-5678-efgh-ijkl-mnopqrstuvwx/video",
                },
            },
            test_credentials=TEST_CREDENTIALS,
@@ -142,12 +138,7 @@ class CreateTalkingAvatarVideoBlock(Block):
        return response.json()

    async def run(
-        self,
-        input_data: Input,
-        *,
-        credentials: APIKeyCredentials,
-        execution_context: ExecutionContext,
-        **kwargs,
+        self, input_data: Input, *, credentials: APIKeyCredentials, **kwargs
    ) -> BlockOutput:
        # Create the clip
        payload = {
@@ -174,14 +165,7 @@ class CreateTalkingAvatarVideoBlock(Block):
        for _ in range(input_data.max_polling_attempts):
            status_response = await self.get_clip_status(credentials.api_key, clip_id)
            if status_response["status"] == "done":
-                # Store the generated video to the user's workspace for persistence
-                video_url = status_response["result_url"]
-                stored_url = await store_media_file(
-                    file=MediaFileType(video_url),
-                    execution_context=execution_context,
-                    return_format="for_block_output",
-                )
-                yield "video_url", stored_url
+                yield "video_url", status_response["result_url"]
                return
            elif status_response["status"] == "error":
                raise RuntimeError(
--- a/autogpt_platform/backend/backend/blocks/test/test_blocks_dos_vulnerability.py
+++ b/autogpt_platform/backend/backend/blocks/test/test_blocks_dos_vulnerability.py
@@ -12,7 +12,6 @@ from backend.blocks.iteration import StepThroughItemsBlock
 from backend.blocks.llm import AITextSummarizerBlock
 from backend.blocks.text import ExtractTextInformationBlock
 from backend.blocks.xml_parser import XMLParserBlock
-from backend.data.execution import ExecutionContext
 from backend.util.file import store_media_file
 from backend.util.type import MediaFileType

@@ -234,12 +233,9 @@ class TestStoreMediaFileSecurity:

        with pytest.raises(ValueError, match="File too large"):
            await store_media_file(
+                graph_exec_id="test",
                file=MediaFileType(large_data_uri),
-                execution_context=ExecutionContext(
-                    user_id="test_user",
-                    graph_exec_id="test",
-                ),
-                return_format="for_local_processing",
+                user_id="test_user",
            )

    @patch("backend.util.file.Path")
@@ -274,12 +270,9 @@ class TestStoreMediaFileSecurity:
        # Should raise an error when directory size exceeds limit
        with pytest.raises(ValueError, match="Disk usage limit exceeded"):
            await store_media_file(
+                graph_exec_id="test",
                file=MediaFileType(
                    "data:text/plain;base64,dGVzdA=="
                ),  # Small test file
-                execution_context=ExecutionContext(
-                    user_id="test_user",
-                    graph_exec_id="test",
-                ),
-                return_format="for_local_processing",
+                user_id="test_user",
            )
--- a/autogpt_platform/backend/backend/blocks/test/test_http.py
+++ b/autogpt_platform/backend/backend/blocks/test/test_http.py
@@ -11,22 +11,10 @@ from backend.blocks.http import (
    HttpMethod,
    SendAuthenticatedWebRequestBlock,
 )
-from backend.data.execution import ExecutionContext
 from backend.data.model import HostScopedCredentials
 from backend.util.request import Response


-def make_test_context(
-    graph_exec_id: str = "test-exec-id",
-    user_id: str = "test-user-id",
-) -> ExecutionContext:
-    """Helper to create test ExecutionContext."""
-    return ExecutionContext(
-        user_id=user_id,
-        graph_exec_id=graph_exec_id,
-    )
-
-
 class TestHttpBlockWithHostScopedCredentials:
    """Test suite for HTTP block integration with HostScopedCredentials."""

@@ -117,7 +105,8 @@ class TestHttpBlockWithHostScopedCredentials:
        async for output_name, output_data in http_block.run(
            input_data,
            credentials=exact_match_credentials,
-            execution_context=make_test_context(),
+            graph_exec_id="test-exec-id",
+            user_id="test-user-id",
        ):
            result.append((output_name, output_data))

@@ -172,7 +161,8 @@ class TestHttpBlockWithHostScopedCredentials:
        async for output_name, output_data in http_block.run(
            input_data,
            credentials=wildcard_credentials,
-            execution_context=make_test_context(),
+            graph_exec_id="test-exec-id",
+            user_id="test-user-id",
        ):
            result.append((output_name, output_data))

@@ -218,7 +208,8 @@ class TestHttpBlockWithHostScopedCredentials:
        async for output_name, output_data in http_block.run(
            input_data,
            credentials=non_matching_credentials,
-            execution_context=make_test_context(),
+            graph_exec_id="test-exec-id",
+            user_id="test-user-id",
        ):
            result.append((output_name, output_data))

@@ -267,7 +258,8 @@ class TestHttpBlockWithHostScopedCredentials:
        async for output_name, output_data in http_block.run(
            input_data,
            credentials=exact_match_credentials,
-            execution_context=make_test_context(),
+            graph_exec_id="test-exec-id",
+            user_id="test-user-id",
        ):
            result.append((output_name, output_data))

@@ -326,7 +318,8 @@ class TestHttpBlockWithHostScopedCredentials:
        async for output_name, output_data in http_block.run(
            input_data,
            credentials=auto_discovered_creds,  # Execution manager found these
-            execution_context=make_test_context(),
+            graph_exec_id="test-exec-id",
+            user_id="test-user-id",
        ):
            result.append((output_name, output_data))

@@ -389,7 +382,8 @@ class TestHttpBlockWithHostScopedCredentials:
        async for output_name, output_data in http_block.run(
            input_data,
            credentials=multi_header_creds,
-            execution_context=make_test_context(),
+            graph_exec_id="test-exec-id",
+            user_id="test-user-id",
        ):
            result.append((output_name, output_data))

@@ -477,7 +471,8 @@ class TestHttpBlockWithHostScopedCredentials:
            async for output_name, output_data in http_block.run(
                input_data,
                credentials=test_creds,
-                execution_context=make_test_context(),
+                graph_exec_id="test-exec-id",
+                user_id="test-user-id",
            ):
                result.append((output_name, output_data))

--- a/autogpt_platform/backend/backend/blocks/text.py
+++ b/autogpt_platform/backend/backend/blocks/text.py
@@ -11,7 +11,6 @@ from backend.data.block import (
    BlockSchemaInput,
    BlockSchemaOutput,
 )
-from backend.data.execution import ExecutionContext
 from backend.data.model import SchemaField
 from backend.util import json, text
 from backend.util.file import get_exec_file_path, store_media_file
@@ -445,21 +444,18 @@ class FileReadBlock(Block):
        )

    async def run(
-        self, input_data: Input, *, execution_context: ExecutionContext, **_kwargs
+        self, input_data: Input, *, graph_exec_id: str, user_id: str, **_kwargs
    ) -> BlockOutput:
        # Store the media file properly (handles URLs, data URIs, etc.)
        stored_file_path = await store_media_file(
+            user_id=user_id,
+            graph_exec_id=graph_exec_id,
            file=input_data.file_input,
-            execution_context=execution_context,
-            return_format="for_local_processing",
+            return_content=False,
        )

-        # Get full file path (graph_exec_id validated by store_media_file above)
-        if not execution_context.graph_exec_id:
-            raise ValueError("execution_context.graph_exec_id is required")
-        file_path = get_exec_file_path(
-            execution_context.graph_exec_id, stored_file_path
-        )
+        # Get full file path
+        file_path = get_exec_file_path(graph_exec_id, stored_file_path)

        if not Path(file_path).exists():
            raise ValueError(f"File does not exist: {file_path}")
--- a/autogpt_platform/backend/backend/data/block.py
+++ b/autogpt_platform/backend/backend/data/block.py
@@ -25,6 +25,7 @@ from prisma.models import AgentBlock
 from prisma.types import AgentBlockCreateInput
 from pydantic import BaseModel

+from backend.data.llm_registry import update_schema_with_llm_registry
 from backend.data.model import NodeExecutionStats
 from backend.integrations.providers import ProviderName
 from backend.util import json
@@ -143,35 +144,59 @@ class BlockInfo(BaseModel):


 class BlockSchema(BaseModel):
-    cached_jsonschema: ClassVar[dict[str, Any]]
+    cached_jsonschema: ClassVar[dict[str, Any] | None] = None
+
+    @classmethod
+    def clear_schema_cache(cls) -> None:
+        """Clear the cached JSON schema for this class."""
+        # Use None instead of {} because {} is truthy and would prevent regeneration
+        cls.cached_jsonschema = None  # type: ignore
+
+    @staticmethod
+    def clear_all_schema_caches() -> None:
+        """Clear cached JSON schemas for all BlockSchema subclasses."""
+
+        def clear_recursive(cls: type) -> None:
+            """Recursively clear cache for class and all subclasses."""
+            if hasattr(cls, "clear_schema_cache"):
+                cls.clear_schema_cache()
+            for subclass in cls.__subclasses__():
+                clear_recursive(subclass)
+
+        clear_recursive(BlockSchema)

    @classmethod
    def jsonschema(cls) -> dict[str, Any]:
-        if cls.cached_jsonschema:
-            return cls.cached_jsonschema
+        # Generate schema if not cached
+        if not cls.cached_jsonschema:
+            model = jsonref.replace_refs(cls.model_json_schema(), merge_props=True)

-        model = jsonref.replace_refs(cls.model_json_schema(), merge_props=True)
+            def ref_to_dict(obj):
+                if isinstance(obj, dict):
+                    # OpenAPI <3.1 does not support sibling fields that has a $ref key
+                    # So sometimes, the schema has an "allOf"/"anyOf"/"oneOf" with 1 item.
+                    keys = {"allOf", "anyOf", "oneOf"}
+                    one_key = next(
+                        (k for k in keys if k in obj and len(obj[k]) == 1), None
+                    )
+                    if one_key:
+                        obj.update(obj[one_key][0])

-        def ref_to_dict(obj):
-            if isinstance(obj, dict):
-                # OpenAPI <3.1 does not support sibling fields that has a $ref key
-                # So sometimes, the schema has an "allOf"/"anyOf"/"oneOf" with 1 item.
-                keys = {"allOf", "anyOf", "oneOf"}
-                one_key = next((k for k in keys if k in obj and len(obj[k]) == 1), None)
-                if one_key:
-                    obj.update(obj[one_key][0])
+                    return {
+                        key: ref_to_dict(value)
+                        for key, value in obj.items()
+                        if not key.startswith("$") and key != one_key
+                    }
+                elif isinstance(obj, list):
+                    return [ref_to_dict(item) for item in obj]

-                return {
-                    key: ref_to_dict(value)
-                    for key, value in obj.items()
-                    if not key.startswith("$") and key != one_key
-                }
-            elif isinstance(obj, list):
-                return [ref_to_dict(item) for item in obj]
+                return obj

-            return obj
+            cls.cached_jsonschema = cast(dict[str, Any], ref_to_dict(model))

-        cls.cached_jsonschema = cast(dict[str, Any], ref_to_dict(model))
+        # Always post-process to ensure LLM registry data is up-to-date
+        # This refreshes model options and discriminator mappings even if schema was cached
+        update_schema_with_llm_registry(cls.cached_jsonschema, cls)

        return cls.cached_jsonschema

@@ -234,7 +259,7 @@ class BlockSchema(BaseModel):
        super().__pydantic_init_subclass__(**kwargs)

        # Reset cached JSON schema to prevent inheriting it from parent class
-        cls.cached_jsonschema = {}
+        cls.cached_jsonschema = None

        credentials_fields = cls.get_credentials_fields()

@@ -873,6 +898,28 @@ def is_block_auth_configured(


 async def initialize_blocks() -> None:
+    # Refresh LLM registry before initializing blocks so blocks can use registry data
+    # This ensures the registry cache is populated even in executor context
+    try:
+        from backend.data import llm_registry
+        from backend.data.block_cost_config import refresh_llm_costs
+
+        # Only refresh if we have DB access (check if Prisma is connected)
+        from backend.data.db import is_connected
+
+        if is_connected():
+            await llm_registry.refresh_llm_registry()
+            refresh_llm_costs()
+            logger.info("LLM registry refreshed during block initialization")
+        else:
+            logger.warning(
+                "Prisma not connected, skipping LLM registry refresh during block initialization"
+            )
+    except Exception as exc:
+        logger.warning(
+            "Failed to refresh LLM registry during block initialization: %s", exc
+        )
+
    # First, sync all provider costs to blocks
    # Imported here to avoid circular import
    from backend.sdk.cost_integration import sync_all_provider_costs
--- a/autogpt_platform/backend/backend/data/block_cost_config.py
+++ b/autogpt_platform/backend/backend/data/block_cost_config.py
@@ -1,3 +1,4 @@
+import logging
 from typing import Type

 from backend.blocks.ai_image_customizer import AIImageCustomizerBlock, GeminiImageModel
@@ -23,19 +24,18 @@ from backend.blocks.ideogram import IdeogramModelBlock
 from backend.blocks.jina.embeddings import JinaEmbeddingBlock
 from backend.blocks.jina.search import ExtractWebsiteContentBlock, SearchTheWebBlock
 from backend.blocks.llm import (
-    MODEL_METADATA,
    AIConversationBlock,
    AIListGeneratorBlock,
    AIStructuredResponseGeneratorBlock,
    AITextGeneratorBlock,
    AITextSummarizerBlock,
-    LlmModel,
 )
 from backend.blocks.replicate.flux_advanced import ReplicateFluxAdvancedModelBlock
 from backend.blocks.replicate.replicate_block import ReplicateModelBlock
 from backend.blocks.smart_decision_maker import SmartDecisionMakerBlock
 from backend.blocks.talking_head import CreateTalkingAvatarVideoBlock
 from backend.blocks.text_to_speech_block import UnrealTextToSpeechBlock
+from backend.data import llm_registry
 from backend.data.block import Block, BlockCost, BlockCostType
 from backend.integrations.credentials_store import (
    aiml_api_credentials,
@@ -55,210 +55,63 @@ from backend.integrations.credentials_store import (
    v0_credentials,
 )

-# =============== Configure the cost for each LLM Model call =============== #
+logger = logging.getLogger(__name__)

-MODEL_COST: dict[LlmModel, int] = {
-    LlmModel.O3: 4,
-    LlmModel.O3_MINI: 2,
-    LlmModel.O1: 16,
-    LlmModel.O1_MINI: 4,
-    # GPT-5 models
-    LlmModel.GPT5_2: 6,
-    LlmModel.GPT5_1: 5,
-    LlmModel.GPT5: 2,
-    LlmModel.GPT5_MINI: 1,
-    LlmModel.GPT5_NANO: 1,
-    LlmModel.GPT5_CHAT: 5,
-    LlmModel.GPT41: 2,
-    LlmModel.GPT41_MINI: 1,
-    LlmModel.GPT4O_MINI: 1,
-    LlmModel.GPT4O: 3,
-    LlmModel.GPT4_TURBO: 10,
-    LlmModel.GPT3_5_TURBO: 1,
-    LlmModel.CLAUDE_4_1_OPUS: 21,
-    LlmModel.CLAUDE_4_OPUS: 21,
-    LlmModel.CLAUDE_4_SONNET: 5,
-    LlmModel.CLAUDE_4_5_HAIKU: 4,
-    LlmModel.CLAUDE_4_5_OPUS: 14,
-    LlmModel.CLAUDE_4_5_SONNET: 9,
-    LlmModel.CLAUDE_3_7_SONNET: 5,
-    LlmModel.CLAUDE_3_HAIKU: 1,
-    LlmModel.AIML_API_QWEN2_5_72B: 1,
-    LlmModel.AIML_API_LLAMA3_1_70B: 1,
-    LlmModel.AIML_API_LLAMA3_3_70B: 1,
-    LlmModel.AIML_API_META_LLAMA_3_1_70B: 1,
-    LlmModel.AIML_API_LLAMA_3_2_3B: 1,
-    LlmModel.LLAMA3_3_70B: 1,
-    LlmModel.LLAMA3_1_8B: 1,
-    LlmModel.OLLAMA_LLAMA3_3: 1,
-    LlmModel.OLLAMA_LLAMA3_2: 1,
-    LlmModel.OLLAMA_LLAMA3_8B: 1,
-    LlmModel.OLLAMA_LLAMA3_405B: 1,
-    LlmModel.OLLAMA_DOLPHIN: 1,
-    LlmModel.OPENAI_GPT_OSS_120B: 1,
-    LlmModel.OPENAI_GPT_OSS_20B: 1,
-    LlmModel.GEMINI_2_5_PRO: 4,
-    LlmModel.GEMINI_3_PRO_PREVIEW: 5,
-    LlmModel.GEMINI_2_5_FLASH: 1,
-    LlmModel.GEMINI_2_0_FLASH: 1,
-    LlmModel.GEMINI_2_5_FLASH_LITE_PREVIEW: 1,
-    LlmModel.GEMINI_2_0_FLASH_LITE: 1,
-    LlmModel.MISTRAL_NEMO: 1,
-    LlmModel.COHERE_COMMAND_R_08_2024: 1,
-    LlmModel.COHERE_COMMAND_R_PLUS_08_2024: 3,
-    LlmModel.DEEPSEEK_CHAT: 2,
-    LlmModel.DEEPSEEK_R1_0528: 1,
-    LlmModel.PERPLEXITY_SONAR: 1,
-    LlmModel.PERPLEXITY_SONAR_PRO: 5,
-    LlmModel.PERPLEXITY_SONAR_DEEP_RESEARCH: 10,
-    LlmModel.NOUSRESEARCH_HERMES_3_LLAMA_3_1_405B: 1,
-    LlmModel.NOUSRESEARCH_HERMES_3_LLAMA_3_1_70B: 1,
-    LlmModel.AMAZON_NOVA_LITE_V1: 1,
-    LlmModel.AMAZON_NOVA_MICRO_V1: 1,
-    LlmModel.AMAZON_NOVA_PRO_V1: 1,
-    LlmModel.MICROSOFT_WIZARDLM_2_8X22B: 1,
-    LlmModel.GRYPHE_MYTHOMAX_L2_13B: 1,
-    LlmModel.META_LLAMA_4_SCOUT: 1,
-    LlmModel.META_LLAMA_4_MAVERICK: 1,
-    LlmModel.LLAMA_API_LLAMA_4_SCOUT: 1,
-    LlmModel.LLAMA_API_LLAMA4_MAVERICK: 1,
-    LlmModel.LLAMA_API_LLAMA3_3_8B: 1,
-    LlmModel.LLAMA_API_LLAMA3_3_70B: 1,
-    LlmModel.GROK_4: 9,
-    LlmModel.GROK_4_FAST: 1,
-    LlmModel.GROK_4_1_FAST: 1,
-    LlmModel.GROK_CODE_FAST_1: 1,
-    LlmModel.KIMI_K2: 1,
-    LlmModel.QWEN3_235B_A22B_THINKING: 1,
-    LlmModel.QWEN3_CODER: 9,
-    # v0 by Vercel models
-    LlmModel.V0_1_5_MD: 1,
-    LlmModel.V0_1_5_LG: 2,
-    LlmModel.V0_1_0_MD: 1,
+PROVIDER_CREDENTIALS = {
+    "openai": openai_credentials,
+    "anthropic": anthropic_credentials,
+    "groq": groq_credentials,
+    "open_router": open_router_credentials,
+    "llama_api": llama_api_credentials,
+    "aiml_api": aiml_api_credentials,
+    "v0": v0_credentials,
 }

-for model in LlmModel:
-    if model not in MODEL_COST:
-        raise ValueError(f"Missing MODEL_COST for model: {model}")
+# =============== Configure the cost for each LLM Model call =============== #
+# All LLM costs now come from the database via llm_registry
+
+LLM_COST: list[BlockCost] = []


-LLM_COST = (
-    # Anthropic Models
-    [
-        BlockCost(
-            cost_type=BlockCostType.RUN,
-            cost_filter={
-                "model": model,
+def _build_llm_costs_from_registry() -> list[BlockCost]:
+    """Build BlockCost list from all models in the LLM registry."""
+    costs: list[BlockCost] = []
+    for model in llm_registry.iter_dynamic_models():
+        for cost in model.costs:
+            credentials = PROVIDER_CREDENTIALS.get(cost.credential_provider)
+            if not credentials:
+                logger.warning(
+                    "Skipping cost entry for %s due to unknown credentials provider %s",
+                    model.slug,
+                    cost.credential_provider,
+                )
+                continue
+            cost_filter = {
+                "model": model.slug,
                "credentials": {
-                    "id": anthropic_credentials.id,
-                    "provider": anthropic_credentials.provider,
-                    "type": anthropic_credentials.type,
+                    "id": credentials.id,
+                    "provider": credentials.provider,
+                    "type": credentials.type,
                },
-            },
-            cost_amount=cost,
-        )
-        for model, cost in MODEL_COST.items()
-        if MODEL_METADATA[model].provider == "anthropic"
-    ]
-    # OpenAI Models
-    + [
-        BlockCost(
-            cost_type=BlockCostType.RUN,
-            cost_filter={
-                "model": model,
-                "credentials": {
-                    "id": openai_credentials.id,
-                    "provider": openai_credentials.provider,
-                    "type": openai_credentials.type,
-                },
-            },
-            cost_amount=cost,
-        )
-        for model, cost in MODEL_COST.items()
-        if MODEL_METADATA[model].provider == "openai"
-    ]
-    # Groq Models
-    + [
-        BlockCost(
-            cost_type=BlockCostType.RUN,
-            cost_filter={
-                "model": model,
-                "credentials": {"id": groq_credentials.id},
-            },
-            cost_amount=cost,
-        )
-        for model, cost in MODEL_COST.items()
-        if MODEL_METADATA[model].provider == "groq"
-    ]
-    # Open Router Models
-    + [
-        BlockCost(
-            cost_type=BlockCostType.RUN,
-            cost_filter={
-                "model": model,
-                "credentials": {
-                    "id": open_router_credentials.id,
-                    "provider": open_router_credentials.provider,
-                    "type": open_router_credentials.type,
-                },
-            },
-            cost_amount=cost,
-        )
-        for model, cost in MODEL_COST.items()
-        if MODEL_METADATA[model].provider == "open_router"
-    ]
-    # Llama API Models
-    + [
-        BlockCost(
-            cost_type=BlockCostType.RUN,
-            cost_filter={
-                "model": model,
-                "credentials": {
-                    "id": llama_api_credentials.id,
-                    "provider": llama_api_credentials.provider,
-                    "type": llama_api_credentials.type,
-                },
-            },
-            cost_amount=cost,
-        )
-        for model, cost in MODEL_COST.items()
-        if MODEL_METADATA[model].provider == "llama_api"
-    ]
-    # v0 by Vercel Models
-    + [
-        BlockCost(
-            cost_type=BlockCostType.RUN,
-            cost_filter={
-                "model": model,
-                "credentials": {
-                    "id": v0_credentials.id,
-                    "provider": v0_credentials.provider,
-                    "type": v0_credentials.type,
-                },
-            },
-            cost_amount=cost,
-        )
-        for model, cost in MODEL_COST.items()
-        if MODEL_METADATA[model].provider == "v0"
-    ]
-    # AI/ML Api Models
-    + [
-        BlockCost(
-            cost_type=BlockCostType.RUN,
-            cost_filter={
-                "model": model,
-                "credentials": {
-                    "id": aiml_api_credentials.id,
-                    "provider": aiml_api_credentials.provider,
-                    "type": aiml_api_credentials.type,
-                },
-            },
-            cost_amount=cost,
-        )
-        for model, cost in MODEL_COST.items()
-        if MODEL_METADATA[model].provider == "aiml_api"
-    ]
-)
+            }
+            costs.append(
+                BlockCost(
+                    cost_type=BlockCostType.RUN,
+                    cost_filter=cost_filter,
+                    cost_amount=cost.credit_cost,
+                )
+            )
+    return costs
+
+
+def refresh_llm_costs() -> None:
+    """Refresh LLM costs from the registry. All costs now come from the database."""
+    LLM_COST.clear()
+    LLM_COST.extend(_build_llm_costs_from_registry())
+
+
+# Initial load will happen after registry is refreshed at startup
+# Don't call refresh_llm_costs() here - it will be called after registry refresh

 # =============== This is the exhaustive list of cost for each Block =============== #

--- a/autogpt_platform/backend/backend/data/execution.py
+++ b/autogpt_platform/backend/backend/data/execution.py
@@ -83,29 +83,12 @@ class ExecutionContext(BaseModel):

    model_config = {"extra": "ignore"}

-    # Execution identity
-    user_id: Optional[str] = None
-    graph_id: Optional[str] = None
-    graph_exec_id: Optional[str] = None
-    graph_version: Optional[int] = None
-    node_id: Optional[str] = None
-    node_exec_id: Optional[str] = None
-
-    # Safety settings
    human_in_the_loop_safe_mode: bool = True
    sensitive_action_safe_mode: bool = False
-
-    # User settings
    user_timezone: str = "UTC"
-
-    # Execution hierarchy
    root_execution_id: Optional[str] = None
    parent_execution_id: Optional[str] = None

-    # Workspace
-    workspace_id: Optional[str] = None
-    session_id: Optional[str] = None
-

 # -------------------------- Models -------------------------- #

--- a/autogpt_platform/backend/backend/data/graph.py
+++ b/autogpt_platform/backend/backend/data/graph.py
@@ -1511,8 +1511,10 @@ async def migrate_llm_models(migrate_to: LlmModel):
            if field.annotation == LlmModel:
                llm_model_fields[block.id] = field_name

-    # Convert enum values to a list of strings for the SQL query
-    enum_values = [v.value for v in LlmModel]
+    # Get all model slugs from the registry (dynamic, not hardcoded enum)
+    from backend.data import llm_registry
+
+    enum_values = list(llm_registry.get_all_model_slugs_for_validation())
    escaped_enum_values = repr(tuple(enum_values))  # hack but works

    # Update each block
--- a/autogpt_platform/backend/backend/data/llm_registry/init.py
+++ b/autogpt_platform/backend/backend/data/llm_registry/init.py
@@ -0,0 +1,72 @@
+"""
+LLM Registry module for managing LLM models, providers, and costs dynamically.
+
+This module provides a database-driven registry system for LLM models,
+replacing hardcoded model configurations with a flexible admin-managed system.
+"""
+
+from backend.data.llm_registry.model import ModelMetadata
+
+# Re-export for backwards compatibility
+from backend.data.llm_registry.notifications import (
+    REGISTRY_REFRESH_CHANNEL,
+    publish_registry_refresh_notification,
+    subscribe_to_registry_refresh,
+)
+from backend.data.llm_registry.registry import (
+    RegistryModel,
+    RegistryModelCost,
+    RegistryModelCreator,
+    get_all_model_slugs_for_validation,
+    get_default_model_slug,
+    get_dynamic_model_slugs,
+    get_fallback_model_for_disabled,
+    get_llm_discriminator_mapping,
+    get_llm_model_cost,
+    get_llm_model_metadata,
+    get_llm_model_schema_options,
+    get_model_info,
+    is_model_enabled,
+    iter_dynamic_models,
+    refresh_llm_registry,
+    register_static_costs,
+    register_static_metadata,
+)
+from backend.data.llm_registry.schema_utils import (
+    is_llm_model_field,
+    refresh_llm_discriminator_mapping,
+    refresh_llm_model_options,
+    update_schema_with_llm_registry,
+)
+
+__all__ = [
+    # Types
+    "ModelMetadata",
+    "RegistryModel",
+    "RegistryModelCost",
+    "RegistryModelCreator",
+    # Registry functions
+    "get_all_model_slugs_for_validation",
+    "get_default_model_slug",
+    "get_dynamic_model_slugs",
+    "get_fallback_model_for_disabled",
+    "get_llm_discriminator_mapping",
+    "get_llm_model_cost",
+    "get_llm_model_metadata",
+    "get_llm_model_schema_options",
+    "get_model_info",
+    "is_model_enabled",
+    "iter_dynamic_models",
+    "refresh_llm_registry",
+    "register_static_costs",
+    "register_static_metadata",
+    # Notifications
+    "REGISTRY_REFRESH_CHANNEL",
+    "publish_registry_refresh_notification",
+    "subscribe_to_registry_refresh",
+    # Schema utilities
+    "is_llm_model_field",
+    "refresh_llm_discriminator_mapping",
+    "refresh_llm_model_options",
+    "update_schema_with_llm_registry",
+]
--- a/autogpt_platform/backend/backend/data/llm_registry/model.py
+++ b/autogpt_platform/backend/backend/data/llm_registry/model.py
@@ -0,0 +1,25 @@
+"""Type definitions for LLM model metadata."""
+
+from typing import Literal, NamedTuple
+
+
+class ModelMetadata(NamedTuple):
+    """Metadata for an LLM model.
+
+    Attributes:
+        provider: The provider identifier (e.g., "openai", "anthropic")
+        context_window: Maximum context window size in tokens
+        max_output_tokens: Maximum output tokens (None if unlimited)
+        display_name: Human-readable name for the model
+        provider_name: Human-readable provider name (e.g., "OpenAI", "Anthropic")
+        creator_name: Name of the organization that created the model
+        price_tier: Relative cost tier (1=cheapest, 2=medium, 3=expensive)
+    """
+
+    provider: str
+    context_window: int
+    max_output_tokens: int | None
+    display_name: str
+    provider_name: str
+    creator_name: str
+    price_tier: Literal[1, 2, 3]
--- a/autogpt_platform/backend/backend/data/llm_registry/notifications.py
+++ b/autogpt_platform/backend/backend/data/llm_registry/notifications.py
@@ -0,0 +1,89 @@
+"""
+Redis pub/sub notifications for LLM registry updates.
+
+When models are added/updated/removed via the admin UI, this module
+publishes notifications to Redis that all executor services subscribe to,
+ensuring they refresh their registry cache in real-time.
+"""
+
+import asyncio
+import logging
+from typing import Any
+
+from backend.data.redis_client import connect_async
+
+logger = logging.getLogger(__name__)
+
+# Redis channel name for LLM registry refresh notifications
+REGISTRY_REFRESH_CHANNEL = "llm_registry:refresh"
+
+
+async def publish_registry_refresh_notification() -> None:
+    """
+    Publish a notification to Redis that the LLM registry has been updated.
+    All executor services subscribed to this channel will refresh their registry.
+    """
+    try:
+        redis = await connect_async()
+        await redis.publish(REGISTRY_REFRESH_CHANNEL, "refresh")
+        logger.info("Published LLM registry refresh notification to Redis")
+    except Exception as exc:
+        logger.warning(
+            "Failed to publish LLM registry refresh notification: %s",
+            exc,
+            exc_info=True,
+        )
+
+
+async def subscribe_to_registry_refresh(
+    on_refresh: Any,  # Async callable that takes no args
+) -> None:
+    """
+    Subscribe to Redis notifications for LLM registry updates.
+    This runs in a loop and processes messages as they arrive.
+
+    Args:
+        on_refresh: Async callable to execute when a refresh notification is received
+    """
+    try:
+        redis = await connect_async()
+        pubsub = redis.pubsub()
+        await pubsub.subscribe(REGISTRY_REFRESH_CHANNEL)
+        logger.info(
+            "Subscribed to LLM registry refresh notifications on channel: %s",
+            REGISTRY_REFRESH_CHANNEL,
+        )
+
+        # Process messages in a loop
+        while True:
+            try:
+                message = await pubsub.get_message(
+                    ignore_subscribe_messages=True, timeout=1.0
+                )
+                if (
+                    message
+                    and message["type"] == "message"
+                    and message["channel"] == REGISTRY_REFRESH_CHANNEL
+                ):
+                    logger.info("Received LLM registry refresh notification")
+                    try:
+                        await on_refresh()
+                    except Exception as exc:
+                        logger.error(
+                            "Error refreshing LLM registry from notification: %s",
+                            exc,
+                            exc_info=True,
+                        )
+            except Exception as exc:
+                logger.warning(
+                    "Error processing registry refresh message: %s", exc, exc_info=True
+                )
+                # Continue listening even if one message fails
+                await asyncio.sleep(1)
+    except Exception as exc:
+        logger.error(
+            "Failed to subscribe to LLM registry refresh notifications: %s",
+            exc,
+            exc_info=True,
+        )
+        raise
--- a/autogpt_platform/backend/backend/data/llm_registry/registry.py
+++ b/autogpt_platform/backend/backend/data/llm_registry/registry.py
@@ -0,0 +1,388 @@
+"""Core LLM registry implementation for managing models dynamically."""
+
+from __future__ import annotations
+
+import asyncio
+import logging
+from dataclasses import dataclass, field
+from typing import Any, Iterable
+
+import prisma.models
+
+from backend.data.llm_registry.model import ModelMetadata
+
+logger = logging.getLogger(__name__)
+
+
+def _json_to_dict(value: Any) -> dict[str, Any]:
+    """Convert Prisma Json type to dict, with fallback to empty dict."""
+    if value is None:
+        return {}
+    if isinstance(value, dict):
+        return value
+    # Prisma Json type should always be a dict at runtime
+    return dict(value) if value else {}
+
+
+@dataclass(frozen=True)
+class RegistryModelCost:
+    """Cost configuration for an LLM model."""
+
+    credit_cost: int
+    credential_provider: str
+    credential_id: str | None
+    credential_type: str | None
+    currency: str | None
+    metadata: dict[str, Any]
+
+
+@dataclass(frozen=True)
+class RegistryModelCreator:
+    """Creator information for an LLM model."""
+
+    id: str
+    name: str
+    display_name: str
+    description: str | None
+    website_url: str | None
+    logo_url: str | None
+
+
+@dataclass(frozen=True)
+class RegistryModel:
+    """Represents a model in the LLM registry."""
+
+    slug: str
+    display_name: str
+    description: str | None
+    metadata: ModelMetadata
+    capabilities: dict[str, Any]
+    extra_metadata: dict[str, Any]
+    provider_display_name: str
+    is_enabled: bool
+    is_recommended: bool = False
+    costs: tuple[RegistryModelCost, ...] = field(default_factory=tuple)
+    creator: RegistryModelCreator | None = None
+
+
+_static_metadata: dict[str, ModelMetadata] = {}
+_static_costs: dict[str, int] = {}
+_dynamic_models: dict[str, RegistryModel] = {}
+_schema_options: list[dict[str, str]] = []
+_discriminator_mapping: dict[str, str] = {}
+_lock = asyncio.Lock()
+
+
+def register_static_metadata(metadata: dict[Any, ModelMetadata]) -> None:
+    """Register static metadata for legacy models (deprecated)."""
+    _static_metadata.update({str(key): value for key, value in metadata.items()})
+    _refresh_cached_schema()
+
+
+def register_static_costs(costs: dict[Any, int]) -> None:
+    """Register static costs for legacy models (deprecated)."""
+    _static_costs.update({str(key): value for key, value in costs.items()})
+
+
+def _build_schema_options() -> list[dict[str, str]]:
+    """Build schema options for model selection dropdown. Only includes enabled models."""
+    options: list[dict[str, str]] = []
+    # Only include enabled models in the dropdown options
+    for model in sorted(_dynamic_models.values(), key=lambda m: m.display_name.lower()):
+        if model.is_enabled:
+            options.append(
+                {
+                    "label": model.display_name,
+                    "value": model.slug,
+                    "group": model.metadata.provider,
+                    "description": model.description or "",
+                }
+            )
+
+    for slug, metadata in _static_metadata.items():
+        if slug in _dynamic_models:
+            continue
+        options.append(
+            {
+                "label": slug,
+                "value": slug,
+                "group": metadata.provider,
+                "description": "",
+            }
+        )
+    return options
+
+
+async def refresh_llm_registry() -> None:
+    """Refresh the LLM registry from the database. Loads all models (enabled and disabled)."""
+    async with _lock:
+        try:
+            records = await prisma.models.LlmModel.prisma().find_many(
+                include={
+                    "Provider": True,
+                    "Costs": True,
+                    "Creator": True,
+                }
+            )
+            logger.debug("Found %d LLM model records in database", len(records))
+        except Exception as exc:
+            logger.error(
+                "Failed to refresh LLM registry from DB: %s", exc, exc_info=True
+            )
+            return
+
+        dynamic: dict[str, RegistryModel] = {}
+        for record in records:
+            provider_name = (
+                record.Provider.name if record.Provider else record.providerId
+            )
+            provider_display_name = (
+                record.Provider.displayName if record.Provider else record.providerId
+            )
+            # Creator name: prefer Creator.name, fallback to provider display name
+            creator_name = (
+                record.Creator.name if record.Creator else provider_display_name
+            )
+            # Price tier: default to 1 (cheapest) if not set
+            price_tier = getattr(record, "priceTier", 1) or 1
+            # Clamp to valid range 1-3
+            price_tier = max(1, min(3, price_tier))
+
+            metadata = ModelMetadata(
+                provider=provider_name,
+                context_window=record.contextWindow,
+                max_output_tokens=record.maxOutputTokens,
+                display_name=record.displayName,
+                provider_name=provider_display_name,
+                creator_name=creator_name,
+                price_tier=price_tier,  # type: ignore[arg-type]
+            )
+            costs = tuple(
+                RegistryModelCost(
+                    credit_cost=cost.creditCost,
+                    credential_provider=cost.credentialProvider,
+                    credential_id=cost.credentialId,
+                    credential_type=cost.credentialType,
+                    currency=cost.currency,
+                    metadata=_json_to_dict(cost.metadata),
+                )
+                for cost in (record.Costs or [])
+            )
+
+            # Map creator if present
+            creator = None
+            if record.Creator:
+                creator = RegistryModelCreator(
+                    id=record.Creator.id,
+                    name=record.Creator.name,
+                    display_name=record.Creator.displayName,
+                    description=record.Creator.description,
+                    website_url=record.Creator.websiteUrl,
+                    logo_url=record.Creator.logoUrl,
+                )
+
+            dynamic[record.slug] = RegistryModel(
+                slug=record.slug,
+                display_name=record.displayName,
+                description=record.description,
+                metadata=metadata,
+                capabilities=_json_to_dict(record.capabilities),
+                extra_metadata=_json_to_dict(record.metadata),
+                provider_display_name=(
+                    record.Provider.displayName
+                    if record.Provider
+                    else record.providerId
+                ),
+                is_enabled=record.isEnabled,
+                is_recommended=record.isRecommended,
+                costs=costs,
+                creator=creator,
+            )
+
+        # Atomic swap - build new structures then replace references
+        # This ensures readers never see partially updated state
+        global _dynamic_models
+        _dynamic_models = dynamic
+        _refresh_cached_schema()
+        logger.info(
+            "LLM registry refreshed with %s dynamic models (enabled: %s, disabled: %s)",
+            len(dynamic),
+            sum(1 for m in dynamic.values() if m.is_enabled),
+            sum(1 for m in dynamic.values() if not m.is_enabled),
+        )
+
+
+def _refresh_cached_schema() -> None:
+    """Refresh cached schema options and discriminator mapping."""
+    global _schema_options, _discriminator_mapping
+
+    # Build new structures
+    new_options = _build_schema_options()
+    new_mapping = {
+        slug: entry.metadata.provider for slug, entry in _dynamic_models.items()
+    }
+    for slug, metadata in _static_metadata.items():
+        new_mapping.setdefault(slug, metadata.provider)
+
+    # Atomic swap - replace references to ensure readers see consistent state
+    _schema_options = new_options
+    _discriminator_mapping = new_mapping
+
+
+def get_llm_model_metadata(slug: str) -> ModelMetadata | None:
+    """Get model metadata by slug. Checks dynamic models first, then static metadata."""
+    if slug in _dynamic_models:
+        return _dynamic_models[slug].metadata
+    return _static_metadata.get(slug)
+
+
+def get_llm_model_cost(slug: str) -> tuple[RegistryModelCost, ...]:
+    """Get model cost configuration by slug."""
+    if slug in _dynamic_models:
+        return _dynamic_models[slug].costs
+    cost_value = _static_costs.get(slug)
+    if cost_value is None:
+        return tuple()
+    return (
+        RegistryModelCost(
+            credit_cost=cost_value,
+            credential_provider="static",
+            credential_id=None,
+            credential_type=None,
+            currency=None,
+            metadata={},
+        ),
+    )
+
+
+def get_llm_model_schema_options() -> list[dict[str, str]]:
+    """
+    Get schema options for LLM model selection dropdown.
+
+    Returns a copy of cached schema options that are refreshed when the registry is
+    updated via refresh_llm_registry() (called on startup and via Redis pub/sub).
+    """
+    # Return a copy to prevent external mutation
+    return list(_schema_options)
+
+
+def get_llm_discriminator_mapping() -> dict[str, str]:
+    """
+    Get discriminator mapping for LLM models.
+
+    Returns a copy of cached discriminator mapping that is refreshed when the registry
+    is updated via refresh_llm_registry() (called on startup and via Redis pub/sub).
+    """
+    # Return a copy to prevent external mutation
+    return dict(_discriminator_mapping)
+
+
+def get_dynamic_model_slugs() -> set[str]:
+    """Get all dynamic model slugs from the registry."""
+    return set(_dynamic_models.keys())
+
+
+def get_all_model_slugs_for_validation() -> set[str]:
+    """
+    Get ALL model slugs (both enabled and disabled) for validation purposes.
+
+    This is used for JSON schema enum validation - we need to accept any known
+    model value (even disabled ones) so that existing graphs don't fail validation.
+    The actual fallback/enforcement happens at runtime in llm_call().
+    """
+    all_slugs = set(_dynamic_models.keys())
+    all_slugs.update(_static_metadata.keys())
+    return all_slugs
+
+
+def iter_dynamic_models() -> Iterable[RegistryModel]:
+    """Iterate over all dynamic models in the registry."""
+    return tuple(_dynamic_models.values())
+
+
+def get_fallback_model_for_disabled(disabled_model_slug: str) -> RegistryModel | None:
+    """
+    Find a fallback model when the requested model is disabled.
+
+    Looks for an enabled model from the same provider. Prefers models with
+    similar names or capabilities if possible.
+
+    Args:
+        disabled_model_slug: The slug of the disabled model
+
+    Returns:
+        An enabled RegistryModel from the same provider, or None if no fallback found
+    """
+    disabled_model = _dynamic_models.get(disabled_model_slug)
+    if not disabled_model:
+        return None
+
+    provider = disabled_model.metadata.provider
+
+    # Find all enabled models from the same provider
+    candidates = [
+        model
+        for model in _dynamic_models.values()
+        if model.is_enabled and model.metadata.provider == provider
+    ]
+
+    if not candidates:
+        return None
+
+    # Sort by: prefer models with similar context window, then by name
+    candidates.sort(
+        key=lambda m: (
+            abs(m.metadata.context_window - disabled_model.metadata.context_window),
+            m.display_name.lower(),
+        )
+    )
+
+    return candidates[0]
+
+
+def is_model_enabled(model_slug: str) -> bool:
+    """Check if a model is enabled in the registry."""
+    model = _dynamic_models.get(model_slug)
+    if not model:
+        # Model not in registry - assume it's a static/legacy model and allow it
+        return True
+    return model.is_enabled
+
+
+def get_model_info(model_slug: str) -> RegistryModel | None:
+    """Get model info from the registry."""
+    return _dynamic_models.get(model_slug)
+
+
+def get_default_model_slug() -> str | None:
+    """
+    Get the default model slug to use for block defaults.
+
+    Returns the recommended model if set (configured via admin UI),
+    otherwise returns the first enabled model alphabetically.
+    Returns None if no models are available or enabled.
+    """
+    # Return the recommended model if one is set and enabled
+    for model in _dynamic_models.values():
+        if model.is_recommended and model.is_enabled:
+            return model.slug
+
+    # No recommended model set - find first enabled model alphabetically
+    for model in sorted(_dynamic_models.values(), key=lambda m: m.display_name.lower()):
+        if model.is_enabled:
+            logger.warning(
+                "No recommended model set, using '%s' as default",
+                model.slug,
+            )
+            return model.slug
+
+    # No enabled models available
+    if _dynamic_models:
+        logger.error(
+            "No enabled models found in registry (%d models registered but all disabled)",
+            len(_dynamic_models),
+        )
+    else:
+        logger.error("No models registered in LLM registry")
+
+    return None
--- a/autogpt_platform/backend/backend/data/llm_registry/schema_utils.py
+++ b/autogpt_platform/backend/backend/data/llm_registry/schema_utils.py
@@ -0,0 +1,130 @@
+"""
+Helper utilities for LLM registry integration with block schemas.
+
+This module handles the dynamic injection of discriminator mappings
+and model options from the LLM registry into block schemas.
+"""
+
+import logging
+from typing import Any
+
+from backend.data.llm_registry.registry import (
+    get_all_model_slugs_for_validation,
+    get_default_model_slug,
+    get_llm_discriminator_mapping,
+    get_llm_model_schema_options,
+)
+
+logger = logging.getLogger(__name__)
+
+
+def is_llm_model_field(field_name: str, field_info: Any) -> bool:
+    """
+    Check if a field is an LLM model selection field.
+
+    Returns True if the field has 'options' in json_schema_extra
+    (set by llm_model_schema_extra() in blocks/llm.py).
+    """
+    if not hasattr(field_info, "json_schema_extra"):
+        return False
+
+    extra = field_info.json_schema_extra
+    if isinstance(extra, dict):
+        return "options" in extra
+
+    return False
+
+
+def refresh_llm_model_options(field_schema: dict[str, Any]) -> None:
+    """
+    Refresh LLM model options from the registry.
+
+    Updates 'options' (for frontend dropdown) to show only enabled models,
+    but keeps the 'enum' (for validation) inclusive of ALL known models.
+
+    This is important because:
+    - Options: What users see in the dropdown (enabled models only)
+    - Enum: What values pass validation (all known models, including disabled)
+
+    Existing graphs may have disabled models selected - they should pass validation
+    and the fallback logic in llm_call() will handle using an alternative model.
+    """
+    fresh_options = get_llm_model_schema_options()
+    if not fresh_options:
+        return
+
+    # Update options array (UI dropdown) - only enabled models
+    if "options" in field_schema:
+        field_schema["options"] = fresh_options
+
+    all_known_slugs = get_all_model_slugs_for_validation()
+    if all_known_slugs and "enum" in field_schema:
+        existing_enum = set(field_schema.get("enum", []))
+        combined_enum = existing_enum | all_known_slugs
+        field_schema["enum"] = sorted(combined_enum)
+
+    # Set the default value from the registry (gpt-4o if available, else first enabled)
+    # This ensures new blocks have a sensible default pre-selected
+    default_slug = get_default_model_slug()
+    if default_slug:
+        field_schema["default"] = default_slug
+
+
+def refresh_llm_discriminator_mapping(field_schema: dict[str, Any]) -> None:
+    """
+    Refresh discriminator_mapping for fields that use model-based discrimination.
+
+    The discriminator is already set when AICredentialsField() creates the field.
+    We only need to refresh the mapping when models are added/removed.
+    """
+    if field_schema.get("discriminator") != "model":
+        return
+
+    # Always refresh the mapping to get latest models
+    fresh_mapping = get_llm_discriminator_mapping()
+    if fresh_mapping is not None:
+        field_schema["discriminator_mapping"] = fresh_mapping
+
+
+def update_schema_with_llm_registry(
+    schema: dict[str, Any], model_class: type | None = None
+) -> None:
+    """
+    Update a JSON schema with current LLM registry data.
+
+    Refreshes:
+    1. Model options for LLM model selection fields (dropdown choices)
+    2. Discriminator mappings for credentials fields (model → provider)
+
+    Args:
+        schema: The JSON schema to update (mutated in-place)
+        model_class: The Pydantic model class (optional, for field introspection)
+    """
+    properties = schema.get("properties", {})
+
+    for field_name, field_schema in properties.items():
+        if not isinstance(field_schema, dict):
+            continue
+
+        # Refresh model options for LLM model fields
+        if model_class and hasattr(model_class, "model_fields"):
+            field_info = model_class.model_fields.get(field_name)
+            if field_info and is_llm_model_field(field_name, field_info):
+                try:
+                    refresh_llm_model_options(field_schema)
+                except Exception as exc:
+                    logger.warning(
+                        "Failed to refresh LLM options for field %s: %s",
+                        field_name,
+                        exc,
+                    )
+
+        # Refresh discriminator mapping for fields that use model discrimination
+        try:
+            refresh_llm_discriminator_mapping(field_schema)
+        except Exception as exc:
+            logger.warning(
+                "Failed to refresh discriminator mapping for field %s: %s",
+                field_name,
+                exc,
+            )
--- a/autogpt_platform/backend/backend/data/model.py
+++ b/autogpt_platform/backend/backend/data/model.py
@@ -40,6 +40,7 @@ from pydantic_core import (
 )
 from typing_extensions import TypedDict

+from backend.data.llm_registry import update_schema_with_llm_registry
 from backend.integrations.providers import ProviderName
 from backend.util.json import loads as json_loads
 from backend.util.settings import Secrets
@@ -544,7 +545,9 @@ class CredentialsMetaInput(BaseModel, Generic[CP, CT]):
            else:
                schema["credentials_provider"] = allowed_providers
            schema["credentials_types"] = model_class.allowed_cred_types()
-        # Do not return anything, just mutate schema in place
+
+        # Ensure LLM discriminators are populated (delegates to shared helper)
+        update_schema_with_llm_registry(schema, model_class)

    model_config = ConfigDict(
        json_schema_extra=_add_json_schema_extra,  # type: ignore
@@ -693,16 +696,20 @@ def CredentialsField(
    This is enforced by the `BlockSchema` base class.
    """

-    field_schema_extra = {
-        k: v
-        for k, v in {
-            "credentials_scopes": list(required_scopes) or None,
-            "discriminator": discriminator,
-            "discriminator_mapping": discriminator_mapping,
-            "discriminator_values": discriminator_values,
-        }.items()
-        if v is not None
-    }
+    # Build field_schema_extra - always include discriminator and mapping if discriminator is set
+    field_schema_extra: dict[str, Any] = {}
+
+    # Always include discriminator if provided
+    if discriminator is not None:
+        field_schema_extra["discriminator"] = discriminator
+        # Always include discriminator_mapping when discriminator is set (even if empty initially)
+        field_schema_extra["discriminator_mapping"] = discriminator_mapping or {}
+
+    # Include other optional fields (only if not None)
+    if required_scopes:
+        field_schema_extra["credentials_scopes"] = list(required_scopes)
+    if discriminator_values:
+        field_schema_extra["discriminator_values"] = discriminator_values

    # Merge any json_schema_extra passed in kwargs
    if "json_schema_extra" in kwargs:
--- a/autogpt_platform/backend/backend/data/onboarding.py
+++ b/autogpt_platform/backend/backend/data/onboarding.py
@@ -41,7 +41,6 @@ FrontendOnboardingStep = Literal[
    OnboardingStep.AGENT_NEW_RUN,
    OnboardingStep.AGENT_INPUT,
    OnboardingStep.CONGRATS,
-    OnboardingStep.VISIT_COPILOT,
    OnboardingStep.MARKETPLACE_VISIT,
    OnboardingStep.BUILDER_OPEN,
 ]
@@ -123,9 +122,6 @@ async def update_user_onboarding(user_id: str, data: UserOnboardingUpdate):
 async def _reward_user(user_id: str, onboarding: UserOnboarding, step: OnboardingStep):
    reward = 0
    match step:
-        # Welcome bonus for visiting copilot ($5 = 500 credits)
-        case OnboardingStep.VISIT_COPILOT:
-            reward = 500
        # Reward user when they clicked New Run during onboarding
        # This is because they need credits before scheduling a run (next step)
        # This is seen as a reward for the GET_RESULTS step in the wallet
--- a/autogpt_platform/backend/backend/data/workspace.py
+++ b/autogpt_platform/backend/backend/data/workspace.py
@@ -1,276 +0,0 @@
-"""
-Database CRUD operations for User Workspace.
-
-This module provides functions for managing user workspaces and workspace files.
-"""
-
-import logging
-from datetime import datetime, timezone
-from typing import Optional
-
-from prisma.models import UserWorkspace, UserWorkspaceFile
-from prisma.types import UserWorkspaceFileWhereInput
-
-from backend.util.json import SafeJson
-
-logger = logging.getLogger(__name__)
-
-
-async def get_or_create_workspace(user_id: str) -> UserWorkspace:
-    """
-    Get user's workspace, creating one if it doesn't exist.
-
-    Uses upsert to handle race conditions when multiple concurrent requests
-    attempt to create a workspace for the same user.
-
-    Args:
-        user_id: The user's ID
-
-    Returns:
-        UserWorkspace instance
-    """
-    workspace = await UserWorkspace.prisma().upsert(
-        where={"userId": user_id},
-        data={
-            "create": {"userId": user_id},
-            "update": {},  # No updates needed if exists
-        },
-    )
-
-    return workspace
-
-
-async def get_workspace(user_id: str) -> Optional[UserWorkspace]:
-    """
-    Get user's workspace if it exists.
-
-    Args:
-        user_id: The user's ID
-
-    Returns:
-        UserWorkspace instance or None
-    """
-    return await UserWorkspace.prisma().find_unique(where={"userId": user_id})
-
-
-async def create_workspace_file(
-    workspace_id: str,
-    file_id: str,
-    name: str,
-    path: str,
-    storage_path: str,
-    mime_type: str,
-    size_bytes: int,
-    checksum: Optional[str] = None,
-    metadata: Optional[dict] = None,
-) -> UserWorkspaceFile:
-    """
-    Create a new workspace file record.
-
-    Args:
-        workspace_id: The workspace ID
-        file_id: The file ID (same as used in storage path for consistency)
-        name: User-visible filename
-        path: Virtual path (e.g., "/documents/report.pdf")
-        storage_path: Actual storage path (GCS or local)
-        mime_type: MIME type of the file
-        size_bytes: File size in bytes
-        checksum: Optional SHA256 checksum
-        metadata: Optional additional metadata
-
-    Returns:
-        Created UserWorkspaceFile instance
-    """
-    # Normalize path to start with /
-    if not path.startswith("/"):
-        path = f"/{path}"
-
-    file = await UserWorkspaceFile.prisma().create(
-        data={
-            "id": file_id,
-            "workspaceId": workspace_id,
-            "name": name,
-            "path": path,
-            "storagePath": storage_path,
-            "mimeType": mime_type,
-            "sizeBytes": size_bytes,
-            "checksum": checksum,
-            "metadata": SafeJson(metadata or {}),
-        }
-    )
-
-    logger.info(
-        f"Created workspace file {file.id} at path {path} "
-        f"in workspace {workspace_id}"
-    )
-    return file
-
-
-async def get_workspace_file(
-    file_id: str,
-    workspace_id: Optional[str] = None,
-) -> Optional[UserWorkspaceFile]:
-    """
-    Get a workspace file by ID.
-
-    Args:
-        file_id: The file ID
-        workspace_id: Optional workspace ID for validation
-
-    Returns:
-        UserWorkspaceFile instance or None
-    """
-    where_clause: dict = {"id": file_id, "isDeleted": False}
-    if workspace_id:
-        where_clause["workspaceId"] = workspace_id
-
-    return await UserWorkspaceFile.prisma().find_first(where=where_clause)
-
-
-async def get_workspace_file_by_path(
-    workspace_id: str,
-    path: str,
-) -> Optional[UserWorkspaceFile]:
-    """
-    Get a workspace file by its virtual path.
-
-    Args:
-        workspace_id: The workspace ID
-        path: Virtual path
-
-    Returns:
-        UserWorkspaceFile instance or None
-    """
-    # Normalize path
-    if not path.startswith("/"):
-        path = f"/{path}"
-
-    return await UserWorkspaceFile.prisma().find_first(
-        where={
-            "workspaceId": workspace_id,
-            "path": path,
-            "isDeleted": False,
-        }
-    )
-
-
-async def list_workspace_files(
-    workspace_id: str,
-    path_prefix: Optional[str] = None,
-    include_deleted: bool = False,
-    limit: Optional[int] = None,
-    offset: int = 0,
-) -> list[UserWorkspaceFile]:
-    """
-    List files in a workspace.
-
-    Args:
-        workspace_id: The workspace ID
-        path_prefix: Optional path prefix to filter (e.g., "/documents/")
-        include_deleted: Whether to include soft-deleted files
-        limit: Maximum number of files to return
-        offset: Number of files to skip
-
-    Returns:
-        List of UserWorkspaceFile instances
-    """
-    where_clause: UserWorkspaceFileWhereInput = {"workspaceId": workspace_id}
-
-    if not include_deleted:
-        where_clause["isDeleted"] = False
-
-    if path_prefix:
-        # Normalize prefix
-        if not path_prefix.startswith("/"):
-            path_prefix = f"/{path_prefix}"
-        where_clause["path"] = {"startswith": path_prefix}
-
-    return await UserWorkspaceFile.prisma().find_many(
-        where=where_clause,
-        order={"createdAt": "desc"},
-        take=limit,
-        skip=offset,
-    )
-
-
-async def count_workspace_files(
-    workspace_id: str,
-    path_prefix: Optional[str] = None,
-    include_deleted: bool = False,
-) -> int:
-    """
-    Count files in a workspace.
-
-    Args:
-        workspace_id: The workspace ID
-        path_prefix: Optional path prefix to filter (e.g., "/sessions/abc123/")
-        include_deleted: Whether to include soft-deleted files
-
-    Returns:
-        Number of files
-    """
-    where_clause: dict = {"workspaceId": workspace_id}
-    if not include_deleted:
-        where_clause["isDeleted"] = False
-
-    if path_prefix:
-        # Normalize prefix
-        if not path_prefix.startswith("/"):
-            path_prefix = f"/{path_prefix}"
-        where_clause["path"] = {"startswith": path_prefix}
-
-    return await UserWorkspaceFile.prisma().count(where=where_clause)
-
-
-async def soft_delete_workspace_file(
-    file_id: str,
-    workspace_id: Optional[str] = None,
-) -> Optional[UserWorkspaceFile]:
-    """
-    Soft-delete a workspace file.
-
-    The path is modified to include a deletion timestamp to free up the original
-    path for new files while preserving the record for potential recovery.
-
-    Args:
-        file_id: The file ID
-        workspace_id: Optional workspace ID for validation
-
-    Returns:
-        Updated UserWorkspaceFile instance or None if not found
-    """
-    # First verify the file exists and belongs to workspace
-    file = await get_workspace_file(file_id, workspace_id)
-    if file is None:
-        return None
-
-    deleted_at = datetime.now(timezone.utc)
-    # Modify path to free up the unique constraint for new files at original path
-    # Format: {original_path}__deleted__{timestamp}
-    deleted_path = f"{file.path}__deleted__{int(deleted_at.timestamp())}"
-
-    updated = await UserWorkspaceFile.prisma().update(
-        where={"id": file_id},
-        data={
-            "isDeleted": True,
-            "deletedAt": deleted_at,
-            "path": deleted_path,
-        },
-    )
-
-    logger.info(f"Soft-deleted workspace file {file_id}")
-    return updated
-
-
-async def get_workspace_total_size(workspace_id: str) -> int:
-    """
-    Get the total size of all files in a workspace.
-
-    Args:
-        workspace_id: The workspace ID
-
-    Returns:
-        Total size in bytes
-    """
-    files = await list_workspace_files(workspace_id)
-    return sum(file.sizeBytes for file in files)
--- a/autogpt_platform/backend/backend/executor/llm_registry_init.py
+++ b/autogpt_platform/backend/backend/executor/llm_registry_init.py
@@ -0,0 +1,66 @@
+"""
+Helper functions for LLM registry initialization in executor context.
+
+These functions handle refreshing the LLM registry when the executor starts
+and subscribing to real-time updates via Redis pub/sub.
+"""
+
+import logging
+
+from backend.data import db, llm_registry
+from backend.data.block import BlockSchema, initialize_blocks
+from backend.data.block_cost_config import refresh_llm_costs
+from backend.data.llm_registry import subscribe_to_registry_refresh
+
+logger = logging.getLogger(__name__)
+
+
+async def initialize_registry_for_executor() -> None:
+    """
+    Initialize blocks and refresh LLM registry in the executor context.
+
+    This must run in the executor's event loop to have access to the database.
+    """
+    try:
+        # Connect to database if not already connected
+        if not db.is_connected():
+            await db.connect()
+            logger.info("[GraphExecutor] Connected to database for registry refresh")
+
+        # Initialize blocks (internally refreshes LLM registry and costs)
+        await initialize_blocks()
+        logger.info("[GraphExecutor] Blocks initialized")
+    except Exception as exc:
+        logger.warning(
+            "[GraphExecutor] Failed to refresh LLM registry on startup: %s",
+            exc,
+            exc_info=True,
+        )
+
+
+async def refresh_registry_on_notification() -> None:
+    """Refresh LLM registry when notified via Redis pub/sub."""
+    try:
+        # Ensure DB is connected
+        if not db.is_connected():
+            await db.connect()
+
+        # Refresh registry and costs
+        await llm_registry.refresh_llm_registry()
+        refresh_llm_costs()
+
+        # Clear block schema caches so they regenerate with new model options
+        BlockSchema.clear_all_schema_caches()
+
+        logger.info("[GraphExecutor] LLM registry refreshed from notification")
+    except Exception as exc:
+        logger.error(
+            "[GraphExecutor] Failed to refresh LLM registry from notification: %s",
+            exc,
+            exc_info=True,
+        )
+
+
+async def subscribe_to_registry_updates() -> None:
+    """Subscribe to Redis pub/sub for LLM registry refresh notifications."""
+    await subscribe_to_registry_refresh(refresh_registry_on_notification)
--- a/autogpt_platform/backend/backend/executor/manager.py
+++ b/autogpt_platform/backend/backend/executor/manager.py
@@ -236,14 +236,7 @@ async def execute_node(
    input_size = len(input_data_str)
    log_metadata.debug("Executed node with input", input=input_data_str)

-    # Create node-specific execution context to avoid race conditions
-    # (multiple nodes can execute concurrently and would otherwise mutate shared state)
-    execution_context = execution_context.model_copy(
-        update={"node_id": node_id, "node_exec_id": node_exec_id}
-    )
-
    # Inject extra execution arguments for the blocks via kwargs
-    # Keep individual kwargs for backwards compatibility with existing blocks
    extra_exec_kwargs: dict = {
        "graph_id": graph_id,
        "graph_version": graph_version,
@@ -709,6 +702,20 @@ class ExecutionProcessor:
        )
        self.node_execution_thread.start()
        self.node_evaluation_thread.start()
+
+        # Initialize LLM registry and subscribe to updates
+        from backend.executor.llm_registry_init import (
+            initialize_registry_for_executor,
+            subscribe_to_registry_updates,
+        )
+
+        asyncio.run_coroutine_threadsafe(
+            initialize_registry_for_executor(), self.node_execution_loop
+        )
+        asyncio.run_coroutine_threadsafe(
+            subscribe_to_registry_updates(), self.node_execution_loop
+        )
+
        logger.info(f"[GraphExecutor] {self.tid} started")

    @error_logged(swallow=False)
--- a/autogpt_platform/backend/backend/executor/utils.py
+++ b/autogpt_platform/backend/backend/executor/utils.py
@@ -892,19 +892,11 @@ async def add_graph_execution(
        settings = await gdb.get_graph_settings(user_id=user_id, graph_id=graph_id)

        execution_context = ExecutionContext(
-            # Execution identity
-            user_id=user_id,
-            graph_id=graph_id,
-            graph_exec_id=graph_exec.id,
-            graph_version=graph_exec.graph_version,
-            # Safety settings
            human_in_the_loop_safe_mode=settings.human_in_the_loop_safe_mode,
            sensitive_action_safe_mode=settings.sensitive_action_safe_mode,
-            # User settings
            user_timezone=(
                user.timezone if user.timezone != USER_TIMEZONE_NOT_SET else "UTC"
            ),
-            # Execution hierarchy
            root_execution_id=graph_exec.id,
        )

--- a/autogpt_platform/backend/backend/executor/utils_test.py
+++ b/autogpt_platform/backend/backend/executor/utils_test.py
@@ -348,7 +348,6 @@ async def test_add_graph_execution_is_repeatable(mocker: MockerFixture):
    mock_graph_exec.id = "execution-id-123"
    mock_graph_exec.node_executions = []  # Add this to avoid AttributeError
    mock_graph_exec.status = ExecutionStatus.QUEUED  # Required for race condition check
-    mock_graph_exec.graph_version = graph_version
    mock_graph_exec.to_graph_execution_entry.return_value = mocker.MagicMock()

    # Mock the queue and event bus
@@ -435,9 +434,6 @@ async def test_add_graph_execution_is_repeatable(mocker: MockerFixture):
    # Create a second mock execution for the sanity check
    mock_graph_exec_2 = mocker.MagicMock(spec=GraphExecutionWithNodes)
    mock_graph_exec_2.id = "execution-id-456"
-    mock_graph_exec_2.node_executions = []
-    mock_graph_exec_2.status = ExecutionStatus.QUEUED
-    mock_graph_exec_2.graph_version = graph_version
    mock_graph_exec_2.to_graph_execution_entry.return_value = mocker.MagicMock()

    # Reset mocks and set up for second call
@@ -618,7 +614,6 @@ async def test_add_graph_execution_with_nodes_to_skip(mocker: MockerFixture):
    mock_graph_exec.id = "execution-id-123"
    mock_graph_exec.node_executions = []
    mock_graph_exec.status = ExecutionStatus.QUEUED  # Required for race condition check
-    mock_graph_exec.graph_version = graph_version

    # Track what's passed to to_graph_execution_entry
    captured_kwargs = {}
--- a/autogpt_platform/backend/backend/server/v2/llm/db.py
+++ b/autogpt_platform/backend/backend/server/v2/llm/db.py
@@ -0,0 +1,935 @@
+from __future__ import annotations
+
+from typing import Any, Iterable, Sequence, cast
+
+import prisma
+import prisma.models
+
+from backend.data.db import transaction
+from backend.server.v2.llm import model as llm_model
+from backend.util.models import Pagination
+
+
+def _json_dict(value: Any | None) -> dict[str, Any]:
+    if not value:
+        return {}
+    if isinstance(value, dict):
+        return value
+    return {}
+
+
+def _map_cost(record: prisma.models.LlmModelCost) -> llm_model.LlmModelCost:
+    return llm_model.LlmModelCost(
+        id=record.id,
+        unit=record.unit,
+        credit_cost=record.creditCost,
+        credential_provider=record.credentialProvider,
+        credential_id=record.credentialId,
+        credential_type=record.credentialType,
+        currency=record.currency,
+        metadata=_json_dict(record.metadata),
+    )
+
+
+def _map_creator(
+    record: prisma.models.LlmModelCreator,
+) -> llm_model.LlmModelCreator:
+    return llm_model.LlmModelCreator(
+        id=record.id,
+        name=record.name,
+        display_name=record.displayName,
+        description=record.description,
+        website_url=record.websiteUrl,
+        logo_url=record.logoUrl,
+        metadata=_json_dict(record.metadata),
+    )
+
+
+def _map_model(record: prisma.models.LlmModel) -> llm_model.LlmModel:
+    costs = []
+    if record.Costs:
+        costs = [_map_cost(cost) for cost in record.Costs]
+
+    creator = None
+    if hasattr(record, "Creator") and record.Creator:
+        creator = _map_creator(record.Creator)
+
+    return llm_model.LlmModel(
+        id=record.id,
+        slug=record.slug,
+        display_name=record.displayName,
+        description=record.description,
+        provider_id=record.providerId,
+        creator_id=record.creatorId,
+        creator=creator,
+        context_window=record.contextWindow,
+        max_output_tokens=record.maxOutputTokens,
+        is_enabled=record.isEnabled,
+        is_recommended=record.isRecommended,
+        capabilities=_json_dict(record.capabilities),
+        metadata=_json_dict(record.metadata),
+        costs=costs,
+    )
+
+
+def _map_provider(record: prisma.models.LlmProvider) -> llm_model.LlmProvider:
+    models: list[llm_model.LlmModel] = []
+    if record.Models:
+        models = [_map_model(model) for model in record.Models]
+
+    return llm_model.LlmProvider(
+        id=record.id,
+        name=record.name,
+        display_name=record.displayName,
+        description=record.description,
+        default_credential_provider=record.defaultCredentialProvider,
+        default_credential_id=record.defaultCredentialId,
+        default_credential_type=record.defaultCredentialType,
+        supports_tools=record.supportsTools,
+        supports_json_output=record.supportsJsonOutput,
+        supports_reasoning=record.supportsReasoning,
+        supports_parallel_tool=record.supportsParallelTool,
+        metadata=_json_dict(record.metadata),
+        models=models,
+    )
+
+
+async def list_providers(
+    include_models: bool = True, enabled_only: bool = False
+) -> list[llm_model.LlmProvider]:
+    """
+    List all LLM providers.
+
+    Args:
+        include_models: Whether to include models for each provider
+        enabled_only: If True, only include enabled models (for public routes)
+    """
+    include: Any = None
+    if include_models:
+        model_where = {"isEnabled": True} if enabled_only else None
+        include = {
+            "Models": {
+                "include": {"Costs": True, "Creator": True},
+                "where": model_where,
+            }
+        }
+    records = await prisma.models.LlmProvider.prisma().find_many(include=include)
+    return [_map_provider(record) for record in records]
+
+
+async def upsert_provider(
+    request: llm_model.UpsertLlmProviderRequest,
+    provider_id: str | None = None,
+) -> llm_model.LlmProvider:
+    data: Any = {
+        "name": request.name,
+        "displayName": request.display_name,
+        "description": request.description,
+        "defaultCredentialProvider": request.default_credential_provider,
+        "defaultCredentialId": request.default_credential_id,
+        "defaultCredentialType": request.default_credential_type,
+        "supportsTools": request.supports_tools,
+        "supportsJsonOutput": request.supports_json_output,
+        "supportsReasoning": request.supports_reasoning,
+        "supportsParallelTool": request.supports_parallel_tool,
+        "metadata": prisma.Json(request.metadata or {}),
+    }
+    include: Any = {"Models": {"include": {"Costs": True, "Creator": True}}}
+    if provider_id:
+        record = await prisma.models.LlmProvider.prisma().update(
+            where={"id": provider_id},
+            data=data,
+            include=include,
+        )
+    else:
+        record = await prisma.models.LlmProvider.prisma().create(
+            data=data,
+            include=include,
+        )
+    if record is None:
+        raise ValueError("Failed to create/update provider")
+    return _map_provider(record)
+
+
+async def delete_provider(provider_id: str) -> bool:
+    """
+    Delete an LLM provider.
+
+    A provider can only be deleted if it has no associated models.
+    Due to onDelete: Restrict on LlmModel.Provider, the database will
+    block deletion if models exist.
+
+    Args:
+        provider_id: UUID of the provider to delete
+
+    Returns:
+        True if deleted successfully
+
+    Raises:
+        ValueError: If provider not found or has associated models
+    """
+    # Check if provider exists
+    provider = await prisma.models.LlmProvider.prisma().find_unique(
+        where={"id": provider_id},
+        include={"Models": True},
+    )
+    if not provider:
+        raise ValueError(f"Provider with id '{provider_id}' not found")
+
+    # Check if provider has any models
+    model_count = len(provider.Models) if provider.Models else 0
+    if model_count > 0:
+        raise ValueError(
+            f"Cannot delete provider '{provider.displayName}' because it has "
+            f"{model_count} model(s). Delete all models first."
+        )
+
+    # Safe to delete
+    await prisma.models.LlmProvider.prisma().delete(where={"id": provider_id})
+    return True
+
+
+async def list_models(
+    provider_id: str | None = None,
+    enabled_only: bool = False,
+    page: int = 1,
+    page_size: int = 50,
+) -> llm_model.LlmModelsResponse:
+    """
+    List LLM models with pagination.
+
+    Args:
+        provider_id: Optional filter by provider ID
+        enabled_only: If True, only return enabled models (for public routes)
+        page: Page number (1-indexed)
+        page_size: Number of models per page
+    """
+    where: Any = {}
+    if provider_id:
+        where["providerId"] = provider_id
+    if enabled_only:
+        where["isEnabled"] = True
+
+    # Get total count for pagination
+    total_items = await prisma.models.LlmModel.prisma().count(
+        where=where if where else None
+    )
+
+    # Calculate pagination
+    skip = (page - 1) * page_size
+    total_pages = (total_items + page_size - 1) // page_size if total_items > 0 else 0
+
+    records = await prisma.models.LlmModel.prisma().find_many(
+        where=where if where else None,
+        include={"Costs": True, "Creator": True},
+        skip=skip,
+        take=page_size,
+    )
+    models = [_map_model(record) for record in records]
+
+    return llm_model.LlmModelsResponse(
+        models=models,
+        pagination=Pagination(
+            total_items=total_items,
+            total_pages=total_pages,
+            current_page=page,
+            page_size=page_size,
+        ),
+    )
+
+
+def _cost_create_payload(
+    costs: Sequence[llm_model.LlmModelCostInput],
+) -> dict[str, Iterable[dict[str, Any]]]:
+
+    create_items = []
+    for cost in costs:
+        item: dict[str, Any] = {
+            "unit": cost.unit,
+            "creditCost": cost.credit_cost,
+            "credentialProvider": cost.credential_provider,
+        }
+        # Only include optional fields if they have values
+        if cost.credential_id:
+            item["credentialId"] = cost.credential_id
+        if cost.credential_type:
+            item["credentialType"] = cost.credential_type
+        if cost.currency:
+            item["currency"] = cost.currency
+        # Handle metadata - use Prisma Json type
+        if cost.metadata is not None and cost.metadata != {}:
+            item["metadata"] = prisma.Json(cost.metadata)
+        create_items.append(item)
+    return {"create": create_items}
+
+
+async def create_model(
+    request: llm_model.CreateLlmModelRequest,
+) -> llm_model.LlmModel:
+    data: Any = {
+        "slug": request.slug,
+        "displayName": request.display_name,
+        "description": request.description,
+        "Provider": {"connect": {"id": request.provider_id}},
+        "contextWindow": request.context_window,
+        "maxOutputTokens": request.max_output_tokens,
+        "isEnabled": request.is_enabled,
+        "capabilities": prisma.Json(request.capabilities or {}),
+        "metadata": prisma.Json(request.metadata or {}),
+        "Costs": _cost_create_payload(request.costs),
+    }
+    if request.creator_id:
+        data["Creator"] = {"connect": {"id": request.creator_id}}
+
+    record = await prisma.models.LlmModel.prisma().create(
+        data=data,
+        include={"Costs": True, "Creator": True, "Provider": True},
+    )
+    return _map_model(record)
+
+
+async def update_model(
+    model_id: str,
+    request: llm_model.UpdateLlmModelRequest,
+) -> llm_model.LlmModel:
+    # Build scalar field updates (non-relation fields)
+    scalar_data: Any = {}
+    if request.display_name is not None:
+        scalar_data["displayName"] = request.display_name
+    if request.description is not None:
+        scalar_data["description"] = request.description
+    if request.context_window is not None:
+        scalar_data["contextWindow"] = request.context_window
+    if request.max_output_tokens is not None:
+        scalar_data["maxOutputTokens"] = request.max_output_tokens
+    if request.is_enabled is not None:
+        scalar_data["isEnabled"] = request.is_enabled
+    if request.capabilities is not None:
+        scalar_data["capabilities"] = request.capabilities
+    if request.metadata is not None:
+        scalar_data["metadata"] = request.metadata
+    # Foreign keys can be updated directly as scalar fields
+    if request.provider_id is not None:
+        scalar_data["providerId"] = request.provider_id
+    if request.creator_id is not None:
+        # Empty string means remove the creator
+        scalar_data["creatorId"] = request.creator_id if request.creator_id else None
+
+    # If we have costs to update, we need to handle them separately
+    # because nested writes have different constraints
+    if request.costs is not None:
+        # Wrap cost replacement in a transaction for atomicity
+        async with transaction() as tx:
+            # First update scalar fields
+            if scalar_data:
+                await tx.llmmodel.update(
+                    where={"id": model_id},
+                    data=scalar_data,
+                )
+            # Then handle costs: delete existing and create new
+            await tx.llmmodelcost.delete_many(where={"llmModelId": model_id})
+            if request.costs:
+                cost_payload = _cost_create_payload(request.costs)
+                for cost_item in cost_payload["create"]:
+                    cost_item["llmModelId"] = model_id
+                    await tx.llmmodelcost.create(data=cast(Any, cost_item))
+        # Fetch the updated record (outside transaction)
+        record = await prisma.models.LlmModel.prisma().find_unique(
+            where={"id": model_id},
+            include={"Costs": True, "Creator": True},
+        )
+    else:
+        # No costs update - simple update
+        record = await prisma.models.LlmModel.prisma().update(
+            where={"id": model_id},
+            data=scalar_data,
+            include={"Costs": True, "Creator": True},
+        )
+
+    if not record:
+        raise ValueError(f"Model with id '{model_id}' not found")
+    return _map_model(record)
+
+
+async def toggle_model(
+    model_id: str,
+    is_enabled: bool,
+    migrate_to_slug: str | None = None,
+    migration_reason: str | None = None,
+    custom_credit_cost: int | None = None,
+) -> llm_model.ToggleLlmModelResponse:
+    """
+    Toggle a model's enabled status, optionally migrating workflows when disabling.
+
+    Args:
+        model_id: UUID of the model to toggle
+        is_enabled: New enabled status
+        migrate_to_slug: If disabling and this is provided, migrate all workflows
+                         using this model to the specified replacement model
+        migration_reason: Optional reason for the migration (e.g., "Provider outage")
+        custom_credit_cost: Optional custom pricing override for migrated workflows.
+                           When set, the billing system should use this cost instead
+                           of the target model's cost for affected nodes.
+
+    Returns:
+        ToggleLlmModelResponse with the updated model and optional migration stats
+    """
+    import json
+
+    # Get the model being toggled
+    model = await prisma.models.LlmModel.prisma().find_unique(
+        where={"id": model_id}, include={"Costs": True}
+    )
+    if not model:
+        raise ValueError(f"Model with id '{model_id}' not found")
+
+    nodes_migrated = 0
+    migration_id: str | None = None
+
+    # If disabling with migration, perform migration first
+    if not is_enabled and migrate_to_slug:
+        # Validate replacement model exists and is enabled
+        replacement = await prisma.models.LlmModel.prisma().find_unique(
+            where={"slug": migrate_to_slug}
+        )
+        if not replacement:
+            raise ValueError(f"Replacement model '{migrate_to_slug}' not found")
+        if not replacement.isEnabled:
+            raise ValueError(
+                f"Replacement model '{migrate_to_slug}' is disabled. "
+                f"Please enable it before using it as a replacement."
+            )
+
+        # Perform all operations atomically within a single transaction
+        # This ensures no nodes are missed between query and update
+        async with transaction() as tx:
+            # Get the IDs of nodes that will be migrated (inside transaction for consistency)
+            node_ids_result = await tx.query_raw(
+                """
+                SELECT id
+                FROM "AgentNode"
+                WHERE "constantInput"::jsonb->>'model' = $1
+                FOR UPDATE
+                """,
+                model.slug,
+            )
+            migrated_node_ids = (
+                [row["id"] for row in node_ids_result] if node_ids_result else []
+            )
+            nodes_migrated = len(migrated_node_ids)
+
+            if nodes_migrated > 0:
+                # Update by IDs to ensure we only update the exact nodes we queried
+                # Use JSON array and jsonb_array_elements_text for safe parameterization
+                node_ids_json = json.dumps(migrated_node_ids)
+                await tx.execute_raw(
+                    """
+                    UPDATE "AgentNode"
+                    SET "constantInput" = JSONB_SET(
+                        "constantInput"::jsonb,
+                        '{model}',
+                        to_jsonb($1::text)
+                    )
+                    WHERE id::text IN (
+                        SELECT jsonb_array_elements_text($2::jsonb)
+                    )
+                    """,
+                    migrate_to_slug,
+                    node_ids_json,
+                )
+
+            record = await tx.llmmodel.update(
+                where={"id": model_id},
+                data={"isEnabled": is_enabled},
+                include={"Costs": True},
+            )
+
+            # Create migration record for revert capability
+            if nodes_migrated > 0:
+                migration_data: Any = {
+                    "sourceModelSlug": model.slug,
+                    "targetModelSlug": migrate_to_slug,
+                    "reason": migration_reason,
+                    "migratedNodeIds": json.dumps(migrated_node_ids),
+                    "nodeCount": nodes_migrated,
+                    "customCreditCost": custom_credit_cost,
+                }
+                migration_record = await tx.llmmodelmigration.create(
+                    data=migration_data
+                )
+                migration_id = migration_record.id
+    else:
+        # Simple toggle without migration
+        record = await prisma.models.LlmModel.prisma().update(
+            where={"id": model_id},
+            data={"isEnabled": is_enabled},
+            include={"Costs": True},
+        )
+
+    if record is None:
+        raise ValueError(f"Model with id '{model_id}' not found")
+    return llm_model.ToggleLlmModelResponse(
+        model=_map_model(record),
+        nodes_migrated=nodes_migrated,
+        migrated_to_slug=migrate_to_slug if nodes_migrated > 0 else None,
+        migration_id=migration_id,
+    )
+
+
+async def get_model_usage(model_id: str) -> llm_model.LlmModelUsageResponse:
+    """Get usage count for a model."""
+    import prisma as prisma_module
+
+    model = await prisma.models.LlmModel.prisma().find_unique(where={"id": model_id})
+    if not model:
+        raise ValueError(f"Model with id '{model_id}' not found")
+
+    count_result = await prisma_module.get_client().query_raw(
+        """
+        SELECT COUNT(*) as count
+        FROM "AgentNode"
+        WHERE "constantInput"::jsonb->>'model' = $1
+        """,
+        model.slug,
+    )
+    node_count = int(count_result[0]["count"]) if count_result else 0
+
+    return llm_model.LlmModelUsageResponse(model_slug=model.slug, node_count=node_count)
+
+
+async def delete_model(
+    model_id: str, replacement_model_slug: str | None = None
+) -> llm_model.DeleteLlmModelResponse:
+    """
+    Delete a model and optionally migrate all AgentNodes using it to a replacement model.
+
+    This performs an atomic operation within a database transaction:
+    1. Validates the model exists
+    2. Counts affected nodes
+    3. If nodes exist, validates replacement model and migrates them
+    4. Deletes the LlmModel record (CASCADE deletes costs)
+
+    Args:
+        model_id: UUID of the model to delete
+        replacement_model_slug: Slug of the model to migrate to (required only if nodes use this model)
+
+    Returns:
+        DeleteLlmModelResponse with migration stats
+
+    Raises:
+        ValueError: If model not found, nodes exist but no replacement provided,
+                    replacement not found, or replacement is disabled
+    """
+    # 1. Get the model being deleted (validation - outside transaction)
+    model = await prisma.models.LlmModel.prisma().find_unique(
+        where={"id": model_id}, include={"Costs": True}
+    )
+    if not model:
+        raise ValueError(f"Model with id '{model_id}' not found")
+
+    deleted_slug = model.slug
+    deleted_display_name = model.displayName
+
+    # 2. Count affected nodes first to determine if replacement is needed
+    import prisma as prisma_module
+
+    count_result = await prisma_module.get_client().query_raw(
+        """
+        SELECT COUNT(*) as count
+        FROM "AgentNode"
+        WHERE "constantInput"::jsonb->>'model' = $1
+        """,
+        deleted_slug,
+    )
+    nodes_to_migrate = int(count_result[0]["count"]) if count_result else 0
+
+    # 3. Validate replacement model only if there are nodes to migrate
+    if nodes_to_migrate > 0:
+        if not replacement_model_slug:
+            raise ValueError(
+                f"Cannot delete model '{deleted_slug}': {nodes_to_migrate} workflow node(s) "
+                f"are using it. Please provide a replacement_model_slug to migrate them."
+            )
+        replacement = await prisma.models.LlmModel.prisma().find_unique(
+            where={"slug": replacement_model_slug}
+        )
+        if not replacement:
+            raise ValueError(f"Replacement model '{replacement_model_slug}' not found")
+        if not replacement.isEnabled:
+            raise ValueError(
+                f"Replacement model '{replacement_model_slug}' is disabled. "
+                f"Please enable it before using it as a replacement."
+            )
+
+    # 4. Perform migration (if needed) and deletion atomically within a transaction
+    async with transaction() as tx:
+        # Migrate all AgentNode.constantInput->model to replacement
+        if nodes_to_migrate > 0 and replacement_model_slug:
+            await tx.execute_raw(
+                """
+                UPDATE "AgentNode"
+                SET "constantInput" = JSONB_SET(
+                    "constantInput"::jsonb,
+                    '{model}',
+                    to_jsonb($1::text)
+                )
+                WHERE "constantInput"::jsonb->>'model' = $2
+                """,
+                replacement_model_slug,
+                deleted_slug,
+            )
+
+        # Delete the model (CASCADE will delete costs automatically)
+        await tx.llmmodel.delete(where={"id": model_id})
+
+    # Build appropriate message based on whether migration happened
+    if nodes_to_migrate > 0:
+        message = (
+            f"Successfully deleted model '{deleted_display_name}' ({deleted_slug}) "
+            f"and migrated {nodes_to_migrate} workflow node(s) to '{replacement_model_slug}'."
+        )
+    else:
+        message = (
+            f"Successfully deleted model '{deleted_display_name}' ({deleted_slug}). "
+            f"No workflows were using this model."
+        )
+
+    return llm_model.DeleteLlmModelResponse(
+        deleted_model_slug=deleted_slug,
+        deleted_model_display_name=deleted_display_name,
+        replacement_model_slug=replacement_model_slug,
+        nodes_migrated=nodes_to_migrate,
+        message=message,
+    )
+
+
+def _map_migration(
+    record: prisma.models.LlmModelMigration,
+) -> llm_model.LlmModelMigration:
+    return llm_model.LlmModelMigration(
+        id=record.id,
+        source_model_slug=record.sourceModelSlug,
+        target_model_slug=record.targetModelSlug,
+        reason=record.reason,
+        node_count=record.nodeCount,
+        custom_credit_cost=record.customCreditCost,
+        is_reverted=record.isReverted,
+        created_at=record.createdAt.isoformat(),
+        reverted_at=record.revertedAt.isoformat() if record.revertedAt else None,
+    )
+
+
+async def list_migrations(
+    include_reverted: bool = False,
+) -> list[llm_model.LlmModelMigration]:
+    """
+    List model migrations, optionally including reverted ones.
+
+    Args:
+        include_reverted: If True, include reverted migrations. Default is False.
+
+    Returns:
+        List of LlmModelMigration records
+    """
+    where: Any = None if include_reverted else {"isReverted": False}
+    records = await prisma.models.LlmModelMigration.prisma().find_many(
+        where=where,
+        order={"createdAt": "desc"},
+    )
+    return [_map_migration(record) for record in records]
+
+
+async def get_migration(migration_id: str) -> llm_model.LlmModelMigration | None:
+    """Get a specific migration by ID."""
+    record = await prisma.models.LlmModelMigration.prisma().find_unique(
+        where={"id": migration_id}
+    )
+    return _map_migration(record) if record else None
+
+
+async def revert_migration(
+    migration_id: str,
+    re_enable_source_model: bool = True,
+) -> llm_model.RevertMigrationResponse:
+    """
+    Revert a model migration, restoring affected nodes to their original model.
+
+    This only reverts the specific nodes that were migrated, not all nodes
+    currently using the target model.
+
+    Args:
+        migration_id: UUID of the migration to revert
+        re_enable_source_model: Whether to re-enable the source model if it's disabled
+
+    Returns:
+        RevertMigrationResponse with revert stats
+
+    Raises:
+        ValueError: If migration not found, already reverted, or source model not available
+    """
+    import json
+    from datetime import datetime, timezone
+
+    # Get the migration record
+    migration = await prisma.models.LlmModelMigration.prisma().find_unique(
+        where={"id": migration_id}
+    )
+    if not migration:
+        raise ValueError(f"Migration with id '{migration_id}' not found")
+
+    if migration.isReverted:
+        raise ValueError(
+            f"Migration '{migration_id}' has already been reverted "
+            f"on {migration.revertedAt.isoformat() if migration.revertedAt else 'unknown date'}"
+        )
+
+    # Check if source model exists
+    source_model = await prisma.models.LlmModel.prisma().find_unique(
+        where={"slug": migration.sourceModelSlug}
+    )
+    if not source_model:
+        raise ValueError(
+            f"Source model '{migration.sourceModelSlug}' no longer exists. "
+            f"Cannot revert migration."
+        )
+
+    # Get the migrated node IDs (Prisma auto-parses JSONB to list)
+    migrated_node_ids: list[str] = (
+        migration.migratedNodeIds
+        if isinstance(migration.migratedNodeIds, list)
+        else json.loads(migration.migratedNodeIds)  # type: ignore
+    )
+    if not migrated_node_ids:
+        raise ValueError("No nodes to revert in this migration")
+
+    # Track if we need to re-enable the source model
+    source_model_was_disabled = not source_model.isEnabled
+    should_re_enable = source_model_was_disabled and re_enable_source_model
+    source_model_re_enabled = False
+
+    # Perform revert atomically
+    async with transaction() as tx:
+        # Re-enable the source model if requested and it was disabled
+        if should_re_enable:
+            await tx.llmmodel.update(
+                where={"id": source_model.id},
+                data={"isEnabled": True},
+            )
+            source_model_re_enabled = True
+
+        # Update only the specific nodes that were migrated
+        # We need to check that they still have the target model (haven't been changed since)
+        # Use a single batch update for efficiency
+        # Use JSON array and jsonb_array_elements_text for safe parameterization
+        node_ids_json = json.dumps(migrated_node_ids)
+        result = await tx.execute_raw(
+            """
+            UPDATE "AgentNode"
+            SET "constantInput" = JSONB_SET(
+                "constantInput"::jsonb,
+                '{model}',
+                to_jsonb($1::text)
+            )
+            WHERE id::text IN (
+                SELECT jsonb_array_elements_text($2::jsonb)
+            )
+            AND "constantInput"::jsonb->>'model' = $3
+            """,
+            migration.sourceModelSlug,
+            node_ids_json,
+            migration.targetModelSlug,
+        )
+        nodes_reverted = result if result else 0
+
+        # Mark migration as reverted
+        await tx.llmmodelmigration.update(
+            where={"id": migration_id},
+            data={
+                "isReverted": True,
+                "revertedAt": datetime.now(timezone.utc),
+            },
+        )
+
+    # Calculate nodes that were already changed since migration
+    nodes_already_changed = len(migrated_node_ids) - nodes_reverted
+
+    # Build appropriate message
+    message_parts = [
+        f"Successfully reverted migration: {nodes_reverted} node(s) restored "
+        f"from '{migration.targetModelSlug}' to '{migration.sourceModelSlug}'."
+    ]
+    if nodes_already_changed > 0:
+        message_parts.append(
+            f" {nodes_already_changed} node(s) were already changed and not reverted."
+        )
+    if source_model_re_enabled:
+        message_parts.append(
+            f" Model '{migration.sourceModelSlug}' has been re-enabled."
+        )
+
+    return llm_model.RevertMigrationResponse(
+        migration_id=migration_id,
+        source_model_slug=migration.sourceModelSlug,
+        target_model_slug=migration.targetModelSlug,
+        nodes_reverted=nodes_reverted,
+        nodes_already_changed=nodes_already_changed,
+        source_model_re_enabled=source_model_re_enabled,
+        message="".join(message_parts),
+    )
+
+
+# ============================================================================
+# Creator CRUD operations
+# ============================================================================
+
+
+async def list_creators() -> list[llm_model.LlmModelCreator]:
+    """List all LLM model creators."""
+    records = await prisma.models.LlmModelCreator.prisma().find_many(
+        order={"displayName": "asc"}
+    )
+    return [_map_creator(record) for record in records]
+
+
+async def get_creator(creator_id: str) -> llm_model.LlmModelCreator | None:
+    """Get a specific creator by ID."""
+    record = await prisma.models.LlmModelCreator.prisma().find_unique(
+        where={"id": creator_id}
+    )
+    return _map_creator(record) if record else None
+
+
+async def upsert_creator(
+    request: llm_model.UpsertLlmCreatorRequest,
+    creator_id: str | None = None,
+) -> llm_model.LlmModelCreator:
+    """Create or update a model creator."""
+    data: Any = {
+        "name": request.name,
+        "displayName": request.display_name,
+        "description": request.description,
+        "websiteUrl": request.website_url,
+        "logoUrl": request.logo_url,
+        "metadata": prisma.Json(request.metadata or {}),
+    }
+    if creator_id:
+        record = await prisma.models.LlmModelCreator.prisma().update(
+            where={"id": creator_id},
+            data=data,
+        )
+    else:
+        record = await prisma.models.LlmModelCreator.prisma().create(data=data)
+    if record is None:
+        raise ValueError("Failed to create/update creator")
+    return _map_creator(record)
+
+
+async def delete_creator(creator_id: str) -> bool:
+    """
+    Delete a model creator.
+
+    This will set creatorId to NULL on all associated models (due to onDelete: SetNull).
+
+    Args:
+        creator_id: UUID of the creator to delete
+
+    Returns:
+        True if deleted successfully
+
+    Raises:
+        ValueError: If creator not found
+    """
+    creator = await prisma.models.LlmModelCreator.prisma().find_unique(
+        where={"id": creator_id}
+    )
+    if not creator:
+        raise ValueError(f"Creator with id '{creator_id}' not found")
+
+    await prisma.models.LlmModelCreator.prisma().delete(where={"id": creator_id})
+    return True
+
+
+async def get_recommended_model() -> llm_model.LlmModel | None:
+    """
+    Get the currently recommended LLM model.
+
+    Returns:
+        The recommended model, or None if no model is marked as recommended.
+    """
+    record = await prisma.models.LlmModel.prisma().find_first(
+        where={"isRecommended": True, "isEnabled": True},
+        include={"Costs": True, "Creator": True},
+    )
+    return _map_model(record) if record else None
+
+
+async def set_recommended_model(
+    model_id: str,
+) -> tuple[llm_model.LlmModel, str | None]:
+    """
+    Set a model as the recommended model.
+
+    This will clear the isRecommended flag from any other model and set it
+    on the specified model. The model must be enabled.
+
+    Args:
+        model_id: UUID of the model to set as recommended
+
+    Returns:
+        Tuple of (the updated model, previous recommended model slug or None)
+
+    Raises:
+        ValueError: If model not found or not enabled
+    """
+    # First, verify the model exists and is enabled
+    target_model = await prisma.models.LlmModel.prisma().find_unique(
+        where={"id": model_id}
+    )
+    if not target_model:
+        raise ValueError(f"Model with id '{model_id}' not found")
+    if not target_model.isEnabled:
+        raise ValueError(
+            f"Cannot set disabled model '{target_model.slug}' as recommended"
+        )
+
+    # Get the current recommended model (if any)
+    current_recommended = await prisma.models.LlmModel.prisma().find_first(
+        where={"isRecommended": True}
+    )
+    previous_slug = current_recommended.slug if current_recommended else None
+
+    # Use a transaction to ensure atomicity
+    async with transaction() as tx:
+        # Clear isRecommended from all models
+        await tx.llmmodel.update_many(
+            where={"isRecommended": True},
+            data={"isRecommended": False},
+        )
+        # Set the new recommended model
+        await tx.llmmodel.update(
+            where={"id": model_id},
+            data={"isRecommended": True},
+        )
+
+    # Fetch and return the updated model
+    updated_record = await prisma.models.LlmModel.prisma().find_unique(
+        where={"id": model_id},
+        include={"Costs": True, "Creator": True},
+    )
+    if not updated_record:
+        raise ValueError("Failed to fetch updated model")
+
+    return _map_model(updated_record), previous_slug
+
+
+async def get_recommended_model_slug() -> str | None:
+    """
+    Get the slug of the currently recommended LLM model.
+
+    Returns:
+        The slug of the recommended model, or None if no model is marked as recommended.
+    """
+    record = await prisma.models.LlmModel.prisma().find_first(
+        where={"isRecommended": True, "isEnabled": True},
+    )
+    return record.slug if record else None
--- a/autogpt_platform/backend/backend/server/v2/llm/model.py
+++ b/autogpt_platform/backend/backend/server/v2/llm/model.py
@@ -0,0 +1,235 @@
+from __future__ import annotations
+
+import re
+from datetime import datetime
+from typing import Any, Optional
+
+import prisma.enums
+import pydantic
+
+from backend.util.models import Pagination
+
+# Pattern for valid model slugs: alphanumeric start, then alphanumeric, dots, underscores, slashes, hyphens
+SLUG_PATTERN = re.compile(r"^[a-zA-Z0-9][a-zA-Z0-9._/-]*$")
+
+
+class LlmModelCost(pydantic.BaseModel):
+    id: str
+    unit: prisma.enums.LlmCostUnit = prisma.enums.LlmCostUnit.RUN
+    credit_cost: int
+    credential_provider: str
+    credential_id: Optional[str] = None
+    credential_type: Optional[str] = None
+    currency: Optional[str] = None
+    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
+
+
+class LlmModelCreator(pydantic.BaseModel):
+    """Represents the organization that created/trained the model (e.g., OpenAI, Meta)."""
+
+    id: str
+    name: str
+    display_name: str
+    description: Optional[str] = None
+    website_url: Optional[str] = None
+    logo_url: Optional[str] = None
+    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
+
+
+class LlmModel(pydantic.BaseModel):
+    id: str
+    slug: str
+    display_name: str
+    description: Optional[str] = None
+    provider_id: str
+    creator_id: Optional[str] = None
+    creator: Optional[LlmModelCreator] = None
+    context_window: int
+    max_output_tokens: Optional[int] = None
+    is_enabled: bool = True
+    is_recommended: bool = False
+    capabilities: dict[str, Any] = pydantic.Field(default_factory=dict)
+    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
+    costs: list[LlmModelCost] = pydantic.Field(default_factory=list)
+
+
+class LlmProvider(pydantic.BaseModel):
+    id: str
+    name: str
+    display_name: str
+    description: Optional[str] = None
+    default_credential_provider: Optional[str] = None
+    default_credential_id: Optional[str] = None
+    default_credential_type: Optional[str] = None
+    supports_tools: bool = True
+    supports_json_output: bool = True
+    supports_reasoning: bool = False
+    supports_parallel_tool: bool = False
+    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
+    models: list[LlmModel] = pydantic.Field(default_factory=list)
+
+
+class LlmProvidersResponse(pydantic.BaseModel):
+    providers: list[LlmProvider]
+
+
+class LlmModelsResponse(pydantic.BaseModel):
+    models: list[LlmModel]
+    pagination: Optional[Pagination] = None
+
+
+class LlmCreatorsResponse(pydantic.BaseModel):
+    creators: list[LlmModelCreator]
+
+
+class UpsertLlmProviderRequest(pydantic.BaseModel):
+    name: str
+    display_name: str
+    description: Optional[str] = None
+    default_credential_provider: Optional[str] = None
+    default_credential_id: Optional[str] = None
+    default_credential_type: Optional[str] = "api_key"
+    supports_tools: bool = True
+    supports_json_output: bool = True
+    supports_reasoning: bool = False
+    supports_parallel_tool: bool = False
+    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
+
+
+class UpsertLlmCreatorRequest(pydantic.BaseModel):
+    name: str
+    display_name: str
+    description: Optional[str] = None
+    website_url: Optional[str] = None
+    logo_url: Optional[str] = None
+    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
+
+
+class LlmModelCostInput(pydantic.BaseModel):
+    unit: prisma.enums.LlmCostUnit = prisma.enums.LlmCostUnit.RUN
+    credit_cost: int
+    credential_provider: str
+    credential_id: Optional[str] = None
+    credential_type: Optional[str] = "api_key"
+    currency: Optional[str] = None
+    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
+
+
+class CreateLlmModelRequest(pydantic.BaseModel):
+    slug: str
+    display_name: str
+    description: Optional[str] = None
+    provider_id: str
+    creator_id: Optional[str] = None
+    context_window: int
+    max_output_tokens: Optional[int] = None
+    is_enabled: bool = True
+    capabilities: dict[str, Any] = pydantic.Field(default_factory=dict)
+    metadata: dict[str, Any] = pydantic.Field(default_factory=dict)
+    costs: list[LlmModelCostInput]
+
+    @pydantic.field_validator("slug")
+    @classmethod
+    def validate_slug(cls, v: str) -> str:
+        if not v or len(v) > 100:
+            raise ValueError("Slug must be 1-100 characters")
+        if not SLUG_PATTERN.match(v):
+            raise ValueError(
+                "Slug must start with alphanumeric and contain only "
+                "alphanumeric characters, dots, underscores, slashes, or hyphens"
+            )
+        return v
+
+
+class UpdateLlmModelRequest(pydantic.BaseModel):
+    display_name: Optional[str] = None
+    description: Optional[str] = None
+    context_window: Optional[int] = None
+    max_output_tokens: Optional[int] = None
+    is_enabled: Optional[bool] = None
+    capabilities: Optional[dict[str, Any]] = None
+    metadata: Optional[dict[str, Any]] = None
+    provider_id: Optional[str] = None
+    creator_id: Optional[str] = None
+    costs: Optional[list[LlmModelCostInput]] = None
+
+
+class ToggleLlmModelRequest(pydantic.BaseModel):
+    is_enabled: bool
+    migrate_to_slug: Optional[str] = None
+    migration_reason: Optional[str] = None  # e.g., "Provider outage"
+    # Custom pricing override for migrated workflows. When set, billing should use
+    # this cost instead of the target model's cost for affected nodes.
+    # See LlmModelMigration in schema.prisma for full documentation.
+    custom_credit_cost: Optional[int] = None
+
+
+class ToggleLlmModelResponse(pydantic.BaseModel):
+    model: LlmModel
+    nodes_migrated: int = 0
+    migrated_to_slug: Optional[str] = None
+    migration_id: Optional[str] = None  # ID of the migration record for revert
+
+
+class DeleteLlmModelResponse(pydantic.BaseModel):
+    deleted_model_slug: str
+    deleted_model_display_name: str
+    replacement_model_slug: Optional[str] = None
+    nodes_migrated: int
+    message: str
+
+
+class LlmModelUsageResponse(pydantic.BaseModel):
+    model_slug: str
+    node_count: int
+
+
+# Migration tracking models
+class LlmModelMigration(pydantic.BaseModel):
+    id: str
+    source_model_slug: str
+    target_model_slug: str
+    reason: Optional[str] = None
+    node_count: int
+    # Custom pricing override - billing should use this instead of target model's cost
+    custom_credit_cost: Optional[int] = None
+    is_reverted: bool = False
+    created_at: datetime
+    reverted_at: Optional[datetime] = None
+
+
+class LlmMigrationsResponse(pydantic.BaseModel):
+    migrations: list[LlmModelMigration]
+
+
+class RevertMigrationRequest(pydantic.BaseModel):
+    re_enable_source_model: bool = (
+        True  # Whether to re-enable the source model if disabled
+    )
+
+
+class RevertMigrationResponse(pydantic.BaseModel):
+    migration_id: str
+    source_model_slug: str
+    target_model_slug: str
+    nodes_reverted: int
+    nodes_already_changed: int = (
+        0  # Nodes that were modified since migration (not reverted)
+    )
+    source_model_re_enabled: bool = False  # Whether the source model was re-enabled
+    message: str
+
+
+class SetRecommendedModelRequest(pydantic.BaseModel):
+    model_id: str
+
+
+class SetRecommendedModelResponse(pydantic.BaseModel):
+    model: LlmModel
+    previous_recommended_slug: Optional[str] = None
+    message: str
+
+
+class RecommendedModelResponse(pydantic.BaseModel):
+    model: Optional[LlmModel] = None
+    slug: Optional[str] = None
--- a/autogpt_platform/backend/backend/server/v2/llm/routes.py
+++ b/autogpt_platform/backend/backend/server/v2/llm/routes.py
@@ -0,0 +1,29 @@
+import autogpt_libs.auth
+import fastapi
+
+from backend.server.v2.llm import db as llm_db
+from backend.server.v2.llm import model as llm_model
+
+router = fastapi.APIRouter(
+    prefix="/llm",
+    tags=["llm"],
+    dependencies=[fastapi.Security(autogpt_libs.auth.requires_user)],
+)
+
+
+@router.get("/models", response_model=llm_model.LlmModelsResponse)
+async def list_models(
+    page: int = fastapi.Query(default=1, ge=1, description="Page number (1-indexed)"),
+    page_size: int = fastapi.Query(
+        default=50, ge=1, le=100, description="Number of models per page"
+    ),
+):
+    """List all enabled LLM models available to users."""
+    return await llm_db.list_models(enabled_only=True, page=page, page_size=page_size)
+
+
+@router.get("/providers", response_model=llm_model.LlmProvidersResponse)
+async def list_providers():
+    """List all LLM providers with their enabled models."""
+    providers = await llm_db.list_providers(include_models=True, enabled_only=True)
+    return llm_model.LlmProvidersResponse(providers=providers)
--- a/autogpt_platform/backend/backend/util/cloud_storage.py
+++ b/autogpt_platform/backend/backend/util/cloud_storage.py
@@ -13,7 +13,6 @@ import aiohttp
 from gcloud.aio import storage as async_gcs_storage
 from google.cloud import storage as gcs_storage

-from backend.util.gcs_utils import download_with_fresh_session, generate_signed_url
 from backend.util.settings import Config

 logger = logging.getLogger(__name__)
@@ -252,7 +251,7 @@ class CloudStorageHandler:
            f"in_task: {current_task is not None}"
        )

-        # Parse bucket and blob name from path (path already has gcs:// prefix removed)
+        # Parse bucket and blob name from path
        parts = path.split("/", 1)
        if len(parts) != 2:
            raise ValueError(f"Invalid GCS path: {path}")
@@ -262,19 +261,50 @@ class CloudStorageHandler:
        # Authorization check
        self._validate_file_access(blob_name, user_id, graph_exec_id)

-        logger.info(
-            f"[CloudStorage] About to download from GCS - bucket: {bucket_name}, blob: {blob_name}"
+        # Use a fresh client for each download to avoid session issues
+        # This is less efficient but more reliable with the executor's event loop
+        logger.info("[CloudStorage] Creating fresh GCS client for download")
+
+        # Create a new session specifically for this download
+        session = aiohttp.ClientSession(
+            connector=aiohttp.TCPConnector(limit=10, force_close=True)
        )

+        async_client = None
        try:
-            content = await download_with_fresh_session(bucket_name, blob_name)
+            # Create a new GCS client with the fresh session
+            async_client = async_gcs_storage.Storage(session=session)
+
+            logger.info(
+                f"[CloudStorage] About to download from GCS - bucket: {bucket_name}, blob: {blob_name}"
+            )
+
+            # Download content using the fresh client
+            content = await async_client.download(bucket_name, blob_name)
            logger.info(
                f"[CloudStorage] GCS download successful - size: {len(content)} bytes"
            )
+
+            # Clean up
+            await async_client.close()
+            await session.close()
+
            return content
-        except FileNotFoundError:
-            raise
+
        except Exception as e:
+            # Always try to clean up
+            if async_client is not None:
+                try:
+                    await async_client.close()
+                except Exception as cleanup_error:
+                    logger.warning(
+                        f"[CloudStorage] Error closing GCS client: {cleanup_error}"
+                    )
+            try:
+                await session.close()
+            except Exception as cleanup_error:
+                logger.warning(f"[CloudStorage] Error closing session: {cleanup_error}")
+
            # Log the specific error for debugging
            logger.error(
                f"[CloudStorage] GCS download failed - error: {str(e)}, "
@@ -289,6 +319,10 @@ class CloudStorageHandler:
                    f"current_task: {current_task}, "
                    f"bucket: {bucket_name}, blob: redacted for privacy"
                )
+
+            # Convert gcloud-aio exceptions to standard ones
+            if "404" in str(e) or "Not Found" in str(e):
+                raise FileNotFoundError(f"File not found: gcs://{path}")
            raise

    def _validate_file_access(
@@ -411,7 +445,8 @@ class CloudStorageHandler:
        graph_exec_id: str | None = None,
    ) -> str:
        """Generate signed URL for GCS with authorization."""
-        # Parse bucket and blob name from path (path already has gcs:// prefix removed)
+
+        # Parse bucket and blob name from path
        parts = path.split("/", 1)
        if len(parts) != 2:
            raise ValueError(f"Invalid GCS path: {path}")
@@ -421,11 +456,21 @@ class CloudStorageHandler:
        # Authorization check
        self._validate_file_access(blob_name, user_id, graph_exec_id)

+        # Use sync client for signed URLs since gcloud-aio doesn't support them
        sync_client = self._get_sync_gcs_client()
-        return await generate_signed_url(
-            sync_client, bucket_name, blob_name, expiration_hours * 3600
+        bucket = sync_client.bucket(bucket_name)
+        blob = bucket.blob(blob_name)
+
+        # Generate signed URL asynchronously using sync client
+        url = await asyncio.to_thread(
+            blob.generate_signed_url,
+            version="v4",
+            expiration=datetime.now(timezone.utc) + timedelta(hours=expiration_hours),
+            method="GET",
        )

+        return url
+
    async def delete_expired_files(self, provider: str = "gcs") -> int:
        """
        Delete files that have passed their expiration time.
--- a/autogpt_platform/backend/backend/util/exceptions.py
+++ b/autogpt_platform/backend/backend/util/exceptions.py
@@ -135,12 +135,6 @@ class GraphValidationError(ValueError):
        )


-class InvalidInputError(ValueError):
-    """Raised when user input validation fails (e.g., search term too long)"""
-
-    pass
-
-
 class DatabaseError(Exception):
    """Raised when there is an error interacting with the database"""

--- a/autogpt_platform/backend/backend/util/file.py
+++ b/autogpt_platform/backend/backend/util/file.py
@@ -5,26 +5,13 @@ import shutil
 import tempfile
 import uuid
 from pathlib import Path
-from typing import TYPE_CHECKING, Literal
 from urllib.parse import urlparse

 from backend.util.cloud_storage import get_cloud_storage_handler
 from backend.util.request import Requests
-from backend.util.settings import Config
 from backend.util.type import MediaFileType
 from backend.util.virus_scanner import scan_content_safe

-if TYPE_CHECKING:
-    from backend.data.execution import ExecutionContext
-
-# Return format options for store_media_file
-# - "for_local_processing": Returns local file path - use with ffmpeg, MoviePy, PIL, etc.
-# - "for_external_api": Returns data URI (base64) - use when sending content to external APIs
-# - "for_block_output": Returns best format for output - workspace:// in CoPilot, data URI in graphs
-MediaReturnFormat = Literal[
-    "for_local_processing", "for_external_api", "for_block_output"
-]
-
 TEMP_DIR = Path(tempfile.gettempdir()).resolve()

 # Maximum filename length (conservative limit for most filesystems)
@@ -80,56 +67,42 @@ def clean_exec_files(graph_exec_id: str, file: str = "") -> None:


 async def store_media_file(
+    graph_exec_id: str,
    file: MediaFileType,
-    execution_context: "ExecutionContext",
-    *,
-    return_format: MediaReturnFormat,
+    user_id: str,
+    return_content: bool = False,
 ) -> MediaFileType:
    """
-    Safely handle 'file' (a data URI, a URL, a workspace:// reference, or a local path
-    relative to {temp}/exec_file/{exec_id}), placing or verifying it under:
+    Safely handle 'file' (a data URI, a URL, or a local path relative to {temp}/exec_file/{exec_id}),
+    placing or verifying it under:
        {tempdir}/exec_file/{exec_id}/...

-    For each MediaFileType input:
-    - Data URI: decode and store locally
-    - URL: download and store locally
-    - workspace:// reference: read from workspace, store locally
-    - Local path: verify it exists in exec_file directory
+    If 'return_content=True', return a data URI (data:<mime>;base64,<content>).
+    Otherwise, returns the file media path relative to the exec_id folder.

-    Return format options:
-    - "for_local_processing": Returns local file path - use with ffmpeg, MoviePy, PIL, etc.
-    - "for_external_api": Returns data URI (base64) - use when sending to external APIs
-    - "for_block_output": Returns best format for output - workspace:// in CoPilot, data URI in graphs
+    For each MediaFileType type:
+    - Data URI:
+      -> decode and store in a new random file in that folder
+    - URL:
+      -> download and store in that folder
+    - Local path:
+      -> interpret as relative to that folder; verify it exists
+         (no copying, as it's presumably already there).
+         We realpath-check so no symlink or '..' can escape the folder.

-    :param file:               Data URI, URL, workspace://, or local (relative) path.
-    :param execution_context:  ExecutionContext with user_id, graph_exec_id, workspace_id.
-    :param return_format:      What to return: "for_local_processing", "for_external_api", or "for_block_output".
-    :return:                   The requested result based on return_format.
+
+    :param graph_exec_id:  The unique ID of the graph execution.
+    :param file:           Data URI, URL, or local (relative) path.
+    :param return_content: If True, return a data URI of the file content.
+                           If False, return the *relative* path inside the exec_id folder.
+    :return:               The requested result: data URI or relative path of the media.
    """
-    # Extract values from execution_context
-    graph_exec_id = execution_context.graph_exec_id
-    user_id = execution_context.user_id
-
-    if not graph_exec_id:
-        raise ValueError("execution_context.graph_exec_id is required")
-    if not user_id:
-        raise ValueError("execution_context.user_id is required")
-
-    # Create workspace_manager if we have workspace_id (with session scoping)
-    # Import here to avoid circular import (file.py → workspace.py → data → blocks → file.py)
-    from backend.util.workspace import WorkspaceManager
-
-    workspace_manager: WorkspaceManager | None = None
-    if execution_context.workspace_id:
-        workspace_manager = WorkspaceManager(
-            user_id, execution_context.workspace_id, execution_context.session_id
-        )
    # Build base path
    base_path = Path(get_exec_file_path(graph_exec_id, ""))
    base_path.mkdir(parents=True, exist_ok=True)

    # Security fix: Add disk space limits to prevent DoS
-    MAX_FILE_SIZE_BYTES = Config().max_file_size_mb * 1024 * 1024
+    MAX_FILE_SIZE = 100 * 1024 * 1024  # 100MB per file
    MAX_TOTAL_DISK_USAGE = 1024 * 1024 * 1024  # 1GB total per execution directory

    # Check total disk usage in base_path
@@ -169,57 +142,9 @@ async def store_media_file(
        """
        return str(absolute_path.relative_to(base))

-    # Get cloud storage handler for checking cloud paths
-    cloud_storage = await get_cloud_storage_handler()
-
-    # Track if the input came from workspace (don't re-save it)
-    is_from_workspace = file.startswith("workspace://")
-
-    # Check if this is a workspace file reference
-    if is_from_workspace:
-        if workspace_manager is None:
-            raise ValueError(
-                "Workspace file reference requires workspace context. "
-                "This file type is only available in CoPilot sessions."
-            )
-
-        # Parse workspace reference
-        # workspace://abc123 - by file ID
-        # workspace:///path/to/file.txt - by virtual path
-        file_ref = file[12:]  # Remove "workspace://"
-
-        if file_ref.startswith("/"):
-            # Path reference
-            workspace_content = await workspace_manager.read_file(file_ref)
-            file_info = await workspace_manager.get_file_info_by_path(file_ref)
-            filename = sanitize_filename(
-                file_info.name if file_info else f"{uuid.uuid4()}.bin"
-            )
-        else:
-            # ID reference
-            workspace_content = await workspace_manager.read_file_by_id(file_ref)
-            file_info = await workspace_manager.get_file_info(file_ref)
-            filename = sanitize_filename(
-                file_info.name if file_info else f"{uuid.uuid4()}.bin"
-            )
-
-        try:
-            target_path = _ensure_inside_base(base_path / filename, base_path)
-        except OSError as e:
-            raise ValueError(f"Invalid file path '{filename}': {e}") from e
-
-        # Check file size limit
-        if len(workspace_content) > MAX_FILE_SIZE_BYTES:
-            raise ValueError(
-                f"File too large: {len(workspace_content)} bytes > {MAX_FILE_SIZE_BYTES} bytes"
-            )
-
-        # Virus scan the workspace content before writing locally
-        await scan_content_safe(workspace_content, filename=filename)
-        target_path.write_bytes(workspace_content)
-
    # Check if this is a cloud storage path
-    elif cloud_storage.is_cloud_path(file):
+    cloud_storage = await get_cloud_storage_handler()
+    if cloud_storage.is_cloud_path(file):
        # Download from cloud storage and store locally
        cloud_content = await cloud_storage.retrieve_file(
            file, user_id=user_id, graph_exec_id=graph_exec_id
@@ -234,9 +159,9 @@ async def store_media_file(
            raise ValueError(f"Invalid file path '{filename}': {e}") from e

        # Check file size limit
-        if len(cloud_content) > MAX_FILE_SIZE_BYTES:
+        if len(cloud_content) > MAX_FILE_SIZE:
            raise ValueError(
-                f"File too large: {len(cloud_content)} bytes > {MAX_FILE_SIZE_BYTES} bytes"
+                f"File too large: {len(cloud_content)} bytes > {MAX_FILE_SIZE} bytes"
            )

        # Virus scan the cloud content before writing locally
@@ -264,9 +189,9 @@ async def store_media_file(
        content = base64.b64decode(b64_content)

        # Check file size limit
-        if len(content) > MAX_FILE_SIZE_BYTES:
+        if len(content) > MAX_FILE_SIZE:
            raise ValueError(
-                f"File too large: {len(content)} bytes > {MAX_FILE_SIZE_BYTES} bytes"
+                f"File too large: {len(content)} bytes > {MAX_FILE_SIZE} bytes"
            )

        # Virus scan the base64 content before writing
@@ -274,31 +199,23 @@ async def store_media_file(
        target_path.write_bytes(content)

    elif file.startswith(("http://", "https://")):
-        # URL - download first to get Content-Type header
-        resp = await Requests().get(file)
-
-        # Check file size limit
-        if len(resp.content) > MAX_FILE_SIZE_BYTES:
-            raise ValueError(
-                f"File too large: {len(resp.content)} bytes > {MAX_FILE_SIZE_BYTES} bytes"
-            )
-
-        # Extract filename from URL path
+        # URL
        parsed_url = urlparse(file)
        filename = sanitize_filename(Path(parsed_url.path).name or f"{uuid.uuid4()}")
-
-        # If filename lacks extension, add one from Content-Type header
-        if "." not in filename:
-            content_type = resp.headers.get("Content-Type", "").split(";")[0].strip()
-            if content_type:
-                ext = _extension_from_mime(content_type)
-                filename = f"{filename}{ext}"
-
        try:
            target_path = _ensure_inside_base(base_path / filename, base_path)
        except OSError as e:
            raise ValueError(f"Invalid file path '{filename}': {e}") from e

+        # Download and save
+        resp = await Requests().get(file)
+
+        # Check file size limit
+        if len(resp.content) > MAX_FILE_SIZE:
+            raise ValueError(
+                f"File too large: {len(resp.content)} bytes > {MAX_FILE_SIZE} bytes"
+            )
+
        # Virus scan the downloaded content before writing
        await scan_content_safe(resp.content, filename=filename)
        target_path.write_bytes(resp.content)
@@ -313,43 +230,11 @@ async def store_media_file(
        if not target_path.is_file():
            raise ValueError(f"Local file does not exist: {target_path}")

-    # Return based on requested format
-    if return_format == "for_local_processing":
-        # Use when processing files locally with tools like ffmpeg, MoviePy, PIL
-        # Returns: relative path in exec_file directory (e.g., "image.png")
-        return MediaFileType(_strip_base_prefix(target_path, base_path))
-
-    elif return_format == "for_external_api":
-        # Use when sending content to external APIs that need base64
-        # Returns: data URI (e.g., "data:image/png;base64,iVBORw0...")
+    # Return result
+    if return_content:
        return MediaFileType(_file_to_data_uri(target_path))
-
-    elif return_format == "for_block_output":
-        # Use when returning output from a block to user/next block
-        # Returns: workspace:// ref (CoPilot) or data URI (graph execution)
-        if workspace_manager is None:
-            # No workspace available (graph execution without CoPilot)
-            # Fallback to data URI so the content can still be used/displayed
-            return MediaFileType(_file_to_data_uri(target_path))
-
-        # Don't re-save if input was already from workspace
-        if is_from_workspace:
-            # Return original workspace reference
-            return MediaFileType(file)
-
-        # Save new content to workspace
-        content = target_path.read_bytes()
-        filename = target_path.name
-
-        file_record = await workspace_manager.write_file(
-            content=content,
-            filename=filename,
-            overwrite=True,
-        )
-        return MediaFileType(f"workspace://{file_record.id}")
-
    else:
-        raise ValueError(f"Invalid return_format: {return_format}")
+        return MediaFileType(_strip_base_prefix(target_path, base_path))


 def get_dir_size(path: Path) -> int:
--- a/autogpt_platform/backend/backend/util/file_test.py
+++ b/autogpt_platform/backend/backend/util/file_test.py
@@ -7,22 +7,10 @@ from unittest.mock import AsyncMock, MagicMock, patch

 import pytest

-from backend.data.execution import ExecutionContext
 from backend.util.file import store_media_file
 from backend.util.type import MediaFileType


-def make_test_context(
-    graph_exec_id: str = "test-exec-123",
-    user_id: str = "test-user-123",
-) -> ExecutionContext:
-    """Helper to create test ExecutionContext."""
-    return ExecutionContext(
-        user_id=user_id,
-        graph_exec_id=graph_exec_id,
-    )
-
-
 class TestFileCloudIntegration:
    """Test cases for cloud storage integration in file utilities."""

@@ -82,9 +70,10 @@ class TestFileCloudIntegration:
            mock_path_class.side_effect = path_constructor

            result = await store_media_file(
-                file=MediaFileType(cloud_path),
-                execution_context=make_test_context(graph_exec_id=graph_exec_id),
-                return_format="for_local_processing",
+                graph_exec_id,
+                MediaFileType(cloud_path),
+                "test-user-123",
+                return_content=False,
            )

            # Verify cloud storage operations
@@ -155,9 +144,10 @@ class TestFileCloudIntegration:
            mock_path_obj.name = "image.png"
            with patch("backend.util.file.Path", return_value=mock_path_obj):
                result = await store_media_file(
-                    file=MediaFileType(cloud_path),
-                    execution_context=make_test_context(graph_exec_id=graph_exec_id),
-                    return_format="for_external_api",
+                    graph_exec_id,
+                    MediaFileType(cloud_path),
+                    "test-user-123",
+                    return_content=True,
                )

            # Verify result is a data URI
@@ -208,9 +198,10 @@ class TestFileCloudIntegration:
            mock_resolved_path.relative_to.return_value = Path("test-uuid-789.txt")

            await store_media_file(
-                file=MediaFileType(data_uri),
-                execution_context=make_test_context(graph_exec_id=graph_exec_id),
-                return_format="for_local_processing",
+                graph_exec_id,
+                MediaFileType(data_uri),
+                "test-user-123",
+                return_content=False,
            )

            # Verify cloud handler was checked but not used for retrieval
@@ -243,7 +234,5 @@ class TestFileCloudIntegration:
                FileNotFoundError, match="File not found in cloud storage"
            ):
                await store_media_file(
-                    file=MediaFileType(cloud_path),
-                    execution_context=make_test_context(graph_exec_id=graph_exec_id),
-                    return_format="for_local_processing",
+                    graph_exec_id, MediaFileType(cloud_path), "test-user-123"
                )
--- a/autogpt_platform/backend/backend/util/gcs_utils.py
+++ b/autogpt_platform/backend/backend/util/gcs_utils.py
@@ -1,108 +0,0 @@
-"""
-Shared GCS utilities for workspace and cloud storage backends.
-
-This module provides common functionality for working with Google Cloud Storage,
-including path parsing, client management, and signed URL generation.
-"""
-
-import asyncio
-import logging
-from datetime import datetime, timedelta, timezone
-
-import aiohttp
-from gcloud.aio import storage as async_gcs_storage
-from google.cloud import storage as gcs_storage
-
-logger = logging.getLogger(__name__)
-
-
-def parse_gcs_path(path: str) -> tuple[str, str]:
-    """
-    Parse a GCS path in the format 'gcs://bucket/blob' to (bucket, blob).
-
-    Args:
-        path: GCS path string (e.g., "gcs://my-bucket/path/to/file")
-
-    Returns:
-        Tuple of (bucket_name, blob_name)
-
-    Raises:
-        ValueError: If the path format is invalid
-    """
-    if not path.startswith("gcs://"):
-        raise ValueError(f"Invalid GCS path: {path}")
-
-    path_without_prefix = path[6:]  # Remove "gcs://"
-    parts = path_without_prefix.split("/", 1)
-    if len(parts) != 2:
-        raise ValueError(f"Invalid GCS path format: {path}")
-
-    return parts[0], parts[1]
-
-
-async def download_with_fresh_session(bucket: str, blob: str) -> bytes:
-    """
-    Download file content using a fresh session.
-
-    This approach avoids event loop issues that can occur when reusing
-    sessions across different async contexts (e.g., in executors).
-
-    Args:
-        bucket: GCS bucket name
-        blob: Blob path within the bucket
-
-    Returns:
-        File content as bytes
-
-    Raises:
-        FileNotFoundError: If the file doesn't exist
-    """
-    session = aiohttp.ClientSession(
-        connector=aiohttp.TCPConnector(limit=10, force_close=True)
-    )
-    client: async_gcs_storage.Storage | None = None
-    try:
-        client = async_gcs_storage.Storage(session=session)
-        content = await client.download(bucket, blob)
-        return content
-    except Exception as e:
-        if "404" in str(e) or "Not Found" in str(e):
-            raise FileNotFoundError(f"File not found: gcs://{bucket}/{blob}")
-        raise
-    finally:
-        if client:
-            try:
-                await client.close()
-            except Exception:
-                pass  # Best-effort cleanup
-        await session.close()
-
-
-async def generate_signed_url(
-    sync_client: gcs_storage.Client,
-    bucket_name: str,
-    blob_name: str,
-    expires_in: int,
-) -> str:
-    """
-    Generate a signed URL for temporary access to a GCS file.
-
-    Uses asyncio.to_thread() to run the sync operation without blocking.
-
-    Args:
-        sync_client: Sync GCS client with service account credentials
-        bucket_name: GCS bucket name
-        blob_name: Blob path within the bucket
-        expires_in: URL expiration time in seconds
-
-    Returns:
-        Signed URL string
-    """
-    bucket = sync_client.bucket(bucket_name)
-    blob = bucket.blob(blob_name)
-    return await asyncio.to_thread(
-        blob.generate_signed_url,
-        version="v4",
-        expiration=datetime.now(timezone.utc) + timedelta(seconds=expires_in),
-        method="GET",
-    )
--- a/autogpt_platform/backend/backend/util/settings.py
+++ b/autogpt_platform/backend/backend/util/settings.py
@@ -263,12 +263,6 @@ class Config(UpdateTrackingModel["Config"], BaseSettings):
        description="The name of the Google Cloud Storage bucket for media files",
    )

-    workspace_storage_dir: str = Field(
-        default="",
-        description="Local directory for workspace file storage when GCS is not configured. "
-        "If empty, defaults to {app_data}/workspaces. Used for self-hosted deployments.",
-    )
-
    reddit_user_agent: str = Field(
        default="web:AutoGPT:v0.6.0 (by /u/autogpt)",
        description="The user agent for the Reddit API",
@@ -365,8 +359,8 @@ class Config(UpdateTrackingModel["Config"], BaseSettings):
        description="The port for the Agent Generator service",
    )
    agentgenerator_timeout: int = Field(
-        default=600,
-        description="The timeout in seconds for Agent Generator service requests (includes retries for rate limits)",
+        default=120,
+        description="The timeout in seconds for Agent Generator service requests",
    )

    enable_example_blocks: bool = Field(
@@ -395,13 +389,6 @@ class Config(UpdateTrackingModel["Config"], BaseSettings):
        description="Maximum file size in MB for file uploads (1-1024 MB)",
    )

-    max_file_size_mb: int = Field(
-        default=100,
-        ge=1,
-        le=1024,
-        description="Maximum file size in MB for workspace files (1-1024 MB)",
-    )
-
    # AutoMod configuration
    automod_enabled: bool = Field(
        default=False,
--- a/autogpt_platform/backend/backend/util/test.py
+++ b/autogpt_platform/backend/backend/util/test.py
@@ -140,29 +140,14 @@ async def execute_block_test(block: Block):
            setattr(block, mock_name, mock_obj)

    # Populate credentials argument(s)
-    # Generate IDs for execution context
-    graph_id = str(uuid.uuid4())
-    node_id = str(uuid.uuid4())
-    graph_exec_id = str(uuid.uuid4())
-    node_exec_id = str(uuid.uuid4())
-    user_id = str(uuid.uuid4())
-    graph_version = 1  # Default version for tests
-
    extra_exec_kwargs: dict = {
-        "graph_id": graph_id,
-        "node_id": node_id,
-        "graph_exec_id": graph_exec_id,
-        "node_exec_id": node_exec_id,
-        "user_id": user_id,
-        "graph_version": graph_version,
-        "execution_context": ExecutionContext(
-            user_id=user_id,
-            graph_id=graph_id,
-            graph_exec_id=graph_exec_id,
-            graph_version=graph_version,
-            node_id=node_id,
-            node_exec_id=node_exec_id,
-        ),
+        "graph_id": str(uuid.uuid4()),
+        "node_id": str(uuid.uuid4()),
+        "graph_exec_id": str(uuid.uuid4()),
+        "node_exec_id": str(uuid.uuid4()),
+        "user_id": str(uuid.uuid4()),
+        "graph_version": 1,  # Default version for tests
+        "execution_context": ExecutionContext(),
    }
    input_model = cast(type[BlockSchema], block.input_schema)

--- a/autogpt_platform/backend/backend/util/workspace.py
+++ b/autogpt_platform/backend/backend/util/workspace.py
@@ -1,419 +0,0 @@
-"""
-WorkspaceManager for managing user workspace file operations.
-
-This module provides a high-level interface for workspace file operations,
-combining the storage backend and database layer.
-"""
-
-import logging
-import mimetypes
-import uuid
-from typing import Optional
-
-from prisma.errors import UniqueViolationError
-from prisma.models import UserWorkspaceFile
-
-from backend.data.workspace import (
-    count_workspace_files,
-    create_workspace_file,
-    get_workspace_file,
-    get_workspace_file_by_path,
-    list_workspace_files,
-    soft_delete_workspace_file,
-)
-from backend.util.settings import Config
-from backend.util.workspace_storage import compute_file_checksum, get_workspace_storage
-
-logger = logging.getLogger(__name__)
-
-
-class WorkspaceManager:
-    """
-    Manages workspace file operations.
-
-    Combines storage backend operations with database record management.
-    Supports session-scoped file segmentation where files are stored in
-    session-specific virtual paths: /sessions/{session_id}/{filename}
-    """
-
-    def __init__(
-        self, user_id: str, workspace_id: str, session_id: Optional[str] = None
-    ):
-        """
-        Initialize WorkspaceManager.
-
-        Args:
-            user_id: The user's ID
-            workspace_id: The workspace ID
-            session_id: Optional session ID for session-scoped file access
-        """
-        self.user_id = user_id
-        self.workspace_id = workspace_id
-        self.session_id = session_id
-        # Session path prefix for file isolation
-        self.session_path = f"/sessions/{session_id}" if session_id else ""
-
-    def _resolve_path(self, path: str) -> str:
-        """
-        Resolve a path, defaulting to session folder if session_id is set.
-
-        Cross-session access is allowed by explicitly using /sessions/other-session-id/...
-
-        Args:
-            path: Virtual path (e.g., "/file.txt" or "/sessions/abc123/file.txt")
-
-        Returns:
-            Resolved path with session prefix if applicable
-        """
-        # If path explicitly references a session folder, use it as-is
-        if path.startswith("/sessions/"):
-            return path
-
-        # If we have a session context, prepend session path
-        if self.session_path:
-            # Normalize the path
-            if not path.startswith("/"):
-                path = f"/{path}"
-            return f"{self.session_path}{path}"
-
-        # No session context, use path as-is
-        return path if path.startswith("/") else f"/{path}"
-
-    def _get_effective_path(
-        self, path: Optional[str], include_all_sessions: bool
-    ) -> Optional[str]:
-        """
-        Get effective path for list/count operations based on session context.
-
-        Args:
-            path: Optional path prefix to filter
-            include_all_sessions: If True, don't apply session scoping
-
-        Returns:
-            Effective path prefix for database query
-        """
-        if include_all_sessions:
-            # Normalize path to ensure leading slash (stored paths are normalized)
-            if path is not None and not path.startswith("/"):
-                return f"/{path}"
-            return path
-        elif path is not None:
-            # Resolve the provided path with session scoping
-            return self._resolve_path(path)
-        elif self.session_path:
-            # Default to session folder with trailing slash to prevent prefix collisions
-            # e.g., "/sessions/abc" should not match "/sessions/abc123"
-            return self.session_path.rstrip("/") + "/"
-        else:
-            # No session context, use path as-is
-            return path
-
-    async def read_file(self, path: str) -> bytes:
-        """
-        Read file from workspace by virtual path.
-
-        When session_id is set, paths are resolved relative to the session folder
-        unless they explicitly reference /sessions/...
-
-        Args:
-            path: Virtual path (e.g., "/documents/report.pdf")
-
-        Returns:
-            File content as bytes
-
-        Raises:
-            FileNotFoundError: If file doesn't exist
-        """
-        resolved_path = self._resolve_path(path)
-        file = await get_workspace_file_by_path(self.workspace_id, resolved_path)
-        if file is None:
-            raise FileNotFoundError(f"File not found at path: {resolved_path}")
-
-        storage = await get_workspace_storage()
-        return await storage.retrieve(file.storagePath)
-
-    async def read_file_by_id(self, file_id: str) -> bytes:
-        """
-        Read file from workspace by file ID.
-
-        Args:
-            file_id: The file's ID
-
-        Returns:
-            File content as bytes
-
-        Raises:
-            FileNotFoundError: If file doesn't exist
-        """
-        file = await get_workspace_file(file_id, self.workspace_id)
-        if file is None:
-            raise FileNotFoundError(f"File not found: {file_id}")
-
-        storage = await get_workspace_storage()
-        return await storage.retrieve(file.storagePath)
-
-    async def write_file(
-        self,
-        content: bytes,
-        filename: str,
-        path: Optional[str] = None,
-        mime_type: Optional[str] = None,
-        overwrite: bool = False,
-    ) -> UserWorkspaceFile:
-        """
-        Write file to workspace.
-
-        When session_id is set, files are written to /sessions/{session_id}/...
-        by default. Use explicit /sessions/... paths for cross-session access.
-
-        Args:
-            content: File content as bytes
-            filename: Filename for the file
-            path: Virtual path (defaults to "/{filename}", session-scoped if session_id set)
-            mime_type: MIME type (auto-detected if not provided)
-            overwrite: Whether to overwrite existing file at path
-
-        Returns:
-            Created UserWorkspaceFile instance
-
-        Raises:
-            ValueError: If file exceeds size limit or path already exists
-        """
-        # Enforce file size limit
-        max_file_size = Config().max_file_size_mb * 1024 * 1024
-        if len(content) > max_file_size:
-            raise ValueError(
-                f"File too large: {len(content)} bytes exceeds "
-                f"{Config().max_file_size_mb}MB limit"
-            )
-
-        # Determine path with session scoping
-        if path is None:
-            path = f"/{filename}"
-        elif not path.startswith("/"):
-            path = f"/{path}"
-
-        # Resolve path with session prefix
-        path = self._resolve_path(path)
-
-        # Check if file exists at path (only error for non-overwrite case)
-        # For overwrite=True, we let the write proceed and handle via UniqueViolationError
-        # This ensures the new file is written to storage BEFORE the old one is deleted,
-        # preventing data loss if the new write fails
-        if not overwrite:
-            existing = await get_workspace_file_by_path(self.workspace_id, path)
-            if existing is not None:
-                raise ValueError(f"File already exists at path: {path}")
-
-        # Auto-detect MIME type if not provided
-        if mime_type is None:
-            mime_type, _ = mimetypes.guess_type(filename)
-            mime_type = mime_type or "application/octet-stream"
-
-        # Compute checksum
-        checksum = compute_file_checksum(content)
-
-        # Generate unique file ID for storage
-        file_id = str(uuid.uuid4())
-
-        # Store file in storage backend
-        storage = await get_workspace_storage()
-        storage_path = await storage.store(
-            workspace_id=self.workspace_id,
-            file_id=file_id,
-            filename=filename,
-            content=content,
-        )
-
-        # Create database record - handle race condition where another request
-        # created a file at the same path between our check and create
-        try:
-            file = await create_workspace_file(
-                workspace_id=self.workspace_id,
-                file_id=file_id,
-                name=filename,
-                path=path,
-                storage_path=storage_path,
-                mime_type=mime_type,
-                size_bytes=len(content),
-                checksum=checksum,
-            )
-        except UniqueViolationError:
-            # Race condition: another request created a file at this path
-            if overwrite:
-                # Re-fetch and delete the conflicting file, then retry
-                existing = await get_workspace_file_by_path(self.workspace_id, path)
-                if existing:
-                    await self.delete_file(existing.id)
-                # Retry the create - if this also fails, clean up storage file
-                try:
-                    file = await create_workspace_file(
-                        workspace_id=self.workspace_id,
-                        file_id=file_id,
-                        name=filename,
-                        path=path,
-                        storage_path=storage_path,
-                        mime_type=mime_type,
-                        size_bytes=len(content),
-                        checksum=checksum,
-                    )
-                except Exception:
-                    # Clean up orphaned storage file on retry failure
-                    try:
-                        await storage.delete(storage_path)
-                    except Exception as e:
-                        logger.warning(f"Failed to clean up orphaned storage file: {e}")
-                    raise
-            else:
-                # Clean up the orphaned storage file before raising
-                try:
-                    await storage.delete(storage_path)
-                except Exception as e:
-                    logger.warning(f"Failed to clean up orphaned storage file: {e}")
-                raise ValueError(f"File already exists at path: {path}")
-        except Exception:
-            # Any other database error (connection, validation, etc.) - clean up storage
-            try:
-                await storage.delete(storage_path)
-            except Exception as e:
-                logger.warning(f"Failed to clean up orphaned storage file: {e}")
-            raise
-
-        logger.info(
-            f"Wrote file {file.id} ({filename}) to workspace {self.workspace_id} "
-            f"at path {path}, size={len(content)} bytes"
-        )
-
-        return file
-
-    async def list_files(
-        self,
-        path: Optional[str] = None,
-        limit: Optional[int] = None,
-        offset: int = 0,
-        include_all_sessions: bool = False,
-    ) -> list[UserWorkspaceFile]:
-        """
-        List files in workspace.
-
-        When session_id is set and include_all_sessions is False (default),
-        only files in the current session's folder are listed.
-
-        Args:
-            path: Optional path prefix to filter (e.g., "/documents/")
-            limit: Maximum number of files to return
-            offset: Number of files to skip
-            include_all_sessions: If True, list files from all sessions.
-                                  If False (default), only list current session's files.
-
-        Returns:
-            List of UserWorkspaceFile instances
-        """
-        effective_path = self._get_effective_path(path, include_all_sessions)
-
-        return await list_workspace_files(
-            workspace_id=self.workspace_id,
-            path_prefix=effective_path,
-            limit=limit,
-            offset=offset,
-        )
-
-    async def delete_file(self, file_id: str) -> bool:
-        """
-        Delete a file (soft-delete).
-
-        Args:
-            file_id: The file's ID
-
-        Returns:
-            True if deleted, False if not found
-        """
-        file = await get_workspace_file(file_id, self.workspace_id)
-        if file is None:
-            return False
-
-        # Delete from storage
-        storage = await get_workspace_storage()
-        try:
-            await storage.delete(file.storagePath)
-        except Exception as e:
-            logger.warning(f"Failed to delete file from storage: {e}")
-            # Continue with database soft-delete even if storage delete fails
-
-        # Soft-delete database record
-        result = await soft_delete_workspace_file(file_id, self.workspace_id)
-        return result is not None
-
-    async def get_download_url(self, file_id: str, expires_in: int = 3600) -> str:
-        """
-        Get download URL for a file.
-
-        Args:
-            file_id: The file's ID
-            expires_in: URL expiration in seconds (default 1 hour)
-
-        Returns:
-            Download URL (signed URL for GCS, API endpoint for local)
-
-        Raises:
-            FileNotFoundError: If file doesn't exist
-        """
-        file = await get_workspace_file(file_id, self.workspace_id)
-        if file is None:
-            raise FileNotFoundError(f"File not found: {file_id}")
-
-        storage = await get_workspace_storage()
-        return await storage.get_download_url(file.storagePath, expires_in)
-
-    async def get_file_info(self, file_id: str) -> Optional[UserWorkspaceFile]:
-        """
-        Get file metadata.
-
-        Args:
-            file_id: The file's ID
-
-        Returns:
-            UserWorkspaceFile instance or None
-        """
-        return await get_workspace_file(file_id, self.workspace_id)
-
-    async def get_file_info_by_path(self, path: str) -> Optional[UserWorkspaceFile]:
-        """
-        Get file metadata by path.
-
-        When session_id is set, paths are resolved relative to the session folder
-        unless they explicitly reference /sessions/...
-
-        Args:
-            path: Virtual path
-
-        Returns:
-            UserWorkspaceFile instance or None
-        """
-        resolved_path = self._resolve_path(path)
-        return await get_workspace_file_by_path(self.workspace_id, resolved_path)
-
-    async def get_file_count(
-        self,
-        path: Optional[str] = None,
-        include_all_sessions: bool = False,
-    ) -> int:
-        """
-        Get number of files in workspace.
-
-        When session_id is set and include_all_sessions is False (default),
-        only counts files in the current session's folder.
-
-        Args:
-            path: Optional path prefix to filter (e.g., "/documents/")
-            include_all_sessions: If True, count all files in workspace.
-                                  If False (default), only count current session's files.
-
-        Returns:
-            Number of files
-        """
-        effective_path = self._get_effective_path(path, include_all_sessions)
-
-        return await count_workspace_files(
-            self.workspace_id, path_prefix=effective_path
-        )
--- a/autogpt_platform/backend/backend/util/workspace_storage.py
+++ b/autogpt_platform/backend/backend/util/workspace_storage.py
@@ -1,398 +0,0 @@
-"""
-Workspace storage backend abstraction for supporting both cloud and local deployments.
-
-This module provides a unified interface for storing workspace files, with implementations
-for Google Cloud Storage (cloud deployments) and local filesystem (self-hosted deployments).
-"""
-
-import asyncio
-import hashlib
-import logging
-from abc import ABC, abstractmethod
-from datetime import datetime, timezone
-from pathlib import Path
-from typing import Optional
-
-import aiofiles
-import aiohttp
-from gcloud.aio import storage as async_gcs_storage
-from google.cloud import storage as gcs_storage
-
-from backend.util.data import get_data_path
-from backend.util.gcs_utils import (
-    download_with_fresh_session,
-    generate_signed_url,
-    parse_gcs_path,
-)
-from backend.util.settings import Config
-
-logger = logging.getLogger(__name__)
-
-
-class WorkspaceStorageBackend(ABC):
-    """Abstract interface for workspace file storage."""
-
-    @abstractmethod
-    async def store(
-        self,
-        workspace_id: str,
-        file_id: str,
-        filename: str,
-        content: bytes,
-    ) -> str:
-        """
-        Store file content, return storage path.
-
-        Args:
-            workspace_id: The workspace ID
-            file_id: Unique file ID for storage
-            filename: Original filename
-            content: File content as bytes
-
-        Returns:
-            Storage path string (cloud path or local path)
-        """
-        pass
-
-    @abstractmethod
-    async def retrieve(self, storage_path: str) -> bytes:
-        """
-        Retrieve file content from storage.
-
-        Args:
-            storage_path: The storage path returned from store()
-
-        Returns:
-            File content as bytes
-        """
-        pass
-
-    @abstractmethod
-    async def delete(self, storage_path: str) -> None:
-        """
-        Delete file from storage.
-
-        Args:
-            storage_path: The storage path to delete
-        """
-        pass
-
-    @abstractmethod
-    async def get_download_url(self, storage_path: str, expires_in: int = 3600) -> str:
-        """
-        Get URL for downloading the file.
-
-        Args:
-            storage_path: The storage path
-            expires_in: URL expiration time in seconds (default 1 hour)
-
-        Returns:
-            Download URL (signed URL for GCS, direct API path for local)
-        """
-        pass
-
-
-class GCSWorkspaceStorage(WorkspaceStorageBackend):
-    """Google Cloud Storage implementation for workspace storage."""
-
-    def __init__(self, bucket_name: str):
-        self.bucket_name = bucket_name
-        self._async_client: Optional[async_gcs_storage.Storage] = None
-        self._sync_client: Optional[gcs_storage.Client] = None
-        self._session: Optional[aiohttp.ClientSession] = None
-
-    async def _get_async_client(self) -> async_gcs_storage.Storage:
-        """Get or create async GCS client."""
-        if self._async_client is None:
-            self._session = aiohttp.ClientSession(
-                connector=aiohttp.TCPConnector(limit=100, force_close=False)
-            )
-            self._async_client = async_gcs_storage.Storage(session=self._session)
-        return self._async_client
-
-    def _get_sync_client(self) -> gcs_storage.Client:
-        """Get or create sync GCS client (for signed URLs)."""
-        if self._sync_client is None:
-            self._sync_client = gcs_storage.Client()
-        return self._sync_client
-
-    async def close(self) -> None:
-        """Close all client connections."""
-        if self._async_client is not None:
-            try:
-                await self._async_client.close()
-            except Exception as e:
-                logger.warning(f"Error closing GCS client: {e}")
-            self._async_client = None
-
-        if self._session is not None:
-            try:
-                await self._session.close()
-            except Exception as e:
-                logger.warning(f"Error closing session: {e}")
-            self._session = None
-
-    def _build_blob_name(self, workspace_id: str, file_id: str, filename: str) -> str:
-        """Build the blob path for workspace files."""
-        return f"workspaces/{workspace_id}/{file_id}/{filename}"
-
-    async def store(
-        self,
-        workspace_id: str,
-        file_id: str,
-        filename: str,
-        content: bytes,
-    ) -> str:
-        """Store file in GCS."""
-        client = await self._get_async_client()
-        blob_name = self._build_blob_name(workspace_id, file_id, filename)
-
-        # Upload with metadata
-        upload_time = datetime.now(timezone.utc)
-        await client.upload(
-            self.bucket_name,
-            blob_name,
-            content,
-            metadata={
-                "uploaded_at": upload_time.isoformat(),
-                "workspace_id": workspace_id,
-                "file_id": file_id,
-            },
-        )
-
-        return f"gcs://{self.bucket_name}/{blob_name}"
-
-    async def retrieve(self, storage_path: str) -> bytes:
-        """Retrieve file from GCS."""
-        bucket_name, blob_name = parse_gcs_path(storage_path)
-        return await download_with_fresh_session(bucket_name, blob_name)
-
-    async def delete(self, storage_path: str) -> None:
-        """Delete file from GCS."""
-        bucket_name, blob_name = parse_gcs_path(storage_path)
-        client = await self._get_async_client()
-
-        try:
-            await client.delete(bucket_name, blob_name)
-        except Exception as e:
-            if "404" not in str(e) and "Not Found" not in str(e):
-                raise
-            # File already deleted, that's fine
-
-    async def get_download_url(self, storage_path: str, expires_in: int = 3600) -> str:
-        """
-        Generate download URL for GCS file.
-
-        Attempts to generate a signed URL if running with service account credentials.
-        Falls back to an API proxy endpoint if signed URL generation fails
-        (e.g., when running locally with user OAuth credentials).
-        """
-        bucket_name, blob_name = parse_gcs_path(storage_path)
-
-        # Extract file_id from blob_name for fallback: workspaces/{workspace_id}/{file_id}/{filename}
-        blob_parts = blob_name.split("/")
-        file_id = blob_parts[2] if len(blob_parts) >= 3 else None
-
-        # Try to generate signed URL (requires service account credentials)
-        try:
-            sync_client = self._get_sync_client()
-            return await generate_signed_url(
-                sync_client, bucket_name, blob_name, expires_in
-            )
-        except AttributeError as e:
-            # Signed URL generation requires service account with private key.
-            # When running with user OAuth credentials, fall back to API proxy.
-            if "private key" in str(e) and file_id:
-                logger.debug(
-                    "Cannot generate signed URL (no service account credentials), "
-                    "falling back to API proxy endpoint"
-                )
-                return f"/api/workspace/files/{file_id}/download"
-            raise
-
-
-class LocalWorkspaceStorage(WorkspaceStorageBackend):
-    """Local filesystem implementation for workspace storage (self-hosted deployments)."""
-
-    def __init__(self, base_dir: Optional[str] = None):
-        """
-        Initialize local storage backend.
-
-        Args:
-            base_dir: Base directory for workspace storage.
-                     If None, defaults to {app_data}/workspaces
-        """
-        if base_dir:
-            self.base_dir = Path(base_dir)
-        else:
-            self.base_dir = Path(get_data_path()) / "workspaces"
-
-        # Ensure base directory exists
-        self.base_dir.mkdir(parents=True, exist_ok=True)
-
-    def _build_file_path(self, workspace_id: str, file_id: str, filename: str) -> Path:
-        """Build the local file path with path traversal protection."""
-        # Import here to avoid circular import
-        # (file.py imports workspace.py which imports workspace_storage.py)
-        from backend.util.file import sanitize_filename
-
-        # Sanitize filename to prevent path traversal (removes / and \ among others)
-        safe_filename = sanitize_filename(filename)
-        file_path = (self.base_dir / workspace_id / file_id / safe_filename).resolve()
-
-        # Verify the resolved path is still under base_dir
-        if not file_path.is_relative_to(self.base_dir.resolve()):
-            raise ValueError("Invalid filename: path traversal detected")
-
-        return file_path
-
-    def _parse_storage_path(self, storage_path: str) -> Path:
-        """Parse local storage path to filesystem path."""
-        if storage_path.startswith("local://"):
-            relative_path = storage_path[8:]  # Remove "local://"
-        else:
-            relative_path = storage_path
-
-        full_path = (self.base_dir / relative_path).resolve()
-
-        # Security check: ensure path is under base_dir
-        # Use is_relative_to() for robust path containment check
-        # (handles case-insensitive filesystems and edge cases)
-        if not full_path.is_relative_to(self.base_dir.resolve()):
-            raise ValueError("Invalid storage path: path traversal detected")
-
-        return full_path
-
-    async def store(
-        self,
-        workspace_id: str,
-        file_id: str,
-        filename: str,
-        content: bytes,
-    ) -> str:
-        """Store file locally."""
-        file_path = self._build_file_path(workspace_id, file_id, filename)
-
-        # Create parent directories
-        file_path.parent.mkdir(parents=True, exist_ok=True)
-
-        # Write file asynchronously
-        async with aiofiles.open(file_path, "wb") as f:
-            await f.write(content)
-
-        # Return relative path as storage path
-        relative_path = file_path.relative_to(self.base_dir)
-        return f"local://{relative_path}"
-
-    async def retrieve(self, storage_path: str) -> bytes:
-        """Retrieve file from local storage."""
-        file_path = self._parse_storage_path(storage_path)
-
-        if not file_path.exists():
-            raise FileNotFoundError(f"File not found: {storage_path}")
-
-        async with aiofiles.open(file_path, "rb") as f:
-            return await f.read()
-
-    async def delete(self, storage_path: str) -> None:
-        """Delete file from local storage."""
-        file_path = self._parse_storage_path(storage_path)
-
-        if file_path.exists():
-            # Remove file
-            file_path.unlink()
-
-            # Clean up empty parent directories
-            parent = file_path.parent
-            while parent != self.base_dir:
-                try:
-                    if parent.exists() and not any(parent.iterdir()):
-                        parent.rmdir()
-                    else:
-                        break
-                except OSError:
-                    break
-                parent = parent.parent
-
-    async def get_download_url(self, storage_path: str, expires_in: int = 3600) -> str:
-        """
-        Get download URL for local file.
-
-        For local storage, this returns an API endpoint path.
-        The actual serving is handled by the API layer.
-        """
-        # Parse the storage path to get the components
-        if storage_path.startswith("local://"):
-            relative_path = storage_path[8:]
-        else:
-            relative_path = storage_path
-
-        # Return the API endpoint for downloading
-        # The file_id is extracted from the path: {workspace_id}/{file_id}/{filename}
-        parts = relative_path.split("/")
-        if len(parts) >= 2:
-            file_id = parts[1]  # Second component is file_id
-            return f"/api/workspace/files/{file_id}/download"
-        else:
-            raise ValueError(f"Invalid storage path format: {storage_path}")
-
-
-# Global storage backend instance
-_workspace_storage: Optional[WorkspaceStorageBackend] = None
-_storage_lock = asyncio.Lock()
-
-
-async def get_workspace_storage() -> WorkspaceStorageBackend:
-    """
-    Get the workspace storage backend instance.
-
-    Uses GCS if media_gcs_bucket_name is configured, otherwise uses local storage.
-    """
-    global _workspace_storage
-
-    if _workspace_storage is None:
-        async with _storage_lock:
-            if _workspace_storage is None:
-                config = Config()
-
-                if config.media_gcs_bucket_name:
-                    logger.info(
-                        f"Using GCS workspace storage: {config.media_gcs_bucket_name}"
-                    )
-                    _workspace_storage = GCSWorkspaceStorage(
-                        config.media_gcs_bucket_name
-                    )
-                else:
-                    storage_dir = (
-                        config.workspace_storage_dir
-                        if config.workspace_storage_dir
-                        else None
-                    )
-                    logger.info(
-                        f"Using local workspace storage: {storage_dir or 'default'}"
-                    )
-                    _workspace_storage = LocalWorkspaceStorage(storage_dir)
-
-    return _workspace_storage
-
-
-async def shutdown_workspace_storage() -> None:
-    """
-    Properly shutdown the global workspace storage backend.
-
-    Closes aiohttp sessions and other resources for GCS backend.
-    Should be called during application shutdown.
-    """
-    global _workspace_storage
-
-    if _workspace_storage is not None:
-        async with _storage_lock:
-            if _workspace_storage is not None:
-                if isinstance(_workspace_storage, GCSWorkspaceStorage):
-                    await _workspace_storage.close()
-                _workspace_storage = None
-
-
-def compute_file_checksum(content: bytes) -> str:
-    """Compute SHA256 checksum of file content."""
-    return hashlib.sha256(content).hexdigest()
--- a/autogpt_platform/backend/migrations/20251126113000_add_llm_registry/migration.sql
+++ b/autogpt_platform/backend/migrations/20251126113000_add_llm_registry/migration.sql
@@ -0,0 +1,81 @@
+-- CreateEnum
+CREATE TYPE "LlmCostUnit" AS ENUM ('RUN', 'TOKENS');
+
+-- CreateTable
+CREATE TABLE "LlmProvider" (
+    "id" TEXT NOT NULL,
+    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
+    "updatedAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
+    "name" TEXT NOT NULL,
+    "displayName" TEXT NOT NULL,
+    "description" TEXT,
+    "defaultCredentialProvider" TEXT,
+    "defaultCredentialId" TEXT,
+    "defaultCredentialType" TEXT,
+    "supportsTools" BOOLEAN NOT NULL DEFAULT TRUE,
+    "supportsJsonOutput" BOOLEAN NOT NULL DEFAULT TRUE,
+    "supportsReasoning" BOOLEAN NOT NULL DEFAULT FALSE,
+    "supportsParallelTool" BOOLEAN NOT NULL DEFAULT FALSE,
+    "metadata" JSONB NOT NULL DEFAULT '{}'::jsonb,
+
+    CONSTRAINT "LlmProvider_pkey" PRIMARY KEY ("id"),
+    CONSTRAINT "LlmProvider_name_key" UNIQUE ("name")
+);
+
+-- CreateTable
+CREATE TABLE "LlmModel" (
+    "id" TEXT NOT NULL,
+    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
+    "updatedAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
+    "slug" TEXT NOT NULL,
+    "displayName" TEXT NOT NULL,
+    "description" TEXT,
+    "providerId" TEXT NOT NULL,
+    "contextWindow" INTEGER NOT NULL,
+    "maxOutputTokens" INTEGER,
+    "isEnabled" BOOLEAN NOT NULL DEFAULT TRUE,
+    "capabilities" JSONB NOT NULL DEFAULT '{}'::jsonb,
+    "metadata" JSONB NOT NULL DEFAULT '{}'::jsonb,
+
+    CONSTRAINT "LlmModel_pkey" PRIMARY KEY ("id"),
+    CONSTRAINT "LlmModel_slug_key" UNIQUE ("slug")
+);
+
+-- CreateTable
+CREATE TABLE "LlmModelCost" (
+    "id" TEXT NOT NULL,
+    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
+    "updatedAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
+    "unit" "LlmCostUnit" NOT NULL DEFAULT 'RUN',
+    "creditCost" INTEGER NOT NULL,
+    "credentialProvider" TEXT NOT NULL,
+    "credentialId" TEXT,
+    "credentialType" TEXT,
+    "currency" TEXT,
+    "metadata" JSONB NOT NULL DEFAULT '{}'::jsonb,
+    "llmModelId" TEXT NOT NULL,
+
+    CONSTRAINT "LlmModelCost_pkey" PRIMARY KEY ("id")
+);
+
+-- CreateIndex
+CREATE INDEX "LlmModel_providerId_isEnabled_idx" ON "LlmModel"("providerId", "isEnabled");
+
+-- CreateIndex
+CREATE INDEX "LlmModel_slug_idx" ON "LlmModel"("slug");
+
+-- CreateIndex
+CREATE INDEX "LlmModelCost_llmModelId_idx" ON "LlmModelCost"("llmModelId");
+
+-- CreateIndex
+CREATE INDEX "LlmModelCost_credentialProvider_idx" ON "LlmModelCost"("credentialProvider");
+
+-- CreateIndex
+CREATE UNIQUE INDEX "LlmModelCost_llmModelId_credentialProvider_unit_key" ON "LlmModelCost"("llmModelId", "credentialProvider", "unit");
+
+-- AddForeignKey
+ALTER TABLE "LlmModel" ADD CONSTRAINT "LlmModel_providerId_fkey" FOREIGN KEY ("providerId") REFERENCES "LlmProvider"("id") ON DELETE RESTRICT ON UPDATE CASCADE;
+
+-- AddForeignKey
+ALTER TABLE "LlmModelCost" ADD CONSTRAINT "LlmModelCost_llmModelId_fkey" FOREIGN KEY ("llmModelId") REFERENCES "LlmModel"("id") ON DELETE CASCADE ON UPDATE CASCADE;
+
--- a/autogpt_platform/backend/migrations/20251126120000_seed_llm_registry/migration.sql
+++ b/autogpt_platform/backend/migrations/20251126120000_seed_llm_registry/migration.sql
@@ -0,0 +1,226 @@
+-- Seed LLM Registry from existing hard-coded data
+-- This migration populates the LlmProvider, LlmModel, and LlmModelCost tables
+-- with data from the existing MODEL_METADATA and MODEL_COST dictionaries
+
+-- Insert Providers
+INSERT INTO "LlmProvider" ("id", "name", "displayName", "description", "defaultCredentialProvider", "defaultCredentialType", "supportsTools", "supportsJsonOutput", "supportsReasoning", "supportsParallelTool", "metadata")
+VALUES
+    (gen_random_uuid(), 'openai', 'OpenAI', 'OpenAI language models', 'openai', 'api_key', true, true, true, true, '{}'::jsonb),
+    (gen_random_uuid(), 'anthropic', 'Anthropic', 'Anthropic Claude models', 'anthropic', 'api_key', true, true, true, false, '{}'::jsonb),
+    (gen_random_uuid(), 'groq', 'Groq', 'Groq inference API', 'groq', 'api_key', false, true, false, false, '{}'::jsonb),
+    (gen_random_uuid(), 'open_router', 'OpenRouter', 'OpenRouter unified API', 'open_router', 'api_key', true, true, false, false, '{}'::jsonb),
+    (gen_random_uuid(), 'aiml_api', 'AI/ML API', 'AI/ML API models', 'aiml_api', 'api_key', false, true, false, false, '{}'::jsonb),
+    (gen_random_uuid(), 'ollama', 'Ollama', 'Ollama local models', 'ollama', 'api_key', false, true, false, false, '{}'::jsonb),
+    (gen_random_uuid(), 'llama_api', 'Llama API', 'Llama API models', 'llama_api', 'api_key', false, true, false, false, '{}'::jsonb),
+    (gen_random_uuid(), 'v0', 'v0', 'v0 by Vercel models', 'v0', 'api_key', true, true, false, false, '{}'::jsonb)
+ON CONFLICT ("name") DO NOTHING;
+
+-- Insert Models (using CTEs to reference provider IDs)
+WITH provider_ids AS (
+    SELECT "id", "name" FROM "LlmProvider"
+)
+INSERT INTO "LlmModel" ("id", "slug", "displayName", "description", "providerId", "contextWindow", "maxOutputTokens", "isEnabled", "capabilities", "metadata")
+SELECT
+    gen_random_uuid(),
+    model_slug,
+    model_display_name,
+    NULL,
+    p."id",
+    context_window,
+    max_output_tokens,
+    true,
+    '{}'::jsonb,
+    '{}'::jsonb
+FROM (VALUES
+    -- OpenAI models
+    ('o3', 'O3', 'openai', 200000, 100000),
+    ('o3-mini', 'O3 Mini', 'openai', 200000, 100000),
+    ('o1', 'O1', 'openai', 200000, 100000),
+    ('o1-mini', 'O1 Mini', 'openai', 128000, 65536),
+    ('gpt-5-2025-08-07', 'GPT 5', 'openai', 400000, 128000),
+    ('gpt-5.1-2025-11-13', 'GPT 5.1', 'openai', 400000, 128000),
+    ('gpt-5-mini-2025-08-07', 'GPT 5 Mini', 'openai', 400000, 128000),
+    ('gpt-5-nano-2025-08-07', 'GPT 5 Nano', 'openai', 400000, 128000),
+    ('gpt-5-chat-latest', 'GPT 5 Chat', 'openai', 400000, 16384),
+    ('gpt-4.1-2025-04-14', 'GPT 4.1', 'openai', 1000000, 32768),
+    ('gpt-4.1-mini-2025-04-14', 'GPT 4.1 Mini', 'openai', 1047576, 32768),
+    ('gpt-4o-mini', 'GPT 4o Mini', 'openai', 128000, 16384),
+    ('gpt-4o', 'GPT 4o', 'openai', 128000, 16384),
+    ('gpt-4-turbo', 'GPT 4 Turbo', 'openai', 128000, 4096),
+    ('gpt-3.5-turbo', 'GPT 3.5 Turbo', 'openai', 16385, 4096),
+    -- Anthropic models
+    ('claude-opus-4-1-20250805', 'Claude 4.1 Opus', 'anthropic', 200000, 32000),
+    ('claude-opus-4-20250514', 'Claude 4 Opus', 'anthropic', 200000, 32000),
+    ('claude-sonnet-4-20250514', 'Claude 4 Sonnet', 'anthropic', 200000, 64000),
+    ('claude-opus-4-5-20251101', 'Claude 4.5 Opus', 'anthropic', 200000, 64000),
+    ('claude-sonnet-4-5-20250929', 'Claude 4.5 Sonnet', 'anthropic', 200000, 64000),
+    ('claude-haiku-4-5-20251001', 'Claude 4.5 Haiku', 'anthropic', 200000, 64000),
+    ('claude-3-7-sonnet-20250219', 'Claude 3.7 Sonnet', 'anthropic', 200000, 64000),
+    ('claude-3-haiku-20240307', 'Claude 3 Haiku', 'anthropic', 200000, 4096),
+    -- AI/ML API models
+    ('Qwen/Qwen2.5-72B-Instruct-Turbo', 'Qwen 2.5 72B', 'aiml_api', 32000, 8000),
+    ('nvidia/llama-3.1-nemotron-70b-instruct', 'Llama 3.1 Nemotron 70B', 'aiml_api', 128000, 40000),
+    ('meta-llama/Llama-3.3-70B-Instruct-Turbo', 'Llama 3.3 70B', 'aiml_api', 128000, NULL),
+    ('meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo', 'Meta Llama 3.1 70B', 'aiml_api', 131000, 2000),
+    ('meta-llama/Llama-3.2-3B-Instruct-Turbo', 'Llama 3.2 3B', 'aiml_api', 128000, NULL),
+    -- Groq models
+    ('llama-3.3-70b-versatile', 'Llama 3.3 70B', 'groq', 128000, 32768),
+    ('llama-3.1-8b-instant', 'Llama 3.1 8B', 'groq', 128000, 8192),
+    -- Ollama models
+    ('llama3.3', 'Llama 3.3', 'ollama', 8192, NULL),
+    ('llama3.2', 'Llama 3.2', 'ollama', 8192, NULL),
+    ('llama3', 'Llama 3', 'ollama', 8192, NULL),
+    ('llama3.1:405b', 'Llama 3.1 405B', 'ollama', 8192, NULL),
+    ('dolphin-mistral:latest', 'Dolphin Mistral', 'ollama', 32768, NULL),
+    -- OpenRouter models
+    ('google/gemini-2.5-pro-preview-03-25', 'Gemini 2.5 Pro', 'open_router', 1050000, 8192),
+    ('google/gemini-3-pro-preview', 'Gemini 3 Pro Preview', 'open_router', 1048576, 65535),
+    ('google/gemini-2.5-flash', 'Gemini 2.5 Flash', 'open_router', 1048576, 65535),
+    ('google/gemini-2.0-flash-001', 'Gemini 2.0 Flash', 'open_router', 1048576, 8192),
+    ('google/gemini-2.5-flash-lite-preview-06-17', 'Gemini 2.5 Flash Lite Preview', 'open_router', 1048576, 65535),
+    ('google/gemini-2.0-flash-lite-001', 'Gemini 2.0 Flash Lite', 'open_router', 1048576, 8192),
+    ('mistralai/mistral-nemo', 'Mistral Nemo', 'open_router', 128000, 4096),
+    ('cohere/command-r-08-2024', 'Command R', 'open_router', 128000, 4096),
+    ('cohere/command-r-plus-08-2024', 'Command R Plus', 'open_router', 128000, 4096),
+    ('deepseek/deepseek-chat', 'DeepSeek Chat', 'open_router', 64000, 2048),
+    ('deepseek/deepseek-r1-0528', 'DeepSeek R1', 'open_router', 163840, 163840),
+    ('perplexity/sonar', 'Perplexity Sonar', 'open_router', 127000, 8000),
+    ('perplexity/sonar-pro', 'Perplexity Sonar Pro', 'open_router', 200000, 8000),
+    ('perplexity/sonar-deep-research', 'Perplexity Sonar Deep Research', 'open_router', 128000, 16000),
+    ('nousresearch/hermes-3-llama-3.1-405b', 'Hermes 3 Llama 3.1 405B', 'open_router', 131000, 4096),
+    ('nousresearch/hermes-3-llama-3.1-70b', 'Hermes 3 Llama 3.1 70B', 'open_router', 12288, 12288),
+    ('openai/gpt-oss-120b', 'GPT OSS 120B', 'open_router', 131072, 131072),
+    ('openai/gpt-oss-20b', 'GPT OSS 20B', 'open_router', 131072, 32768),
+    ('amazon/nova-lite-v1', 'Amazon Nova Lite', 'open_router', 300000, 5120),
+    ('amazon/nova-micro-v1', 'Amazon Nova Micro', 'open_router', 128000, 5120),
+    ('amazon/nova-pro-v1', 'Amazon Nova Pro', 'open_router', 300000, 5120),
+    ('microsoft/wizardlm-2-8x22b', 'WizardLM 2 8x22B', 'open_router', 65536, 4096),
+    ('gryphe/mythomax-l2-13b', 'MythoMax L2 13B', 'open_router', 4096, 4096),
+    ('meta-llama/llama-4-scout', 'Llama 4 Scout', 'open_router', 131072, 131072),
+    ('meta-llama/llama-4-maverick', 'Llama 4 Maverick', 'open_router', 1048576, 1000000),
+    ('x-ai/grok-4', 'Grok 4', 'open_router', 256000, 256000),
+    ('x-ai/grok-4-fast', 'Grok 4 Fast', 'open_router', 2000000, 30000),
+    ('x-ai/grok-4.1-fast', 'Grok 4.1 Fast', 'open_router', 2000000, 30000),
+    ('x-ai/grok-code-fast-1', 'Grok Code Fast 1', 'open_router', 256000, 10000),
+    ('moonshotai/kimi-k2', 'Kimi K2', 'open_router', 131000, 131000),
+    ('qwen/qwen3-235b-a22b-thinking-2507', 'Qwen 3 235B Thinking', 'open_router', 262144, 262144),
+    ('qwen/qwen3-coder', 'Qwen 3 Coder', 'open_router', 262144, 262144),
+    -- Llama API models
+    ('Llama-4-Scout-17B-16E-Instruct-FP8', 'Llama 4 Scout', 'llama_api', 128000, 4028),
+    ('Llama-4-Maverick-17B-128E-Instruct-FP8', 'Llama 4 Maverick', 'llama_api', 128000, 4028),
+    ('Llama-3.3-8B-Instruct', 'Llama 3.3 8B', 'llama_api', 128000, 4028),
+    ('Llama-3.3-70B-Instruct', 'Llama 3.3 70B', 'llama_api', 128000, 4028),
+    -- v0 models
+    ('v0-1.5-md', 'v0 1.5 MD', 'v0', 128000, 64000),
+    ('v0-1.5-lg', 'v0 1.5 LG', 'v0', 512000, 64000),
+    ('v0-1.0-md', 'v0 1.0 MD', 'v0', 128000, 64000)
+) AS models(model_slug, model_display_name, provider_name, context_window, max_output_tokens)
+JOIN provider_ids p ON p."name" = models.provider_name
+ON CONFLICT ("slug") DO NOTHING;
+
+-- Insert Costs (using CTEs to reference model IDs)
+WITH model_ids AS (
+    SELECT "id", "slug", "providerId" FROM "LlmModel"
+),
+provider_ids AS (
+    SELECT "id", "name" FROM "LlmProvider"
+)
+INSERT INTO "LlmModelCost" ("id", "unit", "creditCost", "credentialProvider", "credentialId", "credentialType", "currency", "metadata", "llmModelId")
+SELECT
+    gen_random_uuid(),
+    'RUN'::"LlmCostUnit",
+    cost,
+    p."name",
+    NULL,
+    'api_key',
+    NULL,
+    '{}'::jsonb,
+    m."id"
+FROM (VALUES
+    -- OpenAI costs
+    ('o3', 4),
+    ('o3-mini', 2),
+    ('o1', 16),
+    ('o1-mini', 4),
+    ('gpt-5-2025-08-07', 2),
+    ('gpt-5.1-2025-11-13', 5),
+    ('gpt-5-mini-2025-08-07', 1),
+    ('gpt-5-nano-2025-08-07', 1),
+    ('gpt-5-chat-latest', 5),
+    ('gpt-4.1-2025-04-14', 2),
+    ('gpt-4.1-mini-2025-04-14', 1),
+    ('gpt-4o-mini', 1),
+    ('gpt-4o', 3),
+    ('gpt-4-turbo', 10),
+    ('gpt-3.5-turbo', 1),
+    -- Anthropic costs
+    ('claude-opus-4-1-20250805', 21),
+    ('claude-opus-4-20250514', 21),
+    ('claude-sonnet-4-20250514', 5),
+    ('claude-haiku-4-5-20251001', 4),
+    ('claude-opus-4-5-20251101', 14),
+    ('claude-sonnet-4-5-20250929', 9),
+    ('claude-3-7-sonnet-20250219', 5),
+    ('claude-3-haiku-20240307', 1),
+    -- AI/ML API costs
+    ('Qwen/Qwen2.5-72B-Instruct-Turbo', 1),
+    ('nvidia/llama-3.1-nemotron-70b-instruct', 1),
+    ('meta-llama/Llama-3.3-70B-Instruct-Turbo', 1),
+    ('meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo', 1),
+    ('meta-llama/Llama-3.2-3B-Instruct-Turbo', 1),
+    -- Groq costs
+    ('llama-3.3-70b-versatile', 1),
+    ('llama-3.1-8b-instant', 1),
+    -- Ollama costs
+    ('llama3.3', 1),
+    ('llama3.2', 1),
+    ('llama3', 1),
+    ('llama3.1:405b', 1),
+    ('dolphin-mistral:latest', 1),
+    -- OpenRouter costs
+    ('google/gemini-2.5-pro-preview-03-25', 4),
+    ('google/gemini-3-pro-preview', 5),
+    ('mistralai/mistral-nemo', 1),
+    ('cohere/command-r-08-2024', 1),
+    ('cohere/command-r-plus-08-2024', 3),
+    ('deepseek/deepseek-chat', 2),
+    ('perplexity/sonar', 1),
+    ('perplexity/sonar-pro', 5),
+    ('perplexity/sonar-deep-research', 10),
+    ('nousresearch/hermes-3-llama-3.1-405b', 1),
+    ('nousresearch/hermes-3-llama-3.1-70b', 1),
+    ('amazon/nova-lite-v1', 1),
+    ('amazon/nova-micro-v1', 1),
+    ('amazon/nova-pro-v1', 1),
+    ('microsoft/wizardlm-2-8x22b', 1),
+    ('gryphe/mythomax-l2-13b', 1),
+    ('meta-llama/llama-4-scout', 1),
+    ('meta-llama/llama-4-maverick', 1),
+    ('x-ai/grok-4', 9),
+    ('x-ai/grok-4-fast', 1),
+    ('x-ai/grok-4.1-fast', 1),
+    ('x-ai/grok-code-fast-1', 1),
+    ('moonshotai/kimi-k2', 1),
+    ('qwen/qwen3-235b-a22b-thinking-2507', 1),
+    ('qwen/qwen3-coder', 9),
+    ('google/gemini-2.5-flash', 1),
+    ('google/gemini-2.0-flash-001', 1),
+    ('google/gemini-2.5-flash-lite-preview-06-17', 1),
+    ('google/gemini-2.0-flash-lite-001', 1),
+    ('deepseek/deepseek-r1-0528', 1),
+    ('openai/gpt-oss-120b', 1),
+    ('openai/gpt-oss-20b', 1),
+    -- Llama API costs
+    ('Llama-4-Scout-17B-16E-Instruct-FP8', 1),
+    ('Llama-4-Maverick-17B-128E-Instruct-FP8', 1),
+    ('Llama-3.3-8B-Instruct', 1),
+    ('Llama-3.3-70B-Instruct', 1),
+    -- v0 costs
+    ('v0-1.5-md', 1),
+    ('v0-1.5-lg', 2),
+    ('v0-1.0-md', 1)
+) AS costs(model_slug, cost)
+JOIN model_ids m ON m."slug" = costs.model_slug
+JOIN provider_ids p ON p."id" = m."providerId"
+ON CONFLICT ("llmModelId", "credentialProvider", "unit") DO NOTHING;
+
--- a/autogpt_platform/backend/migrations/20251218100000_add_llm_model_migration/migration.sql
+++ b/autogpt_platform/backend/migrations/20251218100000_add_llm_model_migration/migration.sql
@@ -0,0 +1,25 @@
+-- CreateTable
+CREATE TABLE "LlmModelMigration" (
+    "id" TEXT NOT NULL,
+    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
+    "updatedAt" TIMESTAMP(3) NOT NULL,
+    "sourceModelSlug" TEXT NOT NULL,
+    "targetModelSlug" TEXT NOT NULL,
+    "reason" TEXT,
+    "migratedNodeIds" JSONB NOT NULL DEFAULT '[]',
+    "nodeCount" INTEGER NOT NULL,
+    "customCreditCost" INTEGER,
+    "isReverted" BOOLEAN NOT NULL DEFAULT false,
+    "revertedAt" TIMESTAMP(3),
+
+    CONSTRAINT "LlmModelMigration_pkey" PRIMARY KEY ("id")
+);
+
+-- CreateIndex
+CREATE INDEX "LlmModelMigration_sourceModelSlug_idx" ON "LlmModelMigration"("sourceModelSlug");
+
+-- CreateIndex
+CREATE INDEX "LlmModelMigration_targetModelSlug_idx" ON "LlmModelMigration"("targetModelSlug");
+
+-- CreateIndex
+CREATE INDEX "LlmModelMigration_isReverted_idx" ON "LlmModelMigration"("isReverted");
--- a/autogpt_platform/backend/migrations/20251224100000_add_llm_model_creator/migration.sql
+++ b/autogpt_platform/backend/migrations/20251224100000_add_llm_model_creator/migration.sql
@@ -0,0 +1,127 @@
+-- Add LlmModelCreator table
+-- Creator represents who made/trained the model (e.g., OpenAI, Meta)
+-- This is distinct from Provider who hosts/serves the model (e.g., OpenRouter)
+
+-- Create the LlmModelCreator table
+CREATE TABLE "LlmModelCreator" (
+    "id" TEXT NOT NULL,
+    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
+    "updatedAt" TIMESTAMP(3) NOT NULL,
+    "name" TEXT NOT NULL,
+    "displayName" TEXT NOT NULL,
+    "description" TEXT,
+    "websiteUrl" TEXT,
+    "logoUrl" TEXT,
+    "metadata" JSONB NOT NULL DEFAULT '{}',
+
+    CONSTRAINT "LlmModelCreator_pkey" PRIMARY KEY ("id")
+);
+
+-- Create unique index on name
+CREATE UNIQUE INDEX "LlmModelCreator_name_key" ON "LlmModelCreator"("name");
+
+-- Add creatorId column to LlmModel
+ALTER TABLE "LlmModel" ADD COLUMN "creatorId" TEXT;
+
+-- Add foreign key constraint
+ALTER TABLE "LlmModel" ADD CONSTRAINT "LlmModel_creatorId_fkey"
+    FOREIGN KEY ("creatorId") REFERENCES "LlmModelCreator"("id") ON DELETE SET NULL ON UPDATE CASCADE;
+
+-- Create index on creatorId
+CREATE INDEX "LlmModel_creatorId_idx" ON "LlmModel"("creatorId");
+
+-- Seed creators based on known model creators
+INSERT INTO "LlmModelCreator" ("id", "updatedAt", "name", "displayName", "description", "websiteUrl", "metadata")
+VALUES
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'openai', 'OpenAI', 'Creator of GPT models', 'https://openai.com', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'anthropic', 'Anthropic', 'Creator of Claude models', 'https://anthropic.com', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'meta', 'Meta', 'Creator of Llama models', 'https://ai.meta.com', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'google', 'Google', 'Creator of Gemini models', 'https://deepmind.google', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'mistral', 'Mistral AI', 'Creator of Mistral models', 'https://mistral.ai', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'cohere', 'Cohere', 'Creator of Command models', 'https://cohere.com', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'deepseek', 'DeepSeek', 'Creator of DeepSeek models', 'https://deepseek.com', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'perplexity', 'Perplexity AI', 'Creator of Sonar models', 'https://perplexity.ai', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'qwen', 'Qwen (Alibaba)', 'Creator of Qwen models', 'https://qwenlm.github.io', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'xai', 'xAI', 'Creator of Grok models', 'https://x.ai', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'amazon', 'Amazon', 'Creator of Nova models', 'https://aws.amazon.com/bedrock', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'microsoft', 'Microsoft', 'Creator of WizardLM models', 'https://microsoft.com', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'moonshot', 'Moonshot AI', 'Creator of Kimi models', 'https://moonshot.cn', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'nvidia', 'NVIDIA', 'Creator of Nemotron models', 'https://nvidia.com', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'nous_research', 'Nous Research', 'Creator of Hermes models', 'https://nousresearch.com', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'vercel', 'Vercel', 'Creator of v0 models', 'https://vercel.com', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'cognitive_computations', 'Cognitive Computations', 'Creator of Dolphin models', 'https://erichartford.com', '{}'),
+    (gen_random_uuid(), CURRENT_TIMESTAMP, 'gryphe', 'Gryphe', 'Creator of MythoMax models', 'https://huggingface.co/Gryphe', '{}')
+ON CONFLICT ("name") DO NOTHING;
+
+-- Update existing models with their creators
+-- OpenAI models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'openai')
+WHERE "slug" LIKE 'gpt-%' OR "slug" LIKE 'o1%' OR "slug" LIKE 'o3%' OR "slug" LIKE 'openai/%';
+
+-- Anthropic models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'anthropic')
+WHERE "slug" LIKE 'claude-%';
+
+-- Meta/Llama models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'meta')
+WHERE "slug" LIKE 'llama%' OR "slug" LIKE 'Llama%' OR "slug" LIKE 'meta-llama/%' OR "slug" LIKE '%/llama-%';
+
+-- Google models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'google')
+WHERE "slug" LIKE 'google/%' OR "slug" LIKE 'gemini%';
+
+-- Mistral models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'mistral')
+WHERE "slug" LIKE 'mistral%' OR "slug" LIKE 'mistralai/%';
+
+-- Cohere models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'cohere')
+WHERE "slug" LIKE 'cohere/%' OR "slug" LIKE 'command-%';
+
+-- DeepSeek models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'deepseek')
+WHERE "slug" LIKE 'deepseek/%' OR "slug" LIKE 'deepseek-%';
+
+-- Perplexity models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'perplexity')
+WHERE "slug" LIKE 'perplexity/%' OR "slug" LIKE 'sonar%';
+
+-- Qwen models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'qwen')
+WHERE "slug" LIKE 'Qwen/%' OR "slug" LIKE 'qwen/%';
+
+-- xAI/Grok models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'xai')
+WHERE "slug" LIKE 'x-ai/%' OR "slug" LIKE 'grok%';
+
+-- Amazon models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'amazon')
+WHERE "slug" LIKE 'amazon/%' OR "slug" LIKE 'nova-%';
+
+-- Microsoft models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'microsoft')
+WHERE "slug" LIKE 'microsoft/%' OR "slug" LIKE 'wizardlm%';
+
+-- Moonshot models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'moonshot')
+WHERE "slug" LIKE 'moonshotai/%' OR "slug" LIKE 'kimi%';
+
+-- NVIDIA models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'nvidia')
+WHERE "slug" LIKE 'nvidia/%' OR "slug" LIKE '%nemotron%';
+
+-- Nous Research models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'nous_research')
+WHERE "slug" LIKE 'nousresearch/%' OR "slug" LIKE 'hermes%';
+
+-- Vercel/v0 models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'vercel')
+WHERE "slug" LIKE 'v0-%';
+
+-- Dolphin models (Cognitive Computations / Eric Hartford)
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'cognitive_computations')
+WHERE "slug" LIKE 'dolphin-%';
+
+-- Gryphe models
+UPDATE "LlmModel" SET "creatorId" = (SELECT "id" FROM "LlmModelCreator" WHERE "name" = 'gryphe')
+WHERE "slug" LIKE 'gryphe/%' OR "slug" LIKE 'mythomax%';
--- a/autogpt_platform/backend/migrations/20260105120000_add_agent_node_model_index/migration.sql
+++ b/autogpt_platform/backend/migrations/20260105120000_add_agent_node_model_index/migration.sql
@@ -0,0 +1,4 @@
+-- CreateIndex
+-- Index for efficient LLM model lookups on AgentNode.constantInput->>'model'
+-- This improves performance of model migration queries in the LLM registry
+CREATE INDEX "AgentNode_constantInput_model_idx" ON "AgentNode" ((("constantInput"->>'model')));
--- a/autogpt_platform/backend/migrations/20260106150000_add_gpt52_model/migration.sql
+++ b/autogpt_platform/backend/migrations/20260106150000_add_gpt52_model/migration.sql
@@ -0,0 +1,52 @@
+-- Add GPT-5.2 model and update O3 slug
+-- This migration adds the new GPT-5.2 model added in dev branch
+
+-- Update O3 slug to match dev branch format
+UPDATE "LlmModel"
+SET "slug" = 'o3-2025-04-16'
+WHERE "slug" = 'o3';
+
+-- Update cost reference for O3 if needed
+-- (costs are linked by model ID, so no update needed)
+
+-- Add GPT-5.2 model
+WITH provider_id AS (
+    SELECT "id" FROM "LlmProvider" WHERE "name" = 'openai'
+)
+INSERT INTO "LlmModel" ("id", "slug", "displayName", "description", "providerId", "contextWindow", "maxOutputTokens", "isEnabled", "capabilities", "metadata")
+SELECT
+    gen_random_uuid(),
+    'gpt-5.2-2025-12-11',
+    'GPT 5.2',
+    'OpenAI GPT-5.2 model',
+    p."id",
+    400000,
+    128000,
+    true,
+    '{}'::jsonb,
+    '{}'::jsonb
+FROM provider_id p
+ON CONFLICT ("slug") DO NOTHING;
+
+-- Add cost for GPT-5.2
+WITH model_id AS (
+    SELECT m."id", p."name" as provider_name
+    FROM "LlmModel" m
+    JOIN "LlmProvider" p ON p."id" = m."providerId"
+    WHERE m."slug" = 'gpt-5.2-2025-12-11'
+)
+INSERT INTO "LlmModelCost" ("id", "unit", "creditCost", "credentialProvider", "credentialId", "credentialType", "currency", "metadata", "llmModelId")
+SELECT
+    gen_random_uuid(),
+    'RUN'::"LlmCostUnit",
+    3,  -- Same cost tier as GPT-5.1
+    m.provider_name,
+    NULL,
+    'api_key',
+    NULL,
+    '{}'::jsonb,
+    m."id"
+FROM model_id m
+WHERE NOT EXISTS (
+    SELECT 1 FROM "LlmModelCost" c WHERE c."llmModelId" = m."id"
+);
--- a/autogpt_platform/backend/migrations/20260107100000_add_llm_recommended_model/migration.sql
+++ b/autogpt_platform/backend/migrations/20260107100000_add_llm_recommended_model/migration.sql
@@ -0,0 +1,11 @@
+-- Add isRecommended field to LlmModel table
+-- This allows admins to mark a model as the recommended default
+
+ALTER TABLE "LlmModel" ADD COLUMN "isRecommended" BOOLEAN NOT NULL DEFAULT false;
+
+-- Set gpt-4o-mini as the default recommended model (if it exists)
+UPDATE "LlmModel" SET "isRecommended" = true WHERE "slug" = 'gpt-4o-mini' AND "isEnabled" = true;
+
+-- Create unique partial index to enforce only one recommended model at the database level
+-- This prevents multiple rows from having isRecommended = true
+CREATE UNIQUE INDEX "LlmModel_single_recommended_idx" ON "LlmModel" ("isRecommended") WHERE "isRecommended" = true;
--- a/autogpt_platform/backend/migrations/20260122000000_add_llm_price_tier/migration.sql
+++ b/autogpt_platform/backend/migrations/20260122000000_add_llm_price_tier/migration.sql
@@ -0,0 +1,61 @@
+-- Add new columns to LlmModel table for extended model metadata
+-- These columns support the LLM Picker UI enhancements
+
+-- Add priceTier column: 1=cheapest, 2=medium, 3=expensive
+ALTER TABLE "LlmModel" ADD COLUMN IF NOT EXISTS "priceTier" INTEGER NOT NULL DEFAULT 1;
+
+-- Add creatorId column for model creator relationship (if not exists)
+ALTER TABLE "LlmModel" ADD COLUMN IF NOT EXISTS "creatorId" TEXT;
+
+-- Add isRecommended column (if not exists)
+ALTER TABLE "LlmModel" ADD COLUMN IF NOT EXISTS "isRecommended" BOOLEAN NOT NULL DEFAULT FALSE;
+
+-- Add index on creatorId if not exists
+CREATE INDEX IF NOT EXISTS "LlmModel_creatorId_idx" ON "LlmModel"("creatorId");
+
+-- Add foreign key for creatorId if not exists
+DO $$
+BEGIN
+    IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'LlmModel_creatorId_fkey') THEN
+        -- Only add FK if LlmModelCreator table exists
+        IF EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'LlmModelCreator') THEN
+            ALTER TABLE "LlmModel" ADD CONSTRAINT "LlmModel_creatorId_fkey"
+            FOREIGN KEY ("creatorId") REFERENCES "LlmModelCreator"("id") ON DELETE SET NULL ON UPDATE CASCADE;
+        END IF;
+    END IF;
+END $$;
+
+-- Update priceTier values for existing models based on original MODEL_METADATA
+-- Tier 1 = cheapest, Tier 2 = medium, Tier 3 = expensive
+
+-- OpenAI models
+UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" = 'o3';
+UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" = 'o3-mini';
+UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" = 'o1';
+UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" = 'o1-mini';
+UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" = 'gpt-5.2';
+UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" = 'gpt-5.1';
+UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" = 'gpt-5';
+UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" = 'gpt-5-mini';
+UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" = 'gpt-5-nano';
+UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" = 'gpt-5-chat-latest';
+UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" LIKE 'gpt-4.1%';
+UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" = 'gpt-4o-mini';
+UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" = 'gpt-4o';
+UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" = 'gpt-4-turbo';
+UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" = 'gpt-3.5-turbo';
+
+-- Anthropic models
+UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" LIKE 'claude-opus%';
+UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" LIKE 'claude-sonnet%';
+UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" LIKE 'claude%-4-5-sonnet%';
+UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" LIKE 'claude%-haiku%';
+UPDATE "LlmModel" SET "priceTier" = 1 WHERE "slug" = 'claude-3-haiku-20240307';
+
+-- OpenRouter models - Pro/expensive tiers
+UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" LIKE 'google/gemini%-pro%';
+UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" LIKE '%command-r-plus%';
+UPDATE "LlmModel" SET "priceTier" = 2 WHERE "slug" LIKE '%sonar-pro%';
+UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" LIKE '%sonar-deep-research%';
+UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" = 'x-ai/grok-4';
+UPDATE "LlmModel" SET "priceTier" = 3 WHERE "slug" LIKE '%qwen3-coder%';
--- a/autogpt_platform/backend/migrations/20260127211502_add_visit_copilot_onboarding_step/migration.sql
+++ b/autogpt_platform/backend/migrations/20260127211502_add_visit_copilot_onboarding_step/migration.sql
@@ -1,2 +0,0 @@
-- AlterEnum
-ALTER TYPE "OnboardingStep" ADD VALUE 'VISIT_COPILOT';
--- a/autogpt_platform/backend/migrations/20260127230419_add_user_workspace/migration.sql
+++ b/autogpt_platform/backend/migrations/20260127230419_add_user_workspace/migration.sql
@@ -1,52 +0,0 @@
-- CreateEnum
-CREATE TYPE "WorkspaceFileSource" AS ENUM ('UPLOAD', 'EXECUTION', 'COPILOT', 'IMPORT');
-
-- CreateTable
-CREATE TABLE "UserWorkspace" (
-    "id" TEXT NOT NULL,
-    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
-    "updatedAt" TIMESTAMP(3) NOT NULL,
-    "userId" TEXT NOT NULL,
-
-    CONSTRAINT "UserWorkspace_pkey" PRIMARY KEY ("id")
-);
-
-- CreateTable
-CREATE TABLE "UserWorkspaceFile" (
-    "id" TEXT NOT NULL,
-    "createdAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
-    "updatedAt" TIMESTAMP(3) NOT NULL,
-    "workspaceId" TEXT NOT NULL,
-    "name" TEXT NOT NULL,
-    "path" TEXT NOT NULL,
-    "storagePath" TEXT NOT NULL,
-    "mimeType" TEXT NOT NULL,
-    "sizeBytes" BIGINT NOT NULL,
-    "checksum" TEXT,
-    "isDeleted" BOOLEAN NOT NULL DEFAULT false,
-    "deletedAt" TIMESTAMP(3),
-    "source" "WorkspaceFileSource" NOT NULL DEFAULT 'UPLOAD',
-    "sourceExecId" TEXT,
-    "sourceSessionId" TEXT,
-    "metadata" JSONB NOT NULL DEFAULT '{}',
-
-    CONSTRAINT "UserWorkspaceFile_pkey" PRIMARY KEY ("id")
-);
-
-- CreateIndex
-CREATE UNIQUE INDEX "UserWorkspace_userId_key" ON "UserWorkspace"("userId");
-
-- CreateIndex
-CREATE INDEX "UserWorkspace_userId_idx" ON "UserWorkspace"("userId");
-
-- CreateIndex
-CREATE INDEX "UserWorkspaceFile_workspaceId_isDeleted_idx" ON "UserWorkspaceFile"("workspaceId", "isDeleted");
-
-- CreateIndex
-CREATE UNIQUE INDEX "UserWorkspaceFile_workspaceId_path_key" ON "UserWorkspaceFile"("workspaceId", "path");
-
-- AddForeignKey
-ALTER TABLE "UserWorkspace" ADD CONSTRAINT "UserWorkspace_userId_fkey" FOREIGN KEY ("userId") REFERENCES "User"("id") ON DELETE CASCADE ON UPDATE CASCADE;
-
-- AddForeignKey
-ALTER TABLE "UserWorkspaceFile" ADD CONSTRAINT "UserWorkspaceFile_workspaceId_fkey" FOREIGN KEY ("workspaceId") REFERENCES "UserWorkspace"("id") ON DELETE CASCADE ON UPDATE CASCADE;
--- a/autogpt_platform/backend/migrations/20260129011611_remove_workspace_file_source/migration.sql
+++ b/autogpt_platform/backend/migrations/20260129011611_remove_workspace_file_source/migration.sql
@@ -1,16 +0,0 @@
-/*
-  Warnings:
-
-  - You are about to drop the column `source` on the `UserWorkspaceFile` table. All the data in the column will be lost.
-  - You are about to drop the column `sourceExecId` on the `UserWorkspaceFile` table. All the data in the column will be lost.
-  - You are about to drop the column `sourceSessionId` on the `UserWorkspaceFile` table. All the data in the column will be lost.
-
-*/
-
-- AlterTable
-ALTER TABLE "UserWorkspaceFile" DROP COLUMN "source",
-DROP COLUMN "sourceExecId",
-DROP COLUMN "sourceSessionId";
-
-- DropEnum
-DROP TYPE "WorkspaceFileSource";
--- a/Show More
+++ b/Show More