mirror of https://github.com/Significant-Gravitas/AutoGPT.git synced 2026-01-06 22:03:59 -05:00

Files

Nicholas Tindle 37b3e4e82e feat(blocks)!: Update Exa search block to match latest API specification (#11185 )

BREAKING CHANGE: Removed deprecated use_auto_prompt field from Input
schema. Existing workflows using this field will need to be updated to
use the type field set to "auto" instead.

## Summary of Changes 📝

This PR comprehensively updates all Exa search blocks to match the
latest Exa API specification and adds significant new functionality
through the Websets API integration.

### Core API Updates 🔄

- **Migration to Exa SDK**: Replaced manual API calls with the official
`exa_py` AsyncExa SDK across all blocks for better reliability and
maintainability
- **Removed deprecated fields**: Eliminated
`use_auto_prompt`/`useAutoprompt` field (breaking change)
- **Fixed incomplete field definitions**: Corrected `user_location`
field definition
- **Added new input fields**: Added `moderation` and `context` fields
for enhanced content filtering

### Enhanced Content Settings 🛠️

- **Text field improvements**: Support both boolean and advanced object
configurations
- **New content options**: 
  - Added `livecrawl` settings (never, fallback, always, preferred)
  - Added `subpages` support for deeper content retrieval
  - Added `extras` settings for links and images
  - Added `context` settings for additional contextual information
- **Updated settings**: Enhanced `highlight` and `summary`
configurations with new query and schema options

### Comprehensive Cost Tracking 💰

- Added detailed cost tracking models:
  - `CostDollars` for monetary costs
  - `CostCredits` for API credit tracking
  - `CostDuration` for time-based costs
- New output fields: `request_id`, `resolved_search_type`,
`cost_dollars`
- Improved response handling to conditionally yield fields based on
availability

### New Websets API Integration 🚀

Added eight new specialized blocks for Exa's Websets API:
- **`websets.py`**: Core webset management (create, get, list, delete)
- **`websets_search.py`**: Search operations within websets
- **`websets_items.py`**: Individual item management (add, get, update,
delete)
- **`websets_enrichment.py`**: Data enrichment operations
- **`websets_import_export.py`**: Bulk import/export functionality
- **`websets_monitor.py`**: Monitor and track webset changes
- **`websets_polling.py`**: Poll for updates and changes

### New Special-Purpose Blocks 🎯

- **`code_context.py`**: Code search capabilities for finding relevant
code snippets from open source repositories, documentation, and Stack
Overflow
- **`research.py`**: Asynchronous research capabilities that explore the
web, gather sources, synthesize findings, and return structured results
with citations

### Code Organization Improvements 📁

- **Removed legacy code**: Deleted `model.py` file containing deprecated
API models
- **Centralized helpers**: Consolidated shared models and utilities in
`helpers.py`
- **Improved modularity**: Each webset operation is now in its own
dedicated file

### Other Changes 🔧

- Updated `.gitignore` for better development workflow
- Updated `CLAUDE.md` with project-specific instructions
- Updated documentation in `docs/content/platform/new_blocks.md` with
error handling, data models, and file input guidelines
- Improved webhook block implementations with SDK integration

### Files Changed 📂

- **Modified (11 files)**:
  - `.gitignore`
  - `autogpt_platform/CLAUDE.md`
  - `autogpt_platform/backend/backend/blocks/exa/answers.py`
  - `autogpt_platform/backend/backend/blocks/exa/contents.py`
  - `autogpt_platform/backend/backend/blocks/exa/helpers.py`
  - `autogpt_platform/backend/backend/blocks/exa/search.py`
  - `autogpt_platform/backend/backend/blocks/exa/similar.py`
  - `autogpt_platform/backend/backend/blocks/exa/webhook_blocks.py`
  - `autogpt_platform/backend/backend/blocks/exa/websets.py`
  - `docs/content/platform/new_blocks.md`

- **Added (8 files)**:
  - `autogpt_platform/backend/backend/blocks/exa/code_context.py`
  - `autogpt_platform/backend/backend/blocks/exa/research.py`
  - `autogpt_platform/backend/backend/blocks/exa/websets_enrichment.py`
- `autogpt_platform/backend/backend/blocks/exa/websets_import_export.py`
  - `autogpt_platform/backend/backend/blocks/exa/websets_items.py`
  - `autogpt_platform/backend/backend/blocks/exa/websets_monitor.py`
  - `autogpt_platform/backend/backend/blocks/exa/websets_polling.py`
  - `autogpt_platform/backend/backend/blocks/exa/websets_search.py`

- **Deleted (1 file)**:
  - `autogpt_platform/backend/backend/blocks/exa/model.py`

### Migration Guide 🚦

For users with existing workflows using the deprecated `use_auto_prompt`
field:
1. Remove the `use_auto_prompt` field from your input configuration
2. Set the `type` field to `ExaSearchTypes.AUTO` (or "auto" in JSON) to
achieve the same behavior
3. Review any custom content settings as the structure has been enhanced

### Testing Recommendations ✅

- Test existing workflows to ensure they handle the breaking change
- Verify cost tracking fields are properly returned
- Test new content settings options (livecrawl, subpages, extras,
context)
- Validate websets functionality if using the new Websets API blocks

🤖 Generated with [Claude Code](https://claude.com/claude-code)

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] made + ran a test agent for the blocks and flows between them
[Exa
Tests_v44.json](https://github.com/user-attachments/files/23226143/Exa.Tests_v44.json)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Migrates Exa blocks to AsyncExa SDK, adds comprehensive
Websets/research/code-context blocks, updates existing
search/content/answers/similar, deletes legacy models, adjusts
tests/docs; breaking: remove `use_auto_prompt` in favor of
`type="auto"`.
> 
> - **Backend — Exa integration (SDK migration & BREAKING)**:
> - Replace manual HTTP calls with `exa_py.AsyncExa` across `search`,
`similar`, `contents`, `answers`, and webhooks; richer outputs
(citations, context, costs, resolved search type).
>   - BREAKING: remove `Input.use_auto_prompt`; use `type = "auto"`.
> - Centralize models/utilities in `exa/helpers.py` (content settings,
cost models, result mappers).
> - **New Blocks**:
> - **Websets**: management (`websets.py`), searches, items,
enrichments, imports/exports, monitors, polling (new files under
`exa/websets_*`).
> - **Research**: async research task create/get/wait/list
(`exa/research.py`).
> - **Code Context**: code snippet/context retrieval
(`exa/code_context.py`).
> - **Removals**:
>   - Delete deprecated `exa/model.py`.
> - **Docs & DX**:
> - Update `docs/new_blocks.md` (error handling, models, file input) and
`CLAUDE.md`; ignore backend logs in `.gitignore`.
> - **Frontend Tests**:
> - Split/extend “e” block tests and improve block add robustness in
Playwright (`build.spec.ts`, `build.page.ts`).
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
6e5e572322. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added multiple Exa research and webset management blocks for task
creation, monitoring, and completion tracking.
* Introduced new search capabilities including code context retrieval,
content search, and enhanced filtering options.
* Added webset enrichment, import/export, and item management
functionality.
  * Expanded search with location-based and category filters.

* **Documentation**
* Updated guidance on error handling, data models, and file input
handling.

* **Refactor**
* Modernized backend API integration with improved response structure
and error reporting.
  * Simplified configuration options for search operations.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude <noreply@anthropic.com>

2025-11-05 19:52:48 +00:00

9.8 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Repository Overview

AutoGPT Platform is a monorepo containing:

Backend (/backend): Python FastAPI server with async support
Frontend (/frontend): Next.js React application
Shared Libraries (/autogpt_libs): Common Python utilities

Essential Commands

Backend Development

# Install dependencies
cd backend && poetry install

# Run database migrations
poetry run prisma migrate dev

# Start all services (database, redis, rabbitmq, clamav)
docker compose up -d

# Run the backend server
poetry run serve

# Run tests
poetry run test

# Run specific test
poetry run pytest path/to/test_file.py::test_function_name

# Run block tests (tests that validate all blocks work correctly)
poetry run pytest backend/blocks/test/test_block.py -xvs

# Run tests for a specific block (e.g., GetCurrentTimeBlock)
poetry run pytest 'backend/blocks/test/test_block.py::test_available_blocks[GetCurrentTimeBlock]' -xvs

# Lint and format
# prefer format if you want to just "fix" it and only get the errors that can't be autofixed
poetry run format  # Black + isort
poetry run lint    # ruff

More details can be found in TESTING.md

Creating/Updating Snapshots

When you first write a test or when the expected output changes:

poetry run pytest path/to/test.py --snapshot-update

⚠️ Important: Always review snapshot changes before committing! Use git diff to verify the changes are expected.

Frontend Development

# Install dependencies
cd frontend && pnpm i

# Generate API client from OpenAPI spec
pnpm generate:api

# Start development server
pnpm dev

# Run E2E tests
pnpm test

# Run Storybook for component development
pnpm storybook

# Build production
pnpm build

# Format and lint
pnpm format

# Type checking
pnpm types

📖 Complete Guide: See /frontend/CONTRIBUTING.md and /frontend/.cursorrules for comprehensive frontend patterns.

Key Frontend Conventions:

Separate render logic from data/behavior in components
Use generated API hooks from @/app/api/__generated__/endpoints/
Use function declarations (not arrow functions) for components/handlers
Use design system components from src/components/ (atoms, molecules, organisms)
Only use Phosphor Icons
Never use src/components/__legacy__/* or deprecated BackendAPI

Architecture Overview

Backend Architecture

API Layer: FastAPI with REST and WebSocket endpoints
Database: PostgreSQL with Prisma ORM, includes pgvector for embeddings
Queue System: RabbitMQ for async task processing
Execution Engine: Separate executor service processes agent workflows
Authentication: JWT-based with Supabase integration
Security: Cache protection middleware prevents sensitive data caching in browsers/proxies

Frontend Architecture

Framework: Next.js 15 App Router (client-first approach)
Data Fetching: Type-safe generated API hooks via Orval + React Query
State Management: React Query for server state, co-located UI state in components/hooks
Component Structure: Separate render logic (.tsx) from business logic (use*.ts hooks)
Workflow Builder: Visual graph editor using @xyflow/react
UI Components: shadcn/ui (Radix UI primitives) with Tailwind CSS styling
Icons: Phosphor Icons only
Feature Flags: LaunchDarkly integration
Error Handling: ErrorCard for render errors, toast for mutations, Sentry for exceptions
Testing: Playwright for E2E, Storybook for component development

Key Concepts

Agent Graphs: Workflow definitions stored as JSON, executed by the backend
Blocks: Reusable components in /backend/blocks/ that perform specific tasks
Integrations: OAuth and API connections stored per user
Store: Marketplace for sharing agent templates
Virus Scanning: ClamAV integration for file upload security

Testing Approach

Backend uses pytest with snapshot testing for API responses
Test files are colocated with source files (*_test.py)
Frontend uses Playwright for E2E tests
Component testing via Storybook

Database Schema

Key models (defined in /backend/schema.prisma):

User: Authentication and profile data
AgentGraph: Workflow definitions with version control
AgentGraphExecution: Execution history and results
AgentNode: Individual nodes in a workflow
StoreListing: Marketplace listings for sharing agents

Environment Configuration

Configuration Files

Backend: /backend/.env.default (defaults) → /backend/.env (user overrides)
Frontend: /frontend/.env.default (defaults) → /frontend/.env (user overrides)
Platform: /.env.default (Supabase/shared defaults) → /.env (user overrides)

Docker Environment Loading Order

.env.default files provide base configuration (tracked in git)
.env files provide user-specific overrides (gitignored)
Docker Compose environment: sections provide service-specific overrides
Shell environment variables have highest precedence

Key Points

All services use hardcoded defaults in docker-compose files (no ${VARIABLE} substitutions)
The env_file directive loads variables INTO containers at runtime
Backend/Frontend services use YAML anchors for consistent configuration
Supabase services (db/docker/docker-compose.yml) follow the same pattern

Common Development Tasks

Adding a new block:

Follow the comprehensive Block SDK Guide which covers:

Provider configuration with ProviderBuilder
Block schema definition
Authentication (API keys, OAuth, webhooks)
Testing and validation
File organization

Quick steps:

Create new file in /backend/backend/blocks/
Configure provider using ProviderBuilder in _config.py
Inherit from Block base class
Define input/output schemas using BlockSchema
Implement async run method
Generate unique block ID using uuid.uuid4()
Test with poetry run pytest backend/blocks/test/test_block.py

Note: when making many new blocks analyze the interfaces for each of these blocks and picture if they would go well together in a graph based editor or would they struggle to connect productively? ex: do the inputs and outputs tie well together?

If you get any pushback or hit complex block conditions check the new_blocks guide in the docs.

Modifying the API:

Update route in /backend/backend/server/routers/
Add/update Pydantic models in same directory
Write tests alongside the route file
Run poetry run test to verify

Frontend feature development:

See /frontend/CONTRIBUTING.md for complete patterns. Quick reference:

Pages: Create in src/app/(platform)/feature-name/page.tsx
- Add usePageName.ts hook for logic
- Put sub-components in local components/ folder
Components: Structure as ComponentName/ComponentName.tsx + useComponentName.ts + helpers.ts
- Use design system components from src/components/ (atoms, molecules, organisms)
- Never use src/components/__legacy__/*
Data fetching: Use generated API hooks from @/app/api/__generated__/endpoints/
- Regenerate with pnpm generate:api
- Pattern: use{Method}{Version}{OperationName}
Styling: Tailwind CSS only, use design tokens, Phosphor Icons only
Testing: Add Storybook stories for new components, Playwright for E2E
Code conventions: Function declarations (not arrow functions) for components/handlers

Security Implementation

Cache Protection Middleware:

Located in /backend/backend/server/middleware/security.py
Default behavior: Disables caching for ALL endpoints with Cache-Control: no-store, no-cache, must-revalidate, private
Uses an allow list approach - only explicitly permitted paths can be cached
Cacheable paths include: static assets (/static/*, /_next/static/*), health checks, public store pages, documentation
Prevents sensitive data (auth tokens, API keys, user data) from being cached by browsers/proxies
To allow caching for a new endpoint, add it to CACHEABLE_PATHS in the middleware
Applied to both main API server and external API applications

Creating Pull Requests

Create the PR aginst the dev branch of the repository.
Ensure the branch name is descriptive (e.g., feature/add-new-block)/
Use conventional commit messages (see below)/
Fill out the .github/PULL_REQUEST_TEMPLATE.md template as the PR description/
Run the github pre-commit hooks to ensure code quality.

Reviewing/Revising Pull Requests

When the user runs /pr-comments or tries to fetch them, also run gh api /repos/Significant-Gravitas/AutoGPT/pulls/[issuenum]/reviews to get the reviews
Use gh api /repos/Significant-Gravitas/AutoGPT/pulls/[issuenum]/reviews/[review_id]/comments to get the review contents
Use gh api /repos/Significant-Gravitas/AutoGPT/issues/9924/comments to get the pr specific comments

Conventional Commits

Use this format for commit messages and Pull Request titles:

Conventional Commit Types:

feat: Introduces a new feature to the codebase
fix: Patches a bug in the codebase
refactor: Code change that neither fixes a bug nor adds a feature; also applies to removing features
ci: Changes to CI configuration
docs: Documentation-only changes
dx: Improvements to the developer experience

Recommended Base Scopes:

platform: Changes affecting both frontend and backend
frontend
backend
infra
blocks: Modifications/additions of individual blocks

Subscope Examples:

backend/executor
backend/db
frontend/builder (includes changes to the block UI component)
infra/prod

Use these scopes and subscopes for clarity and consistency in commit messages.

9.8 KiB Raw Blame History