Commit Graph

19 Commits

Author SHA1 Message Date
Zamil Majdy
8b83bb8647 feat(backend): unified hybrid search with embedding backfill for all content types (#11767)
## Summary

This PR extends the embedding system to support **blocks** and
**documentation** content types in addition to store agents, and
introduces **unified hybrid search** across all content types using a
single `UnifiedContentEmbedding` table.

### Key Changes

1. **Unified Hybrid Search Architecture**
   - Added `search` tsvector column to `UnifiedContentEmbedding` table
- New `unified_hybrid_search()` function searches across all content
types (agents, blocks, docs)
- Updated `hybrid_search()` for store agents to use
`UnifiedContentEmbedding.search`
   - Removed deprecated `search` column from `StoreListingVersion` table

2. **Pluggable Content Handler Architecture**
   - Created abstract `ContentHandler` base class for extensibility
- Implemented handlers: `StoreAgentHandler`, `BlockHandler`,
`DocumentationHandler`
   - Registry pattern for easy addition of new content types

3. **Block Embeddings**
   - Discovers all blocks using `get_blocks()`
- Extracts searchable text from: name, description, categories,
input/output schemas

4. **Documentation Embeddings**
   - Scans `/docs/` directory for `.md` and `.mdx` files
   - Extracts title from first `#` heading or uses filename as fallback

5. **Hybrid Search Graceful Degradation**
- Falls back to lexical-only search if query embedding generation fails
   - Redistributes semantic weight proportionally to other components
   - Logs warning instead of throwing error

6. **Database Migrations**
- `20260115200000_add_unified_search_tsvector`: Adds search column to
UnifiedContentEmbedding with auto-update trigger
- `20260115210000_remove_storelistingversion_search`: Removes deprecated
search column and updates StoreAgent view

7. **Orphan Cleanup**
- `cleanup_orphaned_embeddings()` removes embeddings for deleted content
   - Always runs after backfill, even at 100% coverage

### Review Comments Addressed

-  SQL parameter index bug when user_id provided (embeddings.py)
-  Early return skipping cleanup at 100% coverage (scheduler.py)
-  Inconsistent return structure across code paths (scheduler.py)
-  SQL UNION syntax error - added parentheses for ORDER BY/LIMIT
(hybrid_search.py)
-  Version numeric ordering in aggregations (migration)
-  Embedding dimension uses EMBEDDING_DIM constant

### Files Changed

- `backend/api/features/store/content_handlers.py` (NEW): Handler
architecture
- `backend/api/features/store/embeddings.py`: Refactored to use handlers
- `backend/api/features/store/hybrid_search.py`: Unified search +
graceful degradation
- `backend/executor/scheduler.py`: Process all content types, consistent
returns
- `migrations/20260115200000_add_unified_search_tsvector/`: Add tsvector
to unified table
- `migrations/20260115210000_remove_storelistingversion_search/`: Remove
old search column
- `schema.prisma`: Updated UnifiedContentEmbedding and
StoreListingVersion models
- `*_test.py`: Added tests for unified_hybrid_search

## Test Plan

1.  All tests passing on Python 3.11, 3.12, 3.13
2.  Types check passing
3.  CodeRabbit and Sentry reviews addressed
4. Deploy to staging and verify:
   - Backfill job processes all content types
   - Search results include blocks and docs
   - Search works without OpenAI API (graceful degradation)

🤖 Generated with [Claude Code](https://claude.ai/code)

---------

Co-authored-by: Swifty <craigswift13@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-16 09:47:19 +01:00
Swifty
843c487500 feat(backend): add prisma types stub generator for pyright compatibility (#11736)
Prisma's generated `types.py` file is 57,000+ lines with complex
recursive TypedDict definitions that exhaust Pyright's type inference
budget. This causes random type errors and makes the type checker
unreliable.

### Changes 🏗️

- Add `gen_prisma_types_stub.py` script that generates a lightweight
`.pyi` stub file
- The stub preserves safe types (Literal, TypeVar) while collapsing
complex TypedDicts to `dict[str, Any]`
- Integrate stub generation into all workflows that run `prisma
generate`:
  - `platform-backend-ci.yml`
  - `claude.yml`
  - `claude-dependabot.yml`
  - `copilot-setup-steps.yml`
  - `docker-compose.platform.yml`
  - `Dockerfile`
  - `Makefile` (migrate & reset-db targets)
  - `linter.py` (lint & format commands)
- Add `gen-prisma-stub` poetry script entry
- Fix two pre-existing type errors that were previously masked:
- `store/db.py`: Replace private type
`_StoreListingVersion_version_OrderByInput` with dict literal
  - `airtable/_webhook.py`: Add cast for `Serializable` type

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] Run `poetry run format` - passes with 0 errors (down from 57+)
  - [x] Run `poetry run lint` - passes with 0 errors
  - [x] Run `poetry run gen-prisma-stub` - generates stub successfully
- [x] Verify stub file is created at correct location with proper
content

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Added a lightweight Prisma type-stub generator and integrated it into
build, lint, CI/CD, and container workflows.
* Build, migration, formatting, and lint steps now generate these stubs
to improve type-checking performance and reduce overhead during builds
and deployments.
  * Exposed a project command to run stub generation manually.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-09 16:31:10 +01:00
Ubbe
ebfbf31c73 ci(frontend): query generation on dev and ci check (#10417)
## Changes 🏗️

- Run the API query generation as part of the `dev` command
  - update the `README` to reflect so
- Add CI job to generate queries and type-check to make sure we are not
out of sync
  - the job is run both in Front-end and Back-end changes 
- Generate the files via script to load the BE URL dynamically from the
env
- Remove generated files from Git 
- rename the `type-check` command to `types`

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] CI passes
  - [x] `README` updates make sense 

#### For configuration changes:

None

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2025-08-19 11:21:36 +00:00
Zamil Majdy
4bfeddc03d feat(platform/docker): add frontend service to docker-compose with env config improvements (#10615)
## Summary
This PR adds the frontend service to the Docker Compose configuration,
enabling `docker compose up` to run the complete stack, including the
frontend. It also implements comprehensive environment variable
improvements, unified .env file support, and fixes Docker networking
issues.

## Key Changes

### 🐳 Docker Compose Improvements
- **Added frontend service** to `docker-compose.yml` and
`docker-compose.platform.yml`
- **Production build**: Uses `pnpm build + serve` instead of dev server
for better stability and lower memory usage
- **Service dependencies**: Frontend now waits for backend services
(`rest_server`, `websocket_server`) to be ready
- **YAML anchors**: Implemented DRY configuration to avoid duplicating
environment values

### 📁 Unified .env File Support
- **Frontend .env loading**: Automatically loads `.env` file during
Docker build and runtime
- **Backend .env loading**: Optional `.env` file support with fallback
to sensible defaults in `settings.py`
- **Single source of truth**: All `NEXT_PUBLIC_*` and API keys can be
defined in respective `.env` files
- **Docker integration**: Updated `.dockerignore` to include `.env`
files in build context
- **Git tracking**: Frontend and backend `.env` files are now trackable
(removed from gitignore)

### 🔧 Environment Variable Architecture
- **Dual environment strategy**: 
- Server-side code uses Docker service names
(`http://rest_server:8006/api`)
  - Client-side code uses localhost URLs (`http://localhost:8006/api`)
- **Comprehensive config**: Added build args and runtime environment
variables
- **Network compatibility**: Fixes connection issues between frontend
and backend containers
- **Shared backend variables**: Common environment variables (service
hosts, auth settings) centralized using YAML anchors

### 🛠️ Code Improvements
- **Centralized env-config helper** (`/frontend/src/lib/env-config.ts`)
with server-side priority
- **Updated all frontend code** to use shared environment helpers
instead of direct `process.env` access
- **Consistent API**: All environment variable access now goes through
helper functions
- **Settings.py improvements**: Better defaults for CORS origins and
optional .env file loading

### 🔗 Files Changed
- `docker-compose.yml` & `docker-compose.platform.yml` - Added frontend
service and shared backend env vars
- `frontend/Dockerfile` - Simplified build process to use .env files
directly
- `backend/settings.py` - Optional .env loading and better defaults
- `frontend/src/lib/env-config.ts` - New centralized environment
configuration
- `.dockerignore` - Allow .env files in build context
- `.gitignore` - Updated to allow frontend/backend .env files
- Multiple frontend files - Updated to use env helpers
- Updates to both auto installer scripts to work with the latest setup!

## Benefits
-  **Single command deployment**: `docker compose up` now runs
everything
-  **Better reliability**: Production build reduces memory usage and
crashes
-  **Network compatibility**: Proper container-to-container
communication
-  **Maintainable config**: Centralized environment variable management
with .env files
-  **Development friendly**: Works in both Docker and local development
-  **API key management**: Easy configuration through .env files for
all services
-  **No more manual env vars**: Frontend and backend automatically load
their respective .env files

## Testing
-  Verified Docker service communication works correctly
-  Frontend responds and serves content properly  
-  Environment variables are correctly resolved in both server and
client contexts
-  No connection errors after implementing service dependencies
-  .env file loading works correctly in both build and runtime phases
-  Backend services work with and without .env files present

### Checklist 📋

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)

🤖 Generated with [Claude Code](https://claude.ai/code)

---------

Co-authored-by: Lluis Agusti <hi@llu.lu>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
Co-authored-by: Claude <claude@users.noreply.github.com>
Co-authored-by: Bentlybro <Github@bentlybro.com>
2025-08-14 03:28:18 +00:00
Ubbe
e3590e1eb0 chore(frontend): ci caching + e2e test data script (#10446)
## Changes 🏗️

- Make docker + deps cache actually work on the FE CI
- Run the E2E test data script before Playwright

## Checklist 📋

### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] CI is faster in repeated runs ( _uses cache_ )
  - [x] Test data script runs successfully 

### For configuration changes:
None
2025-07-24 19:17:14 +00:00
Ubbe
73a3d980ca chore(frontend): move from yarn1 to pnpm (#10072)
## 🧢 Overview
This PR migrates the AutoGPT Platform frontend from [yarn
1](https://classic.yarnpkg.com/lang/en/) to [pnpm](https://pnpm.io/)
using **corepack** for automatic package manager management.

**yarn1** is not longer maintained and a bit old, moving to **pnpm** we
get:
-  Significantly faster install times,
- 💾 Better disk space efficiency,
- 🛠️ Better community support and maintenance,
- 💆🏽‍♂️  Config swap very easy

##  🏗️ Changes

### Package Management Migration

- updated [corepack](https://github.com/nodejs/corepack) to use
[pnpm](https://pnpm.io/)
- Deleted `yarn.lock` and generated new `pnpm-lock.yaml`
- Updated `.gitignore`

### Documentation Updates

- `frontend/README.md`: 
  - added comprehensive tech stack overview with links
  - updated all commands to use pnpm
  - added corepack setup instructions
  - and included migration disclaimer for yarn users
- `backend/README.md`: 
  - Updated installation instructions to use pnpm with corepack
- `AGENTS.md`: 
  - Updated testing commands from yarn to pnpm

### CI/CD & Infrastructure

- **GitHub Workflows** : 
  - updated all jobs to use pnpm with corepack enable
  - cleaned FE Playwright test workflow to avoid Sentry noise
- **Dockerfile**:
- updated to use pnpm with corepack, changed lock file reference, and
updated cache mount path

###  📋 Checklist

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  
  **Test Plan:**
  > assuming you are on the `frontend` folder 
- [x] Clean installation works: `rm -rf node_modules && corepack enable
&& pnpm install`
  - [x] Development server starts correctly: `pnpm dev`
  - [x] Build process works: `pnpm build`
  - [x] Linting and formatting work: `pnpm lint` and `pnpm format`
  - [x] Type checking works: `pnpm type-check`
  - [x] Tests run successfully: `pnpm test`
  - [x] Storybook starts correctly: `pnpm storybook`
  - [x] Docker build succeeds with new pnpm configuration
  - [x] GitHub Actions workflow passes with pnpm commands

#### For configuration changes:
- [x] `.env.example` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
2025-06-04 17:07:29 +04:00
Reinier van der Leer
d638c1f484 Fix Poetry v2.0.0 compatibility (#9197)
Make all changes necessary to make everything work with Poetry v2.0.0.

- Resolves #9196

## Changes
- Removed `--no-update` flag from `poetry lock` command in codebase
- Removed extra path arguments from `poetry -C [path] run [command]`
occurrences
- Regenerated all lock files in hierarchical order
- Added workaround for Poetry bug where `packages.[i].format` is now
suddenly required

Additionally:
- Fixed up .dockerignore
  - Fixes .venv being erroneously copied over from local
  - Fixes build context bloat (300MB -> 2.5MB)
- Fixed warnings about entrypoint script not being installed in docker
builds

### Relevant (breaking) changes in v2.0.0
- `--no-update` flag no longer exists for `poetry lock` as it has become
default behavior
- The `-C` option now actually changes the directory, so any path
arguments in `poetry run` commands can/must be removed
- Poetry v2.0.0 uses the new v2.1 lock file spec, so all lock files have
to be regenerated to avoid false-positive lock file updates and checks
on future PRs
- **BUG:** when specifying `poetry.tool.packages`, `format` is required
now
  - python-poetry/poetry#9961

Full Poetry v2.0.0 release notes and change log:
https://python-poetry.org/blog/announcing-poetry-2.0.0
2025-01-06 23:34:49 +01:00
Aarushi
fc51176a56 fix(.dockerignore) Put dockerignore back (#8136)
* put dockerignore back

* add classic prefix
2024-09-23 09:26:30 +00:00
Swifty
ef7cfbb860 refactor: AutoGPT Platform Stealth Launch Repo Re-Org (#8113)
Restructuring the Repo to make it clear the difference between classic autogpt and the autogpt platform:
* Move the "classic" projects `autogpt`, `forge`, `frontend`, and `benchmark` into a `classic` folder
  * Also rename `autogpt` to `original_autogpt` for absolute clarity
* Rename `rnd/` to `autogpt_platform/`
  * `rnd/autogpt_builder` -> `autogpt_platform/frontend`
  * `rnd/autogpt_server` -> `autogpt_platform/backend`
* Adjust any paths accordingly
2024-09-20 16:50:43 +02:00
Aarushi
ab60a57379 tweak(rnd): Ignore .env in market (#8035)
ignore .env
2024-09-11 11:01:34 +01:00
Aarushi
0b919522ae feat(rnd): Split Execution Manager (#8008)
* split execution manager and removed ns and use direct uri with k8s and docker specific dns

* formating

* split execution manager

* refactor(builder): Fix linting warning and errors (#8021)

* Fix lint errors

* Fix dependency loop

* address feedback

* docker compose

* remove ns entirely

* remove yarn lock changes

* update readme

* remove ref

* dockerfile and log

* update log

* debug

* rename to executor

* remove execution from rest

* exec.py

* linting

* udpate tests to use config

* fix test

---------

Co-authored-by: Krzysztof Czerwinski <34861343+kcze@users.noreply.github.com>
2024-09-10 10:05:31 +01:00
Zamil Majdy
b4b5a09b6b fix(rnd): Dockerfile Avoid full rebuild on each file change (#7971)
Co-authored-by: Aarushi <50577581+aarushik93@users.noreply.github.com>
2024-09-04 17:30:13 +00:00
Aarushi
5000aa7ee0 tweak(rnd,docker) Remove SQLite (#7966)
* move migrations, update networking and dockignore

* update docs

* remove sqlite from ci

* remove schema linting checks

* fix formatting

* remove schema linting

* add test script

* formatting and linting

* stop pg not down

* seperate test db

* diff port

* remove duplicate
2024-09-04 10:18:57 +01:00
Aarushi
699087e289 feat(rnd) Add dockerfiles (#7523)
* replace SQLite with Postgres

* dockerfiles and optional docker compose set up

* Update rnd/autogpt_builder/Dockerfile

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>

* address feedback

* Update .dockerignore

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>

* Remove example files folder

* remove backend and frontend from docker compose

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2024-07-24 10:01:22 +01:00
Reinier van der Leer
5292736779 fix(agent): Unbreak docker builds after repo restructure (#7164)
- Move `autogpt/Dockerfile` to `Dockerfile.autogpt`
  - Write new selective `.dockerignore` (in repo root) to keep build context clean
  - Amend `autogpt/docker-compose.yml` and all `autogpt-docker-*.yml` workflows accordingly

- Include `forge/` in docker build context so it can be used as a path dependency

- Include `frontend/` in docker builds
2024-05-22 18:11:16 +02:00
Merwane Hamadi
8489052358 Move Auto-GPT to autogpts/autogpt 2023-09-05 09:40:24 -07:00
Reinier van der Leer
ba030eac1d Reduce docker build bloat 2023-08-31 02:50:54 +02:00
merwanehamadi
0c8f2cfd1c Fix autogpt docker image not working because missing prompt_settings (#4680)
Co-authored-by: Richard Beales <rich@richbeales.net>
2023-06-13 20:18:39 +01:00
Reinier van der Leer
9c60eecce6 Improve docker setup & config (#1843)
* Improve docker setup & config

* fix(browsing): Selenium needs access to home directory

* fix(docker): allow overriding memory backend settings

* simplify Dockerfile and docker-compose config

* add .dockerignore

* adjust Docker CI with release build type arg

* replace Chrome by Chromium in devcontainer

* update docs

* update bulletin

* use preinstalled chromedriver in web_selenium.py

* update installation.md

* fix code blocks for mkdocs

* fix links to docs
2023-04-24 14:27:53 +01:00