mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-01-09 15:17:59 -05:00
# Query Optimization for AgentNodeExecution Tables
## Overview
This PR describes the database index optimizations applied to improve
the performance of slow queries in the AutoGPT platform backend.
## Problem Analysis
The following queries were identified as consuming significant database
resources:
### 1. Complex Filtering Query (19.3% of total time)
```sql
SELECT ... FROM "AgentNodeExecution"
WHERE "agentNodeId" = $1
AND "agentGraphExecutionId" = $2
AND "executionStatus" = $3
AND "id" NOT IN (
SELECT "referencedByInputExecId"
FROM "AgentNodeExecutionInputOutput"
WHERE "name" = $4 AND "referencedByInputExecId" IS NOT NULL
)
ORDER BY "addedTime" ASC
```
### 2. Multi-table JOIN Query (8.9% of total time)
```sql
SELECT ... FROM "AgentNodeExecution"
LEFT JOIN "AgentNode" ON ...
LEFT JOIN "AgentBlock" ON ...
WHERE "AgentBlock"."id" IN (...)
AND "executionStatus" != $11
AND "agentGraphExecutionId" IN (...)
ORDER BY "queuedTime" DESC, "addedTime" DESC
```
### 3. Bulk Graph Execution Queries (multiple variations, ~10% combined)
Multiple queries filtering by `agentGraphExecutionId` with various
ordering requirements.
## Optimization Strategy
### 1. Composite Indexes for AgentNodeExecution
Set the following composite indexes to the `AgentNodeExecution` model:
```prisma
@@index([agentGraphExecutionId, agentNodeId, executionStatus])
@@index([addedTime, queuedTime])
```
#### Benefits:
- **Index 1**: Covers the exact WHERE clause of the complex filtering
query, allowing index-only scans
- **Index 2**: Optimizes queries filtering by graph execution and status
- **Index 3**: Supports efficient sorting when filtering by graph
execution
- **Index 4**: Optimizes ORDER BY operations on time fields
### 2. Composite Index for AgentNodeExecutionInputOutput
Added the following composite index:
```prisma
// Input and Output pin names are unique for each AgentNodeExecution.
@@unique([referencedByInputExecId, referencedByOutputExecId, name])
@@index([referencedByOutputExecId])
// Composite index for `upsert_execution_input`.
@@index([name, time])
```
#### Benefits:
- Dramatically improves the NOT IN subquery performance in Query 1
- Allows the database to use an index scan instead of a full table scan
- Reduces the subquery execution time from O(n) to O(log n)
## Expected Performance Improvements
1. **Query 1 (19.3% of total time)**:
- Expected improvement: 80-90% reduction in execution time
- The composite index on `[agentNodeId, agentGraphExecutionId,
executionStatus]` will allow index-only scans
- The subquery will benefit from the new index on
`AgentNodeExecutionInputOutput`
2. **Query 2 (8.9% of total time)**:
- Expected improvement: 50-70% reduction in execution time
- The `[agentGraphExecutionId, executionStatus]` index will reduce the
initial filtering cost
3. **Bulk Queries (10% combined)**:
- Expected improvement: 60-80% reduction in execution time
- Composite indexes including time fields will optimize sorting
operations
## Migration Considerations
1. **Index Creation Time**: Creating these indexes on existing large
tables may take time
2. **Storage Impact**: Each index requires additional storage space
3. **Write Performance**: Slight decrease in INSERT/UPDATE performance
due to index maintenance
## Additional Optimizations
### NotificationEvent Table Index
Added index for notification batch queries:
```prisma
@@index([userNotificationBatchId])
```
This optimizes the query:
```sql
SELECT ... FROM "NotificationEvent"
WHERE "userNotificationBatchId" IN (...)
```
#### Benefits:
- Eliminates full table scans when filtering by batch ID
- Improves query performance from O(n) to O(log n)
- Particularly beneficial for users with many notification events
## Future Optimizations
Consider these additional optimizations if needed:
1. Partitioning `AgentNodeExecution` table by `createdAt` or
`agentGraphExecutionId`
2. Implementing materialized views for frequently accessed aggregate
data
3. Adding covering indexes for specific query patterns
4. Implementing query result caching at the application level
---------
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
279 lines
10 KiB
YAML
279 lines
10 KiB
YAML
repos:
|
|
- repo: https://github.com/pre-commit/pre-commit-hooks
|
|
rev: v4.4.0
|
|
hooks:
|
|
- id: check-added-large-files
|
|
args: ["--maxkb=500"]
|
|
- id: fix-byte-order-marker
|
|
- id: check-case-conflict
|
|
- id: check-merge-conflict
|
|
- id: check-symlinks
|
|
- id: debug-statements
|
|
|
|
- repo: https://github.com/Yelp/detect-secrets
|
|
rev: v1.5.0
|
|
hooks:
|
|
- id: detect-secrets
|
|
name: Detect secrets
|
|
description: Detects high entropy strings that are likely to be passwords.
|
|
files: ^autogpt_platform/
|
|
stages: [pre-push]
|
|
|
|
- repo: local
|
|
# For proper type checking, all dependencies need to be up-to-date.
|
|
# It's also a good idea to check that poetry.lock is consistent with pyproject.toml.
|
|
hooks:
|
|
- id: poetry-install
|
|
name: Check & Install dependencies - AutoGPT Platform - Backend
|
|
alias: poetry-install-platform-backend
|
|
entry: poetry -C autogpt_platform/backend install
|
|
# include autogpt_libs source (since it's a path dependency)
|
|
files: ^autogpt_platform/(backend|autogpt_libs)/poetry\.lock$
|
|
types: [file]
|
|
language: system
|
|
pass_filenames: false
|
|
|
|
- id: poetry-install
|
|
name: Check & Install dependencies - AutoGPT Platform - Libs
|
|
alias: poetry-install-platform-libs
|
|
entry: poetry -C autogpt_platform/autogpt_libs install
|
|
files: ^autogpt_platform/autogpt_libs/poetry\.lock$
|
|
types: [file]
|
|
language: system
|
|
pass_filenames: false
|
|
|
|
- id: poetry-install
|
|
name: Check & Install dependencies - Classic - AutoGPT
|
|
alias: poetry-install-classic-autogpt
|
|
entry: poetry -C classic/original_autogpt install
|
|
# include forge source (since it's a path dependency)
|
|
files: ^classic/(original_autogpt|forge)/poetry\.lock$
|
|
types: [file]
|
|
language: system
|
|
pass_filenames: false
|
|
|
|
- id: poetry-install
|
|
name: Check & Install dependencies - Classic - Forge
|
|
alias: poetry-install-classic-forge
|
|
entry: poetry -C classic/forge install
|
|
files: ^classic/forge/poetry\.lock$
|
|
types: [file]
|
|
language: system
|
|
pass_filenames: false
|
|
|
|
- id: poetry-install
|
|
name: Check & Install dependencies - Classic - Benchmark
|
|
alias: poetry-install-classic-benchmark
|
|
entry: poetry -C classic/benchmark install
|
|
files: ^classic/benchmark/poetry\.lock$
|
|
types: [file]
|
|
language: system
|
|
pass_filenames: false
|
|
|
|
- repo: local
|
|
# For proper type checking, Prisma client must be up-to-date.
|
|
hooks:
|
|
- id: prisma-generate
|
|
name: Prisma Generate - AutoGPT Platform - Backend
|
|
alias: prisma-generate-platform-backend
|
|
entry: bash -c 'cd autogpt_platform/backend && poetry run prisma generate'
|
|
# include everything that triggers poetry install + the prisma schema
|
|
files: ^autogpt_platform/((backend|autogpt_libs)/poetry\.lock|backend/schema.prisma)$
|
|
types: [file]
|
|
language: system
|
|
pass_filenames: false
|
|
|
|
- repo: https://github.com/astral-sh/ruff-pre-commit
|
|
rev: v0.7.2
|
|
hooks:
|
|
- id: ruff
|
|
name: Lint (Ruff) - AutoGPT Platform - Backend
|
|
alias: ruff-lint-platform-backend
|
|
files: ^autogpt_platform/backend/
|
|
args: [--fix]
|
|
|
|
- id: ruff
|
|
name: Lint (Ruff) - AutoGPT Platform - Libs
|
|
alias: ruff-lint-platform-libs
|
|
files: ^autogpt_platform/autogpt_libs/
|
|
args: [--fix]
|
|
|
|
- id: ruff-format
|
|
name: Format (Ruff) - AutoGPT Platform - Libs
|
|
alias: ruff-lint-platform-libs
|
|
files: ^autogpt_platform/autogpt_libs/
|
|
|
|
- repo: local
|
|
# isort needs the context of which packages are installed to function, so we
|
|
# can't use a vendored isort pre-commit hook (which runs in its own isolated venv).
|
|
hooks:
|
|
- id: isort
|
|
name: Lint (isort) - AutoGPT Platform - Backend
|
|
alias: isort-platform-backend
|
|
entry: poetry -P autogpt_platform/backend run isort -p backend
|
|
files: ^autogpt_platform/backend/
|
|
types: [file, python]
|
|
language: system
|
|
|
|
- id: isort
|
|
name: Lint (isort) - Classic - AutoGPT
|
|
alias: isort-classic-autogpt
|
|
entry: poetry -P classic/original_autogpt run isort -p autogpt
|
|
files: ^classic/original_autogpt/
|
|
types: [file, python]
|
|
language: system
|
|
|
|
- id: isort
|
|
name: Lint (isort) - Classic - Forge
|
|
alias: isort-classic-forge
|
|
entry: poetry -P classic/forge run isort -p forge
|
|
files: ^classic/forge/
|
|
types: [file, python]
|
|
language: system
|
|
|
|
- id: isort
|
|
name: Lint (isort) - Classic - Benchmark
|
|
alias: isort-classic-benchmark
|
|
entry: poetry -P classic/benchmark run isort -p agbenchmark
|
|
files: ^classic/benchmark/
|
|
types: [file, python]
|
|
language: system
|
|
|
|
- repo: https://github.com/psf/black
|
|
rev: 24.10.0
|
|
# Black has sensible defaults, doesn't need package context, and ignores
|
|
# everything in .gitignore, so it works fine without any config or arguments.
|
|
hooks:
|
|
- id: black
|
|
name: Format (Black)
|
|
|
|
- repo: https://github.com/PyCQA/flake8
|
|
rev: 7.0.0
|
|
# To have flake8 load the config of the individual subprojects, we have to call
|
|
# them separately.
|
|
hooks:
|
|
- id: flake8
|
|
name: Lint (Flake8) - Classic - AutoGPT
|
|
alias: flake8-classic-autogpt
|
|
files: ^classic/original_autogpt/(autogpt|scripts|tests)/
|
|
args: [--config=classic/original_autogpt/.flake8]
|
|
|
|
- id: flake8
|
|
name: Lint (Flake8) - Classic - Forge
|
|
alias: flake8-classic-forge
|
|
files: ^classic/forge/(forge|tests)/
|
|
args: [--config=classic/forge/.flake8]
|
|
|
|
- id: flake8
|
|
name: Lint (Flake8) - Classic - Benchmark
|
|
alias: flake8-classic-benchmark
|
|
files: ^classic/benchmark/(agbenchmark|tests)/((?!reports).)*[/.]
|
|
args: [--config=classic/benchmark/.flake8]
|
|
|
|
- repo: local
|
|
hooks:
|
|
- id: prettier
|
|
name: Format (Prettier) - AutoGPT Platform - Frontend
|
|
alias: format-platform-frontend
|
|
entry: bash -c 'cd autogpt_platform/frontend && npx prettier --write $(echo "$@" | sed "s|autogpt_platform/frontend/||g")' --
|
|
files: ^autogpt_platform/frontend/
|
|
types: [file]
|
|
language: system
|
|
|
|
- repo: local
|
|
# To have watertight type checking, we check *all* the files in an affected
|
|
# project. To trigger on poetry.lock we also reset the file `types` filter.
|
|
hooks:
|
|
- id: pyright
|
|
name: Typecheck - AutoGPT Platform - Backend
|
|
alias: pyright-platform-backend
|
|
entry: poetry -C autogpt_platform/backend run pyright
|
|
# include forge source (since it's a path dependency) but exclude *_test.py files:
|
|
files: ^autogpt_platform/(backend/((backend|test)/|(\w+\.py|poetry\.lock)$)|autogpt_libs/(autogpt_libs/.*(?<!_test)\.py|poetry\.lock)$)
|
|
types: [file]
|
|
language: system
|
|
pass_filenames: false
|
|
|
|
- id: pyright
|
|
name: Typecheck - AutoGPT Platform - Libs
|
|
alias: pyright-platform-libs
|
|
entry: poetry -C autogpt_platform/autogpt_libs run pyright
|
|
files: ^autogpt_platform/autogpt_libs/(autogpt_libs/|poetry\.lock$)
|
|
types: [file]
|
|
language: system
|
|
pass_filenames: false
|
|
|
|
- id: pyright
|
|
name: Typecheck - Classic - AutoGPT
|
|
alias: pyright-classic-autogpt
|
|
entry: poetry -C classic/original_autogpt run pyright
|
|
# include forge source (since it's a path dependency) but exclude *_test.py files:
|
|
files: ^(classic/original_autogpt/((autogpt|scripts|tests)/|poetry\.lock$)|classic/forge/(forge/.*(?<!_test)\.py|poetry\.lock)$)
|
|
types: [file]
|
|
language: system
|
|
pass_filenames: false
|
|
|
|
- id: pyright
|
|
name: Typecheck - Classic - Forge
|
|
alias: pyright-classic-forge
|
|
entry: poetry -C classic/forge run pyright
|
|
files: ^classic/forge/(forge/|poetry\.lock$)
|
|
types: [file]
|
|
language: system
|
|
pass_filenames: false
|
|
|
|
- id: pyright
|
|
name: Typecheck - Classic - Benchmark
|
|
alias: pyright-classic-benchmark
|
|
entry: poetry -C classic/benchmark run pyright
|
|
files: ^classic/benchmark/(agbenchmark/|tests/|poetry\.lock$)
|
|
types: [file]
|
|
language: system
|
|
pass_filenames: false
|
|
|
|
- repo: local
|
|
hooks:
|
|
- id: tsc
|
|
name: Typecheck - AutoGPT Platform - Frontend
|
|
entry: bash -c 'cd autogpt_platform/frontend && npm run type-check'
|
|
files: ^autogpt_platform/frontend/
|
|
types: [file]
|
|
language: system
|
|
pass_filenames: false
|
|
|
|
# - repo: local
|
|
# hooks:
|
|
# - id: pytest
|
|
# name: Run tests - AutoGPT Platform - Backend
|
|
# alias: pytest-platform-backend
|
|
# entry: bash -c 'cd autogpt_platform/backend && poetry run pytest'
|
|
# # include autogpt_libs source (since it's a path dependency) but exclude *_test.py files:
|
|
# files: ^autogpt_platform/(backend/((backend|test)/|poetry\.lock$)|autogpt_libs/(autogpt_libs/.*(?<!_test)\.py|poetry\.lock)$)
|
|
# language: system
|
|
# pass_filenames: false
|
|
|
|
# - id: pytest
|
|
# name: Run tests - Classic - AutoGPT (excl. slow tests)
|
|
# alias: pytest-classic-autogpt
|
|
# entry: bash -c 'cd classic/original_autogpt && poetry run pytest --cov=autogpt -m "not slow" tests/unit tests/integration'
|
|
# # include forge source (since it's a path dependency) but exclude *_test.py files:
|
|
# files: ^(classic/original_autogpt/((autogpt|tests)/|poetry\.lock$)|classic/forge/(forge/.*(?<!_test)\.py|poetry\.lock)$)
|
|
# language: system
|
|
# pass_filenames: false
|
|
|
|
# - id: pytest
|
|
# name: Run tests - Classic - Forge (excl. slow tests)
|
|
# alias: pytest-classic-forge
|
|
# entry: bash -c 'cd classic/forge && poetry run pytest --cov=forge -m "not slow"'
|
|
# files: ^classic/forge/(forge/|tests/|poetry\.lock$)
|
|
# language: system
|
|
# pass_filenames: false
|
|
|
|
# - id: pytest
|
|
# name: Run tests - Classic - Benchmark
|
|
# alias: pytest-classic-benchmark
|
|
# entry: bash -c 'cd classic/benchmark && poetry run pytest --cov=benchmark'
|
|
# files: ^classic/benchmark/(agbenchmark/|tests/|poetry\.lock$)
|
|
# language: system
|
|
# pass_filenames: false
|