Compare commits

..

60 Commits

Author SHA1 Message Date
Nicholas Tindle
326554d89a style(classic): update black to 24.10.0 and reformat
Update black version to match pre-commit hook (24.10.0) and reformat
all files with the new version.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 10:51:54 -06:00
Nicholas Tindle
5e22a1888a chore: add classic benchmark reports and workspaces to gitignore
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 10:42:55 -06:00
Nicholas Tindle
a4d7b0142f fix(classic): resolve all pyright type errors
- Add missing strategies (lats, multi_agent_debate) to PromptStrategyName
- Fix method override signatures for reasoning_effort parameter
- Fix Pydantic Field() overload issues with helper function
- Fix BeautifulSoup Tag type narrowing in web_fetch.py
- Fix Optional member access in playwright_browser.py and rewoo.py
- Convert hasattr patterns to getattr for proper type narrowing
- Add proper type casts for Literal types
- Fix file storage path type conversions
- Exclude legacy challenges/ from pyright checking

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 10:41:53 -06:00
Nicholas Tindle
7d6375f59c style(classic): fix flake8 line length issue
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 01:25:00 -06:00
Nicholas Tindle
aeec0ce509 chore: add test.db to gitignore 2026-01-20 01:24:22 -06:00
Nicholas Tindle
b32bfcaac5 chore: remove test.db from tracking 2026-01-20 01:24:00 -06:00
Nicholas Tindle
5373a6eb6e style(classic): fix code formatting with black
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 01:23:51 -06:00
Nicholas Tindle
98cde46ccb style(classic): fix import sorting with isort
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 01:23:33 -06:00
Nicholas Tindle
bd10da10d9 ci: update pre-commit hooks for consolidated classic Poetry project
- Consolidate classic poetry-install hooks into single hook using classic/
- Update isort hook to work with consolidated project structure
- Simplify flake8 hooks to use single classic/.flake8 config
- Consolidate pyright hooks into single hook for classic/
- Add direct_benchmark to hook coverage

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 01:21:50 -06:00
Nicholas Tindle
60fdee1345 fix(classic): resolve linting and formatting issues for CI compliance
- Update .flake8 config to exclude workspace directories and ignore E203
- Fix import sorting (isort) across multiple files
- Fix code formatting (black) across multiple files
- Remove unused imports and fix line length issues (flake8)
- Fix f-strings without placeholders and unused variables

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 01:16:38 -06:00
Nicholas Tindle
6f2783468c feat(classic): add sub-agent architecture and LATS/multi-agent debate strategies
Add comprehensive sub-agent spawning infrastructure that enables prompt
strategies to coordinate multiple agents for advanced reasoning patterns.

New files:
- forge/agent/execution_context.py: ExecutionContext, ResourceBudget,
  SubAgentHandle, and AgentFactory protocol for sub-agent lifecycle
- agent_factory/default_factory.py: DefaultAgentFactory implementation
- prompt_strategies/lats.py: Language Agent Tree Search using MCTS
  with sub-agents for action expansion and evaluation
- prompt_strategies/multi_agent_debate.py: Multi-agent debate with
  proposal, critique, and consensus phases

Key changes:
- BaseMultiStepPromptStrategy gains spawn_sub_agent(), run_sub_agent(),
  spawn_and_run(), and run_parallel() methods
- Agent class accepts optional ExecutionContext and injects it into strategies
- Sub-agents enabled by default (enable_sub_agents=True)
- Resource limits: max_depth=5, max_sub_agents=25, max_cycles=25

All 7 strategies now available in benchmark:
one_shot, rewoo, plan_execute, reflexion, tree_of_thoughts, lats, multi_agent_debate

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 01:01:28 -06:00
Nicholas Tindle
c1031b286d ci(classic): update CI workflows for consolidated Poetry project
Update all classic CI workflows to use the single consolidated
pyproject.toml at classic/ instead of individual project directories.

Changes:
- classic-autogpt-ci.yml: Run from classic/, update cache key and test paths
- classic-forge-ci.yml: Run from classic/, update cache key and test paths
- classic-benchmark-ci.yml: Run from classic/, use direct-benchmark command
- classic-python-checks.yml: Simplify to single job (no matrix needed)
- classic-autogpts-ci.yml: Update to use direct-benchmark for smoke tests

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 00:53:50 -06:00
Nicholas Tindle
b849eafb7f feat(direct_benchmark): enable shell command execution with safety denylist
Enable agents to execute shell commands during benchmarks by setting
execute_local_commands=True and using denylist mode to block dangerous
commands (rm, sudo, chmod, kill, etc.) while allowing safe operations.

Also adds ExecutePython challenge to test code execution capability.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 00:52:06 -06:00
Nicholas Tindle
572c3f5e0d refactor(classic): consolidate Poetry projects into single pyproject.toml
Merge forge/, original_autogpt/, and direct_benchmark/ into a single Poetry
project to eliminate cross-project path dependency issues.

Changes:
- Create classic/pyproject.toml with merged dependencies from all three projects
- Remove individual pyproject.toml and poetry.lock files from subdirectories
- Update all CLAUDE.md files to reflect commands run from classic/ root
- Update all README.md files with new installation and usage instructions

All packages are now included via the packages directive:
- forge/forge (core agent framework)
- original_autogpt/autogpt (AutoGPT agent)
- direct_benchmark/direct_benchmark (benchmark harness)

CLI entry points preserved: autogpt, serve, direct-benchmark

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 00:49:56 -06:00
Nicholas Tindle
89003a585d feat(direct_benchmark): show "would have passed" for timed-out challenges
When a challenge times out but the agent's solution would have passed
evaluation, this is now clearly indicated:

- Completion blocks show "TIMEOUT (would have passed)" in yellow
- Recent completions panel shows hourglass icon + "would pass" suffix
- Summary table has new "Would Pass" column
- Final summary shows "+N would pass" count
- Success rate includes "would pass" challenges

The evaluator still runs on timed-out challenges to calculate the score,
but success remains False. This gives visibility into near-misses that
just needed more time.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 00:30:00 -06:00
Nicholas Tindle
0e65785228 fix(direct_benchmark): don't mark timed-out challenges as passed
Previously, the evaluator would run on all results including timed-out
challenges. If the agent happened to write a working solution before
timing out, evaluation would pass and override success=True, resulting
in contradictory output showing both PASS and "timed out".

Now we skip evaluation for timed-out challenges - they cannot pass.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 00:25:41 -06:00
Nicholas Tindle
f07dff1cdd fix(direct_benchmark): add pytest dependency for challenge evaluation
The TicTacToe and other challenges use pytest-based test files for
evaluation. Without pytest installed in the benchmark virtualenv,
these evaluations were silently failing.

Root cause: test.py imports pytest but the package wasn't a dependency,
causing ModuleNotFoundError during evaluation subprocess.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 00:21:12 -06:00
Nicholas Tindle
00e02a4696 feat(direct_benchmark): add run ID to completion blocks
Include config:challenge:attempt and timestamp in completion block
header for easier debugging and log correlation.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 00:14:23 -06:00
Nicholas Tindle
634bff8277 refactor(forge): replace Selenium with Playwright for web browsing
- Remove selenium.py and test_selenium.py
- Add playwright_browser.py with WebPlaywrightComponent
- Update web component exports to use Playwright
- Update dependencies in pyproject.toml/poetry.lock
- Minor agent and reflexion strategy improvements
- Update CLAUDE.md documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 23:57:17 -06:00
Nicholas Tindle
d591f36c7b fix(direct_benchmark): track cost from LLM provider
Previously cost was hardcoded to 0.0. Now extracts cumulative cost
from MultiProvider.get_incurred_cost() after each step execution.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 23:37:12 -06:00
Nicholas Tindle
a347bed0b1 feat(direct_benchmark): add incremental resume and selective reset
Benchmarks now automatically save progress and resume from where they
left off. State is persisted to .benchmark_state.json in reports dir.

Features:
- Auto-resume: runs skip already-completed challenges
- --fresh: clear all state and start over
- --retry-failures: re-run only failed challenges
- --reset-strategy/model/challenge: selective resets
- `state show/clear/reset` subcommands for state management
- Config mismatch detection with auto-reset

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 23:32:27 -06:00
Nicholas Tindle
4eeb6ee2b0 feat(direct_benchmark): add CI mode for non-interactive environments
Add --ci flag that disables Rich Live display while preserving
completion blocks. Auto-detects CI environment via CI env var or
non-TTY stdout. Prints progress every 10 completions for visibility.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 23:21:10 -06:00
Nicholas Tindle
7db962b9f9 feat(direct_benchmark): dynamic column layout up to 10 wide
- Calculate max columns based on terminal width (up to 10)
- Reduced panel width from 35 to 30 chars to fit more
- Wider terminals can now show more parallel runs side-by-side

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 23:15:16 -06:00
Nicholas Tindle
9108b21541 fix(direct_benchmark): parallel execution and always show completion blocks
Fixes:
- Use run_key (config:challenge) instead of just config_name for tracking
  active runs - allows multiple challenges from same config to run in parallel
- Add asyncio.sleep(0) yields to let multiple tasks acquire semaphore
  and start before any proceed with work
- Always print completion blocks (not just failures) for visibility

This should properly show 8/8 active runs when running with --parallel 8.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 23:13:56 -06:00
Nicholas Tindle
ffe9325296 feat(direct_benchmark): multi-panel UI with copy-paste completion blocks
UI improvements:
- Multi-column layout: each active config gets its own panel showing
  challenge name and step history (last 6 steps with status)
- Copy-paste completion blocks: when a challenge finishes (especially
  failures), prints a detailed block with all steps for easy debugging
- Configurable logging: suppresses noisy LLM provider warnings unless
  --debug flag is set
- Pass debug flag through harness to UI

Example active runs panel:
┌─ one_shot/claude ─┬─ rewoo/claude ────┐
│ ReadFile          │ WriteFile         │
│   ✓ #1 read_file  │   ✓ #1 think      │
│   ✓ #2 write_file │   ✓ #2 plan       │
│   ● step 3: ...   │   ● step 3: ...   │
└───────────────────┴───────────────────┘

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 23:10:34 -06:00
Nicholas Tindle
0a616d9267 feat(direct_benchmark): add step-level logging with colored prefixes
- Add step callback to AgentRunner for real-time step logging
- BenchmarkUI now shows:
  - Active runs with current step info
  - Recent steps panel with colored config prefixes
  - Proper Live display refresh (implements __rich_console__)
- Each config gets a distinct color for easy identification
- Verbose mode prints step logs immediately with config prefix
- Fix Live display not updating (pass UI object, not rendered content)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 23:02:20 -06:00
Nicholas Tindle
ab95077e5b refactor(forge): remove VCR cassettes, use real API calls with skip for forks
- Remove vcrpy and pytest-recording dependencies
- Remove tests/vcr/ directory and vcr_cassettes submodule
- Remove .gitmodules (only had cassette submodule)
- Simplify CI workflow - no more cassette checkout/push/PAT_REVIEW
- Tests requiring API keys now skip if not set (fork PRs)
- Update CLAUDE.md files to remove cassette references
- Fix broken agbenchmark path in pyproject.toml

Security improvement: removes need for PAT with cross-repo write access.
Fork PRs will have API-dependent tests skipped (GitHub protects secrets).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 22:51:57 -06:00
Nicholas Tindle
e477150979 Merge branch 'dev' into make-old-work 2026-01-19 22:30:46 -06:00
Nicholas Tindle
804430e243 refactor(classic): migrate from agbenchmark to direct_benchmark harness
- Remove old benchmark/ folder with agbenchmark framework
- Move challenges to direct_benchmark/challenges/
- Move analysis tools (analyze_reports.py, analyze_failures.py) to direct_benchmark/
- Move challenges_already_beaten.json to direct_benchmark/
- Update CI workflow to use direct_benchmark
- Update CLAUDE.md files with new benchmarking instructions
- Add benchmarking section to original_autogpt/CLAUDE.md

The direct_benchmark harness directly instantiates agents without HTTP
server overhead, enabling parallel execution with asyncio semaphore.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 22:29:51 -06:00
Nicholas Tindle
acb320d32d feat(classic): add noninteractive mode env var and benchmark config logging
- Add NONINTERACTIVE_MODE env var support to AppConfig for disabling
  user interaction during automated runs
- Benchmark harness now sets NONINTERACTIVE_MODE=True when starting agents
- Add agent configuration logging at server startup (model, strategy, etc.)
- Harness logs env vars being passed to agent for verification
- Add --agent-output flag to show full agent server output for debugging

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 19:40:24 -06:00
Nicholas Tindle
32f68d5999 feat(classic): add failure analysis tool and improve benchmark output
Benchmark improvements:
- Add analyze_failures.py for pattern detection and failure analysis
- Add informative step output: tool name, args, result status, cost
- Add --all and --matrix flags for comprehensive model/strategy testing
- Add --analyze-only and --no-analyze flags for flexible analysis control
- Auto-run failure analysis after benchmarks with markdown export
- Fix directory creation bug in ReportManager (add parents=True)

Prompt strategy enhancements:
- Implement full plan_execute, reflexion, rewoo, tree_of_thoughts strategies
- Add PROMPT_STRATEGY env var support for strategy selection
- Add extended thinking support for Anthropic models
- Add reasoning effort support for OpenAI o-series models

LLM provider improvements:
- Add thinking_budget_tokens config for Anthropic extended thinking
- Add reasoning_effort config for OpenAI reasoning models
- Improve error feedback for LLM self-correction

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 18:58:41 -06:00
Nicholas Tindle
49f56b4e8d feat(classic): enhance strategy benchmark harness with model comparison and bug fixes
- Add model comparison support to test harness (claude, openai, gpt5, opus presets)
- Add --models, --smart-llm, --fast-llm, --list-models CLI args
- Add real-time logging with timestamps and progress indicators
- Fix success parsing bug: read results[0].success instead of non-existent metrics.success
- Fix agbenchmark TestResult validation: use exception typename when value is empty
- Fix WebArena challenge validation: use strings instead of integers in instantiation_dict
- Fix Agent type annotations: create AnyActionProposal union for all prompt strategies
- Add pytest integration tests for the strategy benchmark harness

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 18:07:14 -06:00
Nicholas Tindle
bead811e73 docs(classic): add workspace, settings, and permissions documentation
Document the layered configuration system including:
- Workspace structure (.autogpt/ directory layout)
- Settings location (environment variables, workspace YAML, agent YAML)
- Permission system (check order, pattern syntax, approval scopes)
- Default security behavior

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 12:17:10 -06:00
Nicholas Tindle
013f728ebf feat(forge): improve tool call error feedback for LLM self-correction
When tool calls fail validation, the error messages now include:
- What arguments were actually provided
- The expected parameter schema with types and required/optional indicators

This helps LLMs understand and fix their mistakes when retrying,
rather than just being told a parameter is missing.

Example improved error:
  Invalid function call for write_file: 'contents' is a required property
  You provided: {"filename": 'story.txt'}
  Expected parameters: {"filename": string (required), "contents": string (required)}

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 11:49:17 -06:00
Nicholas Tindle
cda9572acd feat(forge): add lightweight web fetch component
Add WebFetchComponent for fast HTTP-based page fetching without browser
overhead. Uses trafilatura for intelligent content extraction.

Commands:
- fetch_webpage: Extract main content as text/markdown/xml
  - Removes navigation, ads, boilerplate automatically
  - Extracts page metadata (title, description, author, date)
  - Extracts and lists page links
  - Much faster than Selenium-based read_webpage

- fetch_raw_html: Get raw HTML for structure inspection
  - Optional truncation for large pages

Features:
- Trafilatura-powered content extraction (best-in-class accuracy)
- Automatic link extraction with relative URL resolution
- Page metadata extraction (OG tags, meta tags)
- Configurable timeout, max content length, max links
- Proper error handling for timeouts and HTTP errors
- 19 comprehensive tests

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 01:04:22 -06:00
Nicholas Tindle
e0784f8f6b refactor(forge): simplify deeply nested error handling in Anthropic provider
- Extract _get_tool_error_message helper method
- Replace 20+ levels of nesting with simple for loop
- Improve readability of tool_result construction
- Update benchmark poetry.lock

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 00:15:33 -06:00
Nicholas Tindle
3040f39136 feat(forge): modernize web search with tiered provider system
Replace basic DuckDuckGo-only search with a modern tiered system:

1. Tavily (primary) - AI-optimized results with content extraction
   - AI-generated answer summaries
   - Relevance scoring
   - Full page content extraction via search_and_extract command

2. Serper (secondary) - Fast, cheap Google SERP results
   - $0.30-1.00 per 1K queries
   - Real Google results without scraping

3. DDGS multi-engine (fallback) - Free, no API key required
   - Automatic fallback chain: DuckDuckGo → Bing → Brave → Google → etc.
   - 8 search backends supported

Key changes:
- Upgrade duckduckgo-search to ddgs v9.10 (renamed successor package)
- Add Tavily and Serper API integrations
- Implement automatic provider selection and fallback chain
- Add search_and_extract command for research with content extraction
- Add TAVILY_API_KEY and SERPER_API_KEY to env templates
- Update benchmark httpx constraint for ddgs compatibility
- 23 comprehensive tests for all providers and fallback scenarios

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 00:06:42 -06:00
Nicholas Tindle
515504c604 fix(classic): resolve pyright type errors in original_autogpt
- Change Agent class to use ActionProposal instead of OneShotAgentActionProposal
  to support multiple prompt strategy types
- Widen display_thoughts parameter type from AssistantThoughts to ModelWithSummary
- Fix speak attribute access in agent_protocol_server with hasattr check
- Add type: ignore comments for intentional thoughts field overrides in strategies
- Remove unused OneShotAgentActionProposal import

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 23:53:23 -06:00
Nicholas Tindle
18edeaeaf4 fix(classic): fix linting and formatting errors across codebase
- Fix 32+ flake8 E501 (line too long) errors by shortening descriptions
- Remove unused import in todo.py
- Fix test_todo.py argument order (config= keyword)
- Add type annotations to fix pyright errors where straightforward
- Add noqa comments for flake8 false positives in __init__.py
- Remove unused nonlocal declarations in main.py
- Run black and isort to fix formatting
- Update CLAUDE.md with improved linting commands

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 23:37:28 -06:00
Nicholas Tindle
44182aff9c feat(classic): add strategy benchmark test harness for CI
- Add test_prompt_strategies.py harness to compare prompt strategies
- Add pytest wrapper (test_strategy_benchmark.py) for CI integration
- Fix serve command (remove invalid --port flag, use AP_SERVER_PORT env)
- Fix test category (interface -> general)
- Add aiohttp-retry dependency for agbenchmark
- Add pytest markers: slow, integration, requires_agent

Usage:
  poetry run python agbenchmark_config/test_prompt_strategies.py --quick
  poetry run pytest tests/integration/test_strategy_benchmark.py -v

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 23:36:19 -06:00
Nicholas Tindle
864c5a7846 fix(classic): approve+feedback now executes command then sends feedback
Previously, when a user selected "Once" or "Always" with feedback (via Tab),
the command was NOT executed because UserFeedbackProvided was raised before
checking the approval scope. This fix changes the architecture from
exception-based to return-value-based.

Changes:
- Add PermissionCheckResult class with allowed, scope, and feedback fields
- Change check_command() to return PermissionCheckResult instead of bool
- Update prompt_fn signature to return (ApprovalScope, feedback) tuple
- Add pending_user_feedback mechanism to EpisodicActionHistory
- Update execute() to handle feedback after successful command execution
- Feedback message explicitly states "Command executed successfully"
- Add on_auto_approve callback for displaying auto-approved commands
- Add comprehensive tests for approval/denial with feedback scenarios

Behavior:
- Once + feedback → Execute command, then send feedback to agent
- Always + feedback → Execute command, save permission, send feedback
- Deny + feedback → Don't execute, send feedback to agent

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 22:32:43 -06:00
Nicholas Tindle
699fffb1a8 feat(classic): add Rich interactive selector for command approval
Adds a custom Rich-based interactive selector for the command approval
workflow. Features include:
- Arrow key navigation for selecting approval options
- Tab to add context to any selection (e.g., "Once + also check file x")
- Dedicated inline feedback option with shadow placeholder text
- Quick select with number keys 1-5
- Works within existing asyncio event loop (no prompt_toolkit dependency)

Also adds UIProvider abstraction pattern for future UI implementations.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 21:49:43 -06:00
Nicholas Tindle
f0641c2d26 fix(classic): auto-advance plan steps in Plan-Execute strategy
The strategy was stuck in a loop because it tracked plan steps but never
advanced them - the record_step_success() method existed but was never
called by the agent's execution loop.

Fix by using a _pending_step_advance flag to track when an action has
been proposed. On the next parse_response_content() call, advance the
previous step before processing the new response. This keeps step
tracking self-contained in the strategy without requiring agent changes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 21:14:16 -06:00
Nicholas Tindle
94b6f74c95 feat(classic): add multiple prompt strategies for agent reasoning
Implement four new prompt strategies based on research papers:

- ReWOO: Reasoning Without Observation (5x token efficiency)
- Plan-and-Execute: Separate planning from execution phases
- Reflexion: Verbal reinforcement learning with episodic memory
- Tree of Thoughts: Deliberate problem solving with tree search

Each strategy extends a new BaseMultiStepPromptStrategy base class
with shared utilities. Strategies are selectable via PROMPT_STRATEGY
environment variable or config.prompt_strategy setting.

Fix JSONSchema generation issue where Optional/Union types created
anyOf schemas without direct type field - resolved by storing
plan/phase state in strategy instances rather than ActionProposal.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 20:33:10 -06:00
Nicholas Tindle
46aabab3ea feat(classic): upgrade to Python 3.12+ with CI testing on 3.12, 3.13, 3.14
- Update Python version constraint from ^3.10 to ^3.12 in all pyproject.toml
- Update classifiers to reflect Python 3.12, 3.13, 3.14 support
- Update dependencies for Python 3.13+ compatibility:
  - chromadb: ^0.4.10 -> ^1.4.0
  - numpy: >=1.26.0,<2.0.0 -> >=2.0.0
  - watchdog: 4.0.0 -> ^6.0.0
  - spacy: ^3.0.0 -> ^3.8.0 (numpy 2.x compatibility)
  - en-core-web-sm model: 3.7.1 -> 3.8.0
  - httpx (benchmark): ^0.24.0 -> ^0.27.0
- Update tool configuration:
  - Black target-version: py310 -> py312
  - Pyright pythonVersion: 3.10 -> 3.12
- Update Dockerfiles to use Python 3.12
- Update CI workflows to test on Python 3.12, 3.13, and 3.14
- Regenerate all poetry.lock files

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 20:25:11 -06:00
Nicholas Tindle
0a65df5102 fix(classic): always use native tool calling, fix N/A command loop
- Remove openai_functions config option - native tool calling is now always enabled
- Remove use_functions_api from BaseAgentConfiguration and prompt strategy
- Add use_prefill config to disable prefill for Anthropic (prefill + tools incompatible)
- Update anthropic dependency to ^0.45.0 for tools API support
- Simplify prompt strategy to always expect tool_calls from LLM response

This fixes the N/A command loop bug where models would output "N/A" as a
command name when function calling was disabled. With native tool calling
always enabled, models are forced to pick from valid tools only.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 19:54:40 -06:00
Nicholas Tindle
6fbd208fe3 chore: ignore .claude/settings.local.json in all directories
Update gitignore to use glob pattern for settings.local.json files
in any .claude directory. Also untrack the existing file.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 18:54:42 -06:00
Nicholas Tindle
8fc174ca87 refactor(classic): simplify log format by removing timestamps
Remove asctime from log formats since terminal output already has
timestamps from the logging infrastructure. Makes logs cleaner
and easier to read.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 18:52:47 -06:00
Nicholas Tindle
cacc89790f feat(classic): improve AutoGPT configuration and setup
Environment loading:
- Search for .env in multiple locations (cwd, ~/.autogpt, ~/.config/autogpt)
- Allows running autogpt from any directory
- Document search order in .env.template

Setup simplification:
- Remove interactive AI settings revision (was broken/unused)
- Simplify to just printing current settings
- Clean up unused imports

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 18:52:38 -06:00
Nicholas Tindle
b9113bee02 feat(classic): enhance existing components with new capabilities
CodeExecutorComponent:
- Add timeout and env_vars parameters to execution commands
- Add execute_shell_popen for streaming output
- Improve error handling with CodeTimeoutError

FileManagerComponent:
- Add file_info, file_search, file_copy, file_move commands
- Add directory_create, directory_list_tree commands
- Better path validation and error messages

GitOperationsComponent:
- Add git_log, git_show, git_branch commands
- Add git_stash, git_stash_pop, git_stash_list commands
- Add git_cherry_pick, git_revert, git_reset commands
- Add git_remote, git_fetch, git_pull, git_push commands

UserInteractionComponent:
- Add ask_multiple_choice for structured options
- Add notify_user for non-blocking notifications
- Add confirm_action for yes/no confirmations

WebSearchComponent:
- Minor error handling improvements

WebSeleniumComponent:
- Add get_page_content, execute_javascript commands
- Add take_element_screenshot command
- Add wait_for_element, scroll_page commands
- Improve element interaction reliability

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 18:52:27 -06:00
Nicholas Tindle
3f65da03e7 feat(classic): add new exception types for enhanced error handling
Add specialized exception classes for better error reporting:
- CodeTimeoutError: For code execution timeouts
- HTTPError: For HTTP request failures with status code/URL
- DataProcessingError: For JSON/CSV processing errors

Each exception includes helpful hints for users.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 18:52:10 -06:00
Nicholas Tindle
9e96d11b2d feat(classic): add utility components for agent capabilities
Add 6 new utility components to expand agent functionality:

- ArchiveHandlerComponent: ZIP/TAR archive operations (create, extract, list)
- ClipboardComponent: In-memory clipboard for copy/paste operations
- DataProcessorComponent: CSV/JSON data manipulation and analysis
- HTTPClientComponent: HTTP requests (GET, POST, PUT, DELETE)
- MathUtilsComponent: Mathematical calculations and statistics
- TextUtilsComponent: Text processing (regex, diff, encoding, hashing)

All components follow the forge component pattern with:
- CommandProvider for exposing commands
- DirectiveProvider for resources/best practices
- Comprehensive parameter validation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 18:50:52 -06:00
Nicholas Tindle
4c264b7ae9 feat(classic): add TodoComponent with LLM-powered decomposition
Add a task management component modeled after Claude Code's TodoWrite:
- TodoItem with recursive sub_items for hierarchical task structure
- todo_write: atomic list replacement with sub-items support
- todo_read: retrieve current todos with nested structure
- todo_clear: clear all todos
- todo_decompose: use smart LLM to break down tasks into sub-steps

Features:
- Hierarchical task tracking with independent status per sub-item
- MessageProvider shows todos in LLM context with proper indentation
- DirectiveProvider adds best practices for task management
- Graceful fallback when LLM provider not configured

Integrates with:
- original_autogpt Agent (full LLM decomposition support)
- ForgeAgent (basic task tracking, no decomposition)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 18:49:48 -06:00
Nicholas Tindle
0adbc0bd05 fix(classic): update CI for removed frontend and helper scripts
Remove references to deleted files (./run, cli.py, setup.py, frontend/)
from CI workflows. Replace ./run agent start with direct poetry commands
to start agent servers in background.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 17:41:11 -06:00
Nicholas Tindle
8f3291bc92 feat(classic): add workspace permissions system for agent commands
Add a layered permission system that controls agent command execution:

- Create autogpt.yaml in .autogpt/ folder with default allow/deny rules
- File operations in workspace allowed by default
- Sensitive files (.env, .key, .pem) blocked by default
- Dangerous shell commands (sudo, rm -rf) blocked by default
- Interactive prompts for unknown commands (y=agent, Y=workspace, n=deny)
- Agent-specific permissions stored in .autogpt/agents/{id}/permissions.yaml

Files added:
- forge/forge/config/workspace_settings.py - Pydantic models for settings
- forge/forge/permissions.py - CommandPermissionManager with pattern matching

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 17:39:33 -06:00
Nicholas Tindle
7a20de880d chore: add .autogpt/ to gitignore
The .autogpt/ directory is where AutoGPT stores agent data when running
from any directory. This should not be committed to version control.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 17:02:47 -06:00
Nicholas Tindle
ef8a6d2528 feat(classic): make AutoGPT installable and runnable from any directory
Add --workspace option to CLI that defaults to current working directory,
allowing users to run `autogpt` from any folder. Agent data is now stored
in `.autogpt/` subdirectory of the workspace instead of a hardcoded path.

Changes:
- Add -w/--workspace CLI option to run and serve commands
- Remove dependency on forge package location for PROJECT_ROOT
- Update config to use workspace instead of project_root
- Store agent data in .autogpt/ within workspace directory
- Update pyproject.toml files with proper PyPI metadata
- Fix outdated tests to match current implementation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 17:00:36 -06:00
Nicholas Tindle
fd66be2aaa chore(classic): remove unneeded files and add CLAUDE.md docs
- Remove deprecated Flutter frontend (replaced by autogpt_platform)
- Remove shell scripts (run, setup, autogpt.sh, etc.)
- Remove tutorials (outdated)
- Remove CLI-USAGE.md and FORGE-QUICKSTART.md
- Add CLAUDE.md files for Claude Code guidance

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 16:17:35 -06:00
Nicholas Tindle
ae2cc97dc4 feat(classic): add modern Anthropic models and fix deprecated API
- Add Claude 3.5 v2, Claude 4 Sonnet, Claude 4 Opus, and Claude 4.5 Opus models
- Add rolling aliases (CLAUDE_SONNET, CLAUDE_OPUS, CLAUDE_HAIKU)
- Fix deprecated beta.tools.messages.create API call to use standard messages.create
- Update anthropic SDK from ^0.25.1 to >=0.40,<1.0

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 16:15:16 -06:00
Nicholas Tindle
ea521eed26 wip: add supprot for new openai models (non working) 2025-12-26 10:02:17 -06:00
2312 changed files with 33985 additions and 820465 deletions

View File

@@ -6,11 +6,15 @@ on:
paths:
- '.github/workflows/classic-autogpt-ci.yml'
- 'classic/original_autogpt/**'
- 'classic/direct_benchmark/**'
- 'classic/forge/**'
pull_request:
branches: [ master, dev, release-* ]
paths:
- '.github/workflows/classic-autogpt-ci.yml'
- 'classic/original_autogpt/**'
- 'classic/direct_benchmark/**'
- 'classic/forge/**'
concurrency:
group: ${{ format('classic-autogpt-ci-{0}', github.head_ref && format('{0}-{1}', github.event_name, github.event.pull_request.number) || github.sha) }}
@@ -19,47 +23,22 @@ concurrency:
defaults:
run:
shell: bash
working-directory: classic/original_autogpt
working-directory: classic
jobs:
test:
permissions:
contents: read
timeout-minutes: 30
strategy:
fail-fast: false
matrix:
python-version: ["3.10"]
platform-os: [ubuntu, macos, macos-arm64, windows]
runs-on: ${{ matrix.platform-os != 'macos-arm64' && format('{0}-latest', matrix.platform-os) || 'macos-14' }}
runs-on: ubuntu-latest
steps:
# Quite slow on macOS (2~4 minutes to set up Docker)
# - name: Set up Docker (macOS)
# if: runner.os == 'macOS'
# uses: crazy-max/ghaction-setup-docker@v3
- name: Start MinIO service (Linux)
if: runner.os == 'Linux'
- name: Start MinIO service
working-directory: '.'
run: |
docker pull minio/minio:edge-cicd
docker run -d -p 9000:9000 minio/minio:edge-cicd
- name: Start MinIO service (macOS)
if: runner.os == 'macOS'
working-directory: ${{ runner.temp }}
run: |
brew install minio/stable/minio
mkdir data
minio server ./data &
# No MinIO on Windows:
# - Windows doesn't support running Linux Docker containers
# - It doesn't seem possible to start background processes on Windows. They are
# killed after the step returns.
# See: https://github.com/actions/runner/issues/598#issuecomment-2011890429
- name: Checkout repository
uses: actions/checkout@v4
with:
@@ -71,41 +50,23 @@ jobs:
git config --global user.name "Auto-GPT-Bot"
git config --global user.email "github-bot@agpt.co"
- name: Set up Python ${{ matrix.python-version }}
- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
python-version: "3.12"
- id: get_date
name: Get date
run: echo "date=$(date +'%Y-%m-%d')" >> $GITHUB_OUTPUT
- name: Set up Python dependency cache
# On Windows, unpacking cached dependencies takes longer than just installing them
if: runner.os != 'Windows'
uses: actions/cache@v4
with:
path: ${{ runner.os == 'macOS' && '~/Library/Caches/pypoetry' || '~/.cache/pypoetry' }}
key: poetry-${{ runner.os }}-${{ hashFiles('classic/original_autogpt/poetry.lock') }}
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-${{ hashFiles('classic/poetry.lock') }}
- name: Install Poetry (Unix)
if: runner.os != 'Windows'
run: |
curl -sSL https://install.python-poetry.org | python3 -
if [ "${{ runner.os }}" = "macOS" ]; then
PATH="$HOME/.local/bin:$PATH"
echo "$HOME/.local/bin" >> $GITHUB_PATH
fi
- name: Install Poetry (Windows)
if: runner.os == 'Windows'
shell: pwsh
run: |
(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | python -
$env:PATH += ";$env:APPDATA\Python\Scripts"
echo "$env:APPDATA\Python\Scripts" >> $env:GITHUB_PATH
- name: Install Poetry
run: curl -sSL https://install.python-poetry.org | python3 -
- name: Install Python dependencies
run: poetry install
@@ -116,12 +77,12 @@ jobs:
--cov=autogpt --cov-branch --cov-report term-missing --cov-report xml \
--numprocesses=logical --durations=10 \
--junitxml=junit.xml -o junit_family=legacy \
tests/unit tests/integration
original_autogpt/tests/unit original_autogpt/tests/integration
env:
CI: true
PLAIN_OUTPUT: True
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
S3_ENDPOINT_URL: ${{ runner.os != 'Windows' && 'http://127.0.0.1:9000' || '' }}
S3_ENDPOINT_URL: http://127.0.0.1:9000
AWS_ACCESS_KEY_ID: minioadmin
AWS_SECRET_ACCESS_KEY: minioadmin
@@ -135,11 +96,11 @@ jobs:
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}
flags: autogpt-agent,${{ runner.os }}
flags: autogpt-agent
- name: Upload logs to artifact
if: always()
uses: actions/upload-artifact@v4
with:
name: test-logs
path: classic/original_autogpt/logs/
path: classic/logs/

View File

@@ -11,9 +11,6 @@ on:
- 'classic/original_autogpt/**'
- 'classic/forge/**'
- 'classic/benchmark/**'
- 'classic/run'
- 'classic/cli.py'
- 'classic/setup.py'
- '!**/*.md'
pull_request:
branches: [ master, dev, release-* ]
@@ -22,9 +19,6 @@ on:
- 'classic/original_autogpt/**'
- 'classic/forge/**'
- 'classic/benchmark/**'
- 'classic/run'
- 'classic/cli.py'
- 'classic/setup.py'
- '!**/*.md'
defaults:
@@ -35,13 +29,9 @@ defaults:
jobs:
serve-agent-protocol:
runs-on: ubuntu-latest
strategy:
matrix:
agent-name: [ original_autogpt ]
fail-fast: false
timeout-minutes: 20
env:
min-python-version: '3.10'
min-python-version: '3.12'
steps:
- name: Checkout repository
uses: actions/checkout@v4
@@ -55,22 +45,22 @@ jobs:
python-version: ${{ env.min-python-version }}
- name: Install Poetry
working-directory: ./classic/${{ matrix.agent-name }}/
run: |
curl -sSL https://install.python-poetry.org | python -
- name: Run regression tests
- name: Install dependencies
run: poetry install
- name: Run smoke tests with direct-benchmark
run: |
./run agent start ${{ matrix.agent-name }}
cd ${{ matrix.agent-name }}
poetry run agbenchmark --mock --test=BasicRetrieval --test=Battleship --test=WebArenaTask_0
poetry run agbenchmark --test=WriteFile
poetry run direct-benchmark run \
--strategies one_shot \
--models claude \
--tests ReadFile,WriteFile \
--json
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
AGENT_NAME: ${{ matrix.agent-name }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
REQUESTS_CA_BUNDLE: /etc/ssl/certs/ca-certificates.crt
HELICONE_CACHE_ENABLED: false
HELICONE_PROPERTY_AGENT: ${{ matrix.agent-name }}
REPORTS_FOLDER: ${{ format('../../reports/{0}', matrix.agent-name) }}
TELEMETRY_ENVIRONMENT: autogpt-ci
TELEMETRY_OPT_IN: ${{ github.ref_name == 'master' }}
NONINTERACTIVE_MODE: "true"
CI: true

View File

@@ -1,17 +1,21 @@
name: Classic - AGBenchmark CI
name: Classic - Direct Benchmark CI
on:
push:
branches: [ master, dev, ci-test* ]
paths:
- 'classic/benchmark/**'
- '!classic/benchmark/reports/**'
- 'classic/direct_benchmark/**'
- 'classic/benchmark/agbenchmark/challenges/**'
- 'classic/original_autogpt/**'
- 'classic/forge/**'
- .github/workflows/classic-benchmark-ci.yml
pull_request:
branches: [ master, dev, release-* ]
paths:
- 'classic/benchmark/**'
- '!classic/benchmark/reports/**'
- 'classic/direct_benchmark/**'
- 'classic/benchmark/agbenchmark/challenges/**'
- 'classic/original_autogpt/**'
- 'classic/forge/**'
- .github/workflows/classic-benchmark-ci.yml
concurrency:
@@ -23,23 +27,16 @@ defaults:
shell: bash
env:
min-python-version: '3.10'
min-python-version: '3.12'
jobs:
test:
permissions:
contents: read
benchmark-tests:
runs-on: ubuntu-latest
timeout-minutes: 30
strategy:
fail-fast: false
matrix:
python-version: ["3.10"]
platform-os: [ubuntu, macos, macos-arm64, windows]
runs-on: ${{ matrix.platform-os != 'macos-arm64' && format('{0}-latest', matrix.platform-os) || 'macos-14' }}
defaults:
run:
shell: bash
working-directory: classic/benchmark
working-directory: classic
steps:
- name: Checkout repository
uses: actions/checkout@v4
@@ -47,71 +44,84 @@ jobs:
fetch-depth: 0
submodules: true
- name: Set up Python ${{ matrix.python-version }}
- name: Set up Python ${{ env.min-python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
python-version: ${{ env.min-python-version }}
- name: Set up Python dependency cache
# On Windows, unpacking cached dependencies takes longer than just installing them
if: runner.os != 'Windows'
uses: actions/cache@v4
with:
path: ${{ runner.os == 'macOS' && '~/Library/Caches/pypoetry' || '~/.cache/pypoetry' }}
key: poetry-${{ runner.os }}-${{ hashFiles('classic/benchmark/poetry.lock') }}
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-${{ hashFiles('classic/poetry.lock') }}
- name: Install Poetry (Unix)
if: runner.os != 'Windows'
- name: Install Poetry
run: |
curl -sSL https://install.python-poetry.org | python3 -
if [ "${{ runner.os }}" = "macOS" ]; then
PATH="$HOME/.local/bin:$PATH"
echo "$HOME/.local/bin" >> $GITHUB_PATH
fi
- name: Install Poetry (Windows)
if: runner.os == 'Windows'
shell: pwsh
run: |
(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | python -
$env:PATH += ";$env:APPDATA\Python\Scripts"
echo "$env:APPDATA\Python\Scripts" >> $env:GITHUB_PATH
- name: Install Python dependencies
- name: Install dependencies
run: poetry install
- name: Run pytest with coverage
- name: Run basic benchmark tests
run: |
poetry run pytest -vv \
--cov=agbenchmark --cov-branch --cov-report term-missing --cov-report xml \
--durations=10 \
--junitxml=junit.xml -o junit_family=legacy \
tests
echo "Testing ReadFile challenge with one_shot strategy..."
poetry run direct-benchmark run \
--strategies one_shot \
--models claude \
--tests ReadFile \
--json
echo "Testing WriteFile challenge..."
poetry run direct-benchmark run \
--strategies one_shot \
--models claude \
--tests WriteFile \
--json
env:
CI: true
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
NONINTERACTIVE_MODE: "true"
- name: Upload test results to Codecov
if: ${{ !cancelled() }} # Run even if tests fail
uses: codecov/test-results-action@v1
with:
token: ${{ secrets.CODECOV_TOKEN }}
- name: Test category filtering
run: |
echo "Testing coding category..."
poetry run direct-benchmark run \
--strategies one_shot \
--models claude \
--categories coding \
--tests ReadFile,WriteFile \
--json
env:
CI: true
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
NONINTERACTIVE_MODE: "true"
- name: Upload coverage reports to Codecov
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}
flags: agbenchmark,${{ runner.os }}
- name: Test multiple strategies
run: |
echo "Testing multiple strategies..."
poetry run direct-benchmark run \
--strategies one_shot,plan_execute \
--models claude \
--tests ReadFile \
--parallel 2 \
--json
env:
CI: true
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
NONINTERACTIVE_MODE: "true"
self-test-with-agent:
# Run regression tests on maintain challenges
regression-tests:
runs-on: ubuntu-latest
strategy:
matrix:
agent-name: [forge]
fail-fast: false
timeout-minutes: 20
timeout-minutes: 45
if: github.ref == 'refs/heads/master' || github.ref == 'refs/heads/dev'
defaults:
run:
shell: bash
working-directory: classic
steps:
- name: Checkout repository
uses: actions/checkout@v4
@@ -126,51 +136,22 @@ jobs:
- name: Install Poetry
run: |
curl -sSL https://install.python-poetry.org | python -
curl -sSL https://install.python-poetry.org | python3 -
- name: Install dependencies
run: poetry install
- name: Run regression tests
working-directory: classic
run: |
./run agent start ${{ matrix.agent-name }}
cd ${{ matrix.agent-name }}
set +e # Ignore non-zero exit codes and continue execution
echo "Running the following command: poetry run agbenchmark --maintain --mock"
poetry run agbenchmark --maintain --mock
EXIT_CODE=$?
set -e # Stop ignoring non-zero exit codes
# Check if the exit code was 5, and if so, exit with 0 instead
if [ $EXIT_CODE -eq 5 ]; then
echo "regression_tests.json is empty."
fi
echo "Running the following command: poetry run agbenchmark --mock"
poetry run agbenchmark --mock
echo "Running the following command: poetry run agbenchmark --mock --category=data"
poetry run agbenchmark --mock --category=data
echo "Running the following command: poetry run agbenchmark --mock --category=coding"
poetry run agbenchmark --mock --category=coding
# echo "Running the following command: poetry run agbenchmark --test=WriteFile"
# poetry run agbenchmark --test=WriteFile
cd ../benchmark
poetry install
echo "Adding the BUILD_SKILL_TREE environment variable. This will attempt to add new elements in the skill tree. If new elements are added, the CI fails because they should have been pushed"
export BUILD_SKILL_TREE=true
# poetry run agbenchmark --mock
# CHANGED=$(git diff --name-only | grep -E '(agbenchmark/challenges)|(../classic/frontend/assets)') || echo "No diffs"
# if [ ! -z "$CHANGED" ]; then
# echo "There are unstaged changes please run agbenchmark and commit those changes since they are needed."
# echo "$CHANGED"
# exit 1
# else
# echo "No unstaged changes."
# fi
echo "Running regression tests (previously beaten challenges)..."
poetry run direct-benchmark run \
--strategies one_shot \
--models claude \
--maintain \
--parallel 4 \
--json
env:
CI: true
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
TELEMETRY_ENVIRONMENT: autogpt-benchmark-ci
TELEMETRY_OPT_IN: ${{ github.ref_name == 'master' }}
NONINTERACTIVE_MODE: "true"

View File

@@ -6,13 +6,11 @@ on:
paths:
- '.github/workflows/classic-forge-ci.yml'
- 'classic/forge/**'
- '!classic/forge/tests/vcr_cassettes'
pull_request:
branches: [ master, dev, release-* ]
paths:
- '.github/workflows/classic-forge-ci.yml'
- 'classic/forge/**'
- '!classic/forge/tests/vcr_cassettes'
concurrency:
group: ${{ format('forge-ci-{0}', github.head_ref && format('{0}-{1}', github.event_name, github.event.pull_request.number) || github.sha) }}
@@ -21,115 +19,38 @@ concurrency:
defaults:
run:
shell: bash
working-directory: classic/forge
working-directory: classic
jobs:
test:
permissions:
contents: read
timeout-minutes: 30
strategy:
fail-fast: false
matrix:
python-version: ["3.10"]
platform-os: [ubuntu, macos, macos-arm64, windows]
runs-on: ${{ matrix.platform-os != 'macos-arm64' && format('{0}-latest', matrix.platform-os) || 'macos-14' }}
runs-on: ubuntu-latest
steps:
# Quite slow on macOS (2~4 minutes to set up Docker)
# - name: Set up Docker (macOS)
# if: runner.os == 'macOS'
# uses: crazy-max/ghaction-setup-docker@v3
- name: Start MinIO service (Linux)
if: runner.os == 'Linux'
- name: Start MinIO service
working-directory: '.'
run: |
docker pull minio/minio:edge-cicd
docker run -d -p 9000:9000 minio/minio:edge-cicd
- name: Start MinIO service (macOS)
if: runner.os == 'macOS'
working-directory: ${{ runner.temp }}
run: |
brew install minio/stable/minio
mkdir data
minio server ./data &
# No MinIO on Windows:
# - Windows doesn't support running Linux Docker containers
# - It doesn't seem possible to start background processes on Windows. They are
# killed after the step returns.
# See: https://github.com/actions/runner/issues/598#issuecomment-2011890429
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
submodules: true
- name: Checkout cassettes
if: ${{ startsWith(github.event_name, 'pull_request') }}
env:
PR_BASE: ${{ github.event.pull_request.base.ref }}
PR_BRANCH: ${{ github.event.pull_request.head.ref }}
PR_AUTHOR: ${{ github.event.pull_request.user.login }}
run: |
cassette_branch="${PR_AUTHOR}-${PR_BRANCH}"
cassette_base_branch="${PR_BASE}"
cd tests/vcr_cassettes
if ! git ls-remote --exit-code --heads origin $cassette_base_branch ; then
cassette_base_branch="master"
fi
if git ls-remote --exit-code --heads origin $cassette_branch ; then
git fetch origin $cassette_branch
git fetch origin $cassette_base_branch
git checkout $cassette_branch
# Pick non-conflicting cassette updates from the base branch
git merge --no-commit --strategy-option=ours origin/$cassette_base_branch
echo "Using cassettes from mirror branch '$cassette_branch'," \
"synced to upstream branch '$cassette_base_branch'."
else
git checkout -b $cassette_branch
echo "Branch '$cassette_branch' does not exist in cassette submodule." \
"Using cassettes from '$cassette_base_branch'."
fi
- name: Set up Python ${{ matrix.python-version }}
- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
python-version: "3.12"
- name: Set up Python dependency cache
# On Windows, unpacking cached dependencies takes longer than just installing them
if: runner.os != 'Windows'
uses: actions/cache@v4
with:
path: ${{ runner.os == 'macOS' && '~/Library/Caches/pypoetry' || '~/.cache/pypoetry' }}
key: poetry-${{ runner.os }}-${{ hashFiles('classic/forge/poetry.lock') }}
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-${{ hashFiles('classic/poetry.lock') }}
- name: Install Poetry (Unix)
if: runner.os != 'Windows'
run: |
curl -sSL https://install.python-poetry.org | python3 -
if [ "${{ runner.os }}" = "macOS" ]; then
PATH="$HOME/.local/bin:$PATH"
echo "$HOME/.local/bin" >> $GITHUB_PATH
fi
- name: Install Poetry (Windows)
if: runner.os == 'Windows'
shell: pwsh
run: |
(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | python -
$env:PATH += ";$env:APPDATA\Python\Scripts"
echo "$env:APPDATA\Python\Scripts" >> $env:GITHUB_PATH
- name: Install Poetry
run: curl -sSL https://install.python-poetry.org | python3 -
- name: Install Python dependencies
run: poetry install
@@ -140,12 +61,15 @@ jobs:
--cov=forge --cov-branch --cov-report term-missing --cov-report xml \
--durations=10 \
--junitxml=junit.xml -o junit_family=legacy \
forge
forge/forge forge/tests
env:
CI: true
PLAIN_OUTPUT: True
# API keys - tests that need these will skip if not available
# Secrets are not available to fork PRs (GitHub security feature)
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
S3_ENDPOINT_URL: ${{ runner.os != 'Windows' && 'http://127.0.0.1:9000' || '' }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
S3_ENDPOINT_URL: http://127.0.0.1:9000
AWS_ACCESS_KEY_ID: minioadmin
AWS_SECRET_ACCESS_KEY: minioadmin
@@ -159,85 +83,11 @@ jobs:
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}
flags: forge,${{ runner.os }}
- id: setup_git_auth
name: Set up git token authentication
# Cassettes may be pushed even when tests fail
if: success() || failure()
run: |
config_key="http.${{ github.server_url }}/.extraheader"
if [ "${{ runner.os }}" = 'macOS' ]; then
base64_pat=$(echo -n "pat:${{ secrets.PAT_REVIEW }}" | base64)
else
base64_pat=$(echo -n "pat:${{ secrets.PAT_REVIEW }}" | base64 -w0)
fi
git config "$config_key" \
"Authorization: Basic $base64_pat"
cd tests/vcr_cassettes
git config "$config_key" \
"Authorization: Basic $base64_pat"
echo "config_key=$config_key" >> $GITHUB_OUTPUT
- id: push_cassettes
name: Push updated cassettes
# For pull requests, push updated cassettes even when tests fail
if: github.event_name == 'push' || (! github.event.pull_request.head.repo.fork && (success() || failure()))
env:
PR_BRANCH: ${{ github.event.pull_request.head.ref }}
PR_AUTHOR: ${{ github.event.pull_request.user.login }}
run: |
if [ "${{ startsWith(github.event_name, 'pull_request') }}" = "true" ]; then
is_pull_request=true
cassette_branch="${PR_AUTHOR}-${PR_BRANCH}"
else
cassette_branch="${{ github.ref_name }}"
fi
cd tests/vcr_cassettes
# Commit & push changes to cassettes if any
if ! git diff --quiet; then
git add .
git commit -m "Auto-update cassettes"
git push origin HEAD:$cassette_branch
if [ ! $is_pull_request ]; then
cd ../..
git add tests/vcr_cassettes
git commit -m "Update cassette submodule"
git push origin HEAD:$cassette_branch
fi
echo "updated=true" >> $GITHUB_OUTPUT
else
echo "updated=false" >> $GITHUB_OUTPUT
echo "No cassette changes to commit"
fi
- name: Post Set up git token auth
if: steps.setup_git_auth.outcome == 'success'
run: |
git config --unset-all '${{ steps.setup_git_auth.outputs.config_key }}'
git submodule foreach git config --unset-all '${{ steps.setup_git_auth.outputs.config_key }}'
- name: Apply "behaviour change" label and comment on PR
if: ${{ startsWith(github.event_name, 'pull_request') }}
run: |
PR_NUMBER="${{ github.event.pull_request.number }}"
TOKEN="${{ secrets.PAT_REVIEW }}"
REPO="${{ github.repository }}"
if [[ "${{ steps.push_cassettes.outputs.updated }}" == "true" ]]; then
echo "Adding label and comment..."
echo $TOKEN | gh auth login --with-token
gh issue edit $PR_NUMBER --add-label "behaviour change"
gh issue comment $PR_NUMBER --body "You changed AutoGPT's behaviour on ${{ runner.os }}. The cassettes have been updated and will be merged to the submodule when this Pull Request gets merged."
fi
flags: forge
- name: Upload logs to artifact
if: always()
uses: actions/upload-artifact@v4
with:
name: test-logs
path: classic/forge/logs/
path: classic/logs/

View File

@@ -1,60 +0,0 @@
name: Classic - Frontend CI/CD
on:
push:
branches:
- master
- dev
- 'ci-test*' # This will match any branch that starts with "ci-test"
paths:
- 'classic/frontend/**'
- '.github/workflows/classic-frontend-ci.yml'
pull_request:
paths:
- 'classic/frontend/**'
- '.github/workflows/classic-frontend-ci.yml'
jobs:
build:
permissions:
contents: write
pull-requests: write
runs-on: ubuntu-latest
env:
BUILD_BRANCH: ${{ format('classic-frontend-build/{0}', github.ref_name) }}
steps:
- name: Checkout Repo
uses: actions/checkout@v4
- name: Setup Flutter
uses: subosito/flutter-action@v2
with:
flutter-version: '3.13.2'
- name: Build Flutter to Web
run: |
cd classic/frontend
flutter build web --base-href /app/
# - name: Commit and Push to ${{ env.BUILD_BRANCH }}
# if: github.event_name == 'push'
# run: |
# git config --local user.email "action@github.com"
# git config --local user.name "GitHub Action"
# git add classic/frontend/build/web
# git checkout -B ${{ env.BUILD_BRANCH }}
# git commit -m "Update frontend build to ${GITHUB_SHA:0:7}" -a
# git push -f origin ${{ env.BUILD_BRANCH }}
- name: Create PR ${{ env.BUILD_BRANCH }} -> ${{ github.ref_name }}
if: github.event_name == 'push'
uses: peter-evans/create-pull-request@v7
with:
add-paths: classic/frontend/build/web
base: ${{ github.ref_name }}
branch: ${{ env.BUILD_BRANCH }}
delete-branch: true
title: "Update frontend build in `${{ github.ref_name }}`"
body: "This PR updates the frontend build based on commit ${{ github.sha }}."
commit-message: "Update frontend build based on commit ${{ github.sha }}"

View File

@@ -7,7 +7,9 @@ on:
- '.github/workflows/classic-python-checks-ci.yml'
- 'classic/original_autogpt/**'
- 'classic/forge/**'
- 'classic/benchmark/**'
- 'classic/direct_benchmark/**'
- 'classic/pyproject.toml'
- 'classic/poetry.lock'
- '**.py'
- '!classic/forge/tests/vcr_cassettes'
pull_request:
@@ -16,7 +18,9 @@ on:
- '.github/workflows/classic-python-checks-ci.yml'
- 'classic/original_autogpt/**'
- 'classic/forge/**'
- 'classic/benchmark/**'
- 'classic/direct_benchmark/**'
- 'classic/pyproject.toml'
- 'classic/poetry.lock'
- '**.py'
- '!classic/forge/tests/vcr_cassettes'
@@ -27,44 +31,13 @@ concurrency:
defaults:
run:
shell: bash
working-directory: classic
jobs:
get-changed-parts:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
- id: changes-in
name: Determine affected subprojects
uses: dorny/paths-filter@v3
with:
filters: |
original_autogpt:
- classic/original_autogpt/autogpt/**
- classic/original_autogpt/tests/**
- classic/original_autogpt/poetry.lock
forge:
- classic/forge/forge/**
- classic/forge/tests/**
- classic/forge/poetry.lock
benchmark:
- classic/benchmark/agbenchmark/**
- classic/benchmark/tests/**
- classic/benchmark/poetry.lock
outputs:
changed-parts: ${{ steps.changes-in.outputs.changes }}
lint:
needs: get-changed-parts
runs-on: ubuntu-latest
env:
min-python-version: "3.10"
strategy:
matrix:
sub-package: ${{ fromJson(needs.get-changed-parts.outputs.changed-parts) }}
fail-fast: false
min-python-version: "3.12"
steps:
- name: Checkout repository
@@ -81,42 +54,31 @@ jobs:
uses: actions/cache@v4
with:
path: ~/.cache/pypoetry
key: ${{ runner.os }}-poetry-${{ hashFiles(format('{0}/poetry.lock', matrix.sub-package)) }}
key: ${{ runner.os }}-poetry-${{ hashFiles('classic/poetry.lock') }}
- name: Install Poetry
run: curl -sSL https://install.python-poetry.org | python3 -
# Install dependencies
- name: Install Python dependencies
run: poetry -C classic/${{ matrix.sub-package }} install
run: poetry install
# Lint
- name: Lint (isort)
run: poetry run isort --check .
working-directory: classic/${{ matrix.sub-package }}
- name: Lint (Black)
if: success() || failure()
run: poetry run black --check .
working-directory: classic/${{ matrix.sub-package }}
- name: Lint (Flake8)
if: success() || failure()
run: poetry run flake8 .
working-directory: classic/${{ matrix.sub-package }}
types:
needs: get-changed-parts
runs-on: ubuntu-latest
env:
min-python-version: "3.10"
strategy:
matrix:
sub-package: ${{ fromJson(needs.get-changed-parts.outputs.changed-parts) }}
fail-fast: false
min-python-version: "3.12"
steps:
- name: Checkout repository
@@ -133,19 +95,16 @@ jobs:
uses: actions/cache@v4
with:
path: ~/.cache/pypoetry
key: ${{ runner.os }}-poetry-${{ hashFiles(format('{0}/poetry.lock', matrix.sub-package)) }}
key: ${{ runner.os }}-poetry-${{ hashFiles('classic/poetry.lock') }}
- name: Install Poetry
run: curl -sSL https://install.python-poetry.org | python3 -
# Install dependencies
- name: Install Python dependencies
run: poetry -C classic/${{ matrix.sub-package }} install
run: poetry install
# Typecheck
- name: Typecheck
if: success() || failure()
run: poetry run pyright
working-directory: classic/${{ matrix.sub-package }}

10
.gitignore vendored
View File

@@ -3,6 +3,7 @@
classic/original_autogpt/keys.py
classic/original_autogpt/*.json
auto_gpt_workspace/*
.autogpt/
*.mpeg
.env
# Root .env files
@@ -159,6 +160,10 @@ CURRENT_BULLETIN.md
# AgBenchmark
classic/benchmark/agbenchmark/reports/
classic/reports/
classic/direct_benchmark/reports/
classic/.benchmark_workspaces/
classic/direct_benchmark/.benchmark_workspaces/
# Nodejs
package-lock.json
@@ -177,5 +182,8 @@ autogpt_platform/backend/settings.py
*.ign.*
.test-contents
.claude/settings.local.json
**/.claude/settings.local.json
/autogpt_platform/backend/logs
# Test database
test.db

3
.gitmodules vendored
View File

@@ -1,3 +0,0 @@
[submodule "classic/forge/tests/vcr_cassettes"]
path = classic/forge/tests/vcr_cassettes
url = https://github.com/Significant-Gravitas/Auto-GPT-test-cassettes

View File

@@ -43,29 +43,10 @@ repos:
pass_filenames: false
- id: poetry-install
name: Check & Install dependencies - Classic - AutoGPT
alias: poetry-install-classic-autogpt
entry: poetry -C classic/original_autogpt install
# include forge source (since it's a path dependency)
files: ^classic/(original_autogpt|forge)/poetry\.lock$
types: [file]
language: system
pass_filenames: false
- id: poetry-install
name: Check & Install dependencies - Classic - Forge
alias: poetry-install-classic-forge
entry: poetry -C classic/forge install
files: ^classic/forge/poetry\.lock$
types: [file]
language: system
pass_filenames: false
- id: poetry-install
name: Check & Install dependencies - Classic - Benchmark
alias: poetry-install-classic-benchmark
entry: poetry -C classic/benchmark install
files: ^classic/benchmark/poetry\.lock$
name: Check & Install dependencies - Classic
alias: poetry-install-classic
entry: poetry -C classic install
files: ^classic/poetry\.lock$
types: [file]
language: system
pass_filenames: false
@@ -116,26 +97,10 @@ repos:
language: system
- id: isort
name: Lint (isort) - Classic - AutoGPT
alias: isort-classic-autogpt
entry: poetry -P classic/original_autogpt run isort -p autogpt
files: ^classic/original_autogpt/
types: [file, python]
language: system
- id: isort
name: Lint (isort) - Classic - Forge
alias: isort-classic-forge
entry: poetry -P classic/forge run isort -p forge
files: ^classic/forge/
types: [file, python]
language: system
- id: isort
name: Lint (isort) - Classic - Benchmark
alias: isort-classic-benchmark
entry: poetry -P classic/benchmark run isort -p agbenchmark
files: ^classic/benchmark/
name: Lint (isort) - Classic
alias: isort-classic
entry: bash -c 'cd classic && poetry run isort $(echo "$@" | sed "s|classic/||g")' --
files: ^classic/(original_autogpt|forge|direct_benchmark)/
types: [file, python]
language: system
@@ -149,26 +114,13 @@ repos:
- repo: https://github.com/PyCQA/flake8
rev: 7.0.0
# To have flake8 load the config of the individual subprojects, we have to call
# them separately.
# Use consolidated flake8 config at classic/.flake8
hooks:
- id: flake8
name: Lint (Flake8) - Classic - AutoGPT
alias: flake8-classic-autogpt
files: ^classic/original_autogpt/(autogpt|scripts|tests)/
args: [--config=classic/original_autogpt/.flake8]
- id: flake8
name: Lint (Flake8) - Classic - Forge
alias: flake8-classic-forge
files: ^classic/forge/(forge|tests)/
args: [--config=classic/forge/.flake8]
- id: flake8
name: Lint (Flake8) - Classic - Benchmark
alias: flake8-classic-benchmark
files: ^classic/benchmark/(agbenchmark|tests)/((?!reports).)*[/.]
args: [--config=classic/benchmark/.flake8]
name: Lint (Flake8) - Classic
alias: flake8-classic
files: ^classic/(original_autogpt|forge|direct_benchmark)/
args: [--config=classic/.flake8]
- repo: local
hooks:
@@ -204,29 +156,10 @@ repos:
pass_filenames: false
- id: pyright
name: Typecheck - Classic - AutoGPT
alias: pyright-classic-autogpt
entry: poetry -C classic/original_autogpt run pyright
# include forge source (since it's a path dependency) but exclude *_test.py files:
files: ^(classic/original_autogpt/((autogpt|scripts|tests)/|poetry\.lock$)|classic/forge/(forge/.*(?<!_test)\.py|poetry\.lock)$)
types: [file]
language: system
pass_filenames: false
- id: pyright
name: Typecheck - Classic - Forge
alias: pyright-classic-forge
entry: poetry -C classic/forge run pyright
files: ^classic/forge/(forge/|poetry\.lock$)
types: [file]
language: system
pass_filenames: false
- id: pyright
name: Typecheck - Classic - Benchmark
alias: pyright-classic-benchmark
entry: poetry -C classic/benchmark run pyright
files: ^classic/benchmark/(agbenchmark/|tests/|poetry\.lock$)
name: Typecheck - Classic
alias: pyright-classic
entry: poetry -C classic run pyright
files: ^classic/(original_autogpt|forge|direct_benchmark)/.*\.py$|^classic/poetry\.lock$
types: [file]
language: system
pass_filenames: false

View File

@@ -218,7 +218,6 @@ async def save_agent_to_library(
library_agents = await library_db.create_library_agent(
graph=created_graph,
user_id=user_id,
sensitive_action_safe_mode=True,
create_library_agents_for_sub_graphs=False,
)

View File

@@ -33,7 +33,7 @@ from .models import (
UserReadiness,
)
from .utils import (
build_missing_credentials_from_graph,
check_user_has_required_credentials,
extract_credentials_from_schema,
fetch_graph_from_store_slug,
get_or_create_library_agent,
@@ -237,13 +237,15 @@ class RunAgentTool(BaseTool):
# Return credentials needed response with input data info
# The UI handles credential setup automatically, so the message
# focuses on asking about input data
requirements_creds_dict = build_missing_credentials_from_graph(
graph, None
credentials = extract_credentials_from_schema(
graph.credentials_input_schema
)
missing_credentials_dict = build_missing_credentials_from_graph(
graph, graph_credentials
missing_creds_check = await check_user_has_required_credentials(
user_id, credentials
)
requirements_creds_list = list(requirements_creds_dict.values())
missing_credentials_dict = {
c.id: c.model_dump() for c in missing_creds_check
}
return SetupRequirementsResponse(
message=self._build_inputs_message(graph, MSG_WHAT_VALUES_TO_USE),
@@ -257,7 +259,7 @@ class RunAgentTool(BaseTool):
ready_to_run=False,
),
requirements={
"credentials": requirements_creds_list,
"credentials": [c.model_dump() for c in credentials],
"inputs": self._get_inputs_list(graph.input_schema),
"execution_modes": self._get_execution_modes(graph),
},

View File

@@ -22,7 +22,6 @@ from .models import (
ToolResponseBase,
UserReadiness,
)
from .utils import build_missing_credentials_from_field_info
logger = logging.getLogger(__name__)
@@ -190,11 +189,7 @@ class RunBlockTool(BaseTool):
if missing_credentials:
# Return setup requirements response with missing credentials
credentials_fields_info = block.input_schema.get_credentials_fields_info()
missing_creds_dict = build_missing_credentials_from_field_info(
credentials_fields_info, set(matched_credentials.keys())
)
missing_creds_list = list(missing_creds_dict.values())
missing_creds_dict = {c.id: c.model_dump() for c in missing_credentials}
return SetupRequirementsResponse(
message=(
@@ -211,7 +206,7 @@ class RunBlockTool(BaseTool):
ready_to_run=False,
),
requirements={
"credentials": missing_creds_list,
"credentials": [c.model_dump() for c in missing_credentials],
"inputs": self._get_inputs_list(block),
"execution_modes": ["immediate"],
},

View File

@@ -8,7 +8,7 @@ from backend.api.features.library import model as library_model
from backend.api.features.store import db as store_db
from backend.data import graph as graph_db
from backend.data.graph import GraphModel
from backend.data.model import CredentialsFieldInfo, CredentialsMetaInput
from backend.data.model import CredentialsMetaInput
from backend.integrations.creds_manager import IntegrationCredentialsManager
from backend.util.exceptions import NotFoundError
@@ -89,59 +89,6 @@ def extract_credentials_from_schema(
return credentials
def _serialize_missing_credential(
field_key: str, field_info: CredentialsFieldInfo
) -> dict[str, Any]:
"""
Convert credential field info into a serializable dict that preserves all supported
credential types (e.g., api_key + oauth2) so the UI can offer multiple options.
"""
supported_types = sorted(field_info.supported_types)
provider = next(iter(field_info.provider), "unknown")
scopes = sorted(field_info.required_scopes or [])
return {
"id": field_key,
"title": field_key.replace("_", " ").title(),
"provider": provider,
"provider_name": provider.replace("_", " ").title(),
"type": supported_types[0] if supported_types else "api_key",
"types": supported_types,
"scopes": scopes,
}
def build_missing_credentials_from_graph(
graph: GraphModel, matched_credentials: dict[str, CredentialsMetaInput] | None
) -> dict[str, Any]:
"""
Build a missing_credentials mapping from a graph's aggregated credentials inputs,
preserving all supported credential types for each field.
"""
matched_keys = set(matched_credentials.keys()) if matched_credentials else set()
aggregated_fields = graph.aggregate_credentials_inputs()
return {
field_key: _serialize_missing_credential(field_key, field_info)
for field_key, (field_info, _node_fields) in aggregated_fields.items()
if field_key not in matched_keys
}
def build_missing_credentials_from_field_info(
credential_fields: dict[str, CredentialsFieldInfo],
matched_keys: set[str],
) -> dict[str, Any]:
"""
Build missing_credentials mapping from a simple credentials field info dictionary.
"""
return {
field_key: _serialize_missing_credential(field_key, field_info)
for field_key, field_info in credential_fields.items()
if field_key not in matched_keys
}
def extract_credentials_as_dict(
credentials_input_schema: dict[str, Any] | None,
) -> dict[str, CredentialsMetaInput]:

View File

@@ -401,11 +401,27 @@ async def add_generated_agent_image(
)
def _initialize_graph_settings(graph: graph_db.GraphModel) -> GraphSettings:
"""
Initialize GraphSettings based on graph content.
Args:
graph: The graph to analyze
Returns:
GraphSettings with appropriate human_in_the_loop_safe_mode value
"""
if graph.has_human_in_the_loop:
# Graph has HITL blocks - set safe mode to True by default
return GraphSettings(human_in_the_loop_safe_mode=True)
else:
# Graph has no HITL blocks - keep None
return GraphSettings(human_in_the_loop_safe_mode=None)
async def create_library_agent(
graph: graph_db.GraphModel,
user_id: str,
hitl_safe_mode: bool = True,
sensitive_action_safe_mode: bool = False,
create_library_agents_for_sub_graphs: bool = True,
) -> list[library_model.LibraryAgent]:
"""
@@ -414,8 +430,6 @@ async def create_library_agent(
Args:
agent: The agent/Graph to add to the library.
user_id: The user to whom the agent will be added.
hitl_safe_mode: Whether HITL blocks require manual review (default True).
sensitive_action_safe_mode: Whether sensitive action blocks require review.
create_library_agents_for_sub_graphs: If True, creates LibraryAgent records for sub-graphs as well.
Returns:
@@ -451,11 +465,7 @@ async def create_library_agent(
}
},
settings=SafeJson(
GraphSettings.from_graph(
graph_entry,
hitl_safe_mode=hitl_safe_mode,
sensitive_action_safe_mode=sensitive_action_safe_mode,
).model_dump()
_initialize_graph_settings(graph_entry).model_dump()
),
),
include=library_agent_include(
@@ -617,6 +627,33 @@ async def update_library_agent(
raise DatabaseError("Failed to update library agent") from e
async def update_library_agent_settings(
user_id: str,
agent_id: str,
settings: GraphSettings,
) -> library_model.LibraryAgent:
"""
Updates the settings for a specific LibraryAgent.
Args:
user_id: The owner of the LibraryAgent.
agent_id: The ID of the LibraryAgent to update.
settings: New GraphSettings to apply.
Returns:
The updated LibraryAgent.
Raises:
NotFoundError: If the specified LibraryAgent does not exist.
DatabaseError: If there's an error in the update operation.
"""
return await update_library_agent(
library_agent_id=agent_id,
user_id=user_id,
settings=settings,
)
async def delete_library_agent(
library_agent_id: str, user_id: str, soft_delete: bool = True
) -> None:
@@ -801,7 +838,7 @@ async def add_store_agent_to_library(
"isCreatedByUser": False,
"useGraphIsActiveVersion": False,
"settings": SafeJson(
GraphSettings.from_graph(graph_model).model_dump()
_initialize_graph_settings(graph_model).model_dump()
),
},
include=library_agent_include(
@@ -1191,15 +1228,8 @@ async def fork_library_agent(
)
new_graph = await on_graph_activate(new_graph, user_id=user_id)
# Create a library agent for the new graph, preserving safe mode settings
return (
await create_library_agent(
new_graph,
user_id,
hitl_safe_mode=original_agent.settings.human_in_the_loop_safe_mode,
sensitive_action_safe_mode=original_agent.settings.sensitive_action_safe_mode,
)
)[0]
# Create a library agent for the new graph
return (await create_library_agent(new_graph, user_id))[0]
except prisma.errors.PrismaError as e:
logger.error(f"Database error cloning library agent: {e}")
raise DatabaseError("Failed to fork library agent") from e

View File

@@ -73,12 +73,6 @@ class LibraryAgent(pydantic.BaseModel):
has_external_trigger: bool = pydantic.Field(
description="Whether the agent has an external trigger (e.g. webhook) node"
)
has_human_in_the_loop: bool = pydantic.Field(
description="Whether the agent has human-in-the-loop blocks"
)
has_sensitive_action: bool = pydantic.Field(
description="Whether the agent has sensitive action blocks"
)
trigger_setup_info: Optional[GraphTriggerInfo] = None
# Indicates whether there's a new output (based on recent runs)
@@ -186,8 +180,6 @@ class LibraryAgent(pydantic.BaseModel):
graph.credentials_input_schema if sub_graphs is not None else None
),
has_external_trigger=graph.has_external_trigger,
has_human_in_the_loop=graph.has_human_in_the_loop,
has_sensitive_action=graph.has_sensitive_action,
trigger_setup_info=graph.trigger_setup_info,
new_output=new_output,
can_access_graph=can_access_graph,

View File

@@ -52,8 +52,6 @@ async def test_get_library_agents_success(
output_schema={"type": "object", "properties": {}},
credentials_input_schema={"type": "object", "properties": {}},
has_external_trigger=False,
has_human_in_the_loop=False,
has_sensitive_action=False,
status=library_model.LibraryAgentStatus.COMPLETED,
recommended_schedule_cron=None,
new_output=False,
@@ -77,8 +75,6 @@ async def test_get_library_agents_success(
output_schema={"type": "object", "properties": {}},
credentials_input_schema={"type": "object", "properties": {}},
has_external_trigger=False,
has_human_in_the_loop=False,
has_sensitive_action=False,
status=library_model.LibraryAgentStatus.COMPLETED,
recommended_schedule_cron=None,
new_output=False,
@@ -154,8 +150,6 @@ async def test_get_favorite_library_agents_success(
output_schema={"type": "object", "properties": {}},
credentials_input_schema={"type": "object", "properties": {}},
has_external_trigger=False,
has_human_in_the_loop=False,
has_sensitive_action=False,
status=library_model.LibraryAgentStatus.COMPLETED,
recommended_schedule_cron=None,
new_output=False,
@@ -224,8 +218,6 @@ def test_add_agent_to_library_success(
output_schema={"type": "object", "properties": {}},
credentials_input_schema={"type": "object", "properties": {}},
has_external_trigger=False,
has_human_in_the_loop=False,
has_sensitive_action=False,
status=library_model.LibraryAgentStatus.COMPLETED,
new_output=False,
can_access_graph=True,

View File

@@ -159,10 +159,10 @@ async def store_content_embedding(
INSERT INTO {schema_prefix}"UnifiedContentEmbedding" (
"id", "contentType", "contentId", "userId", "embedding", "searchableText", "metadata", "createdAt", "updatedAt"
)
VALUES (gen_random_uuid()::text, $1::{schema_prefix}"ContentType", $2, $3, $4::{schema}.vector, $5, $6::jsonb, NOW(), NOW())
VALUES (gen_random_uuid()::text, $1::{schema_prefix}"ContentType", $2, $3, $4::vector, $5, $6::jsonb, NOW(), NOW())
ON CONFLICT ("contentType", "contentId", "userId")
DO UPDATE SET
"embedding" = $4::{schema}.vector,
"embedding" = $4::vector,
"searchableText" = $5,
"metadata" = $6::jsonb,
"updatedAt" = NOW()
@@ -177,6 +177,7 @@ async def store_content_embedding(
searchable_text,
metadata_json,
client=client,
set_public_search_path=True,
)
logger.info(f"Stored embedding for {content_type}:{content_id}")
@@ -235,6 +236,7 @@ async def get_content_embedding(
content_type,
content_id,
user_id,
set_public_search_path=True,
)
if result and len(result) > 0:
@@ -869,44 +871,31 @@ async def semantic_search(
# Add content type parameters and build placeholders dynamically
content_type_start_idx = len(params) + 1
content_type_placeholders = ", ".join(
"$" + str(content_type_start_idx + i) + '::{schema_prefix}"ContentType"'
f'${content_type_start_idx + i}::{{{{schema_prefix}}}}"ContentType"'
for i in range(len(content_types))
)
params.extend([ct.value for ct in content_types])
# Build min_similarity param index before appending
min_similarity_idx = len(params) + 1
params.append(min_similarity)
sql = (
"""
sql = f"""
SELECT
"contentId" as content_id,
"contentType" as content_type,
"searchableText" as searchable_text,
metadata,
1 - (embedding <=> '"""
+ embedding_str
+ """'::{schema}.vector) as similarity
FROM {schema_prefix}"UnifiedContentEmbedding"
WHERE "contentType" IN ("""
+ content_type_placeholders
+ """)
"""
+ user_filter
+ """
AND 1 - (embedding <=> '"""
+ embedding_str
+ """'::{schema}.vector) >= $"""
+ str(min_similarity_idx)
+ """
1 - (embedding <=> '{embedding_str}'::vector) as similarity
FROM {{{{schema_prefix}}}}"UnifiedContentEmbedding"
WHERE "contentType" IN ({content_type_placeholders})
{user_filter}
AND 1 - (embedding <=> '{embedding_str}'::vector) >= ${len(params) + 1}
ORDER BY similarity DESC
LIMIT $1
"""
)
params.append(min_similarity)
try:
results = await query_raw_with_schema(sql, *params)
results = await query_raw_with_schema(
sql, *params, set_public_search_path=True
)
return [
{
"content_id": row["content_id"],
@@ -933,41 +922,31 @@ async def semantic_search(
# Add content type parameters and build placeholders dynamically
content_type_start_idx = len(params_lexical) + 1
content_type_placeholders_lexical = ", ".join(
"$" + str(content_type_start_idx + i) + '::{schema_prefix}"ContentType"'
f'${content_type_start_idx + i}::{{{{schema_prefix}}}}"ContentType"'
for i in range(len(content_types))
)
params_lexical.extend([ct.value for ct in content_types])
# Build query param index before appending
query_param_idx = len(params_lexical) + 1
params_lexical.append(f"%{query}%")
# Use regular string (not f-string) for template to preserve {schema_prefix} placeholders
sql_lexical = (
"""
sql_lexical = f"""
SELECT
"contentId" as content_id,
"contentType" as content_type,
"searchableText" as searchable_text,
metadata,
0.0 as similarity
FROM {schema_prefix}"UnifiedContentEmbedding"
WHERE "contentType" IN ("""
+ content_type_placeholders_lexical
+ """)
"""
+ user_filter
+ """
AND "searchableText" ILIKE $"""
+ str(query_param_idx)
+ """
FROM {{{{schema_prefix}}}}"UnifiedContentEmbedding"
WHERE "contentType" IN ({content_type_placeholders_lexical})
{user_filter}
AND "searchableText" ILIKE ${len(params_lexical) + 1}
ORDER BY "updatedAt" DESC
LIMIT $1
"""
)
params_lexical.append(f"%{query}%")
try:
results = await query_raw_with_schema(sql_lexical, *params_lexical)
results = await query_raw_with_schema(
sql_lexical, *params_lexical, set_public_search_path=True
)
return [
{
"content_id": row["content_id"],

View File

@@ -155,14 +155,18 @@ async def test_store_embedding_success(mocker):
)
assert result is True
# execute_raw is called once for INSERT (no separate SET search_path needed)
assert mock_client.execute_raw.call_count == 1
# execute_raw is called twice: once for SET search_path, once for INSERT
assert mock_client.execute_raw.call_count == 2
# Verify the INSERT query with the actual data
call_args = mock_client.execute_raw.call_args_list[0][0]
assert "test-version-id" in call_args
assert "[0.1,0.2,0.3]" in call_args
assert None in call_args # userId should be None for store agents
# First call: SET search_path
first_call_args = mock_client.execute_raw.call_args_list[0][0]
assert "SET search_path" in first_call_args[0]
# Second call: INSERT query with the actual data
second_call_args = mock_client.execute_raw.call_args_list[1][0]
assert "test-version-id" in second_call_args
assert "[0.1,0.2,0.3]" in second_call_args
assert None in second_call_args # userId should be None for store agents
@pytest.mark.asyncio(loop_scope="session")

View File

@@ -12,7 +12,7 @@ from dataclasses import dataclass
from typing import Any, Literal
from prisma.enums import ContentType
from rank_bm25 import BM25Okapi # type: ignore[import-untyped]
from rank_bm25 import BM25Okapi
from backend.api.features.store.embeddings import (
EMBEDDING_DIM,
@@ -295,7 +295,7 @@ async def unified_hybrid_search(
FROM {{schema_prefix}}"UnifiedContentEmbedding" uce
WHERE uce."contentType" = ANY({content_types_param}::{{schema_prefix}}"ContentType"[])
{user_filter}
ORDER BY uce.embedding <=> {embedding_param}::{{schema}}.vector
ORDER BY uce.embedding <=> {embedding_param}::vector
LIMIT 200
)
),
@@ -307,7 +307,7 @@ async def unified_hybrid_search(
uce.metadata,
uce."updatedAt" as updated_at,
-- Semantic score: cosine similarity (1 - distance)
COALESCE(1 - (uce.embedding <=> {embedding_param}::{{schema}}.vector), 0) as semantic_score,
COALESCE(1 - (uce.embedding <=> {embedding_param}::vector), 0) as semantic_score,
-- Lexical score: ts_rank_cd
COALESCE(ts_rank_cd(uce.search, plainto_tsquery('english', {query_param})), 0) as lexical_raw,
-- Category match from metadata
@@ -363,7 +363,9 @@ async def unified_hybrid_search(
LIMIT {limit_param} OFFSET {offset_param}
"""
results = await query_raw_with_schema(sql_query, *params)
results = await query_raw_with_schema(
sql_query, *params, set_public_search_path=True
)
total = results[0]["total_count"] if results else 0
# Apply BM25 reranking
@@ -583,7 +585,7 @@ async def hybrid_search(
WHERE uce."contentType" = 'STORE_AGENT'::{{schema_prefix}}"ContentType"
AND uce."userId" IS NULL
AND {where_clause}
ORDER BY uce.embedding <=> {embedding_param}::{{schema}}.vector
ORDER BY uce.embedding <=> {embedding_param}::vector
LIMIT 200
) uce
),
@@ -605,7 +607,7 @@ async def hybrid_search(
-- Searchable text for BM25 reranking
COALESCE(sa.agent_name, '') || ' ' || COALESCE(sa.sub_heading, '') || ' ' || COALESCE(sa.description, '') as searchable_text,
-- Semantic score
COALESCE(1 - (uce.embedding <=> {embedding_param}::{{schema}}.vector), 0) as semantic_score,
COALESCE(1 - (uce.embedding <=> {embedding_param}::vector), 0) as semantic_score,
-- Lexical score (raw, will normalize)
COALESCE(ts_rank_cd(uce.search, plainto_tsquery('english', {query_param})), 0) as lexical_raw,
-- Category match
@@ -686,7 +688,9 @@ async def hybrid_search(
LIMIT {limit_param} OFFSET {offset_param}
"""
results = await query_raw_with_schema(sql_query, *params)
results = await query_raw_with_schema(
sql_query, *params, set_public_search_path=True
)
total = results[0]["total_count"] if results else 0

View File

@@ -761,8 +761,10 @@ async def create_new_graph(
graph.reassign_ids(user_id=user_id, reassign_graph_id=True)
graph.validate_graph(for_run=False)
# The return value of the create graph & library function is intentionally not used here,
# as the graph already valid and no sub-graphs are returned back.
await graph_db.create_graph(graph, user_id=user_id)
await library_db.create_library_agent(graph, user_id)
await library_db.create_library_agent(graph, user_id=user_id)
activated_graph = await on_graph_activate(graph, user_id=user_id)
if create_graph.source == "builder":
@@ -886,19 +888,21 @@ async def set_graph_active_version(
async def _update_library_agent_version_and_settings(
user_id: str, agent_graph: graph_db.GraphModel
) -> library_model.LibraryAgent:
# Keep the library agent up to date with the new active version
library = await library_db.update_agent_version_in_library(
user_id, agent_graph.id, agent_graph.version
)
updated_settings = GraphSettings.from_graph(
graph=agent_graph,
hitl_safe_mode=library.settings.human_in_the_loop_safe_mode,
sensitive_action_safe_mode=library.settings.sensitive_action_safe_mode,
)
if updated_settings != library.settings:
library = await library_db.update_library_agent(
library_agent_id=library.id,
# If the graph has HITL node, initialize the setting if it's not already set.
if (
agent_graph.has_human_in_the_loop
and library.settings.human_in_the_loop_safe_mode is None
):
await library_db.update_library_agent_settings(
user_id=user_id,
settings=updated_settings,
agent_id=library.id,
settings=library.settings.model_copy(
update={"human_in_the_loop_safe_mode": True}
),
)
return library
@@ -915,18 +919,21 @@ async def update_graph_settings(
user_id: Annotated[str, Security(get_user_id)],
) -> GraphSettings:
"""Update graph settings for the user's library agent."""
# Get the library agent for this graph
library_agent = await library_db.get_library_agent_by_graph_id(
graph_id=graph_id, user_id=user_id
)
if not library_agent:
raise HTTPException(404, f"Graph #{graph_id} not found in user's library")
updated_agent = await library_db.update_library_agent(
library_agent_id=library_agent.id,
# Update the library agent settings
updated_agent = await library_db.update_library_agent_settings(
user_id=user_id,
agent_id=library_agent.id,
settings=settings,
)
# Return the updated settings
return GraphSettings.model_validate(updated_agent.settings)

View File

@@ -680,58 +680,3 @@ class ListIsEmptyBlock(Block):
async def run(self, input_data: Input, **kwargs) -> BlockOutput:
yield "is_empty", len(input_data.list) == 0
class ConcatenateListsBlock(Block):
class Input(BlockSchemaInput):
lists: List[List[Any]] = SchemaField(
description="A list of lists to concatenate together. All lists will be combined in order into a single list.",
placeholder="e.g., [[1, 2], [3, 4], [5, 6]]",
)
class Output(BlockSchemaOutput):
concatenated_list: List[Any] = SchemaField(
description="The concatenated list containing all elements from all input lists in order."
)
error: str = SchemaField(
description="Error message if concatenation failed due to invalid input types."
)
def __init__(self):
super().__init__(
id="3cf9298b-5817-4141-9d80-7c2cc5199c8e",
description="Concatenates multiple lists into a single list. All elements from all input lists are combined in order.",
categories={BlockCategory.BASIC},
input_schema=ConcatenateListsBlock.Input,
output_schema=ConcatenateListsBlock.Output,
test_input=[
{"lists": [[1, 2, 3], [4, 5, 6]]},
{"lists": [["a", "b"], ["c"], ["d", "e", "f"]]},
{"lists": [[1, 2], []]},
{"lists": []},
],
test_output=[
("concatenated_list", [1, 2, 3, 4, 5, 6]),
("concatenated_list", ["a", "b", "c", "d", "e", "f"]),
("concatenated_list", [1, 2]),
("concatenated_list", []),
],
)
async def run(self, input_data: Input, **kwargs) -> BlockOutput:
concatenated = []
for idx, lst in enumerate(input_data.lists):
if lst is None:
# Skip None values to avoid errors
continue
if not isinstance(lst, list):
# Type validation: each item must be a list
# Strings are iterable and would cause extend() to iterate character-by-character
# Non-iterable types would raise TypeError
yield "error", (
f"Invalid input at index {idx}: expected a list, got {type(lst).__name__}. "
f"All items in 'lists' must be lists (e.g., [[1, 2], [3, 4]])."
)
return
concatenated.extend(lst)
yield "concatenated_list", concatenated

View File

@@ -84,7 +84,7 @@ class HITLReviewHelper:
Exception: If review creation or status update fails
"""
# Skip review if safe mode is disabled - return auto-approved result
if not execution_context.human_in_the_loop_safe_mode:
if not execution_context.safe_mode:
logger.info(
f"Block {block_name} skipping review for node {node_exec_id} - safe mode disabled"
)

View File

@@ -104,7 +104,7 @@ class HumanInTheLoopBlock(Block):
execution_context: ExecutionContext,
**_kwargs,
) -> BlockOutput:
if not execution_context.human_in_the_loop_safe_mode:
if not execution_context.safe_mode:
logger.info(
f"HITL block skipping review for node {node_exec_id} - safe mode disabled"
)

View File

@@ -79,10 +79,6 @@ class ModelMetadata(NamedTuple):
provider: str
context_window: int
max_output_tokens: int | None
display_name: str
provider_name: str
creator_name: str
price_tier: Literal[1, 2, 3]
class LlmModelMeta(EnumMeta):
@@ -175,26 +171,6 @@ class LlmModel(str, Enum, metaclass=LlmModelMeta):
V0_1_5_LG = "v0-1.5-lg"
V0_1_0_MD = "v0-1.0-md"
@classmethod
def __get_pydantic_json_schema__(cls, schema, handler):
json_schema = handler(schema)
llm_model_metadata = {}
for model in cls:
model_name = model.value
metadata = model.metadata
llm_model_metadata[model_name] = {
"creator": metadata.creator_name,
"creator_name": metadata.creator_name,
"title": metadata.display_name,
"provider": metadata.provider,
"provider_name": metadata.provider_name,
"name": model_name,
"price_tier": metadata.price_tier,
}
json_schema["llm_model"] = True
json_schema["llm_model_metadata"] = llm_model_metadata
return json_schema
@property
def metadata(self) -> ModelMetadata:
return MODEL_METADATA[self]
@@ -214,291 +190,119 @@ class LlmModel(str, Enum, metaclass=LlmModelMeta):
MODEL_METADATA = {
# https://platform.openai.com/docs/models
LlmModel.O3: ModelMetadata("openai", 200000, 100000, "O3", "OpenAI", "OpenAI", 2),
LlmModel.O3_MINI: ModelMetadata(
"openai", 200000, 100000, "O3 Mini", "OpenAI", "OpenAI", 1
), # o3-mini-2025-01-31
LlmModel.O1: ModelMetadata(
"openai", 200000, 100000, "O1", "OpenAI", "OpenAI", 3
), # o1-2024-12-17
LlmModel.O1_MINI: ModelMetadata(
"openai", 128000, 65536, "O1 Mini", "OpenAI", "OpenAI", 2
), # o1-mini-2024-09-12
LlmModel.O3: ModelMetadata("openai", 200000, 100000),
LlmModel.O3_MINI: ModelMetadata("openai", 200000, 100000), # o3-mini-2025-01-31
LlmModel.O1: ModelMetadata("openai", 200000, 100000), # o1-2024-12-17
LlmModel.O1_MINI: ModelMetadata("openai", 128000, 65536), # o1-mini-2024-09-12
# GPT-5 models
LlmModel.GPT5_2: ModelMetadata(
"openai", 400000, 128000, "GPT-5.2", "OpenAI", "OpenAI", 3
),
LlmModel.GPT5_1: ModelMetadata(
"openai", 400000, 128000, "GPT-5.1", "OpenAI", "OpenAI", 2
),
LlmModel.GPT5: ModelMetadata(
"openai", 400000, 128000, "GPT-5", "OpenAI", "OpenAI", 1
),
LlmModel.GPT5_MINI: ModelMetadata(
"openai", 400000, 128000, "GPT-5 Mini", "OpenAI", "OpenAI", 1
),
LlmModel.GPT5_NANO: ModelMetadata(
"openai", 400000, 128000, "GPT-5 Nano", "OpenAI", "OpenAI", 1
),
LlmModel.GPT5_CHAT: ModelMetadata(
"openai", 400000, 16384, "GPT-5 Chat Latest", "OpenAI", "OpenAI", 2
),
LlmModel.GPT41: ModelMetadata(
"openai", 1047576, 32768, "GPT-4.1", "OpenAI", "OpenAI", 1
),
LlmModel.GPT41_MINI: ModelMetadata(
"openai", 1047576, 32768, "GPT-4.1 Mini", "OpenAI", "OpenAI", 1
),
LlmModel.GPT5_2: ModelMetadata("openai", 400000, 128000),
LlmModel.GPT5_1: ModelMetadata("openai", 400000, 128000),
LlmModel.GPT5: ModelMetadata("openai", 400000, 128000),
LlmModel.GPT5_MINI: ModelMetadata("openai", 400000, 128000),
LlmModel.GPT5_NANO: ModelMetadata("openai", 400000, 128000),
LlmModel.GPT5_CHAT: ModelMetadata("openai", 400000, 16384),
LlmModel.GPT41: ModelMetadata("openai", 1047576, 32768),
LlmModel.GPT41_MINI: ModelMetadata("openai", 1047576, 32768),
LlmModel.GPT4O_MINI: ModelMetadata(
"openai", 128000, 16384, "GPT-4o Mini", "OpenAI", "OpenAI", 1
"openai", 128000, 16384
), # gpt-4o-mini-2024-07-18
LlmModel.GPT4O: ModelMetadata(
"openai", 128000, 16384, "GPT-4o", "OpenAI", "OpenAI", 2
), # gpt-4o-2024-08-06
LlmModel.GPT4O: ModelMetadata("openai", 128000, 16384), # gpt-4o-2024-08-06
LlmModel.GPT4_TURBO: ModelMetadata(
"openai", 128000, 4096, "GPT-4 Turbo", "OpenAI", "OpenAI", 3
"openai", 128000, 4096
), # gpt-4-turbo-2024-04-09
LlmModel.GPT3_5_TURBO: ModelMetadata(
"openai", 16385, 4096, "GPT-3.5 Turbo", "OpenAI", "OpenAI", 1
), # gpt-3.5-turbo-0125
LlmModel.GPT3_5_TURBO: ModelMetadata("openai", 16385, 4096), # gpt-3.5-turbo-0125
# https://docs.anthropic.com/en/docs/about-claude/models
LlmModel.CLAUDE_4_1_OPUS: ModelMetadata(
"anthropic", 200000, 32000, "Claude Opus 4.1", "Anthropic", "Anthropic", 3
"anthropic", 200000, 32000
), # claude-opus-4-1-20250805
LlmModel.CLAUDE_4_OPUS: ModelMetadata(
"anthropic", 200000, 32000, "Claude Opus 4", "Anthropic", "Anthropic", 3
"anthropic", 200000, 32000
), # claude-4-opus-20250514
LlmModel.CLAUDE_4_SONNET: ModelMetadata(
"anthropic", 200000, 64000, "Claude Sonnet 4", "Anthropic", "Anthropic", 2
"anthropic", 200000, 64000
), # claude-4-sonnet-20250514
LlmModel.CLAUDE_4_5_OPUS: ModelMetadata(
"anthropic", 200000, 64000, "Claude Opus 4.5", "Anthropic", "Anthropic", 3
"anthropic", 200000, 64000
), # claude-opus-4-5-20251101
LlmModel.CLAUDE_4_5_SONNET: ModelMetadata(
"anthropic", 200000, 64000, "Claude Sonnet 4.5", "Anthropic", "Anthropic", 3
"anthropic", 200000, 64000
), # claude-sonnet-4-5-20250929
LlmModel.CLAUDE_4_5_HAIKU: ModelMetadata(
"anthropic", 200000, 64000, "Claude Haiku 4.5", "Anthropic", "Anthropic", 2
"anthropic", 200000, 64000
), # claude-haiku-4-5-20251001
LlmModel.CLAUDE_3_7_SONNET: ModelMetadata(
"anthropic", 200000, 64000, "Claude 3.7 Sonnet", "Anthropic", "Anthropic", 2
"anthropic", 200000, 64000
), # claude-3-7-sonnet-20250219
LlmModel.CLAUDE_3_HAIKU: ModelMetadata(
"anthropic", 200000, 4096, "Claude 3 Haiku", "Anthropic", "Anthropic", 1
"anthropic", 200000, 4096
), # claude-3-haiku-20240307
# https://docs.aimlapi.com/api-overview/model-database/text-models
LlmModel.AIML_API_QWEN2_5_72B: ModelMetadata(
"aiml_api", 32000, 8000, "Qwen 2.5 72B Instruct Turbo", "AI/ML", "Qwen", 1
),
LlmModel.AIML_API_LLAMA3_1_70B: ModelMetadata(
"aiml_api",
128000,
40000,
"Llama 3.1 Nemotron 70B Instruct",
"AI/ML",
"Nvidia",
1,
),
LlmModel.AIML_API_LLAMA3_3_70B: ModelMetadata(
"aiml_api", 128000, None, "Llama 3.3 70B Instruct Turbo", "AI/ML", "Meta", 1
),
LlmModel.AIML_API_META_LLAMA_3_1_70B: ModelMetadata(
"aiml_api", 131000, 2000, "Llama 3.1 70B Instruct Turbo", "AI/ML", "Meta", 1
),
LlmModel.AIML_API_LLAMA_3_2_3B: ModelMetadata(
"aiml_api", 128000, None, "Llama 3.2 3B Instruct Turbo", "AI/ML", "Meta", 1
),
LlmModel.AIML_API_QWEN2_5_72B: ModelMetadata("aiml_api", 32000, 8000),
LlmModel.AIML_API_LLAMA3_1_70B: ModelMetadata("aiml_api", 128000, 40000),
LlmModel.AIML_API_LLAMA3_3_70B: ModelMetadata("aiml_api", 128000, None),
LlmModel.AIML_API_META_LLAMA_3_1_70B: ModelMetadata("aiml_api", 131000, 2000),
LlmModel.AIML_API_LLAMA_3_2_3B: ModelMetadata("aiml_api", 128000, None),
# https://console.groq.com/docs/models
LlmModel.LLAMA3_3_70B: ModelMetadata(
"groq", 128000, 32768, "Llama 3.3 70B Versatile", "Groq", "Meta", 1
),
LlmModel.LLAMA3_1_8B: ModelMetadata(
"groq", 128000, 8192, "Llama 3.1 8B Instant", "Groq", "Meta", 1
),
LlmModel.LLAMA3_3_70B: ModelMetadata("groq", 128000, 32768),
LlmModel.LLAMA3_1_8B: ModelMetadata("groq", 128000, 8192),
# https://ollama.com/library
LlmModel.OLLAMA_LLAMA3_3: ModelMetadata(
"ollama", 8192, None, "Llama 3.3", "Ollama", "Meta", 1
),
LlmModel.OLLAMA_LLAMA3_2: ModelMetadata(
"ollama", 8192, None, "Llama 3.2", "Ollama", "Meta", 1
),
LlmModel.OLLAMA_LLAMA3_8B: ModelMetadata(
"ollama", 8192, None, "Llama 3", "Ollama", "Meta", 1
),
LlmModel.OLLAMA_LLAMA3_405B: ModelMetadata(
"ollama", 8192, None, "Llama 3.1 405B", "Ollama", "Meta", 1
),
LlmModel.OLLAMA_DOLPHIN: ModelMetadata(
"ollama", 32768, None, "Dolphin Mistral Latest", "Ollama", "Mistral AI", 1
),
LlmModel.OLLAMA_LLAMA3_3: ModelMetadata("ollama", 8192, None),
LlmModel.OLLAMA_LLAMA3_2: ModelMetadata("ollama", 8192, None),
LlmModel.OLLAMA_LLAMA3_8B: ModelMetadata("ollama", 8192, None),
LlmModel.OLLAMA_LLAMA3_405B: ModelMetadata("ollama", 8192, None),
LlmModel.OLLAMA_DOLPHIN: ModelMetadata("ollama", 32768, None),
# https://openrouter.ai/models
LlmModel.GEMINI_2_5_PRO: ModelMetadata(
"open_router",
1050000,
8192,
"Gemini 2.5 Pro Preview 03.25",
"OpenRouter",
"Google",
2,
),
LlmModel.GEMINI_3_PRO_PREVIEW: ModelMetadata(
"open_router", 1048576, 65535, "Gemini 3 Pro Preview", "OpenRouter", "Google", 2
),
LlmModel.GEMINI_2_5_FLASH: ModelMetadata(
"open_router", 1048576, 65535, "Gemini 2.5 Flash", "OpenRouter", "Google", 1
),
LlmModel.GEMINI_2_0_FLASH: ModelMetadata(
"open_router", 1048576, 8192, "Gemini 2.0 Flash 001", "OpenRouter", "Google", 1
),
LlmModel.GEMINI_2_5_PRO: ModelMetadata("open_router", 1050000, 8192),
LlmModel.GEMINI_3_PRO_PREVIEW: ModelMetadata("open_router", 1048576, 65535),
LlmModel.GEMINI_2_5_FLASH: ModelMetadata("open_router", 1048576, 65535),
LlmModel.GEMINI_2_0_FLASH: ModelMetadata("open_router", 1048576, 8192),
LlmModel.GEMINI_2_5_FLASH_LITE_PREVIEW: ModelMetadata(
"open_router",
1048576,
65535,
"Gemini 2.5 Flash Lite Preview 06.17",
"OpenRouter",
"Google",
1,
),
LlmModel.GEMINI_2_0_FLASH_LITE: ModelMetadata(
"open_router",
1048576,
8192,
"Gemini 2.0 Flash Lite 001",
"OpenRouter",
"Google",
1,
),
LlmModel.MISTRAL_NEMO: ModelMetadata(
"open_router", 128000, 4096, "Mistral Nemo", "OpenRouter", "Mistral AI", 1
),
LlmModel.COHERE_COMMAND_R_08_2024: ModelMetadata(
"open_router", 128000, 4096, "Command R 08.2024", "OpenRouter", "Cohere", 1
),
LlmModel.COHERE_COMMAND_R_PLUS_08_2024: ModelMetadata(
"open_router", 128000, 4096, "Command R Plus 08.2024", "OpenRouter", "Cohere", 2
),
LlmModel.DEEPSEEK_CHAT: ModelMetadata(
"open_router", 64000, 2048, "DeepSeek Chat", "OpenRouter", "DeepSeek", 1
),
LlmModel.DEEPSEEK_R1_0528: ModelMetadata(
"open_router", 163840, 163840, "DeepSeek R1 0528", "OpenRouter", "DeepSeek", 1
),
LlmModel.PERPLEXITY_SONAR: ModelMetadata(
"open_router", 127000, 8000, "Sonar", "OpenRouter", "Perplexity", 1
),
LlmModel.PERPLEXITY_SONAR_PRO: ModelMetadata(
"open_router", 200000, 8000, "Sonar Pro", "OpenRouter", "Perplexity", 2
"open_router", 1048576, 65535
),
LlmModel.GEMINI_2_0_FLASH_LITE: ModelMetadata("open_router", 1048576, 8192),
LlmModel.MISTRAL_NEMO: ModelMetadata("open_router", 128000, 4096),
LlmModel.COHERE_COMMAND_R_08_2024: ModelMetadata("open_router", 128000, 4096),
LlmModel.COHERE_COMMAND_R_PLUS_08_2024: ModelMetadata("open_router", 128000, 4096),
LlmModel.DEEPSEEK_CHAT: ModelMetadata("open_router", 64000, 2048),
LlmModel.DEEPSEEK_R1_0528: ModelMetadata("open_router", 163840, 163840),
LlmModel.PERPLEXITY_SONAR: ModelMetadata("open_router", 127000, 8000),
LlmModel.PERPLEXITY_SONAR_PRO: ModelMetadata("open_router", 200000, 8000),
LlmModel.PERPLEXITY_SONAR_DEEP_RESEARCH: ModelMetadata(
"open_router",
128000,
16000,
"Sonar Deep Research",
"OpenRouter",
"Perplexity",
3,
),
LlmModel.NOUSRESEARCH_HERMES_3_LLAMA_3_1_405B: ModelMetadata(
"open_router",
131000,
4096,
"Hermes 3 Llama 3.1 405B",
"OpenRouter",
"Nous Research",
1,
"open_router", 131000, 4096
),
LlmModel.NOUSRESEARCH_HERMES_3_LLAMA_3_1_70B: ModelMetadata(
"open_router",
12288,
12288,
"Hermes 3 Llama 3.1 70B",
"OpenRouter",
"Nous Research",
1,
),
LlmModel.OPENAI_GPT_OSS_120B: ModelMetadata(
"open_router", 131072, 131072, "GPT-OSS 120B", "OpenRouter", "OpenAI", 1
),
LlmModel.OPENAI_GPT_OSS_20B: ModelMetadata(
"open_router", 131072, 32768, "GPT-OSS 20B", "OpenRouter", "OpenAI", 1
),
LlmModel.AMAZON_NOVA_LITE_V1: ModelMetadata(
"open_router", 300000, 5120, "Nova Lite V1", "OpenRouter", "Amazon", 1
),
LlmModel.AMAZON_NOVA_MICRO_V1: ModelMetadata(
"open_router", 128000, 5120, "Nova Micro V1", "OpenRouter", "Amazon", 1
),
LlmModel.AMAZON_NOVA_PRO_V1: ModelMetadata(
"open_router", 300000, 5120, "Nova Pro V1", "OpenRouter", "Amazon", 1
),
LlmModel.MICROSOFT_WIZARDLM_2_8X22B: ModelMetadata(
"open_router", 65536, 4096, "WizardLM 2 8x22B", "OpenRouter", "Microsoft", 1
),
LlmModel.GRYPHE_MYTHOMAX_L2_13B: ModelMetadata(
"open_router", 4096, 4096, "MythoMax L2 13B", "OpenRouter", "Gryphe", 1
),
LlmModel.META_LLAMA_4_SCOUT: ModelMetadata(
"open_router", 131072, 131072, "Llama 4 Scout", "OpenRouter", "Meta", 1
),
LlmModel.META_LLAMA_4_MAVERICK: ModelMetadata(
"open_router", 1048576, 1000000, "Llama 4 Maverick", "OpenRouter", "Meta", 1
),
LlmModel.GROK_4: ModelMetadata(
"open_router", 256000, 256000, "Grok 4", "OpenRouter", "xAI", 3
),
LlmModel.GROK_4_FAST: ModelMetadata(
"open_router", 2000000, 30000, "Grok 4 Fast", "OpenRouter", "xAI", 1
),
LlmModel.GROK_4_1_FAST: ModelMetadata(
"open_router", 2000000, 30000, "Grok 4.1 Fast", "OpenRouter", "xAI", 1
),
LlmModel.GROK_CODE_FAST_1: ModelMetadata(
"open_router", 256000, 10000, "Grok Code Fast 1", "OpenRouter", "xAI", 1
),
LlmModel.KIMI_K2: ModelMetadata(
"open_router", 131000, 131000, "Kimi K2", "OpenRouter", "Moonshot AI", 1
),
LlmModel.QWEN3_235B_A22B_THINKING: ModelMetadata(
"open_router",
262144,
262144,
"Qwen 3 235B A22B Thinking 2507",
"OpenRouter",
"Qwen",
1,
),
LlmModel.QWEN3_CODER: ModelMetadata(
"open_router", 262144, 262144, "Qwen 3 Coder", "OpenRouter", "Qwen", 3
"open_router", 12288, 12288
),
LlmModel.OPENAI_GPT_OSS_120B: ModelMetadata("open_router", 131072, 131072),
LlmModel.OPENAI_GPT_OSS_20B: ModelMetadata("open_router", 131072, 32768),
LlmModel.AMAZON_NOVA_LITE_V1: ModelMetadata("open_router", 300000, 5120),
LlmModel.AMAZON_NOVA_MICRO_V1: ModelMetadata("open_router", 128000, 5120),
LlmModel.AMAZON_NOVA_PRO_V1: ModelMetadata("open_router", 300000, 5120),
LlmModel.MICROSOFT_WIZARDLM_2_8X22B: ModelMetadata("open_router", 65536, 4096),
LlmModel.GRYPHE_MYTHOMAX_L2_13B: ModelMetadata("open_router", 4096, 4096),
LlmModel.META_LLAMA_4_SCOUT: ModelMetadata("open_router", 131072, 131072),
LlmModel.META_LLAMA_4_MAVERICK: ModelMetadata("open_router", 1048576, 1000000),
LlmModel.GROK_4: ModelMetadata("open_router", 256000, 256000),
LlmModel.GROK_4_FAST: ModelMetadata("open_router", 2000000, 30000),
LlmModel.GROK_4_1_FAST: ModelMetadata("open_router", 2000000, 30000),
LlmModel.GROK_CODE_FAST_1: ModelMetadata("open_router", 256000, 10000),
LlmModel.KIMI_K2: ModelMetadata("open_router", 131000, 131000),
LlmModel.QWEN3_235B_A22B_THINKING: ModelMetadata("open_router", 262144, 262144),
LlmModel.QWEN3_CODER: ModelMetadata("open_router", 262144, 262144),
# Llama API models
LlmModel.LLAMA_API_LLAMA_4_SCOUT: ModelMetadata(
"llama_api",
128000,
4028,
"Llama 4 Scout 17B 16E Instruct FP8",
"Llama API",
"Meta",
1,
),
LlmModel.LLAMA_API_LLAMA4_MAVERICK: ModelMetadata(
"llama_api",
128000,
4028,
"Llama 4 Maverick 17B 128E Instruct FP8",
"Llama API",
"Meta",
1,
),
LlmModel.LLAMA_API_LLAMA3_3_8B: ModelMetadata(
"llama_api", 128000, 4028, "Llama 3.3 8B Instruct", "Llama API", "Meta", 1
),
LlmModel.LLAMA_API_LLAMA3_3_70B: ModelMetadata(
"llama_api", 128000, 4028, "Llama 3.3 70B Instruct", "Llama API", "Meta", 1
),
LlmModel.LLAMA_API_LLAMA_4_SCOUT: ModelMetadata("llama_api", 128000, 4028),
LlmModel.LLAMA_API_LLAMA4_MAVERICK: ModelMetadata("llama_api", 128000, 4028),
LlmModel.LLAMA_API_LLAMA3_3_8B: ModelMetadata("llama_api", 128000, 4028),
LlmModel.LLAMA_API_LLAMA3_3_70B: ModelMetadata("llama_api", 128000, 4028),
# v0 by Vercel models
LlmModel.V0_1_5_MD: ModelMetadata("v0", 128000, 64000, "v0 1.5 MD", "V0", "V0", 1),
LlmModel.V0_1_5_LG: ModelMetadata("v0", 512000, 64000, "v0 1.5 LG", "V0", "V0", 1),
LlmModel.V0_1_0_MD: ModelMetadata("v0", 128000, 64000, "v0 1.0 MD", "V0", "V0", 1),
LlmModel.V0_1_5_MD: ModelMetadata("v0", 128000, 64000),
LlmModel.V0_1_5_LG: ModelMetadata("v0", 512000, 64000),
LlmModel.V0_1_0_MD: ModelMetadata("v0", 128000, 64000),
}
DEFAULT_LLM_MODEL = LlmModel.GPT5_2

View File

@@ -242,7 +242,7 @@ async def test_smart_decision_maker_tracks_llm_stats():
outputs = {}
# Create execution context
mock_execution_context = ExecutionContext(human_in_the_loop_safe_mode=False)
mock_execution_context = ExecutionContext(safe_mode=False)
# Create a mock execution processor for tests
@@ -343,7 +343,7 @@ async def test_smart_decision_maker_parameter_validation():
# Create execution context
mock_execution_context = ExecutionContext(human_in_the_loop_safe_mode=False)
mock_execution_context = ExecutionContext(safe_mode=False)
# Create a mock execution processor for tests
@@ -409,7 +409,7 @@ async def test_smart_decision_maker_parameter_validation():
# Create execution context
mock_execution_context = ExecutionContext(human_in_the_loop_safe_mode=False)
mock_execution_context = ExecutionContext(safe_mode=False)
# Create a mock execution processor for tests
@@ -471,7 +471,7 @@ async def test_smart_decision_maker_parameter_validation():
outputs = {}
# Create execution context
mock_execution_context = ExecutionContext(human_in_the_loop_safe_mode=False)
mock_execution_context = ExecutionContext(safe_mode=False)
# Create a mock execution processor for tests
@@ -535,7 +535,7 @@ async def test_smart_decision_maker_parameter_validation():
outputs = {}
# Create execution context
mock_execution_context = ExecutionContext(human_in_the_loop_safe_mode=False)
mock_execution_context = ExecutionContext(safe_mode=False)
# Create a mock execution processor for tests
@@ -658,7 +658,7 @@ async def test_smart_decision_maker_raw_response_conversion():
outputs = {}
# Create execution context
mock_execution_context = ExecutionContext(human_in_the_loop_safe_mode=False)
mock_execution_context = ExecutionContext(safe_mode=False)
# Create a mock execution processor for tests
@@ -730,7 +730,7 @@ async def test_smart_decision_maker_raw_response_conversion():
outputs = {}
# Create execution context
mock_execution_context = ExecutionContext(human_in_the_loop_safe_mode=False)
mock_execution_context = ExecutionContext(safe_mode=False)
# Create a mock execution processor for tests
@@ -786,7 +786,7 @@ async def test_smart_decision_maker_raw_response_conversion():
outputs = {}
# Create execution context
mock_execution_context = ExecutionContext(human_in_the_loop_safe_mode=False)
mock_execution_context = ExecutionContext(safe_mode=False)
# Create a mock execution processor for tests
@@ -905,7 +905,7 @@ async def test_smart_decision_maker_agent_mode():
# Create a mock execution context
mock_execution_context = ExecutionContext(
human_in_the_loop_safe_mode=False,
safe_mode=False,
)
# Create a mock execution processor for agent mode tests
@@ -1027,7 +1027,7 @@ async def test_smart_decision_maker_traditional_mode_default():
# Create execution context
mock_execution_context = ExecutionContext(human_in_the_loop_safe_mode=False)
mock_execution_context = ExecutionContext(safe_mode=False)
# Create a mock execution processor for tests

View File

@@ -386,7 +386,7 @@ async def test_output_yielding_with_dynamic_fields():
outputs = {}
from backend.data.execution import ExecutionContext
mock_execution_context = ExecutionContext(human_in_the_loop_safe_mode=False)
mock_execution_context = ExecutionContext(safe_mode=False)
mock_execution_processor = MagicMock()
async for output_name, output_value in block.run(
@@ -609,9 +609,7 @@ async def test_validation_errors_dont_pollute_conversation():
outputs = {}
from backend.data.execution import ExecutionContext
mock_execution_context = ExecutionContext(
human_in_the_loop_safe_mode=False
)
mock_execution_context = ExecutionContext(safe_mode=False)
# Create a proper mock execution processor for agent mode
from collections import defaultdict

View File

@@ -474,7 +474,7 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
self.block_type = block_type
self.webhook_config = webhook_config
self.execution_stats: NodeExecutionStats = NodeExecutionStats()
self.is_sensitive_action: bool = False
self.requires_human_review: bool = False
if self.webhook_config:
if isinstance(self.webhook_config, BlockWebhookConfig):
@@ -637,9 +637,8 @@ class Block(ABC, Generic[BlockSchemaInputType, BlockSchemaOutputType]):
- should_pause: True if execution should be paused for review
- input_data_to_use: The input data to use (may be modified by reviewer)
"""
if not (
self.is_sensitive_action and execution_context.sensitive_action_safe_mode
):
# Skip review if not required or safe mode is disabled
if not self.requires_human_review or not execution_context.safe_mode:
return False, input_data
from backend.blocks.helpers.review import HITLReviewHelper

View File

@@ -99,15 +99,10 @@ MODEL_COST: dict[LlmModel, int] = {
LlmModel.OPENAI_GPT_OSS_20B: 1,
LlmModel.GEMINI_2_5_PRO: 4,
LlmModel.GEMINI_3_PRO_PREVIEW: 5,
LlmModel.GEMINI_2_5_FLASH: 1,
LlmModel.GEMINI_2_0_FLASH: 1,
LlmModel.GEMINI_2_5_FLASH_LITE_PREVIEW: 1,
LlmModel.GEMINI_2_0_FLASH_LITE: 1,
LlmModel.MISTRAL_NEMO: 1,
LlmModel.COHERE_COMMAND_R_08_2024: 1,
LlmModel.COHERE_COMMAND_R_PLUS_08_2024: 3,
LlmModel.DEEPSEEK_CHAT: 2,
LlmModel.DEEPSEEK_R1_0528: 1,
LlmModel.PERPLEXITY_SONAR: 1,
LlmModel.PERPLEXITY_SONAR_PRO: 5,
LlmModel.PERPLEXITY_SONAR_DEEP_RESEARCH: 10,
@@ -131,6 +126,11 @@ MODEL_COST: dict[LlmModel, int] = {
LlmModel.KIMI_K2: 1,
LlmModel.QWEN3_235B_A22B_THINKING: 1,
LlmModel.QWEN3_CODER: 9,
LlmModel.GEMINI_2_5_FLASH: 1,
LlmModel.GEMINI_2_0_FLASH: 1,
LlmModel.GEMINI_2_5_FLASH_LITE_PREVIEW: 1,
LlmModel.GEMINI_2_0_FLASH_LITE: 1,
LlmModel.DEEPSEEK_R1_0528: 1,
# v0 by Vercel models
LlmModel.V0_1_5_MD: 1,
LlmModel.V0_1_5_LG: 2,

View File

@@ -38,6 +38,20 @@ POOL_TIMEOUT = os.getenv("DB_POOL_TIMEOUT")
if POOL_TIMEOUT:
DATABASE_URL = add_param(DATABASE_URL, "pool_timeout", POOL_TIMEOUT)
# Add public schema to search_path for pgvector type access
# The vector extension is in public schema, but search_path is determined by schema parameter
# Extract the schema from DATABASE_URL or default to 'public' (matching get_database_schema())
parsed_url = urlparse(DATABASE_URL)
url_params = dict(parse_qsl(parsed_url.query))
db_schema = url_params.get("schema", "public")
# Build search_path, avoiding duplicates if db_schema is already 'public'
search_path_schemas = list(
dict.fromkeys([db_schema, "public"])
) # Preserves order, removes duplicates
search_path = ",".join(search_path_schemas)
# This allows using ::vector without schema qualification
DATABASE_URL = add_param(DATABASE_URL, "options", f"-c search_path={search_path}")
HTTP_TIMEOUT = int(POOL_TIMEOUT) if POOL_TIMEOUT else None
prisma = Prisma(
@@ -113,40 +127,38 @@ async def _raw_with_schema(
*args,
execute: bool = False,
client: Prisma | None = None,
set_public_search_path: bool = False,
) -> list[dict] | int:
"""Internal: Execute raw SQL with proper schema handling.
Use query_raw_with_schema() or execute_raw_with_schema() instead.
Supports placeholders:
- {schema_prefix}: Table/type prefix (e.g., "platform".)
- {schema}: Raw schema name (e.g., platform) for pgvector types
Args:
query_template: SQL query with {schema_prefix} and/or {schema} placeholders
query_template: SQL query with {schema_prefix} placeholder
*args: Query parameters
execute: If False, executes SELECT query. If True, executes INSERT/UPDATE/DELETE.
client: Optional Prisma client for transactions (only used when execute=True).
set_public_search_path: If True, sets search_path to include public schema.
Needed for pgvector types and other public schema objects.
Returns:
- list[dict] if execute=False (query results)
- int if execute=True (number of affected rows)
Example:
await execute_raw_with_schema(
'INSERT INTO {schema_prefix}"Embedding" (vec) VALUES ($1::{schema}.vector)',
embedding_data
)
"""
schema = get_database_schema()
schema_prefix = f'"{schema}".' if schema != "public" else ""
formatted_query = query_template.format(schema_prefix=schema_prefix, schema=schema)
formatted_query = query_template.format(schema_prefix=schema_prefix)
import prisma as prisma_module
db_client = client if client else prisma_module.get_client()
# Set search_path to include public schema if requested
# Prisma doesn't support the 'options' connection parameter, so we set it per-session
# This is idempotent and safe to call multiple times
if set_public_search_path:
await db_client.execute_raw(f"SET search_path = {schema}, public") # type: ignore
if execute:
result = await db_client.execute_raw(formatted_query, *args) # type: ignore
else:
@@ -155,12 +167,16 @@ async def _raw_with_schema(
return result
async def query_raw_with_schema(query_template: str, *args) -> list[dict]:
async def query_raw_with_schema(
query_template: str, *args, set_public_search_path: bool = False
) -> list[dict]:
"""Execute raw SQL SELECT query with proper schema handling.
Args:
query_template: SQL query with {schema_prefix} and/or {schema} placeholders
query_template: SQL query with {schema_prefix} placeholder
*args: Query parameters
set_public_search_path: If True, sets search_path to include public schema.
Needed for pgvector types and other public schema objects.
Returns:
List of result rows as dictionaries
@@ -171,20 +187,23 @@ async def query_raw_with_schema(query_template: str, *args) -> list[dict]:
user_id
)
"""
return await _raw_with_schema(query_template, *args, execute=False) # type: ignore
return await _raw_with_schema(query_template, *args, execute=False, set_public_search_path=set_public_search_path) # type: ignore
async def execute_raw_with_schema(
query_template: str,
*args,
client: Prisma | None = None,
set_public_search_path: bool = False,
) -> int:
"""Execute raw SQL command (INSERT/UPDATE/DELETE) with proper schema handling.
Args:
query_template: SQL query with {schema_prefix} and/or {schema} placeholders
query_template: SQL query with {schema_prefix} placeholder
*args: Query parameters
client: Optional Prisma client for transactions
set_public_search_path: If True, sets search_path to include public schema.
Needed for pgvector types and other public schema objects.
Returns:
Number of affected rows
@@ -196,7 +215,7 @@ async def execute_raw_with_schema(
client=tx # Optional transaction client
)
"""
return await _raw_with_schema(query_template, *args, execute=True, client=client) # type: ignore
return await _raw_with_schema(query_template, *args, execute=True, client=client, set_public_search_path=set_public_search_path) # type: ignore
class BaseDbModel(BaseModel):

View File

@@ -81,8 +81,7 @@ class ExecutionContext(BaseModel):
This includes information needed by blocks, sub-graphs, and execution management.
"""
human_in_the_loop_safe_mode: bool = True
sensitive_action_safe_mode: bool = False
safe_mode: bool = True
user_timezone: str = "UTC"
root_execution_id: Optional[str] = None
parent_execution_id: Optional[str] = None

View File

@@ -3,7 +3,7 @@ import logging
import uuid
from collections import defaultdict
from datetime import datetime, timezone
from typing import TYPE_CHECKING, Annotated, Any, Literal, Optional, cast
from typing import TYPE_CHECKING, Any, Literal, Optional, cast
from prisma.enums import SubmissionStatus
from prisma.models import (
@@ -20,7 +20,7 @@ from prisma.types import (
AgentNodeLinkCreateInput,
StoreListingVersionWhereInput,
)
from pydantic import BaseModel, BeforeValidator, Field, create_model
from pydantic import BaseModel, Field, create_model
from pydantic.fields import computed_field
from backend.blocks.agent import AgentExecutorBlock
@@ -62,29 +62,7 @@ logger = logging.getLogger(__name__)
class GraphSettings(BaseModel):
# Use Annotated with BeforeValidator to coerce None to default values.
# This handles cases where the database has null values for these fields.
human_in_the_loop_safe_mode: Annotated[
bool, BeforeValidator(lambda v: v if v is not None else True)
] = True
sensitive_action_safe_mode: Annotated[
bool, BeforeValidator(lambda v: v if v is not None else False)
] = False
@classmethod
def from_graph(
cls,
graph: "GraphModel",
hitl_safe_mode: bool | None = None,
sensitive_action_safe_mode: bool = False,
) -> "GraphSettings":
# Default to True if not explicitly set
if hitl_safe_mode is None:
hitl_safe_mode = True
return cls(
human_in_the_loop_safe_mode=hitl_safe_mode,
sensitive_action_safe_mode=sensitive_action_safe_mode,
)
human_in_the_loop_safe_mode: bool | None = None
class Link(BaseDbModel):
@@ -266,14 +244,10 @@ class BaseGraph(BaseDbModel):
return any(
node.block_id
for node in self.nodes
if node.block.block_type == BlockType.HUMAN_IN_THE_LOOP
)
@computed_field
@property
def has_sensitive_action(self) -> bool:
return any(
node.block_id for node in self.nodes if node.block.is_sensitive_action
if (
node.block.block_type == BlockType.HUMAN_IN_THE_LOOP
or node.block.requires_human_review
)
)
@property

View File

@@ -309,7 +309,7 @@ def ensure_embeddings_coverage():
# Process in batches until no more missing embeddings
while True:
result = db_client.backfill_missing_embeddings(batch_size=100)
result = db_client.backfill_missing_embeddings(batch_size=10)
total_processed += result["processed"]
total_success += result["success"]

View File

@@ -873,8 +873,11 @@ async def add_graph_execution(
settings = await gdb.get_graph_settings(user_id=user_id, graph_id=graph_id)
execution_context = ExecutionContext(
human_in_the_loop_safe_mode=settings.human_in_the_loop_safe_mode,
sensitive_action_safe_mode=settings.sensitive_action_safe_mode,
safe_mode=(
settings.human_in_the_loop_safe_mode
if settings.human_in_the_loop_safe_mode is not None
else True
),
user_timezone=(
user.timezone if user.timezone != USER_TIMEZONE_NOT_SET else "UTC"
),

View File

@@ -386,7 +386,6 @@ async def test_add_graph_execution_is_repeatable(mocker: MockerFixture):
mock_user.timezone = "UTC"
mock_settings = mocker.MagicMock()
mock_settings.human_in_the_loop_safe_mode = True
mock_settings.sensitive_action_safe_mode = False
mock_udb.get_user_by_id = mocker.AsyncMock(return_value=mock_user)
mock_gdb.get_graph_settings = mocker.AsyncMock(return_value=mock_settings)
@@ -652,7 +651,6 @@ async def test_add_graph_execution_with_nodes_to_skip(mocker: MockerFixture):
mock_user.timezone = "UTC"
mock_settings = mocker.MagicMock()
mock_settings.human_in_the_loop_safe_mode = True
mock_settings.sensitive_action_safe_mode = False
mock_udb.get_user_by_id = mocker.AsyncMock(return_value=mock_user)
mock_gdb.get_graph_settings = mocker.AsyncMock(return_value=mock_settings)

View File

@@ -11,7 +11,6 @@
"forked_from_version": null,
"has_external_trigger": false,
"has_human_in_the_loop": false,
"has_sensitive_action": false,
"id": "graph-123",
"input_schema": {
"properties": {},

View File

@@ -11,7 +11,6 @@
"forked_from_version": null,
"has_external_trigger": false,
"has_human_in_the_loop": false,
"has_sensitive_action": false,
"id": "graph-123",
"input_schema": {
"properties": {},

View File

@@ -27,8 +27,6 @@
"properties": {}
},
"has_external_trigger": false,
"has_human_in_the_loop": false,
"has_sensitive_action": false,
"trigger_setup_info": null,
"new_output": false,
"can_access_graph": true,
@@ -36,8 +34,7 @@
"is_favorite": false,
"recommended_schedule_cron": null,
"settings": {
"human_in_the_loop_safe_mode": true,
"sensitive_action_safe_mode": false
"human_in_the_loop_safe_mode": null
},
"marketplace_listing": null
},
@@ -68,8 +65,6 @@
"properties": {}
},
"has_external_trigger": false,
"has_human_in_the_loop": false,
"has_sensitive_action": false,
"trigger_setup_info": null,
"new_output": false,
"can_access_graph": false,
@@ -77,8 +72,7 @@
"is_favorite": false,
"recommended_schedule_cron": null,
"settings": {
"human_in_the_loop_safe_mode": true,
"sensitive_action_safe_mode": false
"human_in_the_loop_safe_mode": null
},
"marketplace_listing": null
}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 5.9 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 19 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 72 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 374 B

Binary file not shown.

Before

Width:  |  Height:  |  Size: 663 B

Binary file not shown.

Before

Width:  |  Height:  |  Size: 40 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 4.1 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 2.5 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.8 KiB

View File

@@ -5,11 +5,10 @@ import {
TooltipContent,
TooltipTrigger,
} from "@/components/atoms/Tooltip/BaseTooltip";
import { CircleNotchIcon, PlayIcon, StopIcon } from "@phosphor-icons/react";
import { PlayIcon, StopIcon } from "@phosphor-icons/react";
import { useShallow } from "zustand/react/shallow";
import { RunInputDialog } from "../RunInputDialog/RunInputDialog";
import { useRunGraph } from "./useRunGraph";
import { cn } from "@/lib/utils";
export const RunGraph = ({ flowID }: { flowID: string | null }) => {
const {
@@ -25,31 +24,6 @@ export const RunGraph = ({ flowID }: { flowID: string | null }) => {
useShallow((state) => state.isGraphRunning),
);
const isLoading = isExecutingGraph || isTerminatingGraph || isSaving;
// Determine which icon to show with proper animation
const renderIcon = () => {
const iconClass = cn(
"size-4 transition-transform duration-200 ease-out",
!isLoading && "group-hover:scale-110",
);
if (isLoading) {
return (
<CircleNotchIcon
className={cn(iconClass, "animate-spin")}
weight="bold"
/>
);
}
if (isGraphRunning) {
return <StopIcon className={iconClass} weight="fill" />;
}
return <PlayIcon className={iconClass} weight="fill" />;
};
return (
<>
<Tooltip>
@@ -59,18 +33,18 @@ export const RunGraph = ({ flowID }: { flowID: string | null }) => {
variant={isGraphRunning ? "destructive" : "primary"}
data-id={isGraphRunning ? "stop-graph-button" : "run-graph-button"}
onClick={isGraphRunning ? handleStopGraph : handleRunGraph}
disabled={!flowID || isLoading}
className="group"
disabled={!flowID || isExecutingGraph || isTerminatingGraph}
loading={isExecutingGraph || isTerminatingGraph || isSaving}
>
{renderIcon()}
{!isGraphRunning ? (
<PlayIcon className="size-4" />
) : (
<StopIcon className="size-4" />
)}
</Button>
</TooltipTrigger>
<TooltipContent>
{isLoading
? "Processing..."
: isGraphRunning
? "Stop agent"
: "Run agent"}
{isGraphRunning ? "Stop agent" : "Run agent"}
</TooltipContent>
</Tooltip>
<RunInputDialog

View File

@@ -10,7 +10,6 @@ import { useRunInputDialog } from "./useRunInputDialog";
import { CronSchedulerDialog } from "../CronSchedulerDialog/CronSchedulerDialog";
import { useTutorialStore } from "@/app/(platform)/build/stores/tutorialStore";
import { useEffect } from "react";
import { CredentialsGroupedView } from "@/components/contextual/CredentialsInput/components/CredentialsGroupedView/CredentialsGroupedView";
export const RunInputDialog = ({
isOpen,
@@ -24,17 +23,19 @@ export const RunInputDialog = ({
const hasInputs = useGraphStore((state) => state.hasInputs);
const hasCredentials = useGraphStore((state) => state.hasCredentials);
const inputSchema = useGraphStore((state) => state.inputSchema);
const credentialsSchema = useGraphStore(
(state) => state.credentialsInputSchema,
);
const {
credentialFields,
requiredCredentials,
credentialsUiSchema,
handleManualRun,
handleInputChange,
openCronSchedulerDialog,
setOpenCronSchedulerDialog,
inputValues,
credentialValues,
handleCredentialFieldChange,
handleCredentialChange,
isExecutingGraph,
} = useRunInputDialog({ setIsOpen });
@@ -61,67 +62,67 @@ export const RunInputDialog = ({
isOpen,
set: setIsOpen,
}}
styling={{ maxWidth: "700px", minWidth: "700px" }}
styling={{ maxWidth: "600px", minWidth: "600px" }}
>
<Dialog.Content>
<div
className="grid grid-cols-[1fr_auto] gap-10 p-1"
data-id="run-input-dialog-content"
>
<div className="space-y-6">
{/* Credentials Section */}
{hasCredentials() && credentialFields.length > 0 && (
<div data-id="run-input-credentials-section">
<div className="mb-4">
<Text variant="h4" className="text-gray-900">
Credentials
</Text>
</div>
<div className="px-2" data-id="run-input-credentials-form">
<CredentialsGroupedView
credentialFields={credentialFields}
requiredCredentials={requiredCredentials}
inputCredentials={credentialValues}
inputValues={inputValues}
onCredentialChange={handleCredentialFieldChange}
/>
</div>
<div className="space-y-6 p-1" data-id="run-input-dialog-content">
{/* Credentials Section */}
{hasCredentials() && (
<div data-id="run-input-credentials-section">
<div className="mb-4">
<Text variant="h4" className="text-gray-900">
Credentials
</Text>
</div>
)}
{/* Inputs Section */}
{hasInputs() && (
<div data-id="run-input-inputs-section">
<div className="mb-4">
<Text variant="h4" className="text-gray-900">
Inputs
</Text>
</div>
<div data-id="run-input-inputs-form">
<FormRenderer
jsonSchema={inputSchema as RJSFSchema}
handleChange={(v) => handleInputChange(v.formData)}
uiSchema={uiSchema}
initialValues={{}}
formContext={{
showHandles: false,
size: "large",
}}
/>
</div>
<div className="px-2" data-id="run-input-credentials-form">
<FormRenderer
jsonSchema={credentialsSchema as RJSFSchema}
handleChange={(v) => handleCredentialChange(v.formData)}
uiSchema={credentialsUiSchema}
initialValues={{}}
formContext={{
showHandles: false,
size: "large",
showOptionalToggle: false,
}}
/>
</div>
)}
</div>
</div>
)}
{/* Inputs Section */}
{hasInputs() && (
<div data-id="run-input-inputs-section">
<div className="mb-4">
<Text variant="h4" className="text-gray-900">
Inputs
</Text>
</div>
<div data-id="run-input-inputs-form">
<FormRenderer
jsonSchema={inputSchema as RJSFSchema}
handleChange={(v) => handleInputChange(v.formData)}
uiSchema={uiSchema}
initialValues={{}}
formContext={{
showHandles: false,
size: "large",
}}
/>
</div>
</div>
)}
{/* Action Button */}
<div
className="flex flex-col items-end justify-start"
className="flex justify-end pt-2"
data-id="run-input-actions-section"
>
{purpose === "run" && (
<Button
variant="primary"
size="large"
className="group h-fit min-w-0 gap-2 px-10"
className="group h-fit min-w-0 gap-2"
onClick={handleManualRun}
loading={isExecutingGraph}
data-id="run-input-manual-run-button"
@@ -136,7 +137,7 @@ export const RunInputDialog = ({
<Button
variant="primary"
size="large"
className="group h-fit min-w-0 gap-2 px-10"
className="group h-fit min-w-0 gap-2"
onClick={() => setOpenCronSchedulerDialog(true)}
data-id="run-input-schedule-button"
>

View File

@@ -7,11 +7,12 @@ import {
GraphExecutionMeta,
} from "@/lib/autogpt-server-api";
import { parseAsInteger, parseAsString, useQueryStates } from "nuqs";
import { useCallback, useMemo, useState } from "react";
import { useMemo, useState } from "react";
import { uiSchema } from "../../../FlowEditor/nodes/uiSchema";
import { isCredentialFieldSchema } from "@/components/renderers/InputRenderer/custom/CredentialField/helpers";
import { useNodeStore } from "@/app/(platform)/build/stores/nodeStore";
import { useToast } from "@/components/molecules/Toast/use-toast";
import { useReactFlow } from "@xyflow/react";
import type { CredentialField } from "@/components/contextual/CredentialsInput/components/CredentialsGroupedView/helpers";
export const useRunInputDialog = ({
setIsOpen,
@@ -119,32 +120,27 @@ export const useRunInputDialog = ({
},
});
// Convert credentials schema to credential fields array for CredentialsGroupedView
const credentialFields: CredentialField[] = useMemo(() => {
if (!credentialsSchema?.properties) return [];
return Object.entries(credentialsSchema.properties);
}, [credentialsSchema]);
// We are rendering the credentials field differently compared to other fields.
// In the node, we have the field name as "credential" - so our library catches it and renders it differently.
// But here we have a different name, something like `Firecrawl credentials`, so here we are telling the library that this field is a credential field type.
// Get required credentials as a Set
const requiredCredentials = useMemo(() => {
return new Set<string>(credentialsSchema?.required || []);
}, [credentialsSchema]);
const credentialsUiSchema = useMemo(() => {
const dynamicUiSchema: any = { ...uiSchema };
// Handler for individual credential changes
const handleCredentialFieldChange = useCallback(
(key: string, value?: CredentialsMetaInput) => {
setCredentialValues((prev) => {
if (value) {
return { ...prev, [key]: value };
} else {
const next = { ...prev };
delete next[key];
return next;
if (credentialsSchema?.properties) {
Object.keys(credentialsSchema.properties).forEach((fieldName) => {
const fieldSchema = credentialsSchema.properties[fieldName];
if (isCredentialFieldSchema(fieldSchema)) {
dynamicUiSchema[fieldName] = {
...dynamicUiSchema[fieldName],
"ui:field": "custom/credential_field",
};
}
});
},
[],
);
}
return dynamicUiSchema;
}, [credentialsSchema]);
const handleManualRun = async () => {
// Filter out incomplete credentials (those without a valid id)
@@ -177,14 +173,12 @@ export const useRunInputDialog = ({
};
return {
credentialFields,
requiredCredentials,
credentialsUiSchema,
inputValues,
credentialValues,
isExecutingGraph,
handleInputChange,
handleCredentialChange,
handleCredentialFieldChange,
handleManualRun,
openCronSchedulerDialog,
setOpenCronSchedulerDialog,

View File

@@ -18,118 +18,69 @@ interface Props {
fullWidth?: boolean;
}
interface SafeModeButtonProps {
isEnabled: boolean;
label: string;
tooltipEnabled: string;
tooltipDisabled: string;
onToggle: () => void;
isPending: boolean;
fullWidth?: boolean;
}
function SafeModeButton({
isEnabled,
label,
tooltipEnabled,
tooltipDisabled,
onToggle,
isPending,
fullWidth = false,
}: SafeModeButtonProps) {
return (
<Tooltip delayDuration={100}>
<TooltipTrigger asChild>
<Button
variant={isEnabled ? "primary" : "outline"}
size="small"
onClick={onToggle}
disabled={isPending}
className={cn("justify-start", fullWidth ? "w-full" : "")}
>
{isEnabled ? (
<>
<ShieldCheckIcon weight="bold" size={16} />
<Text variant="body" className="text-zinc-200">
{label}: ON
</Text>
</>
) : (
<>
<ShieldIcon weight="bold" size={16} />
<Text variant="body" className="text-zinc-600">
{label}: OFF
</Text>
</>
)}
</Button>
</TooltipTrigger>
<TooltipContent>
<div className="text-center">
<div className="font-medium">
{label}: {isEnabled ? "ON" : "OFF"}
</div>
<div className="mt-1 text-xs text-muted-foreground">
{isEnabled ? tooltipEnabled : tooltipDisabled}
</div>
</div>
</TooltipContent>
</Tooltip>
);
}
export function FloatingSafeModeToggle({
graph,
className,
fullWidth = false,
}: Props) {
const {
currentHITLSafeMode,
showHITLToggle,
isHITLStateUndetermined,
handleHITLToggle,
currentSensitiveActionSafeMode,
showSensitiveActionToggle,
handleSensitiveActionToggle,
currentSafeMode,
isPending,
shouldShowToggle,
isStateUndetermined,
handleToggle,
} = useAgentSafeMode(graph);
if (!shouldShowToggle || isPending) {
return null;
}
const showHITL = showHITLToggle && !isHITLStateUndetermined;
const showSensitive = showSensitiveActionToggle;
if (!showHITL && !showSensitive) {
if (!shouldShowToggle || isStateUndetermined || isPending) {
return null;
}
return (
<div className={cn("fixed z-50 flex flex-col gap-2", className)}>
{showHITL && (
<SafeModeButton
isEnabled={currentHITLSafeMode}
label="Human in the loop block approval"
tooltipEnabled="The agent will pause at human-in-the-loop blocks and wait for your approval"
tooltipDisabled="Human in the loop blocks will proceed automatically"
onToggle={handleHITLToggle}
isPending={isPending}
fullWidth={fullWidth}
/>
)}
{showSensitive && (
<SafeModeButton
isEnabled={currentSensitiveActionSafeMode}
label="Sensitive actions blocks approval"
tooltipEnabled="The agent will pause at sensitive action blocks and wait for your approval"
tooltipDisabled="Sensitive action blocks will proceed automatically"
onToggle={handleSensitiveActionToggle}
isPending={isPending}
fullWidth={fullWidth}
/>
)}
<div className={cn("fixed z-50", className)}>
<Tooltip delayDuration={100}>
<TooltipTrigger asChild>
<Button
variant={currentSafeMode! ? "primary" : "outline"}
key={graph.id}
size="small"
title={
currentSafeMode!
? "Safe Mode: ON. Human in the loop blocks require manual review"
: "Safe Mode: OFF. Human in the loop blocks proceed automatically"
}
onClick={handleToggle}
className={cn(fullWidth ? "w-full" : "")}
>
{currentSafeMode! ? (
<>
<ShieldCheckIcon weight="bold" size={16} />
<Text variant="body" className="text-zinc-200">
Safe Mode: ON
</Text>
</>
) : (
<>
<ShieldIcon weight="bold" size={16} />
<Text variant="body" className="text-zinc-600">
Safe Mode: OFF
</Text>
</>
)}
</Button>
</TooltipTrigger>
<TooltipContent>
<div className="text-center">
<div className="font-medium">
Safe Mode: {currentSafeMode! ? "ON" : "OFF"}
</div>
<div className="mt-1 text-xs text-muted-foreground">
{currentSafeMode!
? "Human in the loop blocks require manual review"
: "Human in the loop blocks proceed automatically"}
</div>
</div>
</TooltipContent>
</Tooltip>
</div>
);
}

View File

@@ -53,14 +53,14 @@ export const CustomControls = memo(
const controls = [
{
id: "zoom-in-button",
icon: <PlusIcon className="size-3.5 text-zinc-600" />,
icon: <PlusIcon className="size-4" />,
label: "Zoom In",
onClick: () => zoomIn(),
className: "h-10 w-10 border-none",
},
{
id: "zoom-out-button",
icon: <MinusIcon className="size-3.5 text-zinc-600" />,
icon: <MinusIcon className="size-4" />,
label: "Zoom Out",
onClick: () => zoomOut(),
className: "h-10 w-10 border-none",
@@ -68,9 +68,9 @@ export const CustomControls = memo(
{
id: "tutorial-button",
icon: isTutorialLoading ? (
<CircleNotchIcon className="size-3.5 animate-spin text-zinc-600" />
<CircleNotchIcon className="size-4 animate-spin" />
) : (
<ChalkboardIcon className="size-3.5 text-zinc-600" />
<ChalkboardIcon className="size-4" />
),
label: isTutorialLoading ? "Loading Tutorial..." : "Start Tutorial",
onClick: handleTutorialClick,
@@ -79,7 +79,7 @@ export const CustomControls = memo(
},
{
id: "fit-view-button",
icon: <FrameCornersIcon className="size-3.5 text-zinc-600" />,
icon: <FrameCornersIcon className="size-4" />,
label: "Fit View",
onClick: () => fitView({ padding: 0.2, duration: 800, maxZoom: 1 }),
className: "h-10 w-10 border-none",
@@ -87,9 +87,9 @@ export const CustomControls = memo(
{
id: "lock-button",
icon: !isLocked ? (
<LockOpenIcon className="size-3.5 text-zinc-600" />
<LockOpenIcon className="size-4" />
) : (
<LockIcon className="size-3.5 text-zinc-600" />
<LockIcon className="size-4" />
),
label: "Toggle Lock",
onClick: () => setIsLocked(!isLocked),

View File

@@ -139,6 +139,14 @@ export const useFlow = () => {
useNodeStore.getState().setNodes([]);
useNodeStore.getState().clearResolutionState();
addNodes(customNodes);
// Sync hardcoded values with handle IDs.
// If a keyvalue field has a key without a value, the backend omits it from hardcoded values.
// But if a handleId exists for that key, it causes inconsistency.
// This ensures hardcoded values stay in sync with handle IDs.
customNodes.forEach((node) => {
useNodeStore.getState().syncHardcodedValuesWithHandleIds(node.id);
});
}
}, [customNodes, addNodes]);
@@ -150,14 +158,6 @@ export const useFlow = () => {
}
}, [graph?.links, addLinks]);
useEffect(() => {
if (customNodes.length > 0 && graph?.links) {
customNodes.forEach((node) => {
useNodeStore.getState().syncHardcodedValuesWithHandleIds(node.id);
});
}
}, [customNodes, graph?.links]);
// update node execution status in nodes
useEffect(() => {
if (

View File

@@ -19,8 +19,6 @@ export type CustomEdgeData = {
beadUp?: number;
beadDown?: number;
beadData?: Map<string, NodeExecutionResult["status"]>;
edgeColorClass?: string;
edgeHexColor?: string;
};
export type CustomEdge = XYEdge<CustomEdgeData, "custom">;
@@ -38,6 +36,7 @@ const CustomEdge = ({
selected,
}: EdgeProps<CustomEdge>) => {
const removeConnection = useEdgeStore((state) => state.removeEdge);
// Subscribe to the brokenEdgeIDs map and check if this edge is broken across any node
const isBroken = useNodeStore((state) => state.isEdgeBroken(id));
const [isHovered, setIsHovered] = useState(false);
@@ -53,7 +52,6 @@ const CustomEdge = ({
const isStatic = data?.isStatic ?? false;
const beadUp = data?.beadUp ?? 0;
const beadDown = data?.beadDown ?? 0;
const edgeColorClass = data?.edgeColorClass;
const handleRemoveEdge = () => {
removeConnection(id);
@@ -72,9 +70,7 @@ const CustomEdge = ({
? "!stroke-red-500 !stroke-[2px] [stroke-dasharray:4]"
: selected
? "stroke-zinc-800"
: edgeColorClass
? cn(edgeColorClass, "opacity-70 hover:opacity-100")
: "stroke-zinc-500/50 hover:stroke-zinc-500",
: "stroke-zinc-500/50 hover:stroke-zinc-500",
)}
/>
<JSBeads

View File

@@ -8,7 +8,6 @@ import { useCallback } from "react";
import { useNodeStore } from "../../../stores/nodeStore";
import { useHistoryStore } from "../../../stores/historyStore";
import { CustomEdge } from "./CustomEdge";
import { getEdgeColorFromOutputType } from "../nodes/helpers";
export const useCustomEdge = () => {
const edges = useEdgeStore((s) => s.edges);
@@ -35,13 +34,8 @@ export const useCustomEdge = () => {
if (exists) return;
const nodes = useNodeStore.getState().nodes;
const sourceNode = nodes.find((n) => n.id === conn.source);
const isStatic = sourceNode?.data?.staticOutput;
const { colorClass, hexColor } = getEdgeColorFromOutputType(
sourceNode?.data?.outputSchema,
conn.sourceHandle,
);
const isStatic = nodes.find((n) => n.id === conn.source)?.data
?.staticOutput;
addEdge({
source: conn.source,
@@ -50,8 +44,6 @@ export const useCustomEdge = () => {
targetHandle: conn.targetHandle,
data: {
isStatic,
edgeColorClass: colorClass,
edgeHexColor: hexColor,
},
});
},

View File

@@ -1,21 +1,22 @@
import { Button } from "@/components/atoms/Button/Button";
import { Text } from "@/components/atoms/Text/Text";
import {
Accordion,
AccordionContent,
AccordionItem,
AccordionTrigger,
} from "@/components/molecules/Accordion/Accordion";
import { beautifyString, cn } from "@/lib/utils";
import { CopyIcon, CheckIcon } from "@phosphor-icons/react";
import { CaretDownIcon, CopyIcon, CheckIcon } from "@phosphor-icons/react";
import { NodeDataViewer } from "./components/NodeDataViewer/NodeDataViewer";
import { ContentRenderer } from "./components/ContentRenderer";
import { useNodeOutput } from "./useNodeOutput";
import { ViewMoreData } from "./components/ViewMoreData";
export const NodeDataRenderer = ({ nodeId }: { nodeId: string }) => {
const { outputData, copiedKey, handleCopy, executionResultId, inputData } =
useNodeOutput(nodeId);
const {
outputData,
isExpanded,
setIsExpanded,
copiedKey,
handleCopy,
executionResultId,
inputData,
} = useNodeOutput(nodeId);
if (Object.keys(outputData).length === 0) {
return null;
@@ -24,117 +25,122 @@ export const NodeDataRenderer = ({ nodeId }: { nodeId: string }) => {
return (
<div
data-tutorial-id={`node-output`}
className="rounded-b-xl border-t border-zinc-200 px-4 py-2"
className="flex flex-col gap-3 rounded-b-xl border-t border-zinc-200 px-4 py-4"
>
<Accordion type="single" collapsible defaultValue="node-output">
<AccordionItem value="node-output" className="border-none">
<AccordionTrigger className="py-2 hover:no-underline">
<Text
variant="body-medium"
className="!font-semibold text-slate-700"
>
Node Output
</Text>
</AccordionTrigger>
<AccordionContent className="pt-2">
<div className="flex max-w-[350px] flex-col gap-4">
<div className="space-y-2">
<Text variant="small-medium">Input</Text>
<div className="flex items-center justify-between">
<Text variant="body-medium" className="!font-semibold text-slate-700">
Node Output
</Text>
<Button
variant="ghost"
size="small"
onClick={() => setIsExpanded(!isExpanded)}
className="h-fit min-w-0 p-1 text-slate-600 hover:text-slate-900"
>
<CaretDownIcon
size={16}
weight="bold"
className={`transition-transform ${isExpanded ? "rotate-180" : ""}`}
/>
</Button>
</div>
<ContentRenderer value={inputData} shortContent={false} />
{isExpanded && (
<>
<div className="flex max-w-[350px] flex-col gap-4">
<div className="space-y-2">
<Text variant="small-medium">Input</Text>
<div className="mt-1 flex justify-end gap-1">
<NodeDataViewer
data={inputData}
pinName="Input"
execId={executionResultId}
/>
<Button
variant="secondary"
size="small"
onClick={() => handleCopy("input", inputData)}
className={cn(
"h-fit min-w-0 gap-1.5 border border-zinc-200 p-2 text-black hover:text-slate-900",
copiedKey === "input" &&
"border-green-400 bg-green-100 hover:border-green-400 hover:bg-green-200",
)}
>
{copiedKey === "input" ? (
<CheckIcon size={12} className="text-green-600" />
) : (
<CopyIcon size={12} />
)}
</Button>
</div>
<ContentRenderer value={inputData} shortContent={false} />
<div className="mt-1 flex justify-end gap-1">
<NodeDataViewer
data={inputData}
pinName="Input"
execId={executionResultId}
/>
<Button
variant="secondary"
size="small"
onClick={() => handleCopy("input", inputData)}
className={cn(
"h-fit min-w-0 gap-1.5 border border-zinc-200 p-2 text-black hover:text-slate-900",
copiedKey === "input" &&
"border-green-400 bg-green-100 hover:border-green-400 hover:bg-green-200",
)}
>
{copiedKey === "input" ? (
<CheckIcon size={12} className="text-green-600" />
) : (
<CopyIcon size={12} />
)}
</Button>
</div>
</div>
{Object.entries(outputData)
.slice(0, 2)
.map(([key, value]) => (
<div key={key} className="flex flex-col gap-2">
<div className="flex items-center gap-2">
<Text
variant="small-medium"
className="!font-semibold text-slate-600"
>
Pin:
</Text>
<Text variant="small" className="text-slate-700">
{beautifyString(key)}
</Text>
</div>
<div className="w-full space-y-2">
<Text
variant="small"
className="!font-semibold text-slate-600"
>
Data:
</Text>
<div className="relative space-y-2">
{value.map((item, index) => (
<div key={index}>
<ContentRenderer value={item} shortContent={true} />
</div>
))}
<div className="mt-1 flex justify-end gap-1">
<NodeDataViewer
data={value}
pinName={key}
execId={executionResultId}
/>
<Button
variant="secondary"
size="small"
onClick={() => handleCopy(key, value)}
className={cn(
"h-fit min-w-0 gap-1.5 border border-zinc-200 p-2 text-black hover:text-slate-900",
copiedKey === key &&
"border-green-400 bg-green-100 hover:border-green-400 hover:bg-green-200",
)}
>
{copiedKey === key ? (
<CheckIcon size={12} className="text-green-600" />
) : (
<CopyIcon size={12} />
)}
</Button>
{Object.entries(outputData)
.slice(0, 2)
.map(([key, value]) => (
<div key={key} className="flex flex-col gap-2">
<div className="flex items-center gap-2">
<Text
variant="small-medium"
className="!font-semibold text-slate-600"
>
Pin:
</Text>
<Text variant="small" className="text-slate-700">
{beautifyString(key)}
</Text>
</div>
<div className="w-full space-y-2">
<Text
variant="small"
className="!font-semibold text-slate-600"
>
Data:
</Text>
<div className="relative space-y-2">
{value.map((item, index) => (
<div key={index}>
<ContentRenderer value={item} shortContent={true} />
</div>
))}
<div className="mt-1 flex justify-end gap-1">
<NodeDataViewer
data={value}
pinName={key}
execId={executionResultId}
/>
<Button
variant="secondary"
size="small"
onClick={() => handleCopy(key, value)}
className={cn(
"h-fit min-w-0 gap-1.5 border border-zinc-200 p-2 text-black hover:text-slate-900",
copiedKey === key &&
"border-green-400 bg-green-100 hover:border-green-400 hover:bg-green-200",
)}
>
{copiedKey === key ? (
<CheckIcon size={12} className="text-green-600" />
) : (
<CopyIcon size={12} />
)}
</Button>
</div>
</div>
</div>
))}
</div>
</div>
))}
</div>
{Object.keys(outputData).length > 2 && (
<ViewMoreData
outputData={outputData}
execId={executionResultId}
/>
)}
</AccordionContent>
</AccordionItem>
</Accordion>
{Object.keys(outputData).length > 2 && (
<ViewMoreData outputData={outputData} execId={executionResultId} />
)}
</>
)}
</div>
);
};

View File

@@ -4,6 +4,7 @@ import { useShallow } from "zustand/react/shallow";
import { useState } from "react";
export const useNodeOutput = (nodeId: string) => {
const [isExpanded, setIsExpanded] = useState(true);
const [copiedKey, setCopiedKey] = useState<string | null>(null);
const { toast } = useToast();
@@ -36,10 +37,13 @@ export const useNodeOutput = (nodeId: string) => {
}
};
return {
outputData,
inputData,
copiedKey,
handleCopy,
outputData: outputData,
inputData: inputData,
isExpanded: isExpanded,
setIsExpanded: setIsExpanded,
copiedKey: copiedKey,
setCopiedKey: setCopiedKey,
handleCopy: handleCopy,
executionResultId: nodeExecutionResult?.node_exec_id,
};
};

View File

@@ -187,38 +187,3 @@ export const getTypeDisplayInfo = (schema: any) => {
hexColor,
};
};
export function getEdgeColorFromOutputType(
outputSchema: RJSFSchema | undefined,
sourceHandle: string,
): { colorClass: string; hexColor: string } {
const defaultColor = {
colorClass: "stroke-zinc-500/50",
hexColor: "#6b7280",
};
if (!outputSchema?.properties) return defaultColor;
const properties = outputSchema.properties as Record<string, unknown>;
const handleParts = sourceHandle.split("_#_");
let currentSchema: Record<string, unknown> = properties;
for (let i = 0; i < handleParts.length; i++) {
const part = handleParts[i];
const fieldSchema = currentSchema[part] as Record<string, unknown>;
if (!fieldSchema) return defaultColor;
if (i === handleParts.length - 1) {
const { hexColor, colorClass } = getTypeDisplayInfo(fieldSchema);
return { colorClass: colorClass.replace("!text-", "stroke-"), hexColor };
}
if (fieldSchema.properties) {
currentSchema = fieldSchema.properties as Record<string, unknown>;
} else {
return defaultColor;
}
}
return defaultColor;
}

View File

@@ -1,32 +1,7 @@
type IconOptions = {
size?: number;
color?: string;
};
const DEFAULT_SIZE = 16;
const DEFAULT_COLOR = "#52525b"; // zinc-600
const iconPaths = {
ClickIcon: `M88,24V16a8,8,0,0,1,16,0v8a8,8,0,0,1-16,0ZM16,104h8a8,8,0,0,0,0-16H16a8,8,0,0,0,0,16ZM124.42,39.16a8,8,0,0,0,10.74-3.58l8-16a8,8,0,0,0-14.31-7.16l-8,16A8,8,0,0,0,124.42,39.16Zm-96,81.69-16,8a8,8,0,0,0,7.16,14.31l16-8a8,8,0,1,0-7.16-14.31ZM219.31,184a16,16,0,0,1,0,22.63l-12.68,12.68a16,16,0,0,1-22.63,0L132.7,168,115,214.09c0,.1-.08.21-.13.32a15.83,15.83,0,0,1-14.6,9.59l-.79,0a15.83,15.83,0,0,1-14.41-11L32.8,52.92A16,16,0,0,1,52.92,32.8L213,85.07a16,16,0,0,1,1.41,29.8l-.32.13L168,132.69ZM208,195.31,156.69,144h0a16,16,0,0,1,4.93-26l.32-.14,45.95-17.64L48,48l52.2,159.86,17.65-46c0-.11.08-.22.13-.33a16,16,0,0,1,11.69-9.34,16.72,16.72,0,0,1,3-.28,16,16,0,0,1,11.3,4.69L195.31,208Z`,
Keyboard: `M224,48H32A16,16,0,0,0,16,64V192a16,16,0,0,0,16,16H224a16,16,0,0,0,16-16V64A16,16,0,0,0,224,48Zm0,144H32V64H224V192Zm-16-64a8,8,0,0,1-8,8H56a8,8,0,0,1,0-16H200A8,8,0,0,1,208,128Zm0-32a8,8,0,0,1-8,8H56a8,8,0,0,1,0-16H200A8,8,0,0,1,208,96ZM72,160a8,8,0,0,1-8,8H56a8,8,0,0,1,0-16h8A8,8,0,0,1,72,160Zm96,0a8,8,0,0,1-8,8H96a8,8,0,0,1,0-16h64A8,8,0,0,1,168,160Zm40,0a8,8,0,0,1-8,8h-8a8,8,0,0,1,0-16h8A8,8,0,0,1,208,160Z`,
Drag: `M188,80a27.79,27.79,0,0,0-13.36,3.4,28,28,0,0,0-46.64-11A28,28,0,0,0,80,92v20H68a28,28,0,0,0-28,28v12a88,88,0,0,0,176,0V108A28,28,0,0,0,188,80Zm12,72a72,72,0,0,1-144,0V140a12,12,0,0,1,12-12H80v24a8,8,0,0,0,16,0V92a12,12,0,0,1,24,0v28a8,8,0,0,0,16,0V92a12,12,0,0,1,24,0v28a8,8,0,0,0,16,0V108a12,12,0,0,1,24,0Z`,
};
function createIcon(path: string, options: IconOptions = {}): string {
const size = options.size ?? DEFAULT_SIZE;
const color = options.color ?? DEFAULT_COLOR;
return `<svg xmlns="http://www.w3.org/2000/svg" width="${size}" height="${size}" fill="${color}" viewBox="0 0 256 256"><path d="${path}"></path></svg>`;
}
// These are SVG Phosphor icons
export const ICONS = {
ClickIcon: createIcon(iconPaths.ClickIcon),
Keyboard: createIcon(iconPaths.Keyboard),
Drag: createIcon(iconPaths.Drag),
ClickIcon: `<svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" fill="#000000" viewBox="0 0 256 256"><path d="M88,24V16a8,8,0,0,1,16,0v8a8,8,0,0,1-16,0ZM16,104h8a8,8,0,0,0,0-16H16a8,8,0,0,0,0,16ZM124.42,39.16a8,8,0,0,0,10.74-3.58l8-16a8,8,0,0,0-14.31-7.16l-8,16A8,8,0,0,0,124.42,39.16Zm-96,81.69-16,8a8,8,0,0,0,7.16,14.31l16-8a8,8,0,1,0-7.16-14.31ZM219.31,184a16,16,0,0,1,0,22.63l-12.68,12.68a16,16,0,0,1-22.63,0L132.7,168,115,214.09c0,.1-.08.21-.13.32a15.83,15.83,0,0,1-14.6,9.59l-.79,0a15.83,15.83,0,0,1-14.41-11L32.8,52.92A16,16,0,0,1,52.92,32.8L213,85.07a16,16,0,0,1,1.41,29.8l-.32.13L168,132.69ZM208,195.31,156.69,144h0a16,16,0,0,1,4.93-26l.32-.14,45.95-17.64L48,48l52.2,159.86,17.65-46c0-.11.08-.22.13-.33a16,16,0,0,1,11.69-9.34,16.72,16.72,0,0,1,3-.28,16,16,0,0,1,11.3,4.69L195.31,208Z"></path></svg>`,
Keyboard: `<svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" fill="#000000" viewBox="0 0 256 256"><path d="M224,48H32A16,16,0,0,0,16,64V192a16,16,0,0,0,16,16H224a16,16,0,0,0,16-16V64A16,16,0,0,0,224,48Zm0,144H32V64H224V192Zm-16-64a8,8,0,0,1-8,8H56a8,8,0,0,1,0-16H200A8,8,0,0,1,208,128Zm0-32a8,8,0,0,1-8,8H56a8,8,0,0,1,0-16H200A8,8,0,0,1,208,96ZM72,160a8,8,0,0,1-8,8H56a8,8,0,0,1,0-16h8A8,8,0,0,1,72,160Zm96,0a8,8,0,0,1-8,8H96a8,8,0,0,1,0-16h64A8,8,0,0,1,168,160Zm40,0a8,8,0,0,1-8,8h-8a8,8,0,0,1,0-16h8A8,8,0,0,1,208,160Z"></path></svg>`,
Drag: `<svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" fill="#000000" viewBox="0 0 256 256"><path d="M188,80a27.79,27.79,0,0,0-13.36,3.4,28,28,0,0,0-46.64-11A28,28,0,0,0,80,92v20H68a28,28,0,0,0-28,28v12a88,88,0,0,0,176,0V108A28,28,0,0,0,188,80Zm12,72a72,72,0,0,1-144,0V140a12,12,0,0,1,12-12H80v24a8,8,0,0,0,16,0V92a12,12,0,0,1,24,0v28a8,8,0,0,0,16,0V92a12,12,0,0,1,24,0v28a8,8,0,0,0,16,0V108a12,12,0,0,1,24,0Z"></path></svg>`,
};
export function getIcon(
name: keyof typeof iconPaths,
options?: IconOptions,
): string {
return createIcon(iconPaths[name], options);
}

View File

@@ -11,7 +11,6 @@ import {
} from "./helpers";
import { useNodeStore } from "../../../stores/nodeStore";
import { useEdgeStore } from "../../../stores/edgeStore";
import { useTutorialStore } from "../../../stores/tutorialStore";
let isTutorialLoading = false;
let tutorialLoadingCallback: ((loading: boolean) => void) | null = null;
@@ -61,14 +60,12 @@ export const startTutorial = async () => {
handleTutorialComplete();
removeTutorialStyles();
clearPrefetchedBlocks();
useTutorialStore.getState().setIsTutorialRunning(false);
});
tour.on("cancel", () => {
handleTutorialCancel(tour);
removeTutorialStyles();
clearPrefetchedBlocks();
useTutorialStore.getState().setIsTutorialRunning(false);
});
for (const step of tour.steps) {

View File

@@ -61,18 +61,12 @@ export const convertNodesPlusBlockInfoIntoCustomNodes = (
return customNode;
};
const isToolSourceName = (sourceName: string): boolean =>
sourceName.startsWith("tools_^_");
const cleanupSourceName = (sourceName: string): string =>
isToolSourceName(sourceName) ? "tools" : sourceName;
export const linkToCustomEdge = (link: Link): CustomEdge => ({
id: link.id ?? "",
type: "custom" as const,
source: link.source_id,
target: link.sink_id,
sourceHandle: cleanupSourceName(link.source_name),
sourceHandle: link.source_name,
targetHandle: link.sink_name,
data: {
isStatic: link.is_static,

View File

@@ -267,34 +267,23 @@ export function extractCredentialsNeeded(
| undefined;
if (missingCreds && Object.keys(missingCreds).length > 0) {
const agentName = (setupInfo?.agent_name as string) || "this block";
const credentials = Object.values(missingCreds).map((credInfo) => {
// Normalize to array at boundary - prefer 'types' array, fall back to single 'type'
const typesArray = credInfo.types as
| Array<"api_key" | "oauth2" | "user_password" | "host_scoped">
| undefined;
const singleType =
const credentials = Object.values(missingCreds).map((credInfo) => ({
provider: (credInfo.provider as string) || "unknown",
providerName:
(credInfo.provider_name as string) ||
(credInfo.provider as string) ||
"Unknown Provider",
credentialType:
(credInfo.type as
| "api_key"
| "oauth2"
| "user_password"
| "host_scoped"
| undefined) || "api_key";
const credentialTypes =
typesArray && typesArray.length > 0 ? typesArray : [singleType];
return {
provider: (credInfo.provider as string) || "unknown",
providerName:
(credInfo.provider_name as string) ||
(credInfo.provider as string) ||
"Unknown Provider",
credentialTypes,
title:
(credInfo.title as string) ||
`${(credInfo.provider_name as string) || (credInfo.provider as string)} credentials`,
scopes: credInfo.scopes as string[] | undefined,
};
});
| "host_scoped") || "api_key",
title:
(credInfo.title as string) ||
`${(credInfo.provider_name as string) || (credInfo.provider as string)} credentials`,
scopes: credInfo.scopes as string[] | undefined,
}));
return {
type: "credentials_needed",
toolName,
@@ -369,14 +358,11 @@ export function extractInputsNeeded(
credentials.forEach((cred) => {
const id = cred.id as string;
if (id) {
const credentialTypes = Array.isArray(cred.types)
? cred.types
: [(cred.type as string) || "api_key"];
credentialsSchema[id] = {
type: "object",
properties: {},
credentials_provider: [cred.provider as string],
credentials_types: credentialTypes,
credentials_types: [(cred.type as string) || "api_key"],
credentials_scopes: cred.scopes as string[] | undefined,
};
}

View File

@@ -9,9 +9,7 @@ import { useChatCredentialsSetup } from "./useChatCredentialsSetup";
export interface CredentialInfo {
provider: string;
providerName: string;
credentialTypes: Array<
"api_key" | "oauth2" | "user_password" | "host_scoped"
>;
credentialType: "api_key" | "oauth2" | "user_password" | "host_scoped";
title: string;
scopes?: string[];
}
@@ -32,7 +30,7 @@ function createSchemaFromCredentialInfo(
type: "object",
properties: {},
credentials_provider: [credential.provider],
credentials_types: credential.credentialTypes,
credentials_types: [credential.credentialType],
credentials_scopes: credential.scopes,
discriminator: undefined,
discriminator_mapping: undefined,

View File

@@ -41,9 +41,7 @@ export type ChatMessageData =
credentials: Array<{
provider: string;
providerName: string;
credentialTypes: Array<
"api_key" | "oauth2" | "user_password" | "host_scoped"
>;
credentialType: "api_key" | "oauth2" | "user_password" | "host_scoped";
title: string;
scopes?: string[];
}>;

View File

@@ -31,18 +31,10 @@ export function AgentSettingsModal({
}
}
const {
currentHITLSafeMode,
showHITLToggle,
handleHITLToggle,
currentSensitiveActionSafeMode,
showSensitiveActionToggle,
handleSensitiveActionToggle,
isPending,
shouldShowToggle,
} = useAgentSafeMode(agent);
const { currentSafeMode, isPending, hasHITLBlocks, handleToggle } =
useAgentSafeMode(agent);
if (!shouldShowToggle) return null;
if (!hasHITLBlocks) return null;
return (
<Dialog
@@ -65,48 +57,23 @@ export function AgentSettingsModal({
)}
<Dialog.Content>
<div className="space-y-6">
{showHITLToggle && (
<div className="flex w-full flex-col items-start gap-4 rounded-xl border border-zinc-100 bg-white p-6">
<div className="flex w-full items-start justify-between gap-4">
<div className="flex-1">
<Text variant="large-semibold">
Human-in-the-loop approval
</Text>
<Text variant="large" className="mt-1 text-zinc-900">
The agent will pause at human-in-the-loop blocks and wait
for your review before continuing
</Text>
</div>
<Switch
checked={currentHITLSafeMode || false}
onCheckedChange={handleHITLToggle}
disabled={isPending}
className="mt-1"
/>
<div className="flex w-full flex-col items-start gap-4 rounded-xl border border-zinc-100 bg-white p-6">
<div className="flex w-full items-start justify-between gap-4">
<div className="flex-1">
<Text variant="large-semibold">Require human approval</Text>
<Text variant="large" className="mt-1 text-zinc-900">
The agent will pause and wait for your review before
continuing
</Text>
</div>
<Switch
checked={currentSafeMode || false}
onCheckedChange={handleToggle}
disabled={isPending}
className="mt-1"
/>
</div>
)}
{showSensitiveActionToggle && (
<div className="flex w-full flex-col items-start gap-4 rounded-xl border border-zinc-100 bg-white p-6">
<div className="flex w-full items-start justify-between gap-4">
<div className="flex-1">
<Text variant="large-semibold">
Sensitive action approval
</Text>
<Text variant="large" className="mt-1 text-zinc-900">
The agent will pause at sensitive action blocks and wait for
your review before continuing
</Text>
</div>
<Switch
checked={currentSensitiveActionSafeMode}
onCheckedChange={handleSensitiveActionToggle}
disabled={isPending}
className="mt-1"
/>
</div>
</div>
)}
</div>
</div>
</Dialog.Content>
</Dialog>

View File

@@ -5,37 +5,30 @@ import {
AccordionItem,
AccordionTrigger,
} from "@/components/molecules/Accordion/Accordion";
import {
CredentialsMetaInput,
CredentialsType,
} from "@/lib/autogpt-server-api/types";
import { CredentialsProvidersContext } from "@/providers/agent-credentials/credentials-provider";
import { SlidersHorizontalIcon } from "@phosphor-icons/react";
import { SlidersHorizontal } from "@phosphor-icons/react";
import { useContext, useEffect, useMemo, useRef } from "react";
import { useRunAgentModalContext } from "../../context";
import {
areSystemCredentialProvidersLoading,
CredentialField,
findSavedCredentialByProviderAndType,
hasMissingRequiredSystemCredentials,
splitCredentialFieldsBySystem,
} from "./helpers";
} from "../helpers";
type Props = {
credentialFields: CredentialField[];
requiredCredentials: Set<string>;
inputCredentials: Record<string, CredentialsMetaInput | undefined>;
inputValues: Record<string, any>;
onCredentialChange: (key: string, value?: CredentialsMetaInput) => void;
};
export function CredentialsGroupedView({
credentialFields,
requiredCredentials,
inputCredentials,
inputValues,
onCredentialChange,
}: Props) {
const allProviders = useContext(CredentialsProvidersContext);
const { inputCredentials, setInputCredentialsValue, inputValues } =
useRunAgentModalContext();
const { userCredentialFields, systemCredentialFields } = useMemo(
() =>
@@ -94,11 +87,11 @@ export function CredentialsGroupedView({
);
if (savedCredential) {
onCredentialChange(key, {
setInputCredentialsValue(key, {
id: savedCredential.id,
provider: savedCredential.provider,
type: savedCredential.type as CredentialsType,
title: savedCredential.title,
type: savedCredential.type,
title: (savedCredential as { title?: string }).title,
});
}
}
@@ -110,7 +103,7 @@ export function CredentialsGroupedView({
systemCredentialFields,
requiredCredentials,
inputCredentials,
onCredentialChange,
setInputCredentialsValue,
isLoadingProviders,
]);
@@ -130,7 +123,7 @@ export function CredentialsGroupedView({
}
selectedCredentials={selectedCred}
onSelectCredentials={(value) => {
onCredentialChange(key, value);
setInputCredentialsValue(key, value);
}}
siblingInputs={inputValues}
isOptional={!requiredCredentials.has(key)}
@@ -150,8 +143,7 @@ export function CredentialsGroupedView({
<AccordionItem value="system-credentials" className="border-none">
<AccordionTrigger className="py-2 text-sm text-muted-foreground hover:no-underline">
<div className="flex items-center gap-1">
<SlidersHorizontalIcon size={16} weight="bold" /> System
credentials
<SlidersHorizontal size={16} weight="bold" /> System credentials
{hasMissingSystemCredentials && (
<span className="text-destructive">(missing)</span>
)}
@@ -171,7 +163,7 @@ export function CredentialsGroupedView({
}
selectedCredentials={selectedCred}
onSelectCredentials={(value) => {
onCredentialChange(key, value);
setInputCredentialsValue(key, value);
}}
siblingInputs={inputValues}
isOptional={!requiredCredentials.has(key)}

View File

@@ -1,9 +1,9 @@
import { Input } from "@/components/atoms/Input/Input";
import { CredentialsGroupedView } from "@/components/contextual/CredentialsInput/components/CredentialsGroupedView/CredentialsGroupedView";
import { InformationTooltip } from "@/components/molecules/InformationTooltip/InformationTooltip";
import { useMemo } from "react";
import { RunAgentInputs } from "../../../RunAgentInputs/RunAgentInputs";
import { useRunAgentModalContext } from "../../context";
import { CredentialsGroupedView } from "../CredentialsGroupedView/CredentialsGroupedView";
import { ModalSection } from "../ModalSection/ModalSection";
import { WebhookTriggerBanner } from "../WebhookTriggerBanner/WebhookTriggerBanner";
@@ -19,8 +19,6 @@ export function ModalRunSection() {
setInputValue,
agentInputFields,
agentCredentialsInputFields,
inputCredentials,
setInputCredentialsValue,
} = useRunAgentModalContext();
const inputFields = Object.entries(agentInputFields || {});
@@ -104,9 +102,6 @@ export function ModalRunSection() {
<CredentialsGroupedView
credentialFields={credentialFields}
requiredCredentials={requiredCredentials}
inputCredentials={inputCredentials}
inputValues={inputValues}
onCredentialChange={setInputCredentialsValue}
/>
</ModalSection>
) : null}

View File

@@ -1,5 +1,5 @@
import { CredentialsProvidersContextType } from "@/providers/agent-credentials/credentials-provider";
import { getSystemCredentials } from "../../helpers";
import { getSystemCredentials } from "../../../../../../../../../../../components/contextual/CredentialsInput/helpers";
export type CredentialField = [string, any];

View File

@@ -5,112 +5,48 @@ import { Graph } from "@/lib/autogpt-server-api/types";
import { cn } from "@/lib/utils";
import { ShieldCheckIcon, ShieldIcon } from "@phosphor-icons/react";
import { useAgentSafeMode } from "@/hooks/useAgentSafeMode";
import {
Tooltip,
TooltipContent,
TooltipTrigger,
} from "@/components/atoms/Tooltip/BaseTooltip";
interface Props {
graph: GraphModel | LibraryAgent | Graph;
className?: string;
fullWidth?: boolean;
}
interface SafeModeIconButtonProps {
isEnabled: boolean;
label: string;
tooltipEnabled: string;
tooltipDisabled: string;
onToggle: () => void;
isPending: boolean;
}
function SafeModeIconButton({
isEnabled,
label,
tooltipEnabled,
tooltipDisabled,
onToggle,
isPending,
}: SafeModeIconButtonProps) {
return (
<Tooltip delayDuration={100}>
<TooltipTrigger asChild>
<Button
variant="icon"
size="icon"
aria-label={`${label}: ${isEnabled ? "ON" : "OFF"}. ${isEnabled ? tooltipEnabled : tooltipDisabled}`}
onClick={onToggle}
disabled={isPending}
className={cn(isPending ? "opacity-0" : "opacity-100")}
>
{isEnabled ? (
<ShieldCheckIcon weight="bold" size={16} />
) : (
<ShieldIcon weight="bold" size={16} />
)}
</Button>
</TooltipTrigger>
<TooltipContent>
<div className="text-center">
<div className="font-medium">
{label}: {isEnabled ? "ON" : "OFF"}
</div>
<div className="mt-1 text-xs text-muted-foreground">
{isEnabled ? tooltipEnabled : tooltipDisabled}
</div>
</div>
</TooltipContent>
</Tooltip>
);
}
export function SafeModeToggle({ graph, className }: Props) {
export function SafeModeToggle({ graph }: Props) {
const {
currentHITLSafeMode,
showHITLToggle,
isHITLStateUndetermined,
handleHITLToggle,
currentSensitiveActionSafeMode,
showSensitiveActionToggle,
handleSensitiveActionToggle,
currentSafeMode,
isPending,
shouldShowToggle,
isStateUndetermined,
handleToggle,
} = useAgentSafeMode(graph);
if (!shouldShowToggle || isHITLStateUndetermined) {
return null;
}
const showHITL = showHITLToggle && !isHITLStateUndetermined;
const showSensitive = showSensitiveActionToggle;
if (!showHITL && !showSensitive) {
if (!shouldShowToggle || isStateUndetermined) {
return null;
}
return (
<div className={cn("flex gap-1", className)}>
{showHITL && (
<SafeModeIconButton
isEnabled={currentHITLSafeMode}
label="Human-in-the-loop"
tooltipEnabled="The agent will pause at human-in-the-loop blocks and wait for your approval"
tooltipDisabled="Human-in-the-loop blocks will proceed automatically"
onToggle={handleHITLToggle}
isPending={isPending}
/>
<Button
variant="icon"
key={graph.id}
size="icon"
aria-label={
currentSafeMode!
? "Safe Mode: ON. Human in the loop blocks require manual review"
: "Safe Mode: OFF. Human in the loop blocks proceed automatically"
}
onClick={handleToggle}
className={cn(isPending ? "opacity-0" : "opacity-100")}
>
{currentSafeMode! ? (
<>
<ShieldCheckIcon weight="bold" size={16} />
</>
) : (
<>
<ShieldIcon weight="bold" size={16} />
</>
)}
{showSensitive && (
<SafeModeIconButton
isEnabled={currentSensitiveActionSafeMode}
label="Sensitive actions"
tooltipEnabled="The agent will pause at sensitive action blocks and wait for your approval"
tooltipDisabled="Sensitive action blocks will proceed automatically"
onToggle={handleSensitiveActionToggle}
isPending={isPending}
/>
)}
</div>
</Button>
);
}

View File

@@ -13,16 +13,8 @@ interface Props {
}
export function SelectedSettingsView({ agent, onClearSelectedRun }: Props) {
const {
currentHITLSafeMode,
showHITLToggle,
handleHITLToggle,
currentSensitiveActionSafeMode,
showSensitiveActionToggle,
handleSensitiveActionToggle,
isPending,
shouldShowToggle,
} = useAgentSafeMode(agent);
const { currentSafeMode, isPending, hasHITLBlocks, handleToggle } =
useAgentSafeMode(agent);
return (
<SelectedViewLayout agent={agent}>
@@ -42,51 +34,24 @@ export function SelectedSettingsView({ agent, onClearSelectedRun }: Props) {
</div>
<div className={`${AGENT_LIBRARY_SECTION_PADDING_X} space-y-6`}>
{shouldShowToggle ? (
<>
{showHITLToggle && (
<div className="flex w-full max-w-2xl flex-col items-start gap-4 rounded-xl border border-zinc-100 bg-white p-6">
<div className="flex w-full items-start justify-between gap-4">
<div className="flex-1">
<Text variant="large-semibold">
Human-in-the-loop approval
</Text>
<Text variant="large" className="mt-1 text-zinc-900">
The agent will pause at human-in-the-loop blocks and
wait for your review before continuing
</Text>
</div>
<Switch
checked={currentHITLSafeMode || false}
onCheckedChange={handleHITLToggle}
disabled={isPending}
className="mt-1"
/>
</div>
{hasHITLBlocks ? (
<div className="flex w-full max-w-2xl flex-col items-start gap-4 rounded-xl border border-zinc-100 bg-white p-6">
<div className="flex w-full items-start justify-between gap-4">
<div className="flex-1">
<Text variant="large-semibold">Require human approval</Text>
<Text variant="large" className="mt-1 text-zinc-900">
The agent will pause and wait for your review before
continuing
</Text>
</div>
)}
{showSensitiveActionToggle && (
<div className="flex w-full max-w-2xl flex-col items-start gap-4 rounded-xl border border-zinc-100 bg-white p-6">
<div className="flex w-full items-start justify-between gap-4">
<div className="flex-1">
<Text variant="large-semibold">
Sensitive action approval
</Text>
<Text variant="large" className="mt-1 text-zinc-900">
The agent will pause at sensitive action blocks and wait
for your review before continuing
</Text>
</div>
<Switch
checked={currentSensitiveActionSafeMode}
onCheckedChange={handleSensitiveActionToggle}
disabled={isPending}
className="mt-1"
/>
</div>
</div>
)}
</>
<Switch
checked={currentSafeMode || false}
onCheckedChange={handleToggle}
disabled={isPending}
className="mt-1"
/>
</div>
</div>
) : (
<div className="rounded-xl border border-zinc-100 bg-white p-6">
<Text variant="body" className="text-muted-foreground">

View File

@@ -2,7 +2,6 @@
import { Button } from "@/components/atoms/Button/Button";
import { FileInput } from "@/components/atoms/FileInput/FileInput";
import { Input } from "@/components/atoms/Input/Input";
import { LoadingSpinner } from "@/components/atoms/LoadingSpinner/LoadingSpinner";
import { Dialog } from "@/components/molecules/Dialog/Dialog";
import {
Form,
@@ -121,7 +120,7 @@ export default function LibraryUploadAgentDialog() {
>
{isUploading ? (
<div className="flex items-center gap-2">
<LoadingSpinner size="small" className="text-white" />
<div className="h-4 w-4 animate-spin rounded-full border-b-2 border-t-2 border-white"></div>
<span>Uploading...</span>
</div>
) : (

View File

@@ -6383,11 +6383,6 @@
"title": "Has Human In The Loop",
"readOnly": true
},
"has_sensitive_action": {
"type": "boolean",
"title": "Has Sensitive Action",
"readOnly": true
},
"trigger_setup_info": {
"anyOf": [
{ "$ref": "#/components/schemas/GraphTriggerInfo" },
@@ -6404,7 +6399,6 @@
"output_schema",
"has_external_trigger",
"has_human_in_the_loop",
"has_sensitive_action",
"trigger_setup_info"
],
"title": "BaseGraph"
@@ -7635,11 +7629,6 @@
"title": "Has Human In The Loop",
"readOnly": true
},
"has_sensitive_action": {
"type": "boolean",
"title": "Has Sensitive Action",
"readOnly": true
},
"trigger_setup_info": {
"anyOf": [
{ "$ref": "#/components/schemas/GraphTriggerInfo" },
@@ -7663,7 +7652,6 @@
"output_schema",
"has_external_trigger",
"has_human_in_the_loop",
"has_sensitive_action",
"trigger_setup_info",
"credentials_input_schema"
],
@@ -7742,11 +7730,6 @@
"title": "Has Human In The Loop",
"readOnly": true
},
"has_sensitive_action": {
"type": "boolean",
"title": "Has Sensitive Action",
"readOnly": true
},
"trigger_setup_info": {
"anyOf": [
{ "$ref": "#/components/schemas/GraphTriggerInfo" },
@@ -7771,7 +7754,6 @@
"output_schema",
"has_external_trigger",
"has_human_in_the_loop",
"has_sensitive_action",
"trigger_setup_info",
"credentials_input_schema"
],
@@ -7780,14 +7762,8 @@
"GraphSettings": {
"properties": {
"human_in_the_loop_safe_mode": {
"type": "boolean",
"title": "Human In The Loop Safe Mode",
"default": true
},
"sensitive_action_safe_mode": {
"type": "boolean",
"title": "Sensitive Action Safe Mode",
"default": false
"anyOf": [{ "type": "boolean" }, { "type": "null" }],
"title": "Human In The Loop Safe Mode"
}
},
"type": "object",
@@ -7945,16 +7921,6 @@
"title": "Has External Trigger",
"description": "Whether the agent has an external trigger (e.g. webhook) node"
},
"has_human_in_the_loop": {
"type": "boolean",
"title": "Has Human In The Loop",
"description": "Whether the agent has human-in-the-loop blocks"
},
"has_sensitive_action": {
"type": "boolean",
"title": "Has Sensitive Action",
"description": "Whether the agent has sensitive action blocks"
},
"trigger_setup_info": {
"anyOf": [
{ "$ref": "#/components/schemas/GraphTriggerInfo" },
@@ -8001,8 +7967,6 @@
"output_schema",
"credentials_input_schema",
"has_external_trigger",
"has_human_in_the_loop",
"has_sensitive_action",
"new_output",
"can_access_graph",
"is_latest_version",

View File

@@ -1,33 +0,0 @@
"use client";
import * as PopoverPrimitive from "@radix-ui/react-popover";
import * as React from "react";
import { cn } from "@/lib/utils";
const Popover = PopoverPrimitive.Root;
const PopoverTrigger = PopoverPrimitive.Trigger;
const PopoverAnchor = PopoverPrimitive.Anchor;
const PopoverContent = React.forwardRef<
React.ElementRef<typeof PopoverPrimitive.Content>,
React.ComponentPropsWithoutRef<typeof PopoverPrimitive.Content>
>(({ className, align = "center", sideOffset = 4, ...props }, ref) => (
<PopoverPrimitive.Portal>
<PopoverPrimitive.Content
ref={ref}
align={align}
sideOffset={sideOffset}
className={cn(
"z-50 w-72 rounded-lg border border-zinc-200 bg-white p-4 text-zinc-900 shadow-md outline-none data-[state=open]:animate-in data-[state=closed]:animate-out data-[state=closed]:fade-out-0 data-[state=open]:fade-in-0 data-[state=closed]:zoom-out-95 data-[state=open]:zoom-in-95 data-[side=bottom]:slide-in-from-top-2 data-[side=left]:slide-in-from-right-2 data-[side=right]:slide-in-from-left-2 data-[side=top]:slide-in-from-bottom-2",
className,
)}
{...props}
/>
</PopoverPrimitive.Portal>
));
PopoverContent.displayName = PopoverPrimitive.Content.displayName;
export { Popover, PopoverAnchor, PopoverContent, PopoverTrigger };

View File

@@ -35,13 +35,12 @@ export const CredentialFieldTitle = (props: {
uiOptions,
);
const provider = getCredentialProviderFromSchema(
useNodeStore.getState().getHardCodedValues(nodeId),
schema as BlockIOCredentialsSubSchema,
const credentialProvider = toDisplayName(
getCredentialProviderFromSchema(
useNodeStore.getState().getHardCodedValues(nodeId),
schema as BlockIOCredentialsSubSchema,
) ?? "",
);
const credentialProvider = provider
? `${toDisplayName(provider)} credential`
: "credential";
const updatedUiSchema = updateUiOption(uiSchema, {
showHandles: false,

View File

@@ -1,92 +0,0 @@
"use client";
import {
descriptionId,
FieldProps,
getTemplate,
RJSFSchema,
titleId,
} from "@rjsf/utils";
import { useMemo } from "react";
import { LlmModelPicker } from "./components/LlmModelPicker";
import { LlmModelMetadataMap } from "./types";
import { updateUiOption } from "../../helpers";
type LlmModelSchema = RJSFSchema & {
llm_model_metadata?: LlmModelMetadataMap;
};
export function LlmModelField(props: FieldProps) {
const { schema, formData, onChange, disabled, readonly, fieldPathId } = props;
const metadata = useMemo(() => {
return (schema as LlmModelSchema)?.llm_model_metadata ?? {};
}, [schema]);
const models = useMemo(() => {
return Object.values(metadata);
}, [metadata]);
const selectedName =
typeof formData === "string"
? formData
: typeof schema.default === "string"
? schema.default
: "";
const selectedModel = selectedName
? (metadata[selectedName] ??
models.find((model) => model.name === selectedName))
: undefined;
const recommendedName =
typeof schema.default === "string" ? schema.default : models[0]?.name;
const recommendedModel =
recommendedName && metadata[recommendedName]
? metadata[recommendedName]
: undefined;
if (models.length === 0) {
return null;
}
const TitleFieldTemplate = getTemplate("TitleFieldTemplate", props.registry);
const DescriptionFieldTemplate = getTemplate(
"DescriptionFieldTemplate",
props.registry,
);
const updatedUiSchema = updateUiOption(props.uiSchema, {
showHandles: false,
});
return (
<>
<div className="flex items-center gap-2">
<TitleFieldTemplate
id={titleId(fieldPathId)}
title={schema.title || ""}
required={true}
schema={schema}
uiSchema={updatedUiSchema}
registry={props.registry}
/>
<DescriptionFieldTemplate
id={descriptionId(fieldPathId)}
description={schema.description || ""}
schema={schema}
registry={props.registry}
/>
</div>
<LlmModelPicker
models={models}
selectedModel={selectedModel}
recommendedModel={recommendedModel}
onSelect={(value) => onChange(value, fieldPathId?.path)}
disabled={disabled || readonly}
/>
</>
);
}

View File

@@ -1,66 +0,0 @@
"use client";
import Image from "next/image";
import { Text } from "@/components/atoms/Text/Text";
const creatorIconMap: Record<string, string> = {
anthropic: "/integrations/anthropic-color.png",
openai: "/integrations/openai.png",
google: "/integrations/gemini.png",
nvidia: "/integrations/nvidia.png",
groq: "/integrations/groq.png",
ollama: "/integrations/ollama.png",
openrouter: "/integrations/open_router.png",
v0: "/integrations/v0.png",
xai: "/integrations/xai.webp",
meta: "/integrations/llama_api.png",
amazon: "/integrations/amazon.png",
cohere: "/integrations/cohere.png",
deepseek: "/integrations/deepseek.png",
gryphe: "/integrations/gryphe.png",
microsoft: "/integrations/microsoft.webp",
moonshotai: "/integrations/moonshot.png",
mistral: "/integrations/mistral.png",
mistralai: "/integrations/mistral.png",
nousresearch: "/integrations/nousresearch.avif",
perplexity: "/integrations/perplexity.webp",
qwen: "/integrations/qwen.png",
};
type Props = {
value: string;
size?: number;
};
export function LlmIcon({ value, size = 20 }: Props) {
const normalized = value.trim().toLowerCase().replace(/\s+/g, "");
const src = creatorIconMap[normalized];
if (src) {
return (
<div
className="flex items-center justify-center overflow-hidden rounded-xsmall"
style={{ width: size, height: size }}
>
<Image
src={src}
alt={value}
width={size}
height={size}
className="h-full w-full object-cover"
/>
</div>
);
}
const fallback = value?.trim().slice(0, 1).toUpperCase() || "?";
return (
<div
className="flex items-center justify-center rounded-xsmall bg-zinc-100"
style={{ width: size, height: size }}
>
<Text variant="small" className="text-zinc-500">
{fallback}
</Text>
</div>
);
}

View File

@@ -1,24 +0,0 @@
"use client";
import { ArrowLeftIcon } from "@phosphor-icons/react";
import { Text } from "@/components/atoms/Text/Text";
type Props = {
label: string;
onBack: () => void;
};
export function LlmMenuHeader({ label, onBack }: Props) {
return (
<button
type="button"
onClick={onBack}
className="flex w-full items-center gap-2 px-2 py-2 text-left hover:bg-zinc-100"
>
<ArrowLeftIcon className="h-4 w-4 text-zinc-800" weight="bold" />
<Text variant="body" className="text-zinc-900">
{label}
</Text>
</button>
);
}

View File

@@ -1,61 +0,0 @@
"use client";
import { CaretRightIcon, CheckIcon } from "@phosphor-icons/react";
import { Text } from "@/components/atoms/Text/Text";
import { cn } from "@/lib/utils";
type Props = {
title: string;
subtitle?: string;
icon?: React.ReactNode;
showChevron?: boolean;
rightSlot?: React.ReactNode;
onClick: () => void;
isActive?: boolean;
};
export function LlmMenuItem({
title,
subtitle,
icon,
showChevron,
rightSlot,
onClick,
isActive,
}: Props) {
const hasIcon = Boolean(icon);
return (
<button
type="button"
onClick={onClick}
className={cn("w-full py-1 pl-2 pr-4 text-left hover:bg-zinc-100")}
>
<div className="flex items-center justify-between gap-3">
<div className="flex items-center gap-2">
{icon}
<Text variant="body" className="text-zinc-900">
{title}
</Text>
</div>
<div className="flex items-center gap-2">
{isActive && (
<CheckIcon className="h-4 w-4 text-emerald-600" weight="bold" />
)}
{rightSlot}
{showChevron && (
<CaretRightIcon className="h-4 w-4 text-zinc-900" weight="bold" />
)}
</div>
</div>
{subtitle && (
<Text
variant="small"
className={cn("mb-1 text-zinc-500", hasIcon && "pl-0")}
>
{subtitle}
</Text>
)}
</button>
);
}

View File

@@ -1,235 +0,0 @@
"use client";
import { useCallback, useEffect, useMemo, useState } from "react";
import { CaretDownIcon } from "@phosphor-icons/react";
import {
Popover,
PopoverContent,
PopoverTrigger,
} from "@/components/molecules/Popover/Popover";
import { Text } from "@/components/atoms/Text/Text";
import { cn } from "@/lib/utils";
import {
getCreatorDisplayName,
getModelDisplayName,
getProviderDisplayName,
groupByCreator,
groupByTitle,
} from "../helpers";
import { LlmModelMetadata } from "../types";
import { LlmIcon } from "./LlmIcon";
import { LlmMenuHeader } from "./LlmMenuHeader";
import { LlmMenuItem } from "./LlmMenuItem";
import { LlmPriceTier } from "./LlmPriceTier";
type MenuView = "creator" | "model" | "provider";
type Props = {
models: LlmModelMetadata[];
selectedModel?: LlmModelMetadata;
recommendedModel?: LlmModelMetadata;
onSelect: (value: string) => void;
disabled?: boolean;
};
export function LlmModelPicker({
models,
selectedModel,
recommendedModel,
onSelect,
disabled,
}: Props) {
const [open, setOpen] = useState(false);
const [view, setView] = useState<MenuView>("creator");
const [activeCreator, setActiveCreator] = useState<string | null>(null);
const [activeTitle, setActiveTitle] = useState<string | null>(null);
const modelsByCreator = useMemo(() => groupByCreator(models), [models]);
const creators = useMemo(() => {
return Array.from(modelsByCreator.keys()).sort((a, b) =>
a.localeCompare(b),
);
}, [modelsByCreator]);
const creatorIconValues = useMemo(() => {
const map = new Map<string, string>();
for (const [creator, entries] of modelsByCreator.entries()) {
map.set(creator, entries[0]?.creator ?? creator);
}
return map;
}, [modelsByCreator]);
useEffect(() => {
if (!open) {
return;
}
setView("creator");
setActiveCreator(
selectedModel
? getCreatorDisplayName(selectedModel)
: (creators[0] ?? null),
);
setActiveTitle(selectedModel ? getModelDisplayName(selectedModel) : null);
}, [open, selectedModel, creators]);
const currentCreator = activeCreator ?? creators[0] ?? null;
const currentModels = useMemo(() => {
return currentCreator ? (modelsByCreator.get(currentCreator) ?? []) : [];
}, [currentCreator, modelsByCreator]);
const currentCreatorIcon = useMemo(() => {
return currentModels[0]?.creator ?? currentCreator;
}, [currentModels, currentCreator]);
const modelsByTitle = useMemo(
() => groupByTitle(currentModels),
[currentModels],
);
const modelEntries = useMemo(() => {
return Array.from(modelsByTitle.entries())
.map(([title, entries]) => {
const providers = new Set(entries.map((entry) => entry.provider));
return {
title,
entries,
providerCount: providers.size,
};
})
.sort((a, b) => a.title.localeCompare(b.title));
}, [modelsByTitle]);
const providerEntries = useMemo(() => {
if (!activeTitle) {
return [];
}
return modelsByTitle.get(activeTitle) ?? [];
}, [activeTitle, modelsByTitle]);
const handleSelectModel = useCallback(
(modelName: string) => {
onSelect(modelName);
setOpen(false);
},
[onSelect],
);
const triggerModel = selectedModel ?? recommendedModel ?? models[0];
const triggerTitle = triggerModel
? getModelDisplayName(triggerModel)
: "Select model";
const triggerCreator = triggerModel?.creator ?? "";
return (
<Popover open={open} onOpenChange={setOpen}>
<PopoverTrigger asChild>
<button
type="button"
disabled={disabled}
className={cn(
"flex w-full min-w-[15rem] items-center rounded-lg border border-zinc-200 bg-white px-3 py-2 text-left",
"hover:border-zinc-300 focus:outline-none focus:ring-2 focus:ring-zinc-200",
disabled && "cursor-not-allowed opacity-60",
)}
>
<LlmIcon value={triggerCreator} />
<Text variant="body" className="ml-1 flex-1 text-zinc-900">
{triggerTitle}
</Text>
<CaretDownIcon className="h-3 w-3 text-zinc-900" weight="bold" />
</button>
</PopoverTrigger>
<PopoverContent
align="start"
sideOffset={4}
className="max-h-[45vh] w-[--radix-popover-trigger-width] min-w-[16rem] overflow-y-auto rounded-md border border-zinc-200 bg-white p-0 shadow-[0px_1px_4px_rgba(12,12,13,0.12)]"
>
{view === "creator" && (
<div className="flex flex-col">
{recommendedModel && (
<>
<LlmMenuItem
title={getModelDisplayName(recommendedModel)}
subtitle="Recommended"
icon={<LlmIcon value={recommendedModel.creator} />}
onClick={() => handleSelectModel(recommendedModel.name)}
/>
<div className="border-b border-zinc-200" />
</>
)}
{creators.map((creator) => (
<LlmMenuItem
key={creator}
title={creator}
icon={
<LlmIcon value={creatorIconValues.get(creator) ?? creator} />
}
showChevron={true}
isActive={
selectedModel
? getCreatorDisplayName(selectedModel) === creator
: false
}
onClick={() => {
setActiveCreator(creator);
setView("model");
}}
/>
))}
</div>
)}
{view === "model" && currentCreator && (
<div className="flex flex-col">
<LlmMenuHeader
label={currentCreator}
onBack={() => setView("creator")}
/>
<div className="border-b border-zinc-200" />
{modelEntries.map((entry) => (
<LlmMenuItem
key={entry.title}
title={entry.title}
icon={<LlmIcon value={currentCreatorIcon} />}
rightSlot={<LlmPriceTier tier={entry.entries[0]?.price_tier} />}
showChevron={entry.providerCount > 1}
isActive={
selectedModel
? getModelDisplayName(selectedModel) === entry.title
: false
}
onClick={() => {
if (entry.providerCount > 1) {
setActiveTitle(entry.title);
setView("provider");
return;
}
handleSelectModel(entry.entries[0].name);
}}
/>
))}
</div>
)}
{view === "provider" && activeTitle && (
<div className="flex flex-col">
<LlmMenuHeader
label={activeTitle}
onBack={() => setView("model")}
/>
<div className="border-b border-zinc-200" />
{providerEntries.map((entry) => (
<LlmMenuItem
key={`${entry.title}-${entry.provider}`}
title={getProviderDisplayName(entry)}
icon={<LlmIcon value={entry.provider} />}
isActive={selectedModel?.provider === entry.provider}
onClick={() => handleSelectModel(entry.name)}
/>
))}
</div>
)}
</PopoverContent>
</Popover>
);
}

View File

@@ -1,25 +0,0 @@
"use client";
import { CurrencyDollarSimpleIcon } from "@phosphor-icons/react";
type Props = {
tier?: number;
};
export function LlmPriceTier({ tier }: Props) {
if (!tier || tier <= 0) {
return null;
}
const clamped = Math.min(3, Math.max(1, tier));
return (
<div className="flex items-center text-zinc-900">
{Array.from({ length: clamped }).map((_, index) => (
<CurrencyDollarSimpleIcon
key={`price-${index}`}
className="-mr-0.5 h-3 w-3"
weight="bold"
/>
))}
</div>
);
}

View File

@@ -1,35 +0,0 @@
import { LlmModelMetadata } from "./types";
export function groupByCreator(models: LlmModelMetadata[]) {
const map = new Map<string, LlmModelMetadata[]>();
for (const model of models) {
const key = getCreatorDisplayName(model);
const existing = map.get(key) ?? [];
existing.push(model);
map.set(key, existing);
}
return map;
}
export function groupByTitle(models: LlmModelMetadata[]) {
const map = new Map<string, LlmModelMetadata[]>();
for (const model of models) {
const displayName = getModelDisplayName(model);
const existing = map.get(displayName) ?? [];
existing.push(model);
map.set(displayName, existing);
}
return map;
}
export function getCreatorDisplayName(model: LlmModelMetadata): string {
return model.creator_name || model.creator || "";
}
export function getModelDisplayName(model: LlmModelMetadata): string {
return model.title || model.name || "";
}
export function getProviderDisplayName(model: LlmModelMetadata): string {
return model.provider_name || model.provider || "";
}

View File

@@ -1,11 +0,0 @@
export type LlmModelMetadata = {
creator: string;
creator_name: string;
title: string;
provider: string;
provider_name: string;
name: string;
price_tier?: number;
};
export type LlmModelMetadataMap = Record<string, LlmModelMetadata>;

View File

@@ -8,7 +8,6 @@ import {
isMultiSelectSchema,
} from "../utils/schema-utils";
import { TableField } from "./TableField/TableField";
import { LlmModelField } from "./LlmModelField/LlmModelField";
export interface CustomFieldDefinition {
id: string;
@@ -58,15 +57,6 @@ export const CUSTOM_FIELDS: CustomFieldDefinition[] = [
},
component: TableField,
},
{
id: "custom/llm_model_field",
matcher: (schema: any) => {
return (
typeof schema === "object" && schema !== null && "llm_model" in schema
);
},
component: LlmModelField,
},
];
export function findCustomFieldId(schema: any): string | null {

View File

@@ -20,15 +20,11 @@ function hasHITLBlocks(graph: GraphModel | LibraryAgent | Graph): boolean {
if ("has_human_in_the_loop" in graph) {
return !!graph.has_human_in_the_loop;
}
return false;
}
function hasSensitiveActionBlocks(
graph: GraphModel | LibraryAgent | Graph,
): boolean {
if ("has_sensitive_action" in graph) {
return !!graph.has_sensitive_action;
if (isLibraryAgent(graph)) {
return graph.settings?.human_in_the_loop_safe_mode !== null;
}
return false;
}
@@ -44,9 +40,7 @@ export function useAgentSafeMode(graph: GraphModel | LibraryAgent | Graph) {
const graphId = getGraphId(graph);
const isAgent = isLibraryAgent(graph);
const showHITLToggle = hasHITLBlocks(graph);
const showSensitiveActionToggle = hasSensitiveActionBlocks(graph);
const shouldShowToggle = showHITLToggle || showSensitiveActionToggle;
const shouldShowToggle = hasHITLBlocks(graph);
const { mutateAsync: updateGraphSettings, isPending } =
usePatchV1UpdateGraphSettings();
@@ -62,37 +56,27 @@ export function useAgentSafeMode(graph: GraphModel | LibraryAgent | Graph) {
},
);
const [localHITLSafeMode, setLocalHITLSafeMode] = useState<boolean>(true);
const [localSensitiveActionSafeMode, setLocalSensitiveActionSafeMode] =
useState<boolean>(false);
const [isLocalStateLoaded, setIsLocalStateLoaded] = useState<boolean>(false);
const [localSafeMode, setLocalSafeMode] = useState<boolean | null>(null);
useEffect(() => {
if (!isAgent && libraryAgent) {
setLocalHITLSafeMode(
libraryAgent.settings?.human_in_the_loop_safe_mode ?? true,
);
setLocalSensitiveActionSafeMode(
libraryAgent.settings?.sensitive_action_safe_mode ?? false,
);
setIsLocalStateLoaded(true);
const backendValue = libraryAgent.settings?.human_in_the_loop_safe_mode;
if (backendValue !== undefined) {
setLocalSafeMode(backendValue);
}
}
}, [isAgent, libraryAgent]);
const currentHITLSafeMode = isAgent
? (graph.settings?.human_in_the_loop_safe_mode ?? true)
: localHITLSafeMode;
const currentSafeMode = isAgent
? graph.settings?.human_in_the_loop_safe_mode
: localSafeMode;
const currentSensitiveActionSafeMode = isAgent
? (graph.settings?.sensitive_action_safe_mode ?? false)
: localSensitiveActionSafeMode;
const isStateUndetermined = isAgent
? graph.settings?.human_in_the_loop_safe_mode == null
: isLoading || localSafeMode === null;
const isHITLStateUndetermined = isAgent
? false
: isLoading || !isLocalStateLoaded;
const handleHITLToggle = useCallback(async () => {
const newSafeMode = !currentHITLSafeMode;
const handleToggle = useCallback(async () => {
const newSafeMode = !currentSafeMode;
try {
await updateGraphSettings({
@@ -101,7 +85,7 @@ export function useAgentSafeMode(graph: GraphModel | LibraryAgent | Graph) {
});
if (!isAgent) {
setLocalHITLSafeMode(newSafeMode);
setLocalSafeMode(newSafeMode);
}
if (isAgent) {
@@ -117,62 +101,37 @@ export function useAgentSafeMode(graph: GraphModel | LibraryAgent | Graph) {
queryClient.invalidateQueries({ queryKey: ["v2", "executions"] });
toast({
title: `HITL safe mode ${newSafeMode ? "enabled" : "disabled"}`,
title: `Safe mode ${newSafeMode ? "enabled" : "disabled"}`,
description: newSafeMode
? "Human-in-the-loop blocks will require manual review"
: "Human-in-the-loop blocks will proceed automatically",
duration: 2000,
});
} catch (error) {
handleToggleError(error, isAgent, toast);
}
}, [
currentHITLSafeMode,
graphId,
isAgent,
graph.id,
updateGraphSettings,
queryClient,
toast,
]);
const isNotFoundError =
error instanceof Error &&
(error.message.includes("404") || error.message.includes("not found"));
const handleSensitiveActionToggle = useCallback(async () => {
const newSafeMode = !currentSensitiveActionSafeMode;
try {
await updateGraphSettings({
graphId,
data: { sensitive_action_safe_mode: newSafeMode },
});
if (!isAgent) {
setLocalSensitiveActionSafeMode(newSafeMode);
}
if (isAgent) {
queryClient.invalidateQueries({
queryKey: getGetV2GetLibraryAgentQueryOptions(graph.id.toString())
.queryKey,
if (!isAgent && isNotFoundError) {
toast({
title: "Safe mode not available",
description:
"To configure safe mode, please save this graph to your library first.",
variant: "destructive",
});
} else {
toast({
title: "Failed to update safe mode",
description:
error instanceof Error
? error.message
: "An unexpected error occurred.",
variant: "destructive",
});
}
queryClient.invalidateQueries({
queryKey: ["v1", "graphs", graphId, "executions"],
});
queryClient.invalidateQueries({ queryKey: ["v2", "executions"] });
toast({
title: `Sensitive action safe mode ${newSafeMode ? "enabled" : "disabled"}`,
description: newSafeMode
? "Sensitive action blocks will require manual review"
: "Sensitive action blocks will proceed automatically",
duration: 2000,
});
} catch (error) {
handleToggleError(error, isAgent, toast);
}
}, [
currentSensitiveActionSafeMode,
currentSafeMode,
graphId,
isAgent,
graph.id,
@@ -182,53 +141,11 @@ export function useAgentSafeMode(graph: GraphModel | LibraryAgent | Graph) {
]);
return {
// HITL safe mode
currentHITLSafeMode,
showHITLToggle,
isHITLStateUndetermined,
handleHITLToggle,
// Sensitive action safe mode
currentSensitiveActionSafeMode,
showSensitiveActionToggle,
handleSensitiveActionToggle,
// General
currentSafeMode,
isPending,
shouldShowToggle,
// Backwards compatibility
currentSafeMode: currentHITLSafeMode,
isStateUndetermined: isHITLStateUndetermined,
handleToggle: handleHITLToggle,
hasHITLBlocks: showHITLToggle,
isStateUndetermined,
handleToggle,
hasHITLBlocks: shouldShowToggle,
};
}
function handleToggleError(
error: unknown,
isAgent: boolean,
toast: ReturnType<typeof useToast>["toast"],
) {
const isNotFoundError =
error instanceof Error &&
(error.message.includes("404") || error.message.includes("not found"));
if (!isAgent && isNotFoundError) {
toast({
title: "Safe mode not available",
description:
"To configure safe mode, please save this graph to your library first.",
variant: "destructive",
});
} else {
toast({
title: "Failed to update safe mode",
description:
error instanceof Error
? error.message
: "An unexpected error occurred.",
variant: "destructive",
});
}
}

View File

@@ -5,7 +5,7 @@ import isEqual from "lodash/isEqual";
export function cleanNode(node: CustomNode) {
return {
id: node.id,
// Note: position is intentionally excluded to prevent draft saves when dragging nodes
position: node.position,
data: {
hardcodedValues: node.data.hardcodedValues,
title: node.data.title,

View File

@@ -1,12 +1,15 @@
[flake8]
max-line-length = 88
extend-ignore = E203
exclude =
.tox,
__pycache__,
*.pyc,
.env
venv*/*,
.venv/*,
reports/*,
dist/*,
data/*,
.env,
venv*,
.venv,
reports,
dist,
data,
.benchmark_workspaces,
.autogpt,

291
classic/CLAUDE.md Normal file
View File

@@ -0,0 +1,291 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
AutoGPT Classic is an experimental, **unsupported** project demonstrating autonomous GPT-4 operation. Dependencies will not be updated, and the codebase contains known vulnerabilities. This is preserved for educational/historical purposes.
## Repository Structure
```
classic/
├── pyproject.toml # Single consolidated Poetry project
├── poetry.lock # Single lock file
├── forge/
│ └── forge/ # Core agent framework package
├── original_autogpt/
│ └── autogpt/ # AutoGPT agent package
├── direct_benchmark/
│ └── direct_benchmark/ # Benchmark harness package
└── benchmark/ # Challenge definitions (data, not code)
```
All packages are managed by a single `pyproject.toml` at the classic/ root.
## Common Commands
### Setup & Install
```bash
# Install everything from classic/ directory
cd classic
poetry install
```
### Running Agents
```bash
# Run forge agent
poetry run python -m forge
# Run original autogpt server
poetry run serve --debug
# Run autogpt CLI
poetry run autogpt
```
Agents run on `http://localhost:8000` by default.
### Benchmarking
```bash
# Run benchmarks
poetry run direct-benchmark run
# Run specific strategies and models
poetry run direct-benchmark run \
--strategies one_shot,rewoo \
--models claude \
--parallel 4
# Run a single test
poetry run direct-benchmark run --tests ReadFile
# List available commands
poetry run direct-benchmark --help
```
### Testing
```bash
poetry run pytest # All tests
poetry run pytest forge/tests/ # Forge tests only
poetry run pytest original_autogpt/tests/ # AutoGPT tests only
poetry run pytest -k test_name # Single test by name
poetry run pytest path/to/test.py # Specific test file
poetry run pytest --cov # With coverage
```
### Linting & Formatting
Run from the classic/ directory:
```bash
# Format everything (recommended to run together)
poetry run black . && poetry run isort .
# Check formatting (CI-style, no changes)
poetry run black --check . && poetry run isort --check-only .
# Lint
poetry run flake8 # Style linting
# Type check
poetry run pyright # Type checking (some errors are expected in infrastructure code)
```
Note: Always run linters over the entire directory, not specific files, for best results.
## Architecture
### Forge (Core Framework)
The `forge` package is the foundation that other components depend on:
- `forge/agent/` - Agent implementation and protocols
- `forge/llm/` - Multi-provider LLM integrations (OpenAI, Anthropic, Groq, LiteLLM)
- `forge/components/` - Reusable agent components
- `forge/file_storage/` - File system abstraction
- `forge/config/` - Configuration management
### Original AutoGPT
- `original_autogpt/autogpt/app/` - CLI application entry points
- `original_autogpt/autogpt/agents/` - Agent implementations
- `original_autogpt/autogpt/agent_factory/` - Agent creation logic
### Direct Benchmark
Benchmark harness for testing agent performance:
- `direct_benchmark/direct_benchmark/` - CLI and harness code
- `benchmark/agbenchmark/challenges/` - Test cases organized by category (code, retrieval, data, etc.)
- Reports generated in `direct_benchmark/reports/`
### Package Structure
All three packages are included in a single Poetry project. Imports are fully qualified:
- `from forge.agent.base import BaseAgent`
- `from autogpt.agents.agent import Agent`
- `from direct_benchmark.harness import BenchmarkHarness`
## Code Style
- Python 3.12 target
- Line length: 88 characters (Black default)
- Black for formatting, isort for imports (profile="black")
- Type hints with Pyright checking
## Testing Patterns
- Async support via pytest-asyncio
- Fixtures defined in `conftest.py` files provide: `tmp_project_root`, `storage`, `config`, `llm_provider`, `agent`
- Tests requiring API keys (OPENAI_API_KEY, ANTHROPIC_API_KEY) will skip if not set
## Environment Setup
Copy `.env.example` to `.env` in the relevant directory and add your API keys:
```bash
cp .env.example .env
# Edit .env with your OPENAI_API_KEY, etc.
```
## Workspaces
Agents operate within a **workspace** - a directory containing all agent data and files. The workspace root defaults to the current working directory.
### Workspace Structure
```
{workspace}/
├── .autogpt/
│ ├── autogpt.yaml # Workspace-level permissions
│ ├── ap_server.db # Agent Protocol database (server mode)
│ └── agents/
│ └── AutoGPT-{agent_id}/
│ ├── state.json # Agent profile, directives, action history
│ ├── permissions.yaml # Agent-specific permission overrides
│ └── workspace/ # Agent's sandboxed working directory
```
### Key Concepts
- **Multiple agents** can coexist in the same workspace (each gets its own subdirectory)
- **File access** is sandboxed to the agent's `workspace/` directory by default
- **State persistence** - agent state saves to `state.json` and survives across sessions
- **Storage backends** - supports local filesystem, S3, and GCS (via `FILE_STORAGE_BACKEND` env var)
### Specifying a Workspace
```bash
# Default: uses current directory
cd /path/to/my/project && poetry run autogpt
# Or specify explicitly via CLI (if supported)
poetry run autogpt --workspace /path/to/workspace
```
## Settings Location
Configuration uses a **layered system** with three levels (in order of precedence):
### 1. Environment Variables (Global)
Loaded from `.env` file in the working directory:
```bash
# Required
OPENAI_API_KEY=sk-...
# Optional LLM settings
SMART_LLM=gpt-4o # Model for complex reasoning
FAST_LLM=gpt-4o-mini # Model for simple tasks
EMBEDDING_MODEL=text-embedding-3-small
# Optional search providers (for web search component)
TAVILY_API_KEY=tvly-...
SERPER_API_KEY=...
GOOGLE_API_KEY=...
GOOGLE_CUSTOM_SEARCH_ENGINE_ID=...
# Optional infrastructure
LOG_LEVEL=DEBUG # DEBUG, INFO, WARNING, ERROR
DATABASE_STRING=sqlite:///agent.db # Agent Protocol database
PORT=8000 # Server port
FILE_STORAGE_BACKEND=local # local, s3, or gcs
```
### 2. Workspace Settings (`{workspace}/.autogpt/autogpt.yaml`)
Workspace-wide permissions that apply to **all agents** in this workspace:
```yaml
allow:
- read_file({workspace}/**)
- write_to_file({workspace}/**)
- list_folder({workspace}/**)
- web_search(*)
deny:
- read_file(**.env)
- read_file(**.env.*)
- read_file(**.key)
- read_file(**.pem)
- execute_shell(rm -rf:*)
- execute_shell(sudo:*)
```
Auto-generated with sensible defaults if missing.
### 3. Agent Settings (`{workspace}/.autogpt/agents/{id}/permissions.yaml`)
Agent-specific permission overrides:
```yaml
allow:
- execute_python(*)
- web_search(*)
deny:
- execute_shell(*)
```
## Permissions
The permission system uses **pattern matching** with a **first-match-wins** evaluation order.
### Permission Check Order
1. Agent deny list → **Block**
2. Workspace deny list → **Block**
3. Agent allow list → **Allow**
4. Workspace allow list → **Allow**
5. Session denied list → **Block** (commands denied during this session)
6. **Prompt user** → Interactive approval (if in interactive mode)
### Pattern Syntax
Format: `command_name(glob_pattern)`
| Pattern | Description |
|---------|-------------|
| `read_file({workspace}/**)` | Read any file in workspace (recursive) |
| `write_to_file({workspace}/*.txt)` | Write only .txt files in workspace root |
| `execute_shell(python:**)` | Execute Python commands only |
| `execute_shell(git:*)` | Execute any git command |
| `web_search(*)` | Allow all web searches |
Special tokens:
- `{workspace}` - Replaced with actual workspace path
- `**` - Matches any path including `/`
- `*` - Matches any characters except `/`
### Interactive Approval Scopes
When prompted for permission, users can choose:
| Scope | Effect |
|-------|--------|
| **Once** | Allow this one time only (not saved) |
| **Agent** | Always allow for this agent (saves to agent `permissions.yaml`) |
| **Workspace** | Always allow for all agents (saves to `autogpt.yaml`) |
| **Deny** | Deny this command (saves to appropriate deny list) |
### Default Security
Out of the box, the following are **denied by default**:
- Reading sensitive files (`.env`, `.key`, `.pem`)
- Destructive shell commands (`rm -rf`, `sudo`)
- Operations outside the workspace directory

View File

@@ -1,182 +0,0 @@
## CLI Documentation
This document describes how to interact with the project's CLI (Command Line Interface). It includes the types of outputs you can expect from each command. Note that the `agents stop` command will terminate any process running on port 8000.
### 1. Entry Point for the CLI
Running the `./run` command without any parameters will display the help message, which provides a list of available commands and options. Additionally, you can append `--help` to any command to view help information specific to that command.
```sh
./run
```
**Output**:
```
Usage: cli.py [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
agent Commands to create, start and stop agents
benchmark Commands to start the benchmark and list tests and categories
setup Installs dependencies needed for your system.
```
If you need assistance with any command, simply add the `--help` parameter to the end of your command, like so:
```sh
./run COMMAND --help
```
This will display a detailed help message regarding that specific command, including a list of any additional options and arguments it accepts.
### 2. Setup Command
```sh
./run setup
```
**Output**:
```
Setup initiated
Installation has been completed.
```
This command initializes the setup of the project.
### 3. Agents Commands
**a. List All Agents**
```sh
./run agent list
```
**Output**:
```
Available agents: 🤖
🐙 forge
🐙 autogpt
```
Lists all the available agents.
**b. Create a New Agent**
```sh
./run agent create my_agent
```
**Output**:
```
🎉 New agent 'my_agent' created and switched to the new directory in agents folder.
```
Creates a new agent named 'my_agent'.
**c. Start an Agent**
```sh
./run agent start my_agent
```
**Output**:
```
... (ASCII Art representing the agent startup)
[Date and Time] [forge.sdk.db] [DEBUG] 🐛 Initializing AgentDB with database_string: sqlite:///agent.db
[Date and Time] [forge.sdk.agent] [INFO] 📝 Agent server starting on http://0.0.0.0:8000
```
Starts the 'my_agent' and displays startup ASCII art and logs.
**d. Stop an Agent**
```sh
./run agent stop
```
**Output**:
```
Agent stopped
```
Stops the running agent.
### 4. Benchmark Commands
**a. List Benchmark Categories**
```sh
./run benchmark categories list
```
**Output**:
```
Available categories: 📚
📖 code
📖 safety
📖 memory
... (and so on)
```
Lists all available benchmark categories.
**b. List Benchmark Tests**
```sh
./run benchmark tests list
```
**Output**:
```
Available tests: 📚
📖 interface
🔬 Search - TestSearch
🔬 Write File - TestWriteFile
... (and so on)
```
Lists all available benchmark tests.
**c. Show Details of a Benchmark Test**
```sh
./run benchmark tests details TestWriteFile
```
**Output**:
```
TestWriteFile
-------------
Category: interface
Task: Write the word 'Washington' to a .txt file
... (and other details)
```
Displays the details of the 'TestWriteFile' benchmark test.
**d. Start Benchmark for the Agent**
```sh
./run benchmark start my_agent
```
**Output**:
```
(more details about the testing process shown whilst the test are running)
============= 13 failed, 1 passed in 0.97s ============...
```
Displays the results of the benchmark tests on 'my_agent'.

View File

@@ -2,7 +2,7 @@
ARG BUILD_TYPE=dev
# Use an official Python base image from the Docker Hub
FROM python:3.10-slim AS autogpt-base
FROM python:3.12-slim AS autogpt-base
# Install browsers
RUN apt-get update && apt-get install -y \
@@ -34,9 +34,6 @@ COPY original_autogpt/pyproject.toml original_autogpt/poetry.lock ./
# Include forge so it can be used as a path dependency
COPY forge/ ../forge
# Include frontend
COPY frontend/ ../frontend
# Set the entrypoint
ENTRYPOINT ["poetry", "run", "autogpt"]
CMD []

View File

@@ -1,173 +0,0 @@
# Quickstart Guide
> For the complete getting started [tutorial series](https://aiedge.medium.com/autogpt-forge-e3de53cc58ec) <- click here
Welcome to the Quickstart Guide! This guide will walk you through setting up, building, and running your own AutoGPT agent. Whether you're a seasoned AI developer or just starting out, this guide will provide you with the steps to jumpstart your journey in AI development with AutoGPT.
## System Requirements
This project supports Linux (Debian-based), Mac, and Windows Subsystem for Linux (WSL). If you use a Windows system, you must install WSL. You can find the installation instructions for WSL [here](https://learn.microsoft.com/en-us/windows/wsl/).
## Getting Setup
1. **Fork the Repository**
To fork the repository, follow these steps:
- Navigate to the main page of the repository.
![Repository](../docs/content/imgs/quickstart/001_repo.png)
- In the top-right corner of the page, click Fork.
![Create Fork UI](../docs/content/imgs/quickstart/002_fork.png)
- On the next page, select your GitHub account to create the fork.
- Wait for the forking process to complete. You now have a copy of the repository in your GitHub account.
2. **Clone the Repository**
To clone the repository, you need to have Git installed on your system. If you don't have Git installed, download it from [here](https://git-scm.com/downloads). Once you have Git installed, follow these steps:
- Open your terminal.
- Navigate to the directory where you want to clone the repository.
- Run the git clone command for the fork you just created
![Clone the Repository](../docs/content/imgs/quickstart/003_clone.png)
- Then open your project in your ide
![Open the Project in your IDE](../docs/content/imgs/quickstart/004_ide.png)
4. **Setup the Project**
Next, we need to set up the required dependencies. We have a tool to help you perform all the tasks on the repo.
It can be accessed by running the `run` command by typing `./run` in the terminal.
The first command you need to use is `./run setup.` This will guide you through setting up your system.
Initially, you will get instructions for installing Flutter and Chrome and setting up your GitHub access token like the following image:
![Setup the Project](../docs/content/imgs/quickstart/005_setup.png)
### For Windows Users
If you're a Windows user and experience issues after installing WSL, follow the steps below to resolve them.
#### Update WSL
Run the following command in Powershell or Command Prompt:
1. Enable the optional WSL and Virtual Machine Platform components.
2. Download and install the latest Linux kernel.
3. Set WSL 2 as the default.
4. Download and install the Ubuntu Linux distribution (a reboot may be required).
```shell
wsl --install
```
For more detailed information and additional steps, refer to [Microsoft's WSL Setup Environment Documentation](https://learn.microsoft.com/en-us/windows/wsl/setup/environment).
#### Resolve FileNotFoundError or "No such file or directory" Errors
When you run `./run setup`, if you encounter errors like `No such file or directory` or `FileNotFoundError`, it might be because Windows-style line endings (CRLF - Carriage Return Line Feed) are not compatible with Unix/Linux style line endings (LF - Line Feed).
To resolve this, you can use the `dos2unix` utility to convert the line endings in your script from CRLF to LF. Heres how to install and run `dos2unix` on the script:
```shell
sudo apt update
sudo apt install dos2unix
dos2unix ./run
```
After executing the above commands, running `./run setup` should work successfully.
#### Store Project Files within the WSL File System
If you continue to experience issues, consider storing your project files within the WSL file system instead of the Windows file system. This method avoids path translations and permissions issues and provides a more consistent development environment.
You can keep running the command to get feedback on where you are up to with your setup.
When setup has been completed, the command will return an output like this:
![Setup Complete](../docs/content/imgs/quickstart/006_setup_complete.png)
## Creating Your Agent
After completing the setup, the next step is to create your agent template.
Execute the command `./run agent create YOUR_AGENT_NAME`, where `YOUR_AGENT_NAME` should be replaced with your chosen name.
Tips for naming your agent:
* Give it its own unique name, or name it after yourself
* Include an important aspect of your agent in the name, such as its purpose
Examples: `SwiftyosAssistant`, `PwutsPRAgent`, `MySuperAgent`
![Create an Agent](../docs/content/imgs/quickstart/007_create_agent.png)
## Running your Agent
Your agent can be started using the command: `./run agent start YOUR_AGENT_NAME`
This starts the agent on the URL: `http://localhost:8000/`
![Start the Agent](../docs/content/imgs/quickstart/009_start_agent.png)
The front end can be accessed from `http://localhost:8000/`; first, you must log in using either a Google account or your GitHub account.
![Login](../docs/content/imgs/quickstart/010_login.png)
Upon logging in, you will get a page that looks something like this: your task history down the left-hand side of the page, and the 'chat' window to send tasks to your agent.
![Login](../docs/content/imgs/quickstart/011_home.png)
When you have finished with your agent or just need to restart it, use Ctl-C to end the session. Then, you can re-run the start command.
If you are having issues and want to ensure the agent has been stopped, there is a `./run agent stop` command, which will kill the process using port 8000, which should be the agent.
## Benchmarking your Agent
The benchmarking system can also be accessed using the CLI too:
```bash
agpt % ./run benchmark
Usage: cli.py benchmark [OPTIONS] COMMAND [ARGS]...
Commands to start the benchmark and list tests and categories
Options:
--help Show this message and exit.
Commands:
categories Benchmark categories group command
start Starts the benchmark command
tests Benchmark tests group command
agpt % ./run benchmark categories
Usage: cli.py benchmark categories [OPTIONS] COMMAND [ARGS]...
Benchmark categories group command
Options:
--help Show this message and exit.
Commands:
list List benchmark categories command
agpt % ./run benchmark tests
Usage: cli.py benchmark tests [OPTIONS] COMMAND [ARGS]...
Benchmark tests group command
Options:
--help Show this message and exit.
Commands:
details Benchmark test details command
list List benchmark tests command
```
The benchmark has been split into different categories of skills you can test your agent on. You can see what categories are available with
```bash
./run benchmark categories list
# And what tests are available with
./run benchmark tests list
```
![Login](../docs/content/imgs/quickstart/012_tests.png)
Finally, you can run the benchmark with
```bash
./run benchmark start YOUR_AGENT_NAME
```
>

View File

@@ -4,7 +4,7 @@ AutoGPT Classic was an experimental project to demonstrate autonomous GPT-4 oper
## Project Status
⚠️ **This project is unsupported, and dependencies will not be updated. It was an experiment that has concluded its initial research phase. If you want to use AutoGPT, you should use the [AutoGPT Platform](/autogpt_platform)**
**This project is unsupported, and dependencies will not be updated.** It was an experiment that has concluded its initial research phase. If you want to use AutoGPT, you should use the [AutoGPT Platform](/autogpt_platform).
For those interested in autonomous AI agents, we recommend exploring more actively maintained alternatives or referring to this codebase for educational purposes only.
@@ -16,37 +16,171 @@ AutoGPT Classic was one of the first implementations of autonomous AI agents - A
- Learn from the results and adjust its approach
- Chain multiple actions together to achieve an objective
## Key Features
- 🔄 Autonomous task chaining
- 🛠 Tool and API integration capabilities
- 💾 Memory management for context retention
- 🔍 Web browsing and information gathering
- 📝 File operations and content creation
- 🔄 Self-prompting and task breakdown
## Structure
The project is organized into several key components:
- `/benchmark` - Performance testing tools
- `/forge` - Core autonomous agent framework
- `/frontend` - User interface components
- `/original_autogpt` - Original implementation
```
classic/
├── pyproject.toml # Single consolidated Poetry project
├── poetry.lock # Single lock file
├── forge/ # Core autonomous agent framework
├── original_autogpt/ # Original implementation
├── direct_benchmark/ # Benchmark harness
└── benchmark/ # Challenge definitions (data)
```
## Getting Started
While this project is no longer actively maintained, you can still explore the codebase:
### Prerequisites
- Python 3.12+
- [Poetry](https://python-poetry.org/docs/#installation)
### Installation
1. Clone the repository:
```bash
# Clone the repository
git clone https://github.com/Significant-Gravitas/AutoGPT.git
cd classic
# Install everything
poetry install
```
2. Review the documentation:
- For reference, see the [documentation](https://docs.agpt.co). You can browse at the same point in time as this commit so the docs don't change.
- Check `CLI-USAGE.md` for command-line interface details
- Refer to `TROUBLESHOOTING.md` for common issues
### Configuration
Configuration uses a layered system:
1. **Environment variables** (`.env` file)
2. **Workspace settings** (`.autogpt/autogpt.yaml`)
3. **Agent settings** (`.autogpt/agents/{id}/permissions.yaml`)
Copy the example environment file and add your API keys:
```bash
cp .env.example .env
```
Key environment variables:
```bash
# Required
OPENAI_API_KEY=sk-...
# Optional LLM settings
SMART_LLM=gpt-4o # Model for complex reasoning
FAST_LLM=gpt-4o-mini # Model for simple tasks
# Optional search providers
TAVILY_API_KEY=tvly-...
SERPER_API_KEY=...
# Optional infrastructure
LOG_LEVEL=DEBUG
PORT=8000
FILE_STORAGE_BACKEND=local # local, s3, or gcs
```
### Running
All commands run from the `classic/` directory:
```bash
# Run forge agent
poetry run python -m forge
# Run original autogpt server
poetry run serve --debug
# Run autogpt CLI
poetry run autogpt
```
Agents run on `http://localhost:8000` by default.
### Benchmarking
```bash
poetry run direct-benchmark run
```
### Testing
```bash
poetry run pytest # All tests
poetry run pytest forge/tests/ # Forge tests only
poetry run pytest original_autogpt/tests/ # AutoGPT tests only
```
## Workspaces
Agents operate within a **workspace** directory that contains all agent data and files:
```
{workspace}/
├── .autogpt/
│ ├── autogpt.yaml # Workspace-level permissions
│ ├── ap_server.db # Agent Protocol database (server mode)
│ └── agents/
│ └── AutoGPT-{agent_id}/
│ ├── state.json # Agent profile, directives, history
│ ├── permissions.yaml # Agent-specific permissions
│ └── workspace/ # Agent's sandboxed working directory
```
- The workspace defaults to the current working directory
- Multiple agents can coexist in the same workspace
- Agent file access is sandboxed to their `workspace/` subdirectory
- State persists across sessions via `state.json`
## Permissions
AutoGPT uses a **layered permission system** with pattern matching:
### Permission Files
| File | Scope | Location |
|------|-------|----------|
| `autogpt.yaml` | All agents in workspace | `.autogpt/autogpt.yaml` |
| `permissions.yaml` | Single agent | `.autogpt/agents/{id}/permissions.yaml` |
### Permission Format
```yaml
allow:
- read_file({workspace}/**) # Read any file in workspace
- write_to_file({workspace}/**) # Write any file in workspace
- web_search(*) # All web searches
deny:
- read_file(**.env) # Block .env files
- execute_shell(sudo:*) # Block sudo commands
```
### Check Order (First Match Wins)
1. Agent deny → Block
2. Workspace deny → Block
3. Agent allow → Allow
4. Workspace allow → Allow
5. Prompt user → Interactive approval
### Interactive Approval
When prompted, users can approve commands with different scopes:
- **Once** - Allow this one time only
- **Agent** - Always allow for this agent
- **Workspace** - Always allow for all agents
- **Deny** - Block this command
### Default Security
Denied by default:
- Sensitive files (`.env`, `.key`, `.pem`)
- Destructive commands (`rm -rf`, `sudo`)
- Operations outside the workspace
## Security Notice
This codebase has **known vulnerabilities** and issues with its dependencies. It will not be updated to new dependencies. Use for educational purposes only.
## License
@@ -55,27 +189,3 @@ This project segment is licensed under the MIT License - see the [LICENSE](LICEN
## Documentation
Please refer to the [documentation](https://docs.agpt.co) for more detailed information about the project's architecture and concepts.
You can browse at the same point in time as this commit so the docs don't change.
## Historical Impact
AutoGPT Classic played a significant role in advancing the field of autonomous AI agents:
- Demonstrated practical implementation of AI autonomy
- Inspired numerous derivative projects and research
- Contributed to the development of AI agent architectures
- Helped identify key challenges in AI autonomy
## Security Notice
If you're studying this codebase, please understand this has KNOWN vulnerabilities and issues with its dependencies. It will not be updated to new dependencies.
## Community & Support
While active development has concluded:
- The codebase remains available for study and reference
- Historical discussions can be found in project issues
- Related research and developments continue in the broader AI agent community
## Acknowledgments
Thanks to all contributors who participated in this experimental project and helped advance the field of autonomous AI agents.

View File

@@ -1,4 +0,0 @@
AGENT_NAME=mini-agi
REPORTS_FOLDER="reports/mini-agi"
OPENAI_API_KEY="sk-" # for LLM eval
BUILD_SKILL_TREE=false # set to true to build the skill tree.

View File

@@ -1,12 +0,0 @@
[flake8]
max-line-length = 88
# Ignore rules that conflict with Black code style
extend-ignore = E203, W503
exclude =
__pycache__/,
*.pyc,
.pytest_cache/,
venv*/,
.venv/,
reports/,
agbenchmark/reports/,

View File

@@ -1,174 +0,0 @@
agbenchmark_config/workspace/
backend/backend_stdout.txt
reports/df*.pkl
reports/raw*
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/
.DS_Store
```
secrets.json
agbenchmark_config/challenges_already_beaten.json
agbenchmark_config/challenges/pri_*
agbenchmark_config/updates.json
agbenchmark_config/reports/*
agbenchmark_config/reports/success_rate.json
agbenchmark_config/reports/regression_tests.json

View File

@@ -1,21 +0,0 @@
MIT License
Copyright (c) 2024 AutoGPT
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

View File

@@ -1,25 +0,0 @@
# Auto-GPT Benchmarks
Built for the purpose of benchmarking the performance of agents regardless of how they work.
Objectively know how well your agent is performing in categories like code, retrieval, memory, and safety.
Save time and money while doing it through smart dependencies. The best part? It's all automated.
## Scores:
<img width="733" alt="Screenshot 2023-07-25 at 10 35 01 AM" src="https://github.com/Significant-Gravitas/Auto-GPT-Benchmarks/assets/9652976/98963e0b-18b9-4b17-9a6a-4d3e4418af70">
## Ranking overall:
- 1- [Beebot](https://github.com/AutoPackAI/beebot)
- 2- [mini-agi](https://github.com/muellerberndt/mini-agi)
- 3- [Auto-GPT](https://github.com/Significant-Gravitas/AutoGPT)
## Detailed results:
<img width="733" alt="Screenshot 2023-07-25 at 10 42 15 AM" src="https://github.com/Significant-Gravitas/Auto-GPT-Benchmarks/assets/9652976/39be464c-c842-4437-b28a-07d878542a83">
[Click here to see the results and the raw data!](https://docs.google.com/spreadsheets/d/1WXm16P2AHNbKpkOI0LYBpcsGG0O7D8HYTG5Uj0PaJjA/edit#gid=203558751)!
More agents coming soon !

View File

@@ -1,69 +0,0 @@
## As a user
1. `pip install auto-gpt-benchmarks`
2. Add boilerplate code to run and kill agent
3. `agbenchmark`
- `--category challenge_category` to run tests in a specific category
- `--mock` to only run mock tests if they exists for each test
- `--noreg` to skip any tests that have passed in the past. When you run without this flag and a previous challenge that passed fails, it will now not be regression tests
4. We call boilerplate code for your agent
5. Show pass rate of tests, logs, and any other metrics
## Contributing
##### Diagrams: https://whimsical.com/agbenchmark-5n4hXBq1ZGzBwRsK4TVY7x
### To run the existing mocks
1. clone the repo `auto-gpt-benchmarks`
2. `pip install poetry`
3. `poetry shell`
4. `poetry install`
5. `cp .env_example .env`
6. `git submodule update --init --remote --recursive`
7. `uvicorn server:app --reload`
8. `agbenchmark --mock`
Keep config the same and watch the logs :)
### To run with mini-agi
1. Navigate to `auto-gpt-benchmarks/agent/mini-agi`
2. `pip install -r requirements.txt`
3. `cp .env_example .env`, set `PROMPT_USER=false` and add your `OPENAI_API_KEY=`. Sset `MODEL="gpt-3.5-turbo"` if you don't have access to `gpt-4` yet. Also make sure you have Python 3.10^ installed
4. set `AGENT_NAME=mini-agi` in `.env` file and where you want your `REPORTS_FOLDER` to be
5. Make sure to follow the commands above, and remove mock flag `agbenchmark`
- To add requirements `poetry add requirement`.
Feel free to create prs to merge with `main` at will (but also feel free to ask for review) - if you can't send msg in R&D chat for access.
If you push at any point and break things - it'll happen to everyone - fix it asap. Step 1 is to revert `master` to last working commit
Let people know what beautiful code you write does, document everything well
Share your progress :)
#### Dataset
Manually created, existing challenges within Auto-Gpt, https://osu-nlp-group.github.io/Mind2Web/
## How do I add new agents to agbenchmark ?
Example with smol developer.
1- Create a github branch with your agent following the same pattern as this example:
https://github.com/smol-ai/developer/pull/114/files
2- Create the submodule and the github workflow by following the same pattern as this example:
https://github.com/Significant-Gravitas/Auto-GPT-Benchmarks/pull/48/files
## How do I run agent in different environments?
**To just use as the benchmark for your agent**. `pip install` the package and run `agbenchmark`
**For internal Auto-GPT ci runs**, specify the `AGENT_NAME` you want you use and set the `HOME_ENV`.
Ex. `AGENT_NAME=mini-agi`
**To develop agent alongside benchmark**, you can specify the `AGENT_NAME` you want you use and add as a submodule to the repo

Some files were not shown because too many files have changed in this diff Show More