Compare commits

...

31 Commits

Author SHA1 Message Date
Zamil Majdy
0470e2e618 Merge branch 'dev' of github.com:Significant-Gravitas/AutoGPT into feat/autopilot-dry-run-flag 2026-04-01 06:15:15 +02:00
Zamil Majdy
c0367430e3 Merge branch 'dev' of github.com:Significant-Gravitas/AutoGPT into feat/autopilot-dry-run-flag 2026-03-31 19:07:25 +02:00
Zamil Majdy
28ede02a8e Merge branch 'dev' of github.com:Significant-Gravitas/AutoGPT into feat/autopilot-dry-run-flag 2026-03-31 15:17:53 +02:00
Zamil Majdy
ba2699aad2 merge: resolve import conflict in copilot/db.py after merging dev
Keep both ChatSessionMetadata (from feature branch) and
invalidate_session_cache (from dev) imports.
2026-03-31 09:17:27 +02:00
Zamil Majdy
1f828ada7b test(copilot): add backward-compat tests for ChatSessionMetadata deserialization
Verify that sessions created before the dry_run field existed still
deserialize correctly — empty dicts, missing keys, and JSON round-trips
all default to dry_run=False without errors.
2026-03-29 20:42:47 +02:00
Zamil Majdy
356ac09cba fix(copilot): respect session.dry_run in run_mcp_tool
When a chat session has dry_run=True, run_mcp_tool now returns a
synthetic dry-run response instead of executing the MCP tool for real.
Discovery (listing tools) is still allowed so the agent can inspect
schemas. Adds tests for all three cases: blocked execution, allowed
discovery, and normal execution when dry_run is False.
2026-03-29 08:04:27 +02:00
Zamil Majdy
a75b521e6b fix(copilot): add missing dry_run param to http_credentials_test calls
After the dev merge, RunBlockTool._execute() gained a required
keyword-only `dry_run` parameter. The two integration tests in
TestRunBlockToolAuthenticatedHttp were not updated, causing type-check
and test failures across all Python versions.
2026-03-29 04:33:41 +02:00
Zamil Majdy
ae93d62378 Merge branch 'dev' of github.com:Significant-Gravitas/AutoGPT into feat/autopilot-dry-run-flag 2026-03-29 04:17:37 +02:00
Zamil Majdy
1d86cd11a5 chore(frontend): regenerate openapi.json from backend schema
Re-export and prettier-format the API schema to match the current
backend after merging dev.
2026-03-28 07:38:54 +03:00
Zamil Majdy
5ac5cd594a merge: resolve openapi.json conflict with remote feature branch
Keep the dev branch version with extended rate-limit admin fields
(optional user_id, email param, user_email property).
2026-03-28 07:30:39 +03:00
Zamil Majdy
c399d61e2a fix(copilot): add null-safety guards for LLM-supplied string params
Reintroduce `(param or "").strip()` null-coalescing in
connect_integration, find_block, and feature_requests tools. If the
LLM sends `null` for a string parameter the previous `.strip()` call
would raise an AttributeError instead of a helpful validation error.
2026-03-28 07:28:59 +03:00
Zamil Majdy
de866f4686 Merge branch 'dev' of github.com:Significant-Gravitas/AutoGPT into feat/autopilot-dry-run-flag 2026-03-28 07:28:03 +03:00
Zamil Majdy
497499a5e6 fix(frontend): regenerate openapi.json with CreateSessionRequest extra=forbid
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 03:15:45 +00:00
Zamil Majdy
cc2205e0c4 fix(copilot): reject nested metadata.dry_run with 422 and add contract tests
The CreateSessionRequest accepted extra fields silently, so sending
{"metadata": {"dry_run": true}} resulted in dry_run defaulting to false
instead of returning a validation error. Add `extra="forbid"` to reject
unknown fields and add tests for dry_run=true, default=false, and the
nested-metadata rejection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 00:06:12 +00:00
Zamil Majdy
5231276ce9 fix(copilot): prevent dry-run metadata from leaking into LLM context
When dry_run=True, the tool output message contained "[DRY RUN] Block
'X' simulated successfully — no real execution occurred." This text was
fed directly into the LLM conversation context, causing the autopilot to
realize it was in simulation mode and change its evaluation behavior —
defeating the purpose of dry-run testing.

Changes:
- Remove [DRY RUN] prefix from execute_block() dry-run message — now
  identical to real execution ("Block 'X' executed successfully")
- Shorten dry_run tool schema descriptions to "Execute in preview mode"
  instead of detailed simulation language that steered LLM behavior
- Update tests to assert dry-run output matches real execution format

The is_dry_run field is preserved in the response model so the frontend
can still show simulation badges via the SSE tool output event.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 08:41:49 +00:00
Zamil Majdy
4a581ef3d7 fix(copilot): add explicit dry_run= to all callers after making it mandatory
All callers of RunAgentTool.execute(), ChatSession.new(), and
create_chat_session() now pass dry_run= explicitly, fixing the
"Field required" validation errors in run_agent_test.py and other
test files.
2026-03-27 12:12:18 +07:00
Zamil Majdy
c10389a88c refactor(copilot): make dry_run a mandatory parameter in tool signatures
Remove `= False` default from dry_run in _execute, execute_block,
prepare_block_for_execution, RunAgentInput, and _run_agent so callers
must always pass it explicitly. This prevents ambiguity about which
execution mode is active.

- Add dry_run to "required" in RunBlockTool and RunAgentTool JSON schemas
- Make RunBlockTool._execute params keyword-only (after *) for pyright compat
- Update all test call sites to pass dry_run=True or dry_run=False explicitly
- Leave ChatSessionMetadata.dry_run default intact (DB model)
2026-03-27 11:57:58 +07:00
Zamil Majdy
880d304e9e fix(copilot): handle None in folder tool .strip() calls
When the LLM passes None for name/folder_id parameters, .strip()
raises AttributeError. Coalesce to empty string before stripping
so the existing validation returns the proper ErrorResponse.
2026-03-27 11:18:46 +07:00
Zamil Majdy
88364fada7 fix(copilot): document metadata immutability and remove redundant or-guard
- Add docstring to update_chat_session clarifying that metadata is
  intentionally omitted (set once at creation, immutable thereafter)
- Remove redundant `or {}` in model.py since _parse_json_field already
  returns {} as default
2026-03-27 11:05:47 +07:00
Zamil Majdy
e8ce06c92f fix(copilot): align test docstring with narrowed dry_run scope wording 2026-03-27 10:44:38 +07:00
Zamil Majdy
e680738074 fix(copilot): fix broken migration and narrow dry_run scope in docs
- Fix migration SQL: remove references to nonexistent dryRun column
  (steps 2-3 referenced a column that was never created in dev)
- Rename migration directory to add_metadata_to_chat_session
- Narrow "ALL tool calls" to "run_block and run_agent" in descriptions,
  docstrings, and docs to accurately reflect what session dry_run covers
- Simplify baseline service context propagation comment
2026-03-27 10:40:55 +07:00
Zamil Majdy
96554732ee refactor(copilot): remove _session_dry_run ContextVar; use session.dry_run directly
session.dry_run (via ChatSession.metadata) is the single source of truth.
Remove the redundant _session_dry_run ContextVar, is_session_dry_run()
helper, and dry_run parameter from set_execution_context(). Tool handlers
now read session.dry_run directly.
2026-03-27 10:04:13 +07:00
Zamil Majdy
3d8c040e33 refactor(copilot): replace **kwargs with typed parameters in tool _execute methods
Convert all copilot tool `_execute` methods from duck-typed `**kwargs` to
proper typed parameter signatures. This makes each tool's interface explicit,
enables IDE autocompletion and static analysis, and eliminates silent typo
bugs from `kwargs.get("wrong_name")`.

Three tools intentionally retain **kwargs:
- AddUnderstandingTool: parameters derived dynamically from Pydantic model
- RunAgentTool/AgentOutputTool: delegates to Pydantic model with cross-field
  validators (RunAgentInput/AgentOutputInput)

The BaseTool.execute() dispatcher keeps **kwargs since it forwards unknown
params to each tool's typed _execute signature.
2026-03-27 09:59:07 +07:00
Zamil Majdy
1e9a8cb072 refactor(copilot): replace dryRun Boolean column with metadata Json column
Replace the single-purpose `dryRun Boolean` column on `AgentChatSession`
with an extensible `metadata Json` column backed by a typed Pydantic model
(`ChatSessionMetadata`). This avoids DB migrations for future session-level
flags — new fields can be added to the model with defaults.

- Schema: `dryRun Boolean @default(false)` -> `metadata Json @default("{}")`
- Model: `ChatSessionMetadata(dry_run=False)` with convenience property
- API: `SessionDetailResponse.dry_run` -> `.metadata`
- Migration: data-preserving (migrates existing dryRun=true rows)
- All existing `session.dry_run` reads still work via the property
2026-03-27 09:41:25 +07:00
Zamil Majdy
40f6087f52 fix(copilot): expose dry_run in SessionDetailResponse for session reload
Add dry_run field to SessionDetailResponse and populate it in
get_session() so the frontend can know if a session is in dry-run mode
on page reload or reconnection.
2026-03-27 01:04:04 +07:00
Zamil Majdy
1d24765b8e Merge branch 'dev' of github.com:Significant-Gravitas/AutoGPT into feat/autopilot-dry-run-flag 2026-03-26 23:38:54 +07:00
Zamil Majdy
4b47c6bef2 fix(copilot): propagate session dry_run context in baseline service path
The baseline LLM fallback path (OpenAI-compatible) was not calling
set_execution_context(), so the _session_dry_run ContextVar was never
set. This meant tools executed via the baseline path would ignore the
session's dry_run flag and perform real API calls.

Add set_execution_context() call before tool execution begins, matching
the SDK path behavior.
2026-03-26 21:23:09 +07:00
Zamil Majdy
cfd88cc7ba fix(copilot): block scheduling in dry-run sessions and isolate test ContextVar state
- Return an error when scheduling (cron/schedule_name) is attempted in a
  dry-run session, since schedules create real side effects that cannot
  be simulated.
- Add proper ContextVar token save/restore in TestSetExecutionContextDryRun
  tests to prevent state leakage between test methods.
- Add test for the new scheduling-blocked-in-dry-run behavior.
2026-03-26 21:03:42 +07:00
Zamil Majdy
456f46b5a6 fix(backend): remove platform schema prefix from migration and sync block docs
- Migration SQL referenced "platform"."ChatSession" which fails in CI
  where the platform schema doesn't exist. Use unqualified table name.
- Regenerate block docs after adding dry_run to AutoPilot block input.
2026-03-26 20:47:25 +07:00
Zamil Majdy
951c9e6d29 fix: address PR review comments for session-level dry_run
- Pass dry_run=session.dry_run in _save_session_to_db fallback path
- Document that dry_run only applies to new sessions in field description
- Tighten test assertion to check kwargs directly
2026-03-26 20:39:56 +07:00
Zamil Majdy
7868080817 feat(copilot): add session-level dry_run flag to autopilot sessions
When dry_run is enabled at the session level, ALL tool calls to run_block
and run_agent are forced to use dry-run simulation mode, regardless of
the individual tool call's dry_run parameter. This enables testing agent
wiring end-to-end without real API calls, side effects, or credit charges.

Changes:
- Add dryRun field to ChatSession Prisma schema with migration
- Add dry_run field to ChatSession/ChatSessionInfo Python models
- Add _session_dry_run ContextVar in copilot context for propagation
- Update set_execution_context to accept and propagate dry_run flag
- Update AutoPilot block Input schema with dry_run parameter
- Update run_block tool to check session-level dry_run ContextVar
- Update run_agent tool to check session-level dry_run ContextVar
- Add dry_run support to chat session creation API endpoint
- Update frontend to pass body to create session mutation
- Add comprehensive tests for session-level dry_run propagation
2026-03-26 20:34:46 +07:00

Diff Content Not Available