Compare commits

...

22 Commits

Author SHA1 Message Date
Zamil Majdy
f1f639a4f7 chore: merge dev into fix/copilot-static-system-prompt
Resolve conflict in copilot/sdk/service.py — take incoming imports
(TranscriptDownload, cli_session_path, extract_context_messages,
projects_base, strip_for_upload) and graphiti turn-scoped assistant
message extraction from dev.
2026-04-16 22:38:57 +07:00
majdyz
eff087ec4c fix(backend/copilot): replace pyright ignore with _SystemPromptPreset TypedDict subclass
sdk 0.1.58's SystemPromptPreset TypedDict doesn't declare exclude_dynamic_sections
(added in 0.1.59). Introduce a local _SystemPromptPreset subclass that adds the
field as NotRequired[bool] instead of masking the type error with a pyright ignore.
2026-04-15 20:25:05 +07:00
majdyz
6279477bf7 style(backend/copilot): add isort skip_file guard to sdk/service.py
Prevents isort from converting double-dot relative imports back to
absolute imports on future format runs, which would re-introduce the
Pyright type-collision fixed in the previous commit.
2026-04-15 20:11:16 +07:00
majdyz
ab5b96f35e fix(backend/copilot): unify relative imports in sdk/service.py to fix Pyright type collision
Mixed absolute (backend.copilot.*) and relative (..*) imports for the same
modules caused Pyright to treat them as distinct types, producing errors like
"response_model.StreamBaseResponse is not assignable to
backend.copilot.response_model.StreamBaseResponse". Convert all backend.copilot.*
top-level and inline imports to relative imports for consistency.
2026-04-15 20:08:20 +07:00
majdyz
08352cff21 fix(backend/copilot): suppress Pyright false-positive on SystemPromptPreset TypedDict init
TypedDict keyword-arg construction is valid Python but Pyright misreports it
as a call-issue. Add pyright: ignore[reportCallIssue] to silence the noise.
2026-04-15 20:06:37 +07:00
majdyz
e1d60ab2c6 style(backend/copilot): reformat prompt_cache_test.py per black 2026-04-15 20:02:24 +07:00
majdyz
951c89699b fix(backend/copilot): document <env_context> tag in system prompt
Sentry flagged LOW severity: the newly-introduced <env_context> tag was
not documented in _CACHEABLE_SYSTEM_PROMPT, unlike <user_context> and
<memory_context>. Add a sentence instructing the LLM to treat its
contents as the trusted real working directory, and add a corresponding
smoke-test to prompt_cache_test.py.
2026-04-15 19:59:15 +07:00
majdyz
2fcea229cb perf(backend/copilot): memoize _get_cloud_sandbox_supplement with @cache
Apply the same @cache pattern used by get_sdk_supplement to the E2B path
_get_cloud_sandbox_supplement so its result is computed once per process
lifetime rather than on every SDK turn invocation.
2026-04-15 19:57:17 +07:00
majdyz
6f540fc3a9 test(backend/copilot): add warm_ctx resume-skip and None-coalescing tests
Cover the two gaps flagged in the autogpt-pr-reviewer review:
- no_user_message_in_session_returns_none: verifies inject_user_context
  returns None when session_messages contains no user-role message,
  mirroring the has_history=True / resume path in stream_chat_completion_sdk
- none_warm_ctx_coalesces_to_empty: verifies empty warm_ctx omits the
  <memory_context> block (covers fetch_warm_context returning None → "")
2026-04-15 19:55:29 +07:00
majdyz
e3d5a89e20 fix(backend/copilot): sanitize <env_context> tags and inject after sanitization
- extend sanitize_user_supplied_context to strip <env_context> blocks
  and lone tags, preventing users from spoofing working-directory context
- move <env_context> injection from before sanitization to after, by
  passing env_ctx to inject_user_context (same pattern as warm_ctx /
  memory_context) — ensures server-injected block is never stripped
- add tests for env_context stripping and inject_user_context env_ctx param

Addresses sentry bot predictions (13341646, 13342337).
2026-04-15 19:40:34 +07:00
majdyz
6a67336556 Merge remote-tracking branch 'origin/dev' into fix/copilot-static-system-prompt 2026-04-15 19:35:45 +07:00
Zamil Majdy
a91182c7ce Merge branch 'dev' of https://github.com/Significant-Gravitas/AutoGPT into fix/copilot-static-system-prompt 2026-04-15 19:01:57 +07:00
Zamil Majdy
408c67f642 fix(backend/copilot): use anchored prefix stripping in strip_injected_context_for_display
The previous implementation delegated to sanitize_user_supplied_context
which uses greedy "anywhere" regexes.  This broke existing tests that
verify mid-message <user_context> tags are preserved, and would also
mangle user text that happened to include these strings.

Add _MEMORY_CONTEXT_PREFIX_RE and _ENV_CONTEXT_PREFIX_RE anchored regexes.
Rewrite strip_injected_context_for_display to iteratively strip leading
injected blocks using anchored regexes, handling any permutation of the
three block types at the start of a stored message without touching
mid-message content.
2026-04-15 18:46:20 +07:00
majdyz
e3f5afa7c1 fix(backend/copilot): strip all injected context blocks in chat history API
The previous strip_user_context_prefix used an anchored ^<user_context>
regex that silently failed when messages started with <memory_context> or
<env_context>, leaking server-injected blocks to API consumers.

Add strip_injected_context_for_display that removes all three injected
block types (<memory_context>, <env_context>, <user_context>) regardless
of order, and use it in the chat-history GET endpoint.
2026-04-15 18:32:33 +07:00
majdyz
f2768ce7f7 fix(backend/copilot): prepend <memory_context> after sanitization to prevent stripping
The warm_ctx block was prepended before inject_user_context, which internally
calls sanitize_user_supplied_context — stripping all <memory_context> tags to
prevent user spoofing. This silently dropped Graphiti warm context on every
first turn.

Fix: pass warm_ctx as a parameter to inject_user_context and prepend the
<memory_context> block AFTER sanitization, so the server-trusted content is
never stripped. Adds 4 unit tests covering: prepend on first turn, empty
warm_ctx omitted, survival through sanitizer, and a round-trip contract test
binding injection format to the stripping regex.
2026-04-15 18:13:53 +07:00
majdyz
799ff49932 fix(backend/copilot): document <memory_context> tag in system prompt 2026-04-15 15:32:52 +07:00
majdyz
6c89188e15 refactor(backend/copilot): use @cache decorator + inject real cwd as env_context
Replace the manual _LOCAL_STORAGE_SUPPLEMENT global + if-None pattern with
functools.cache on get_sdk_supplement — cleaner, same behaviour, and the
importlib.reload in tests resets the cache naturally.

Inject the actual sdk_cwd into the first user message as <env_context> so
the model always knows its real working directory. The system prompt keeps
the static /tmp/copilot-<session-id> placeholder for cross-user prompt
cache hits; the dynamic path goes in the user-message context instead.
2026-04-15 15:11:14 +07:00
majdyz
e0a8bc5224 fix(backend/copilot): reformat assert statements to pass black lint 2026-04-15 15:05:46 +07:00
majdyz
c68c7d7560 refactor(backend/copilot): remove unused cwd param from get_sdk_supplement
The cwd argument was already ignored (del cwd on entry) — kept only for
call-site compatibility. Remove it entirely and update all three call sites:
sdk/service.py, sdk/service_test.py, and test/copilot/dry_run_loop_test.py.
2026-04-15 15:01:11 +07:00
majdyz
dae497272e fix(backend/copilot): sanitize memory_context tags from user input and test static supplement
- Extend sanitize_user_supplied_context to strip <memory_context> blocks in
  addition to <user_context>, preventing context-spoofing via the new tag
  introduced for Graphiti warm context injection
- Add MEMORY_CONTEXT_TAG constant and matching regexes to service.py
- Add TestStripUserContextTags tests for memory_context stripping
- Add TestGetSdkSupplementStaticPlaceholder to verify the local-mode
  supplement uses a static placeholder path (not a session UUID) and
  returns identical output regardless of cwd argument
2026-04-15 14:51:35 +07:00
majdyz
0279bab8d4 Merge branch 'dev' of https://github.com/Significant-Gravitas/AutoGPT into fix/copilot-static-system-prompt 2026-04-15 14:48:24 +07:00
majdyz
1b67d4cf8a fix(backend/copilot): make system prompt fully static for cross-user prompt caching
The system prompt was not cacheable across sessions or users due to two
sources of per-session dynamic content leaking into it:

1. sdk_cwd (/tmp/copilot-<uuid>) was embedded in the storage supplement
   via get_sdk_supplement(cwd=sdk_cwd). Every session has a unique UUID,
   making the system prompt unique per session — cache miss every first
   message.

2. Graphiti warm_ctx (user-specific memory facts) was appended directly
   to the system prompt on the first turn, making it unique per user per
   turn.

Fix both by keeping the system prompt fully static:
- get_sdk_supplement now ignores cwd and uses the constant placeholder
  "/tmp/copilot-<session-id>" in the supplement text. The actual cwd is
  still passed to ClaudeAgentOptions.cwd so the subprocess uses the right
  directory.
- warm_ctx is now injected into the first user message as a trusted
  <memory_context> block (before inject_user_context runs), so it is
  persisted to DB alongside the <user_context> prefix and replayed
  correctly on --resume without re-fetching.

After this change all users share the same system prompt text — one
cache write globally per model, then cache reads for everyone.
2026-04-15 14:38:39 +07:00

Diff Content Not Available