Compare commits

..

5 Commits

Author SHA1 Message Date
Zamil Majdy
3cfe2f384a fix(copilot): remove redundant success-path transcript upload
The success path always uploaded the resume file (old downloaded data),
then the finally block overwrote with the stop hook (new turn data).
With always-upload, this caused the smaller stop hook to overwrite
larger (but stale) data from the resume file. Remove the success path
upload — the finally block handles it correctly by preferring stop hook
content and falling back to the resume file when empty.
2026-03-06 12:45:12 +07:00
Zamil Majdy
fea6711ae7 fix(copilot): always upload transcript instead of size-based skip
The size comparison (existing >= new) prevented transcript uploads when
the CLI compacted old tool results via --resume, causing the stored
transcript to become permanently stale. Since the executor holds a
cluster lock per session, concurrent uploads cannot race — just always
overwrite.
2026-03-06 02:16:36 +07:00
Zamil Majdy
21c705af6e fix(backend/copilot): prevent title update from overwriting session messages (#12302)
### Changes 🏗️

Fixes a race condition in `update_session_title()` where the background
title generation task could overwrite the Redis session cache with a
stale snapshot, causing the copilot to "forget" its previous turns.

**Root cause:** `update_session_title()` performs a read-modify-write on
the Redis cache (read full session → set title → write back). Meanwhile,
`upsert_chat_session()` writes a newer version with more messages during
streaming. If the title task reads early (e.g., 34 messages) and writes
late (after streaming persisted 101 messages), the stale 34-message
version overwrites the 101-message version. When the next message lands
on a different pod, it loads the stale session from Redis.

**Fix:** Replace the read-modify-write with a simple cache invalidation
(`invalidate_session_cache`). The title is already updated in the DB;
the next access just reloads from DB with the correct title and
messages. No locks, no deserialization of the full session blob, no risk
of stale overwrites.

**Evidence from prod logs (session `41a3814c`):**
- Pod `tm2jb` persisted session with 101 messages
- Pod `phflm` loaded session from Redis cache with only 35 messages (66
messages lost)
- The title background task ran between these events, overwriting the
cache

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] `poetry run pytest backend/copilot/model_test.py` — 15/15 pass
  - [x] All pre-commit hooks pass (ruff, black, isort, pyright)
- [ ] After deploy: verify long sessions no longer lose context on
multi-pod setups
2026-03-05 18:49:41 +00:00
Zamil Majdy
a576be9db2 fix(backend): install agent-browser + Chromium in Docker image (#12301)
The Copilot browser tool (`browser_navigate`, `browser_act`,
`browser_screenshot`) has been broken on dev because `agent-browser` CLI
+ Chromium were never installed in the backend Docker image.

### Changes 🏗️

- Added `npx playwright install-deps chromium` to install Chromium
runtime libraries (libnss3, libatk, etc.)
- Added `npm install -g agent-browser` to install the CLI
- Added `agent-browser install` to download the Chromium binary
- Layer is placed after existing COPY-from-builder lines to preserve
Docker cache ordering

### Root cause

Every `browser_navigate` call fails with:
```
WARNING  [browser_navigate] open failed for <url>: agent-browser is not installed
(run: npm install -g agent-browser && agent-browser install).
```
The error originates from `FileNotFoundError` in `agent_browser.py:101`
when the subprocess tries to execute the `agent-browser` binary which
doesn't exist in the container.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified `agent-browser` binary is missing from current dev pod
via `kubectl logs`
- [x] Confirmed session `01eeac29-5a7` shows repeated failures for all
URLs
- [ ] After deploy: verify browser_navigate works in a Copilot session
on dev

#### For configuration changes:
- [x] `.env.default` is updated or already compatible with my changes
- [x] `docker-compose.yml` is updated or already compatible with my
changes
- [x] I have included a list of my configuration changes in the PR
description (under **Changes**)
2026-03-05 18:44:55 +00:00
dependabot[bot]
5e90585f10 chore(deps): bump crazy-max/ghaction-github-runtime from 3 to 4 (#12262)
Bumps
[crazy-max/ghaction-github-runtime](https://github.com/crazy-max/ghaction-github-runtime)
from 3 to 4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/crazy-max/ghaction-github-runtime/releases">crazy-max/ghaction-github-runtime's
releases</a>.</em></p>
<blockquote>
<h2>v3.1.0</h2>
<ul>
<li>Bump <code>@​actions/core</code> from 1.10.0 to 1.11.1 in <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/pull/58">crazy-max/ghaction-github-runtime#58</a></li>
<li>Bump braces from 3.0.2 to 3.0.3 in <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/pull/54">crazy-max/ghaction-github-runtime#54</a></li>
<li>Bump cross-spawn from 7.0.3 to 7.0.6 in <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/pull/59">crazy-max/ghaction-github-runtime#59</a></li>
<li>Bump ip from 2.0.0 to 2.0.1 in <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/pull/50">crazy-max/ghaction-github-runtime#50</a></li>
<li>Bump micromatch from 4.0.5 to 4.0.8 in <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/pull/55">crazy-max/ghaction-github-runtime#55</a></li>
<li>Bump tar from 6.1.14 to 6.2.1 in <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/pull/51">crazy-max/ghaction-github-runtime#51</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/crazy-max/ghaction-github-runtime/compare/v3.0.0...v3.1.0">https://github.com/crazy-max/ghaction-github-runtime/compare/v3.0.0...v3.1.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="04d248b846"><code>04d248b</code></a>
Merge pull request <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/issues/76">#76</a>
from crazy-max/node24</li>
<li><a
href="c8f8e4e4e2"><code>c8f8e4e</code></a>
node 24 as default runtime</li>
<li><a
href="494a382acb"><code>494a382</code></a>
Merge pull request <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/issues/68">#68</a>
from crazy-max/dependabot/npm_and_yarn/actions/core-2.0.1</li>
<li><a
href="5d51b8ef32"><code>5d51b8e</code></a>
Merge pull request <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/issues/74">#74</a>
from crazy-max/dependabot/npm_and_yarn/minimatch-3.1.5</li>
<li><a
href="f7077dccce"><code>f7077dc</code></a>
chore: update generated content</li>
<li><a
href="4d1e03547a"><code>4d1e035</code></a>
chore(deps): bump minimatch from 3.1.2 to 3.1.5</li>
<li><a
href="b59d56d5bc"><code>b59d56d</code></a>
chore(deps): bump <code>@​actions/core</code> from 1.11.1 to 2.0.1</li>
<li><a
href="6d0e2ef281"><code>6d0e2ef</code></a>
Merge pull request <a
href="https://redirect.github.com/crazy-max/ghaction-github-runtime/issues/75">#75</a>
from crazy-max/esm</li>
<li><a
href="41d6f6acdb"><code>41d6f6a</code></a>
remove codecov config</li>
<li><a
href="b5018eca65"><code>b5018ec</code></a>
chore: update generated content</li>
<li>Additional commits viewable in <a
href="https://github.com/crazy-max/ghaction-github-runtime/compare/v3...v4">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=crazy-max/ghaction-github-runtime&package-manager=github_actions&previous-version=3&new-version=4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>
2026-03-05 15:59:06 +00:00
7 changed files with 47 additions and 93 deletions

View File

@@ -139,7 +139,7 @@ jobs:
- name: Upload logs to artifact
if: always()
uses: actions/upload-artifact@v7
uses: actions/upload-artifact@v4
with:
name: test-logs
path: classic/original_autogpt/logs/

View File

@@ -237,7 +237,7 @@ jobs:
- name: Upload logs to artifact
if: always()
uses: actions/upload-artifact@v7
uses: actions/upload-artifact@v4
with:
name: test-logs
path: classic/forge/logs/

View File

@@ -149,7 +149,7 @@ jobs:
driver-opts: network=host
- name: Set up Platform - Expose GHA cache to docker buildx CLI
uses: crazy-max/ghaction-github-runtime@v3
uses: crazy-max/ghaction-github-runtime@v4
- name: Set up Platform - Build Docker images (with cache)
working-directory: autogpt_platform
@@ -269,7 +269,7 @@ jobs:
- name: Upload Playwright report
if: always()
uses: actions/upload-artifact@v7
uses: actions/upload-artifact@v4
with:
name: playwright-report
path: playwright-report
@@ -278,7 +278,7 @@ jobs:
- name: Upload Playwright test results
if: always()
uses: actions/upload-artifact@v7
uses: actions/upload-artifact@v4
with:
name: playwright-test-results
path: test-results

View File

@@ -111,13 +111,29 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
# Copy poetry (build-time only, for `poetry install --only-root` to create entry points)
COPY --from=builder /usr/local/lib/python3* /usr/local/lib/python3*
COPY --from=builder /usr/local/bin/poetry /usr/local/bin/poetry
# Copy Node.js installation for Prisma
# Copy Node.js installation for Prisma and agent-browser.
# npm/npx are symlinks in the builder (-> ../lib/node_modules/npm/bin/*-cli.js);
# COPY resolves them to regular files, breaking require() paths. Recreate as
# proper symlinks so npm/npx can find their modules.
COPY --from=builder /usr/bin/node /usr/bin/node
COPY --from=builder /usr/lib/node_modules /usr/lib/node_modules
COPY --from=builder /usr/bin/npm /usr/bin/npm
COPY --from=builder /usr/bin/npx /usr/bin/npx
RUN ln -s ../lib/node_modules/npm/bin/npm-cli.js /usr/bin/npm \
&& ln -s ../lib/node_modules/npm/bin/npx-cli.js /usr/bin/npx
COPY --from=builder /root/.cache/prisma-python/binaries /root/.cache/prisma-python/binaries
# Install agent-browser (Copilot browser tool) + Chromium runtime dependencies.
# These are the runtime libraries Chromium/Playwright needs on Debian 13 (trixie).
RUN apt-get update && apt-get install -y --no-install-recommends \
libnss3 libnspr4 libatk1.0-0 libatk-bridge2.0-0 libcups2 libdrm2 \
libdbus-1-3 libxkbcommon0 libatspi2.0-0t64 libxcomposite1 libxdamage1 \
libxfixes3 libxrandr2 libgbm1 libasound2t64 libpango-1.0-0 libcairo2 \
libx11-6 libx11-xcb1 libxcb1 libxext6 libglib2.0-0t64 \
fonts-liberation libfontconfig1 \
&& rm -rf /var/lib/apt/lists/* \
&& npm install -g agent-browser \
&& agent-browser install \
&& rm -rf /tmp/* /root/.npm
WORKDIR /app/autogpt_platform/backend
# Copy only the .venv from builder (not the entire /app directory)

View File

@@ -705,19 +705,10 @@ async def update_session_title(session_id: str, title: str) -> bool:
logger.warning(f"Session {session_id} not found for title update")
return False
# Update title in cache if it exists (instead of invalidating).
# This prevents race conditions where cache invalidation causes
# the frontend to see stale DB data while streaming is still in progress.
try:
cached = await _get_session_from_cache(session_id)
if cached:
cached.title = title
await cache_chat_session(cached)
except Exception as e:
# Not critical - title will be correct on next full cache refresh
logger.warning(
f"Failed to update title in cache for session {session_id}: {e}"
)
# Invalidate the cache so the next access reloads from DB with the
# updated title. This avoids a read-modify-write on the full session
# blob, which could overwrite concurrent message updates.
await invalidate_session_cache(session_id)
return True
except Exception as e:

View File

@@ -1408,46 +1408,11 @@ async def stream_chat_completion_sdk(
) and not has_appended_assistant:
session.messages.append(assistant_response)
# --- Upload transcript for next-turn --resume ---
# After async with the SDK task group has exited, so the Stop
# hook has already fired and the CLI has been SIGTERMed. The
# CLI uses appendFileSync, so all writes are safely on disk.
if config.claude_agent_use_resume and user_id:
# With --resume the CLI appends to the resume file (most
# complete). Otherwise use the Stop hook path.
if use_resume and resume_file:
raw_transcript = read_transcript_file(resume_file)
logger.debug("[SDK] Transcript source: resume file")
elif captured_transcript.path:
raw_transcript = read_transcript_file(captured_transcript.path)
logger.debug(
"[SDK] Transcript source: stop hook (%s), read result: %s",
captured_transcript.path,
f"{len(raw_transcript)}B" if raw_transcript else "None",
)
else:
raw_transcript = None
if not raw_transcript:
logger.debug(
"[SDK] No usable transcript — CLI file had no "
"conversation entries (expected for first turn "
"without --resume)"
)
if raw_transcript:
# Shield the upload from generator cancellation so a
# client disconnect / page refresh doesn't lose the
# transcript. The upload must finish even if the SSE
# connection is torn down.
await asyncio.shield(
_try_upload_transcript(
user_id,
session_id,
raw_transcript,
message_count=len(session.messages),
)
)
# Transcript upload is handled in the finally block below — it
# correctly prefers the stop hook content (new turn data) over the
# resume file (old downloaded data). Uploading here would write
# stale data that the finally block then overwrites with potentially
# smaller (but newer) stop hook content.
logger.info(
"[SDK] [%s] Stream completed successfully with %d messages",

View File

@@ -331,10 +331,10 @@ async def upload_transcript(
) -> None:
"""Strip progress entries and upload transcript to bucket storage.
Safety: only overwrites when the new (stripped) transcript is larger than
what is already stored. Since JSONL is append-only, the latest transcript
is always the longest. This prevents a slow/stale background task from
clobbering a newer upload from a concurrent turn.
The executor holds a cluster lock per session, so concurrent uploads for
the same session cannot happen. We always overwrite — with ``--resume``
the CLI may compact old tool results, so neither byte size nor line count
is a reliable proxy for "newer".
Args:
message_count: ``len(session.messages)`` at upload time — used by
@@ -353,33 +353,16 @@ async def upload_transcript(
storage = await get_workspace_storage()
wid, fid, fname = _storage_path_parts(user_id, session_id)
encoded = stripped.encode("utf-8")
new_size = len(encoded)
# Check existing transcript size to avoid overwriting newer with older
path = _build_storage_path(user_id, session_id, storage)
content_skipped = False
try:
existing = await storage.retrieve(path)
if len(existing) >= new_size:
logger.info(
f"[Transcript] Skipping content upload — existing ({len(existing)}B) "
f">= new ({new_size}B) for session {session_id}"
)
content_skipped = True
except (FileNotFoundError, Exception):
pass # No existing transcript or retrieval error — proceed with upload
await storage.store(
workspace_id=wid,
file_id=fid,
filename=fname,
content=encoded,
)
if not content_skipped:
await storage.store(
workspace_id=wid,
file_id=fid,
filename=fname,
content=encoded,
)
# Always update metadata (even when content is skipped) so message_count
# stays current. The gap-fill logic in _build_query_message relies on
# message_count to avoid re-compressing the same messages every turn.
# Update metadata so message_count stays current. The gap-fill logic
# in _build_query_message relies on it to avoid re-compressing messages.
try:
meta = {"message_count": message_count, "uploaded_at": time.time()}
mwid, mfid, mfname = _meta_storage_path_parts(user_id, session_id)
@@ -393,9 +376,8 @@ async def upload_transcript(
logger.warning(f"[Transcript] Failed to write metadata for {session_id}: {e}")
logger.info(
f"[Transcript] Uploaded {new_size}B "
f"(stripped from {len(content)}B, msg_count={message_count}, "
f"content_skipped={content_skipped}) "
f"[Transcript] Uploaded {len(encoded)}B "
f"(stripped from {len(content)}B, msg_count={message_count}) "
f"for session {session_id}"
)