chore: merge master into dev, resolve baseline/transcript conflicts

Conflicts in baseline/service.py, baseline/transcript_integration_test.py, and transcript.py arose because dev-only commit 0cd0a76305 (baseline upload fix) overlapped with the same fix in PR #12804 which landed in master. Took master's version for all three files — it is the complete, reviewed implementation.
fix(backend/copilot): baseline always uploads when GCS has no transcript
2026-04-30 03:00:41 -04:00 · 2026-04-16 15:38:46 +07:00 · 2026-04-16 14:58:42 +07:00 · 2026-04-15 21:25:07 +00:00 · 2026-04-16 01:23:07 +07:00 · 2026-04-16 00:05:32 +07:00
5 changed files with 202 additions and 56 deletions
--- a/autogpt_platform/backend/backend/copilot/executor/utils.py
+++ b/autogpt_platform/backend/backend/copilot/executor/utils.py
@@ -124,15 +124,6 @@ def create_copilot_queue_config() -> RabbitMQConfig:
        vhost="/",
        exchanges=[COPILOT_EXECUTION_EXCHANGE, COPILOT_CANCEL_EXCHANGE],
        queues=[run_queue, cancel_queue],
-        # The consumer threads sit in pika's blocking ``start_consuming()`` for
-        # the full lifetime of the process. If the TCP connection is dropped
-        # (server restart, NAT timeout, laptop sleep) while pika's IO thread is
-        # starved, the socket rots in CLOSE_WAIT and no message is ever
-        # consumed — see zombie-consumer incident notes. A short heartbeat plus
-        # kernel-level TCP keepalive makes both the app and the OS notice a
-        # dead peer within a couple of minutes instead of hours.
-        heartbeat=60,
-        tcp_keepalive=True,
    )


--- a/autogpt_platform/backend/backend/copilot/prompting.py
+++ b/autogpt_platform/backend/backend/copilot/prompting.py
@@ -174,18 +174,14 @@ sandbox so `bash_exec` can access it for further processing.
 The exact sandbox path is shown in the `[Sandbox copy available at ...]` note.

 ### GitHub CLI (`gh`) and git
- To check if the user has their GitHub account already connected, run `gh auth status`. Always check this before running `connect_integration(provider="github")` which will ask the user to connect their GitHub regardless if it's already connected.
+- To check if the user has their GitHub account already connected, run `gh auth status`. Always check this before asking them to connect it.
 - If the user has connected their GitHub account, both `gh` and `git` are
  pre-authenticated — use them directly without any manual login step.
  `git` HTTPS operations (clone, push, pull) work automatically.
 - If the token changes mid-session (e.g. user reconnects with a new token),
  run `gh auth setup-git` to re-register the credential helper.
- **MANDATORY:** You MUST run `gh auth status` before EVER calling
-  `connect_integration(provider="github")`. If it shows `Logged in`,
-  proceed directly — no integration connection needed. Never skip this check.
- If `gh auth status` shows NOT logged in, or `gh`/`git` fails with an
-  authentication error (e.g. "authentication required", "could not read
-  Username", or exit code 128), THEN call
+- If `gh` or `git` fails with an authentication error (e.g. "authentication
+  required", "could not read Username", or exit code 128), call
  `connect_integration(provider="github")` to surface the GitHub credentials
  setup card so the user can connect their account. Once connected, retry
  the operation.
--- a/autogpt_platform/backend/backend/copilot/transcript_test.py
+++ b/autogpt_platform/backend/backend/copilot/transcript_test.py
@@ -880,6 +880,202 @@ class TestUploadCliSession:
        assert meta_content["mode"] == "baseline"
        assert meta_content["message_count"] == 4

+    def test_strips_session_before_upload_and_writes_back(self, tmp_path):
+        """Strippable entries (progress, thinking blocks) are removed before upload.
+
+        The stripped content is written back to disk (so same-pod turns benefit)
+        and the smaller bytes are uploaded to GCS.
+        """
+        import asyncio
+        import os
+        import re
+        from unittest.mock import AsyncMock, patch
+
+        from .transcript import _sanitize_id, upload_cli_session
+
+        projects_base = str(tmp_path)
+        session_id = "12345678-0000-0000-0000-000000000010"
+        sdk_cwd = str(tmp_path)
+
+        encoded_cwd = re.sub(r"[^a-zA-Z0-9]", "-", os.path.realpath(sdk_cwd))
+        session_dir = tmp_path / encoded_cwd
+        session_dir.mkdir(parents=True, exist_ok=True)
+        session_file = session_dir / f"{_sanitize_id(session_id)}.jsonl"
+
+        # A CLI session with a progress entry (strippable) and a real assistant message.
+        import json
+
+        progress_entry = {
+            "type": "progress",
+            "uuid": "p1",
+            "parentUuid": "u1",
+            "data": {"type": "bash_progress", "stdout": "running..."},
+        }
+        user_entry = {
+            "type": "user",
+            "uuid": "u1",
+            "message": {"role": "user", "content": "hello"},
+        }
+        asst_entry = {
+            "type": "assistant",
+            "uuid": "a1",
+            "parentUuid": "u1",
+            "message": {"role": "assistant", "content": "world"},
+        }
+        raw_content = (
+            json.dumps(progress_entry)
+            + "\n"
+            + json.dumps(user_entry)
+            + "\n"
+            + json.dumps(asst_entry)
+            + "\n"
+        )
+        raw_bytes = raw_content.encode("utf-8")
+        session_file.write_bytes(raw_bytes)
+
+        mock_storage = AsyncMock()
+
+        with (
+            patch(
+                "backend.copilot.transcript._projects_base",
+                return_value=projects_base,
+            ),
+            patch(
+                "backend.copilot.transcript.get_workspace_storage",
+                new_callable=AsyncMock,
+                return_value=mock_storage,
+            ),
+        ):
+            asyncio.run(
+                upload_cli_session(
+                    user_id="user-1",
+                    session_id=session_id,
+                    sdk_cwd=sdk_cwd,
+                )
+            )
+
+        # Upload should have been called with stripped bytes (no progress entry).
+        mock_storage.store.assert_called_once()
+        stored_content: bytes = mock_storage.store.call_args.kwargs["content"]
+        stored_lines = stored_content.decode("utf-8").strip().split("\n")
+        stored_types = [json.loads(line).get("type") for line in stored_lines]
+        assert "progress" not in stored_types
+        assert "user" in stored_types
+        assert "assistant" in stored_types
+        # Stripped bytes should be smaller than raw.
+        assert len(stored_content) < len(raw_bytes)
+        # File on disk should also be the stripped version.
+        disk_content = session_file.read_bytes()
+        assert disk_content == stored_content
+
+    def test_strips_stale_thinking_blocks_before_upload(self, tmp_path):
+        """Thinking blocks in non-last assistant turns are stripped to reduce size."""
+        import asyncio
+        import json
+        import os
+        import re
+        from unittest.mock import AsyncMock, patch
+
+        from .transcript import _sanitize_id, upload_cli_session
+
+        projects_base = str(tmp_path)
+        session_id = "12345678-0000-0000-0000-000000000011"
+        sdk_cwd = str(tmp_path)
+
+        encoded_cwd = re.sub(r"[^a-zA-Z0-9]", "-", os.path.realpath(sdk_cwd))
+        session_dir = tmp_path / encoded_cwd
+        session_dir.mkdir(parents=True, exist_ok=True)
+        session_file = session_dir / f"{_sanitize_id(session_id)}.jsonl"
+
+        # Two turns: first assistant has thinking block (stale), second doesn't.
+        u1 = {
+            "type": "user",
+            "uuid": "u1",
+            "message": {"role": "user", "content": "q1"},
+        }
+        a1_with_thinking = {
+            "type": "assistant",
+            "uuid": "a1",
+            "parentUuid": "u1",
+            "message": {
+                "id": "msg_a1",
+                "role": "assistant",
+                "content": [
+                    {"type": "thinking", "thinking": "A" * 5000},
+                    {"type": "text", "text": "answer1"},
+                ],
+            },
+        }
+        u2 = {
+            "type": "user",
+            "uuid": "u2",
+            "parentUuid": "a1",
+            "message": {"role": "user", "content": "q2"},
+        }
+        a2_no_thinking = {
+            "type": "assistant",
+            "uuid": "a2",
+            "parentUuid": "u2",
+            "message": {
+                "id": "msg_a2",
+                "role": "assistant",
+                "content": [{"type": "text", "text": "answer2"}],
+            },
+        }
+        raw_content = (
+            json.dumps(u1)
+            + "\n"
+            + json.dumps(a1_with_thinking)
+            + "\n"
+            + json.dumps(u2)
+            + "\n"
+            + json.dumps(a2_no_thinking)
+            + "\n"
+        )
+        raw_bytes = raw_content.encode("utf-8")
+        session_file.write_bytes(raw_bytes)
+
+        mock_storage = AsyncMock()
+
+        with (
+            patch(
+                "backend.copilot.transcript._projects_base",
+                return_value=projects_base,
+            ),
+            patch(
+                "backend.copilot.transcript.get_workspace_storage",
+                new_callable=AsyncMock,
+                return_value=mock_storage,
+            ),
+        ):
+            asyncio.run(
+                upload_cli_session(
+                    user_id="user-1",
+                    session_id=session_id,
+                    sdk_cwd=sdk_cwd,
+                )
+            )
+
+        stored_content: bytes = mock_storage.store.call_args.kwargs["content"]
+        stored_lines = stored_content.decode("utf-8").strip().split("\n")
+
+        # a1 should have its thinking block stripped (it's not the last assistant turn).
+        a1_stored = json.loads(stored_lines[1])
+        a1_content = a1_stored["message"]["content"]
+        assert all(
+            b["type"] != "thinking" for b in a1_content
+        ), "stale thinking block should be stripped from a1"
+        assert any(
+            b["type"] == "text" for b in a1_content
+        ), "text block should be kept in a1"
+
+        # a2 (last turn) should be unchanged.
+        a2_stored = json.loads(stored_lines[3])
+        assert a2_stored["message"]["content"] == [{"type": "text", "text": "answer2"}]
+
+        # Stripped bytes smaller than raw.
+        assert len(stored_content) < len(raw_bytes)
+

 class TestRestoreCliSession:
    def test_returns_none_when_file_not_found_in_storage(self):
--- a/autogpt_platform/backend/backend/data/rabbitmq.py
+++ b/autogpt_platform/backend/backend/data/rabbitmq.py
@@ -1,6 +1,5 @@
 import asyncio
 import logging
-import socket
 from abc import ABC, abstractmethod
 from enum import Enum
 from typing import Awaitable, Optional
@@ -43,39 +42,6 @@ CONNECTION_ATTEMPTS = 5
 # Use case: Faster reconnection for long-running executions that need to resume quickly
 RETRY_DELAY = 1

-# DEFAULT_HEARTBEAT (300s = 5 min)
-# AMQP application-level heartbeat. Server drops the connection if no heartbeat
-# is seen within ~2x this interval. Consumers that sit in CLOSE_WAIT because
-# pika's IO loop was starved (e.g. laptop sleep, blocking main thread) recover
-# faster with a lower value. See `create_copilot_queue_config` for a case that
-# overrides this.
-DEFAULT_HEARTBEAT = 300
-
-
-def _tcp_keepalive_options() -> dict[str, int]:
-    """Platform-aware TCP keepalive socket options for pika.
-
-    pika enables ``SO_KEEPALIVE`` on every socket by default; this dict tunes
-    how quickly the kernel declares a silent peer dead. Without these knobs,
-    the OS default on Linux is ~2 hours of idle before the first probe — long
-    enough for a half-closed socket to sit in CLOSE_WAIT forever while the
-    consumer thread is blocked inside ``start_consuming()``.
-
-    pika passes each key through ``getattr(socket, key)`` at ``IPPROTO_TCP``
-    level, so names must exist on the current platform. Linux has
-    ``TCP_KEEPIDLE``; macOS uses ``TCP_KEEPALIVE`` for the equivalent knob.
-    """
-    opts: dict[str, int] = {}
-    if hasattr(socket, "TCP_KEEPIDLE"):
-        opts["TCP_KEEPIDLE"] = 60
-    elif hasattr(socket, "TCP_KEEPALIVE"):
-        opts["TCP_KEEPALIVE"] = 60
-    if hasattr(socket, "TCP_KEEPINTVL"):
-        opts["TCP_KEEPINTVL"] = 20
-    if hasattr(socket, "TCP_KEEPCNT"):
-        opts["TCP_KEEPCNT"] = 3
-    return opts
-

 class ExchangeType(str, Enum):
    DIRECT = "direct"
@@ -107,8 +73,6 @@ class RabbitMQConfig(BaseModel):
    vhost: str = "/"
    exchanges: list[Exchange]
    queues: list[Queue]
-    heartbeat: int = DEFAULT_HEARTBEAT
-    tcp_keepalive: bool = False


 class RabbitMQBase(ABC):
@@ -177,8 +141,7 @@ class SyncRabbitMQ(RabbitMQBase):
            socket_timeout=SOCKET_TIMEOUT,
            connection_attempts=CONNECTION_ATTEMPTS,
            retry_delay=RETRY_DELAY,
-            heartbeat=self.config.heartbeat,
-            tcp_options=_tcp_keepalive_options() if self.config.tcp_keepalive else None,
+            heartbeat=300,  # 5 minute timeout (heartbeats sent every 2.5 min)
        )

        self._connection = pika.BlockingConnection(parameters)
@@ -297,7 +260,7 @@ class AsyncRabbitMQ(RabbitMQBase):
            password=self.password,
            virtualhost=self.config.vhost.lstrip("/"),
            blocked_connection_timeout=BLOCKED_CONNECTION_TIMEOUT,
-            heartbeat=self.config.heartbeat,
+            heartbeat=300,  # 5 minute timeout (heartbeats sent every 2.5 min)
        )
        self._channel = await self._connection.channel()
        await self._channel.set_qos(prefetch_count=1)
--- a/autogpt_platform/frontend/src/app/(platform)/build/components/FlowEditor/Flow/Flow.tsx
+++ b/autogpt_platform/frontend/src/app/(platform)/build/components/FlowEditor/Flow/Flow.tsx
@@ -110,7 +110,7 @@ export const Flow = () => {
            event.preventDefault();
          }}
          maxZoom={2}
-          minZoom={0.1}
+          minZoom={0.05}
          onDragOver={onDragOver}
          onDrop={onDrop}
          nodesDraggable={!isLocked}
Author	SHA1	Message	Date
Zamil Majdy	2b4727e8b2	chore: merge master into dev, resolve baseline/transcript conflicts Conflicts in baseline/service.py, baseline/transcript_integration_test.py, and transcript.py arose because dev-only commit `0cd0a76305` (baseline upload fix) overlapped with the same fix in PR #12804 which landed in master. Took master's version for all three files — it is the complete, reviewed implementation.	2026-04-16 15:38:46 +07:00
Zamil Majdy	0cd0a76305	fix(backend/copilot): baseline always uploads when GCS has no transcript _load_prior_transcript was returning False for missing/invalid transcripts, which caused should_upload_transcript to suppress the upload. The original intent was to protect against overwriting a newer GCS version — but a missing or corrupt file is not 'newer'. Only stale (watermark ahead) and download errors (unknown GCS state) should suppress upload. Also renames transcript_covers_prefix → transcript_upload_safe throughout to accurately describe what the flag means.	2026-04-16 14:58:42 +07:00
chernistry	bd2efed080	fix(frontend): allow zooming out more in the builder (#12690 ) Reduced minZoom on the builder canvas from 0.1 to 0.05 to allow zooming out further when working with large agent graphs. Fixes #9325 Co-authored-by: Nicholas Tindle <nicholas.tindle@agpt.co>	2026-04-15 21:25:07 +00:00
Zamil Majdy	5fccd8a762	Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev	2026-04-16 01:23:07 +07:00
Zamil Majdy	d27d22159d	Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev	2026-04-16 00:05:32 +07:00
Zamil Majdy	df205b5444	fix(backend/copilot): strip CLI session file to prevent auto-compaction context loss The Claude Code CLI auto-compacts its native session JSONL when the context approaches the model's token limit (~200K for Sonnet). After compaction the detailed conversation history is replaced by a ~27K-token summary, causing the silent context loss users see as memory failures in long sessions. Root cause identified from production logs for session 93ecf7c9: - T6 CLI session: 233KB / ~207K tokens (near Sonnet limit) - T7 CLI compacted session -> ~167KB / ~47K tokens (PreCompact hook missed) - T12 second compaction -> ~176KB / ~27K tokens (just system prompt + summary) - T14-T21: cache_read=26714 constantly -- only system prompt visible to Claude The same stripping we already apply to our transcript (stale thinking blocks, progress/metadata entries) now also runs on the CLI native session file. At ~2x the size of the stripped transcript, unstripped sessions routinely hit the compaction threshold within 6-10 turns of a heavy Opus/thinking session. After stripping: - same-pod turns reuse the stripped local file (no compaction trigger) - cross-pod turns restore the stripped GCS file (same benefit)	2026-04-15 23:19:12 +07:00
majdyz	4efa1c4310	fix(copilot): set session_id on mode-switch T1 to enable --resume on subsequent turns When a user switches from baseline (fast) mode to SDK (extended_thinking) mode mid-session, the first SDK turn has has_history=True (prior baseline messages in DB) but no CLI session file in storage. The old code gated session_id on `not has_history`, so mode-switch T1 never received a session_id — the CLI generated a random ID that wasn't uploaded under the expected key. Every subsequent SDK turn would fail to restore the CLI session and run without --resume, injecting the full compressed history on each turn, causing model confusion. Fix: set session_id whenever not using --resume (the `else` branch), covering T1 fresh, mode-switch T1, and T2+ fallback turns. The retry path is updated to use `"session_id" in sdk_options_kwargs` as the discriminator (instead of `not has_history`) so mode-switch T1 retries also keep the session_id while T2+ retries (where T1 restored a session file via restore_cli_session) still remove it to avoid "Session ID already in use".	2026-04-15 23:19:11 +07:00