mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-02-10 06:45:28 -05:00
### Changes 🏗️ Fixes [**AUTOGPT-SERVER-1TN**](https://autoagpt.sentry.io/issues/?query=AUTOGPT-SERVER-1TN) (~39K events since Feb 2025) and related connection issues **6JC/6JD/6JE/6JF** (~6K combined). #### Problem When the RabbitMQ TCP connection drops (network blip, server restart, etc.): 1. `connect_robust` (aio_pika) automatically reconnects the underlying AMQP connection 2. But `AsyncRabbitMQ._channel` still references the **old dead channel** 3. `is_ready` checks `not self._channel.is_closed` — but the channel object doesn't know the transport is gone 4. `publish_message` tries to use the stale channel → `ChannelInvalidStateError: No active transport in channel` 5. `@func_retry` retries 5 times, but each retry hits the same stale channel (it passes `is_ready`) This means every connection drop generates errors until the process is restarted. #### Fix **New `_ensure_channel()` helper** that resets stale channels before reconnecting, so `connect()` creates a fresh one instead of short-circuiting on `is_connected`. **Explicit `ChannelInvalidStateError` handling in `publish_message`:** 1. First attempt uses `_ensure_channel()` (handles normal staleness) 2. If publish throws `ChannelInvalidStateError`, does a full reconnect (resets both `_channel` and `_connection`) and retries once 3. `@func_retry` provides additional retry resilience on top **Simplified `get_channel()`** to use the same resilient helper. **1 file changed, 62 insertions, 24 deletions.** #### Impact - Eliminates ~39K `ChannelInvalidStateError` Sentry events - RabbitMQ operations self-heal after connection drops without process restart - Related transport EOF errors (6JC/6JD/6JE/6JF) should also reduce