mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-02-10 23:05:17 -05:00
Calling LLM using the current block sometimes can break due to the high context window. A prompt compaction algorithm is applied (enabled by default) to make sure the sent prompt is within a context window limit. ### Changes 🏗️ ```` Heuristics -------- * Prefer shrinking the content rather than truncating the conversation. * If the conversation content is compacted and it's still not enough, then reduce the conversation list. * The rest of the implementation is adjusted to minimize the LLM call breaking. Strategy -------- 1. **Token-aware truncation** – progressively halve a per-message cap (`start_cap`, `start_cap/2`, … `floor_cap`) and apply it to the *content* of every message except the first and last. Tool shells are included: we keep the envelope but shorten huge payloads. 2. **Middle-out deletion** – if still over the limit, delete the whole messages working outward from the centre, **skipping** any message that contains ``tool_calls`` or has ``role == "tool"``. 3. **Last-chance trim** – if still too big, truncate the *first* and *last* message bodies down to `floor_cap` tokens. 4. If the prompt is *still* too large: • raise ``ValueError`` when ``lossy_ok == False`` (default) • return the partially-trimmed prompt when ``lossy_ok == True`` ```` ### Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: <!-- Put your test plan here: --> - [x] Run an SDM block in a loop until it hits 200000 tokens using the open-ai O3 model.