Files
AutoGPT/autogpt_platform/backend
Zamil Majdy c4056cbae9 feat(block): Introduce context-window aware prompt compaction for LLM & SmartDecision blocks (#10252)
Calling LLM using the current block sometimes can break due to the high
context window.
A prompt compaction algorithm is applied (enabled by default) to make
sure the sent prompt is within a context window limit.


### Changes 🏗️

````
Heuristics
--------
* Prefer shrinking the content rather than truncating the conversation.
* If the conversation content is compacted and it's still not enough, then reduce the conversation list.
* The rest of the implementation is adjusted to minimize the LLM call breaking.

Strategy
--------
1. **Token-aware truncation** – progressively halve a per-message cap
   (`start_cap`, `start_cap/2`, … `floor_cap`) and apply it to the
   *content* of every message except the first and last.  Tool shells
   are included: we keep the envelope but shorten huge payloads.
2. **Middle-out deletion** – if still over the limit, delete the whole
   messages working outward from the centre, **skipping** any message
   that contains ``tool_calls`` or has ``role == "tool"``.
3. **Last-chance trim** – if still too big, truncate the *first* and
   *last* message bodies down to `floor_cap` tokens.
4. If the prompt is *still* too large:
     • raise ``ValueError``      when ``lossy_ok == False`` (default)
     • return the partially-trimmed prompt when ``lossy_ok == True``
````

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  <!-- Put your test plan here: -->
- [x] Run an SDM block in a loop until it hits 200000 tokens using the
open-ai O3 model.
2025-06-27 15:07:50 +00:00
..
2025-04-15 15:40:15 +00:00