diff --git a/docs/concepts/session-pruning.md b/docs/concepts/session-pruning.md index 5e91b9fb80..fdd5080e0f 100644 --- a/docs/concepts/session-pruning.md +++ b/docs/concepts/session-pruning.md @@ -13,6 +13,7 @@ Session pruning trims **old tool results** from the in-memory context right befo - Only affects the messages sent to the model for that request. - Only active for Anthropic API calls (and OpenRouter Anthropic models). - For best results, match `ttl` to your model `cacheControlTtl`. + - After a prune, the TTL window resets so subsequent requests keep cache until `ttl` expires again. ## What can be pruned - Only `toolResult` messages. diff --git a/docs/gateway/configuration.md b/docs/gateway/configuration.md index ddce68e796..bdf2ad29be 100644 --- a/docs/gateway/configuration.md +++ b/docs/gateway/configuration.md @@ -1597,6 +1597,7 @@ Notes / current limitations: - The estimated “context ratio” is based on **characters** (approximate), not exact tokens. - If the session doesn’t contain at least `keepLastAssistants` assistant messages yet, pruning is skipped. - `cache-ttl` only activates for Anthropic API calls (and OpenRouter Anthropic models). +- After a prune, the TTL window resets so subsequent requests keep cache until `ttl` expires again. - For best results, match `contextPruning.ttl` to the model `cacheControlTtl` you set in `agents.defaults.models.*.params`. Default (off):