docs(tokens): document image dimension token tradeoffs

2026-02-19 18:39:20 -05:00 · 2026-02-18 00:56:57 +01:00
parent b05e89e5e6
commit 4c569ce246
4 changed files with 25 additions and 1 deletions
--- a/docs/gateway/configuration-reference.md
+++ b/docs/gateway/configuration-reference.md
@@ -597,6 +597,20 @@ Max total characters injected across all workspace bootstrap files. Default: `15
 }
 ```

+### `agents.defaults.imageMaxDimensionPx`
+
+Max pixel size for the longest image side in transcript/tool image blocks before provider calls.
+Default: `1200`.
+
+Lower values usually reduce vision-token usage and request payload size for screenshot-heavy runs.
+Higher values preserve more visual detail.
+
+```json5
+{
+  agents: { defaults: { imageMaxDimensionPx: 1200 } },
+}
+```
+
 ### `agents.defaults.userTimezone`

 Timezone for system prompt context (not message timestamps). Falls back to host timezone.
--- a/docs/gateway/configuration.md
+++ b/docs/gateway/configuration.md
@@ -126,7 +126,7 @@ When validation fails:

    - `agents.defaults.models` defines the model catalog and acts as the allowlist for `/model`.
    - Model refs use `provider/model` format (e.g. `anthropic/claude-opus-4-6`).
-    - `agents.defaults.imageMaxDimensionPx` controls transcript/tool image downscaling (default `1200`).
+    - `agents.defaults.imageMaxDimensionPx` controls transcript/tool image downscaling (default `1200`); lower values usually reduce vision-token usage on screenshot-heavy runs.
    - See [Models CLI](/concepts/models) for switching models in chat and [Model Failover](/concepts/model-failover) for auth rotation and fallback behavior.
    - For custom/self-hosted providers, see [Custom providers](/gateway/configuration-reference#custom-providers-and-base-urls) in the reference.

--- a/docs/reference/token-use.md
+++ b/docs/reference/token-use.md
@@ -36,6 +36,12 @@ Everything the model receives counts toward the context limit:
 - Compaction summaries and pruning artifacts
 - Provider wrappers or safety headers (not visible, but still counted)

+For images, OpenClaw downscales transcript/tool image payloads before provider calls.
+Use `agents.defaults.imageMaxDimensionPx` (default: `1200`) to tune this:
+
+- Lower values usually reduce vision-token usage and payload size.
+- Higher values preserve more visual detail for OCR/UI-heavy screenshots.
+
 For a practical breakdown (per injected file, tools, skills, and system prompt size), use `/context list` or `/context detail`. See [Context](/concepts/context).

 ## How to see current token usage
@@ -106,6 +112,7 @@ agents:

 - Use `/compact` to summarize long sessions.
 - Trim large tool outputs in your workflows.
+- Lower `agents.defaults.imageMaxDimensionPx` for screenshot-heavy sessions.
 - Keep skill descriptions short (skill list is injected into the prompt).
 - Prefer smaller models for verbose, exploratory work.

--- a/docs/reference/transcript-hygiene.md
+++ b/docs/reference/transcript-hygiene.md
@@ -53,6 +53,9 @@ Separate from transcript hygiene, session files are repaired (if needed) before
 Image payloads are always sanitized to prevent provider-side rejection due to size
 limits (downscale/recompress oversized base64 images).

+This also helps control image-driven token pressure for vision-capable models.
+Lower max dimensions generally reduce token usage; higher dimensions preserve detail.
+
 Implementation:

 - `sanitizeSessionMessagesImages` in `src/agents/pi-embedded-helpers/images.ts`