feat: add configurable web_fetch maxChars cap

This commit is contained in:
Peter Steinberger
2026-02-03 17:35:51 -08:00
parent 6b4b6049b4
commit d3ba57b7d7
8 changed files with 74 additions and 3 deletions

View File

@@ -2039,6 +2039,7 @@ of `every`, keep `HEARTBEAT.md` tiny, and/or choose a cheaper `model`.
- `tools.web.search.cacheTtlMinutes` (default 15)
- `tools.web.fetch.enabled` (default true)
- `tools.web.fetch.maxChars` (default 50000)
- `tools.web.fetch.maxCharsCap` (default 50000; clamps maxChars from config/tool calls)
- `tools.web.fetch.timeoutSeconds` (default 30)
- `tools.web.fetch.cacheTtlMinutes` (default 15)
- `tools.web.fetch.userAgent` (optional override)

View File

@@ -252,6 +252,7 @@ Core parameters:
Notes:
- Enable via `tools.web.fetch.enabled`.
- `maxChars` is clamped by `tools.web.fetch.maxCharsCap` (default 50000).
- Responses are cached (default 15 min).
- For JS-heavy sites, prefer the browser tool.
- See [Web tools](/tools/web) for setup.

View File

@@ -221,6 +221,7 @@ Fetch a URL and extract readable content.
fetch: {
enabled: true,
maxChars: 50000,
maxCharsCap: 50000,
timeoutSeconds: 30,
cacheTtlMinutes: 15,
maxRedirects: 3,
@@ -252,6 +253,7 @@ Notes:
- Firecrawl requests use bot-circumvention mode and cache results by default.
- `web_fetch` sends a Chrome-like User-Agent and `Accept-Language` by default; override `userAgent` if needed.
- `web_fetch` blocks private/internal hostnames and re-checks redirects (limit with `maxRedirects`).
- `maxChars` is clamped to `tools.web.fetch.maxCharsCap`.
- `web_fetch` is best-effort extraction; some sites will need the browser tool.
- See [Firecrawl](/tools/firecrawl) for key setup and service details.
- Responses are cached (default 15 minutes) to reduce repeated fetches.