Commit Graph

5016 Commits

Author SHA1 Message Date
Onur
424d2dddf5 fix: prevent act:evaluate hangs from getting browser tool stuck/killed (#13498)
* fix(browser): prevent permanent timeout after stuck evaluate

Thread AbortSignal from client-fetch through dispatcher to Playwright
operations. When a timeout fires, force-disconnect the Playwright CDP
connection to unblock the serialized command queue, allowing the next
call to reconnect transparently.

Key changes:
- client-fetch.ts: proper AbortController with signal propagation
- pw-session.ts: new forceDisconnectPlaywrightForTarget()
- pw-tools-core.interactions.ts: accept signal, align inner timeout
  to outer-500ms, inject in-browser Promise.race for async evaluates
- routes/dispatcher.ts + types.ts: propagate signal through dispatch
- server.ts + bridge-server.ts: Express middleware creates AbortSignal
  from request lifecycle
- client-actions-core.ts: add timeoutMs to evaluate type

Fixes #10994

* fix(browser): v2 - force-disconnect via Connection.close() instead of browser.close()

When page.evaluate() is stuck on a hung CDP transport, browser.close() also
hangs because it tries to send a close command through the same stuck pipe.

v2 fix: forceDisconnectPlaywrightForTarget now directly calls Playwright's
internal Connection.close() which locally rejects all pending callbacks and
emits 'disconnected' without touching the network. This instantly unblocks
all stuck Playwright operations.

closePlaywrightBrowserConnection (clean shutdown) now also has a 3s timeout
fallback that drops to forceDropConnection if browser.close() hangs.

Fixes permanent browser timeout after stuck evaluate.

* fix(browser): v3 - fire-and-forget browser.close() instead of Connection.close()

v2's forceDropConnection called browser._connection.close() which corrupts
the entire Playwright instance because Connection is shared across all
objects (BrowserType, Browser, Page, etc.). This prevented reconnection
with cascading 'connectOverCDP: Force-disconnected' errors.

v3 fix: forceDisconnectPlaywrightForTarget now:
1. Nulls cached connection immediately
2. Fire-and-forgets browser.close() (doesn't await — it may hang)
3. Next connectBrowser() creates a fresh connectOverCDP WebSocket

Each connectOverCDP creates an independent WebSocket to the CDP endpoint,
so the new connection is unaffected by the old one's pending close.
The old browser.close() eventually resolves when the in-browser evaluate
timeout fires, or the old connection gets GC'd.

* fix(browser): v4 - clear connecting state and remove stale disconnect listeners

The reconnect was failing because:
1. forceDisconnectPlaywrightForTarget nulled cached but not connecting,
   so subsequent calls could await a stale promise
2. The old browser's 'disconnected' event handler raced with new
   connections, nulling the fresh cached reference

Fix: null both cached and connecting, and removeAllListeners on the
old browser before fire-and-forget close.

* fix(browser): v5 - use raw CDP Runtime.terminateExecution to kill stuck evaluate

When forceDisconnectPlaywrightForTarget fires, open a raw WebSocket
to the stuck page's CDP endpoint and send Runtime.terminateExecution.
This kills running JS without navigating away or crashing the page.
Also clear connecting state and remove stale disconnect listeners.

* fix(browser): abort cancels stuck evaluate

* Browser: always cleanup evaluate abort listener

* Chore: remove Playwright debug scripts

* Docs: add CDP evaluate refactor plan

* Browser: refactor Playwright force-disconnect

* Browser: abort stops evaluate promptly

* Node host: extract withTimeout helper

* Browser: remove disconnected listener safely

* Changelog: note act:evaluate hang fix

---------

Co-authored-by: Bob <bob@dutifulbob.com>
2026-02-11 07:54:48 +08:00
Riccardo Giorato
be6de9bb75 Update Together default model to together/moonshotai/Kimi-K2.5 (#13324) 2026-02-11 08:39:15 +09:00
Vignesh
fa906b26ad feat: IRC — add first-class channel support
Adds IRC as a first-class channel with core config surfaces (schema/hints/dock), plugin auto-enable detection, routing/policy alignment, and docs/tests.

Co-authored-by: Vignesh <vigneshnatarajan92@gmail.com>
2026-02-10 17:33:57 -06:00
Sebastian
90f58333e9 test(docker): make bash 3.2 compatibility check portable 2026-02-10 18:04:48 -05:00
Gustavo Madeira Santana
c4d3800c29 fix: resolve message tool lint error (#13453) (thanks @liebertar) 2026-02-10 15:09:58 -05:00
Cklee
22458f57f2 fix(agents): strip [Historical context: ...] and tool call text from streaming path (#13453)
- Add [Historical context: ...] marker pattern to stripDowngradedToolCallText
- Apply stripDowngradedToolCallText in emitBlockChunk streaming path
- Previously only stripBlockTags ran during streaming, leaking [Tool Call: ...] markers to users
- Add 7 test cases for the new pattern stripping
2026-02-10 15:04:52 -05:00
meaadore1221-afk
67d25c6533 fix: strip reasoning tags from messaging tool text to prevent <think> leakage (#11053)
Co-authored-by: MEA <mea@MEAdeMac-mini.local>
2026-02-10 14:48:17 -05:00
Mateusz Michalik
6731c6a1cd fix(docker): support Bash 3.2 in docker-setup.sh (#9441)
* fix(docker): use Bash 3.2-compatible upsert_env in docker-setup.sh

* refactor(docker): simplify argument handling in write_extra_compose function

* fix(docker): add bash 3.2 regression coverage (#9441) (thanks @mateusz-michalik)

---------

Co-authored-by: Sebastian <19554889+sebslight@users.noreply.github.com>
2026-02-10 09:55:43 -05:00
Gustavo Madeira Santana
2914cb1d48 Onboard: rename Custom API Endpoint to Custom Provider 2026-02-10 07:36:04 -05:00
Blossom
c0befdee0b feat(onboard): add custom/local API configuration flow (#11106)
* feat(onboard): add custom/local API configuration flow

* ci: retry macos check

* fix: expand custom API onboarding (#11106) (thanks @MackDing)

* fix: refine custom endpoint detection (#11106) (thanks @MackDing)

* fix: streamline custom endpoint onboarding (#11106) (thanks @MackDing)

* fix: skip model picker for custom endpoint (#11106) (thanks @MackDing)

* fix: avoid allowlist picker for custom endpoint (#11106) (thanks @MackDing)

* Onboard: reuse shared fetch timeout helper (#11106) (thanks @MackDing)

* Onboard: clarify default base URL name (#11106) (thanks @MackDing)

---------

Co-authored-by: OpenClaw Contributor <contributor@openclaw.ai>
Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>
2026-02-10 07:31:02 -05:00
Vignesh Natarajan
a76dea0d23 Config: clarify memorySearch migration precedence 2026-02-10 00:21:27 -08:00
Vignesh Natarajan
8688730161 Config: migrate legacy top-level memorySearch 2026-02-10 00:21:27 -08:00
Vignesh Natarajan
efc79f69a2 Gateway: eager-init QMD backend on startup 2026-02-09 23:58:34 -08:00
Vignesh
ef4a0e92b7 fix(memory/qmd): scope query to managed collections (#11645) 2026-02-09 23:35:27 -08:00
quotentiroler
40919b1fc8 fix(test): add StringSelectMenu to @buape/carbon mock 2026-02-09 23:11:53 -08:00
Peter Steinberger
53273b490b fix(auto-reply): prevent sender spoofing in group prompts 2026-02-10 00:44:38 -06:00
Shadow
8ff1618bfc Discord: add exec approval cleanup option (#13205) 2026-02-10 00:39:42 -06:00
Shadow
4537ebc43a fix: enforce Discord agent component DM auth (#11254) (thanks @thedudeabidesai) 2026-02-10 00:26:59 -06:00
max
f17c978f5c refactor(security,config): split oversized files (#13182)
refactor(security,config): split oversized files using dot-naming convention

- audit-extra.ts (1,199 LOC) -> barrel (31) + sync (559) + async (668)
- schema.ts (1,114 LOC) -> schema (353) + field-metadata (729)
- Add tmp-refactoring-strategy.md documenting Wave 1-4 plan

PR #13182
2026-02-09 22:22:29 -08:00
Shadow
47f6bb4146 Commands: add commands.allowFrom config 2026-02-09 23:58:52 -06:00
zerone0x
1d46ca3a95 fix(signal): enforce mention gating for group messages (#13124)
* fix(signal): enforce mention gating for group messages

Signal group messages bypassed mention gating, causing the bot to reply
even when requireMention was enabled and the message did not mention
the bot. This aligns Signal with Slack, Discord, Telegram, and iMessage
which all enforce mention gating correctly.

Fixes #13106

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(signal): keep pending history context for mention-gated skips (#13124) (thanks @zerone0x)

---------

Co-authored-by: Yansu <no-reply@yansu.ai>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-09 23:19:07 -06:00
Gustavo Madeira Santana
e19a23520c fix: unify session maintenance and cron run pruning (#13083)
* fix: prune stale session entries, cap entry count, and rotate sessions.json

The sessions.json file grows unbounded over time. Every heartbeat tick (default: 30m)
triggers multiple full rewrites, and session keys from groups, threads, and DMs
accumulate indefinitely with large embedded objects (skillsSnapshot,
systemPromptReport). At >50MB the synchronous JSON parse blocks the event loop,
causing Telegram webhook timeouts and effectively taking the bot down.

Three mitigations, all running inside saveSessionStoreUnlocked() on every write:

1. Prune stale entries: remove entries with updatedAt older than 30 days
   (configurable via session.maintenance.pruneDays in openclaw.json)

2. Cap entry count: keep only the 500 most recently updated entries
   (configurable via session.maintenance.maxEntries). Entries without updatedAt
   are evicted first.

3. File rotation: if the existing sessions.json exceeds 10MB before a write,
   rename it to sessions.json.bak.{timestamp} and keep only the 3 most recent
   backups (configurable via session.maintenance.rotateBytes).

All three thresholds are configurable under session.maintenance in openclaw.json
with Zod validation. No env vars.

Existing tests updated to use Date.now() instead of epoch-relative timestamps
(1, 2, 3) that would be incorrectly pruned as stale.

27 new tests covering pruning, capping, rotation, and integration scenarios.

* feat: auto-prune expired cron run sessions (#12289)

Add TTL-based reaper for isolated cron run sessions that accumulate
indefinitely in sessions.json.

New config option:
  cron.sessionRetention: string | false  (default: '24h')

The reaper runs piggy-backed on the cron timer tick, self-throttled
to sweep at most every 5 minutes. It removes session entries matching
the pattern cron:<jobId>:run:<uuid> whose updatedAt + retention < now.

Design follows the Kubernetes ttlSecondsAfterFinished pattern:
- Sessions are persisted normally (observability/debugging)
- A periodic reaper prunes expired entries
- Configurable retention with sensible default
- Set to false to disable pruning entirely

Files changed:
- src/config/types.cron.ts: Add sessionRetention to CronConfig
- src/config/zod-schema.ts: Add Zod validation for sessionRetention
- src/cron/session-reaper.ts: New reaper module (sweepCronRunSessions)
- src/cron/session-reaper.test.ts: 12 tests covering all paths
- src/cron/service/state.ts: Add cronConfig/sessionStorePath to deps
- src/cron/service/timer.ts: Wire reaper into onTimer tick
- src/gateway/server-cron.ts: Pass config and session store path to deps

Closes #12289

* fix: sweep cron session stores per agent

* docs: add changelog for session maintenance (#13083) (thanks @skyfallsin, @Glucksberg)

* fix: add warn-only session maintenance mode

* fix: warn-only maintenance defaults to active session

* fix: deliver maintenance warnings to active session

* docs: add session maintenance examples

* fix: accept duration and size maintenance thresholds

* refactor: share cron run session key check

* fix: format issues and replace defaultRuntime.warn with console.warn

---------

Co-authored-by: Pradeep Elankumaran <pradeepe@gmail.com>
Co-authored-by: Glucksberg <markuscontasul@gmail.com>
Co-authored-by: max <40643627+quotentiroler@users.noreply.github.com>
Co-authored-by: quotentiroler <max.nussbaumer@maxhealth.tech>
2026-02-09 20:42:35 -08:00
quotentiroler
a26670a2fb refactor: consolidate fetchWithTimeout into shared utility 2026-02-09 20:34:56 -08:00
Jake
757522fb48 fix(memory): default batch embeddings to off
Disables async batch embeddings by default for memory indexing; batch remains opt-in via agents.defaults.memorySearch.remote.batch.enabled.

(#13069) Thanks @mcinteerj.

Co-authored-by: Jake McInteer <mcinteerj@gmail.com>
2026-02-09 22:31:58 -06:00
Evan Reid
0c7bc303c9 fix(tools): correct Grok response parsing for xAI Responses API (#13049)
* fix(tools): correct Grok response parsing for xAI Responses API

The xAI Responses API returns content in output[0].content[0].text,
not in output_text field. Updated GrokSearchResponse type and
runGrokSearch to extract content from the correct path.

Fixes the 'No response' issue when using Grok web search.

* fix(tools): harden Grok web_search parsing (#13049) (thanks @ereid7)

---------

Co-authored-by: erai <erai@erais-Mac-mini.local>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-09 21:51:24 -06:00
quotentiroler
cc87c0ed7c Update contributing, deduplicate more functions 2026-02-09 19:21:33 -08:00
quotentiroler
453eaed4dc improve pre-commit hook 2026-02-09 18:59:42 -08:00
quotentiroler
53910f3643 Deduplicate more 2026-02-09 18:56:58 -08:00
Rami Abdelrazzaq
c2b2d535fb fix: suggest /clear in context overflow error message (#12973)
* fix: suggest /reset in context overflow error message

When the context window overflows, the error message now suggests
using /reset to clear session history, giving users an actionable
recovery path instead of a dead-end error.

Closes #12940

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: suggest /reset in context overflow error message (#12973) (thanks @RamiNoodle733)

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Rami Abdelrazzaq <RamiNoodle733@users.noreply.github.com>
2026-02-09 20:44:37 -06:00
Liu Yuan
33ee8bbf1d feat: add zai/glm-4.6v image understanding support (#10267)
Fixes #10265. Thanks @liuy.
2026-02-09 18:38:09 -08:00
Yida-Dev
d3c71875e4 fix: cap Discord gateway reconnect at 50 attempts to prevent infinite loop (#12230)
* fix: cap Discord gateway reconnect attempts to prevent infinite loop

The Discord GatewayPlugin was configured with maxAttempts: Infinity,
which causes an unbounded reconnection loop when the Discord gateway
enters a persistent failure state (e.g. code 1005 with stalled HELLO).

In production, this manifested as 2,483+ reconnection attempts in a
single log file, starving the Node.js event loop and preventing cron,
heartbeat, and other subsystems from functioning.

Cap maxAttempts at 50, which provides ~25 minutes of retry time
(with 30s HELLO timeout between attempts) before cleanly exiting
via the existing "Max reconnect attempts" error handler.

Closes #11836

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Changelog: note Discord gateway reconnect cap (#12230) (thanks @Yida-Dev)

---------

Co-authored-by: Yida-Dev <reyifeijun@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Shadow <shadow@clawd.bot>
2026-02-09 20:36:43 -06:00
magendary
ead3bb645f discord: auto-create thread when sending to Forum/Media channels (#12380)
* discord: auto-create thread when sending to Forum/Media channels

* Discord: harden forum thread sends (#12380) (thanks @magendary)

* fix: clean up discord send exports (#12380) (thanks @magendary)

---------

Co-authored-by: Shadow <shadow@clawd.bot>
2026-02-09 20:26:42 -06:00
quotentiroler
59a4aaf376 Merge branch 'main' of https://github.com/openclaw/openclaw 2026-02-09 17:57:28 -08:00
quotentiroler
039aaf176e CI: cleanup and fix broken job references
- Fix code-size -> code-analysis job name (5 jobs had wrong dependency)
- Remove useless install-check job (was no-op)
- Add explicit docs_only guard to release-check
- Remove dead submodule checkout steps (no submodules in repo)
- Rename detect-docs-only -> detect-docs-changes, add docs_changed output
- Reorder check script: format first for faster fail
- Fix billing error test (PR #12946 removed fallback detection but not test)
2026-02-09 17:52:51 -08:00
Tak Hoffman
54315aeacf Agents: scope sanitizeUserFacingText rewrites to errorContext
Squash-merge #12988.

Refs: #12889 #12309 #3594 #7483 #10094 #10368 #11317 #11359 #11649 #12022 #12432 #12676 #12711
2026-02-09 19:52:24 -06:00
Shadow
8e607d927c Docs: require labeler + label updates for channels/extensions 2026-02-09 17:08:18 -08:00
max
8d75a496bf refactor: centralize isPlainObject, isRecord, isErrno, isLoopbackHost utilities (#12926) 2026-02-09 17:02:55 -08:00
Jabez Borja
8c73dbe705 fix(telegram): prevent false-positive billing error detection in conversation text (#12946) thanks @jabezborja 2026-02-09 19:49:31 -05:00
George Pickett
afec0f11f8 test: lock /think off persistence (#9564) 2026-02-09 16:08:15 -08:00
Liu Yuan
97b3ee7ec0 Fix: Honor /think off for reasoning-capable models
Problem:
When users execute `/think off`, they still receive `reasoning_content`
from models configured with `reasoning: true` (e.g., GLM-4.7, GLM-4.6,
Kimi K2.5, MiniMax-M2.1).

Expected: `/think off` should completely disable reasoning content.
Actual: Reasoning content is still returned.

Root Cause:
The directive handlers delete `sessionEntry.thinkingLevel` when user
executes `/think off`. This causes the thinking level to become undefined,
and the system falls back to `resolveThinkingDefault()`, which checks the
model catalog and returns "low" for reasoning-capable models, ignoring the
user's explicit intent.

Why We Must Persist "off" (Design Rationale):

1. **Model-dependent defaults**: Unlike other directives where "off" means
   use a global default, `thinkingLevel` has model-dependent defaults:
   - Reasoning-capable models (GLM-4.7, etc.) → default "low"
   - Other models → default "off"

2. **Existing pattern**: The codebase already follows this pattern for
   `elevatedLevel`, which persists "off" explicitly to override defaults
   that may be "on". The comment explains:
   "Persist 'off' explicitly so `/elevated off` actually overrides defaults."

3. **User intent**: When a user explicitly executes `/think off`, they want
   to disable thinking regardless of the model's capabilities. Deleting the
   field breaks this intent by falling back to the model's default.

Solution:
Persist "off" value instead of deleting the field in all internal directive handlers:
- `src/auto-reply/reply/directive-handling.impl.ts`: Directive-only messages
- `src/auto-reply/reply/directive-handling.persist.ts`: Inline directives
- `src/commands/agent.ts`: CLI command-line flags

Gateway API Backward Compatibility:
The original implementation incorrectly mapped `null` to "off" in
`sessions-patch.ts` for consistency with internal handlers. This was a
breaking change because:
- Previously, `null` cleared the override (deleted the field)
- API clients lost the ability to "clear to default" via `null`
- This contradicts standard JSON semantics where `null` means "no value"

Restored original null semantics in `src/gateway/sessions-patch.ts`:
- `null` → delete field, fall back to model default (clear override)
- `"off"` → persist explicit override
- Other values → normalize and persist

This ensures backward compatibility for API clients while fixing the `/think off`
issue in internal handlers.

Signed-off-by: Liu Yuan <namei.unix@gmail.com>
2026-02-09 16:08:15 -08:00
Riccardo Giorato
661279cbfa feat: adding support for Together ai provider (#10304) 2026-02-10 08:49:34 +09:00
Tak Hoffman
4df252d895 Gateway: add CLAUDE.md symlink for AGENTS.md 2026-02-09 17:02:55 -06:00
Rodrigo Uroz
ae99e656af (fix): .env vars not available during runtime config reloads (healthchecks fail with MissingEnvVarError) (#12748)
* Config: reload dotenv before env substitution on runtime loads

* Test: isolate config env var regression from host state env

* fix: keep dotenv vars resolvable on runtime config reloads (#12748) (thanks @rodrigouroz)

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-09 16:31:41 -06:00
Sk Akram
1cee5135e4 fix: preserve original filename for WhatsApp inbound documents (#12691)
* fix: preserve original filename for WhatsApp inbound documents

* fix: cover WhatsApp document filenames (#12691) (thanks @akramcodez)

* test: streamline inbound media waits (#12691) (thanks @akramcodez)

---------

Co-authored-by: Gustavo Madeira Santana <gumadeiras@gmail.com>
2026-02-09 16:56:19 -05:00
quotentiroler
1fad19008e fix: improve code-size gate output and duplicate detection, fix Windows path in source-display 2026-02-09 13:18:51 -08:00
Peter Steinberger
268094938b fix: reduce brew noise in onboarding 2026-02-09 13:27:21 -06:00
Peter Steinberger
3e63b2a4fa fix(cli): improve plugins list source display 2026-02-09 13:05:48 -06:00
Peter Steinberger
394d60c1fb fix(onboarding): auto-install shell completion in QuickStart 2026-02-09 12:56:12 -06:00
Chase Dorsey
512b2053c5 fix(web_search): Fix invalid model name sent to Perplexity (#12795)
* fix(web_search): Fix invalid model name sent to Perplexity

* chore: Only apply fix to direct Perplexity calls

* fix(web_search): normalize direct Perplexity model IDs

* fix: add changelog note for perplexity model normalization (#12795) (thanks @cdorsey)

* fix: align tests and fetch type for gate stability (#12795) (thanks @cdorsey)

* chore: keep #12795 scoped to web_search changes

---------

Co-authored-by: Sebastian <19554889+sebslight@users.noreply.github.com>
2026-02-09 13:43:57 -05:00
max
40b11db80e TypeScript: add extensions to tsconfig and fix type errors (#12781)
* TypeScript: add extensions to tsconfig and fix type errors

- Add extensions/**/* to tsconfig.json includes
- Export ProviderAuthResult, AnyAgentTool from plugin-sdk
- Fix optional chaining for messageActions across channels
- Add missing type imports (MSTeamsConfig, GroupPolicy, etc.)
- Add type annotations for provider auth handlers
- Fix undici/fetch type compatibility in zalo proxy
- Correct ChannelAccountSnapshot property usage
- Add type casts for tool registrations
- Extract usage view styles and types to separate files

* TypeScript: fix optional debug calls and handleAction guards
2026-02-09 10:05:38 -08:00