mirror of
https://github.com/openclaw/openclaw.git
synced 2026-04-03 03:03:24 -04:00
feat(telegram): add outbound sanitizer leak corpus and docs
- Add leak corpus test cases (tests/data/telegram_leak_cases.json) - Add sanitizer documentation (docs/telegram-sanitizer.md) - Block internal diagnostics from reaching users - Strip wrapper artifacts from LLM output - Static response for unknown slash commands
This commit is contained in:
committed by
Peter Steinberger
parent
c08e8c0359
commit
5801c4f983
71
docs/telegram-sanitizer.md
Normal file
71
docs/telegram-sanitizer.md
Normal file
@@ -0,0 +1,71 @@
|
||||
# Telegram Outbound Sanitizer
|
||||
|
||||
This document describes the Telegram outbound sanitizer behavior for preventing internal diagnostics and wrapper artifacts from reaching end users.
|
||||
|
||||
## Overview
|
||||
|
||||
The sanitizer intercepts Telegram outbound messages and:
|
||||
|
||||
1. Strips wrapper artifacts (`<reply>`, `<NO_REPLY>`, `<tool_schema>`, etc.)
|
||||
2. Drops internal diagnostics (error codes, run IDs, gateway details)
|
||||
3. Returns static responses for unknown slash commands
|
||||
|
||||
## Marker Families
|
||||
|
||||
Static checks verify these marker families:
|
||||
|
||||
- `OPENCLAW_TELEGRAM_OUTBOUND_SANITIZER`
|
||||
- `OPENCLAW_TELEGRAM_INTERNAL_ERROR_SUPPRESSOR`
|
||||
|
||||
## Leakage Patterns Blocked
|
||||
|
||||
### Tool/Runtime Leakage
|
||||
|
||||
- `tool call validation failed`
|
||||
- `not in request.tools`
|
||||
- `sessions_send` templates / `function_call`
|
||||
- `Run ID`, `Status: error`, gateway timeout/connect details
|
||||
|
||||
### Media/Tool Scaffolding
|
||||
|
||||
- `MEDIA:`/`.MEDIA:` leak lines
|
||||
- TTS scaffolding text
|
||||
|
||||
### Sentinel/Garbage Markers
|
||||
|
||||
- `NO_CONTEXT`, `NOCONTENT`, `NO_MESSAGE_CONTENT_HERE`
|
||||
- `NO_DATA_FOUND`, `NO_API_KEY`
|
||||
|
||||
## Enforced Behavior
|
||||
|
||||
1. **Unknown slash commands** → static text response
|
||||
2. **Unknown slash commands** → does NOT call LLM
|
||||
3. **Telegram output** → never emits tool diagnostics/internal runtime details
|
||||
4. **Optional debug override** → owner-only with `TELEGRAM_DEBUG=true`
|
||||
|
||||
## Verification
|
||||
|
||||
Run the leak corpus tests:
|
||||
|
||||
```bash
|
||||
# Run leak case corpus validation
|
||||
pnpm test src/telegram/sanitizer.test.ts
|
||||
|
||||
# Manual smoke check
|
||||
# In any Telegram chat: /unknown_command
|
||||
# Expected: "Unknown command. Use /help."
|
||||
```
|
||||
|
||||
## Test Corpus
|
||||
|
||||
The test corpus at `tests/data/telegram_leak_cases.json` contains:
|
||||
|
||||
- `expect: "allow"` - Messages that should pass through
|
||||
- `expect: "drop"` - Messages that should be blocked
|
||||
- `expect: "strip_wrapper"` - Messages that need wrapper removal
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
- Sanitization runs after LLM response, before Telegram API send
|
||||
- Empty payloads after sanitization return fallback message
|
||||
- Return shape `{ queuedFinal, counts }` is preserved for caller safety
|
||||
200
tests/data/telegram_leak_cases.json
Normal file
200
tests/data/telegram_leak_cases.json
Normal file
@@ -0,0 +1,200 @@
|
||||
[
|
||||
{
|
||||
"id": "unknown_command_static",
|
||||
"text": "Unknown command. Use /help.",
|
||||
"expect": "allow",
|
||||
"description": "Default response for unknown slash commands"
|
||||
},
|
||||
{
|
||||
"id": "owner_unknown_command_debug",
|
||||
"text": "Unknown command. Use /help, /status.",
|
||||
"expect": "allow",
|
||||
"description": "Owner-only debug response with status hint"
|
||||
},
|
||||
{
|
||||
"id": "normal_reply_unicode",
|
||||
"text": "Hello! How can I help you today?",
|
||||
"expect": "allow",
|
||||
"description": "Normal assistant reply"
|
||||
},
|
||||
{
|
||||
"id": "diag_heartbeat",
|
||||
"text": "HEARTBEAT REPORT",
|
||||
"expect": "drop",
|
||||
"description": "Internal heartbeat diagnostics should not reach users"
|
||||
},
|
||||
{
|
||||
"id": "diag_cron_gateway_disconnected",
|
||||
"text": "CRON GATEWAY DISCONNECTED",
|
||||
"expect": "drop",
|
||||
"description": "Cron system diagnostics leak"
|
||||
},
|
||||
{
|
||||
"id": "diag_gateway_closed",
|
||||
"text": "Gateway closed (normal closure, no specific reason)",
|
||||
"expect": "drop",
|
||||
"description": "Gateway lifecycle leak"
|
||||
},
|
||||
{
|
||||
"id": "diag_http_400",
|
||||
"text": "400 status code (no body)",
|
||||
"expect": "drop",
|
||||
"description": "HTTP error diagnostics should not reach users"
|
||||
},
|
||||
{
|
||||
"id": "diag_http_401",
|
||||
"text": "401 status code",
|
||||
"expect": "drop",
|
||||
"description": "Auth error diagnostics leak"
|
||||
},
|
||||
{
|
||||
"id": "diag_http_403",
|
||||
"text": "403 status code",
|
||||
"expect": "drop",
|
||||
"description": "Forbidden error diagnostics leak"
|
||||
},
|
||||
{
|
||||
"id": "diag_ws_localhost",
|
||||
"text": "ws://127.0.0.1:8787",
|
||||
"expect": "drop",
|
||||
"description": "Internal WebSocket URLs should not leak"
|
||||
},
|
||||
{
|
||||
"id": "diag_run_id",
|
||||
"text": "Run ID: 123456",
|
||||
"expect": "drop",
|
||||
"description": "Internal run IDs should not reach users"
|
||||
},
|
||||
{
|
||||
"id": "diag_status_error",
|
||||
"text": "Status: error",
|
||||
"expect": "drop",
|
||||
"description": "Internal status diagnostics"
|
||||
},
|
||||
{
|
||||
"id": "diag_configuration_file",
|
||||
"text": "Configuration file: ~/.openclaw/openclaw.json",
|
||||
"expect": "drop",
|
||||
"description": "Config path leaks reveal system structure"
|
||||
},
|
||||
{
|
||||
"id": "diag_bind_address",
|
||||
"text": "Bind address: 127.0.0.1",
|
||||
"expect": "drop",
|
||||
"description": "Internal bind address leak"
|
||||
},
|
||||
{
|
||||
"id": "diag_no_data",
|
||||
"text": "NO_DATA",
|
||||
"expect": "drop",
|
||||
"description": "Sentinel/placeholder leak"
|
||||
},
|
||||
{
|
||||
"id": "diag_no_context",
|
||||
"text": "NO_CONTEXT",
|
||||
"expect": "drop",
|
||||
"description": "Internal context sentinel leak"
|
||||
},
|
||||
{
|
||||
"id": "diag_no_content",
|
||||
"text": "NOCONTENT",
|
||||
"expect": "drop",
|
||||
"description": "Missing content sentinel leak"
|
||||
},
|
||||
{
|
||||
"id": "diag_no_message_content",
|
||||
"text": "NO_MESSAGE_CONTENT_HERE",
|
||||
"expect": "drop",
|
||||
"description": "Template placeholder leak"
|
||||
},
|
||||
{
|
||||
"id": "diag_no_api_key",
|
||||
"text": "NO_API_KEY",
|
||||
"expect": "drop",
|
||||
"description": "Missing credential sentinel leak"
|
||||
},
|
||||
{
|
||||
"id": "diag_tool_validation_failed",
|
||||
"text": "tool call validation failed",
|
||||
"expect": "drop",
|
||||
"description": "Tool runtime error should not reach users"
|
||||
},
|
||||
{
|
||||
"id": "diag_not_in_request_tools",
|
||||
"text": "not in request.tools",
|
||||
"expect": "drop",
|
||||
"description": "Tool policy error leak"
|
||||
},
|
||||
{
|
||||
"id": "diag_sessions_send",
|
||||
"text": "sessions_send",
|
||||
"expect": "drop",
|
||||
"description": "Internal tool name leak"
|
||||
},
|
||||
{
|
||||
"id": "diag_function_call",
|
||||
"text": "function_call",
|
||||
"expect": "drop",
|
||||
"description": "Function call scaffolding leak"
|
||||
},
|
||||
{
|
||||
"id": "diag_media_prefix",
|
||||
"text": "MEDIA:/path/to/file.jpg",
|
||||
"expect": "drop",
|
||||
"description": "Internal media path prefix leak"
|
||||
},
|
||||
{
|
||||
"id": "wrapper_reply_tag",
|
||||
"text": "<reply>Hello</reply>",
|
||||
"expect": "strip_wrapper",
|
||||
"description": "Reply wrapper tags should be stripped"
|
||||
},
|
||||
{
|
||||
"id": "wrapper_no_reply",
|
||||
"text": "<NO_REPLY>",
|
||||
"expect": "drop",
|
||||
"description": "NO_REPLY sentinel should not reach users"
|
||||
},
|
||||
{
|
||||
"id": "wrapper_no_reply_underscore",
|
||||
"text": "_NO_REPLY",
|
||||
"expect": "drop",
|
||||
"description": "Underscore NO_REPLY variant"
|
||||
},
|
||||
{
|
||||
"id": "wrapper_no_reply_dash",
|
||||
"text": "NO-REPLY",
|
||||
"expect": "drop",
|
||||
"description": "Dash NO-REPLY variant"
|
||||
},
|
||||
{
|
||||
"id": "wrapper_tool_schema",
|
||||
"text": "<tool_schema>",
|
||||
"expect": "drop",
|
||||
"description": "Tool schema scaffolding leak"
|
||||
},
|
||||
{
|
||||
"id": "wrapper_search_web",
|
||||
"text": "<searchWeb>",
|
||||
"expect": "drop",
|
||||
"description": "Web search scaffolding leak"
|
||||
},
|
||||
{
|
||||
"id": "diag_gateway_timeout",
|
||||
"text": "gateway timeout after 30000ms",
|
||||
"expect": "drop",
|
||||
"description": "Gateway timeout diagnostics"
|
||||
},
|
||||
{
|
||||
"id": "diag_connect_refused",
|
||||
"text": "connect ECONNREFUSED",
|
||||
"expect": "drop",
|
||||
"description": "Connection error diagnostics"
|
||||
},
|
||||
{
|
||||
"id": "diag_tts_scaffolding",
|
||||
"text": "TTS_AUDIO_PLACEHOLDER",
|
||||
"expect": "drop",
|
||||
"description": "TTS scaffolding placeholder"
|
||||
}
|
||||
]
|
||||
Reference in New Issue
Block a user