From 43f4548e3d33875da0fb22909054a138e196db92 Mon Sep 17 00:00:00 2001 From: quotentiroler Date: Sun, 8 Feb 2026 04:13:27 -0800 Subject: [PATCH] fmt: security docs --- docs/security/CONTRIBUTING-THREAT-MODEL.md | 52 +- docs/security/THREAT-MODEL-ATLAS.md | 588 +++++++++++---------- 2 files changed, 323 insertions(+), 317 deletions(-) diff --git a/docs/security/CONTRIBUTING-THREAT-MODEL.md b/docs/security/CONTRIBUTING-THREAT-MODEL.md index 65aebd8125..884a8ff9bc 100644 --- a/docs/security/CONTRIBUTING-THREAT-MODEL.md +++ b/docs/security/CONTRIBUTING-THREAT-MODEL.md @@ -1,12 +1,12 @@ # Contributing to the OpenClaw Threat Model -Thanks for helping make OpenClaw more secure. This threat model is a living document and we welcome contributions from anyone - you don't need to be a security expert. +Thanks for helping make OpenClaw more secure. This threat model is a living document and we welcome contributions from anyone - you don't need to be a security expert. ## Ways to Contribute ### Add a Threat -Spotted an attack vector or risk we haven't covered? Open an issue on [openclaw/trust](https://github.com/openclaw/trust/issues) and describe it in your own words. You don't need to know any frameworks or fill in every field - just describe the scenario. +Spotted an attack vector or risk we haven't covered? Open an issue on [openclaw/trust](https://github.com/openclaw/trust/issues) and describe it in your own words. You don't need to know any frameworks or fill in every field - just describe the scenario. **Helpful to include (but not required):** @@ -15,13 +15,13 @@ Spotted an attack vector or risk we haven't covered? Open an issue on [openclaw/ - How severe you think it is (low / medium / high / critical) - Any links to related research, CVEs, or real-world examples -We'll handle the ATLAS mapping, threat IDs, and risk assessment during review. If you want to include those details, great - but it's not expected. +We'll handle the ATLAS mapping, threat IDs, and risk assessment during review. If you want to include those details, great - but it's not expected. > **This is for adding to the threat model, not reporting live vulnerabilities.** If you've found an exploitable vulnerability, see our [Trust page](https://trust.openclaw.ai) for responsible disclosure instructions. ### Suggest a Mitigation -Have an idea for how to address an existing threat? Open an issue or PR referencing the threat. Useful mitigations are specific and actionable - for example, "per-sender rate limiting of 10 messages/minute at the gateway" is better than "implement rate limiting." +Have an idea for how to address an existing threat? Open an issue or PR referencing the threat. Useful mitigations are specific and actionable - for example, "per-sender rate limiting of 10 messages/minute at the gateway" is better than "implement rate limiting." ### Propose an Attack Chain @@ -29,48 +29,48 @@ Attack chains show how multiple threats combine into a realistic attack scenario ### Fix or Improve Existing Content -Typos, clarifications, outdated info, better examples - PRs welcome, no issue needed. +Typos, clarifications, outdated info, better examples - PRs welcome, no issue needed. ## What We Use ### MITRE ATLAS -This threat model is built on [MITRE ATLAS](https://atlas.mitre.org/) (Adversarial Threat Landscape for AI Systems), a framework designed specifically for AI/ML threats like prompt injection, tool misuse, and agent exploitation. You don't need to know ATLAS to contribute - we map submissions to the framework during review. +This threat model is built on [MITRE ATLAS](https://atlas.mitre.org/) (Adversarial Threat Landscape for AI Systems), a framework designed specifically for AI/ML threats like prompt injection, tool misuse, and agent exploitation. You don't need to know ATLAS to contribute - we map submissions to the framework during review. ### Threat IDs Each threat gets an ID like `T-EXEC-003`. The categories are: -| Code | Category | -|------|----------| -| RECON | Reconnaissance - information gathering | -| ACCESS | Initial access - gaining entry | -| EXEC | Execution - running malicious actions | -| PERSIST | Persistence - maintaining access | -| EVADE | Defense evasion - avoiding detection | -| DISC | Discovery - learning about the environment | -| EXFIL | Exfiltration - stealing data | -| IMPACT | Impact - damage or disruption | +| Code | Category | +| ------- | ------------------------------------------ | +| RECON | Reconnaissance - information gathering | +| ACCESS | Initial access - gaining entry | +| EXEC | Execution - running malicious actions | +| PERSIST | Persistence - maintaining access | +| EVADE | Defense evasion - avoiding detection | +| DISC | Discovery - learning about the environment | +| EXFIL | Exfiltration - stealing data | +| IMPACT | Impact - damage or disruption | IDs are assigned by maintainers during review. You don't need to pick one. ### Risk Levels -| Level | Meaning | -|-------|---------| -| **Critical** | Full system compromise, or high likelihood + critical impact | -| **High** | Significant damage likely, or medium likelihood + critical impact | -| **Medium** | Moderate risk, or low likelihood + high impact | -| **Low** | Unlikely and limited impact | +| Level | Meaning | +| ------------ | ----------------------------------------------------------------- | +| **Critical** | Full system compromise, or high likelihood + critical impact | +| **High** | Significant damage likely, or medium likelihood + critical impact | +| **Medium** | Moderate risk, or low likelihood + high impact | +| **Low** | Unlikely and limited impact | If you're unsure about the risk level, just describe the impact and we'll assess it. ## Review Process -1. **Triage** - We review new submissions within 48 hours -2. **Assessment** - We verify feasibility, assign ATLAS mapping and threat ID, validate risk level -3. **Documentation** - We ensure everything is formatted and complete -4. **Merge** - Added to the threat model and visualization +1. **Triage** - We review new submissions within 48 hours +2. **Assessment** - We verify feasibility, assign ATLAS mapping and threat ID, validate risk level +3. **Documentation** - We ensure everything is formatted and complete +4. **Merge** - Added to the threat model and visualization ## Resources diff --git a/docs/security/THREAT-MODEL-ATLAS.md b/docs/security/THREAT-MODEL-ATLAS.md index 60582cdcf2..c5d0387a51 100644 --- a/docs/security/THREAT-MODEL-ATLAS.md +++ b/docs/security/THREAT-MODEL-ATLAS.md @@ -12,6 +12,7 @@ This threat model is built on [MITRE ATLAS](https://atlas.mitre.org/), the industry-standard framework for documenting adversarial threats to AI/ML systems. ATLAS is maintained by [MITRE](https://www.mitre.org/) in collaboration with the AI security community. **Key ATLAS Resources:** + - [ATLAS Techniques](https://atlas.mitre.org/techniques/) - [ATLAS Tactics](https://atlas.mitre.org/tactics/) - [ATLAS Case Studies](https://atlas.mitre.org/studies/) @@ -21,6 +22,7 @@ This threat model is built on [MITRE ATLAS](https://atlas.mitre.org/), the indus ### Contributing to This Threat Model This is a living document maintained by the OpenClaw community. See [CONTRIBUTING-THREAT-MODEL.md](./CONTRIBUTING-THREAT-MODEL.md) for guidelines on contributing: + - Reporting new threats - Updating existing threats - Proposing attack chains @@ -36,14 +38,14 @@ This threat model documents adversarial threats to the OpenClaw AI agent platfor ### 1.2 Scope -| Component | Included | Notes | -|-----------|----------|-------| -| OpenClaw Agent Runtime | Yes | Core agent execution, tool calls, sessions | -| Gateway | Yes | Authentication, routing, channel integration | -| Channel Integrations | Yes | WhatsApp, Telegram, Discord, Signal, Slack, etc. | -| ClawHub Marketplace | Yes | Skill publishing, moderation, distribution | -| MCP Servers | Yes | External tool providers | -| User Devices | Partial | Mobile apps, desktop clients | +| Component | Included | Notes | +| ---------------------- | -------- | ------------------------------------------------ | +| OpenClaw Agent Runtime | Yes | Core agent execution, tool calls, sessions | +| Gateway | Yes | Authentication, routing, channel integration | +| Channel Integrations | Yes | WhatsApp, Telegram, Discord, Signal, Slack, etc. | +| ClawHub Marketplace | Yes | Skill publishing, moderation, distribution | +| MCP Servers | Yes | External tool providers | +| User Devices | Partial | Mobile apps, desktop clients | ### 1.3 Out of Scope @@ -122,14 +124,14 @@ Nothing is explicitly out of scope for this threat model. ### 2.2 Data Flows -| Flow | Source | Destination | Data | Protection | -|------|--------|-------------|------|------------| -| F1 | Channel | Gateway | User messages | TLS, AllowFrom | -| F2 | Gateway | Agent | Routed messages | Session isolation | -| F3 | Agent | Tools | Tool invocations | Policy enforcement | -| F4 | Agent | External | web_fetch requests | SSRF blocking | -| F5 | ClawHub | Agent | Skill code | Moderation, scanning | -| F6 | Agent | Channel | Responses | Output filtering | +| Flow | Source | Destination | Data | Protection | +| ---- | ------- | ----------- | ------------------ | -------------------- | +| F1 | Channel | Gateway | User messages | TLS, AllowFrom | +| F2 | Gateway | Agent | Routed messages | Session isolation | +| F3 | Agent | Tools | Tool invocations | Policy enforcement | +| F4 | Agent | External | web_fetch requests | SSRF blocking | +| F5 | ClawHub | Agent | Skill code | Moderation, scanning | +| F6 | Agent | Channel | Responses | Output filtering | --- @@ -139,27 +141,27 @@ Nothing is explicitly out of scope for this threat model. #### T-RECON-001: Agent Endpoint Discovery -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0006 - Active Scanning | -| **Description** | Attacker scans for exposed OpenClaw gateway endpoints | -| **Attack Vector** | Network scanning, shodan queries, DNS enumeration | -| **Affected Components** | Gateway, exposed API endpoints | -| **Current Mitigations** | Tailscale auth option, bind to loopback by default | -| **Residual Risk** | Medium - Public gateways discoverable | -| **Recommendations** | Document secure deployment, add rate limiting on discovery endpoints | +| Attribute | Value | +| ----------------------- | -------------------------------------------------------------------- | +| **ATLAS ID** | AML.T0006 - Active Scanning | +| **Description** | Attacker scans for exposed OpenClaw gateway endpoints | +| **Attack Vector** | Network scanning, shodan queries, DNS enumeration | +| **Affected Components** | Gateway, exposed API endpoints | +| **Current Mitigations** | Tailscale auth option, bind to loopback by default | +| **Residual Risk** | Medium - Public gateways discoverable | +| **Recommendations** | Document secure deployment, add rate limiting on discovery endpoints | #### T-RECON-002: Channel Integration Probing -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0006 - Active Scanning | -| **Description** | Attacker probes messaging channels to identify AI-managed accounts | -| **Attack Vector** | Sending test messages, observing response patterns | -| **Affected Components** | All channel integrations | -| **Current Mitigations** | None specific | -| **Residual Risk** | Low - Limited value from discovery alone | -| **Recommendations** | Consider response timing randomization | +| Attribute | Value | +| ----------------------- | ------------------------------------------------------------------ | +| **ATLAS ID** | AML.T0006 - Active Scanning | +| **Description** | Attacker probes messaging channels to identify AI-managed accounts | +| **Attack Vector** | Sending test messages, observing response patterns | +| **Affected Components** | All channel integrations | +| **Current Mitigations** | None specific | +| **Residual Risk** | Low - Limited value from discovery alone | +| **Recommendations** | Consider response timing randomization | --- @@ -167,39 +169,39 @@ Nothing is explicitly out of scope for this threat model. #### T-ACCESS-001: Pairing Code Interception -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0040 - AI Model Inference API Access | -| **Description** | Attacker intercepts pairing code during 30s grace period | -| **Attack Vector** | Shoulder surfing, network sniffing, social engineering | -| **Affected Components** | Device pairing system | -| **Current Mitigations** | 30s expiry, codes sent via existing channel | -| **Residual Risk** | Medium - Grace period exploitable | -| **Recommendations** | Reduce grace period, add confirmation step | +| Attribute | Value | +| ----------------------- | -------------------------------------------------------- | +| **ATLAS ID** | AML.T0040 - AI Model Inference API Access | +| **Description** | Attacker intercepts pairing code during 30s grace period | +| **Attack Vector** | Shoulder surfing, network sniffing, social engineering | +| **Affected Components** | Device pairing system | +| **Current Mitigations** | 30s expiry, codes sent via existing channel | +| **Residual Risk** | Medium - Grace period exploitable | +| **Recommendations** | Reduce grace period, add confirmation step | #### T-ACCESS-002: AllowFrom Spoofing -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0040 - AI Model Inference API Access | -| **Description** | Attacker spoofs allowed sender identity in channel | -| **Attack Vector** | Depends on channel - phone number spoofing, username impersonation | -| **Affected Components** | AllowFrom validation per channel | -| **Current Mitigations** | Channel-specific identity verification | -| **Residual Risk** | Medium - Some channels vulnerable to spoofing | -| **Recommendations** | Document channel-specific risks, add cryptographic verification where possible | +| Attribute | Value | +| ----------------------- | ------------------------------------------------------------------------------ | +| **ATLAS ID** | AML.T0040 - AI Model Inference API Access | +| **Description** | Attacker spoofs allowed sender identity in channel | +| **Attack Vector** | Depends on channel - phone number spoofing, username impersonation | +| **Affected Components** | AllowFrom validation per channel | +| **Current Mitigations** | Channel-specific identity verification | +| **Residual Risk** | Medium - Some channels vulnerable to spoofing | +| **Recommendations** | Document channel-specific risks, add cryptographic verification where possible | #### T-ACCESS-003: Token Theft -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0040 - AI Model Inference API Access | -| **Description** | Attacker steals authentication tokens from config files | -| **Attack Vector** | Malware, unauthorized device access, config backup exposure | -| **Affected Components** | ~/.openclaw/credentials/, config storage | -| **Current Mitigations** | File permissions | -| **Residual Risk** | High - Tokens stored in plaintext | -| **Recommendations** | Implement token encryption at rest, add token rotation | +| Attribute | Value | +| ----------------------- | ----------------------------------------------------------- | +| **ATLAS ID** | AML.T0040 - AI Model Inference API Access | +| **Description** | Attacker steals authentication tokens from config files | +| **Attack Vector** | Malware, unauthorized device access, config backup exposure | +| **Affected Components** | ~/.openclaw/credentials/, config storage | +| **Current Mitigations** | File permissions | +| **Residual Risk** | High - Tokens stored in plaintext | +| **Recommendations** | Implement token encryption at rest, add token rotation | --- @@ -207,51 +209,51 @@ Nothing is explicitly out of scope for this threat model. #### T-EXEC-001: Direct Prompt Injection -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0051.000 - LLM Prompt Injection: Direct | -| **Description** | Attacker sends crafted prompts to manipulate agent behavior | -| **Attack Vector** | Channel messages containing adversarial instructions | -| **Affected Components** | Agent LLM, all input surfaces | -| **Current Mitigations** | Pattern detection, external content wrapping | -| **Residual Risk** | Critical - Detection only, no blocking; sophisticated attacks bypass | -| **Recommendations** | Implement multi-layer defense, output validation, user confirmation for sensitive actions | +| Attribute | Value | +| ----------------------- | ----------------------------------------------------------------------------------------- | +| **ATLAS ID** | AML.T0051.000 - LLM Prompt Injection: Direct | +| **Description** | Attacker sends crafted prompts to manipulate agent behavior | +| **Attack Vector** | Channel messages containing adversarial instructions | +| **Affected Components** | Agent LLM, all input surfaces | +| **Current Mitigations** | Pattern detection, external content wrapping | +| **Residual Risk** | Critical - Detection only, no blocking; sophisticated attacks bypass | +| **Recommendations** | Implement multi-layer defense, output validation, user confirmation for sensitive actions | #### T-EXEC-002: Indirect Prompt Injection -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0051.001 - LLM Prompt Injection: Indirect | -| **Description** | Attacker embeds malicious instructions in fetched content | -| **Attack Vector** | Malicious URLs, poisoned emails, compromised webhooks | -| **Affected Components** | web_fetch, email ingestion, external data sources | -| **Current Mitigations** | Content wrapping with XML tags and security notice | -| **Residual Risk** | High - LLM may ignore wrapper instructions | -| **Recommendations** | Implement content sanitization, separate execution contexts | +| Attribute | Value | +| ----------------------- | ----------------------------------------------------------- | +| **ATLAS ID** | AML.T0051.001 - LLM Prompt Injection: Indirect | +| **Description** | Attacker embeds malicious instructions in fetched content | +| **Attack Vector** | Malicious URLs, poisoned emails, compromised webhooks | +| **Affected Components** | web_fetch, email ingestion, external data sources | +| **Current Mitigations** | Content wrapping with XML tags and security notice | +| **Residual Risk** | High - LLM may ignore wrapper instructions | +| **Recommendations** | Implement content sanitization, separate execution contexts | #### T-EXEC-003: Tool Argument Injection -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0051.000 - LLM Prompt Injection: Direct | -| **Description** | Attacker manipulates tool arguments through prompt injection | -| **Attack Vector** | Crafted prompts that influence tool parameter values | -| **Affected Components** | All tool invocations | -| **Current Mitigations** | Exec approvals for dangerous commands | -| **Residual Risk** | High - Relies on user judgment | -| **Recommendations** | Implement argument validation, parameterized tool calls | +| Attribute | Value | +| ----------------------- | ------------------------------------------------------------ | +| **ATLAS ID** | AML.T0051.000 - LLM Prompt Injection: Direct | +| **Description** | Attacker manipulates tool arguments through prompt injection | +| **Attack Vector** | Crafted prompts that influence tool parameter values | +| **Affected Components** | All tool invocations | +| **Current Mitigations** | Exec approvals for dangerous commands | +| **Residual Risk** | High - Relies on user judgment | +| **Recommendations** | Implement argument validation, parameterized tool calls | #### T-EXEC-004: Exec Approval Bypass -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0043 - Craft Adversarial Data | -| **Description** | Attacker crafts commands that bypass approval allowlist | -| **Attack Vector** | Command obfuscation, alias exploitation, path manipulation | -| **Affected Components** | exec-approvals.ts, command allowlist | -| **Current Mitigations** | Allowlist + ask mode | -| **Residual Risk** | High - No command sanitization | -| **Recommendations** | Implement command normalization, expand blocklist | +| Attribute | Value | +| ----------------------- | ---------------------------------------------------------- | +| **ATLAS ID** | AML.T0043 - Craft Adversarial Data | +| **Description** | Attacker crafts commands that bypass approval allowlist | +| **Attack Vector** | Command obfuscation, alias exploitation, path manipulation | +| **Affected Components** | exec-approvals.ts, command allowlist | +| **Current Mitigations** | Allowlist + ask mode | +| **Residual Risk** | High - No command sanitization | +| **Recommendations** | Implement command normalization, expand blocklist | --- @@ -259,39 +261,39 @@ Nothing is explicitly out of scope for this threat model. #### T-PERSIST-001: Malicious Skill Installation -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0010.001 - Supply Chain Compromise: AI Software | -| **Description** | Attacker publishes malicious skill to ClawHub | -| **Attack Vector** | Create account, publish skill with hidden malicious code | -| **Affected Components** | ClawHub, skill loading, agent execution | -| **Current Mitigations** | GitHub account age verification, pattern-based moderation flags | -| **Residual Risk** | Critical - No sandboxing, limited review | -| **Recommendations** | VirusTotal integration (in progress), skill sandboxing, community review | +| Attribute | Value | +| ----------------------- | ------------------------------------------------------------------------ | +| **ATLAS ID** | AML.T0010.001 - Supply Chain Compromise: AI Software | +| **Description** | Attacker publishes malicious skill to ClawHub | +| **Attack Vector** | Create account, publish skill with hidden malicious code | +| **Affected Components** | ClawHub, skill loading, agent execution | +| **Current Mitigations** | GitHub account age verification, pattern-based moderation flags | +| **Residual Risk** | Critical - No sandboxing, limited review | +| **Recommendations** | VirusTotal integration (in progress), skill sandboxing, community review | #### T-PERSIST-002: Skill Update Poisoning -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0010.001 - Supply Chain Compromise: AI Software | -| **Description** | Attacker compromises popular skill and pushes malicious update | -| **Attack Vector** | Account compromise, social engineering of skill owner | -| **Affected Components** | ClawHub versioning, auto-update flows | -| **Current Mitigations** | Version fingerprinting | -| **Residual Risk** | High - Auto-updates may pull malicious versions | -| **Recommendations** | Implement update signing, rollback capability, version pinning | +| Attribute | Value | +| ----------------------- | -------------------------------------------------------------- | +| **ATLAS ID** | AML.T0010.001 - Supply Chain Compromise: AI Software | +| **Description** | Attacker compromises popular skill and pushes malicious update | +| **Attack Vector** | Account compromise, social engineering of skill owner | +| **Affected Components** | ClawHub versioning, auto-update flows | +| **Current Mitigations** | Version fingerprinting | +| **Residual Risk** | High - Auto-updates may pull malicious versions | +| **Recommendations** | Implement update signing, rollback capability, version pinning | #### T-PERSIST-003: Agent Configuration Tampering -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0010.002 - Supply Chain Compromise: Data | -| **Description** | Attacker modifies agent configuration to persist access | -| **Attack Vector** | Config file modification, settings injection | -| **Affected Components** | Agent config, tool policies | -| **Current Mitigations** | File permissions | -| **Residual Risk** | Medium - Requires local access | -| **Recommendations** | Config integrity verification, audit logging for config changes | +| Attribute | Value | +| ----------------------- | --------------------------------------------------------------- | +| **ATLAS ID** | AML.T0010.002 - Supply Chain Compromise: Data | +| **Description** | Attacker modifies agent configuration to persist access | +| **Attack Vector** | Config file modification, settings injection | +| **Affected Components** | Agent config, tool policies | +| **Current Mitigations** | File permissions | +| **Residual Risk** | Medium - Requires local access | +| **Recommendations** | Config integrity verification, audit logging for config changes | --- @@ -299,27 +301,27 @@ Nothing is explicitly out of scope for this threat model. #### T-EVADE-001: Moderation Pattern Bypass -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0043 - Craft Adversarial Data | -| **Description** | Attacker crafts skill content to evade moderation patterns | -| **Attack Vector** | Unicode homoglyphs, encoding tricks, dynamic loading | -| **Affected Components** | ClawHub moderation.ts | -| **Current Mitigations** | Pattern-based FLAG_RULES | -| **Residual Risk** | High - Simple regex easily bypassed | -| **Recommendations** | Add behavioral analysis (VirusTotal Code Insight), AST-based detection | +| Attribute | Value | +| ----------------------- | ---------------------------------------------------------------------- | +| **ATLAS ID** | AML.T0043 - Craft Adversarial Data | +| **Description** | Attacker crafts skill content to evade moderation patterns | +| **Attack Vector** | Unicode homoglyphs, encoding tricks, dynamic loading | +| **Affected Components** | ClawHub moderation.ts | +| **Current Mitigations** | Pattern-based FLAG_RULES | +| **Residual Risk** | High - Simple regex easily bypassed | +| **Recommendations** | Add behavioral analysis (VirusTotal Code Insight), AST-based detection | #### T-EVADE-002: Content Wrapper Escape -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0043 - Craft Adversarial Data | -| **Description** | Attacker crafts content that escapes XML wrapper context | -| **Attack Vector** | Tag manipulation, context confusion, instruction override | -| **Affected Components** | External content wrapping | -| **Current Mitigations** | XML tags + security notice | -| **Residual Risk** | Medium - Novel escapes discovered regularly | -| **Recommendations** | Multiple wrapper layers, output-side validation | +| Attribute | Value | +| ----------------------- | --------------------------------------------------------- | +| **ATLAS ID** | AML.T0043 - Craft Adversarial Data | +| **Description** | Attacker crafts content that escapes XML wrapper context | +| **Attack Vector** | Tag manipulation, context confusion, instruction override | +| **Affected Components** | External content wrapping | +| **Current Mitigations** | XML tags + security notice | +| **Residual Risk** | Medium - Novel escapes discovered regularly | +| **Recommendations** | Multiple wrapper layers, output-side validation | --- @@ -327,27 +329,27 @@ Nothing is explicitly out of scope for this threat model. #### T-DISC-001: Tool Enumeration -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0040 - AI Model Inference API Access | -| **Description** | Attacker enumerates available tools through prompting | -| **Attack Vector** | "What tools do you have?" style queries | -| **Affected Components** | Agent tool registry | -| **Current Mitigations** | None specific | -| **Residual Risk** | Low - Tools generally documented | -| **Recommendations** | Consider tool visibility controls | +| Attribute | Value | +| ----------------------- | ----------------------------------------------------- | +| **ATLAS ID** | AML.T0040 - AI Model Inference API Access | +| **Description** | Attacker enumerates available tools through prompting | +| **Attack Vector** | "What tools do you have?" style queries | +| **Affected Components** | Agent tool registry | +| **Current Mitigations** | None specific | +| **Residual Risk** | Low - Tools generally documented | +| **Recommendations** | Consider tool visibility controls | #### T-DISC-002: Session Data Extraction -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0040 - AI Model Inference API Access | -| **Description** | Attacker extracts sensitive data from session context | -| **Attack Vector** | "What did we discuss?" queries, context probing | -| **Affected Components** | Session transcripts, context window | -| **Current Mitigations** | Session isolation per sender | -| **Residual Risk** | Medium - Within-session data accessible | -| **Recommendations** | Implement sensitive data redaction in context | +| Attribute | Value | +| ----------------------- | ----------------------------------------------------- | +| **ATLAS ID** | AML.T0040 - AI Model Inference API Access | +| **Description** | Attacker extracts sensitive data from session context | +| **Attack Vector** | "What did we discuss?" queries, context probing | +| **Affected Components** | Session transcripts, context window | +| **Current Mitigations** | Session isolation per sender | +| **Residual Risk** | Medium - Within-session data accessible | +| **Recommendations** | Implement sensitive data redaction in context | --- @@ -355,39 +357,39 @@ Nothing is explicitly out of scope for this threat model. #### T-EXFIL-001: Data Theft via web_fetch -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0009 - Collection | -| **Description** | Attacker exfiltrates data by instructing agent to send to external URL | -| **Attack Vector** | Prompt injection causing agent to POST data to attacker server | -| **Affected Components** | web_fetch tool | -| **Current Mitigations** | SSRF blocking for internal networks | -| **Residual Risk** | High - External URLs permitted | -| **Recommendations** | Implement URL allowlisting, data classification awareness | +| Attribute | Value | +| ----------------------- | ---------------------------------------------------------------------- | +| **ATLAS ID** | AML.T0009 - Collection | +| **Description** | Attacker exfiltrates data by instructing agent to send to external URL | +| **Attack Vector** | Prompt injection causing agent to POST data to attacker server | +| **Affected Components** | web_fetch tool | +| **Current Mitigations** | SSRF blocking for internal networks | +| **Residual Risk** | High - External URLs permitted | +| **Recommendations** | Implement URL allowlisting, data classification awareness | #### T-EXFIL-002: Unauthorized Message Sending -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0009 - Collection | -| **Description** | Attacker causes agent to send messages containing sensitive data | -| **Attack Vector** | Prompt injection causing agent to message attacker | -| **Affected Components** | Message tool, channel integrations | -| **Current Mitigations** | Outbound messaging gating | -| **Residual Risk** | Medium - Gating may be bypassed | -| **Recommendations** | Require explicit confirmation for new recipients | +| Attribute | Value | +| ----------------------- | ---------------------------------------------------------------- | +| **ATLAS ID** | AML.T0009 - Collection | +| **Description** | Attacker causes agent to send messages containing sensitive data | +| **Attack Vector** | Prompt injection causing agent to message attacker | +| **Affected Components** | Message tool, channel integrations | +| **Current Mitigations** | Outbound messaging gating | +| **Residual Risk** | Medium - Gating may be bypassed | +| **Recommendations** | Require explicit confirmation for new recipients | #### T-EXFIL-003: Credential Harvesting -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0009 - Collection | -| **Description** | Malicious skill harvests credentials from agent context | -| **Attack Vector** | Skill code reads environment variables, config files | -| **Affected Components** | Skill execution environment | -| **Current Mitigations** | None specific to skills | -| **Residual Risk** | Critical - Skills run with agent privileges | -| **Recommendations** | Skill sandboxing, credential isolation | +| Attribute | Value | +| ----------------------- | ------------------------------------------------------- | +| **ATLAS ID** | AML.T0009 - Collection | +| **Description** | Malicious skill harvests credentials from agent context | +| **Attack Vector** | Skill code reads environment variables, config files | +| **Affected Components** | Skill execution environment | +| **Current Mitigations** | None specific to skills | +| **Residual Risk** | Critical - Skills run with agent privileges | +| **Recommendations** | Skill sandboxing, credential isolation | --- @@ -395,39 +397,39 @@ Nothing is explicitly out of scope for this threat model. #### T-IMPACT-001: Unauthorized Command Execution -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0031 - Erode AI Model Integrity | -| **Description** | Attacker executes arbitrary commands on user system | -| **Attack Vector** | Prompt injection combined with exec approval bypass | -| **Affected Components** | Bash tool, command execution | -| **Current Mitigations** | Exec approvals, Docker sandbox option | -| **Residual Risk** | Critical - Host execution without sandbox | -| **Recommendations** | Default to sandbox, improve approval UX | +| Attribute | Value | +| ----------------------- | --------------------------------------------------- | +| **ATLAS ID** | AML.T0031 - Erode AI Model Integrity | +| **Description** | Attacker executes arbitrary commands on user system | +| **Attack Vector** | Prompt injection combined with exec approval bypass | +| **Affected Components** | Bash tool, command execution | +| **Current Mitigations** | Exec approvals, Docker sandbox option | +| **Residual Risk** | Critical - Host execution without sandbox | +| **Recommendations** | Default to sandbox, improve approval UX | #### T-IMPACT-002: Resource Exhaustion (DoS) -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0031 - Erode AI Model Integrity | -| **Description** | Attacker exhausts API credits or compute resources | -| **Attack Vector** | Automated message flooding, expensive tool calls | -| **Affected Components** | Gateway, agent sessions, API provider | -| **Current Mitigations** | None | -| **Residual Risk** | High - No rate limiting | -| **Recommendations** | Implement per-sender rate limits, cost budgets | +| Attribute | Value | +| ----------------------- | -------------------------------------------------- | +| **ATLAS ID** | AML.T0031 - Erode AI Model Integrity | +| **Description** | Attacker exhausts API credits or compute resources | +| **Attack Vector** | Automated message flooding, expensive tool calls | +| **Affected Components** | Gateway, agent sessions, API provider | +| **Current Mitigations** | None | +| **Residual Risk** | High - No rate limiting | +| **Recommendations** | Implement per-sender rate limits, cost budgets | #### T-IMPACT-003: Reputation Damage -| Attribute | Value | -|-----------|-------| -| **ATLAS ID** | AML.T0031 - Erode AI Model Integrity | -| **Description** | Attacker causes agent to send harmful/offensive content | -| **Attack Vector** | Prompt injection causing inappropriate responses | -| **Affected Components** | Output generation, channel messaging | -| **Current Mitigations** | LLM provider content policies | -| **Residual Risk** | Medium - Provider filters imperfect | -| **Recommendations** | Output filtering layer, user controls | +| Attribute | Value | +| ----------------------- | ------------------------------------------------------- | +| **ATLAS ID** | AML.T0031 - Erode AI Model Integrity | +| **Description** | Attacker causes agent to send harmful/offensive content | +| **Attack Vector** | Prompt injection causing inappropriate responses | +| **Affected Components** | Output generation, channel messaging | +| **Current Mitigations** | LLM provider content policies | +| **Residual Risk** | Medium - Provider filters imperfect | +| **Recommendations** | Output filtering layer, user controls | --- @@ -435,15 +437,15 @@ Nothing is explicitly out of scope for this threat model. ### 4.1 Current Security Controls -| Control | Implementation | Effectiveness | -|---------|----------------|---------------| -| GitHub Account Age | `requireGitHubAccountAge()` | Medium - Raises bar for new attackers | -| Path Sanitization | `sanitizePath()` | High - Prevents path traversal | -| File Type Validation | `isTextFile()` | Medium - Only text files, but can still be malicious | -| Size Limits | 50MB total bundle | High - Prevents resource exhaustion | -| Required SKILL.md | Mandatory readme | Low security value - Informational only | -| Pattern Moderation | FLAG_RULES in moderation.ts | Low - Easily bypassed | -| Moderation Status | `moderationStatus` field | Medium - Manual review possible | +| Control | Implementation | Effectiveness | +| -------------------- | --------------------------- | ---------------------------------------------------- | +| GitHub Account Age | `requireGitHubAccountAge()` | Medium - Raises bar for new attackers | +| Path Sanitization | `sanitizePath()` | High - Prevents path traversal | +| File Type Validation | `isTextFile()` | Medium - Only text files, but can still be malicious | +| Size Limits | 50MB total bundle | High - Prevents resource exhaustion | +| Required SKILL.md | Mandatory readme | Low security value - Informational only | +| Pattern Moderation | FLAG_RULES in moderation.ts | Low - Easily bypassed | +| Moderation Status | `moderationStatus` field | Medium - Manual review possible | ### 4.2 Moderation Flag Patterns @@ -463,6 +465,7 @@ Current patterns in `moderation.ts`: ``` **Limitations:** + - Only checks slug, displayName, summary, frontmatter, metadata, file paths - Does not analyze actual skill code content - Simple regex easily bypassed with obfuscation @@ -470,12 +473,12 @@ Current patterns in `moderation.ts`: ### 4.3 Planned Improvements -| Improvement | Status | Impact | -|-------------|--------|--------| -| VirusTotal Integration | In Progress | High - Code Insight behavioral analysis | -| Community Reporting | Partial (`skillReports` table exists) | Medium | -| Audit Logging | Partial (`auditLogs` table exists) | Medium | -| Badge System | Implemented | Medium - `highlighted`, `official`, `deprecated`, `redactionApproved` | +| Improvement | Status | Impact | +| ---------------------- | ------------------------------------- | --------------------------------------------------------------------- | +| VirusTotal Integration | In Progress | High - Code Insight behavioral analysis | +| Community Reporting | Partial (`skillReports` table exists) | Medium | +| Audit Logging | Partial (`auditLogs` table exists) | Medium | +| Badge System | Implemented | Medium - `highlighted`, `official`, `deprecated`, `redactionApproved` | --- @@ -483,37 +486,40 @@ Current patterns in `moderation.ts`: ### 5.1 Likelihood vs Impact -| Threat ID | Likelihood | Impact | Risk Level | Priority | -|-----------|------------|--------|------------|----------| -| T-EXEC-001 | High | Critical | **Critical** | P0 | -| T-PERSIST-001 | High | Critical | **Critical** | P0 | -| T-EXFIL-003 | Medium | Critical | **Critical** | P0 | -| T-IMPACT-001 | Medium | Critical | **High** | P1 | -| T-EXEC-002 | High | High | **High** | P1 | -| T-EXEC-004 | Medium | High | **High** | P1 | -| T-ACCESS-003 | Medium | High | **High** | P1 | -| T-EXFIL-001 | Medium | High | **High** | P1 | -| T-IMPACT-002 | High | Medium | **High** | P1 | -| T-EVADE-001 | High | Medium | **Medium** | P2 | -| T-ACCESS-001 | Low | High | **Medium** | P2 | -| T-ACCESS-002 | Low | High | **Medium** | P2 | -| T-PERSIST-002 | Low | High | **Medium** | P2 | +| Threat ID | Likelihood | Impact | Risk Level | Priority | +| ------------- | ---------- | -------- | ------------ | -------- | +| T-EXEC-001 | High | Critical | **Critical** | P0 | +| T-PERSIST-001 | High | Critical | **Critical** | P0 | +| T-EXFIL-003 | Medium | Critical | **Critical** | P0 | +| T-IMPACT-001 | Medium | Critical | **High** | P1 | +| T-EXEC-002 | High | High | **High** | P1 | +| T-EXEC-004 | Medium | High | **High** | P1 | +| T-ACCESS-003 | Medium | High | **High** | P1 | +| T-EXFIL-001 | Medium | High | **High** | P1 | +| T-IMPACT-002 | High | Medium | **High** | P1 | +| T-EVADE-001 | High | Medium | **Medium** | P2 | +| T-ACCESS-001 | Low | High | **Medium** | P2 | +| T-ACCESS-002 | Low | High | **Medium** | P2 | +| T-PERSIST-002 | Low | High | **Medium** | P2 | ### 5.2 Critical Path Attack Chains **Attack Chain 1: Skill-Based Data Theft** + ``` T-PERSIST-001 → T-EVADE-001 → T-EXFIL-003 (Publish malicious skill) → (Evade moderation) → (Harvest credentials) ``` **Attack Chain 2: Prompt Injection to RCE** + ``` T-EXEC-001 → T-EXEC-004 → T-IMPACT-001 (Inject prompt) → (Bypass exec approval) → (Execute commands) ``` **Attack Chain 3: Indirect Injection via Fetched Content** + ``` T-EXEC-002 → T-EXFIL-001 → External exfiltration (Poison URL content) → (Agent fetches & follows instructions) → (Data sent to attacker) @@ -525,28 +531,28 @@ T-EXEC-002 → T-EXFIL-001 → External exfiltration ### 6.1 Immediate (P0) -| ID | Recommendation | Addresses | -|----|----------------|-----------| -| R-001 | Complete VirusTotal integration | T-PERSIST-001, T-EVADE-001 | -| R-002 | Implement skill sandboxing | T-PERSIST-001, T-EXFIL-003 | -| R-003 | Add output validation for sensitive actions | T-EXEC-001, T-EXEC-002 | +| ID | Recommendation | Addresses | +| ----- | ------------------------------------------- | -------------------------- | +| R-001 | Complete VirusTotal integration | T-PERSIST-001, T-EVADE-001 | +| R-002 | Implement skill sandboxing | T-PERSIST-001, T-EXFIL-003 | +| R-003 | Add output validation for sensitive actions | T-EXEC-001, T-EXEC-002 | ### 6.2 Short-term (P1) -| ID | Recommendation | Addresses | -|----|----------------|-----------| -| R-004 | Implement rate limiting | T-IMPACT-002 | -| R-005 | Add token encryption at rest | T-ACCESS-003 | -| R-006 | Improve exec approval UX and validation | T-EXEC-004 | -| R-007 | Implement URL allowlisting for web_fetch | T-EXFIL-001 | +| ID | Recommendation | Addresses | +| ----- | ---------------------------------------- | ------------ | +| R-004 | Implement rate limiting | T-IMPACT-002 | +| R-005 | Add token encryption at rest | T-ACCESS-003 | +| R-006 | Improve exec approval UX and validation | T-EXEC-004 | +| R-007 | Implement URL allowlisting for web_fetch | T-EXFIL-001 | ### 6.3 Medium-term (P2) -| ID | Recommendation | Addresses | -|----|----------------|-----------| -| R-008 | Add cryptographic channel verification where possible | T-ACCESS-002 | -| R-009 | Implement config integrity verification | T-PERSIST-003 | -| R-010 | Add update signing and version pinning | T-PERSIST-002 | +| ID | Recommendation | Addresses | +| ----- | ----------------------------------------------------- | ------------- | +| R-008 | Add cryptographic channel verification where possible | T-ACCESS-002 | +| R-009 | Implement config integrity verification | T-PERSIST-003 | +| R-010 | Add update signing and version pinning | T-PERSIST-002 | --- @@ -554,44 +560,44 @@ T-EXEC-002 → T-EXFIL-001 → External exfiltration ### 7.1 ATLAS Technique Mapping -| ATLAS ID | Technique Name | OpenClaw Threats | -|----------|----------------|------------------| -| AML.T0006 | Active Scanning | T-RECON-001, T-RECON-002 | -| AML.T0009 | Collection | T-EXFIL-001, T-EXFIL-002, T-EXFIL-003 | -| AML.T0010.001 | Supply Chain: AI Software | T-PERSIST-001, T-PERSIST-002 | -| AML.T0010.002 | Supply Chain: Data | T-PERSIST-003 | -| AML.T0031 | Erode AI Model Integrity | T-IMPACT-001, T-IMPACT-002, T-IMPACT-003 | -| AML.T0040 | AI Model Inference API Access | T-ACCESS-001, T-ACCESS-002, T-ACCESS-003, T-DISC-001, T-DISC-002 | -| AML.T0043 | Craft Adversarial Data | T-EXEC-004, T-EVADE-001, T-EVADE-002 | -| AML.T0051.000 | LLM Prompt Injection: Direct | T-EXEC-001, T-EXEC-003 | -| AML.T0051.001 | LLM Prompt Injection: Indirect | T-EXEC-002 | +| ATLAS ID | Technique Name | OpenClaw Threats | +| ------------- | ------------------------------ | ---------------------------------------------------------------- | +| AML.T0006 | Active Scanning | T-RECON-001, T-RECON-002 | +| AML.T0009 | Collection | T-EXFIL-001, T-EXFIL-002, T-EXFIL-003 | +| AML.T0010.001 | Supply Chain: AI Software | T-PERSIST-001, T-PERSIST-002 | +| AML.T0010.002 | Supply Chain: Data | T-PERSIST-003 | +| AML.T0031 | Erode AI Model Integrity | T-IMPACT-001, T-IMPACT-002, T-IMPACT-003 | +| AML.T0040 | AI Model Inference API Access | T-ACCESS-001, T-ACCESS-002, T-ACCESS-003, T-DISC-001, T-DISC-002 | +| AML.T0043 | Craft Adversarial Data | T-EXEC-004, T-EVADE-001, T-EVADE-002 | +| AML.T0051.000 | LLM Prompt Injection: Direct | T-EXEC-001, T-EXEC-003 | +| AML.T0051.001 | LLM Prompt Injection: Indirect | T-EXEC-002 | ### 7.2 Key Security Files -| Path | Purpose | Risk Level | -|------|---------|------------| -| `src/infra/exec-approvals.ts` | Command approval logic | **Critical** | -| `src/gateway/auth.ts` | Gateway authentication | **Critical** | -| `src/web/inbound/access-control.ts` | Channel access control | **Critical** | -| `src/infra/net/ssrf.ts` | SSRF protection | **Critical** | -| `src/security/external-content.ts` | Prompt injection mitigation | **Critical** | -| `src/agents/sandbox/tool-policy.ts` | Tool policy enforcement | **Critical** | -| `convex/lib/moderation.ts` | ClawHub moderation | **High** | -| `convex/lib/skillPublish.ts` | Skill publishing flow | **High** | -| `src/routing/resolve-route.ts` | Session isolation | **Medium** | +| Path | Purpose | Risk Level | +| ----------------------------------- | --------------------------- | ------------ | +| `src/infra/exec-approvals.ts` | Command approval logic | **Critical** | +| `src/gateway/auth.ts` | Gateway authentication | **Critical** | +| `src/web/inbound/access-control.ts` | Channel access control | **Critical** | +| `src/infra/net/ssrf.ts` | SSRF protection | **Critical** | +| `src/security/external-content.ts` | Prompt injection mitigation | **Critical** | +| `src/agents/sandbox/tool-policy.ts` | Tool policy enforcement | **Critical** | +| `convex/lib/moderation.ts` | ClawHub moderation | **High** | +| `convex/lib/skillPublish.ts` | Skill publishing flow | **High** | +| `src/routing/resolve-route.ts` | Session isolation | **Medium** | ### 7.3 Glossary -| Term | Definition | -|------|------------| -| **ATLAS** | MITRE's Adversarial Threat Landscape for AI Systems | -| **ClawHub** | OpenClaw's skill marketplace | -| **Gateway** | OpenClaw's message routing and authentication layer | -| **MCP** | Model Context Protocol - tool provider interface | +| Term | Definition | +| -------------------- | --------------------------------------------------------- | +| **ATLAS** | MITRE's Adversarial Threat Landscape for AI Systems | +| **ClawHub** | OpenClaw's skill marketplace | +| **Gateway** | OpenClaw's message routing and authentication layer | +| **MCP** | Model Context Protocol - tool provider interface | | **Prompt Injection** | Attack where malicious instructions are embedded in input | -| **Skill** | Downloadable extension for OpenClaw agents | -| **SSRF** | Server-Side Request Forgery | +| **Skill** | Downloadable extension for OpenClaw agents | +| **SSRF** | Server-Side Request Forgery | --- -*This threat model is a living document. Report security issues to security@openclaw.ai* +_This threat model is a living document. Report security issues to security@openclaw.ai_