Layer 1 — Recall-time filter (task-filter.ts):
- New module that reads TASKS.md completed tasks and filters recalled
memories that match completed task IDs or keywords
- Integrated into auto-recall hook as Feature 3 (after score/dedup filters)
- 60-second cache to avoid re-parsing TASKS.md on every message
- 29 new tests
Layer 2 — Sleep cycle Phase 7 (task-memory cleanup):
- New phase cross-references completed tasks with stored memories
- LLM classifies each matched memory as 'lasting' (keep) or 'noise' (delete)
- Conservative: keeps memories on any doubt or LLM failure
- Scans only tasks completed within last 7 days
- New searchMemoriesByKeywords() method on neo4j client
- 16 new tests
Layer 3 — Memory task metadata (taskId field):
- Optional taskId field on MemoryNode, StoreMemoryInput, and search results
- Auto-tags memories during auto-capture when exactly 1 active task exists
- Precise taskId-based filtering at recall time (complements Layer 1)
- findMemoriesByTaskId() and clearTaskIdFromMemories() on neo4j client
- taskId flows through vector, BM25, and graph search signals + RRF fusion
- 20 new tests
All 669 memory-neo4j tests pass. Zero regressions in full suite.
All changes are backward compatible — existing memories without taskId
continue to work. No migration needed.
- Strengthen extraction prompt to always generate 2-4 tags per memory
- Add Phase 2b: Retroactive Tagging to sleep cycle for untagged memories
- Include 'skipped' memories in extraction pipeline (imported memories)
- Add listUntaggedMemories() helper to neo4j-client
- Add extractTagsOnly() lightweight prompt for tag-only extraction
- Add CLI display for Phase 2b stats
Fixes: 79% of memories had zero tags due to weak prompt guidance
and imported memories never going through extraction.
Sleep cycle is now triggered by a system cron job (`0 3 * * *`) calling
`openclaw memory neo4j sleep` rather than an in-process 6-hour interval
timer with mutex. Simpler, more reliable, and easier to manage.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Search results now include per-signal attribution (vec/bm25/graph rank+score)
threaded through RRF fusion to memory_recall output and auto-recall debug logs
- New --report flag on sleep command shows post-cycle quality metrics
(extraction coverage, entity graph density, decay distribution)
- New `health` subcommand with 5-section dashboard: memory overview,
extraction health, entity graph, tag health, decay distribution
Supports --agent scoping and --json output
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Auto-trigger sleep cycle (dedup, extraction, decay, cleanup) in the
background after agent_end when 6h+ have elapsed. Configurable via
sleepCycle.auto and sleepCycle.autoIntervalMs. Removes need for
external cron job with regular gateway usage.
Also includes: removal of Pareto promotion (replaced by manual core
promotion), entity dedup in sleep cycle, and sleep cycle pipeline
cleanup.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add task ledger (TASKS.md) parsing and stale-task archival for maintaining
agent task state across context compactions. Post-compaction recovery injects
memory_recall + TASKS.md read steps after auto-compaction. Sleep cycle gains
entity dedup (Phase 1d) and credential scanning. Memory flush now extracts
active task checkpoints. Compaction instructions prioritize active tasks.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add findSingleUseTags() to prune tags with only 1 reference after 14 days
- Enhance findDuplicateEntityPairs() to match on entity aliases
- Add normalizeTagName() to collapse hyphens/underscores to spaces
- Monitor 'other' category accumulation in sleep cycle Phase 2
- Tighten extraction prompt with explicit entity blocklist (80 terms)
- Raise auto-capture threshold from 0.5 to 0.65
- Fix tests for entity dedup phase and skipPromotion default
Open proposals ("Want me to...?", "Should I...?") are dangerous in
long-term memory because other sessions interpret them as active
instructions and attempt to carry them out. This adds:
- Attention gate patterns for cron delivery outputs and assistant proposals
- Extractor scoring rules to rate proposals/action items as low importance
- Sleep-cycle Phase 7 to retroactively clean existing noise-pattern memories
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add attention gate patterns for voice mode context and session
completion summaries (ephemeral, not user knowledge)
- Rewrite importance rating prompt with detailed scoring guide and
concrete examples to reduce over-scoring of assistant narration
- Raise dedup safety bound from 500 to 2000 pairs
- Add skipPromotion option (default true) so core tier stays
user-curated only
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- stats: show per-agent bar graphs for category counts and avg importance
- list: show actual memory contents grouped by agent/category with importance bars
- list: add --agent, --category, --limit filters
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The agentFilter used `m.agentId` in both the direct-mentions and N-hop
sections of the Cypher query, but `m` is out of scope in the N-hop
section where the Memory node is aliased as `m2`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add userPinned boolean on Memory nodes: user-stored core memories are
immune from importance recalculation, decay, and pruning. Only removable
via memory_forget. Importance locked at 1.0.
- Add listCoreForInjection(): always injects ALL userPinned core memories
plus top N non-pinned core memories by importance (no silent drop-off
for user-pinned memories regardless of maxEntries cap).
- Remove core demotion entirely: promotion is now one-way. Bad core
memories are handled manually via memory_forget.
- Add [bench] performance timing to auto-recall, auto-capture, core
memory injection, core refresh, and hybridSearch.
- Audit fixes: remove dead entity/tag methods, dead test blocks, orphaned
demoteFromCore docstring, unnecessary .slice() in graphSearch.
- Refactor attention gate into shared checks for user/assistant gates.
- Consolidate LLM client, message utils, and config helpers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The hardcoded MAX_SEMANTIC_DEDUP_PAIRS (50) and LLM_CONCURRENCY (8) were
designed for expensive cloud LLM calls. For local Ollama inference these
caps are unnecessarily restrictive, especially during long sleep windows.
- Add maxSemanticDedupPairs to SleepCycleOptions (default: 500)
- Add llmConcurrency to SleepCycleOptions (default: 8)
- Add --max-semantic-pairs and --concurrency CLI flags
- Raise semantic dedup default from 50 → 500 pairs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add 11 ASSISTANT_NARRATION_PATTERNS to reject play-by-play self-talk
("Let me check...", "I'll run...", "Starting...", "Good! The...", etc.)
- Cap Phase 1b semantic dedup to 50 pairs (sorted by similarity desc)
to prevent sleep cycle timeouts on large memory sets
- Raise user auto-capture importance threshold from 0.3 to 0.5
- Raise assistant auto-capture importance threshold from 0.7 to 0.8
- Raise MIN_WORD_COUNT from 5 to 8 for user attention gate
- Neo4j cleanup: deleted 155 noise entries (394→242 memories),
recategorized 2 misplaced entries, stripped Slack metadata from 1
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds skipSemanticDedup option to runSleepCycle that skips Phase 1b
(semantic dedup) and Phase 1c (conflict detection), both of which
require LLM calls. Useful for fast/cheap sleep runs that only need
vector-based dedup and decay.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Gateway (pipecat compatibility):
- openai-http: add finish_reason:"stop" on final SSE chunk, fix ID format
(chatcmpl- not chatcmpl_), capture timestamp once, use delta only, add
writable checks and flush after writes
- http-common: add TCP_NODELAY, X-Accel-Buffering:no, flush after writes,
writable checks on writeDone
- agent-events: fix seqByRun memory leak in clearAgentRunContext
Voice-call security:
- manager.ts, twiml.ts, twilio.ts: escape voice/language XML attributes
to prevent XML injection
- voice-mapping: strip control characters in escapeXml
Voice-call bugs:
- tts-openai: fix broken resample24kTo8k (interpolation frac always 0)
- stt-openai-realtime: close zombie WebSocket on connection timeout
- telnyx: extract direction/from/to for inbound calls (were silently dropped)
- plivo: clean up 5 internal maps on terminal call states (memory leak)
- twilio: clean up callWebhookUrls on terminal call states
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Search: fix entity classification order (proper nouns before word count),
BM25 min-max normalization with floor, empty query guard.
Decay: retrieval-reinforced half-life with effective age anchored to
lastRetrievedAt, parameterized category curves (no string interpolation).
Dedup: transfer TAGGED relationships to survivor during merge.
Orphans: use EXISTS pattern instead of stale mentionCount.
Embeddings: Ollama retry with exponential backoff (2 retries, 1s base).
Config: resolve env vars in neo4j.uri, re-export MemoryCategory from schema.
Extractor: abort-aware batch delay, anonymize prompt examples.
Tests: add 80 tests for index.ts (attention gates, message extraction,
wrapper stripping). Full suite: 480 tests across 8 files, all passing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add composite index on (agentId, category) for faster filtered queries
- Combine graph search into single UNION Cypher query (was 2 sequential)
- Parallelize conflict resolution with LLM_CONCURRENCY chunks
- Batch entity operations (merge, mentions, relationships, tags, category,
extraction status) into a single managed transaction
- Make auto-capture fire-and-forget with shared captureMessage helper
- Extract attention-gate.ts and message-utils.ts modules from index.ts
and extractor.ts for better separation of concerns
- Update tests to match new batched/combined APIs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix initPromise retry: reset to null on failure so subsequent calls
retry instead of returning cached rejected promise
- Remove dead code: findPromotionCandidates, findDemotionCandidates,
calculateEffectiveImportance (~190 lines, never called)
- Add agentId filter to deleteMemory() to prevent cross-agent deletion
- Fix phase label swaps: 1b=Semantic Dedup, 1c=Conflict Detection
(CLI banner, phaseNames map, SleepCycleResult/Options type comments)
- Add autoRecallMinScore and coreMemory config to plugin JSON schema
so the UI can validate and display these options
- Add embedding LRU cache (200 entries, SHA-256 keyed) to eliminate
redundant API calls across auto-recall, auto-capture, and tools
- Add Ollama concurrency limiter (chunks of 4) to prevent thundering
herd on single-threaded embedding server
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Raise MIN_CAPTURE_CHARS from 10 to 30 to reject trivially short messages
- Add noise patterns for conversational filler (haha, lol, hmm, etc.)
- Add noise pattern to reject /new and /reset session prompts
- Raise importance threshold for assistant auto-captures to >= 0.7
- Add Slack protocol prefix/suffix stripping in stripMessageWrappers()
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Five high-impact improvements to the memory system:
1. Min RRF score threshold on auto-recall (default 0.25) — filters low-relevance
results before injecting into context
2. Deduplicate auto-recall against core memories already present in context
3. Capture assistant messages (decisions, recommendations, synthesized facts)
with stricter attention gating and "auto-capture-assistant" source type
4. LLM-judged importance scoring at capture time (0.1-1.0) with 5s timeout
fallback to 0.5, replacing the flat 0.5 default
5. Conflict detection in sleep cycle (Phase 1b) — finds contradictory memories
sharing entities, uses LLM to resolve, invalidates the loser
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a CLI command to re-embed all Memory and Entity nodes after
changing the embedding model or provider. Drops old vector indexes,
re-embeds in batches via the configured provider, and recreates
indexes with the correct dimensions.
- Add extraction config section (apiKey, model, baseUrl) to plugin schema
with env-var fallback and Ollama/local LLM support (no API key required)
- Add category classification to extraction prompt; update memories from
'other' to LLM-assigned category
- Reorder sleep phases: extraction before decay
- Parallelize extraction (3 concurrent via Promise.allSettled)
- Pre-compute effective scores once and reuse for promotion/demotion
- Replace O(n²) Cartesian dedup with per-memory HNSW vector index queries
- Use mentionCount for orphan entity detection instead of subquery
- Remove dead auto-capture code (evaluateAutoCapture, CaptureItem, etc.)
Add `coreMemory.refreshAtContextPercent` config option to re-inject
core memories when context usage exceeds a threshold. This counters
the "lost in the middle" phenomenon documented by Liu et al. (2023).
Implementation:
- Extend before_agent_start hook event with context usage info
- Pass contextWindowTokens and estimatedUsedTokens to hooks
- Track mid-session refresh per session to prevent over-refreshing
- Clear refresh tracking on compaction
- Add comprehensive tests
Based on research: Liu et al., "Lost in the Middle: How Language
Models Use Long Contexts" (Stanford, 2023)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implement retrieval tracking and Pareto-based memory consolidation:
- Track retrievalCount and lastRetrievedAt on every search
- Effective importance formula: importance × freq_boost × recency_factor
- Seven-phase sleep cycle: dedup, pareto scoring, promotion, demotion,
decay/pruning, extraction, cleanup
- Bidirectional mobility between core (≤20%) and regular memory tiers
- Core memories ranked by pure usage (no importance multiplier)
Based on ACT-R memory model and Ebbinghaus forgetting curve research.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>