mirror of
https://github.com/openclaw/openclaw.git
synced 2026-04-25 03:04:29 -04:00
feat (memory): Implement new (opt-in) QMD memory backend
This commit is contained in:
committed by
vignesh07
parent
e25fedf932
commit
8af37aa15f
@@ -118,6 +118,83 @@ Notes:
|
||||
- Only Markdown files are indexed.
|
||||
- Symlinks are ignored (files or directories).
|
||||
|
||||
### QMD backend (experimental)
|
||||
|
||||
Set `memory.backend = "qmd"` to swap the built-in SQLite indexer for
|
||||
[QMD](https://github.com/tobi/qmd): a local-first search sidecar that combines
|
||||
BM25 + vectors + reranking. Markdown stays the source of truth; OpenClaw shells
|
||||
out to QMD for retrieval. Key points:
|
||||
|
||||
**Prereqs**
|
||||
- Disabled by default. Opt in per-config (`memory.backend = "qmd"`).
|
||||
- Install the QMD CLI separately (`bun install -g github.com/tobi/qmd` or grab
|
||||
a release) and make sure the `qmd` binary is on the gateway’s `PATH`.
|
||||
- QMD needs an SQLite build that allows extensions (`brew install sqlite` on
|
||||
macOS). The gateway sets `INDEX_PATH`/`QMD_CONFIG_DIR` automatically.
|
||||
|
||||
**How the sidecar runs**
|
||||
- The gateway writes a self-contained QMD home under
|
||||
`~/.clawdbot/agents/<agentId>/qmd/` (config + cache + sqlite DB).
|
||||
- Collections are rewritten from `memory.qmd.paths` (plus default workspace
|
||||
memory files) into `index.yml`, then `qmd update` + `qmd embed` run on boot and
|
||||
on a configurable interval (`memory.qmd.update.interval`, default 5 m).
|
||||
- Searches run via `qmd query --json`. If QMD fails or the binary is missing,
|
||||
OpenClaw automatically falls back to the builtin SQLite manager so memory tools
|
||||
keep working.
|
||||
|
||||
**Config surface (`memory.qmd.*`)**
|
||||
- `command` (default `qmd`): override the executable path.
|
||||
- `includeDefaultMemory` (default `true`): auto-index `MEMORY.md` + `memory/**/*.md`.
|
||||
- `paths[]`: add extra directories/files (`path`, optional `pattern`, optional
|
||||
stable `name`).
|
||||
- `sessions`: opt into session JSONL indexing (`enabled`, `retentionDays`,
|
||||
`exportDir`, `redactToolOutputs`—defaults to redacting tool payloads).
|
||||
- `update`: controls refresh cadence (`interval`, `debounceMs`, `onBoot`).
|
||||
- `limits`: clamp recall payload (`maxResults`, `maxSnippetChars`,
|
||||
`maxInjectedChars`, `timeoutMs`).
|
||||
- `scope`: same schema as [`session.sendPolicy`](/reference/configuration#session-sendpolicy).
|
||||
Default is DM-only (`deny` all, `allow` direct chats); loosen it to surface QMD
|
||||
hits in groups/channels.
|
||||
- Snippets sourced outside the workspace show up as
|
||||
`qmd/<collection>/<relative-path>` in `memory_search` results; `memory_get`
|
||||
understands that prefix and reads from the configured QMD collection root.
|
||||
- When `memory.qmd.sessions.enabled = true`, OpenClaw exports sanitized session
|
||||
transcripts (User/Assistant turns) into a dedicated QMD collection under
|
||||
`~/.clawdbot/agents/<id>/qmd/sessions/`, so `memory_search` can recall recent
|
||||
conversations without touching the builtin SQLite index.
|
||||
- `memory_search` snippets now include a `Source: <path#line>` footer when
|
||||
`memory.citations` is `auto`/`on`; set `memory.citations = "off"` to keep
|
||||
the path metadata internal (the agent still receives the path for
|
||||
`memory_get`, but the snippet text omits the footer and the system prompt
|
||||
warns the agent not to cite it).
|
||||
|
||||
**Example**
|
||||
|
||||
```json5
|
||||
memory: {
|
||||
backend: "qmd",
|
||||
citations: "auto",
|
||||
qmd: {
|
||||
includeDefaultMemory: true,
|
||||
update: { interval: "5m", debounceMs: 15000 },
|
||||
limits: { maxResults: 6, timeoutMs: 4000 },
|
||||
scope: {
|
||||
default: "deny",
|
||||
rules: [{ action: "allow", match: { chatType: "direct" } }]
|
||||
},
|
||||
paths: [
|
||||
{ name: "docs", path: "~/notes", pattern: "**/*.md" }
|
||||
]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Citations & fallback**
|
||||
- `memory.citations` applies regardless of backend (`auto`/`on`/`off`).
|
||||
- When `qmd` runs, we tag `status().backend = "qmd"` so diagnostics show which engine served the results.
|
||||
If the QMD subprocess exits or JSON output can’t be parsed, the search manager logs a warning and
|
||||
returns the builtin provider (existing Markdown embeddings) until QMD recovers.
|
||||
|
||||
### Gemini embeddings (native)
|
||||
|
||||
Set the provider to `gemini` to use the Gemini embeddings API directly:
|
||||
|
||||
Reference in New Issue
Block a user