Compare commits

..

14 Commits

Author SHA1 Message Date
Nicholas Tindle
8892bcd230 docs: Add workspace and media file architecture documentation (#11989)
### Changes 🏗️

- Added comprehensive architecture documentation at
`docs/platform/workspace-media-architecture.md` covering:
  - Database models (`UserWorkspace`, `UserWorkspaceFile`)
  - `WorkspaceManager` API with session scoping
- `store_media_file()` media normalization pipeline (input types, return
formats)
  - Virus scanning responsibility boundaries
- Decision tree for choosing `WorkspaceManager` vs `store_media_file()`
- Configuration reference including `clamav_max_concurrency` and
`clamav_mark_failed_scans_as_clean`
  - Common patterns with error handling examples
- Updated `autogpt_platform/backend/CLAUDE.md` with a "Workspace & Media
Files" section referencing the new docs
- Removed duplicate `scan_content_safe()` call from
`WriteWorkspaceFileTool` — `WorkspaceManager.write_file()` already scans
internally, so the tool was double-scanning every file
- Replaced removed comment in `workspace.py` with explicit ownership
comment clarifying that `WorkspaceManager` is the single scanning
boundary

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified `scan_content_safe()` is called inside
`WorkspaceManager.write_file()` (workspace.py:186)
- [x] Verified `store_media_file()` scans all input branches including
local paths (file.py:351)
- [x] Verified documentation accuracy against current source code after
merge with dev
  - [x] CI checks all passing

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> Mostly adds documentation and internal developer guidance; the only
code change is a comment clarifying `WorkspaceManager.write_file()` as
the single virus-scanning boundary, with no behavior change.
> 
> **Overview**
> Adds a new `docs/platform/workspace-media-architecture.md` describing
the Workspace storage layer vs the `store_media_file()` media pipeline,
including session scoping and virus-scanning/persistence responsibility
boundaries.
> 
> Updates backend `CLAUDE.md` to point contributors to the new doc when
working on CoPilot uploads/downloads or
`WorkspaceManager`/`store_media_file()`, and clarifies in
`WorkspaceManager.write_file()` (comment-only) that callers should not
duplicate virus scanning.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
18fcfa03f8. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 06:12:26 +00:00
Zamil Majdy
48ff8300a4 Merge branch 'master' of github.com:Significant-Gravitas/AutoGPT into dev 2026-03-17 13:13:42 +07:00
Abhimanyu Yadav
c268fc6464 test(frontend/builder): add integration tests for builder stores, components, and hooks (part-1) (#12433)
### Changes
- Add 329 integration tests across 11 test files for the builder (visual
  workflow editor)
- Cover all Zustand stores (nodeStore, edgeStore, historyStore,
graphStore,
  copyPasteStore, blockMenuStore, controlPanelStore)
- Cover key components (CustomNode, NewBlockMenu, NewSaveControl,
RunGraph)
- Cover hooks (useFlow, useCopyPaste)

### Test files

  | File | Tests | Coverage |
  |------|-------|----------|
| `nodeStore.test.ts` | 58 | Node lifecycle, bulk ops, backend
conversion,
  execution tracking, status, errors, resolution mode |
  | `edgeStore.test.ts` | 37 | Edge CRUD, duplicate rejection, bead
  visualization, backend link conversion, upsert |
| `historyStore.test.ts` | 22 | Undo/redo, history limits (50),
microtask
  batching, deduplication, canUndo/canRedo |
| `graphStore.test.ts` | 28 | Execution status transitions,
isGraphRunning,
  schema management, sub-graphs |
| `copyPasteStore.test.ts` | 8 | Copy/paste with ID remapping, position
offset,
   edge preservation |
| `CustomNode.test.tsx` | 25 | Rendering by block type (NOTE, WEBHOOK,
AGENT,
  OUTPUT, AYRSHARE), error states |
| `NewBlockMenu.test.tsx` | 29 | Store state (search, filters, creators,
  categories), search/default view routing |
| `NewSaveControl.test.tsx` | 11 | Save dialog rendering, form
validation,
  version display, popover state |
| `RunGraph.test.tsx` | 11 | Run/stop button states, loading, click
handlers,
  RunInputDialog visibility |
  | `useFlow.test.ts` | 4 | Loading states, initial load completion |
| `useCopyPaste.test.ts` | 16 | Clipboard copy/paste, UUID remapping,
viewport
  centering, input field guard |
2026-03-17 05:24:55 +00:00
Reinier van der Leer
aff3fb44af ci(platform): Improve end-to-end CI & reduce its cost (#12437)
Our CI costs are skyrocketing, most of it because of
`platform-fullstack-ci.yml`. The `types` job currently uses in a
`big-boi` runner (= expensive), but doesn't need to.
Additionally, the "end-to-end tests" job is currently in
`platform-frontend-ci.yml` instead of `platform-fullstack-ci.yml`,
causing it not to run on backend changes (which it should).

### Changes 🏗️

- Simplify `check-api-types` job (renamed from `types`) and make it use
regular `ubuntu-latest` runner
- Export API schema from backend through CLI (instead of spinning it up
in docker)
- Fix dependency caching in `platform-fullstack-ci.yml` (based on recent
improvements in `platform-frontend-ci.yml`)
- Move `e2e_tests` job to `platform-fullstack-ci.yml`

Out-of-scope but necessary:
- Eliminate module-level init of OpenAI client in
`backend.copilot.service`

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - CI
2026-03-16 23:08:18 +00:00
Zamil Majdy
9a41312769 feat(backend/copilot): parse @@agptfile bare refs by file extension (#12392)
The `@@agptfile:` expansion system previously used content-sniffing
(trying
`json.loads` then `csv.Sniffer`) to decide whether to parse file content
as
structured data. This was fragile — a file containing just `"42"` would
be
parsed as an integer, and the heuristics could misfire on ambiguous
content.

This PR replaces content-sniffing with **extension/MIME-based format
detection**.
When the file has a well-known extension (`.json`, `.csv`, etc.) or MIME
type
fragment (`workspace://id#application/json`), the content is parsed
accordingly.
Unknown formats or parse failures always fall back to plain string — no
surprises.

> [!NOTE]
> This PR builds on the `@@agptfile:` file reference protocol introduced
in #12332 and the structured data auto-parsing added in #12390.
>
> **What is `@@agptfile:`?**
> It is a special URI prefix (e.g. `@@agptfile:workspace:///report.csv`)
that the CoPilot SDK expands inline before sending tool arguments to
blocks. This lets the AI reference workspace files by name, and the SDK
automatically reads and injects the file content. See #12332 for the
full design.

### Changes 🏗️

**New utility: `backend/util/file_content_parser.py`**
- `infer_format(uri)` — determines format from file extension or MIME
fragment
- `parse_file_content(content, fmt)` — parses content, never raises
- Supported text formats: JSON, JSONL/NDJSON, CSV, TSV, YAML, TOML
- Supported binary formats: Parquet (via pyarrow), Excel/XLSX (via
openpyxl)
- JSON scalars (strings, numbers, booleans, null) stay as strings — only
  containers (arrays, objects) are promoted
- CSV/TSV require ≥1 row and ≥2 columns to qualify as tabular data
- Added `openpyxl` dependency for Excel reading via pandas
- Case-insensitive MIME fragment matching per RFC 2045
- Shared `PARSE_EXCEPTIONS` constant to avoid duplication between
modules

**Updated `expand_file_refs_in_args` in `file_ref.py`**
- Bare refs now use `infer_format` + `parse_file_content` instead of the
  old `_try_parse_structured` content-sniffing function
- Binary formats (parquet, xlsx) read raw bytes via `read_file_bytes`
- Embedded refs (text around `@@agptfile:`) still produce plain strings
- **Size guards**: Workspace and sandbox file reads now enforce a 10 MB
limit
  (matching the existing local file limit) to prevent OOM on large files

**Updated `blocks/github/commits.py`**
- Consolidated `_create_blob` and `_create_binary_blob` into a single
function
  with an `encoding` parameter

**Updated copilot system prompt**
- Documents the extension-based structured data parsing and supported
formats

**66 new tests** in `file_content_parser_test.py` covering:
- Format inference (extension, MIME, case-insensitive, precedence)
- All 8 format parsers (happy path + edge cases + fallbacks)
- Binary format handling (string input fallback, invalid bytes fallback)
- Unknown format passthrough

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] All 66 file_content_parser_test.py tests pass
  - [x] All 31 file_ref_test.py tests pass
  - [x] All 13 file_ref_integration_test.py tests pass
  - [x] `poetry run format` passes clean (including pyright)
2026-03-16 22:31:21 +00:00
Ubbe
048fb06b0a feat(frontend): add "Jump Back In" button to Library page (#12387)
Adds a "Jump Back In" CTA at the top of the Library page to encourage
users to quickly rerun their most recently successful agent.

Closes SECRT-1536

### Changes 🏗️

- New `JumpBackIn` component with `useJumpBackIn` hook at
`library/components/JumpBackIn/`
- Fetches first page of library agents sorted by `updatedAt`
- Finds the first agent with a `COMPLETED` execution in
`recent_executions`
- Shows banner with agent name + "Jump Back In" button linking to
`/library/agents/{id}`
- Returns `null` (hidden) when loading or when no agent with a
successful run exists

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] `pnpm format`, `pnpm lint`, `pnpm types` all pass
- [x] Verified banner is hidden when no successful runs exist (edge
case)
- [x] Verified library page renders correctly with no visual regressions

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 21:35:03 +08:00
Zamil Majdy
3f653e6614 dx(.claude): refactor and consolidate Claude Code skills (#12424)
Refactors the Claude Code skills for a cleaner, more intuitive dev loop.

### Changes 🏗️

- **`/pr-review` (new)**: Actual code review skill — reads the PR diff,
fetches existing comments to avoid duplicates, and posts inline GitHub
comments with structured feedback (Blockers / Should Fix / Nice to Have
/ Nit) covering correctness, security, code quality, architecture, and
testing.

- **`/pr-address` (was `/babysit-pr`)**: Addresses review comments and
monitors CI until green. Renamed from `/babysit-pr` to `/pr-address` to
better reflect its purpose. Handles bot-specific feedback
(autogpt-reviewer, sentry, coderabbitai) and loops until all comments
are addressed and CI is green.

- **`/backend-check` + `/frontend-check` → `/check`**: Unified into a
single `/check` skill that auto-detects whether backend (Python) or
frontend (TypeScript) code changed and runs the appropriate formatting,
linting, type checking, and tests. Shared code quality rules applied to
both.

- **`/code-style` enhanced**: Now covers both Python and
TypeScript/React. Added learnings from real PR work: lazy `%s` logging,
TOCTOU awareness, SSE protocol rules (`data:` vs `: comment`), FastAPI
`Security()` vs `Depends()`, Redis pipeline atomicity, error path
sanitization, mock target rules after refactoring.

- **`/worktree` fixed**: Normal `git worktree` is now the default (was
branchlet-first). Branchlet moved to optional section. All paths derived
from `git rev-parse --show-toplevel`.

- **`/pr-create`, `/openapi-regen`, `/new-block` cleaned up**: Reference
`/check` and `/code-style` instead of duplicating instructions.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] Verified all skill files parse correctly (valid YAML frontmatter)
  - [x] Verified skill auto-detection triggers updated in descriptions
- [x] Verified old backend-check and frontend-check directories removed
- [x] Verified pr-review and pr-address directories created with correct
content
2026-03-16 10:35:05 +00:00
Zamil Majdy
c9c3d54b2b fix(platform): reduce Sentry noise by filtering expected errors and downgrading log levels (#12430)
## Summary

Reduces Sentry error noise by ~90% by filtering out expected/transient
errors and downgrading inappropriate error-level logs to warnings. Most
of the top Sentry issues are not actual bugs but expected conditions
(user errors, transient infra, business logic) that were incorrectly
logged at ERROR level, causing them to be captured as Sentry events.

## Changes

### 1. Sentry `before_send` filter (`metrics.py`)
Added a `before_send` hook to filter known expected errors before they
reach Sentry:
- **AMQP/RabbitMQ connection errors** — transient during
deploys/restarts
- **User credential errors** — invalid API keys, missing auth headers
(user error, not platform bug)
- **Insufficient balance** — expected business logic
- **Blocked IP access** — security check working as intended
- **Discord bot token errors** — misconfiguration, not runtime error
- **Google metadata DNS errors** — expected in non-GCP environments
- **Inactive email recipients** — expected for bounced addresses
- **Unclosed client sessions/connectors** — resource cleanup noise

### 2. Connection retry log levels (`retry.py`)
- `conn_retry` final failure: `error` → `warning` (these are infra
retries, not bugs)
- `conn_retry` wrapper final failure: `error` → `warning`
- Discord alert send failure: `error` → `warning`

### 3. Block execution Sentry capture (`manager.py`)
- Skip `sentry_sdk.capture_exception()` for `ValueError` subclasses
(BlockExecutionError, BlockInputError, InsufficientBalanceError, etc.) —
these are user-caused errors, not platform bugs
- Downgrade executor shutdown/disconnect errors to warning

### 4. Scheduler log levels (`scheduler.py`)
- Graph validation failure: `error` → `warning` (expected for
old/invalid graphs)
- Unable to unschedule graph: `error` → `warning`
- Job listener failure: `error` → `warning`
- Async operation failure: `error` → `warning`

### 5. Discord system alert (`notifications.py`)
- Wrapped `discord_system_alert` endpoint with try/catch to prevent
unhandled exceptions (fixes AUTOGPT-SERVER-743, AUTOGPT-SERVER-7MW)

### 6. Notification system log levels (`notifications.py`)
- All batch processing errors: `error` → `warning`
- User email not found: `error` → `warning`
- Notification parsing errors: `error` → `warning`
- Email sending failures: `error` → `warning`
- Summary data gathering failure: `error` → `warning`
- Cleaned up unprofessional error messages

### 7. Cloud storage cleanup (`cloud_storage.py`)
- Cleanup error: `error` → `warning`

## Sentry Issues Addressed

### AMQP/RabbitMQ (~3.4M events total)
-
[AUTOGPT-SERVER-3H2](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-3H2)
— AMQPConnector ConnectionRefusedError (1.2M events)
-
[AUTOGPT-SERVER-3H3](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-3H3)
— AMQPConnectionWorkflowFailed (770K events)
-
[AUTOGPT-SERVER-3H4](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-3H4)
— AMQP connection workflow failed (770K events)
-
[AUTOGPT-SERVER-3H5](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-3H5)
— AMQPConnectionWorkflow reporting failure (770K events)
-
[AUTOGPT-SERVER-3H7](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-3H7)
— Socket failed to connect (514K events)
-
[AUTOGPT-SERVER-3H8](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-3H8)
— TCP Connection attempt failed (514K events)
-
[AUTOGPT-SERVER-3H6](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-3H6)
— AMQPConnectionError (93K events)
-
[AUTOGPT-SERVER-7SX](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-7SX)
— Error creating transport (69K events)
-
[AUTOGPT-SERVER-1TN](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-1TN)
— ChannelInvalidStateError (39K events)
-
[AUTOGPT-SERVER-6JC](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-6JC)
— ConnectionClosedByBroker (2K events)
-
[AUTOGPT-SERVER-6RJ/6RK/6RN/6RQ/6RP/6RR](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-6RJ)
— Various connection failures (~15K events)
-
[AUTOGPT-SERVER-4A5/6RM/7XN](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-4A5)
— Connection close/transport errors (~540 events)

### User Credential Errors (~15K events)
-
[AUTOGPT-SERVER-6S5](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-6S5)
— Incorrect OpenAI API key (9.2K events)
-
[AUTOGPT-SERVER-7W4](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-7W4)
— Incorrect API key in AIConditionBlock (3.4K events)
-
[AUTOGPT-SERVER-83Y](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-83Y)
— AI condition invalid key (2.3K events)
-
[AUTOGPT-SERVER-7ZP](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-7ZP)
— Perplexity missing auth header (451 events)
-
[AUTOGPT-SERVER-7XK/7XM](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-7XK)
— Anthropic invalid key (125 events)
-
[AUTOGPT-SERVER-82C](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-82C)
— Missing auth header (27 events)
-
[AUTOGPT-SERVER-721](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-721)
— Ideogram invalid token (165 events)

### Business Logic / Validation (~120K events)
-
[AUTOGPT-SERVER-7YQ](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-7YQ)
— Disabled block used in graph (56K events)
-
[AUTOGPT-SERVER-6W3](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-6W3)
— Graph failed validation (46K events)
-
[AUTOGPT-SERVER-6W2](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-6W2)
— Unable to unschedule graph (46K events)
-
[AUTOGPT-SERVER-83X](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-83X)
— Blocked IP access (15K events)
-
[AUTOGPT-SERVER-6K9](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-6K9)
— Insufficient balance (4K events)

### Discord Alert Failures (~24K events)
-
[AUTOGPT-SERVER-743](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-743)
— Discord improper token (22K events)
-
[AUTOGPT-SERVER-7MW](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-7MW)
— Discord 403 Missing Access (1.5K events)

### Notification System (~16K events)
-
[AUTOGPT-SERVER-550](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-550)
— Notification batch create error (8.3K events)
-
[AUTOGPT-SERVER-58H](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-58H)
— ValidationError for NotificationEventModel (3K events)
-
[AUTOGPT-SERVER-5C6](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-5C6)
— Get notification batch error (2.1K events)
-
[AUTOGPT-SERVER-4BT](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-4BT)
— Notification batch create error (1.8K events)
-
[AUTOGPT-SERVER-5E4](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-5E4)
— NotificationPreference validation (1.4K events)
-
[AUTOGPT-SERVER-508](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-508)
— Inactive email recipients (702 events)

### Infrastructure / Transient (~20K events)
-
[AUTOGPT-SERVER-6WJ](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-6WJ)
— Unclosed client session (13K events)
-
[AUTOGPT-SERVER-745](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-745)
— Unclosed connector (5.8K events)
-
[AUTOGPT-SERVER-4V1](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-4V1)
— Google metadata DNS error (2.2K events)
-
[AUTOGPT-SERVER-80J](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-80J)
— CloudStorage DNS error (35 events)

### Executor Shutdown
-
[AUTOGPT-SERVER-55J](https://significant-gravitas.sentry.io/issues/AUTOGPT-SERVER-55J)
— Error disconnecting run client (118 events)

## Test plan
- [x] All pre-commit hooks pass (Ruff, isort, Black, Pyright typecheck)
- [x] All changed modules import successfully
- [ ] Deploy to staging and verify Sentry event volume drops
significantly
- [ ] Verify legitimate errors still appear in Sentry
2026-03-16 10:29:01 +00:00
Ubbe
53d58e21d3 feat(frontend): replace technical block terminology with user-friendly labels (#12389)
## Summary
- Replaces all user-facing "block" terminology in the CoPilot activity
stream with plain-English labels ("Step failed", "action",
"Credentials", etc.)
- Adds `humanizeFileName()` utility to display file names without
extensions, with title-case and spaces (e.g. `executive_memo.md` →
`"Executive Memo"`)
- Updates error messages across RunBlock, RunAgent, and FindBlocks tools
to use friendly language

## Test plan
- [ ] Open CoPilot and trigger a block execution — verify animation text
says "Running" / "Step failed" instead of "Running the block" / "Error
running block"
- [ ] Trigger a file read/write action — verify the activity shows
humanized file names (e.g. `Reading "Executive Memo"` not `Reading
executive_memo.md`)
- [ ] Trigger FindBlocks — verify labels say "Searching for actions" and
"Results" instead of "Searching for blocks" and "Block results"
- [ ] Check the work-done stats bar — verify it shows "action" /
"actions" instead of "block run" / "block runs"
- [ ] Trigger a setup requirements card — verify labels say
"Credentials" and "Inputs" instead of "Block credentials" and "Block
inputs"
- [ ] Visit `/copilot/styleguide` — verify error test data no longer
contains "Block execution" text

Resolves: [SECRT-2025](https://linear.app/autogpt/issue/SECRT-2025)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 09:00:25 +00:00
Ubbe
fa04fb41d8 feat(frontend): add "Run now" button to schedule view (#12388)
Adds a "Run now" action to the schedule detail view and sidebar
dropdown, allowing users to immediately trigger a scheduled agent run
without waiting for the next cron execution.

### Changes 🏗️

- **`useSelectedScheduleActions.ts`**: Added
`usePostV1ExecuteGraphAgent` hook and `handleRunNow` function that
executes the agent using the schedule's stored `input_data` and
`input_credentials`. On success, invalidates runs query and navigates to
the new run
- **`SelectedScheduleActions.tsx`**: Added Play icon button as first
action button, with loading spinner while running
- **`SelectedScheduleView.tsx`**: Threads `onSelectRun` prop and
`schedule` object to action components (both mobile and desktop layouts)
- **`NewAgentLibraryView.tsx`**: Passes `onSelectRun` handler to enable
navigation to the new run after execution
- **`ScheduleActionsDropdown.tsx`**: Added "Run now" dropdown menu item
with same execution logic
- **`ScheduleListItem.tsx`**: Added `onRunCreated` prop passed to
dropdown
- **`SidebarRunsList.tsx`**: Connects sidebar dropdown to run
selection/navigation

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] `pnpm format`, `pnpm lint`, `pnpm types` all pass
- [x] Code review: follows existing patterns (mirrors "Run Again" in
SelectedRunActions)
  - [x] No visual regressions on agent detail page

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 17:00:41 +08:00
Otto
d9c16ded65 fix(copilot): prioritize block discovery over MCP and sanitize HTML errors (#12394)
Requested by @majdyz

When a user asks for Google Sheets integration, the CoPilot agent skips
block discovery entirely (despite 55+ Google Sheets blocks being
available), jumps straight to MCP, guesses a fake URL
(`https://sheets.googleapis.com/mcp`), and gets a raw HTML 404 error
page dumped into the conversation.

**Changes:**

1. **MCP guide** (`mcp_tool_guide.md`): Added "Check blocks first"
section directing the agent to use `find_block` before attempting MCP
for any service not in the known servers list. Explicitly prohibits
guessing/constructing MCP server URLs.

2. **Error handling** (`run_mcp_tool.py`): Detects HTML error pages in
HTTP responses (e.g. raw 404 pages from non-MCP endpoints) and returns a
clean one-liner like "This URL does not appear to host an MCP server"
instead of dumping the full HTML body.

**Note:** The main CoPilot system prompt (managed externally, not in
repo) should also be updated to reinforce block-first behavior in the
Capability Check section. This PR covers the in-repo changes.

Session reference: `9216df83-5f4a-48eb-9457-3ba2057638ae` (turn 3)
Ticket: [SECRT-2116](https://linear.app/autogpt/issue/SECRT-2116)

---
Co-authored-by: Zamil Majdy (@majdyz) <majdyz@gmail.com>

---------

Co-authored-by: Zamil Majdy (@majdyz) <majdyz@gmail.com>
Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2026-03-14 12:49:03 +00:00
Otto
6dc8429ae7 fix(copilot): downgrade agent validation failure log from error to warning (#12409)
Agent validation failures are expected when the LLM generates invalid
agent graphs (wrong block IDs, missing required inputs, bad output field
names). The validator catches these and returns proper error responses.

However, `validator.py:938` used `logger.error()`, which Sentry captures
as error events — flooding #platform-alerts with non-errors.

This changes it to `logger.warning()`, keeping the log visible for
debugging without triggering Sentry alerts.

Fixes SECRT-2120

---
Co-authored-by: Zamil Majdy (@majdyz) <zamil.majdy@agpt.co>
2026-03-14 12:48:36 +00:00
Zamil Majdy
cfe22e5a8f fix(backend/copilot): sync TranscriptBuilder with CLI on mid-stream compaction (#12401)
## Summary
- **Root cause**: `TranscriptBuilder` accumulates all raw SDK stream
messages including pre-compaction content. When the CLI compacts
mid-stream, the uploaded transcript was still uncompacted, causing
"Prompt is too long" errors on the next `--resume` turn.
- **Fix**: Detect mid-stream compaction via the `PreCompact` hook, read
the CLI's session file to get the compacted entries (summary +
post-compaction messages), and call
`TranscriptBuilder.replace_entries()` to sync it with the CLI's active
context. This ensures the uploaded transcript always matches what the
CLI sees.
- **Key changes**:
- `CompactionTracker`: stores `transcript_path` from `PreCompact` hook,
one-shot `compaction_just_ended` flag that correctly resets for multiple
compactions
- `read_compacted_entries()`: reads CLI session JSONL, finds
`isCompactSummary: true` entry, returns it + all entries after. Includes
path validation against the CLI projects directory.
- `TranscriptBuilder.replace_entries()`: clears and replaces all entries
with compacted ones, preserving `isCompactSummary` entries (which have
`type: "summary"` that would normally be stripped)
- `load_previous()`: also preserves `isCompactSummary` entries when
loading a previously compacted transcript
- Service stream loop: after compaction ends, reads compacted entries
and syncs TranscriptBuilder

## Test plan
- [x] 69 tests pass across `compaction_test.py` and `transcript_test.py`
- [x] Tests cover: one-shot flag behavior, multiple compactions within a
query, transcript path storage, path traversal rejection,
`read_compacted_entries` (7 tests), `replace_entries` (4 tests),
`load_previous` with compacted content (2 tests)
- [x] Pre-commit hooks pass (lint, format, typecheck)
- [ ] Manual test: trigger compaction in a multi-turn session and verify
the uploaded transcript reflects compaction
2026-03-13 22:17:46 +00:00
Otto
0b594a219c feat(copilot): support prompt-in-URL for shareable prompt links (#12406)
Requested by @torantula

Add support for shareable AutoPilot URLs that contain a prompt in the
URL hash fragment, inspired by [Lovable's
implementation](https://docs.lovable.dev/integrations/build-with-url).

**URL format:**
- `/copilot#prompt=URL-encoded-text` — pre-fills the input for the user
to review before sending
- `/copilot?autosubmit=true#prompt=...` — auto-creates a session and
sends the prompt immediately

**Example:**
```
https://platform.agpt.co/copilot#prompt=Create%20a%20todo%20app
https://platform.agpt.co/copilot?autosubmit=true#prompt=Create%20a%20todo%20app
```

**Key design decisions:**
- Uses URL fragment (`#`) instead of query params — fragments never hit
the server, so prompts stay client-side only (better for privacy, no
backend URL length limits)
- URL is cleaned via `history.replaceState` immediately after extraction
to prevent re-triggering on navigation/reload
- Leverages existing `pendingMessage` + `createSession()` flow for
auto-submit — no new backend APIs needed
- For populate-only mode, passes `initialPrompt` down through component
tree to pre-fill the chat input

**Files changed:**
- `useCopilotPage.ts` — URL hash extraction logic + `initialPrompt`
state
- `CopilotPage.tsx` — passes `initialPrompt` to `ChatContainer`
- `ChatContainer.tsx` — passes `initialPrompt` to `EmptySession`
- `EmptySession.tsx` — passes `initialPrompt` to `ChatInput`
- `ChatInput.tsx` / `useChatInput.ts` — accepts `initialValue` to
pre-fill the textarea

Fixes SECRT-2119

---
Co-authored-by: Toran Bruce Richards (@Torantulino) <toran@agpt.co>
2026-03-13 23:54:54 +07:00
115 changed files with 10902 additions and 3534 deletions

View File

@@ -1,17 +0,0 @@
---
name: backend-check
description: Run the full backend formatting, linting, and test suite. Ensures code quality before commits and PRs. TRIGGER when backend Python code has been modified and needs validation.
user-invocable: true
metadata:
author: autogpt-team
version: "1.0.0"
---
# Backend Check
## Steps
1. **Format**: `poetry run format` — runs formatting AND linting. NEVER run ruff/black/isort individually
2. **Fix** any remaining errors manually, re-run until clean
3. **Test**: `poetry run test` (runs DB setup + pytest). For specific files: `poetry run pytest -s -vvv <test_files>`
4. **Snapshots** (if needed): `poetry run pytest path/to/test.py --snapshot-update` — review with `git diff`

View File

@@ -1,35 +0,0 @@
---
name: code-style
description: Python code style preferences for the AutoGPT backend. Apply when writing or reviewing Python code. TRIGGER when writing new Python code, reviewing PRs, or refactoring backend code.
user-invocable: false
metadata:
author: autogpt-team
version: "1.0.0"
---
# Code Style
## Imports
- **Top-level only** — no local/inner imports. Move all imports to the top of the file.
## Typing
- **No duck typing** — avoid `hasattr`, `getattr`, `isinstance` for type dispatch. Use proper typed interfaces, unions, or protocols.
- **Pydantic models** over dataclass, namedtuple, or raw dict for structured data.
- **No linter suppressors** — avoid `# type: ignore`, `# noqa`, `# pyright: ignore` etc. 99% of the time the right fix is fixing the type/code, not silencing the tool.
## Code Structure
- **List comprehensions** over manual loop-and-append.
- **Early return** — guard clauses first, avoid deep nesting.
- **Flatten inline** — prefer short, concise expressions. Reduce `if/else` chains with direct returns or ternaries when readable.
- **Modular functions** — break complex logic into small, focused functions rather than long blocks with nested conditionals.
## Review Checklist
Before finishing, always ask:
- Can any function be split into smaller pieces?
- Is there unnecessary nesting that an early return would eliminate?
- Can any loop be a comprehension?
- Is there a simpler way to express this logic?

View File

@@ -1,16 +0,0 @@
---
name: frontend-check
description: Run the full frontend formatting, linting, and type checking suite. Ensures code quality before commits and PRs. TRIGGER when frontend TypeScript/React code has been modified and needs validation.
user-invocable: true
metadata:
author: autogpt-team
version: "1.0.0"
---
# Frontend Check
## Steps (in order)
1. **Format**: `pnpm format` — NEVER run individual formatters
2. **Lint**: `pnpm lint` — fix errors, re-run until clean
3. **Types**: `pnpm types` — if it keeps failing after multiple attempts, stop and ask the user

View File

@@ -1,29 +0,0 @@
---
name: new-block
description: Create a new backend block following the Block SDK Guide. Guides through provider configuration, schema definition, authentication, and testing. TRIGGER when user asks to create a new block, add a new integration, or build a new node for the graph editor.
user-invocable: true
metadata:
author: autogpt-team
version: "1.0.0"
---
# New Block Creation
Read `docs/platform/block-sdk-guide.md` first for the full guide.
## Steps
1. **Provider config** (if external service): create `_config.py` with `ProviderBuilder`
2. **Block file** in `backend/blocks/` (from `autogpt_platform/backend/`):
- Generate a UUID once with `uuid.uuid4()`, then **hard-code that string** as `id` (IDs must be stable across imports)
- `Input(BlockSchema)` and `Output(BlockSchema)` classes
- `async def run` that `yield`s output fields
3. **Files**: use `store_media_file()` with `"for_block_output"` for outputs
4. **Test**: `poetry run pytest 'backend/blocks/test/test_block.py::test_available_blocks[MyBlock]' -xvs`
5. **Format**: `poetry run format`
## Rules
- Analyze interfaces: do inputs/outputs connect well with other blocks in a graph?
- Use top-level imports, avoid duck typing
- Always use `for_block_output` for block outputs

View File

@@ -1,28 +0,0 @@
---
name: openapi-regen
description: Regenerate the OpenAPI spec and frontend API client. Starts the backend REST server, fetches the spec, and regenerates the typed frontend hooks. TRIGGER when API routes change, new endpoints are added, or frontend API types are stale.
user-invocable: true
metadata:
author: autogpt-team
version: "1.0.0"
---
# OpenAPI Spec Regeneration
## Steps
1. **Run end-to-end** in a single shell block (so `REST_PID` persists):
```bash
cd autogpt_platform/backend && poetry run rest &
REST_PID=$!
WAIT=0; until curl -sf http://localhost:8006/health > /dev/null 2>&1; do sleep 1; WAIT=$((WAIT+1)); [ $WAIT -ge 60 ] && echo "Timed out" && kill $REST_PID && exit 1; done
cd ../frontend && pnpm generate:api:force
kill $REST_PID
pnpm types && pnpm lint && pnpm format
```
## Rules
- Always use `pnpm generate:api:force` (not `pnpm generate:api`)
- Don't manually edit files in `src/app/api/__generated__/`
- Generated hooks follow: `use{Method}{Version}{OperationName}`

View File

@@ -0,0 +1,79 @@
---
name: pr-address
description: Address PR review comments and loop until CI green and all comments resolved. TRIGGER when user asks to address comments, fix PR feedback, respond to reviewers, or babysit/monitor a PR.
user-invocable: true
args: "[PR number or URL] — if omitted, finds PR for current branch."
metadata:
author: autogpt-team
version: "1.0.0"
---
# PR Address
## Find the PR
```bash
gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT
gh pr view {N}
```
## Fetch comments (all sources)
```bash
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews # top-level reviews
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments # inline review comments
gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments # PR conversation comments
```
**Bots to watch for:**
- `autogpt-reviewer` — posts "Blockers", "Should Fix", "Nice to Have". Address ALL of them.
- `sentry[bot]` — bug predictions. Fix real bugs, explain false positives.
- `coderabbitai[bot]` — automated review. Address actionable items.
## For each unaddressed comment
Address comments **one at a time**: fix → commit → push → inline reply → next.
1. Read the referenced code, make the fix (or reply explaining why it's not needed)
2. Commit and push the fix
3. Reply **inline** (not as a new top-level comment) referencing the fixing commit — this is what resolves the conversation for bot reviewers (coderabbitai, sentry):
| Comment type | How to reply |
|---|---|
| Inline review (`pulls/{N}/comments`) | `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID}/replies -f body="Fixed in <commit-sha>: <description>"` |
| Conversation (`issues/{N}/comments`) | `gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments -f body="Fixed in <commit-sha>: <description>"` |
## Format and commit
After fixing, format the changed code:
- **Backend** (from `autogpt_platform/backend/`): `poetry run format`
- **Frontend** (from `autogpt_platform/frontend/`): `pnpm format && pnpm lint && pnpm types`
If API routes changed, regenerate the frontend client:
```bash
cd autogpt_platform/backend && poetry run rest &
REST_PID=$!
trap "kill $REST_PID 2>/dev/null" EXIT
WAIT=0; until curl -sf http://localhost:8006/health > /dev/null 2>&1; do sleep 1; WAIT=$((WAIT+1)); [ $WAIT -ge 60 ] && echo "Timed out" && exit 1; done
cd ../frontend && pnpm generate:api:force
kill $REST_PID 2>/dev/null; trap - EXIT
```
Never manually edit files in `src/app/api/__generated__/`.
Then commit and **push immediately** — never batch commits without pushing.
For backend commits in worktrees: `poetry run git commit` (pre-commit hooks).
## The loop
```text
address comments → format → commit → push
→ re-check comments → fix new ones → push
→ wait for CI → re-check comments after CI settles
→ repeat until: all comments addressed AND CI green AND no new comments arriving
```
While CI runs, stay productive: run local tests, address remaining comments.
**The loop ends when:** CI fully green + all comments addressed + no new comments since CI settled.

View File

@@ -1,31 +0,0 @@
---
name: pr-create
description: Create a pull request for the current branch. TRIGGER when user asks to create a PR, open a pull request, push changes for review, or submit work for merging.
user-invocable: true
metadata:
author: autogpt-team
version: "1.0.0"
---
# Create Pull Request
## Steps
1. **Check for existing PR**: `gh pr view --json url -q .url 2>/dev/null` — if a PR already exists, output its URL and stop
2. **Understand changes**: `git status`, `git diff dev...HEAD`, `git log dev..HEAD --oneline`
3. **Read PR template**: `.github/PULL_REQUEST_TEMPLATE.md`
4. **Draft PR title**: Use conventional commits format (see CLAUDE.md for types and scopes)
5. **Fill out PR template** as the body — be thorough in the Changes section
6. **Format first** (if relevant changes exist):
- Backend: `cd autogpt_platform/backend && poetry run format`
- Frontend: `cd autogpt_platform/frontend && pnpm format`
- Fix any lint errors, then commit formatting changes before pushing
7. **Push**: `git push -u origin HEAD`
8. **Create PR**: `gh pr create --base dev`
9. **Output** the PR URL
## Rules
- Always target `dev` branch
- Do NOT run tests — CI will handle that
- Use the PR template from `.github/PULL_REQUEST_TEMPLATE.md`

View File

@@ -1,51 +1,74 @@
---
name: pr-review
description: Address all open PR review comments systematically. Fetches comments, addresses each one, reacts +1/-1, and replies when clarification is needed. Keeps iterating until all comments are addressed and CI is green. TRIGGER when user shares a PR URL, asks to address review comments, fix PR feedback, or respond to reviewer comments.
description: Review a PR for correctness, security, code quality, and testing issues. TRIGGER when user asks to review a PR, check PR quality, or give feedback on a PR.
user-invocable: true
args: "[PR number or URL] — if omitted, finds PR for current branch."
metadata:
author: autogpt-team
version: "1.0.0"
---
# PR Review Comment Workflow
# PR Review
## Steps
## Find the PR
1. **Find PR**: `gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT`
2. **Fetch comments** (all three sources):
- `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews` (top-level reviews)
- `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments` (inline review comments)
- `gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments` (PR conversation comments)
3. **Skip** comments already reacted to by PR author
4. **For each unreacted comment**:
- Read referenced code, make the fix (or reply if you disagree/need info)
- **Inline review comments** (`pulls/{N}/comments`):
- React: `gh api repos/.../pulls/comments/{ID}/reactions -f content="+1"` (or `-1`)
- Reply: `gh api repos/.../pulls/{N}/comments/{ID}/replies -f body="..."`
- **PR conversation comments** (`issues/{N}/comments`):
- React: `gh api repos/.../issues/comments/{ID}/reactions -f content="+1"` (or `-1`)
- No threaded replies — post a new issue comment if needed
- **Top-level reviews**: no reaction API — address in code, reply via issue comment if needed
5. **Include autogpt-reviewer bot fixes** too
6. **Format**: `cd autogpt_platform/backend && poetry run format`, `cd autogpt_platform/frontend && pnpm format`
7. **Commit & push**
8. **Re-fetch comments** immediately — address any new unreacted ones before waiting on CI
9. **Stay productive while CI runs** — don't idle. In priority order:
- Run any pending local tests (`poetry run pytest`, e2e, etc.) and fix failures
- Address any remaining comments
- Only poll `gh pr checks {N}` as the last resort when there's truly nothing left to do
10. **If CI fails** — fix, go back to step 6
11. **Re-fetch comments again** after CI is green — address anything that appeared while CI was running
12. **Done** only when: all comments reacted AND CI is green.
```bash
gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT
gh pr view {N}
```
## CRITICAL: Do Not Stop
## Read the diff
**Loop is: address → format → commit → push → re-check comments → run local tests → wait CI → re-check comments → repeat.**
```bash
gh pr diff {N}
```
Never idle. If CI is running and you have nothing to address, run local tests. Waiting on CI is the last resort.
## Fetch existing review comments
## Rules
Before posting anything, fetch existing inline comments to avoid duplicates:
- One todo per comment
- For inline review comments: reply on existing threads. For PR conversation comments: post a new issue comment (API doesn't support threaded replies)
- React to every comment: +1 addressed, -1 disagreed (with explanation)
```bash
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews
```
## What to check
**Correctness:** logic errors, off-by-one, missing edge cases, race conditions (TOCTOU in file access, credit charging), error handling gaps, async correctness (missing `await`, unclosed resources).
**Security:** input validation at boundaries, no injection (command, XSS, SQL), secrets not logged, file paths sanitized (`os.path.basename()` in error messages).
**Code quality:** apply rules from backend/frontend CLAUDE.md files.
**Architecture:** DRY, single responsibility, modular functions. `Security()` vs `Depends()` for FastAPI auth. `data:` for SSE events, `: comment` for heartbeats. `transaction=True` for Redis pipelines.
**Testing:** edge cases covered, colocated `*_test.py` (backend) / `__tests__/` (frontend), mocks target where symbol is **used** not defined, `AsyncMock` for async.
## Output format
Every comment **must** be prefixed with `🤖` and a criticality badge:
| Tier | Badge | Meaning |
|---|---|---|
| Blocker | `🔴 **Blocker**` | Must fix before merge |
| Should Fix | `🟠 **Should Fix**` | Important improvement |
| Nice to Have | `🟡 **Nice to Have**` | Minor suggestion |
| Nit | `🔵 **Nit**` | Style / wording |
Example: `🤖 🔴 **Blocker**: Missing error handling for X — suggest wrapping in try/except.`
## Post inline comments
For each finding, post an inline comment on the PR (do not just write a local report):
```bash
# Get the latest commit SHA for the PR
COMMIT_SHA=$(gh api repos/Significant-Gravitas/AutoGPT/pulls/{N} --jq '.head.sha')
# Post an inline comment on a specific file/line
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments \
-f body="🤖 🔴 **Blocker**: <description>" \
-f commit_id="$COMMIT_SHA" \
-f path="<file path>" \
-F line=<line number>
```

View File

@@ -1,45 +0,0 @@
---
name: worktree-setup
description: Set up a new git worktree for parallel development. Creates the worktree, copies .env files, installs dependencies, generates Prisma client, and optionally starts the app (with port conflict resolution) or runs tests. TRIGGER when user asks to set up a worktree, work on a branch in isolation, or needs a separate environment for a branch or PR.
user-invocable: true
metadata:
author: autogpt-team
version: "1.0.0"
---
# Worktree Setup
## Preferred: Use Branchlet
The repo has a `.branchlet.json` config — it handles env file copying, dependency installation, and Prisma generation automatically.
```bash
npm install -g branchlet # install once
branchlet create -n <name> -s <source-branch> -b <new-branch>
branchlet list --json # list all worktrees
```
## Manual Fallback
If branchlet isn't available:
1. `git worktree add ../<RepoName><N> <branch-name>`
2. Copy `.env` files: `backend/.env`, `frontend/.env`, `autogpt_platform/.env`, `db/docker/.env`
3. Install deps:
- `cd autogpt_platform/backend && poetry install && poetry run prisma generate`
- `cd autogpt_platform/frontend && pnpm install`
## Running the App
Free ports first — backend uses: 8001, 8002, 8003, 8005, 8006, 8007, 8008.
```bash
for port in 8001 8002 8003 8005 8006 8007 8008; do
lsof -ti :$port | xargs kill -9 2>/dev/null || true
done
cd <worktree>/autogpt_platform/backend && poetry run app
```
## CoPilot Testing Gotcha
SDK mode spawns a Claude subprocess — **won't work inside Claude Code**. Set `CHAT_USE_CLAUDE_AGENT_SDK=false` in `backend/.env` to use baseline mode.

View File

@@ -0,0 +1,85 @@
---
name: worktree
description: Set up a new git worktree for parallel development. Creates the worktree, copies .env files, installs dependencies, and generates Prisma client. TRIGGER when user asks to set up a worktree, work on a branch in isolation, or needs a separate environment for a branch or PR.
user-invocable: true
args: "[name] — optional worktree name (e.g., 'AutoGPT7'). If omitted, uses next available AutoGPT<N>."
metadata:
author: autogpt-team
version: "3.0.0"
---
# Worktree Setup
## Create the worktree
Derive paths from the git toplevel. If a name is provided as argument, use it. Otherwise, check `git worktree list` and pick the next `AutoGPT<N>`.
```bash
ROOT=$(git rev-parse --show-toplevel)
PARENT=$(dirname "$ROOT")
# From an existing branch
git worktree add "$PARENT/<NAME>" <branch-name>
# From a new branch off dev
git worktree add -b <new-branch> "$PARENT/<NAME>" dev
```
## Copy environment files
Copy `.env` from the root worktree. Falls back to `.env.default` if `.env` doesn't exist.
```bash
ROOT=$(git rev-parse --show-toplevel)
TARGET="$(dirname "$ROOT")/<NAME>"
for envpath in autogpt_platform/backend autogpt_platform/frontend autogpt_platform; do
if [ -f "$ROOT/$envpath/.env" ]; then
cp "$ROOT/$envpath/.env" "$TARGET/$envpath/.env"
elif [ -f "$ROOT/$envpath/.env.default" ]; then
cp "$ROOT/$envpath/.env.default" "$TARGET/$envpath/.env"
fi
done
```
## Install dependencies
```bash
TARGET="$(dirname "$(git rev-parse --show-toplevel)")/<NAME>"
cd "$TARGET/autogpt_platform/autogpt_libs" && poetry install
cd "$TARGET/autogpt_platform/backend" && poetry install && poetry run prisma generate
cd "$TARGET/autogpt_platform/frontend" && pnpm install
```
Replace `<NAME>` with the actual worktree name (e.g., `AutoGPT7`).
## Running the app (optional)
Backend uses ports: 8001, 8002, 8003, 8005, 8006, 8007, 8008. Free them first if needed:
```bash
TARGET="$(dirname "$(git rev-parse --show-toplevel)")/<NAME>"
for port in 8001 8002 8003 8005 8006 8007 8008; do
lsof -ti :$port | xargs kill -9 2>/dev/null || true
done
cd "$TARGET/autogpt_platform/backend" && poetry run app
```
## CoPilot testing
SDK mode spawns a Claude subprocess — won't work inside Claude Code. Set `CHAT_USE_CLAUDE_AGENT_SDK=false` in `backend/.env` to use baseline mode.
## Cleanup
```bash
# Replace <NAME> with the actual worktree name (e.g., AutoGPT7)
git worktree remove "$(dirname "$(git rev-parse --show-toplevel)")/<NAME>"
```
## Alternative: Branchlet (optional)
If [branchlet](https://www.npmjs.com/package/branchlet) is installed:
```bash
branchlet create -n <name> -s <source-branch> -b <new-branch>
```

View File

@@ -5,12 +5,14 @@ on:
branches: [master, dev, ci-test*]
paths:
- ".github/workflows/platform-backend-ci.yml"
- ".github/workflows/scripts/get_package_version_from_lockfile.py"
- "autogpt_platform/backend/**"
- "autogpt_platform/autogpt_libs/**"
pull_request:
branches: [master, dev, release-*]
paths:
- ".github/workflows/platform-backend-ci.yml"
- ".github/workflows/scripts/get_package_version_from_lockfile.py"
- "autogpt_platform/backend/**"
- "autogpt_platform/autogpt_libs/**"
merge_group:

View File

@@ -120,175 +120,6 @@ jobs:
token: ${{ secrets.GITHUB_TOKEN }}
exitOnceUploaded: true
e2e_test:
name: end-to-end tests
runs-on: big-boi
steps:
- name: Checkout repository
uses: actions/checkout@v6
with:
submodules: recursive
- name: Set up Platform - Copy default supabase .env
run: |
cp ../.env.default ../.env
- name: Set up Platform - Copy backend .env and set OpenAI API key
run: |
cp ../backend/.env.default ../backend/.env
echo "OPENAI_INTERNAL_API_KEY=${{ secrets.OPENAI_API_KEY }}" >> ../backend/.env
env:
# Used by E2E test data script to generate embeddings for approved store agents
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
- name: Set up Platform - Set up Docker Buildx
uses: docker/setup-buildx-action@v3
with:
driver: docker-container
driver-opts: network=host
- name: Set up Platform - Expose GHA cache to docker buildx CLI
uses: crazy-max/ghaction-github-runtime@v4
- name: Set up Platform - Build Docker images (with cache)
working-directory: autogpt_platform
run: |
pip install pyyaml
# Resolve extends and generate a flat compose file that bake can understand
docker compose -f docker-compose.yml config > docker-compose.resolved.yml
# Add cache configuration to the resolved compose file
python ../.github/workflows/scripts/docker-ci-fix-compose-build-cache.py \
--source docker-compose.resolved.yml \
--cache-from "type=gha" \
--cache-to "type=gha,mode=max" \
--backend-hash "${{ hashFiles('autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/poetry.lock', 'autogpt_platform/backend/backend') }}" \
--frontend-hash "${{ hashFiles('autogpt_platform/frontend/Dockerfile', 'autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/src') }}" \
--git-ref "${{ github.ref }}"
# Build with bake using the resolved compose file (now includes cache config)
docker buildx bake --allow=fs.read=.. -f docker-compose.resolved.yml --load
env:
NEXT_PUBLIC_PW_TEST: true
- name: Set up tests - Cache E2E test data
id: e2e-data-cache
uses: actions/cache@v5
with:
path: /tmp/e2e_test_data.sql
key: e2e-test-data-${{ hashFiles('autogpt_platform/backend/test/e2e_test_data.py', 'autogpt_platform/backend/migrations/**', '.github/workflows/platform-frontend-ci.yml') }}
- name: Set up Platform - Start Supabase DB + Auth
run: |
docker compose -f ../docker-compose.resolved.yml up -d db auth --no-build
echo "Waiting for database to be ready..."
timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done'
echo "Waiting for auth service to be ready..."
timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -c "SELECT 1 FROM auth.users LIMIT 1" 2>/dev/null; do sleep 2; done' || echo "Auth schema check timeout, continuing..."
- name: Set up Platform - Run migrations
run: |
echo "Running migrations..."
docker compose -f ../docker-compose.resolved.yml run --rm migrate
echo "✅ Migrations completed"
env:
NEXT_PUBLIC_PW_TEST: true
- name: Set up tests - Load cached E2E test data
if: steps.e2e-data-cache.outputs.cache-hit == 'true'
run: |
echo "✅ Found cached E2E test data, restoring..."
{
echo "SET session_replication_role = 'replica';"
cat /tmp/e2e_test_data.sql
echo "SET session_replication_role = 'origin';"
} | docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -b
# Refresh materialized views after restore
docker compose -f ../docker-compose.resolved.yml exec -T db \
psql -U postgres -d postgres -b -c "SET search_path TO platform; SELECT refresh_store_materialized_views();" || true
echo "✅ E2E test data restored from cache"
- name: Set up Platform - Start (all other services)
run: |
docker compose -f ../docker-compose.resolved.yml up -d --no-build
echo "Waiting for rest_server to be ready..."
timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
env:
NEXT_PUBLIC_PW_TEST: true
- name: Set up tests - Create E2E test data
if: steps.e2e-data-cache.outputs.cache-hit != 'true'
run: |
echo "Creating E2E test data..."
docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.resolved.yml ps -q rest_server):/tmp/e2e_test_data.py
docker compose -f ../docker-compose.resolved.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
echo "❌ E2E test data creation failed!"
docker compose -f ../docker-compose.resolved.yml logs --tail=50 rest_server
exit 1
}
# Dump auth.users + platform schema for cache (two separate dumps)
echo "Dumping database for cache..."
{
docker compose -f ../docker-compose.resolved.yml exec -T db \
pg_dump -U postgres --data-only --column-inserts \
--table='auth.users' postgres
docker compose -f ../docker-compose.resolved.yml exec -T db \
pg_dump -U postgres --data-only --column-inserts \
--schema=platform \
--exclude-table='platform._prisma_migrations' \
--exclude-table='platform.apscheduler_jobs' \
--exclude-table='platform.apscheduler_jobs_batched_notifications' \
postgres
} > /tmp/e2e_test_data.sql
echo "✅ Database dump created for caching ($(wc -l < /tmp/e2e_test_data.sql) lines)"
- name: Set up tests - Enable corepack
run: corepack enable
- name: Set up tests - Set up Node
uses: actions/setup-node@v6
with:
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Set up tests - Install dependencies
run: pnpm install --frozen-lockfile
- name: Set up tests - Install browser 'chromium'
run: pnpm playwright install --with-deps chromium
- name: Run Playwright tests
run: pnpm test:no-build
continue-on-error: false
- name: Upload Playwright report
if: always()
uses: actions/upload-artifact@v4
with:
name: playwright-report
path: playwright-report
if-no-files-found: ignore
retention-days: 3
- name: Upload Playwright test results
if: always()
uses: actions/upload-artifact@v4
with:
name: playwright-test-results
path: test-results
if-no-files-found: ignore
retention-days: 3
- name: Print Final Docker Compose logs
if: always()
run: docker compose -f ../docker-compose.resolved.yml logs
integration_test:
runs-on: ubuntu-latest
needs: setup

View File

@@ -1,14 +1,18 @@
name: AutoGPT Platform - Frontend CI
name: AutoGPT Platform - Full-stack CI
on:
push:
branches: [master, dev]
paths:
- ".github/workflows/platform-fullstack-ci.yml"
- ".github/workflows/scripts/docker-ci-fix-compose-build-cache.py"
- ".github/workflows/scripts/get_package_version_from_lockfile.py"
- "autogpt_platform/**"
pull_request:
paths:
- ".github/workflows/platform-fullstack-ci.yml"
- ".github/workflows/scripts/docker-ci-fix-compose-build-cache.py"
- ".github/workflows/scripts/get_package_version_from_lockfile.py"
- "autogpt_platform/**"
merge_group:
@@ -24,42 +28,28 @@ defaults:
jobs:
setup:
runs-on: ubuntu-latest
outputs:
cache-key: ${{ steps.cache-key.outputs.key }}
steps:
- name: Checkout repository
uses: actions/checkout@v6
- name: Set up Node.js
uses: actions/setup-node@v6
with:
node-version: "22.18.0"
- name: Enable corepack
run: corepack enable
- name: Generate cache key
id: cache-key
run: echo "key=${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/package.json') }}" >> $GITHUB_OUTPUT
- name: Cache dependencies
uses: actions/cache@v5
- name: Set up Node
uses: actions/setup-node@v6
with:
path: ~/.pnpm-store
key: ${{ steps.cache-key.outputs.key }}
restore-keys: |
${{ runner.os }}-pnpm-${{ hashFiles('autogpt_platform/frontend/pnpm-lock.yaml') }}
${{ runner.os }}-pnpm-
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Install dependencies
- name: Install dependencies to populate cache
run: pnpm install --frozen-lockfile
types:
runs-on: big-boi
check-api-types:
name: check API types
runs-on: ubuntu-latest
needs: setup
strategy:
fail-fast: false
steps:
- name: Checkout repository
@@ -67,70 +57,256 @@ jobs:
with:
submodules: recursive
- name: Set up Node.js
# ------------------------ Backend setup ------------------------
- name: Set up Backend - Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Set up Backend - Install Poetry
working-directory: autogpt_platform/backend
run: |
POETRY_VERSION=$(python ../../.github/workflows/scripts/get_package_version_from_lockfile.py poetry)
echo "Installing Poetry version ${POETRY_VERSION}"
curl -sSL https://install.python-poetry.org | POETRY_VERSION=$POETRY_VERSION python3 -
- name: Set up Backend - Set up dependency cache
uses: actions/cache@v5
with:
path: ~/.cache/pypoetry
key: poetry-${{ runner.os }}-${{ hashFiles('autogpt_platform/backend/poetry.lock') }}
- name: Set up Backend - Install dependencies
working-directory: autogpt_platform/backend
run: poetry install
- name: Set up Backend - Generate Prisma client
working-directory: autogpt_platform/backend
run: poetry run prisma generate && poetry run gen-prisma-stub
- name: Set up Frontend - Export OpenAPI schema from Backend
working-directory: autogpt_platform/backend
run: poetry run export-api-schema --output ../frontend/src/app/api/openapi.json
# ------------------------ Frontend setup ------------------------
- name: Set up Frontend - Enable corepack
run: corepack enable
- name: Set up Frontend - Set up Node
uses: actions/setup-node@v6
with:
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Enable corepack
run: corepack enable
- name: Copy default supabase .env
run: |
cp ../.env.default ../.env
- name: Copy backend .env
run: |
cp ../backend/.env.default ../backend/.env
- name: Run docker compose
run: |
docker compose -f ../docker-compose.yml --profile local up -d deps_backend
- name: Restore dependencies cache
uses: actions/cache@v5
with:
path: ~/.pnpm-store
key: ${{ needs.setup.outputs.cache-key }}
restore-keys: |
${{ runner.os }}-pnpm-
- name: Install dependencies
- name: Set up Frontend - Install dependencies
run: pnpm install --frozen-lockfile
- name: Setup .env
run: cp .env.default .env
- name: Wait for services to be ready
run: |
echo "Waiting for rest_server to be ready..."
timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
echo "Waiting for database to be ready..."
timeout 60 sh -c 'until docker compose -f ../docker-compose.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done' || echo "Database ready check timeout, continuing..."
- name: Generate API queries
run: pnpm generate:api:force
- name: Set up Frontend - Format OpenAPI schema
id: format-schema
run: pnpm prettier --write ./src/app/api/openapi.json
- name: Check for API schema changes
run: |
if ! git diff --exit-code src/app/api/openapi.json; then
echo "❌ API schema changes detected in src/app/api/openapi.json"
echo ""
echo "The openapi.json file has been modified after running 'pnpm generate:api-all'."
echo "The openapi.json file has been modified after exporting the API schema."
echo "This usually means changes have been made in the BE endpoints without updating the Frontend."
echo "The API schema is now out of sync with the Front-end queries."
echo ""
echo "To fix this:"
echo "1. Pull the backend 'docker compose pull && docker compose up -d --build --force-recreate'"
echo "2. Run 'pnpm generate:api' locally"
echo "3. Run 'pnpm types' locally"
echo "4. Fix any TypeScript errors that may have been introduced"
echo "5. Commit and push your changes"
echo "\nIn the backend directory:"
echo "1. Run 'poetry run export-api-schema --output ../frontend/src/app/api/openapi.json'"
echo "\nIn the frontend directory:"
echo "2. Run 'pnpm prettier --write src/app/api/openapi.json'"
echo "3. Run 'pnpm generate:api'"
echo "4. Run 'pnpm types'"
echo "5. Fix any TypeScript errors that may have been introduced"
echo "6. Commit and push your changes"
echo ""
exit 1
else
echo "✅ No API schema changes detected"
fi
- name: Run Typescript checks
- name: Set up Frontend - Generate API client
id: generate-api-client
run: pnpm orval --config ./orval.config.ts
# Continue with type generation & check even if there are schema changes
if: success() || (steps.format-schema.outcome == 'success')
- name: Check for TypeScript errors
run: pnpm types
if: success() || (steps.generate-api-client.outcome == 'success')
e2e_test:
name: end-to-end tests
runs-on: big-boi
steps:
- name: Checkout repository
uses: actions/checkout@v6
with:
submodules: recursive
- name: Set up Platform - Copy default supabase .env
run: |
cp ../.env.default ../.env
- name: Set up Platform - Copy backend .env and set OpenAI API key
run: |
cp ../backend/.env.default ../backend/.env
echo "OPENAI_INTERNAL_API_KEY=${{ secrets.OPENAI_API_KEY }}" >> ../backend/.env
env:
# Used by E2E test data script to generate embeddings for approved store agents
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
- name: Set up Platform - Set up Docker Buildx
uses: docker/setup-buildx-action@v3
with:
driver: docker-container
driver-opts: network=host
- name: Set up Platform - Expose GHA cache to docker buildx CLI
uses: crazy-max/ghaction-github-runtime@v4
- name: Set up Platform - Build Docker images (with cache)
working-directory: autogpt_platform
run: |
pip install pyyaml
# Resolve extends and generate a flat compose file that bake can understand
docker compose -f docker-compose.yml config > docker-compose.resolved.yml
# Add cache configuration to the resolved compose file
python ../.github/workflows/scripts/docker-ci-fix-compose-build-cache.py \
--source docker-compose.resolved.yml \
--cache-from "type=gha" \
--cache-to "type=gha,mode=max" \
--backend-hash "${{ hashFiles('autogpt_platform/backend/Dockerfile', 'autogpt_platform/backend/poetry.lock', 'autogpt_platform/backend/backend/**') }}" \
--frontend-hash "${{ hashFiles('autogpt_platform/frontend/Dockerfile', 'autogpt_platform/frontend/pnpm-lock.yaml', 'autogpt_platform/frontend/src/**') }}" \
--git-ref "${{ github.ref }}"
# Build with bake using the resolved compose file (now includes cache config)
docker buildx bake --allow=fs.read=.. -f docker-compose.resolved.yml --load
env:
NEXT_PUBLIC_PW_TEST: true
- name: Set up tests - Cache E2E test data
id: e2e-data-cache
uses: actions/cache@v5
with:
path: /tmp/e2e_test_data.sql
key: e2e-test-data-${{ hashFiles('autogpt_platform/backend/test/e2e_test_data.py', 'autogpt_platform/backend/migrations/**', '.github/workflows/platform-fullstack-ci.yml') }}
- name: Set up Platform - Start Supabase DB + Auth
run: |
docker compose -f ../docker-compose.resolved.yml up -d db auth --no-build
echo "Waiting for database to be ready..."
timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db pg_isready -U postgres 2>/dev/null; do sleep 2; done'
echo "Waiting for auth service to be ready..."
timeout 60 sh -c 'until docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -c "SELECT 1 FROM auth.users LIMIT 1" 2>/dev/null; do sleep 2; done' || echo "Auth schema check timeout, continuing..."
- name: Set up Platform - Run migrations
run: |
echo "Running migrations..."
docker compose -f ../docker-compose.resolved.yml run --rm migrate
echo "✅ Migrations completed"
env:
NEXT_PUBLIC_PW_TEST: true
- name: Set up tests - Load cached E2E test data
if: steps.e2e-data-cache.outputs.cache-hit == 'true'
run: |
echo "✅ Found cached E2E test data, restoring..."
{
echo "SET session_replication_role = 'replica';"
cat /tmp/e2e_test_data.sql
echo "SET session_replication_role = 'origin';"
} | docker compose -f ../docker-compose.resolved.yml exec -T db psql -U postgres -d postgres -b
# Refresh materialized views after restore
docker compose -f ../docker-compose.resolved.yml exec -T db \
psql -U postgres -d postgres -b -c "SET search_path TO platform; SELECT refresh_store_materialized_views();" || true
echo "✅ E2E test data restored from cache"
- name: Set up Platform - Start (all other services)
run: |
docker compose -f ../docker-compose.resolved.yml up -d --no-build
echo "Waiting for rest_server to be ready..."
timeout 60 sh -c 'until curl -f http://localhost:8006/health 2>/dev/null; do sleep 2; done' || echo "Rest server health check timeout, continuing..."
env:
NEXT_PUBLIC_PW_TEST: true
- name: Set up tests - Create E2E test data
if: steps.e2e-data-cache.outputs.cache-hit != 'true'
run: |
echo "Creating E2E test data..."
docker cp ../backend/test/e2e_test_data.py $(docker compose -f ../docker-compose.resolved.yml ps -q rest_server):/tmp/e2e_test_data.py
docker compose -f ../docker-compose.resolved.yml exec -T rest_server sh -c "cd /app/autogpt_platform && python /tmp/e2e_test_data.py" || {
echo "❌ E2E test data creation failed!"
docker compose -f ../docker-compose.resolved.yml logs --tail=50 rest_server
exit 1
}
# Dump auth.users + platform schema for cache (two separate dumps)
echo "Dumping database for cache..."
{
docker compose -f ../docker-compose.resolved.yml exec -T db \
pg_dump -U postgres --data-only --column-inserts \
--table='auth.users' postgres
docker compose -f ../docker-compose.resolved.yml exec -T db \
pg_dump -U postgres --data-only --column-inserts \
--schema=platform \
--exclude-table='platform._prisma_migrations' \
--exclude-table='platform.apscheduler_jobs' \
--exclude-table='platform.apscheduler_jobs_batched_notifications' \
postgres
} > /tmp/e2e_test_data.sql
echo "✅ Database dump created for caching ($(wc -l < /tmp/e2e_test_data.sql) lines)"
- name: Set up tests - Enable corepack
run: corepack enable
- name: Set up tests - Set up Node
uses: actions/setup-node@v6
with:
node-version: "22.18.0"
cache: "pnpm"
cache-dependency-path: autogpt_platform/frontend/pnpm-lock.yaml
- name: Set up tests - Install dependencies
run: pnpm install --frozen-lockfile
- name: Set up tests - Install browser 'chromium'
run: pnpm playwright install --with-deps chromium
- name: Run Playwright tests
run: pnpm test:no-build
continue-on-error: false
- name: Upload Playwright report
if: always()
uses: actions/upload-artifact@v4
with:
name: playwright-report
path: playwright-report
if-no-files-found: ignore
retention-days: 3
- name: Upload Playwright test results
if: always()
uses: actions/upload-artifact@v4
with:
name: playwright-test-results
path: test-results
if-no-files-found: ignore
retention-days: 3
- name: Print Final Docker Compose logs
if: always()
run: docker compose -f ../docker-compose.resolved.yml logs

View File

@@ -60,9 +60,12 @@ AutoGPT Platform is a monorepo containing:
### Reviewing/Revising Pull Requests
- When the user runs /pr-comments or tries to fetch them, also run gh api /repos/Significant-Gravitas/AutoGPT/pulls/[issuenum]/reviews to get the reviews
- Use gh api /repos/Significant-Gravitas/AutoGPT/pulls/[issuenum]/reviews/[review_id]/comments to get the review contents
- Use gh api /repos/Significant-Gravitas/AutoGPT/issues/9924/comments to get the pr specific comments
Use `/pr-review` to review a PR or `/pr-address` to address comments.
When fetching comments manually:
- `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews` — top-level reviews
- `gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments` — inline review comments
- `gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments` — PR conversation comments
### Conventional Commits

View File

@@ -58,10 +58,31 @@ poetry run pytest path/to/test.py --snapshot-update
- **Authentication**: JWT-based with Supabase integration
- **Security**: Cache protection middleware prevents sensitive data caching in browsers/proxies
## Code Style
- **Top-level imports only** — no local/inner imports (lazy imports only for heavy optional deps like `openpyxl`)
- **No duck typing** — no `hasattr`/`getattr`/`isinstance` for type dispatch; use typed interfaces/unions/protocols
- **Pydantic models** over dataclass/namedtuple/dict for structured data
- **No linter suppressors** — no `# type: ignore`, `# noqa`, `# pyright: ignore`; fix the type/code
- **List comprehensions** over manual loop-and-append
- **Early return** — guard clauses first, avoid deep nesting
- **Lazy `%s` logging** — `logger.info("Processing %s items", count)` not `logger.info(f"Processing {count} items")`
- **Sanitize error paths** — `os.path.basename()` in error messages to avoid leaking directory structure
- **TOCTOU awareness** — avoid check-then-act patterns for file access and credit charging
- **`Security()` vs `Depends()`** — use `Security()` for auth deps to get proper OpenAPI security spec
- **Redis pipelines** — `transaction=True` for atomicity on multi-step operations
- **`max(0, value)` guards** — for computed values that should never be negative
- **SSE protocol** — `data:` lines for frontend-parsed events (must match Zod schema), `: comment` lines for heartbeats/status
- **File length** — keep files under ~300 lines; if a file grows beyond this, split by responsibility (e.g. extract helpers, models, or a sub-module into a new file). Never keep appending to a long file.
- **Function length** — keep functions under ~40 lines; extract named helpers when a function grows longer. Long functions are a sign of mixed concerns, not complexity.
## Testing Approach
- Uses pytest with snapshot testing for API responses
- Test files are colocated with source files (`*_test.py`)
- Mock at boundaries — mock where the symbol is **used**, not where it's **defined**
- After refactoring, update mock targets to match new module paths
- Use `AsyncMock` for async functions (`from unittest.mock import AsyncMock`)
## Database Schema
@@ -157,6 +178,16 @@ yield "image_url", result_url
3. Write tests alongside the route file
4. Run `poetry run test` to verify
## Workspace & Media Files
**Read [Workspace & Media Architecture](../../docs/platform/workspace-media-architecture.md) when:**
- Working on CoPilot file upload/download features
- Building blocks that handle `MediaFileType` inputs/outputs
- Modifying `WorkspaceManager` or `store_media_file()`
- Debugging file persistence or virus scanning issues
Covers: `WorkspaceManager` (persistent storage with session scoping), `store_media_file()` (media normalization pipeline), and responsibility boundaries for virus scanning and persistence.
## Security Implementation
### Cache Protection Middleware

View File

@@ -27,12 +27,6 @@ from backend.copilot.model import (
get_user_sessions,
update_session_title,
)
from backend.copilot.rate_limit import (
CoPilotUsageStatus,
RateLimitExceeded,
check_rate_limit,
get_usage_status,
)
from backend.copilot.response_model import StreamError, StreamFinish, StreamHeartbeat
from backend.copilot.tools.e2b_sandbox import kill_sandbox
from backend.copilot.tools.models import (
@@ -126,8 +120,6 @@ class SessionDetailResponse(BaseModel):
user_id: str | None
messages: list[dict]
active_stream: ActiveStreamInfo | None = None # Present if stream is still active
total_prompt_tokens: int = 0
total_completion_tokens: int = 0
class SessionSummaryResponse(BaseModel):
@@ -397,10 +389,6 @@ async def get_session(
last_message_id=last_message_id,
)
# Sum token usage from session
total_prompt = sum(u.prompt_tokens for u in session.usage)
total_completion = sum(u.completion_tokens for u in session.usage)
return SessionDetailResponse(
id=session.session_id,
created_at=session.started_at.isoformat(),
@@ -408,26 +396,6 @@ async def get_session(
user_id=session.user_id or None,
messages=messages,
active_stream=active_stream_info,
total_prompt_tokens=total_prompt,
total_completion_tokens=total_completion,
)
@router.get("/usage")
async def get_copilot_usage(
user_id: Annotated[str | None, Depends(auth.get_user_id)],
) -> CoPilotUsageStatus:
"""Get CoPilot usage status for the authenticated user.
Returns current token usage vs limits for daily and weekly windows.
"""
if not user_id:
raise HTTPException(status_code=401, detail="Authentication required")
return await get_usage_status(
user_id=user_id,
daily_token_limit=config.daily_token_limit,
weekly_token_limit=config.weekly_token_limit,
)
@@ -528,17 +496,6 @@ async def stream_chat_post(
},
)
# Pre-turn rate limit check (token-based)
if user_id and (config.daily_token_limit > 0 or config.weekly_token_limit > 0):
try:
await check_rate_limit(
user_id=user_id,
daily_token_limit=config.daily_token_limit,
weekly_token_limit=config.weekly_token_limit,
)
except RateLimitExceeded as e:
raise HTTPException(status_code=429, detail=str(e)) from e
# Enrich message with file metadata if file_ids are provided.
# Also sanitise file_ids so only validated, workspace-scoped IDs are
# forwarded downstream (e.g. to the executor via enqueue_copilot_turn).

View File

@@ -1,6 +1,5 @@
"""Tests for chat API routes: session title update, file attachment validation, usage, and suggested prompts."""
"""Tests for chat API routes: session title update, file attachment validation, and suggested prompts."""
from datetime import UTC, datetime, timedelta
from unittest.mock import AsyncMock, MagicMock
import fastapi
@@ -252,74 +251,6 @@ def test_file_ids_scoped_to_workspace(mocker: pytest_mock.MockFixture):
assert call_kwargs["where"]["isDeleted"] is False
# ─── Usage endpoint ───────────────────────────────────────────────────
def _mock_usage(
mocker: pytest_mock.MockerFixture,
*,
daily_used: int = 500,
weekly_used: int = 2000,
) -> AsyncMock:
"""Mock get_usage_status to return a predictable CoPilotUsageStatus."""
from backend.copilot.rate_limit import CoPilotUsageStatus, UsageWindow
resets_at = datetime.now(UTC) + timedelta(days=1)
status = CoPilotUsageStatus(
daily=UsageWindow(used=daily_used, limit=10000, resets_at=resets_at),
weekly=UsageWindow(used=weekly_used, limit=50000, resets_at=resets_at),
)
return mocker.patch(
"backend.api.features.chat.routes.get_usage_status",
new_callable=AsyncMock,
return_value=status,
)
def test_usage_returns_daily_and_weekly(
mocker: pytest_mock.MockerFixture,
test_user_id: str,
) -> None:
"""GET /usage returns daily and weekly usage."""
mock_get = _mock_usage(mocker, daily_used=500, weekly_used=2000)
mocker.patch.object(chat_routes.config, "daily_token_limit", 10000)
mocker.patch.object(chat_routes.config, "weekly_token_limit", 50000)
response = client.get("/usage")
assert response.status_code == 200
data = response.json()
assert data["daily"]["used"] == 500
assert data["weekly"]["used"] == 2000
mock_get.assert_called_once_with(
user_id=test_user_id,
daily_token_limit=10000,
weekly_token_limit=50000,
)
def test_usage_uses_config_limits(
mocker: pytest_mock.MockerFixture,
test_user_id: str,
) -> None:
"""The endpoint forwards daily_token_limit and weekly_token_limit from config."""
mock_get = _mock_usage(mocker)
mocker.patch.object(chat_routes.config, "daily_token_limit", 99999)
mocker.patch.object(chat_routes.config, "weekly_token_limit", 77777)
response = client.get("/usage")
assert response.status_code == 200
mock_get.assert_called_once_with(
user_id=test_user_id,
daily_token_limit=99999,
weekly_token_limit=77777,
)
# ─── Suggested prompts endpoint ──────────────────────────────────────

View File

@@ -11,7 +11,10 @@ from backend.blocks._base import (
BlockSchemaInput,
BlockSchemaOutput,
)
from backend.data.execution import ExecutionContext
from backend.data.model import SchemaField
from backend.util.file import parse_data_uri, resolve_media_content
from backend.util.type import MediaFileType
from ._api import get_api
from ._auth import (
@@ -178,7 +181,8 @@ class FileOperation(StrEnum):
class FileOperationInput(TypedDict):
path: str
content: str
# MediaFileType is a str NewType — no runtime breakage for existing callers.
content: MediaFileType
operation: FileOperation
@@ -275,11 +279,11 @@ class GithubMultiFileCommitBlock(Block):
base_tree_sha = commit_data["tree"]["sha"]
# 3. Build tree entries for each file operation (blobs created concurrently)
async def _create_blob(content: str) -> str:
async def _create_blob(content: str, encoding: str = "utf-8") -> str:
blob_url = repo_url + "/git/blobs"
blob_response = await api.post(
blob_url,
json={"content": content, "encoding": "utf-8"},
json={"content": content, "encoding": encoding},
)
return blob_response.json()["sha"]
@@ -301,10 +305,19 @@ class GithubMultiFileCommitBlock(Block):
else:
upsert_files.append((path, file_op.get("content", "")))
# Create all blobs concurrently
# Create all blobs concurrently. Data URIs (from store_media_file)
# are sent as base64 blobs to preserve binary content.
if upsert_files:
async def _make_blob(content: str) -> str:
parsed = parse_data_uri(content)
if parsed is not None:
_, b64_payload = parsed
return await _create_blob(b64_payload, encoding="base64")
return await _create_blob(content)
blob_shas = await asyncio.gather(
*[_create_blob(content) for _, content in upsert_files]
*[_make_blob(content) for _, content in upsert_files]
)
for (path, _), blob_sha in zip(upsert_files, blob_shas):
tree_entries.append(
@@ -358,15 +371,36 @@ class GithubMultiFileCommitBlock(Block):
input_data: Input,
*,
credentials: GithubCredentials,
execution_context: ExecutionContext,
**kwargs,
) -> BlockOutput:
try:
# Resolve media references (workspace://, data:, URLs) to data
# URIs so _make_blob can send binary content correctly.
resolved_files: list[FileOperationInput] = []
for file_op in input_data.files:
content = file_op.get("content", "")
operation = FileOperation(file_op.get("operation", "upsert"))
if operation != FileOperation.DELETE:
content = await resolve_media_content(
MediaFileType(content),
execution_context,
return_format="for_external_api",
)
resolved_files.append(
FileOperationInput(
path=file_op["path"],
content=MediaFileType(content),
operation=operation,
)
)
sha, url = await self.multi_file_commit(
credentials,
input_data.repo_url,
input_data.branch,
input_data.commit_message,
input_data.files,
resolved_files,
)
yield "sha", sha
yield "url", url

View File

@@ -8,6 +8,7 @@ from backend.blocks.github.pull_requests import (
GithubMergePullRequestBlock,
prepare_pr_api_url,
)
from backend.data.execution import ExecutionContext
from backend.util.exceptions import BlockExecutionError
# ── prepare_pr_api_url tests ──
@@ -97,7 +98,11 @@ async def test_multi_file_commit_error_path():
"credentials": TEST_CREDENTIALS_INPUT,
}
with pytest.raises(BlockExecutionError, match="ref update failed"):
async for _ in block.execute(input_data, credentials=TEST_CREDENTIALS):
async for _ in block.execute(
input_data,
credentials=TEST_CREDENTIALS,
execution_context=ExecutionContext(),
):
pass

View File

@@ -18,13 +18,11 @@ from langfuse import propagate_attributes
from backend.copilot.model import (
ChatMessage,
ChatSession,
Usage,
get_chat_session,
update_session_title,
upsert_chat_session,
)
from backend.copilot.prompting import get_baseline_supplement
from backend.copilot.rate_limit import record_token_usage
from backend.copilot.response_model import (
StreamBaseResponse,
StreamError,
@@ -38,22 +36,17 @@ from backend.copilot.response_model import (
StreamToolInputAvailable,
StreamToolInputStart,
StreamToolOutputAvailable,
StreamUsage,
)
from backend.copilot.service import (
_build_system_prompt,
_generate_session_title,
client,
_get_openai_client,
config,
)
from backend.copilot.tools import execute_tool, get_available_tools
from backend.copilot.tracking import track_user_message
from backend.util.exceptions import NotFoundError
from backend.util.prompt import (
compress_context,
estimate_token_count,
estimate_token_count_str,
)
from backend.util.prompt import compress_context
logger = logging.getLogger(__name__)
@@ -96,7 +89,7 @@ async def _compress_session_messages(
result = await compress_context(
messages=messages_dict,
model=config.model,
client=client,
client=_get_openai_client(),
)
except Exception as e:
logger.warning("[Baseline] Context compression with LLM failed: %s", e)
@@ -228,9 +221,6 @@ async def stream_chat_completion_baseline(
text_block_id = str(uuid.uuid4())
text_started = False
step_open = False
# Token usage accumulators — populated from streaming chunks
turn_prompt_tokens = 0
turn_completion_tokens = 0
try:
for _round in range(_MAX_TOOL_ROUNDS):
# Open a new step for each LLM round
@@ -242,29 +232,17 @@ async def stream_chat_completion_baseline(
model=config.model,
messages=openai_messages,
stream=True,
stream_options={"include_usage": True},
)
if tools:
create_kwargs["tools"] = tools
response = await client.chat.completions.create(**create_kwargs) # type: ignore[arg-type] # dynamic kwargs
response = await _get_openai_client().chat.completions.create(**create_kwargs) # type: ignore[arg-type] # dynamic kwargs
# Accumulate streamed response (text + tool calls)
round_text = ""
tool_calls_by_index: dict[int, dict[str, str]] = {}
async for chunk in response:
# Capture token usage from the streaming chunk.
# OpenRouter normalises all providers into OpenAI format
# where prompt_tokens already includes cached tokens
# (unlike Anthropic's native API). Use += to sum all
# tool-call rounds since each API call is independent.
if chunk.usage:
turn_prompt_tokens += chunk.usage.prompt_tokens or 0
turn_completion_tokens += chunk.usage.completion_tokens or 0
if not chunk.choices:
continue
delta = chunk.choices[0].delta
delta = chunk.choices[0].delta if chunk.choices else None
if not delta:
continue
@@ -433,53 +411,6 @@ async def stream_chat_completion_baseline(
except Exception:
logger.warning("[Baseline] Langfuse trace context teardown failed")
# Fallback: estimate tokens via tiktoken when the provider does
# not honour stream_options={"include_usage": True}.
# Count the full message list (system + history + turn) since
# each API call sends the complete context window.
if turn_prompt_tokens == 0 and turn_completion_tokens == 0:
turn_prompt_tokens = max(
estimate_token_count(openai_messages, model=config.model), 0
)
turn_completion_tokens = max(
estimate_token_count_str(assistant_text, model=config.model), 0
)
logger.info(
"[Baseline] No streaming usage reported; estimated tokens: "
"prompt=%d, completion=%d",
turn_prompt_tokens,
turn_completion_tokens,
)
# Emit token usage and update session for persistence
if turn_prompt_tokens > 0 or turn_completion_tokens > 0:
total_tokens = turn_prompt_tokens + turn_completion_tokens
session.usage.append(
Usage(
prompt_tokens=turn_prompt_tokens,
completion_tokens=turn_completion_tokens,
total_tokens=total_tokens,
)
)
logger.info(
"[Baseline] Turn usage: prompt=%d, completion=%d, total=%d",
turn_prompt_tokens,
turn_completion_tokens,
total_tokens,
)
# Record for rate limiting counters
if user_id:
try:
await record_token_usage(
user_id=user_id,
prompt_tokens=turn_prompt_tokens,
completion_tokens=turn_completion_tokens,
)
except Exception as usage_err:
logger.warning(
"[Baseline] Failed to record token usage: %s", usage_err
)
# Persist assistant response
if assistant_text:
session.messages.append(
@@ -490,16 +421,4 @@ async def stream_chat_completion_baseline(
except Exception as persist_err:
logger.error("[Baseline] Failed to persist session: %s", persist_err)
# Yield usage and finish AFTER try/finally (not inside finally).
# PEP 525 prohibits yielding from finally in async generators during
# aclose() — doing so raises RuntimeError on client disconnect.
# On GeneratorExit the client is already gone, so unreachable yields
# are harmless; on normal completion they reach the SSE stream.
if turn_prompt_tokens > 0 or turn_completion_tokens > 0:
yield StreamUsage(
promptTokens=turn_prompt_tokens,
completionTokens=turn_completion_tokens,
totalTokens=turn_prompt_tokens + turn_completion_tokens,
)
yield StreamFinish()

View File

@@ -70,20 +70,6 @@ class ChatConfig(BaseSettings):
description="Cache TTL in seconds for Langfuse prompt (0 to disable caching)",
)
# Rate limiting — token-based limits per day and per week.
# Each CoPilot turn consumes ~10-15K tokens (system prompt + tool schemas + response),
# so 2.5M daily allows ~170-250 turns/day which is reasonable for normal use.
# TODO: These are global deploy-time constants. For per-user or per-plan limits,
# move to the database (e.g. UserPlan table) and look up in get_usage_status.
daily_token_limit: int = Field(
default=2_500_000,
description="Max tokens per day, resets at midnight UTC (0 = unlimited)",
)
weekly_token_limit: int = Field(
default=12_500_000,
description="Max tokens per week, resets Monday 00:00 UTC (0 = unlimited)",
)
# Claude Agent SDK Configuration
use_claude_agent_sdk: bool = Field(
default=True,

View File

@@ -11,6 +11,8 @@ from contextvars import ContextVar
from typing import TYPE_CHECKING
from backend.copilot.model import ChatSession
from backend.data.db_accessors import workspace_db
from backend.util.workspace import WorkspaceManager
if TYPE_CHECKING:
from e2b import AsyncSandbox
@@ -82,6 +84,17 @@ def resolve_sandbox_path(path: str) -> str:
return normalized
async def get_workspace_manager(user_id: str, session_id: str) -> WorkspaceManager:
"""Create a session-scoped :class:`WorkspaceManager`.
Placed here (rather than in ``tools/workspace_files``) so that modules
like ``sdk/file_ref`` can import it without triggering the heavy
``tools/__init__`` import chain.
"""
workspace = await workspace_db().get_or_create_workspace(user_id)
return WorkspaceManager(user_id, workspace.id, session_id)
def is_allowed_local_path(path: str, sdk_cwd: str | None = None) -> bool:
"""Return True if *path* is within an allowed host-filesystem location.

View File

@@ -73,9 +73,6 @@ class Usage(BaseModel):
prompt_tokens: int
completion_tokens: int
total_tokens: int
# Cache breakdown (Anthropic-specific; zero for non-Anthropic models)
cache_read_tokens: int = 0
cache_creation_tokens: int = 0
class ChatSessionInfo(BaseModel):

View File

@@ -52,11 +52,43 @@ Examples:
You can embed a reference inside any string argument, or use it as the entire
value. Multiple references in one argument are all expanded.
**Type coercion**: The platform automatically coerces expanded string values
to match the block's expected input types. For example, if a block expects
`list[list[str]]` and you pass a string containing a JSON array (e.g. from
an @@agptfile: expansion), the string will be parsed into the correct type.
**Structured data**: When the **entire** argument value is a single file
reference (no surrounding text), the platform automatically parses the file
content based on its extension or MIME type. Supported formats: JSON, JSONL,
CSV, TSV, YAML, TOML, Parquet, and Excel (.xlsx — first sheet only).
For example, pass `@@agptfile:workspace://<id>` where the file is a `.csv` and
the rows will be parsed into `list[list[str]]` automatically. If the format is
unrecognised or parsing fails, the content is returned as a plain string.
Legacy `.xls` files are **not** supported — only the modern `.xlsx` format.
**Type coercion**: The platform also coerces expanded values to match the
block's expected input types. For example, if a block expects `list[list[str]]`
and the expanded value is a JSON string, it will be parsed into the correct type.
### Media file inputs (format: "file")
Some block inputs accept media files — their schema shows `"format": "file"`.
These fields accept:
- **`workspace://<file_id>`** or **`workspace://<file_id>#<mime>`** — preferred
for large files (images, videos, PDFs). The platform passes the reference
directly to the block without reading the content into memory.
- **`data:<mime>;base64,<payload>`** — inline base64 data URI, suitable for
small files only.
When a block input has `format: "file"`, **pass the `workspace://` URI
directly as the value** (do NOT wrap it in `@@agptfile:`). This avoids large
payloads in tool arguments and preserves binary content (images, videos)
that would be corrupted by text encoding.
Example — committing an image file to GitHub:
```json
{
"files": [{
"path": "docs/hero.png",
"content": "workspace://abc123#image/png",
"operation": "upsert"
}]
}
```
### Sub-agent tasks
- When using the Task tool, NEVER set `run_in_background` to true.

View File

@@ -1,253 +0,0 @@
"""CoPilot rate limiting based on token usage.
Uses Redis fixed-window counters to track per-user token consumption
with configurable daily and weekly limits. Daily windows reset at
midnight UTC; weekly windows reset at ISO week boundary (Monday 00:00
UTC). Fails open when Redis is unavailable to avoid blocking users.
"""
import asyncio
import logging
from datetime import UTC, datetime, timedelta
from pydantic import BaseModel, Field
from backend.data.redis_client import get_redis_async
logger = logging.getLogger(__name__)
# Redis key prefixes
_PREFIX = "copilot:usage"
class UsageWindow(BaseModel):
"""Usage within a single time window."""
used: int
limit: int = Field(
description="Maximum tokens allowed in this window. 0 means unlimited."
)
resets_at: datetime
class CoPilotUsageStatus(BaseModel):
"""Current usage status for a user across all windows."""
daily: UsageWindow
weekly: UsageWindow
class RateLimitExceeded(Exception):
"""Raised when a user exceeds their CoPilot usage limit."""
def __init__(self, window: str, resets_at: datetime):
self.window = window
self.resets_at = resets_at
delta = resets_at - datetime.now(UTC)
total_secs = delta.total_seconds()
if total_secs <= 0:
time_str = "now"
else:
hours = int(total_secs // 3600)
minutes = int((total_secs % 3600) // 60)
time_str = f"{hours}h {minutes}m" if hours > 0 else f"{minutes}m"
super().__init__(
f"You've reached your {window} usage limit. Resets in {time_str}."
)
def _daily_key(user_id: str, now: datetime | None = None) -> str:
if now is None:
now = datetime.now(UTC)
return f"{_PREFIX}:daily:{user_id}:{now.strftime('%Y-%m-%d')}"
def _weekly_key(user_id: str, now: datetime | None = None) -> str:
if now is None:
now = datetime.now(UTC)
year, week, _ = now.isocalendar()
return f"{_PREFIX}:weekly:{user_id}:{year}-W{week:02d}"
def _daily_reset_time(now: datetime | None = None) -> datetime:
"""Calculate when the current daily window resets (next midnight UTC)."""
if now is None:
now = datetime.now(UTC)
return now.replace(hour=0, minute=0, second=0, microsecond=0) + timedelta(days=1)
def _weekly_reset_time(now: datetime | None = None) -> datetime:
"""Calculate when the current weekly window resets (next Monday 00:00 UTC).
On Monday itself, ``(7 - weekday) % 7`` is 0; the ``or 7`` fallback
pushes to *next* Monday so the current week's window stays open.
"""
if now is None:
now = datetime.now(UTC)
days_until_monday = (7 - now.weekday()) % 7 or 7
return now.replace(hour=0, minute=0, second=0, microsecond=0) + timedelta(
days=days_until_monday
)
async def _fetch_counters(user_id: str, now: datetime) -> tuple[int, int]:
"""Fetch daily and weekly token counters from Redis.
Returns (daily_used, weekly_used). Returns (0, 0) if Redis is unavailable.
"""
redis = await get_redis_async()
daily_raw, weekly_raw = await asyncio.gather(
redis.get(_daily_key(user_id, now=now)),
redis.get(_weekly_key(user_id, now=now)),
)
return int(daily_raw or 0), int(weekly_raw or 0)
async def get_usage_status(
user_id: str,
daily_token_limit: int,
weekly_token_limit: int,
) -> CoPilotUsageStatus:
"""Get current usage status for a user.
Args:
user_id: The user's ID.
daily_token_limit: Max tokens per day (0 = unlimited).
weekly_token_limit: Max tokens per week (0 = unlimited).
Returns:
CoPilotUsageStatus with current usage and limits.
"""
now = datetime.now(UTC)
try:
daily_used, weekly_used = await _fetch_counters(user_id, now)
except Exception:
logger.warning(
"Redis unavailable for usage status, returning zeros", exc_info=True
)
daily_used, weekly_used = 0, 0
return CoPilotUsageStatus(
daily=UsageWindow(
used=daily_used,
limit=daily_token_limit,
resets_at=_daily_reset_time(now=now),
),
weekly=UsageWindow(
used=weekly_used,
limit=weekly_token_limit,
resets_at=_weekly_reset_time(now=now),
),
)
async def check_rate_limit(
user_id: str,
daily_token_limit: int,
weekly_token_limit: int,
) -> None:
"""Check if user is within rate limits. Raises RateLimitExceeded if not.
This is a pre-turn soft check. The authoritative usage counter is updated
by ``record_token_usage()`` after the turn completes. Under concurrency,
two parallel turns may both pass this check against the same snapshot.
This is acceptable because token-based limits are approximate by nature
(the exact token count is unknown until after generation).
Fails open: if Redis is unavailable, allows the request.
"""
now = datetime.now(UTC)
try:
daily_used, weekly_used = await _fetch_counters(user_id, now)
except Exception:
logger.warning(
"Redis unavailable for rate limit check, allowing request", exc_info=True
)
return
if daily_token_limit > 0 and daily_used >= daily_token_limit:
raise RateLimitExceeded("daily", _daily_reset_time(now=now))
if weekly_token_limit > 0 and weekly_used >= weekly_token_limit:
raise RateLimitExceeded("weekly", _weekly_reset_time(now=now))
async def record_token_usage(
user_id: str,
prompt_tokens: int,
completion_tokens: int,
*,
cache_read_tokens: int = 0,
cache_creation_tokens: int = 0,
) -> None:
"""Record token usage for a user across all windows.
Uses cost-weighted counting so cached tokens don't unfairly penalise
multi-turn conversations. Anthropic's pricing:
- uncached input: 100%
- cache creation: 25%
- cache read: 10%
- output: 100%
``prompt_tokens`` should be the *uncached* input count (``input_tokens``
from the API response). Cache counts are passed separately.
Args:
user_id: The user's ID.
prompt_tokens: Uncached input tokens.
completion_tokens: Output tokens.
cache_read_tokens: Tokens served from prompt cache (10% cost).
cache_creation_tokens: Tokens written to prompt cache (25% cost).
"""
weighted_input = (
prompt_tokens
+ round(cache_creation_tokens * 0.25)
+ round(cache_read_tokens * 0.1)
)
total = weighted_input + completion_tokens
if total <= 0:
return
raw_total = (
prompt_tokens + cache_read_tokens + cache_creation_tokens + completion_tokens
)
logger.info(
"Recording token usage for %s: raw=%d, weighted=%d "
"(uncached=%d, cache_read=%d@10%%, cache_create=%d@25%%, output=%d)",
user_id[:8],
raw_total,
total,
prompt_tokens,
cache_read_tokens,
cache_creation_tokens,
completion_tokens,
)
now = datetime.now(UTC)
try:
redis = await get_redis_async()
pipe = redis.pipeline(transaction=False)
# Daily counter (expires at next midnight UTC)
d_key = _daily_key(user_id, now=now)
pipe.incrby(d_key, total)
seconds_until_daily_reset = int(
(_daily_reset_time(now=now) - now).total_seconds()
)
pipe.expire(d_key, max(seconds_until_daily_reset, 1))
# Weekly counter (expires end of week)
w_key = _weekly_key(user_id, now=now)
pipe.incrby(w_key, total)
seconds_until_weekly_reset = int(
(_weekly_reset_time(now=now) - now).total_seconds()
)
pipe.expire(w_key, max(seconds_until_weekly_reset, 1))
await pipe.execute()
except Exception:
logger.warning(
"Redis unavailable for recording token usage (tokens=%d)",
total,
exc_info=True,
)

View File

@@ -1,334 +0,0 @@
"""Unit tests for CoPilot rate limiting."""
from datetime import UTC, datetime, timedelta
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from redis.exceptions import RedisError
from .rate_limit import (
CoPilotUsageStatus,
RateLimitExceeded,
check_rate_limit,
get_usage_status,
record_token_usage,
)
_USER = "test-user-rl"
# ---------------------------------------------------------------------------
# RateLimitExceeded
# ---------------------------------------------------------------------------
class TestRateLimitExceeded:
def test_message_contains_window_name(self):
exc = RateLimitExceeded("daily", datetime.now(UTC) + timedelta(hours=1))
assert "daily" in str(exc)
def test_message_contains_reset_time(self):
exc = RateLimitExceeded(
"weekly", datetime.now(UTC) + timedelta(hours=2, minutes=30)
)
msg = str(exc)
# Allow for slight timing drift (29m or 30m)
assert "2h " in msg
assert "Resets in" in msg
def test_message_minutes_only_when_under_one_hour(self):
exc = RateLimitExceeded("daily", datetime.now(UTC) + timedelta(minutes=15))
msg = str(exc)
assert "Resets in" in msg
# Should not have "0h"
assert "0h" not in msg
def test_message_says_now_when_resets_at_is_in_the_past(self):
"""Negative delta (clock skew / stale TTL) should say 'now', not '-1h -30m'."""
exc = RateLimitExceeded("daily", datetime.now(UTC) - timedelta(minutes=5))
assert "Resets in now" in str(exc)
# ---------------------------------------------------------------------------
# get_usage_status
# ---------------------------------------------------------------------------
class TestGetUsageStatus:
@pytest.mark.asyncio
async def test_returns_redis_values(self):
mock_redis = AsyncMock()
mock_redis.get = AsyncMock(side_effect=["500", "2000"])
with patch(
"backend.copilot.rate_limit.get_redis_async",
return_value=mock_redis,
):
status = await get_usage_status(
_USER, daily_token_limit=10000, weekly_token_limit=50000
)
assert isinstance(status, CoPilotUsageStatus)
assert status.daily.used == 500
assert status.daily.limit == 10000
assert status.weekly.used == 2000
assert status.weekly.limit == 50000
@pytest.mark.asyncio
async def test_returns_zeros_when_redis_unavailable(self):
with patch(
"backend.copilot.rate_limit.get_redis_async",
side_effect=ConnectionError("Redis down"),
):
status = await get_usage_status(
_USER, daily_token_limit=10000, weekly_token_limit=50000
)
assert status.daily.used == 0
assert status.weekly.used == 0
@pytest.mark.asyncio
async def test_partial_none_daily_counter(self):
"""Daily counter is None (new day), weekly has usage."""
mock_redis = AsyncMock()
mock_redis.get = AsyncMock(side_effect=[None, "3000"])
with patch(
"backend.copilot.rate_limit.get_redis_async",
return_value=mock_redis,
):
status = await get_usage_status(
_USER, daily_token_limit=10000, weekly_token_limit=50000
)
assert status.daily.used == 0
assert status.weekly.used == 3000
@pytest.mark.asyncio
async def test_partial_none_weekly_counter(self):
"""Weekly counter is None (start of week), daily has usage."""
mock_redis = AsyncMock()
mock_redis.get = AsyncMock(side_effect=["500", None])
with patch(
"backend.copilot.rate_limit.get_redis_async",
return_value=mock_redis,
):
status = await get_usage_status(
_USER, daily_token_limit=10000, weekly_token_limit=50000
)
assert status.daily.used == 500
assert status.weekly.used == 0
@pytest.mark.asyncio
async def test_resets_at_daily_is_next_midnight_utc(self):
mock_redis = AsyncMock()
mock_redis.get = AsyncMock(side_effect=["0", "0"])
with patch(
"backend.copilot.rate_limit.get_redis_async",
return_value=mock_redis,
):
status = await get_usage_status(
_USER, daily_token_limit=10000, weekly_token_limit=50000
)
now = datetime.now(UTC)
# Daily reset should be within 24h
assert status.daily.resets_at > now
assert status.daily.resets_at <= now + timedelta(hours=24, seconds=5)
# ---------------------------------------------------------------------------
# check_rate_limit
# ---------------------------------------------------------------------------
class TestCheckRateLimit:
@pytest.mark.asyncio
async def test_allows_when_under_limit(self):
mock_redis = AsyncMock()
mock_redis.get = AsyncMock(side_effect=["100", "200"])
with patch(
"backend.copilot.rate_limit.get_redis_async",
return_value=mock_redis,
):
# Should not raise
await check_rate_limit(
_USER, daily_token_limit=10000, weekly_token_limit=50000
)
@pytest.mark.asyncio
async def test_raises_when_daily_limit_exceeded(self):
mock_redis = AsyncMock()
mock_redis.get = AsyncMock(side_effect=["10000", "200"])
with patch(
"backend.copilot.rate_limit.get_redis_async",
return_value=mock_redis,
):
with pytest.raises(RateLimitExceeded) as exc_info:
await check_rate_limit(
_USER, daily_token_limit=10000, weekly_token_limit=50000
)
assert exc_info.value.window == "daily"
@pytest.mark.asyncio
async def test_raises_when_weekly_limit_exceeded(self):
mock_redis = AsyncMock()
mock_redis.get = AsyncMock(side_effect=["100", "50000"])
with patch(
"backend.copilot.rate_limit.get_redis_async",
return_value=mock_redis,
):
with pytest.raises(RateLimitExceeded) as exc_info:
await check_rate_limit(
_USER, daily_token_limit=10000, weekly_token_limit=50000
)
assert exc_info.value.window == "weekly"
@pytest.mark.asyncio
async def test_allows_when_redis_unavailable(self):
"""Fail-open: allow requests when Redis is down."""
with patch(
"backend.copilot.rate_limit.get_redis_async",
side_effect=ConnectionError("Redis down"),
):
# Should not raise
await check_rate_limit(
_USER, daily_token_limit=10000, weekly_token_limit=50000
)
@pytest.mark.asyncio
async def test_skips_check_when_limit_is_zero(self):
mock_redis = AsyncMock()
mock_redis.get = AsyncMock(side_effect=["999999", "999999"])
with patch(
"backend.copilot.rate_limit.get_redis_async",
return_value=mock_redis,
):
# Should not raise — limits of 0 mean unlimited
await check_rate_limit(_USER, daily_token_limit=0, weekly_token_limit=0)
# ---------------------------------------------------------------------------
# record_token_usage
# ---------------------------------------------------------------------------
class TestRecordTokenUsage:
@staticmethod
def _make_pipeline_mock() -> MagicMock:
"""Create a pipeline mock with sync methods and async execute."""
pipe = MagicMock()
pipe.execute = AsyncMock(return_value=[])
return pipe
@pytest.mark.asyncio
async def test_increments_redis_counters(self):
mock_pipe = self._make_pipeline_mock()
mock_redis = AsyncMock()
mock_redis.pipeline = lambda **_kw: mock_pipe
with patch(
"backend.copilot.rate_limit.get_redis_async",
return_value=mock_redis,
):
await record_token_usage(_USER, prompt_tokens=100, completion_tokens=50)
# Should call incrby twice (daily + weekly) with total=150
incrby_calls = mock_pipe.incrby.call_args_list
assert len(incrby_calls) == 2
assert incrby_calls[0].args[1] == 150 # daily
assert incrby_calls[1].args[1] == 150 # weekly
@pytest.mark.asyncio
async def test_skips_when_zero_tokens(self):
mock_redis = AsyncMock()
with patch(
"backend.copilot.rate_limit.get_redis_async",
return_value=mock_redis,
):
await record_token_usage(_USER, prompt_tokens=0, completion_tokens=0)
# Should not call pipeline at all
mock_redis.pipeline.assert_not_called()
@pytest.mark.asyncio
async def test_sets_expire_on_both_keys(self):
"""Pipeline should call expire for both daily and weekly keys."""
mock_pipe = self._make_pipeline_mock()
mock_redis = AsyncMock()
mock_redis.pipeline = lambda **_kw: mock_pipe
with patch(
"backend.copilot.rate_limit.get_redis_async",
return_value=mock_redis,
):
await record_token_usage(_USER, prompt_tokens=100, completion_tokens=50)
expire_calls = mock_pipe.expire.call_args_list
assert len(expire_calls) == 2
# Daily key TTL should be positive (seconds until next midnight)
daily_ttl = expire_calls[0].args[1]
assert daily_ttl >= 1
# Weekly key TTL should be positive (seconds until next Monday)
weekly_ttl = expire_calls[1].args[1]
assert weekly_ttl >= 1
@pytest.mark.asyncio
async def test_handles_redis_failure_gracefully(self):
"""Should not raise when Redis is unavailable."""
with patch(
"backend.copilot.rate_limit.get_redis_async",
side_effect=ConnectionError("Redis down"),
):
# Should not raise
await record_token_usage(_USER, prompt_tokens=100, completion_tokens=50)
@pytest.mark.asyncio
async def test_cost_weighted_counting(self):
"""Cached tokens should be weighted: cache_read=10%, cache_create=25%."""
mock_pipe = self._make_pipeline_mock()
mock_redis = AsyncMock()
mock_redis.pipeline = lambda **_kw: mock_pipe
with patch(
"backend.copilot.rate_limit.get_redis_async",
return_value=mock_redis,
):
await record_token_usage(
_USER,
prompt_tokens=100, # uncached → 100
completion_tokens=50, # output → 50
cache_read_tokens=10000, # 10% → 1000
cache_creation_tokens=400, # 25% → 100
)
# Expected weighted total: 100 + 1000 + 100 + 50 = 1250
incrby_calls = mock_pipe.incrby.call_args_list
assert len(incrby_calls) == 2
assert incrby_calls[0].args[1] == 1250 # daily
assert incrby_calls[1].args[1] == 1250 # weekly
@pytest.mark.asyncio
async def test_handles_redis_error_during_pipeline_execute(self):
"""Should not raise when pipeline.execute() fails with RedisError."""
mock_pipe = self._make_pipeline_mock()
mock_pipe.execute = AsyncMock(side_effect=RedisError("Pipeline failed"))
mock_redis = AsyncMock()
mock_redis.pipeline = lambda **_kw: mock_pipe
with patch(
"backend.copilot.rate_limit.get_redis_async",
return_value=mock_redis,
):
# Should not raise — fail-open
await record_token_usage(_USER, prompt_tokens=100, completion_tokens=50)

View File

@@ -186,29 +186,12 @@ class StreamToolOutputAvailable(StreamBaseResponse):
class StreamUsage(StreamBaseResponse):
"""Token usage statistics.
Emitted as an SSE comment so the Vercel AI SDK parser ignores it
(it uses z.strictObject() and rejects unknown event types).
Usage data is recorded server-side (session DB + Redis counters).
"""
"""Token usage statistics."""
type: ResponseType = ResponseType.USAGE
promptTokens: int = Field(..., description="Number of uncached prompt tokens")
promptTokens: int = Field(..., description="Number of prompt tokens")
completionTokens: int = Field(..., description="Number of completion tokens")
totalTokens: int = Field(
..., description="Total number of tokens (raw, not weighted)"
)
cacheReadTokens: int = Field(
default=0, description="Prompt tokens served from cache (10% cost)"
)
cacheCreationTokens: int = Field(
default=0, description="Prompt tokens written to cache (25% cost)"
)
def to_sse(self) -> str:
"""Emit as SSE comment so the AI SDK parser ignores it."""
return f": usage {self.model_dump_json(exclude_none=True)}\n\n"
totalTokens: int = Field(..., description="Total number of tokens")
class StreamError(StreamBaseResponse):

View File

@@ -3,12 +3,45 @@
This module provides the integration layer between the Claude Agent SDK
and the existing CoPilot tool system, enabling drop-in replacement of
the current LLM orchestration with the battle-tested Claude Agent SDK.
Submodule imports are deferred via PEP 562 ``__getattr__`` to break a
circular import cycle::
sdk/__init__ → tool_adapter → copilot.tools (TOOL_REGISTRY)
copilot.tools → run_block → sdk.file_ref (no cycle here, but…)
sdk/__init__ → service → copilot.prompting → copilot.tools (cycle!)
``tool_adapter`` uses ``TOOL_REGISTRY`` at **module level** to build the
static ``COPILOT_TOOL_NAMES`` list, so the import cannot be deferred to
function scope without a larger refactor (moving tool-name registration
to a separate lightweight module). The lazy-import pattern here is the
least invasive way to break the cycle while keeping module-level constants
intact.
"""
from .service import stream_chat_completion_sdk
from .tool_adapter import create_copilot_mcp_server
from typing import Any
__all__ = [
"stream_chat_completion_sdk",
"create_copilot_mcp_server",
]
# Dispatch table for PEP 562 lazy imports. Each entry is a (module, attr)
# pair so new exports can be added without touching __getattr__ itself.
_LAZY_IMPORTS: dict[str, tuple[str, str]] = {
"stream_chat_completion_sdk": (".service", "stream_chat_completion_sdk"),
"create_copilot_mcp_server": (".tool_adapter", "create_copilot_mcp_server"),
}
def __getattr__(name: str) -> Any:
entry = _LAZY_IMPORTS.get(name)
if entry is not None:
module_path, attr = entry
import importlib
module = importlib.import_module(module_path, package=__name__)
value = getattr(module, attr)
globals()[name] = value
return value
raise AttributeError(f"module {__name__!r} has no attribute {name!r}")

View File

@@ -11,7 +11,7 @@ persistence, and the ``CompactionTracker`` state machine.
import asyncio
import logging
import uuid
from collections.abc import Callable
from dataclasses import dataclass, field
from ..constants import COMPACTION_DONE_MSG, COMPACTION_TOOL_NAME
from ..model import ChatMessage, ChatSession
@@ -27,6 +27,19 @@ from ..response_model import (
logger = logging.getLogger(__name__)
@dataclass
class CompactionResult:
"""Result of emit_end_if_ready — bundles events with compaction metadata.
Eliminates the need for separate ``compaction_just_ended`` checks,
preventing TOCTOU races between the emit call and the flag read.
"""
events: list[StreamBaseResponse] = field(default_factory=list)
just_ended: bool = False
transcript_path: str = ""
# ---------------------------------------------------------------------------
# Event builders (private — use CompactionTracker or compaction_events)
# ---------------------------------------------------------------------------
@@ -177,11 +190,22 @@ class CompactionTracker:
self._start_emitted = False
self._done = False
self._tool_call_id = ""
self._transcript_path: str = ""
@property
def on_compact(self) -> Callable[[], None]:
"""Callback for the PreCompact hook."""
return self._compact_start.set
def on_compact(self, transcript_path: str = "") -> None:
"""Callback for the PreCompact hook. Stores transcript_path."""
if (
self._transcript_path
and transcript_path
and self._transcript_path != transcript_path
):
logger.warning(
"[Compaction] Overwriting transcript_path %s -> %s",
self._transcript_path,
transcript_path,
)
self._transcript_path = transcript_path
self._compact_start.set()
# ------------------------------------------------------------------
# Pre-query compaction
@@ -198,10 +222,10 @@ class CompactionTracker:
def reset_for_query(self) -> None:
"""Reset per-query state before a new SDK query."""
self._compact_start.clear()
self._done = False
self._start_emitted = False
self._tool_call_id = ""
self._transcript_path = ""
def emit_start_if_ready(self) -> list[StreamBaseResponse]:
"""If the PreCompact hook fired, emit start events (spinning tool)."""
@@ -212,15 +236,20 @@ class CompactionTracker:
return _start_events(self._tool_call_id)
return []
async def emit_end_if_ready(self, session: ChatSession) -> list[StreamBaseResponse]:
"""If compaction is in progress, emit end events and persist."""
async def emit_end_if_ready(self, session: ChatSession) -> CompactionResult:
"""If compaction is in progress, emit end events and persist.
Returns a ``CompactionResult`` with ``just_ended=True`` and the
captured ``transcript_path`` when a compaction cycle completes.
This avoids a separate flag check (TOCTOU-safe).
"""
# Yield so pending hook tasks can set compact_start
await asyncio.sleep(0)
if self._done:
return []
return CompactionResult()
if not self._start_emitted and not self._compact_start.is_set():
return []
return CompactionResult()
if self._start_emitted:
# Close the open spinner
@@ -233,8 +262,12 @@ class CompactionTracker:
COMPACTION_DONE_MSG, tool_call_id=persist_id
)
transcript_path = self._transcript_path
self._compact_start.clear()
self._start_emitted = False
self._done = True
self._transcript_path = ""
_persist(session, persist_id, COMPACTION_DONE_MSG)
return done_events
return CompactionResult(
events=done_events, just_ended=True, transcript_path=transcript_path
)

View File

@@ -195,10 +195,11 @@ class TestCompactionTracker:
session = _make_session()
tracker.on_compact()
tracker.emit_start_if_ready()
evts = await tracker.emit_end_if_ready(session)
assert len(evts) == 2
assert isinstance(evts[0], StreamToolOutputAvailable)
assert isinstance(evts[1], StreamFinishStep)
result = await tracker.emit_end_if_ready(session)
assert result.just_ended is True
assert len(result.events) == 2
assert isinstance(result.events[0], StreamToolOutputAvailable)
assert isinstance(result.events[1], StreamFinishStep)
# Should persist
assert len(session.messages) == 2
@@ -210,28 +211,32 @@ class TestCompactionTracker:
session = _make_session()
tracker.on_compact()
# Don't call emit_start_if_ready
evts = await tracker.emit_end_if_ready(session)
assert len(evts) == 5 # Full self-contained event
assert isinstance(evts[0], StreamStartStep)
result = await tracker.emit_end_if_ready(session)
assert result.just_ended is True
assert len(result.events) == 5 # Full self-contained event
assert isinstance(result.events[0], StreamStartStep)
assert len(session.messages) == 2
@pytest.mark.asyncio
async def test_emit_end_no_op_when_done(self):
async def test_emit_end_no_op_when_no_new_compaction(self):
tracker = CompactionTracker()
session = _make_session()
tracker.on_compact()
tracker.emit_start_if_ready()
await tracker.emit_end_if_ready(session)
# Second call should be no-op
evts = await tracker.emit_end_if_ready(session)
assert evts == []
result1 = await tracker.emit_end_if_ready(session)
assert result1.just_ended is True
# Second call should be no-op (no new on_compact)
result2 = await tracker.emit_end_if_ready(session)
assert result2.just_ended is False
assert result2.events == []
@pytest.mark.asyncio
async def test_emit_end_no_op_when_nothing_happened(self):
tracker = CompactionTracker()
session = _make_session()
evts = await tracker.emit_end_if_ready(session)
assert evts == []
result = await tracker.emit_end_if_ready(session)
assert result.just_ended is False
assert result.events == []
def test_emit_pre_query(self):
tracker = CompactionTracker()
@@ -246,20 +251,29 @@ class TestCompactionTracker:
tracker._done = True
tracker._start_emitted = True
tracker._tool_call_id = "old"
tracker._transcript_path = "/some/path"
tracker.reset_for_query()
assert tracker._done is False
assert tracker._start_emitted is False
assert tracker._tool_call_id == ""
assert tracker._transcript_path == ""
@pytest.mark.asyncio
async def test_pre_query_blocks_sdk_compaction(self):
"""After pre-query compaction, SDK compaction events are suppressed."""
async def test_pre_query_blocks_sdk_compaction_until_reset(self):
"""After pre-query compaction, SDK compaction is blocked until
reset_for_query is called."""
tracker = CompactionTracker()
session = _make_session()
tracker.emit_pre_query(session)
tracker.on_compact()
# _done is True so emit_start_if_ready is blocked
evts = tracker.emit_start_if_ready()
assert evts == [] # _done blocks it
assert evts == []
# Reset clears _done, allowing subsequent compaction
tracker.reset_for_query()
tracker.on_compact()
evts = tracker.emit_start_if_ready()
assert len(evts) == 3
@pytest.mark.asyncio
async def test_reset_allows_new_compaction(self):
@@ -279,9 +293,9 @@ class TestCompactionTracker:
session = _make_session()
tracker.on_compact()
start_evts = tracker.emit_start_if_ready()
end_evts = await tracker.emit_end_if_ready(session)
result = await tracker.emit_end_if_ready(session)
start_evt = start_evts[1]
end_evt = end_evts[0]
end_evt = result.events[0]
assert isinstance(start_evt, StreamToolInputStart)
assert isinstance(end_evt, StreamToolOutputAvailable)
assert start_evt.toolCallId == end_evt.toolCallId
@@ -289,3 +303,105 @@ class TestCompactionTracker:
tool_calls = session.messages[0].tool_calls
assert tool_calls is not None
assert tool_calls[0]["id"] == start_evt.toolCallId
@pytest.mark.asyncio
async def test_multiple_compactions_within_query(self):
"""Two mid-stream compactions within a single query both trigger."""
tracker = CompactionTracker()
session = _make_session()
# First compaction cycle
tracker.on_compact("/path/1")
tracker.emit_start_if_ready()
result1 = await tracker.emit_end_if_ready(session)
assert result1.just_ended is True
assert len(result1.events) == 2
assert result1.transcript_path == "/path/1"
# Second compaction cycle (should NOT be blocked — _done resets
# because emit_end_if_ready sets it True, but the next on_compact
# + emit_start_if_ready checks !_done which IS True now.
# So we need reset_for_query between queries, but within a single
# query multiple compactions work because _done blocks emit_start
# until the next message arrives, at which point emit_end detects it)
#
# Actually: _done=True blocks emit_start_if_ready, so we need
# the stream loop to reset. In practice service.py doesn't call
# reset between compactions within the same query — let's verify
# the actual behavior.
tracker.on_compact("/path/2")
# _done is True from first compaction, so start is blocked
start_evts = tracker.emit_start_if_ready()
assert start_evts == []
# But emit_end returns no-op because _done is True
result2 = await tracker.emit_end_if_ready(session)
assert result2.just_ended is False
@pytest.mark.asyncio
async def test_multiple_compactions_with_intervening_message(self):
"""Multiple compactions work when the stream loop processes messages between them.
In the real service.py flow:
1. PreCompact fires → on_compact()
2. emit_start shows spinner
3. Next message arrives → emit_end completes compaction (_done=True)
4. Stream continues processing messages...
5. If a second PreCompact fires, _done=True blocks emit_start
6. But the next message triggers emit_end, which sees _done=True → no-op
7. The stream loop needs to detect this and handle accordingly
The actual flow for multiple compactions within a query requires
_done to be cleared between them. The service.py code uses
CompactionResult.just_ended to trigger replace_entries, and _done
stays True until reset_for_query.
"""
tracker = CompactionTracker()
session = _make_session()
# First compaction
tracker.on_compact("/path/1")
tracker.emit_start_if_ready()
result1 = await tracker.emit_end_if_ready(session)
assert result1.just_ended is True
assert result1.transcript_path == "/path/1"
# Simulate reset between queries
tracker.reset_for_query()
# Second compaction in new query
tracker.on_compact("/path/2")
start_evts = tracker.emit_start_if_ready()
assert len(start_evts) == 3
result2 = await tracker.emit_end_if_ready(session)
assert result2.just_ended is True
assert result2.transcript_path == "/path/2"
def test_on_compact_stores_transcript_path(self):
tracker = CompactionTracker()
tracker.on_compact("/some/path.jsonl")
assert tracker._transcript_path == "/some/path.jsonl"
@pytest.mark.asyncio
async def test_emit_end_returns_transcript_path(self):
"""CompactionResult includes the transcript_path from on_compact."""
tracker = CompactionTracker()
session = _make_session()
tracker.on_compact("/my/session.jsonl")
tracker.emit_start_if_ready()
result = await tracker.emit_end_if_ready(session)
assert result.just_ended is True
assert result.transcript_path == "/my/session.jsonl"
# transcript_path is cleared after emit_end
assert tracker._transcript_path == ""
@pytest.mark.asyncio
async def test_emit_end_clears_transcript_path(self):
"""After emit_end, _transcript_path is reset so it doesn't leak to
subsequent non-compaction emit_end calls."""
tracker = CompactionTracker()
session = _make_session()
tracker.on_compact("/first/path.jsonl")
tracker.emit_start_if_ready()
await tracker.emit_end_if_ready(session)
# After compaction, _transcript_path is cleared
assert tracker._transcript_path == ""

View File

@@ -7,8 +7,8 @@ JSONL session files — no SDK subprocess needed. Exercises:
2. User query appended, assistant response streamed
3. PreCompact hook fires → CompactionTracker.on_compact()
4. Next message → emit_start_if_ready() yields spinner events
5. Message after that → emit_end_if_ready() returns end events
6. _read_compacted_entries() reads the CLI session file
5. Message after that → emit_end_if_ready() returns CompactionResult
6. read_compacted_entries() reads the CLI session file
7. TranscriptBuilder.replace_entries() syncs state
8. More messages appended post-compaction
9. to_jsonl() exports full state for upload
@@ -16,7 +16,6 @@ JSONL session files — no SDK subprocess needed. Exercises:
"""
import asyncio
from pathlib import Path
from backend.copilot.model import ChatSession
from backend.copilot.response_model import (
@@ -27,7 +26,10 @@ from backend.copilot.response_model import (
StreamToolOutputAvailable,
)
from backend.copilot.sdk.compaction import CompactionTracker
from backend.copilot.sdk.transcript import strip_progress_entries
from backend.copilot.sdk.transcript import (
read_compacted_entries,
strip_progress_entries,
)
from backend.copilot.sdk.transcript_builder import TranscriptBuilder
from backend.util import json
@@ -41,32 +43,6 @@ def _run(coro):
return asyncio.run(coro)
def _read_compacted_entries(path: str) -> tuple[list[dict], str] | None:
"""Test-only: read compacted entries from a session JSONL file.
Returns (parsed_dicts, jsonl_string) from the first ``isCompactSummary``
entry onward, or ``None`` if no summary is found.
"""
content = Path(path).read_text()
lines = content.strip().split("\n")
compact_idx: int | None = None
parsed: list[dict] = []
raw_lines: list[str] = []
for line in lines:
if not line.strip():
continue
entry = json.loads(line, fallback=None)
if not isinstance(entry, dict):
continue
parsed.append(entry)
raw_lines.append(line.strip())
if compact_idx is None and entry.get("isCompactSummary"):
compact_idx = len(parsed) - 1
if compact_idx is None:
return None
return parsed[compact_idx:], "\n".join(raw_lines[compact_idx:]) + "\n"
# ---------------------------------------------------------------------------
# Fixtures: realistic CLI session file content
# ---------------------------------------------------------------------------
@@ -229,7 +205,7 @@ class TestCompactionE2E:
path.write_text(_make_jsonl(*entries))
return path
def test_full_compaction_lifecycle(self, tmp_path):
def test_full_compaction_lifecycle(self, tmp_path, monkeypatch):
"""Simulate the complete service.py compaction flow.
Timeline:
@@ -240,14 +216,18 @@ class TestCompactionE2E:
5. Mid-stream: PreCompact hook fires (context too large)
6. CLI writes compaction summary to session file
7. Next SDK message → emit_start (spinner)
8. Following message → emit_end (end events)
9. _read_compacted_entries reads the session file
8. Following message → emit_end (CompactionResult)
9. read_compacted_entries reads the session file
10. replace_entries syncs TranscriptBuilder
11. More assistant messages appended
12. Export → upload → next turn downloads it
"""
session_dir = tmp_path / "session"
# --- Setup CLI projects directory ---
config_dir = tmp_path / "config"
projects_dir = config_dir / "projects"
session_dir = projects_dir / "proj"
session_dir.mkdir(parents=True)
monkeypatch.setenv("CLAUDE_CONFIG_DIR", str(config_dir))
# --- Step 1-2: Load "downloaded" transcript from previous turn ---
previous_transcript = _make_jsonl(
@@ -296,8 +276,7 @@ class TestCompactionE2E:
# --- Step 7: CompactionTracker receives PreCompact hook ---
tracker = CompactionTracker()
session = ChatSession.new(user_id="test-user")
# on_compact is a property returning Event.set callable
tracker.on_compact()
tracker.on_compact(str(session_file))
# --- Step 8: Next SDK message arrives → emit_start ---
start_events = tracker.emit_start_if_ready()
@@ -311,30 +290,31 @@ class TestCompactionE2E:
assert tool_call_id.startswith("compaction-")
# --- Step 9: Following message → emit_end ---
end_events = _run(tracker.emit_end_if_ready(session))
assert len(end_events) == 2
assert isinstance(end_events[0], StreamToolOutputAvailable)
assert isinstance(end_events[1], StreamFinishStep)
result = _run(tracker.emit_end_if_ready(session))
assert result.just_ended is True
assert result.transcript_path == str(session_file)
assert len(result.events) == 2
assert isinstance(result.events[0], StreamToolOutputAvailable)
assert isinstance(result.events[1], StreamFinishStep)
# Verify same tool_call_id
assert end_events[0].toolCallId == tool_call_id
assert result.events[0].toolCallId == tool_call_id
# Session should have compaction messages persisted
assert len(session.messages) == 2
assert session.messages[0].role == "assistant"
assert session.messages[1].role == "tool"
# --- Step 10: _read_compacted_entries + replace_entries ---
result = _read_compacted_entries(str(session_file))
assert result is not None
compacted_dicts, compacted_jsonl = result
# --- Step 10: read_compacted_entries + replace_entries ---
compacted = read_compacted_entries(str(session_file))
assert compacted is not None
# Should have: COMPACT_SUMMARY + POST_COMPACT_ASST + USER_3 + ASST_3
assert len(compacted_dicts) == 4
assert compacted_dicts[0]["uuid"] == "cs1"
assert compacted_dicts[0]["isCompactSummary"] is True
assert len(compacted) == 4
assert compacted[0]["uuid"] == "cs1"
assert compacted[0]["isCompactSummary"] is True
# Replace builder state with compacted JSONL
# Replace builder state with compacted entries
old_count = builder.entry_count
builder.replace_entries(compacted_jsonl)
builder.replace_entries(compacted)
assert builder.entry_count == 4 # Only compacted entries
assert builder.entry_count < old_count # Compaction reduced entries
@@ -387,10 +367,13 @@ class TestCompactionE2E:
# Parented to the last entry from previous turn
assert last_entry["parentUuid"] == output_entries[-1]["uuid"]
def test_double_compaction_within_session(self, tmp_path):
def test_double_compaction_within_session(self, tmp_path, monkeypatch):
"""Two compactions in the same session (across reset_for_query)."""
session_dir = tmp_path / "session"
config_dir = tmp_path / "config"
projects_dir = config_dir / "projects"
session_dir = projects_dir / "proj"
session_dir.mkdir(parents=True)
monkeypatch.setenv("CLAUDE_CONFIG_DIR", str(config_dir))
tracker = CompactionTracker()
session = ChatSession.new(user_id="test")
@@ -416,15 +399,14 @@ class TestCompactionE2E:
file1 = session_dir / "session1.jsonl"
file1.write_text(_make_jsonl(first_summary, first_post))
tracker.on_compact()
tracker.on_compact(str(file1))
tracker.emit_start_if_ready()
end_events1 = _run(tracker.emit_end_if_ready(session))
assert len(end_events1) == 2 # output + finish
result1 = _run(tracker.emit_end_if_ready(session))
assert result1.just_ended is True
result1_entries = _read_compacted_entries(str(file1))
assert result1_entries is not None
_, compacted1_jsonl = result1_entries
builder.replace_entries(compacted1_jsonl)
compacted1 = read_compacted_entries(str(file1))
assert compacted1 is not None
builder.replace_entries(compacted1)
assert builder.entry_count == 2
# --- Reset for second query ---
@@ -449,15 +431,14 @@ class TestCompactionE2E:
file2 = session_dir / "session2.jsonl"
file2.write_text(_make_jsonl(second_summary, second_post))
tracker.on_compact()
tracker.on_compact(str(file2))
tracker.emit_start_if_ready()
end_events2 = _run(tracker.emit_end_if_ready(session))
assert len(end_events2) == 2 # output + finish
result2 = _run(tracker.emit_end_if_ready(session))
assert result2.just_ended is True
result2_entries = _read_compacted_entries(str(file2))
assert result2_entries is not None
_, compacted2_jsonl = result2_entries
builder.replace_entries(compacted2_jsonl)
compacted2 = read_compacted_entries(str(file2))
assert compacted2 is not None
builder.replace_entries(compacted2)
assert builder.entry_count == 2 # Only second compaction entries
# Export and verify
@@ -466,7 +447,9 @@ class TestCompactionE2E:
assert entries[0]["uuid"] == "cs-second"
assert entries[0].get("isCompactSummary") is True
def test_strip_progress_then_load_then_compact_roundtrip(self, tmp_path):
def test_strip_progress_then_load_then_compact_roundtrip(
self, tmp_path, monkeypatch
):
"""Full pipeline: strip → load → compact → replace → export → reload.
This tests the exact sequence that happens across two turns:
@@ -475,8 +458,11 @@ class TestCompactionE2E:
Turn 2: Download → load_previous → compaction fires → replace → export
Turn 3: Download the Turn 2 export → load_previous (roundtrip)
"""
session_dir = tmp_path / "session"
config_dir = tmp_path / "config"
projects_dir = config_dir / "projects"
session_dir = projects_dir / "proj"
session_dir.mkdir(parents=True)
monkeypatch.setenv("CLAUDE_CONFIG_DIR", str(config_dir))
# --- Turn 1: SDK produces raw transcript ---
raw_content = _make_jsonl(
@@ -525,10 +511,9 @@ class TestCompactionE2E:
],
)
result = _read_compacted_entries(str(session_file))
assert result is not None
_, compacted_jsonl = result
builder.replace_entries(compacted_jsonl)
compacted = read_compacted_entries(str(session_file))
assert compacted is not None
builder.replace_entries(compacted)
# Append post-compaction message
builder.append_user("Thanks!")

View File

@@ -41,12 +41,20 @@ from typing import Any
from backend.copilot.context import (
get_current_sandbox,
get_sdk_cwd,
get_workspace_manager,
is_allowed_local_path,
resolve_sandbox_path,
)
from backend.copilot.model import ChatSession
from backend.copilot.tools.workspace_files import get_manager
from backend.util.file import parse_workspace_uri
from backend.util.file_content_parser import (
BINARY_FORMATS,
MIME_TO_FORMAT,
PARSE_EXCEPTIONS,
infer_format_from_uri,
parse_file_content,
)
from backend.util.type import MediaFileType
class FileRefExpansionError(Exception):
@@ -74,6 +82,8 @@ _FILE_REF_RE = re.compile(
_MAX_EXPAND_CHARS = 200_000
# Maximum total characters across all @@agptfile: expansions in one string.
_MAX_TOTAL_EXPAND_CHARS = 1_000_000
# Maximum raw byte size for bare ref structured parsing (10 MB).
_MAX_BARE_REF_BYTES = 10_000_000
@dataclass
@@ -83,6 +93,11 @@ class FileRef:
end_line: int | None # 1-indexed, inclusive
# ---------------------------------------------------------------------------
# Public API (top-down: main functions first, helpers below)
# ---------------------------------------------------------------------------
def parse_file_ref(text: str) -> FileRef | None:
"""Return a :class:`FileRef` if *text* is a bare file reference token.
@@ -104,17 +119,6 @@ def parse_file_ref(text: str) -> FileRef | None:
return FileRef(uri=m.group(1), start_line=start, end_line=end)
def _apply_line_range(text: str, start: int | None, end: int | None) -> str:
"""Slice *text* to the requested 1-indexed line range (inclusive)."""
if start is None and end is None:
return text
lines = text.splitlines(keepends=True)
s = (start - 1) if start is not None else 0
e = end if end is not None else len(lines)
selected = list(itertools.islice(lines, s, e))
return "".join(selected)
async def read_file_bytes(
uri: str,
user_id: str | None,
@@ -130,27 +134,47 @@ async def read_file_bytes(
if plain.startswith("workspace://"):
if not user_id:
raise ValueError("workspace:// file references require authentication")
manager = await get_manager(user_id, session.session_id)
manager = await get_workspace_manager(user_id, session.session_id)
ws = parse_workspace_uri(plain)
try:
return await (
data = await (
manager.read_file(ws.file_ref)
if ws.is_path
else manager.read_file_by_id(ws.file_ref)
)
except FileNotFoundError:
raise ValueError(f"File not found: {plain}")
except Exception as exc:
except (PermissionError, OSError) as exc:
raise ValueError(f"Failed to read {plain}: {exc}") from exc
except (AttributeError, TypeError, RuntimeError) as exc:
# AttributeError/TypeError: workspace manager returned an
# unexpected type or interface; RuntimeError: async runtime issues.
logger.warning("Unexpected error reading %s: %s", plain, exc)
raise ValueError(f"Failed to read {plain}: {exc}") from exc
# NOTE: Workspace API does not support pre-read size checks;
# the full file is loaded before the size guard below.
if len(data) > _MAX_BARE_REF_BYTES:
raise ValueError(
f"File too large ({len(data)} bytes, limit {_MAX_BARE_REF_BYTES})"
)
return data
if is_allowed_local_path(plain, get_sdk_cwd()):
resolved = os.path.realpath(os.path.expanduser(plain))
try:
# Read with a one-byte overshoot to detect files that exceed the limit
# without a separate os.path.getsize call (avoids TOCTOU race).
with open(resolved, "rb") as fh:
return fh.read()
data = fh.read(_MAX_BARE_REF_BYTES + 1)
if len(data) > _MAX_BARE_REF_BYTES:
raise ValueError(
f"File too large (>{_MAX_BARE_REF_BYTES} bytes, "
f"limit {_MAX_BARE_REF_BYTES})"
)
return data
except FileNotFoundError:
raise ValueError(f"File not found: {plain}")
except Exception as exc:
except OSError as exc:
raise ValueError(f"Failed to read {plain}: {exc}") from exc
sandbox = get_current_sandbox()
@@ -162,9 +186,33 @@ async def read_file_bytes(
f"Path is not allowed (not in workspace, sdk_cwd, or sandbox): {plain}"
) from exc
try:
return bytes(await sandbox.files.read(remote, format="bytes"))
except Exception as exc:
data = bytes(await sandbox.files.read(remote, format="bytes"))
except (FileNotFoundError, OSError, UnicodeDecodeError) as exc:
raise ValueError(f"Failed to read from sandbox: {plain}: {exc}") from exc
except Exception as exc:
# E2B SDK raises SandboxException subclasses (NotFoundException,
# TimeoutException, NotEnoughSpaceException, etc.) which don't
# inherit from standard exceptions. Import lazily to avoid a
# hard dependency on e2b at module level.
try:
from e2b.exceptions import SandboxException # noqa: PLC0415
if isinstance(exc, SandboxException):
raise ValueError(
f"Failed to read from sandbox: {plain}: {exc}"
) from exc
except ImportError:
pass
# Re-raise unexpected exceptions (TypeError, AttributeError, etc.)
# so they surface as real bugs rather than being silently masked.
raise
# NOTE: E2B sandbox API does not support pre-read size checks;
# the full file is loaded before the size guard below.
if len(data) > _MAX_BARE_REF_BYTES:
raise ValueError(
f"File too large ({len(data)} bytes, limit {_MAX_BARE_REF_BYTES})"
)
return data
raise ValueError(
f"Path is not allowed (not in workspace, sdk_cwd, or sandbox): {plain}"
@@ -178,15 +226,13 @@ async def resolve_file_ref(
) -> str:
"""Resolve a :class:`FileRef` to its text content."""
raw = await read_file_bytes(ref.uri, user_id, session)
return _apply_line_range(
raw.decode("utf-8", errors="replace"), ref.start_line, ref.end_line
)
return _apply_line_range(_to_str(raw), ref.start_line, ref.end_line)
async def expand_file_refs_in_string(
text: str,
user_id: str | None,
session: "ChatSession",
session: ChatSession,
*,
raise_on_error: bool = False,
) -> str:
@@ -232,6 +278,9 @@ async def expand_file_refs_in_string(
if len(content) > _MAX_EXPAND_CHARS:
content = content[:_MAX_EXPAND_CHARS] + "\n... [truncated]"
remaining = _MAX_TOTAL_EXPAND_CHARS - total_chars
# remaining == 0 means the budget was exactly exhausted by the
# previous ref. The elif below (len > remaining) won't catch
# this since 0 > 0 is false, so we need the <= 0 check.
if remaining <= 0:
content = "[file-ref budget exhausted: total expansion limit reached]"
elif len(content) > remaining:
@@ -252,13 +301,31 @@ async def expand_file_refs_in_string(
async def expand_file_refs_in_args(
args: dict[str, Any],
user_id: str | None,
session: "ChatSession",
session: ChatSession,
*,
input_schema: dict[str, Any] | None = None,
) -> dict[str, Any]:
"""Recursively expand ``@@agptfile:...`` references in tool call arguments.
String values are expanded in-place. Nested dicts and lists are
traversed. Non-string scalars are returned unchanged.
**Bare references** (the entire argument value is a single
``@@agptfile:...`` token with no surrounding text) are resolved and then
parsed according to the file's extension or MIME type. See
:mod:`backend.util.file_content_parser` for the full list of supported
formats (JSON, JSONL, CSV, TSV, YAML, TOML, Parquet, Excel).
When *input_schema* is provided and the target property has
``"type": "string"``, structured parsing is skipped — the raw file content
is returned as a plain string so blocks receive the original text.
If the format is unrecognised or parsing fails, the content is returned as
a plain string (the fallback).
**Embedded references** (``@@agptfile:`` mixed with other text) always
produce a plain string — structured parsing only applies to bare refs.
Raises :class:`FileRefExpansionError` if any reference fails to resolve,
so the tool is *not* executed with an error string as its input. The
caller (the MCP tool wrapper) should convert this into an MCP error
@@ -267,15 +334,382 @@ async def expand_file_refs_in_args(
if not args:
return args
async def _expand(value: Any) -> Any:
properties = (input_schema or {}).get("properties", {})
async def _expand(
value: Any,
*,
prop_schema: dict[str, Any] | None = None,
) -> Any:
"""Recursively expand a single argument value.
Strings are checked for ``@@agptfile:`` references and expanded
(bare refs get structured parsing; embedded refs get inline
substitution). Dicts and lists are traversed recursively,
threading the corresponding sub-schema from *prop_schema* so
that nested fields also receive correct type-aware expansion.
Non-string scalars pass through unchanged.
"""
if isinstance(value, str):
ref = parse_file_ref(value)
if ref is not None:
# MediaFileType fields: return the raw URI immediately —
# no file reading, no format inference, no content parsing.
if _is_media_file_field(prop_schema):
return ref.uri
fmt = infer_format_from_uri(ref.uri)
# Workspace URIs by ID (workspace://abc123) have no extension.
# When the MIME fragment is also missing, fall back to the
# workspace file manager's metadata for format detection.
if fmt is None and ref.uri.startswith("workspace://"):
fmt = await _infer_format_from_workspace(ref.uri, user_id, session)
return await _expand_bare_ref(ref, fmt, user_id, session, prop_schema)
# Not a bare ref — do normal inline expansion.
return await expand_file_refs_in_string(
value, user_id, session, raise_on_error=True
)
if isinstance(value, dict):
return {k: await _expand(v) for k, v in value.items()}
# When the schema says this is an object but doesn't define
# inner properties, skip expansion — the caller (e.g.
# RunBlockTool) will expand with the actual nested schema.
if (
prop_schema is not None
and prop_schema.get("type") == "object"
and "properties" not in prop_schema
):
return value
nested_props = (prop_schema or {}).get("properties", {})
return {
k: await _expand(v, prop_schema=nested_props.get(k))
for k, v in value.items()
}
if isinstance(value, list):
return [await _expand(item) for item in value]
items_schema = (prop_schema or {}).get("items")
return [await _expand(item, prop_schema=items_schema) for item in value]
return value
return {k: await _expand(v) for k, v in args.items()}
return {k: await _expand(v, prop_schema=properties.get(k)) for k, v in args.items()}
# ---------------------------------------------------------------------------
# Private helpers (used by the public functions above)
# ---------------------------------------------------------------------------
def _apply_line_range(text: str, start: int | None, end: int | None) -> str:
"""Slice *text* to the requested 1-indexed line range (inclusive).
When the requested range extends beyond the file, a note is appended
so the LLM knows it received the entire remaining content.
"""
if start is None and end is None:
return text
lines = text.splitlines(keepends=True)
total = len(lines)
s = (start - 1) if start is not None else 0
e = end if end is not None else total
selected = list(itertools.islice(lines, s, e))
result = "".join(selected)
if end is not None and end > total:
result += f"\n[Note: file has only {total} lines]\n"
return result
def _to_str(content: str | bytes) -> str:
"""Decode *content* to a string if it is bytes, otherwise return as-is."""
if isinstance(content, str):
return content
return content.decode("utf-8", errors="replace")
def _check_content_size(content: str | bytes) -> None:
"""Raise :class:`ValueError` if *content* exceeds the byte limit.
Raises ``ValueError`` (not ``FileRefExpansionError``) so that the caller
(``_expand_bare_ref``) can unify all resolution errors into a single
``except ValueError`` → ``FileRefExpansionError`` handler, keeping the
error-flow consistent with ``read_file_bytes`` and ``resolve_file_ref``.
For ``bytes``, the length is the byte count directly. For ``str``,
we encode to UTF-8 first because multi-byte characters (e.g. emoji)
mean the byte size can be up to 4x the character count.
"""
if isinstance(content, bytes):
size = len(content)
else:
char_len = len(content)
# Fast lower bound: UTF-8 byte count >= char count.
# If char count already exceeds the limit, reject immediately
# without allocating an encoded copy.
if char_len > _MAX_BARE_REF_BYTES:
size = char_len # real byte size is even larger
# Fast upper bound: each char is at most 4 UTF-8 bytes.
# If worst-case is still under the limit, skip encoding entirely.
elif char_len * 4 <= _MAX_BARE_REF_BYTES:
return
else:
# Edge case: char count is under limit but multibyte chars
# might push byte count over. Encode to get exact size.
size = len(content.encode("utf-8"))
if size > _MAX_BARE_REF_BYTES:
raise ValueError(
f"File too large for structured parsing "
f"({size} bytes, limit {_MAX_BARE_REF_BYTES})"
)
async def _infer_format_from_workspace(
uri: str,
user_id: str | None,
session: ChatSession,
) -> str | None:
"""Look up workspace file metadata to infer the format.
Workspace URIs by ID (``workspace://abc123``) have no file extension.
When the MIME fragment is also absent, we query the workspace file
manager for the file's stored MIME type and original filename.
"""
if not user_id:
return None
try:
ws = parse_workspace_uri(uri)
manager = await get_workspace_manager(user_id, session.session_id)
info = await (
manager.get_file_info(ws.file_ref)
if not ws.is_path
else manager.get_file_info_by_path(ws.file_ref)
)
if info is None:
return None
# Try MIME type first, then filename extension.
mime = (info.mime_type or "").split(";", 1)[0].strip().lower()
return MIME_TO_FORMAT.get(mime) or infer_format_from_uri(info.name)
except (
ValueError,
FileNotFoundError,
OSError,
PermissionError,
AttributeError,
TypeError,
):
# Expected failures: bad URI, missing file, permission denied, or
# workspace manager returning unexpected types. Propagate anything
# else (e.g. programming errors) so they don't get silently swallowed.
logger.debug("workspace metadata lookup failed for %s", uri, exc_info=True)
return None
def _is_media_file_field(prop_schema: dict[str, Any] | None) -> bool:
"""Return True if *prop_schema* describes a MediaFileType field (format: file)."""
if prop_schema is None:
return False
return (
prop_schema.get("type") == "string"
and prop_schema.get("format") == MediaFileType.string_format
)
async def _expand_bare_ref(
ref: FileRef,
fmt: str | None,
user_id: str | None,
session: ChatSession,
prop_schema: dict[str, Any] | None,
) -> Any:
"""Resolve and parse a bare ``@@agptfile:`` reference.
This is the structured-parsing path: the file is read, optionally parsed
according to *fmt*, and adapted to the target *prop_schema*.
Raises :class:`FileRefExpansionError` on resolution or parsing failure.
Note: MediaFileType fields (format: "file") are handled earlier in
``_expand`` to avoid unnecessary format inference and file I/O.
"""
try:
if fmt is not None and fmt in BINARY_FORMATS:
# Binary formats need raw bytes, not UTF-8 text.
# Line ranges are meaningless for binary formats (parquet/xlsx)
# — ignore them and parse full bytes. Warn so the caller/model
# knows the range was silently dropped.
if ref.start_line is not None or ref.end_line is not None:
logger.warning(
"Line range [%s-%s] ignored for binary format %s (%s); "
"binary formats are always parsed in full.",
ref.start_line,
ref.end_line,
fmt,
ref.uri,
)
content: str | bytes = await read_file_bytes(ref.uri, user_id, session)
else:
content = await resolve_file_ref(ref, user_id, session)
except ValueError as exc:
raise FileRefExpansionError(str(exc)) from exc
# For known formats this rejects files >10 MB before parsing.
# For unknown formats _MAX_EXPAND_CHARS (200K chars) below is stricter,
# but this check still guards the parsing path which has no char limit.
# _check_content_size raises ValueError, which we unify here just like
# resolution errors above.
try:
_check_content_size(content)
except ValueError as exc:
raise FileRefExpansionError(str(exc)) from exc
# When the schema declares this parameter as "string",
# return raw file content — don't parse into a structured
# type that would need json.dumps() serialisation.
expect_string = (prop_schema or {}).get("type") == "string"
if expect_string:
if isinstance(content, bytes):
raise FileRefExpansionError(
f"Cannot use {fmt} file as text input: "
f"binary formats (parquet, xlsx) must be passed "
f"to a block that accepts structured data (list/object), "
f"not a string-typed parameter."
)
return content
if fmt is not None:
# Use strict mode for binary formats so we surface the
# actual error (e.g. missing pyarrow/openpyxl, corrupt
# file) instead of silently returning garbled bytes.
strict = fmt in BINARY_FORMATS
try:
parsed = parse_file_content(content, fmt, strict=strict)
except PARSE_EXCEPTIONS as exc:
raise FileRefExpansionError(f"Failed to parse {fmt} file: {exc}") from exc
# Normalize bytes fallback to str so tools never
# receive raw bytes when parsing fails.
if isinstance(parsed, bytes):
parsed = _to_str(parsed)
return _adapt_to_schema(parsed, prop_schema)
# Unknown format — return as plain string, but apply
# the same per-ref character limit used by inline refs
# to prevent injecting unexpectedly large content.
text = _to_str(content)
if len(text) > _MAX_EXPAND_CHARS:
text = text[:_MAX_EXPAND_CHARS] + "\n... [truncated]"
return text
def _adapt_to_schema(parsed: Any, prop_schema: dict[str, Any] | None) -> Any:
"""Adapt a parsed file value to better fit the target schema type.
When the parser returns a natural type (e.g. dict from YAML, list from CSV)
that doesn't match the block's expected type, this function converts it to
a more useful representation instead of relying on pydantic's generic
coercion (which can produce awkward results like flattened dicts → lists).
Returns *parsed* unchanged when no adaptation is needed.
"""
if prop_schema is None:
return parsed
target_type = prop_schema.get("type")
# Dict → array: delegate to helper.
if isinstance(parsed, dict) and target_type == "array":
return _adapt_dict_to_array(parsed, prop_schema)
# List → object: delegate to helper (raises for non-tabular lists).
if isinstance(parsed, list) and target_type == "object":
return _adapt_list_to_object(parsed)
# Tabular list → Any (no type): convert to list of dicts.
# Blocks like FindInDictionaryBlock have `input: Any` which produces
# a schema with no "type" key. Tabular [[header],[rows]] is unusable
# for key lookup, but [{col: val}, ...] works with FindInDict's
# list-of-dicts branch (line 195-199 in data_manipulation.py).
if isinstance(parsed, list) and target_type is None and _is_tabular(parsed):
return _tabular_to_list_of_dicts(parsed)
return parsed
def _adapt_dict_to_array(parsed: dict, prop_schema: dict[str, Any]) -> Any:
"""Adapt a parsed dict to an array-typed field.
Extracts list-valued entries when the target item type is ``array``,
passes through unchanged when item type is ``string`` (lets pydantic error),
or wraps in ``[parsed]`` as a fallback.
"""
items_type = (prop_schema.get("items") or {}).get("type")
if items_type == "array":
# Target is List[List[Any]] — extract list-typed values from the
# dict as inner lists. E.g. YAML {"fruits": [{...},...]}} with
# ConcatenateLists (List[List[Any]]) → [[{...},...]].
list_values = [v for v in parsed.values() if isinstance(v, list)]
if list_values:
return list_values
if items_type == "string":
# Target is List[str] — wrapping a dict would give [dict]
# which can't coerce to strings. Return unchanged and let
# pydantic surface a clear validation error.
return parsed
# Fallback: wrap in a single-element list so the block gets [dict]
# instead of pydantic flattening keys/values into a flat list.
return [parsed]
def _adapt_list_to_object(parsed: list) -> Any:
"""Adapt a parsed list to an object-typed field.
Converts tabular lists to column-dicts; raises for non-tabular lists.
"""
if _is_tabular(parsed):
return _tabular_to_column_dict(parsed)
# Non-tabular list (e.g. a plain Python list from a YAML file) cannot
# be meaningfully coerced to an object. Raise explicitly so callers
# get a clear error rather than pydantic silently wrapping the list.
raise FileRefExpansionError(
"Cannot adapt a non-tabular list to an object-typed field. "
"Expected a tabular structure ([[header], [row1], ...]) or a dict."
)
def _is_tabular(parsed: Any) -> bool:
"""Check if parsed data is in tabular format: [[header], [row1], ...].
Uses isinstance checks because this is a structural type guard on
opaque parser output (Any), not duck typing. A Protocol wouldn't
help here — we need to verify exact list-of-lists shape.
"""
if not isinstance(parsed, list) or len(parsed) < 2:
return False
header = parsed[0]
if not isinstance(header, list) or not header:
return False
if not all(isinstance(h, str) for h in header):
return False
return all(isinstance(row, list) for row in parsed[1:])
def _tabular_to_list_of_dicts(parsed: list) -> list[dict[str, Any]]:
"""Convert [[header], [row1], ...] → [{header[0]: row[0], ...}, ...].
Ragged rows (fewer columns than the header) get None for missing values.
Extra values beyond the header length are silently dropped.
"""
header = parsed[0]
return [
dict(itertools.zip_longest(header, row[: len(header)], fillvalue=None))
for row in parsed[1:]
]
def _tabular_to_column_dict(parsed: list) -> dict[str, list]:
"""Convert [[header], [row1], ...] → {"col1": [val1, ...], ...}.
Ragged rows (fewer columns than the header) get None for missing values,
ensuring all columns have equal length.
"""
header = parsed[0]
return {
col: [row[i] if i < len(row) else None for row in parsed[1:]]
for i, col in enumerate(header)
}

View File

@@ -175,6 +175,199 @@ async def test_expand_args_replaces_file_ref_in_nested_dict():
assert result["count"] == 42
# ---------------------------------------------------------------------------
# expand_file_refs_in_args — bare ref structured parsing
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_bare_ref_json_returns_parsed_dict():
"""Bare ref to a .json file returns parsed dict, not raw string."""
with tempfile.TemporaryDirectory() as sdk_cwd:
json_file = os.path.join(sdk_cwd, "data.json")
with open(json_file, "w") as f:
f.write('{"key": "value", "count": 42}')
with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
mock_cwd_var.get.return_value = sdk_cwd
result = await expand_file_refs_in_args(
{"data": f"@@agptfile:{json_file}"},
user_id="u1",
session=_make_session(),
)
assert result["data"] == {"key": "value", "count": 42}
@pytest.mark.asyncio
async def test_bare_ref_csv_returns_parsed_table():
"""Bare ref to a .csv file returns list[list[str]] table."""
with tempfile.TemporaryDirectory() as sdk_cwd:
csv_file = os.path.join(sdk_cwd, "data.csv")
with open(csv_file, "w") as f:
f.write("Name,Score\nAlice,90\nBob,85")
with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
mock_cwd_var.get.return_value = sdk_cwd
result = await expand_file_refs_in_args(
{"input": f"@@agptfile:{csv_file}"},
user_id="u1",
session=_make_session(),
)
assert result["input"] == [
["Name", "Score"],
["Alice", "90"],
["Bob", "85"],
]
@pytest.mark.asyncio
async def test_bare_ref_unknown_extension_returns_string():
"""Bare ref to a file with unknown extension returns plain string."""
with tempfile.TemporaryDirectory() as sdk_cwd:
txt_file = os.path.join(sdk_cwd, "readme.txt")
with open(txt_file, "w") as f:
f.write("plain text content")
with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
mock_cwd_var.get.return_value = sdk_cwd
result = await expand_file_refs_in_args(
{"data": f"@@agptfile:{txt_file}"},
user_id="u1",
session=_make_session(),
)
assert result["data"] == "plain text content"
assert isinstance(result["data"], str)
@pytest.mark.asyncio
async def test_bare_ref_invalid_json_falls_back_to_string():
"""Bare ref to a .json file with invalid JSON falls back to string."""
with tempfile.TemporaryDirectory() as sdk_cwd:
json_file = os.path.join(sdk_cwd, "bad.json")
with open(json_file, "w") as f:
f.write("not valid json {{{")
with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
mock_cwd_var.get.return_value = sdk_cwd
result = await expand_file_refs_in_args(
{"data": f"@@agptfile:{json_file}"},
user_id="u1",
session=_make_session(),
)
assert result["data"] == "not valid json {{{"
assert isinstance(result["data"], str)
@pytest.mark.asyncio
async def test_embedded_ref_always_returns_string_even_for_json():
"""Embedded ref (text around it) returns plain string, not parsed JSON."""
with tempfile.TemporaryDirectory() as sdk_cwd:
json_file = os.path.join(sdk_cwd, "data.json")
with open(json_file, "w") as f:
f.write('{"key": "value"}')
with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
mock_cwd_var.get.return_value = sdk_cwd
result = await expand_file_refs_in_args(
{"data": f"prefix @@agptfile:{json_file} suffix"},
user_id="u1",
session=_make_session(),
)
assert isinstance(result["data"], str)
assert result["data"].startswith("prefix ")
assert result["data"].endswith(" suffix")
@pytest.mark.asyncio
async def test_bare_ref_yaml_returns_parsed_dict():
"""Bare ref to a .yaml file returns parsed dict."""
with tempfile.TemporaryDirectory() as sdk_cwd:
yaml_file = os.path.join(sdk_cwd, "config.yaml")
with open(yaml_file, "w") as f:
f.write("name: test\ncount: 42\n")
with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
mock_cwd_var.get.return_value = sdk_cwd
result = await expand_file_refs_in_args(
{"config": f"@@agptfile:{yaml_file}"},
user_id="u1",
session=_make_session(),
)
assert result["config"] == {"name": "test", "count": 42}
@pytest.mark.asyncio
async def test_bare_ref_binary_with_line_range_ignores_range():
"""Bare ref to a binary file (.parquet) with line range parses the full file.
Binary formats (parquet, xlsx) ignore line ranges — the full content is
parsed and the range is silently dropped with a log warning.
"""
try:
import pandas as pd
except ImportError:
pytest.skip("pandas not installed")
try:
import pyarrow # noqa: F401 # pyright: ignore[reportMissingImports]
except ImportError:
pytest.skip("pyarrow not installed")
with tempfile.TemporaryDirectory() as sdk_cwd:
parquet_file = os.path.join(sdk_cwd, "data.parquet")
import io as _io
df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
buf = _io.BytesIO()
df.to_parquet(buf, index=False)
with open(parquet_file, "wb") as f:
f.write(buf.getvalue())
with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
mock_cwd_var.get.return_value = sdk_cwd
# Line range [1-2] should be silently ignored for binary formats.
result = await expand_file_refs_in_args(
{"data": f"@@agptfile:{parquet_file}[1-2]"},
user_id="u1",
session=_make_session(),
)
# Full file is returned despite the line range.
assert result["data"] == [["A", "B"], [1, 4], [2, 5], [3, 6]]
@pytest.mark.asyncio
async def test_bare_ref_toml_returns_parsed_dict():
"""Bare ref to a .toml file returns parsed dict."""
with tempfile.TemporaryDirectory() as sdk_cwd:
toml_file = os.path.join(sdk_cwd, "config.toml")
with open(toml_file, "w") as f:
f.write('name = "test"\ncount = 42\n')
with patch("backend.copilot.context._current_sdk_cwd") as mock_cwd_var:
mock_cwd_var.get.return_value = sdk_cwd
result = await expand_file_refs_in_args(
{"config": f"@@agptfile:{toml_file}"},
user_id="u1",
session=_make_session(),
)
assert result["config"] == {"name": "test", "count": 42}
# ---------------------------------------------------------------------------
# _read_file_handler — extended to accept workspace:// and local paths
# ---------------------------------------------------------------------------
@@ -219,7 +412,7 @@ async def test_read_file_handler_workspace_uri():
"backend.copilot.sdk.tool_adapter.get_execution_context",
return_value=("user-1", mock_session),
), patch(
"backend.copilot.sdk.file_ref.get_manager",
"backend.copilot.sdk.file_ref.get_workspace_manager",
new=AsyncMock(return_value=mock_manager),
):
result = await _read_file_handler(
@@ -276,7 +469,7 @@ async def test_read_file_bytes_workspace_virtual_path():
mock_manager.read_file.return_value = b"virtual path content"
with patch(
"backend.copilot.sdk.file_ref.get_manager",
"backend.copilot.sdk.file_ref.get_workspace_manager",
new=AsyncMock(return_value=mock_manager),
):
result = await read_file_bytes("workspace:///reports/q1.md", "user-1", session)

File diff suppressed because it is too large Load Diff

View File

@@ -20,7 +20,24 @@ Use these URLs directly without asking the user:
| Cloudflare | `https://mcp.cloudflare.com/mcp` |
| Atlassian / Jira | `https://mcp.atlassian.com/mcp` |
For other services, search the MCP registry at https://registry.modelcontextprotocol.io/.
For other services, search the MCP registry API:
```http
GET https://registry.modelcontextprotocol.io/v0/servers?q=<search_term>
```
Each result includes a `remotes` array with the exact server URL to use.
### Important: Check blocks first
Before using `run_mcp_tool`, always check if the platform already has blocks for the service
using `find_block`. The platform has hundreds of built-in blocks (Google Sheets, Google Docs,
Google Calendar, Gmail, etc.) that work without MCP setup.
Only use `run_mcp_tool` when:
- The service is in the known hosted MCP servers list above, OR
- You searched `find_block` first and found no matching blocks
**Never guess or construct MCP server URLs.** Only use URLs from the known servers list above
or from the `remotes[].url` field in MCP registry search results.
### Authentication

View File

@@ -221,12 +221,12 @@ class SDKResponseAdapter:
responses.append(StreamFinish())
else:
logger.warning(
"Unexpected ResultMessage subtype: %s", sdk_message.subtype
f"Unexpected ResultMessage subtype: {sdk_message.subtype}"
)
responses.append(StreamFinish())
else:
logger.debug("Unhandled SDK message type: %s", type(sdk_message).__name__)
logger.debug(f"Unhandled SDK message type: {type(sdk_message).__name__}")
return responses

View File

@@ -52,7 +52,7 @@ def _validate_workspace_path(
if is_allowed_local_path(path, sdk_cwd):
return {}
logger.warning("Blocked %s outside workspace: %s", tool_name, path)
logger.warning(f"Blocked {tool_name} outside workspace: {path}")
workspace_hint = f" Allowed workspace: {sdk_cwd}" if sdk_cwd else ""
return _deny(
f"[SECURITY] Tool '{tool_name}' can only access files within the workspace "
@@ -71,7 +71,7 @@ def _validate_tool_access(
"""
# Block forbidden tools
if tool_name in BLOCKED_TOOLS:
logger.warning("Blocked tool access attempt: %s", tool_name)
logger.warning(f"Blocked tool access attempt: {tool_name}")
return _deny(
f"[SECURITY] Tool '{tool_name}' is blocked for security. "
"This is enforced by the platform and cannot be bypassed. "
@@ -111,9 +111,7 @@ def _validate_user_isolation(
# the tool itself via _validate_ephemeral_path.
path = tool_input.get("path", "") or tool_input.get("file_path", "")
if path and ".." in path:
logger.warning(
"Blocked path traversal attempt: %s by user %s", path, user_id
)
logger.warning(f"Blocked path traversal attempt: {path} by user {user_id}")
return {
"hookSpecificOutput": {
"hookEventName": "PreToolUse",
@@ -129,7 +127,7 @@ def create_security_hooks(
user_id: str | None,
sdk_cwd: str | None = None,
max_subtasks: int = 3,
on_compact: Callable[[], None] | None = None,
on_compact: Callable[[str], None] | None = None,
) -> dict[str, Any]:
"""Create the security hooks configuration for Claude Agent SDK.
@@ -144,6 +142,7 @@ def create_security_hooks(
sdk_cwd: SDK working directory for workspace-scoped tool validation
max_subtasks: Maximum concurrent Task (sub-agent) spawns allowed per session
on_compact: Callback invoked when SDK starts compacting context.
Receives the transcript_path from the hook input.
Returns:
Hooks configuration dict for ClaudeAgentOptions
@@ -171,7 +170,7 @@ def create_security_hooks(
# Block background task execution first — denied calls
# should not consume a subtask slot.
if tool_input.get("run_in_background"):
logger.info("[SDK] Blocked background Task, user=%s", user_id)
logger.info(f"[SDK] Blocked background Task, user={user_id}")
return cast(
SyncHookJSONOutput,
_deny(
@@ -213,7 +212,7 @@ def create_security_hooks(
if tool_name == "Task" and tool_use_id is not None:
task_tool_use_ids.add(tool_use_id)
logger.debug("[SDK] Tool start: %s, user=%s", tool_name, user_id)
logger.debug(f"[SDK] Tool start: {tool_name}, user={user_id}")
return cast(SyncHookJSONOutput, {})
def _release_task_slot(tool_name: str, tool_use_id: str | None) -> None:
@@ -303,11 +302,21 @@ def create_security_hooks(
"""
_ = context, tool_use_id
trigger = input_data.get("trigger", "auto")
# Sanitize untrusted input before logging to prevent log injection
transcript_path = (
str(input_data.get("transcript_path", ""))
.replace("\n", "")
.replace("\r", "")
)
logger.info(
f"[SDK] Context compaction triggered: {trigger}, user={user_id}"
"[SDK] Context compaction triggered: %s, user=%s, "
"transcript_path=%s",
trigger,
user_id,
transcript_path,
)
if on_compact is not None:
on_compact()
on_compact(transcript_path)
return cast(SyncHookJSONOutput, {})
hooks: dict[str, Any] = {

View File

@@ -29,6 +29,7 @@ from langfuse import propagate_attributes
from langsmith.integrations.claude_agent_sdk import configure_claude_agent_sdk
from pydantic import BaseModel
from backend.copilot.context import get_workspace_manager
from backend.data.redis_client import get_redis_async
from backend.executor.cluster_lock import AsyncClusterLock
from backend.util.exceptions import NotFoundError
@@ -40,13 +41,11 @@ from ..constants import COPILOT_ERROR_PREFIX, COPILOT_SYSTEM_PREFIX
from ..model import (
ChatMessage,
ChatSession,
Usage,
get_chat_session,
update_session_title,
upsert_chat_session,
)
from ..prompting import get_sdk_supplement
from ..rate_limit import record_token_usage
from ..response_model import (
StreamBaseResponse,
StreamError,
@@ -56,7 +55,6 @@ from ..response_model import (
StreamTextDelta,
StreamToolInputAvailable,
StreamToolOutputAvailable,
StreamUsage,
)
from ..service import (
_build_system_prompt,
@@ -65,7 +63,6 @@ from ..service import (
)
from ..tools.e2b_sandbox import get_or_create_sandbox, pause_sandbox_direct
from ..tools.sandbox import WORKSPACE_PREFIX, make_session_path
from ..tools.workspace_files import get_manager
from ..tracking import track_user_message
from .compaction import CompactionTracker, filter_compaction_messages
from .response_adapter import SDKResponseAdapter
@@ -78,12 +75,9 @@ from .tool_adapter import (
wait_for_stash,
)
from .transcript import (
COMPACT_THRESHOLD_BYTES,
TranscriptDownload,
cleanup_cli_project_dir,
compact_transcript,
download_transcript,
read_cli_session_file,
read_compacted_entries,
upload_transcript,
validate_transcript,
write_transcript_to_tempfile,
@@ -301,7 +295,7 @@ def _cleanup_sdk_tool_results(cwd: str) -> None:
"""
normalized = os.path.normpath(cwd)
if not normalized.startswith(_SDK_CWD_PREFIX):
logger.warning("[SDK] Rejecting cleanup for path outside workspace: %s", cwd)
logger.warning(f"[SDK] Rejecting cleanup for path outside workspace: {cwd}")
return
# Clean the CLI's project directory (transcripts + tool-results).
@@ -395,7 +389,7 @@ async def _compress_messages(
client=client,
)
except Exception as e:
logger.warning("[SDK] Context compression with LLM failed: %s", e)
logger.warning(f"[SDK] Context compression with LLM failed: {e}")
# Fall back to truncation-only (no LLM summarization)
result = await compress_context(
messages=messages_dict,
@@ -571,7 +565,7 @@ async def _prepare_file_attachments(
return empty
try:
manager = await get_manager(user_id, session_id)
manager = await get_workspace_manager(user_id, session_id)
except Exception:
logger.warning(
"Failed to create workspace manager for file attachments",
@@ -631,56 +625,6 @@ async def _prepare_file_attachments(
return PreparedAttachments(hint=hint, image_blocks=image_blocks)
async def _maybe_compact_and_upload(
dl: TranscriptDownload,
user_id: str,
session_id: str,
log_prefix: str = "[Transcript]",
) -> str:
"""Compact an oversized transcript and upload the compacted version.
Returns the (possibly compacted) transcript content, or an empty string
if compaction was needed but failed.
"""
content = dl.content
if len(content) <= COMPACT_THRESHOLD_BYTES:
return content
logger.warning(
"%s Transcript oversized (%dB > %dB), compacting",
log_prefix,
len(content),
COMPACT_THRESHOLD_BYTES,
)
compacted = await compact_transcript(content, log_prefix=log_prefix)
if not compacted:
logger.warning(
"%s Compaction failed, skipping resume for this turn", log_prefix
)
return ""
# Keep the original message_count: it reflects the number of
# session.messages covered by this transcript, which the gap-fill
# logic uses as a slice index. Counting JSONL lines would give a
# smaller number (compacted messages != session message count) and
# cause already-covered messages to be re-injected.
try:
await upload_transcript(
user_id=user_id,
session_id=session_id,
content=compacted,
message_count=dl.message_count,
log_prefix=log_prefix,
)
except Exception:
logger.warning(
"%s Failed to upload compacted transcript",
log_prefix,
exc_info=True,
)
return compacted
async def stream_chat_completion_sdk(
session_id: str,
message: str | None = None,
@@ -792,14 +736,6 @@ async def stream_chat_completion_sdk(
_otel_ctx: Any = None
# Make sure there is no more code between the lock acquisition and try-block.
# Token usage accumulators — populated from ResultMessage at end of turn
turn_prompt_tokens = 0 # uncached input tokens only
turn_completion_tokens = 0
turn_cache_read_tokens = 0
turn_cache_creation_tokens = 0
total_tokens = 0 # computed once before StreamUsage, reused in finally
turn_cost_usd: float | None = None
try:
# Build system prompt (reuses non-SDK path with Langfuse support).
# Pre-compute the cwd here so the exact working directory path can be
@@ -892,33 +828,20 @@ async def stream_chat_completion_sdk(
is_valid,
)
if is_valid:
transcript_content = await _maybe_compact_and_upload(
dl,
user_id=user_id or "",
session_id=session_id,
log_prefix=log_prefix,
)
# Load previous context into builder (empty string is a no-op)
if transcript_content:
transcript_builder.load_previous(
transcript_content, log_prefix=log_prefix
)
resume_file = (
write_transcript_to_tempfile(
transcript_content, session_id, sdk_cwd
)
if transcript_content
else None
# Load previous FULL context into builder
transcript_builder.load_previous(dl.content, log_prefix=log_prefix)
resume_file = write_transcript_to_tempfile(
dl.content, session_id, sdk_cwd
)
if resume_file:
use_resume = True
transcript_msg_count = dl.message_count
logger.debug(
f"{log_prefix} Using --resume ({len(transcript_content)}B, "
f"{log_prefix} Using --resume ({len(dl.content)}B, "
f"msg_count={transcript_msg_count})"
)
else:
logger.warning("%s Transcript downloaded but invalid", log_prefix)
logger.warning(f"{log_prefix} Transcript downloaded but invalid")
elif config.claude_agent_use_resume and user_id and len(session.messages) > 1:
logger.warning(
f"{log_prefix} No transcript available "
@@ -1123,6 +1046,7 @@ async def stream_chat_completion_sdk(
exc_info=True,
)
ended_with_stream_error = True
yield StreamError(
errorText=f"SDK stream error: {stream_err}",
code="sdk_stream_error",
@@ -1188,7 +1112,7 @@ async def stream_chat_completion_sdk(
- len(adapter.resolved_tool_calls),
)
# Log ResultMessage details and capture token usage
# Log ResultMessage details for debugging
if isinstance(sdk_msg, ResultMessage):
logger.info(
"%s Received: ResultMessage %s "
@@ -1207,46 +1131,26 @@ async def stream_chat_completion_sdk(
sdk_msg.result or "(no error message provided)",
)
# Capture token usage from ResultMessage.
# Anthropic reports cached tokens separately:
# input_tokens = uncached only
# cache_read_input_tokens = served from cache
# cache_creation_input_tokens = written to cache
if sdk_msg.usage:
turn_prompt_tokens += sdk_msg.usage.get("input_tokens", 0)
turn_cache_read_tokens += sdk_msg.usage.get(
"cache_read_input_tokens", 0
)
turn_cache_creation_tokens += sdk_msg.usage.get(
"cache_creation_input_tokens", 0
)
turn_completion_tokens += sdk_msg.usage.get(
"output_tokens", 0
)
logger.info(
"%s Token usage: uncached=%d, cache_read=%d, cache_create=%d, output=%d",
log_prefix,
turn_prompt_tokens,
turn_cache_read_tokens,
turn_cache_creation_tokens,
turn_completion_tokens,
)
if sdk_msg.total_cost_usd is not None:
turn_cost_usd = sdk_msg.total_cost_usd
# Emit compaction end if SDK finished compacting.
# When compaction ends, sync TranscriptBuilder with
# the CLI's compacted session file so the uploaded
# transcript reflects compaction.
compaction_events = await compaction.emit_end_if_ready(session)
for ev in compaction_events:
# When compaction ends, sync TranscriptBuilder with the
# CLI's active context so they stay identical.
compact_result = await compaction.emit_end_if_ready(session)
for ev in compact_result.events:
yield ev
if compaction_events and sdk_cwd:
cli_content = await read_cli_session_file(sdk_cwd)
if cli_content:
# After replace_entries, skip append_assistant for this
# sdk_msg — the CLI session file already contains it,
# so appending again would create a duplicate.
entries_replaced = False
if compact_result.just_ended:
compacted = await asyncio.to_thread(
read_compacted_entries,
compact_result.transcript_path,
)
if compacted is not None:
transcript_builder.replace_entries(
cli_content, log_prefix=log_prefix
compacted, log_prefix=log_prefix
)
entries_replaced = True
for response in adapter.convert_message(sdk_msg):
if isinstance(response, StreamStart):
@@ -1333,10 +1237,11 @@ async def stream_chat_completion_sdk(
tool_call_id=response.toolCallId,
)
)
transcript_builder.append_tool_result(
tool_use_id=response.toolCallId,
content=content,
)
if not entries_replaced:
transcript_builder.append_tool_result(
tool_use_id=response.toolCallId,
content=content,
)
has_tool_results = True
elif isinstance(response, StreamFinish):
@@ -1346,7 +1251,9 @@ async def stream_chat_completion_sdk(
# any stashed tool results from the previous turn are
# recorded first, preserving the required API order:
# assistant(tool_use) → tool_result → assistant(text).
if isinstance(sdk_msg, AssistantMessage):
# Skip if replace_entries just ran — the CLI session
# file already contains this message.
if isinstance(sdk_msg, AssistantMessage) and not entries_replaced:
transcript_builder.append_assistant(
content_blocks=_format_sdk_content_blocks(sdk_msg.content),
model=sdk_msg.model,
@@ -1440,27 +1347,6 @@ async def stream_chat_completion_sdk(
) and not has_appended_assistant:
session.messages.append(assistant_response)
# Emit token usage to the client (must be in try to reach SSE stream).
# Session persistence of usage is in finally to stay consistent with
# rate-limit recording even if an exception interrupts between here
# and the finally block.
# Compute total_tokens once; reused in the finally block for
# session persistence and rate-limit recording.
total_tokens = (
turn_prompt_tokens
+ turn_cache_read_tokens
+ turn_cache_creation_tokens
+ turn_completion_tokens
)
if total_tokens > 0:
yield StreamUsage(
promptTokens=turn_prompt_tokens,
completionTokens=turn_completion_tokens,
totalTokens=total_tokens,
cacheReadTokens=turn_cache_read_tokens,
cacheCreationTokens=turn_cache_creation_tokens,
)
# Transcript upload is handled exclusively in the finally block
# to avoid double-uploads (the success path used to upload the
# old resume file, then the finally block overwrote it with the
@@ -1525,48 +1411,6 @@ async def stream_chat_completion_sdk(
except Exception:
logger.warning("OTEL context teardown failed", exc_info=True)
# --- Persist token usage to session + rate-limit counters ---
# Both must live in finally so they stay consistent even when an
# exception interrupts the try block after StreamUsage was yielded.
# total_tokens is computed once before StreamUsage yield above.
if total_tokens > 0:
if session is not None:
session.usage.append(
Usage(
prompt_tokens=turn_prompt_tokens,
completion_tokens=turn_completion_tokens,
total_tokens=total_tokens,
cache_read_tokens=turn_cache_read_tokens,
cache_creation_tokens=turn_cache_creation_tokens,
)
)
logger.info(
"%s Turn usage: uncached=%d, cache_read=%d, cache_create=%d, "
"output=%d, total=%d, cost_usd=%s",
log_prefix,
turn_prompt_tokens,
turn_cache_read_tokens,
turn_cache_creation_tokens,
turn_completion_tokens,
total_tokens,
turn_cost_usd,
)
if user_id and total_tokens > 0:
try:
await record_token_usage(
user_id=user_id,
prompt_tokens=turn_prompt_tokens,
completion_tokens=turn_completion_tokens,
cache_read_tokens=turn_cache_read_tokens,
cache_creation_tokens=turn_cache_creation_tokens,
)
except Exception as usage_err:
logger.warning(
"%s Failed to record token usage: %s",
log_prefix,
usage_err,
)
# --- Persist session messages ---
# This MUST run in finally to persist messages even when the generator
# is stopped early (e.g., user clicks stop, processor breaks stream loop).
@@ -1600,13 +1444,13 @@ async def stream_chat_completion_sdk(
task.add_done_callback(_background_tasks.discard)
# --- Upload transcript for next-turn --resume ---
# This MUST run in finally so the transcript is uploaded even when
# the streaming loop raises an exception.
# The transcript represents the COMPLETE active context (atomic).
# TranscriptBuilder is the single source of truth. It mirrors the
# CLI's active context: on compaction, replace_entries() syncs it
# with the compacted session file. No CLI file read needed here.
if config.claude_agent_use_resume and user_id and session is not None:
try:
# Build complete transcript from captured SDK messages
transcript_content = transcript_builder.to_jsonl()
entry_count = transcript_builder.entry_count
if not transcript_content:
logger.warning(
@@ -1616,18 +1460,15 @@ async def stream_chat_completion_sdk(
logger.warning(
"%s Transcript invalid, skipping upload (entries=%d)",
log_prefix,
transcript_builder.entry_count,
entry_count,
)
else:
logger.info(
"%s Uploading complete transcript (entries=%d, bytes=%d)",
"%s Uploading transcript (entries=%d, bytes=%d)",
log_prefix,
transcript_builder.entry_count,
entry_count,
len(transcript_content),
)
# Shield upload from cancellation - let it complete even if
# the finally block is interrupted. No timeout to avoid race
# conditions where backgrounded uploads overwrite newer transcripts.
await asyncio.shield(
upload_transcript(
user_id=user_id,
@@ -1662,6 +1503,6 @@ async def _update_title_async(
)
if title and user_id:
await update_session_title(session_id, user_id, title, only_if_empty=True)
logger.debug("[SDK] Generated title for %s: %s", session_id, title)
logger.debug(f"[SDK] Generated title for {session_id}: {title}")
except Exception as e:
logger.warning("[SDK] Failed to update session title: %s", e)
logger.warning(f"[SDK] Failed to update session title: {e}")

View File

@@ -20,7 +20,7 @@ class _FakeFileInfo:
size_bytes: int
_PATCH_TARGET = "backend.copilot.sdk.service.get_manager"
_PATCH_TARGET = "backend.copilot.sdk.service.get_workspace_manager"
class TestPrepareFileAttachments:

View File

@@ -234,9 +234,7 @@ def create_tool_handler(base_tool: BaseTool):
try:
return await _execute_tool_sync(base_tool, user_id, session, args)
except Exception as e:
logger.error(
"Error executing tool %s: %s", base_tool.name, e, exc_info=True
)
logger.error(f"Error executing tool {base_tool.name}: {e}", exc_info=True)
return _mcp_error(f"Failed to execute {base_tool.name}: {e}")
return tool_handler
@@ -349,7 +347,7 @@ def create_copilot_mcp_server(*, use_e2b: bool = False):
:func:`get_sdk_disallowed_tools`.
"""
def _truncating(fn, tool_name: str):
def _truncating(fn, tool_name: str, input_schema: dict[str, Any] | None = None):
"""Wrap a tool handler so its response is truncated to stay under the
SDK's 10 MB JSON buffer, and stash the (truncated) output for the
response adapter before the SDK can apply its own head-truncation.
@@ -363,7 +361,9 @@ def create_copilot_mcp_server(*, use_e2b: bool = False):
user_id, session = get_execution_context()
if session is not None:
try:
args = await expand_file_refs_in_args(args, user_id, session)
args = await expand_file_refs_in_args(
args, user_id, session, input_schema=input_schema
)
except FileRefExpansionError as exc:
return _mcp_error(
f"@@agptfile: reference could not be resolved: {exc}. "
@@ -391,11 +391,12 @@ def create_copilot_mcp_server(*, use_e2b: bool = False):
for tool_name, base_tool in TOOL_REGISTRY.items():
handler = create_tool_handler(base_tool)
schema = _build_input_schema(base_tool)
decorated = tool(
tool_name,
base_tool.description,
_build_input_schema(base_tool),
)(_truncating(handler, tool_name))
schema,
)(_truncating(handler, tool_name, input_schema=schema))
sdk_tools.append(decorated)
# E2B file tools replace SDK built-in Read/Write/Edit/Glob/Grep.

View File

@@ -17,13 +17,8 @@ import shutil
import time
from dataclasses import dataclass
from pathlib import Path
from uuid import uuid4
import openai
from backend.copilot.config import ChatConfig
from backend.util import json
from backend.util.prompt import CompressResult, compress_context
logger = logging.getLogger(__name__)
@@ -41,11 +36,6 @@ STRIPPABLE_TYPES = frozenset(
{"progress", "file-history-snapshot", "queue-operation", "summary", "pr-link"}
)
# JSONL protocol values used in transcript serialization.
STOP_REASON_END_TURN = "end_turn"
COMPACT_MSG_ID_PREFIX = "msg_compact_"
ENTRY_TYPE_MESSAGE = "message"
@dataclass
class TranscriptDownload:
@@ -109,9 +99,7 @@ def strip_progress_entries(content: str) -> str:
continue
parent = entry.get("parentUuid", "")
original_parent = parent
seen_parents: set[str] = set()
while parent in stripped_uuids and parent not in seen_parents:
seen_parents.add(parent)
while parent in stripped_uuids:
parent = uuid_to_parent.get(parent, "")
if parent != original_parent:
entry["parentUuid"] = parent
@@ -157,60 +145,147 @@ def _sanitize_id(raw_id: str, max_len: int = 36) -> str:
_SAFE_CWD_PREFIX = os.path.realpath("/tmp/copilot-")
def _projects_base() -> str:
"""Return the resolved path to the CLI's projects directory."""
config_dir = os.environ.get("CLAUDE_CONFIG_DIR") or os.path.expanduser("~/.claude")
return os.path.realpath(os.path.join(config_dir, "projects"))
def _cli_project_dir(sdk_cwd: str) -> str | None:
"""Return the CLI's project directory for a given working directory.
Returns ``None`` if the path would escape the projects base.
"""
cwd_encoded = re.sub(r"[^a-zA-Z0-9]", "-", os.path.realpath(sdk_cwd))
config_dir = os.environ.get("CLAUDE_CONFIG_DIR") or os.path.expanduser("~/.claude")
projects_base = os.path.realpath(os.path.join(config_dir, "projects"))
projects_base = _projects_base()
project_dir = os.path.realpath(os.path.join(projects_base, cwd_encoded))
if not project_dir.startswith(projects_base + os.sep):
logger.warning("[Transcript] Project dir escaped base: %s", project_dir)
logger.warning(
"[Transcript] Project dir escaped projects base: %s", project_dir
)
return None
return project_dir
async def read_cli_session_file(sdk_cwd: str) -> str | None:
"""Read the CLI's own session file, which reflects any mid-stream compaction.
def _safe_glob_jsonl(project_dir: str) -> list[Path]:
"""Glob ``*.jsonl`` files, filtering out symlinks that escape the directory."""
try:
resolved_base = Path(project_dir).resolve()
except OSError as e:
logger.warning("[Transcript] Failed to resolve project dir: %s", e)
return []
After the CLI compacts context, its session file contains the compacted
conversation. Reading this file lets ``TranscriptBuilder`` replace its
uncompacted entries with the CLI's compacted version.
result: list[Path] = []
for candidate in Path(project_dir).glob("*.jsonl"):
try:
resolved = candidate.resolve()
if resolved.is_relative_to(resolved_base):
result.append(resolved)
except (OSError, RuntimeError) as e:
logger.debug(
"[Transcript] Skipping invalid CLI session candidate %s: %s",
candidate,
e,
)
return result
def read_compacted_entries(transcript_path: str) -> list[dict] | None:
"""Read compacted entries from the CLI session file after compaction.
Parses the JSONL file line-by-line, finds the ``isCompactSummary: true``
entry, and returns it plus all entries after it.
The CLI writes the compaction summary BEFORE sending the next message,
so the file is guaranteed to be flushed by the time we read it.
Returns a list of parsed dicts, or ``None`` if the file cannot be read
or no compaction summary is found.
"""
import aiofiles
if not transcript_path:
return None
projects_base = _projects_base()
real_path = os.path.realpath(transcript_path)
if not real_path.startswith(projects_base + os.sep):
logger.warning(
"[Transcript] transcript_path outside projects base: %s", transcript_path
)
return None
try:
content = Path(real_path).read_text()
except OSError as e:
logger.warning(
"[Transcript] Failed to read session file %s: %s", transcript_path, e
)
return None
lines = content.strip().split("\n")
compact_idx: int | None = None
for idx, line in enumerate(lines):
if not line.strip():
continue
entry = json.loads(line, fallback=None)
if not isinstance(entry, dict):
continue
if entry.get("isCompactSummary"):
compact_idx = idx # don't break — find the LAST summary
if compact_idx is None:
logger.debug("[Transcript] No compaction summary found in %s", transcript_path)
return None
entries: list[dict] = []
for line in lines[compact_idx:]:
if not line.strip():
continue
entry = json.loads(line, fallback=None)
if isinstance(entry, dict):
entries.append(entry)
logger.info(
"[Transcript] Read %d compacted entries from %s (summary at line %d)",
len(entries),
transcript_path,
compact_idx + 1,
)
return entries
def read_cli_session_file(sdk_cwd: str) -> str | None:
"""Read the CLI's own session file, which reflects any compaction.
The CLI writes its session transcript to
``~/.claude/projects/<encoded_cwd>/<session_id>.jsonl``.
Since each SDK turn uses a unique ``sdk_cwd``, there should be
exactly one ``.jsonl`` file in that directory.
Returns the file content, or ``None`` if not found.
"""
project_dir = _cli_project_dir(sdk_cwd)
if not project_dir or not os.path.isdir(project_dir):
return None
jsonl_files = list(Path(project_dir).glob("*.jsonl"))
jsonl_files = _safe_glob_jsonl(project_dir)
if not jsonl_files:
logger.debug("[Transcript] No CLI session file in %s", project_dir)
return None
# Pick the most recently modified file (there should only be one per turn).
# Guard against races where a file is deleted between glob and stat.
candidates: list[tuple[float, Path]] = []
for p in jsonl_files:
try:
candidates.append((p.stat().st_mtime, p))
except OSError:
continue
if not candidates:
logger.debug("[Transcript] No readable CLI session file in %s", project_dir)
return None
# Resolve + prefix check to prevent symlink escapes.
session_file = max(candidates, key=lambda item: item[0])[1]
real_path = str(session_file.resolve())
if not real_path.startswith(project_dir + os.sep):
logger.warning("[Transcript] Session file escaped project dir: %s", real_path)
logger.debug("[Transcript] No CLI session file found in %s", project_dir)
return None
# Pick the most recently modified file (should be only one per turn).
try:
async with aiofiles.open(real_path) as f:
content = await f.read()
session_file = max(jsonl_files, key=lambda p: p.stat().st_mtime)
except OSError as e:
logger.warning("[Transcript] Failed to inspect CLI session files: %s", e)
return None
try:
content = session_file.read_text()
logger.info(
"[Transcript] Read CLI session file: %s (%d bytes)",
real_path,
session_file,
len(content),
)
return content
@@ -220,10 +295,16 @@ async def read_cli_session_file(sdk_cwd: str) -> str | None:
def cleanup_cli_project_dir(sdk_cwd: str) -> None:
"""Remove the CLI's project directory for a specific working directory."""
"""Remove the CLI's project directory for a specific working directory.
The CLI stores session data under ``~/.claude/projects/<encoded_cwd>/``.
Each SDK turn uses a unique ``sdk_cwd``, so the project directory is
safe to remove entirely after the transcript has been uploaded.
"""
project_dir = _cli_project_dir(sdk_cwd)
if not project_dir:
return
if os.path.isdir(project_dir):
shutil.rmtree(project_dir, ignore_errors=True)
logger.debug("[Transcript] Cleaned up CLI project dir: %s", project_dir)
@@ -246,7 +327,7 @@ def write_transcript_to_tempfile(
# Validate cwd is under the expected sandbox prefix (CodeQL sanitizer).
real_cwd = os.path.realpath(cwd)
if not real_cwd.startswith(_SAFE_CWD_PREFIX):
logger.warning("[Transcript] cwd outside sandbox: %s", cwd)
logger.warning(f"[Transcript] cwd outside sandbox: {cwd}")
return None
try:
@@ -256,17 +337,17 @@ def write_transcript_to_tempfile(
os.path.join(real_cwd, f"transcript-{safe_id}.jsonl")
)
if not jsonl_path.startswith(real_cwd):
logger.warning("[Transcript] Path escaped cwd: %s", jsonl_path)
logger.warning(f"[Transcript] Path escaped cwd: {jsonl_path}")
return None
with open(jsonl_path, "w") as f:
f.write(transcript_content)
logger.info("[Transcript] Wrote resume file: %s", jsonl_path)
logger.info(f"[Transcript] Wrote resume file: {jsonl_path}")
return jsonl_path
except OSError as e:
logger.warning("[Transcript] Failed to write resume file: %s", e)
logger.warning(f"[Transcript] Failed to write resume file: {e}")
return None
@@ -325,24 +406,27 @@ def _meta_storage_path_parts(user_id: str, session_id: str) -> tuple[str, str, s
)
def _build_storage_path(user_id: str, session_id: str, backend: object) -> str:
"""Build the full storage path string that ``retrieve()`` expects.
``store()`` returns a path like ``gcs://bucket/workspaces/...`` or
``local://workspace_id/file_id/filename``. Since we use deterministic
arguments we can reconstruct the same path for download/delete without
having stored the return value.
"""
def _build_path_from_parts(parts: tuple[str, str, str], backend: object) -> str:
"""Build a full storage path from (workspace_id, file_id, filename) parts."""
from backend.util.workspace_storage import GCSWorkspaceStorage
wid, fid, fname = _storage_path_parts(user_id, session_id)
wid, fid, fname = parts
if isinstance(backend, GCSWorkspaceStorage):
blob = f"workspaces/{wid}/{fid}/{fname}"
return f"gcs://{backend.bucket_name}/{blob}"
else:
# LocalWorkspaceStorage returns local://{relative_path}
return f"local://{wid}/{fid}/{fname}"
return f"local://{wid}/{fid}/{fname}"
def _build_storage_path(user_id: str, session_id: str, backend: object) -> str:
"""Build the full storage path string that ``retrieve()`` expects."""
return _build_path_from_parts(_storage_path_parts(user_id, session_id), backend)
def _build_meta_storage_path(user_id: str, session_id: str, backend: object) -> str:
"""Build the full storage path for the companion .meta.json file."""
return _build_path_from_parts(
_meta_storage_path_parts(user_id, session_id), backend
)
async def upload_transcript(
@@ -410,14 +494,11 @@ async def upload_transcript(
content=json.dumps(meta).encode("utf-8"),
)
except Exception as e:
logger.warning("%s Failed to write metadata: %s", log_prefix, e)
logger.warning(f"{log_prefix} Failed to write metadata: {e}")
logger.info(
"%s Uploaded %dB (stripped from %dB, msg_count=%d)",
log_prefix,
len(encoded),
len(content),
message_count,
f"{log_prefix} Uploaded {len(encoded)}B "
f"(stripped from {len(content)}B, msg_count={message_count})"
)
@@ -440,37 +521,25 @@ async def download_transcript(
data = await storage.retrieve(path)
content = data.decode("utf-8")
except FileNotFoundError:
logger.debug("%s No transcript in storage", log_prefix)
logger.debug(f"{log_prefix} No transcript in storage")
return None
except Exception as e:
logger.warning("%s Failed to download transcript: %s", log_prefix, e)
logger.warning(f"{log_prefix} Failed to download transcript: {e}")
return None
# Try to load metadata (best-effort — old transcripts won't have it)
message_count = 0
uploaded_at = 0.0
try:
from backend.util.workspace_storage import GCSWorkspaceStorage
mwid, mfid, mfname = _meta_storage_path_parts(user_id, session_id)
if isinstance(storage, GCSWorkspaceStorage):
blob = f"workspaces/{mwid}/{mfid}/{mfname}"
meta_path = f"gcs://{storage.bucket_name}/{blob}"
else:
meta_path = f"local://{mwid}/{mfid}/{mfname}"
meta_path = _build_meta_storage_path(user_id, session_id, storage)
meta_data = await storage.retrieve(meta_path)
meta = json.loads(meta_data.decode("utf-8"), fallback={})
message_count = meta.get("message_count", 0)
uploaded_at = meta.get("uploaded_at", 0.0)
except FileNotFoundError:
except (FileNotFoundError, Exception):
pass # No metadata — treat as unknown (msg_count=0 → always fill gap)
except Exception as e:
logger.debug("%s Failed to load transcript metadata: %s", log_prefix, e)
logger.info(
"%s Downloaded %dB (msg_count=%d)", log_prefix, len(content), message_count
)
logger.info(f"{log_prefix} Downloaded {len(content)}B (msg_count={message_count})")
return TranscriptDownload(
content=content,
message_count=message_count,
@@ -478,171 +547,27 @@ async def download_transcript(
)
# ---------------------------------------------------------------------------
# Transcript compaction
# ---------------------------------------------------------------------------
async def delete_transcript(user_id: str, session_id: str) -> None:
"""Delete transcript and its metadata from bucket storage.
# Transcripts above this byte threshold are compacted at download time.
COMPACT_THRESHOLD_BYTES = 400_000
def _flatten_assistant_content(blocks: list) -> str:
"""Flatten assistant content blocks into a single plain-text string."""
parts: list[str] = []
for block in blocks:
if isinstance(block, dict):
if block.get("type") == "text":
parts.append(block.get("text", ""))
elif block.get("type") == "tool_use":
parts.append(f"[tool_use: {block.get('name', '?')}]")
elif isinstance(block, str):
parts.append(block)
return "\n".join(parts) if parts else ""
def _flatten_tool_result_content(blocks: list) -> str:
"""Flatten tool_result and other content blocks into plain text.
Handles nested tool_result structures, text blocks, and raw strings.
Uses ``json.dumps`` as fallback for dict blocks without a ``text`` key
or where ``text`` is ``None``.
Removes both the ``.jsonl`` transcript and the companion ``.meta.json``
so stale ``message_count`` watermarks cannot corrupt gap-fill logic.
"""
str_parts: list[str] = []
for block in blocks:
if isinstance(block, dict) and block.get("type") == "tool_result":
inner = block.get("content", "")
if isinstance(inner, list):
for sub in inner:
if isinstance(sub, dict):
text = sub.get("text")
str_parts.append(
str(text) if text is not None else json.dumps(sub)
)
else:
str_parts.append(str(sub))
else:
str_parts.append(str(inner))
elif isinstance(block, dict) and block.get("type") == "text":
str_parts.append(str(block.get("text", "")))
elif isinstance(block, str):
str_parts.append(block)
return "\n".join(str_parts) if str_parts else ""
from backend.util.workspace_storage import get_workspace_storage
storage = await get_workspace_storage()
path = _build_storage_path(user_id, session_id, storage)
def _transcript_to_messages(content: str) -> list[dict]:
"""Convert JSONL transcript entries to message dicts for compress_context."""
messages: list[dict] = []
for line in content.strip().split("\n"):
if not line.strip():
continue
entry = json.loads(line, fallback=None)
if not isinstance(entry, dict):
continue
if entry.get("type", "") in STRIPPABLE_TYPES and not entry.get(
"isCompactSummary"
):
continue
msg = entry.get("message", {})
role = msg.get("role", "")
if not role:
continue
msg_dict: dict = {"role": role}
raw_content = msg.get("content")
if role == "assistant" and isinstance(raw_content, list):
msg_dict["content"] = _flatten_assistant_content(raw_content)
elif isinstance(raw_content, list):
msg_dict["content"] = _flatten_tool_result_content(raw_content)
else:
msg_dict["content"] = raw_content or ""
messages.append(msg_dict)
return messages
def _messages_to_transcript(messages: list[dict]) -> str:
"""Convert compressed message dicts back to JSONL transcript format."""
lines: list[str] = []
last_uuid: str | None = None
for msg in messages:
role = msg.get("role", "user")
entry_type = "assistant" if role == "assistant" else "user"
uid = str(uuid4())
content = msg.get("content", "")
if role == "assistant":
message: dict = {
"role": "assistant",
"model": "",
"id": f"{COMPACT_MSG_ID_PREFIX}{uuid4().hex[:24]}",
"type": ENTRY_TYPE_MESSAGE,
"content": [{"type": "text", "text": content}] if content else [],
"stop_reason": STOP_REASON_END_TURN,
"stop_sequence": None,
}
else:
message = {"role": role, "content": content}
entry = {
"type": entry_type,
"uuid": uid,
"parentUuid": last_uuid,
"message": message,
}
lines.append(json.dumps(entry, separators=(",", ":")))
last_uuid = uid
return "\n".join(lines) + "\n" if lines else ""
async def _run_compression(
messages: list[dict],
model: str,
cfg: ChatConfig,
log_prefix: str,
) -> CompressResult:
"""Run LLM-based compression with truncation fallback."""
try:
async with openai.AsyncOpenAI(
api_key=cfg.api_key, base_url=cfg.base_url, timeout=30.0
) as client:
return await compress_context(messages=messages, model=model, client=client)
await storage.delete(path)
logger.info("[Transcript] Deleted transcript for session %s", session_id)
except Exception as e:
logger.warning("%s LLM compaction failed, using truncation: %s", log_prefix, e)
return await compress_context(messages=messages, model=model, client=None)
logger.warning("[Transcript] Failed to delete transcript: %s", e)
async def compact_transcript(
content: str,
log_prefix: str = "[Transcript]",
) -> str | None:
"""Compact an oversized JSONL transcript using LLM summarization.
Converts transcript entries to plain messages, runs ``compress_context``
(the same compressor used for pre-query history), and rebuilds JSONL.
Returns the compacted JSONL string, or ``None`` on failure.
"""
cfg = ChatConfig()
messages = _transcript_to_messages(content)
if len(messages) < 2:
logger.warning("%s Too few messages to compact (%d)", log_prefix, len(messages))
return None
# Also delete the companion .meta.json to avoid orphaned metadata.
try:
result = await _run_compression(messages, cfg.model, cfg, log_prefix)
if not result.was_compacted:
logger.info("%s Transcript already within token budget", log_prefix)
return content
logger.info(
"%s Compacted transcript: %d->%d tokens (%d summarized, %d dropped)",
log_prefix,
result.original_token_count,
result.token_count,
result.messages_summarized,
result.messages_dropped,
)
compacted = _messages_to_transcript(result.messages)
if not validate_transcript(compacted):
logger.warning("%s Compacted transcript failed validation", log_prefix)
return None
return compacted
meta_path = _build_meta_storage_path(user_id, session_id, storage)
await storage.delete(meta_path)
logger.info("[Transcript] Deleted metadata for session %s", session_id)
except Exception as e:
logger.error(
"%s Transcript compaction failed: %s", log_prefix, e, exc_info=True
)
return None
logger.warning("[Transcript] Failed to delete metadata: %s", e)

View File

@@ -30,8 +30,8 @@ class TranscriptEntry(BaseModel):
type: str
uuid: str
parentUuid: str | None
message: dict[str, Any]
isCompactSummary: bool | None = None
message: dict[str, Any]
class TranscriptBuilder:
@@ -54,6 +54,24 @@ class TranscriptBuilder:
return self._entries[-1].message.get("id", "")
return ""
@staticmethod
def _parse_entry(data: dict) -> TranscriptEntry | None:
"""Parse a single transcript entry, filtering strippable types.
Returns ``None`` for entries that should be skipped (strippable types
that are not compaction summaries).
"""
entry_type = data.get("type", "")
if entry_type in STRIPPABLE_TYPES and not data.get("isCompactSummary"):
return None
return TranscriptEntry(
type=entry_type,
uuid=data.get("uuid") or str(uuid4()),
parentUuid=data.get("parentUuid"),
isCompactSummary=data.get("isCompactSummary") or None,
message=data.get("message", {}),
)
def load_previous(self, content: str, log_prefix: str = "[Transcript]") -> None:
"""Load complete previous transcript.
@@ -79,21 +97,9 @@ class TranscriptBuilder:
)
continue
# Skip STRIPPABLE_TYPES unless the entry is a compaction summary.
# Compaction summaries may have type "summary" but must be preserved
# so --resume can reconstruct the compacted conversation.
entry_type = data.get("type", "")
is_compact = data.get("isCompactSummary", False)
if entry_type in STRIPPABLE_TYPES and not is_compact:
entry = self._parse_entry(data)
if entry is None:
continue
entry = TranscriptEntry(
type=data["type"],
uuid=data.get("uuid") or str(uuid4()),
parentUuid=data.get("parentUuid"),
message=data.get("message", {}),
isCompactSummary=True if is_compact else None,
)
self._entries.append(entry)
self._last_uuid = entry.uuid
@@ -166,6 +172,43 @@ class TranscriptBuilder:
)
self._last_uuid = msg_uuid
def replace_entries(
self, compacted_entries: list[dict], log_prefix: str = "[Transcript]"
) -> None:
"""Replace all entries with compacted entries from the CLI session file.
Called after mid-stream compaction so TranscriptBuilder mirrors the
CLI's active context (compaction summary + post-compaction entries).
Builds the new list first and validates it's non-empty before swapping,
so corrupt input cannot wipe the conversation history.
"""
new_entries: list[TranscriptEntry] = []
for data in compacted_entries:
entry = self._parse_entry(data)
if entry is not None:
new_entries.append(entry)
if not new_entries:
logger.warning(
"%s replace_entries produced 0 entries from %d inputs, keeping old (%d entries)",
log_prefix,
len(compacted_entries),
len(self._entries),
)
return
old_count = len(self._entries)
self._entries = new_entries
self._last_uuid = new_entries[-1].uuid
logger.info(
"%s TranscriptBuilder compacted: %d entries -> %d entries",
log_prefix,
old_count,
len(self._entries),
)
def to_jsonl(self) -> str:
"""Export complete context as JSONL.
@@ -181,33 +224,6 @@ class TranscriptBuilder:
lines = [entry.model_dump_json(exclude_none=True) for entry in self._entries]
return "\n".join(lines) + "\n"
def replace_entries(self, content: str, log_prefix: str = "[Transcript]") -> None:
"""Replace all entries with compacted JSONL content.
Called after the CLI performs mid-stream compaction so the builder's
state reflects the compacted conversation instead of the full
pre-compaction history.
"""
prev_count = len(self._entries)
temp = TranscriptBuilder()
try:
temp.load_previous(content, log_prefix=log_prefix)
except Exception:
logger.exception(
"%s Failed to parse compacted transcript; keeping %d existing entries",
log_prefix,
prev_count,
)
return
self._entries = temp._entries
self._last_uuid = temp._last_uuid
logger.info(
"%s Replaced %d entries with %d compacted entries",
log_prefix,
prev_count,
len(self._entries),
)
@property
def entry_count(self) -> int:
"""Total number of entries in the complete context."""

File diff suppressed because it is too large Load Diff

View File

@@ -28,10 +28,24 @@ logger = logging.getLogger(__name__)
config = ChatConfig()
settings = Settings()
client = LangfuseAsyncOpenAI(api_key=config.api_key, base_url=config.base_url)
_client: LangfuseAsyncOpenAI | None = None
_langfuse = None
langfuse = get_client()
def _get_openai_client() -> LangfuseAsyncOpenAI:
global _client
if _client is None:
_client = LangfuseAsyncOpenAI(api_key=config.api_key, base_url=config.base_url)
return _client
def _get_langfuse():
global _langfuse
if _langfuse is None:
_langfuse = get_client()
return _langfuse
# Default system prompt used when Langfuse is not configured
# Provides minimal baseline tone and personality - all workflow, tools, and
@@ -84,7 +98,7 @@ async def _get_system_prompt_template(context: str) -> str:
else "latest"
)
prompt = await asyncio.to_thread(
langfuse.get_prompt,
_get_langfuse().get_prompt,
config.langfuse_prompt_name,
label=label,
cache_ttl_seconds=config.langfuse_prompt_cache_ttl,
@@ -158,7 +172,7 @@ async def _generate_session_title(
"environment": settings.config.app_env.value,
}
response = await client.chat.completions.create(
response = await _get_openai_client().chat.completions.create(
model=config.title_model,
messages=[
{

View File

@@ -32,6 +32,7 @@ import shutil
import tempfile
from typing import Any
from backend.copilot.context import get_workspace_manager
from backend.copilot.model import ChatSession
from backend.util.request import validate_url_host
@@ -43,7 +44,6 @@ from .models import (
ErrorResponse,
ToolResponseBase,
)
from .workspace_files import get_manager
logger = logging.getLogger(__name__)
@@ -194,7 +194,7 @@ async def _save_browser_state(
),
}
manager = await get_manager(user_id, session.session_id)
manager = await get_workspace_manager(user_id, session.session_id)
await manager.write_file(
content=json.dumps(state).encode("utf-8"),
filename=_STATE_FILENAME,
@@ -218,7 +218,7 @@ async def _restore_browser_state(
Returns True on success (or no state to restore), False on failure.
"""
try:
manager = await get_manager(user_id, session.session_id)
manager = await get_workspace_manager(user_id, session.session_id)
file_info = await manager.get_file_info_by_path(_STATE_FILENAME)
if file_info is None:
@@ -360,7 +360,7 @@ async def close_browser_session(session_name: str, user_id: str | None = None) -
# Delete persisted browser state (cookies, localStorage) from workspace.
if user_id:
try:
manager = await get_manager(user_id, session_name)
manager = await get_workspace_manager(user_id, session_name)
file_info = await manager.get_file_info_by_path(_STATE_FILENAME)
if file_info is not None:
await manager.delete_file(file_info.id)

View File

@@ -897,7 +897,7 @@ class TestHasLocalSession:
# _save_browser_state
# ---------------------------------------------------------------------------
_GET_MANAGER = "backend.copilot.tools.agent_browser.get_manager"
_GET_MANAGER = "backend.copilot.tools.agent_browser.get_workspace_manager"
def _make_mock_manager():

View File

@@ -935,5 +935,5 @@ class AgentValidator:
for i, error in enumerate(self.errors, 1):
error_message += f"{i}. {error}\n"
logger.error(f"Agent validation failed: {error_message}")
logger.warning(f"Agent validation failed: {error_message}")
return False, error_message

View File

@@ -8,15 +8,11 @@ from pydantic_core import PydanticUndefined
from backend.blocks._base import AnyBlockSchema
from backend.copilot.constants import COPILOT_NODE_PREFIX, COPILOT_SESSION_PREFIX
from backend.data import db
from backend.data.credit import UsageTransactionMetadata, get_user_credit_model
from backend.data.db_accessors import workspace_db
from backend.data.execution import ExecutionContext
from backend.data.model import CredentialsFieldInfo, CredentialsMetaInput
from backend.executor.utils import block_usage_cost
from backend.integrations.creds_manager import IntegrationCredentialsManager
from backend.util.clients import get_database_manager_async_client
from backend.util.exceptions import BlockError, InsufficientBalanceError
from backend.util.exceptions import BlockError
from backend.util.type import coerce_inputs_to_schema
from .models import BlockOutputResponse, ErrorResponse, ToolResponseBase
@@ -25,26 +21,6 @@ from .utils import match_credentials_to_requirements
logger = logging.getLogger(__name__)
async def _get_credits(user_id: str) -> int:
"""Get user credits using the adapter pattern (RPC when Prisma unavailable)."""
if not db.is_connected():
return await get_database_manager_async_client().get_credits(user_id)
credit_model = await get_user_credit_model(user_id)
return await credit_model.get_credits(user_id)
async def _spend_credits(
user_id: str, cost: int, metadata: UsageTransactionMetadata
) -> int:
"""Spend user credits using the adapter pattern (RPC when Prisma unavailable)."""
if not db.is_connected():
return await get_database_manager_async_client().spend_credits(
user_id, cost, metadata
)
credit_model = await get_user_credit_model(user_id)
return await credit_model.spend_credits(user_id, cost, metadata)
def get_inputs_from_schema(
input_schema: dict[str, Any],
exclude_fields: set[str] | None = None,
@@ -139,20 +115,6 @@ async def execute_block(
# Coerce non-matching data types to the expected input schema.
coerce_inputs_to_schema(input_data, block.input_schema)
# Pre-execution credit check
cost, cost_filter = block_usage_cost(block, input_data)
has_cost = cost > 0
if has_cost:
balance = await _get_credits(user_id)
if balance < cost:
return ErrorResponse(
message=(
f"Insufficient credits to run '{block.name}'. "
"Please top up your credits to continue."
),
session_id=session_id,
)
# Execute the block and collect outputs
outputs: dict[str, list[Any]] = defaultdict(list)
async for output_name, output_data in block.execute(
@@ -161,37 +123,6 @@ async def execute_block(
):
outputs[output_name].append(output_data)
# Charge credits for block execution
if has_cost:
try:
await _spend_credits(
user_id=user_id,
cost=cost,
metadata=UsageTransactionMetadata(
graph_exec_id=synthetic_graph_id,
graph_id=synthetic_graph_id,
node_id=synthetic_node_id,
node_exec_id=node_exec_id,
block_id=block_id,
block=block.name,
input=cost_filter,
reason="copilot_block_execution",
),
)
except InsufficientBalanceError:
logger.warning(
"Post-exec credit charge failed for block %s (cost=%d)",
block.name,
cost,
)
return ErrorResponse(
message=(
f"Insufficient credits to complete '{block.name}'. "
"Please top up your credits to continue."
),
session_id=session_id,
)
return BlockOutputResponse(
message=f"Block '{block.name}' executed successfully",
block_id=block_id,
@@ -202,16 +133,16 @@ async def execute_block(
)
except BlockError as e:
logger.warning("Block execution failed: %s", e)
logger.warning(f"Block execution failed: {e}")
return ErrorResponse(
message=f"Block execution failed: {e}",
error=str(e),
session_id=session_id,
)
except Exception as e:
logger.error("Unexpected error executing block: %s", e, exc_info=True)
logger.error(f"Unexpected error executing block: {e}", exc_info=True)
return ErrorResponse(
message="An unexpected error occurred while executing the block",
message=f"Failed to execute block: {str(e)}",
error=str(e),
session_id=session_id,
)

View File

@@ -1,202 +1,24 @@
"""Tests for execute_block — credit charging and type coercion."""
"""Tests for execute_block type coercion in helpers.py.
Verifies that execute_block() coerces string input values to match the block's
expected input types, mirroring the executor's validate_exec() logic.
This is critical for @@agptfile: expansion, where file content is always a string
but the block may expect structured types (e.g. list[list[str]]).
"""
from collections.abc import AsyncIterator
from typing import Any
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from backend.blocks._base import BlockType
from backend.copilot.tools.helpers import execute_block
from backend.copilot.tools.models import BlockOutputResponse, ErrorResponse
_USER = "test-user-helpers"
_SESSION = "test-session-helpers"
def _make_block(block_id: str = "block-1", name: str = "TestBlock"):
"""Create a minimal mock block for execute_block()."""
mock = MagicMock()
mock.id = block_id
mock.name = name
mock.block_type = BlockType.STANDARD
mock.input_schema = MagicMock()
mock.input_schema.get_credentials_fields_info.return_value = {}
async def _execute(
input_data: dict, **kwargs: Any
) -> AsyncIterator[tuple[str, Any]]:
yield "result", "ok"
mock.execute = _execute
return mock
def _patch_workspace():
"""Patch workspace_db to return a mock workspace."""
mock_workspace = MagicMock()
mock_workspace.id = "ws-1"
mock_ws_db = MagicMock()
mock_ws_db.get_or_create_workspace = AsyncMock(return_value=mock_workspace)
return patch("backend.copilot.tools.helpers.workspace_db", return_value=mock_ws_db)
# ---------------------------------------------------------------------------
# Credit charging tests
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
class TestExecuteBlockCreditCharging:
async def test_charges_credits_when_cost_is_positive(self):
"""Block with cost > 0 should call spend_credits after execution."""
block = _make_block()
mock_spend = AsyncMock()
with (
_patch_workspace(),
patch(
"backend.copilot.tools.helpers.block_usage_cost",
return_value=(10, {"key": "val"}),
),
patch(
"backend.copilot.tools.helpers._get_credits",
new_callable=AsyncMock,
return_value=100,
),
patch(
"backend.copilot.tools.helpers._spend_credits",
new_callable=AsyncMock,
side_effect=mock_spend,
),
):
result = await execute_block(
block=block,
block_id="block-1",
input_data={"text": "hello"},
user_id=_USER,
session_id=_SESSION,
node_exec_id="exec-1",
matched_credentials={},
)
assert isinstance(result, BlockOutputResponse)
assert result.success is True
mock_spend.assert_awaited_once()
call_kwargs = mock_spend.call_args.kwargs
assert call_kwargs["cost"] == 10
assert call_kwargs["metadata"].reason == "copilot_block_execution"
async def test_returns_error_when_insufficient_credits_before_exec(self):
"""Pre-execution check should return ErrorResponse when balance < cost."""
block = _make_block()
with (
_patch_workspace(),
patch(
"backend.copilot.tools.helpers.block_usage_cost",
return_value=(10, {}),
),
patch(
"backend.copilot.tools.helpers._get_credits",
new_callable=AsyncMock,
return_value=5, # balance < cost (10)
),
):
result = await execute_block(
block=block,
block_id="block-1",
input_data={},
user_id=_USER,
session_id=_SESSION,
node_exec_id="exec-1",
matched_credentials={},
)
assert isinstance(result, ErrorResponse)
assert "Insufficient credits" in result.message
async def test_no_charge_when_cost_is_zero(self):
"""Block with cost 0 should not call spend_credits."""
block = _make_block()
with (
_patch_workspace(),
patch(
"backend.copilot.tools.helpers.block_usage_cost",
return_value=(0, {}),
),
patch(
"backend.copilot.tools.helpers._get_credits",
) as mock_get_credits,
patch(
"backend.copilot.tools.helpers._spend_credits",
) as mock_spend_credits,
):
result = await execute_block(
block=block,
block_id="block-1",
input_data={},
user_id=_USER,
session_id=_SESSION,
node_exec_id="exec-1",
matched_credentials={},
)
assert isinstance(result, BlockOutputResponse)
assert result.success is True
# Credit functions should not be called at all for zero-cost blocks
mock_get_credits.assert_not_awaited()
mock_spend_credits.assert_not_awaited()
async def test_returns_error_on_post_exec_insufficient_balance(self):
"""If charging fails after execution, return ErrorResponse."""
from backend.util.exceptions import InsufficientBalanceError
block = _make_block()
with (
_patch_workspace(),
patch(
"backend.copilot.tools.helpers.block_usage_cost",
return_value=(10, {}),
),
patch(
"backend.copilot.tools.helpers._get_credits",
new_callable=AsyncMock,
return_value=15, # passes pre-check
),
patch(
"backend.copilot.tools.helpers._spend_credits",
new_callable=AsyncMock,
side_effect=InsufficientBalanceError(
"Low balance", _USER, 5, 10
), # fails during actual charge (race with concurrent spend)
),
):
result = await execute_block(
block=block,
block_id="block-1",
input_data={},
user_id=_USER,
session_id=_SESSION,
node_exec_id="exec-1",
matched_credentials={},
)
assert isinstance(result, ErrorResponse)
assert "Insufficient credits" in result.message
# ---------------------------------------------------------------------------
# Type coercion tests
# ---------------------------------------------------------------------------
from backend.copilot.tools.models import BlockOutputResponse
def _make_block_schema(annotations: dict[str, Any]) -> MagicMock:
"""Create a mock input_schema with model_fields matching the given annotations."""
schema = MagicMock()
# coerce_inputs_to_schema uses model_fields (Pydantic v2 API)
model_fields = {}
for name, ann in annotations.items():
field = MagicMock()
@@ -206,7 +28,7 @@ def _make_block_schema(annotations: dict[str, Any]) -> MagicMock:
return schema
def _make_coerce_block(
def _make_block(
block_id: str,
name: str,
annotations: dict[str, Any],
@@ -238,7 +60,7 @@ _TEST_USER_ID = "test-user-coerce"
@pytest.mark.asyncio(loop_scope="session")
async def test_coerce_json_string_to_nested_list():
"""JSON string → list[list[str]] (Google Sheets CSV import case)."""
block = _make_coerce_block(
block = _make_block(
"sheets-write",
"Google Sheets Write",
{"values": list[list[str]], "spreadsheet_id": str},
@@ -268,6 +90,7 @@ async def test_coerce_json_string_to_nested_list():
assert isinstance(response, BlockOutputResponse)
assert response.success is True
# Verify the input was coerced from string to list[list[str]]
assert block._captured_inputs["values"] == [
["Name", "Score"],
["Alice", "90"],
@@ -280,7 +103,7 @@ async def test_coerce_json_string_to_nested_list():
@pytest.mark.asyncio(loop_scope="session")
async def test_coerce_json_string_to_list():
"""JSON string → list[str]."""
block = _make_coerce_block(
block = _make_block(
"list-block",
"List Block",
{"items": list[str]},
@@ -312,7 +135,7 @@ async def test_coerce_json_string_to_list():
@pytest.mark.asyncio(loop_scope="session")
async def test_coerce_json_string_to_dict():
"""JSON string → dict[str, str]."""
block = _make_coerce_block(
block = _make_block(
"dict-block",
"Dict Block",
{"config": dict[str, str]},
@@ -344,7 +167,7 @@ async def test_coerce_json_string_to_dict():
@pytest.mark.asyncio(loop_scope="session")
async def test_no_coercion_when_type_matches():
"""Already-correct types pass through without coercion."""
block = _make_coerce_block(
block = _make_block(
"pass-through",
"Pass Through",
{"values": list[list[str]], "name": str},
@@ -378,7 +201,7 @@ async def test_no_coercion_when_type_matches():
@pytest.mark.asyncio(loop_scope="session")
async def test_coerce_string_to_int():
"""String number → int."""
block = _make_coerce_block(
block = _make_block(
"int-block",
"Int Block",
{"count": int},
@@ -411,7 +234,7 @@ async def test_coerce_string_to_int():
@pytest.mark.asyncio(loop_scope="session")
async def test_coerce_skips_none_values():
"""None values are not coerced (they may be optional fields)."""
block = _make_coerce_block(
block = _make_block(
"optional-block",
"Optional Block",
{"data": list[str], "label": str},
@@ -437,13 +260,14 @@ async def test_coerce_skips_none_values():
)
assert isinstance(response, BlockOutputResponse)
# 'data' was not provided, so it should not appear in captured inputs
assert "data" not in block._captured_inputs
@pytest.mark.asyncio(loop_scope="session")
async def test_coerce_union_type_preserves_valid_member():
"""Union-typed fields should not be coerced when the value matches a member."""
block = _make_coerce_block(
block = _make_block(
"union-block",
"Union Block",
{"content": str | list[str]},
@@ -469,6 +293,7 @@ async def test_coerce_union_type_preserves_valid_member():
)
assert isinstance(response, BlockOutputResponse)
# list[str] should NOT be stringified to '["a", "b"]'
assert block._captured_inputs["content"] == ["a", "b"]
assert isinstance(block._captured_inputs["content"], list)
@@ -476,7 +301,7 @@ async def test_coerce_union_type_preserves_valid_member():
@pytest.mark.asyncio(loop_scope="session")
async def test_coerce_inner_elements_of_generic():
"""Inner elements of generic containers are recursively coerced."""
block = _make_coerce_block(
block = _make_block(
"inner-coerce",
"Inner Coerce",
{"values": list[str]},
@@ -494,6 +319,7 @@ async def test_coerce_inner_elements_of_generic():
response = await execute_block(
block=block,
block_id="inner-coerce",
# Inner elements are ints, but target is list[str]
input_data={"values": [1, 2, 3]},
user_id=_TEST_USER_ID,
session_id=_TEST_SESSION_ID,
@@ -502,5 +328,6 @@ async def test_coerce_inner_elements_of_generic():
)
assert isinstance(response, BlockOutputResponse)
# Inner elements should be coerced from int to str
assert block._captured_inputs["values"] == ["1", "2", "3"]
assert all(isinstance(v, str) for v in block._captured_inputs["values"])

View File

@@ -12,6 +12,7 @@ from backend.copilot.constants import (
COPILOT_SESSION_PREFIX,
)
from backend.copilot.model import ChatSession
from backend.copilot.sdk.file_ref import FileRefExpansionError, expand_file_refs_in_args
from backend.data.db_accessors import review_db
from backend.data.execution import ExecutionContext
@@ -197,6 +198,29 @@ class RunBlockTool(BaseTool):
session_id=session_id,
)
# Expand @@agptfile: refs in input_data with the block's input
# schema. The generic _truncating wrapper skips opaque object
# properties (input_data has no declared inner properties in the
# tool schema), so file ref tokens are still intact here.
# Using the block's schema lets us return raw text for string-typed
# fields and parsed structures for list/dict-typed fields.
if input_data:
try:
input_data = await expand_file_refs_in_args(
input_data,
user_id,
session,
input_schema=input_schema,
)
except FileRefExpansionError as exc:
return ErrorResponse(
message=(
f"Failed to resolve file reference: {exc}. "
"Ensure the file exists before referencing it."
),
session_id=session_id,
)
if missing_credentials:
# Return setup requirements response with missing credentials
credentials_fields_info = block.input_schema.get_credentials_fields_info()

View File

@@ -184,10 +184,12 @@ class RunMCPToolTool(BaseTool):
if e.status_code in _AUTH_STATUS_CODES and not creds:
# Server requires auth and user has no stored credentials
return self._build_setup_requirements(server_url, session_id)
logger.warning("MCP HTTP error for %s: %s", server_host(server_url), e)
host = server_host(server_url)
logger.warning("MCP HTTP error for %s: status=%s", host, e.status_code)
return ErrorResponse(
message=f"MCP server returned HTTP {e.status_code}: {e}",
message=(f"MCP request to {host} failed with HTTP {e.status_code}."),
session_id=session_id,
error=f"HTTP {e.status_code}: {str(e)[:300]}",
)
except MCPClientError as e:

View File

@@ -580,6 +580,49 @@ async def test_auth_error_with_existing_creds_returns_error():
assert "403" in response.message
@pytest.mark.asyncio(loop_scope="session")
async def test_http_error_returns_clean_message_with_collapsible_detail():
"""Non-auth HTTP errors return a clean message with raw detail in the `error` field."""
from backend.util.request import HTTPClientError
tool = RunMCPToolTool()
session = make_session(_USER_ID)
with patch(
"backend.copilot.tools.run_mcp_tool.validate_url_host", new_callable=AsyncMock
):
with patch(
"backend.copilot.tools.run_mcp_tool.auto_lookup_mcp_credential",
new_callable=AsyncMock,
return_value=None,
):
mock_client = AsyncMock()
mock_client.initialize = AsyncMock(
side_effect=HTTPClientError(
"<!doctype html><html><body>Not Found</body></html>",
status_code=404,
)
)
with patch(
"backend.copilot.tools.run_mcp_tool.MCPClient",
return_value=mock_client,
):
response = await tool._execute(
user_id=_USER_ID,
session=session,
server_url=_SERVER_URL,
)
assert isinstance(response, ErrorResponse)
assert "404" in response.message
# Raw HTML body must NOT leak into the user-facing message
assert "<!doctype" not in response.message
# Raw detail (including original body) goes in the collapsible `error` field
assert response.error is not None
assert "404" in response.error
assert "<!doctype" in response.error.lower()
@pytest.mark.asyncio(loop_scope="session")
async def test_mcp_client_error_returns_error_response():
"""MCPClientError (protocol-level) maps to a clean ErrorResponse."""

View File

@@ -10,11 +10,11 @@ from pydantic import BaseModel
from backend.copilot.context import (
E2B_WORKDIR,
get_current_sandbox,
get_workspace_manager,
resolve_sandbox_path,
)
from backend.copilot.model import ChatSession
from backend.copilot.tools.sandbox import make_session_path
from backend.data.db_accessors import workspace_db
from backend.util.settings import Config
from backend.util.virus_scanner import scan_content_safe
from backend.util.workspace import WorkspaceManager
@@ -218,12 +218,6 @@ def _is_text_mime(mime_type: str) -> bool:
return any(mime_type.startswith(t) for t in _TEXT_MIME_PREFIXES)
async def get_manager(user_id: str, session_id: str) -> WorkspaceManager:
"""Create a session-scoped WorkspaceManager."""
workspace = await workspace_db().get_or_create_workspace(user_id)
return WorkspaceManager(user_id, workspace.id, session_id)
async def _resolve_file(
manager: WorkspaceManager,
file_id: str | None,
@@ -386,7 +380,7 @@ class ListWorkspaceFilesTool(BaseTool):
include_all_sessions: bool = kwargs.get("include_all_sessions", False)
try:
manager = await get_manager(user_id, session_id)
manager = await get_workspace_manager(user_id, session_id)
files = await manager.list_files(
path=path_prefix, limit=limit, include_all_sessions=include_all_sessions
)
@@ -536,7 +530,7 @@ class ReadWorkspaceFileTool(BaseTool):
)
try:
manager = await get_manager(user_id, session_id)
manager = await get_workspace_manager(user_id, session_id)
resolved = await _resolve_file(manager, file_id, path, session_id)
if isinstance(resolved, ErrorResponse):
return resolved
@@ -772,7 +766,7 @@ class WriteWorkspaceFileTool(BaseTool):
try:
await scan_content_safe(content, filename=filename)
manager = await get_manager(user_id, session_id)
manager = await get_workspace_manager(user_id, session_id)
rec = await manager.write_file(
content=content,
filename=filename,
@@ -899,7 +893,7 @@ class DeleteWorkspaceFileTool(BaseTool):
)
try:
manager = await get_manager(user_id, session_id)
manager = await get_workspace_manager(user_id, session_id)
resolved = await _resolve_file(manager, file_id, path, session_id)
if isinstance(resolved, ErrorResponse):
return resolved

View File

@@ -512,10 +512,6 @@ class DatabaseManagerAsyncClient(AppServiceClient):
list_workspace_files = d.list_workspace_files
soft_delete_workspace_file = d.soft_delete_workspace_file
# ============ Credits ============ #
spend_credits = d.spend_credits
get_credits = d.get_credits
# ============ Understanding ============ #
get_business_understanding = d.get_business_understanding
upsert_business_understanding = d.upsert_business_understanding

View File

@@ -61,7 +61,12 @@ from backend.util.decorator import (
error_logged,
time_measured,
)
from backend.util.exceptions import InsufficientBalanceError, ModerationError
from backend.util.exceptions import (
GraphNotFoundError,
InsufficientBalanceError,
ModerationError,
NotFoundError,
)
from backend.util.file import clean_exec_files
from backend.util.logging import TruncatedLogger, configure_logging
from backend.util.metrics import DiscordChannel
@@ -375,9 +380,16 @@ async def execute_node(
log_metadata.debug("Node produced output", **{output_name: output_data})
yield output_name, output_data
except Exception as ex:
# Capture exception WITH context still set before restoring scope
sentry_sdk.capture_exception(error=ex, scope=scope)
sentry_sdk.flush() # Ensure it's sent before we restore scope
# Only capture unexpected errors to Sentry, not user-caused ones.
# Most ValueError subclasses here are expected (BlockExecutionError,
# InsufficientBalanceError, plain ValueError for auth/disabled blocks, etc.)
# but NotFoundError/GraphNotFoundError could indicate real platform issues.
is_expected = isinstance(ex, ValueError) and not isinstance(
ex, (NotFoundError, GraphNotFoundError)
)
if not is_expected:
sentry_sdk.capture_exception(error=ex, scope=scope)
sentry_sdk.flush()
# Re-raise to maintain normal error flow
raise
finally:
@@ -1478,7 +1490,7 @@ class ExecutionProcessor:
alert_message, DiscordChannel.PRODUCT
)
except Exception as e:
logger.error(f"Failed to send low balance Discord alert: {e}")
logger.warning(f"Failed to send low balance Discord alert: {e}")
class ExecutionManager(AppProcess):
@@ -1900,17 +1912,16 @@ class ExecutionManager(AppProcess):
channel = client.get_channel()
channel.connection.add_callback_threadsafe(lambda: channel.stop_consuming())
try:
thread.join(timeout=300)
except TimeoutError:
logger.error(
thread.join(timeout=300)
if thread.is_alive():
logger.warning(
f"{prefix} ⚠️ Run thread did not finish in time, forcing disconnect"
)
client.disconnect()
logger.info(f"{prefix} ✅ Run client disconnected")
except Exception as e:
logger.error(f"{prefix} ⚠️ Error disconnecting run client: {type(e)} {e}")
logger.warning(f"{prefix} ⚠️ Error disconnecting run client: {type(e)} {e}")
def cleanup(self):
"""Override cleanup to implement graceful shutdown with active execution waiting."""
@@ -1926,7 +1937,9 @@ class ExecutionManager(AppProcess):
)
logger.info(f"{prefix} ✅ Exec consumer has been signaled to stop")
except Exception as e:
logger.error(f"{prefix} ⚠️ Error signaling consumer to stop: {type(e)} {e}")
logger.warning(
f"{prefix} ⚠️ Error signaling consumer to stop: {type(e)} {e}"
)
# Wait for active executions to complete
if self.active_graph_runs:
@@ -1957,7 +1970,7 @@ class ExecutionManager(AppProcess):
waited += wait_interval
if self.active_graph_runs:
logger.error(
logger.warning(
f"{prefix} ⚠️ {len(self.active_graph_runs)} executions still running after {max_wait}s"
)
else:
@@ -1968,7 +1981,7 @@ class ExecutionManager(AppProcess):
self.executor.shutdown(cancel_futures=True, wait=False)
logger.info(f"{prefix} ✅ Executor shutdown completed")
except Exception as e:
logger.error(f"{prefix} ⚠️ Error during executor shutdown: {type(e)} {e}")
logger.warning(f"{prefix} ⚠️ Error during executor shutdown: {type(e)} {e}")
# Release remaining execution locks
try:

View File

@@ -94,7 +94,7 @@ SCHEDULER_OPERATION_TIMEOUT_SECONDS = 300 # 5 minutes for scheduler operations
def job_listener(event):
"""Logs job execution outcomes for better monitoring."""
if event.exception:
logger.error(
logger.warning(
f"Job {event.job_id} failed: {type(event.exception).__name__}: {event.exception}"
)
else:
@@ -137,7 +137,7 @@ def run_async(coro, timeout: float = SCHEDULER_OPERATION_TIMEOUT_SECONDS):
try:
return future.result(timeout=timeout)
except Exception as e:
logger.error(f"Async operation failed: {type(e).__name__}: {e}")
logger.warning(f"Async operation failed: {type(e).__name__}: {e}")
raise
@@ -186,7 +186,7 @@ async def _execute_graph(**kwargs):
async def _handle_graph_validation_error(args: "GraphExecutionJobArgs") -> None:
logger.error(
logger.warning(
f"Scheduled Graph {args.graph_id} failed validation. Unscheduling graph"
)
if args.schedule_id:
@@ -196,8 +196,9 @@ async def _handle_graph_validation_error(args: "GraphExecutionJobArgs") -> None:
user_id=args.user_id,
)
else:
logger.error(
f"Unable to unschedule graph: {args.graph_id} as this is an old job with no associated schedule_id please remove manually"
logger.warning(
f"Unable to unschedule graph: {args.graph_id} as this is an old job "
f"with no associated schedule_id please remove manually"
)

View File

@@ -303,9 +303,9 @@ class NotificationManager(AppService):
)
if not oldest_message:
# this should never happen
logger.error(
f"Batch for user {batch.user_id} and type {notification_type} has no oldest message whichshould never happen!!!!!!!!!!!!!!!!"
logger.warning(
f"Batch for user {batch.user_id} and type {notification_type} "
f"has no oldest message — batch may have been cleared concurrently"
)
continue
@@ -318,7 +318,7 @@ class NotificationManager(AppService):
).get_user_email_by_id(batch.user_id)
if not recipient_email:
logger.error(
logger.warning(
f"User email not found for user {batch.user_id}"
)
continue
@@ -344,7 +344,7 @@ class NotificationManager(AppService):
).get_user_notification_batch(batch.user_id, notification_type)
if not batch_data or not batch_data.notifications:
logger.error(
logger.warning(
f"Batch data not found for user {batch.user_id}"
)
# Clear the batch
@@ -372,7 +372,7 @@ class NotificationManager(AppService):
)
)
except Exception as e:
logger.error(
logger.warning(
f"Error parsing notification event: {e=}, {db_event=}"
)
continue
@@ -415,7 +415,10 @@ class NotificationManager(AppService):
async def discord_system_alert(
self, content: str, channel: DiscordChannel = DiscordChannel.PLATFORM
):
await discord_send_alert(content, channel)
try:
await discord_send_alert(content, channel)
except Exception as e:
logger.warning(f"Failed to send Discord system alert: {e}")
async def _queue_scheduled_notification(self, event: SummaryParamsEventModel):
"""Queue a scheduled notification - exposed method for other services to call"""
@@ -516,7 +519,7 @@ class NotificationManager(AppService):
raise ValueError("Invalid event type or params")
except Exception as e:
logger.error(f"Failed to gather summary data: {e}")
logger.warning(f"Failed to gather summary data: {e}")
# Return sensible defaults in case of error
if event_type == NotificationType.DAILY_SUMMARY and isinstance(
params, DailySummaryParams
@@ -562,8 +565,9 @@ class NotificationManager(AppService):
should_retry=False
).get_user_notification_oldest_message_in_batch(user_id, event_type)
if not oldest_message:
logger.error(
f"Batch for user {user_id} and type {event_type} has no oldest message whichshould never happen!!!!!!!!!!!!!!!!"
logger.warning(
f"Batch for user {user_id} and type {event_type} "
f"has no oldest message — batch may have been cleared concurrently"
)
return False
oldest_age = oldest_message.created_at
@@ -585,7 +589,7 @@ class NotificationManager(AppService):
get_notif_data_type(event.type)
].model_validate_json(message)
except Exception as e:
logger.error(f"Error parsing message due to non matching schema {e}")
logger.warning(f"Error parsing message due to non matching schema {e}")
return None
async def _process_admin_message(self, message: str) -> bool:
@@ -614,7 +618,7 @@ class NotificationManager(AppService):
should_retry=False
).get_user_email_by_id(event.user_id)
if not recipient_email:
logger.error(f"User email not found for user {event.user_id}")
logger.warning(f"User email not found for user {event.user_id}")
return False
should_send = await self._should_email_user_based_on_preference(
@@ -651,7 +655,7 @@ class NotificationManager(AppService):
should_retry=False
).get_user_email_by_id(event.user_id)
if not recipient_email:
logger.error(f"User email not found for user {event.user_id}")
logger.warning(f"User email not found for user {event.user_id}")
return False
should_send = await self._should_email_user_based_on_preference(
@@ -672,7 +676,7 @@ class NotificationManager(AppService):
should_retry=False
).get_user_notification_batch(event.user_id, event.type)
if not batch or not batch.notifications:
logger.error(f"Batch not found for user {event.user_id}")
logger.warning(f"Batch not found for user {event.user_id}")
return False
unsub_link = generate_unsubscribe_link(event.user_id)
@@ -745,7 +749,7 @@ class NotificationManager(AppService):
f"Removed {len(chunk_ids)} sent notifications from batch"
)
except Exception as e:
logger.error(
logger.warning(
f"Failed to remove sent notifications: {e}"
)
# Continue anyway - better to risk duplicates than lose emails
@@ -770,7 +774,7 @@ class NotificationManager(AppService):
else:
# Message is too large even after size reduction
if attempt_size == 1:
logger.error(
logger.warning(
f"Failed to send notification at index {i}: "
f"Single notification exceeds email size limit "
f"({len(test_message):,} chars > {MAX_EMAIL_SIZE:,} chars). "
@@ -789,7 +793,7 @@ class NotificationManager(AppService):
f"Removed oversized notification {chunk_ids[0]} from batch permanently"
)
except Exception as e:
logger.error(
logger.warning(
f"Failed to remove oversized notification: {e}"
)
@@ -823,7 +827,7 @@ class NotificationManager(AppService):
f"Set email verification to false for user {event.user_id}"
)
except Exception as deactivation_error:
logger.error(
logger.warning(
f"Failed to deactivate email for user {event.user_id}: "
f"{deactivation_error}"
)
@@ -835,7 +839,7 @@ class NotificationManager(AppService):
f"Disabled all notification preferences for user {event.user_id}"
)
except Exception as disable_error:
logger.error(
logger.warning(
f"Failed to disable notification preferences: {disable_error}"
)
@@ -848,7 +852,7 @@ class NotificationManager(AppService):
f"Cleared ALL notification batches for user {event.user_id}"
)
except Exception as remove_error:
logger.error(
logger.warning(
f"Failed to clear batches for inactive recipient: {remove_error}"
)
@@ -859,7 +863,7 @@ class NotificationManager(AppService):
"422" in error_message
or "unprocessable" in error_message
):
logger.error(
logger.warning(
f"Failed to send notification at index {i}: "
f"Malformed notification data rejected by Postmark. "
f"Error: {e}. Removing from batch permanently."
@@ -877,7 +881,7 @@ class NotificationManager(AppService):
"Removed malformed notification from batch permanently"
)
except Exception as remove_error:
logger.error(
logger.warning(
f"Failed to remove malformed notification: {remove_error}"
)
# Check if it's a ValueError for size limit
@@ -885,14 +889,14 @@ class NotificationManager(AppService):
isinstance(e, ValueError)
and "too large" in error_message
):
logger.error(
logger.warning(
f"Failed to send notification at index {i}: "
f"Notification size exceeds email limit. "
f"Error: {e}. Skipping this notification."
)
# Other API errors
else:
logger.error(
logger.warning(
f"Failed to send notification at index {i}: "
f"Email API error ({error_type}): {e}. "
f"Skipping this notification."
@@ -907,7 +911,9 @@ class NotificationManager(AppService):
if not chunk_sent:
# Should not reach here due to single notification handling
logger.error(f"Failed to send notifications starting at index {i}")
logger.warning(
f"Failed to send notifications starting at index {i}"
)
failed_indices.append(i)
i += 1
@@ -946,7 +952,7 @@ class NotificationManager(AppService):
should_retry=False
).get_user_email_by_id(event.user_id)
if not recipient_email:
logger.error(f"User email not found for user {event.user_id}")
logger.warning(f"User email not found for user {event.user_id}")
return False
should_send = await self._should_email_user_based_on_preference(
event.user_id, event.type
@@ -1007,7 +1013,10 @@ class NotificationManager(AppService):
# Let message.process() handle the rejection
pass
except Exception as e:
logger.error(f"Error processing message in {queue_name}: {e}")
logger.warning(
f"Error processing message in {queue_name}: {e}",
exc_info=True,
)
# Let message.process() handle the rejection
raise
except asyncio.CancelledError:

View File

@@ -256,9 +256,9 @@ class TestNotificationErrorHandling:
assert 2 not in successful_indices # Index 2 failed
# Verify 422 error was logged
error_calls = [call[0][0] for call in mock_logger.error.call_args_list]
warning_calls = [call[0][0] for call in mock_logger.warning.call_args_list]
assert any(
"422" in call or "malformed" in call.lower() for call in error_calls
"422" in call or "malformed" in call.lower() for call in warning_calls
)
# Verify all notifications were removed (4 successful + 1 malformed)
@@ -371,10 +371,10 @@ class TestNotificationErrorHandling:
assert 3 not in successful_indices # Index 3 was not sent
# Verify oversized error was logged
error_calls = [call[0][0] for call in mock_logger.error.call_args_list]
warning_calls = [call[0][0] for call in mock_logger.warning.call_args_list]
assert any(
"exceeds email size limit" in call or "oversized" in call.lower()
for call in error_calls
for call in warning_calls
)
@pytest.mark.asyncio
@@ -478,10 +478,10 @@ class TestNotificationErrorHandling:
assert 1 in failed_indices # Index 1 failed
# Verify generic error was logged
error_calls = [call[0][0] for call in mock_logger.error.call_args_list]
warning_calls = [call[0][0] for call in mock_logger.warning.call_args_list]
assert any(
"api error" in call.lower() or "skipping" in call.lower()
for call in error_calls
for call in warning_calls
)
# Only successful ones should be removed from batch (failed one stays for retry)

View File

@@ -613,5 +613,5 @@ async def cleanup_expired_files_async() -> int:
)
return deleted_count
except Exception as e:
logger.error(f"[CloudStorage] Error during cloud storage cleanup: {e}")
logger.warning(f"[CloudStorage] Error during cloud storage cleanup: {e}")
return 0

View File

@@ -275,13 +275,12 @@ async def store_media_file(
# Process file
elif file.startswith("data:"):
# Data URI
match = re.match(r"^data:([^;]+);base64,(.*)$", file, re.DOTALL)
if not match:
parsed_uri = parse_data_uri(file)
if parsed_uri is None:
raise ValueError(
"Invalid data URI format. Expected data:<mime>;base64,<data>"
)
mime_type = match.group(1).strip().lower()
b64_content = match.group(2).strip()
mime_type, b64_content = parsed_uri
# Generate filename and decode
extension = _extension_from_mime(mime_type)
@@ -415,13 +414,70 @@ def get_dir_size(path: Path) -> int:
return total
async def resolve_media_content(
content: MediaFileType,
execution_context: "ExecutionContext",
*,
return_format: MediaReturnFormat,
) -> MediaFileType:
"""Resolve a ``MediaFileType`` value if it is a media reference, pass through otherwise.
Convenience wrapper around :func:`is_media_file_ref` + :func:`store_media_file`.
Plain text content (source code, filenames) is returned unchanged. Media
references (``data:``, ``workspace://``, ``http(s)://``) are resolved via
:func:`store_media_file` using *return_format*.
Use this when a block field is typed as ``MediaFileType`` but may contain
either literal text or a media reference.
"""
if not content or not is_media_file_ref(content):
return content
return await store_media_file(
content, execution_context, return_format=return_format
)
def is_media_file_ref(value: str) -> bool:
"""Return True if *value* looks like a ``MediaFileType`` reference.
Detects data URIs, workspace:// references, and HTTP(S) URLs — the
formats accepted by :func:`store_media_file`. Plain text content
(e.g. source code, filenames) returns False.
Known limitation: HTTP(S) URL detection is heuristic. Any string that
starts with ``http://`` or ``https://`` is treated as a media URL, even
if it appears as a URL inside source-code comments or documentation.
Blocks that produce source code or Markdown as output may therefore
trigger false positives. Callers that need higher precision should
inspect the string further (e.g. verify the URL is reachable or has a
media-friendly extension).
Note: this does *not* match local file paths, which are ambiguous
(could be filenames or actual paths). Blocks that need to resolve
local paths should check for them separately.
"""
return value.startswith(("data:", "workspace://", "http://", "https://"))
def parse_data_uri(value: str) -> tuple[str, str] | None:
"""Parse a ``data:<mime>;base64,<payload>`` URI.
Returns ``(mime_type, base64_payload)`` if *value* is a valid data URI,
or ``None`` if it is not.
"""
match = re.match(r"^data:([^;]+);base64,(.*)$", value, re.DOTALL)
if not match:
return None
return match.group(1).strip().lower(), match.group(2).strip()
def get_mime_type(file: str) -> str:
"""
Get the MIME type of a file, whether it's a data URI, URL, or local path.
"""
if file.startswith("data:"):
match = re.match(r"^data:([^;]+);base64,", file)
return match.group(1) if match else "application/octet-stream"
parsed_uri = parse_data_uri(file)
return parsed_uri[0] if parsed_uri else "application/octet-stream"
elif file.startswith(("http://", "https://")):
parsed_url = urlparse(file)

View File

@@ -0,0 +1,375 @@
"""Parse file content into structured Python objects based on file format.
Used by the ``@@agptfile:`` expansion system to eagerly parse well-known file
formats into native Python types *before* schema-driven coercion runs. This
lets blocks with ``Any``-typed inputs receive structured data rather than raw
strings, while blocks expecting strings get the value coerced back via
``convert()``.
Supported formats:
- **JSON** (``.json``) — arrays and objects are promoted; scalars stay as strings
- **JSON Lines** (``.jsonl``, ``.ndjson``) — each non-empty line parsed as JSON;
when all lines are dicts with the same keys (tabular data), output is
``list[list[Any]]`` with a header row, consistent with CSV/Parquet/Excel;
otherwise returns a plain ``list`` of parsed values
- **CSV** (``.csv``) — ``csv.reader`` → ``list[list[str]]``
- **TSV** (``.tsv``) — tab-delimited → ``list[list[str]]``
- **YAML** (``.yaml``, ``.yml``) — parsed via PyYAML; containers only
- **TOML** (``.toml``) — parsed via stdlib ``tomllib``
- **Parquet** (``.parquet``) — via pandas/pyarrow → ``list[list[Any]]`` with header row
- **Excel** (``.xlsx``) — via pandas/openpyxl → ``list[list[Any]]`` with header row
(legacy ``.xls`` is **not** supported — only the modern OOXML format)
The **fallback contract** is enforced by :func:`parse_file_content`, not by
individual parser functions. If any parser raises, ``parse_file_content``
catches the exception and returns the original content unchanged (string for
text formats, bytes for binary formats). Callers should never see an
exception from the public API when ``strict=False``.
"""
import csv
import io
import json
import logging
import tomllib
import zipfile
from collections.abc import Callable
# posixpath.splitext handles forward-slash URI paths correctly on all platforms,
# unlike os.path.splitext which uses platform-native separators.
from posixpath import splitext
from typing import Any
import yaml
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Extension / MIME → format label mapping
# ---------------------------------------------------------------------------
_EXT_TO_FORMAT: dict[str, str] = {
".json": "json",
".jsonl": "jsonl",
".ndjson": "jsonl",
".csv": "csv",
".tsv": "tsv",
".yaml": "yaml",
".yml": "yaml",
".toml": "toml",
".parquet": "parquet",
".xlsx": "xlsx",
}
MIME_TO_FORMAT: dict[str, str] = {
"application/json": "json",
"application/x-ndjson": "jsonl",
"application/jsonl": "jsonl",
"text/csv": "csv",
"text/tab-separated-values": "tsv",
"application/x-yaml": "yaml",
"application/yaml": "yaml",
"text/yaml": "yaml",
"application/toml": "toml",
"application/vnd.apache.parquet": "parquet",
"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet": "xlsx",
}
# Formats that require raw bytes rather than decoded text.
BINARY_FORMATS: frozenset[str] = frozenset({"parquet", "xlsx"})
# ---------------------------------------------------------------------------
# Public API (top-down: main functions first, helpers below)
# ---------------------------------------------------------------------------
def infer_format_from_uri(uri: str) -> str | None:
"""Return a format label based on URI extension or MIME fragment.
Returns ``None`` when the format cannot be determined — the caller should
fall back to returning the content as a plain string.
"""
# 1. Check MIME fragment (workspace://abc123#application/json)
if "#" in uri:
_, fragment = uri.rsplit("#", 1)
fmt = MIME_TO_FORMAT.get(fragment.lower())
if fmt:
return fmt
# 2. Check file extension from the path portion.
# Strip the fragment first so ".json#mime" doesn't confuse splitext.
path = uri.split("#")[0].split("?")[0]
_, ext = splitext(path)
fmt = _EXT_TO_FORMAT.get(ext.lower())
if fmt is not None:
return fmt
# Legacy .xls is not supported — map it so callers can produce a
# user-friendly error instead of returning garbled binary.
if ext.lower() == ".xls":
return "xls"
return None
def parse_file_content(content: str | bytes, fmt: str, *, strict: bool = False) -> Any:
"""Parse *content* according to *fmt* and return a native Python value.
When *strict* is ``False`` (default), returns the original *content*
unchanged if *fmt* is not recognised or parsing fails for any reason.
This mode **never raises**.
When *strict* is ``True``, parsing errors are propagated to the caller.
Unrecognised formats or type mismatches (e.g. text for a binary format)
still return *content* unchanged without raising.
"""
if fmt == "xls":
return (
"[Unsupported format] Legacy .xls files are not supported. "
"Please re-save the file as .xlsx (Excel 2007+) and upload again."
)
try:
if fmt in BINARY_FORMATS:
parser = _BINARY_PARSERS.get(fmt)
if parser is None:
return content
if isinstance(content, str):
# Caller gave us text for a binary format — can't parse.
return content
return parser(content)
parser = _TEXT_PARSERS.get(fmt)
if parser is None:
return content
if isinstance(content, bytes):
content = content.decode("utf-8", errors="replace")
return parser(content)
except PARSE_EXCEPTIONS:
if strict:
raise
logger.debug("Structured parsing failed for format=%s, falling back", fmt)
return content
# ---------------------------------------------------------------------------
# Exception loading helpers
# ---------------------------------------------------------------------------
def _load_openpyxl_exception() -> type[Exception]:
"""Return openpyxl's InvalidFileException, raising ImportError if absent."""
from openpyxl.utils.exceptions import InvalidFileException # noqa: PLC0415
return InvalidFileException
def _load_arrow_exception() -> type[Exception]:
"""Return pyarrow's ArrowException, raising ImportError if absent."""
from pyarrow import ArrowException # noqa: PLC0415
return ArrowException
def _optional_exc(loader: "Callable[[], type[Exception]]") -> "type[Exception] | None":
"""Return the exception class from *loader*, or ``None`` if the dep is absent."""
try:
return loader()
except ImportError:
return None
# Exception types that can be raised during file content parsing.
# Shared between ``parse_file_content`` (which catches them in non-strict mode)
# and ``file_ref._expand_bare_ref`` (which re-raises them as FileRefExpansionError).
#
# Optional-dependency exception types are loaded via a helper that raises
# ``ImportError`` at *parse time* rather than silently becoming ``None`` here.
# This ensures mypy sees clean types and missing deps surface as real errors.
PARSE_EXCEPTIONS: tuple[type[BaseException], ...] = tuple(
exc
for exc in (
json.JSONDecodeError,
csv.Error,
yaml.YAMLError,
tomllib.TOMLDecodeError,
ValueError,
UnicodeDecodeError,
ImportError,
OSError,
KeyError,
TypeError,
zipfile.BadZipFile,
_optional_exc(_load_openpyxl_exception),
# ArrowException covers ArrowIOError and ArrowCapacityError which
# do not inherit from standard exceptions; ArrowInvalid/ArrowTypeError
# already map to ValueError/TypeError but this catches the rest.
_optional_exc(_load_arrow_exception),
)
if exc is not None
)
# ---------------------------------------------------------------------------
# Text-based parsers (content: str → Any)
# ---------------------------------------------------------------------------
def _parse_container(parser: Callable[[str], Any], content: str) -> list | dict | str:
"""Parse *content* and return the result only if it is a container (list/dict).
Scalar values (strings, numbers, booleans, None) are discarded and the
original *content* string is returned instead. This prevents e.g. a JSON
file containing just ``"42"`` from silently becoming an int.
"""
parsed = parser(content)
if isinstance(parsed, (list, dict)):
return parsed
return content
def _parse_json(content: str) -> list | dict | str:
return _parse_container(json.loads, content)
def _parse_jsonl(content: str) -> Any:
lines = [json.loads(line) for line in content.splitlines() if line.strip()]
if not lines:
return content
# When every line is a dict with the same keys, convert to table format
# (header row + data rows) — consistent with CSV/TSV/Parquet/Excel output.
# Require ≥2 dicts so a single-line JSONL stays as [dict] (not a table).
if len(lines) >= 2 and all(isinstance(obj, dict) for obj in lines):
keys = list(lines[0].keys())
# Cache as tuple to avoid O(n×k) list allocations in the all() call.
keys_tuple = tuple(keys)
if keys and all(tuple(obj.keys()) == keys_tuple for obj in lines[1:]):
return [keys] + [[obj[k] for k in keys] for obj in lines]
return lines
def _parse_csv(content: str) -> Any:
return _parse_delimited(content, delimiter=",")
def _parse_tsv(content: str) -> Any:
return _parse_delimited(content, delimiter="\t")
def _parse_delimited(content: str, *, delimiter: str) -> Any:
reader = csv.reader(io.StringIO(content), delimiter=delimiter)
# csv.reader never yields [] — blank lines yield [""]. Filter out
# rows where every cell is empty (i.e. truly blank lines).
rows = [row for row in reader if _row_has_content(row)]
if not rows:
return content
# If the declared delimiter produces only single-column rows, try
# sniffing the actual delimiter — catches misidentified files (e.g.
# a tab-delimited file with a .csv extension).
if len(rows[0]) == 1:
try:
dialect = csv.Sniffer().sniff(content[:8192])
if dialect.delimiter != delimiter:
reader = csv.reader(io.StringIO(content), dialect)
rows = [row for row in reader if _row_has_content(row)]
except csv.Error:
pass
if rows and len(rows[0]) >= 2:
return rows
return content
def _row_has_content(row: list[str]) -> bool:
"""Return True when *row* contains at least one non-empty cell.
``csv.reader`` never yields ``[]`` — truly blank lines yield ``[""]``.
This predicate filters those out consistently across the initial read
and the sniffer-fallback re-read.
"""
return any(cell for cell in row)
def _parse_yaml(content: str) -> list | dict | str:
# NOTE: YAML anchor/alias expansion can amplify input beyond the 10MB cap.
# safe_load prevents code execution; for production hardening consider
# a YAML parser with expansion limits (e.g. ruamel.yaml with max_alias_count).
if "\n---" in content or content.startswith("---\n"):
# Multi-document YAML: only the first document is parsed; the rest
# are silently ignored by yaml.safe_load. Warn so callers are aware.
logger.warning(
"Multi-document YAML detected (--- separator); "
"only the first document will be parsed."
)
return _parse_container(yaml.safe_load, content)
def _parse_toml(content: str) -> Any:
parsed = tomllib.loads(content)
# tomllib.loads always returns a dict — return it even if empty.
return parsed
_TEXT_PARSERS: dict[str, Callable[[str], Any]] = {
"json": _parse_json,
"jsonl": _parse_jsonl,
"csv": _parse_csv,
"tsv": _parse_tsv,
"yaml": _parse_yaml,
"toml": _parse_toml,
}
# ---------------------------------------------------------------------------
# Binary-based parsers (content: bytes → Any)
# ---------------------------------------------------------------------------
def _parse_parquet(content: bytes) -> list[list[Any]]:
import pandas as pd
df = pd.read_parquet(io.BytesIO(content))
return _df_to_rows(df)
def _parse_xlsx(content: bytes) -> list[list[Any]]:
import pandas as pd
# Explicitly specify openpyxl engine; the default engine varies by pandas
# version and does not support legacy .xls (which is excluded by our format map).
df = pd.read_excel(io.BytesIO(content), engine="openpyxl")
return _df_to_rows(df)
def _df_to_rows(df: Any) -> list[list[Any]]:
"""Convert a DataFrame to ``list[list[Any]]`` with a header row.
NaN values are replaced with ``None`` so the result is JSON-serializable.
Uses explicit cell-level checking because ``df.where(df.notna(), None)``
silently converts ``None`` back to ``NaN`` in float64 columns.
"""
header = df.columns.tolist()
rows = [
[None if _is_nan(cell) else cell for cell in row] for row in df.values.tolist()
]
return [header] + rows
def _is_nan(cell: Any) -> bool:
"""Check if a cell value is NaN, handling non-scalar types (lists, dicts).
``pd.isna()`` on a list/dict returns a boolean array which raises
``ValueError`` in a boolean context. Guard with a scalar check first.
"""
import pandas as pd
return bool(pd.api.types.is_scalar(cell) and pd.isna(cell))
_BINARY_PARSERS: dict[str, Callable[[bytes], Any]] = {
"parquet": _parse_parquet,
"xlsx": _parse_xlsx,
}

View File

@@ -0,0 +1,624 @@
"""Tests for file_content_parser — format inference and structured parsing."""
import io
import json
import pytest
from backend.util.file_content_parser import (
BINARY_FORMATS,
infer_format_from_uri,
parse_file_content,
)
# ---------------------------------------------------------------------------
# infer_format_from_uri
# ---------------------------------------------------------------------------
class TestInferFormat:
# --- extension-based ---
def test_json_extension(self):
assert infer_format_from_uri("/home/user/data.json") == "json"
def test_jsonl_extension(self):
assert infer_format_from_uri("/tmp/events.jsonl") == "jsonl"
def test_ndjson_extension(self):
assert infer_format_from_uri("/tmp/events.ndjson") == "jsonl"
def test_csv_extension(self):
assert infer_format_from_uri("workspace:///reports/sales.csv") == "csv"
def test_tsv_extension(self):
assert infer_format_from_uri("/home/user/data.tsv") == "tsv"
def test_yaml_extension(self):
assert infer_format_from_uri("/home/user/config.yaml") == "yaml"
def test_yml_extension(self):
assert infer_format_from_uri("/home/user/config.yml") == "yaml"
def test_toml_extension(self):
assert infer_format_from_uri("/home/user/config.toml") == "toml"
def test_parquet_extension(self):
assert infer_format_from_uri("/data/table.parquet") == "parquet"
def test_xlsx_extension(self):
assert infer_format_from_uri("/data/spreadsheet.xlsx") == "xlsx"
def test_xls_extension_returns_xls_label(self):
# Legacy .xls is mapped so callers can produce a helpful error.
assert infer_format_from_uri("/data/old_spreadsheet.xls") == "xls"
def test_case_insensitive(self):
assert infer_format_from_uri("/data/FILE.JSON") == "json"
assert infer_format_from_uri("/data/FILE.CSV") == "csv"
def test_unicode_filename(self):
assert infer_format_from_uri("/home/user/\u30c7\u30fc\u30bf.json") == "json"
assert infer_format_from_uri("/home/user/\u00e9t\u00e9.csv") == "csv"
def test_unknown_extension(self):
assert infer_format_from_uri("/home/user/readme.txt") is None
def test_no_extension(self):
assert infer_format_from_uri("workspace://abc123") is None
# --- MIME-based ---
def test_mime_json(self):
assert infer_format_from_uri("workspace://abc123#application/json") == "json"
def test_mime_csv(self):
assert infer_format_from_uri("workspace://abc123#text/csv") == "csv"
def test_mime_tsv(self):
assert (
infer_format_from_uri("workspace://abc123#text/tab-separated-values")
== "tsv"
)
def test_mime_ndjson(self):
assert (
infer_format_from_uri("workspace://abc123#application/x-ndjson") == "jsonl"
)
def test_mime_yaml(self):
assert infer_format_from_uri("workspace://abc123#application/x-yaml") == "yaml"
def test_mime_xlsx(self):
uri = "workspace://abc123#application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
assert infer_format_from_uri(uri) == "xlsx"
def test_mime_parquet(self):
assert (
infer_format_from_uri("workspace://abc123#application/vnd.apache.parquet")
== "parquet"
)
def test_unknown_mime(self):
assert infer_format_from_uri("workspace://abc123#text/plain") is None
def test_unknown_mime_falls_through_to_extension(self):
# Unknown MIME (text/plain) should fall through to extension-based detection.
assert infer_format_from_uri("workspace:///data.csv#text/plain") == "csv"
# --- MIME takes precedence over extension ---
def test_mime_overrides_extension(self):
# .txt extension but JSON MIME → json
assert infer_format_from_uri("workspace:///file.txt#application/json") == "json"
# ---------------------------------------------------------------------------
# parse_file_content — JSON
# ---------------------------------------------------------------------------
class TestParseJson:
def test_array(self):
result = parse_file_content("[1, 2, 3]", "json")
assert result == [1, 2, 3]
def test_object(self):
result = parse_file_content('{"key": "value"}', "json")
assert result == {"key": "value"}
def test_nested(self):
content = json.dumps({"rows": [[1, 2], [3, 4]]})
result = parse_file_content(content, "json")
assert result == {"rows": [[1, 2], [3, 4]]}
def test_scalar_string_stays_as_string(self):
result = parse_file_content('"hello"', "json")
assert result == '"hello"' # original content, not parsed
def test_scalar_number_stays_as_string(self):
result = parse_file_content("42", "json")
assert result == "42"
def test_scalar_boolean_stays_as_string(self):
result = parse_file_content("true", "json")
assert result == "true"
def test_null_stays_as_string(self):
result = parse_file_content("null", "json")
assert result == "null"
def test_invalid_json_fallback(self):
content = "not json at all"
result = parse_file_content(content, "json")
assert result == content
def test_empty_string_fallback(self):
result = parse_file_content("", "json")
assert result == ""
def test_bytes_input_decoded(self):
result = parse_file_content(b"[1, 2, 3]", "json")
assert result == [1, 2, 3]
# ---------------------------------------------------------------------------
# parse_file_content — JSONL
# ---------------------------------------------------------------------------
class TestParseJsonl:
def test_tabular_uniform_dicts_to_table_format(self):
"""JSONL with uniform dict keys → table format (header + rows),
consistent with CSV/TSV/Parquet/Excel output."""
content = '{"name":"apple","color":"red"}\n{"name":"banana","color":"yellow"}\n{"name":"cherry","color":"red"}'
result = parse_file_content(content, "jsonl")
assert result == [
["name", "color"],
["apple", "red"],
["banana", "yellow"],
["cherry", "red"],
]
def test_tabular_single_key_dicts(self):
"""JSONL with single-key uniform dicts → table format."""
content = '{"a": 1}\n{"a": 2}\n{"a": 3}'
result = parse_file_content(content, "jsonl")
assert result == [["a"], [1], [2], [3]]
def test_tabular_blank_lines_skipped(self):
content = '{"a": 1}\n\n{"a": 2}\n'
result = parse_file_content(content, "jsonl")
assert result == [["a"], [1], [2]]
def test_heterogeneous_dicts_stay_as_list(self):
"""JSONL with different keys across objects → list of dicts (no table)."""
content = '{"name":"apple"}\n{"color":"red"}\n{"size":3}'
result = parse_file_content(content, "jsonl")
assert result == [{"name": "apple"}, {"color": "red"}, {"size": 3}]
def test_partially_overlapping_keys_stay_as_list(self):
"""JSONL dicts with partially overlapping keys → list of dicts."""
content = '{"name":"apple","color":"red"}\n{"name":"banana","size":"medium"}'
result = parse_file_content(content, "jsonl")
assert result == [
{"name": "apple", "color": "red"},
{"name": "banana", "size": "medium"},
]
def test_mixed_types_stay_as_list(self):
"""JSONL with non-dict lines → list of parsed values (no table)."""
content = '1\n"hello"\n[1,2]\n'
result = parse_file_content(content, "jsonl")
assert result == [1, "hello", [1, 2]]
def test_mixed_dicts_and_non_dicts_stay_as_list(self):
"""JSONL mixing dicts and non-dicts → list of parsed values."""
content = '{"a": 1}\n42\n{"b": 2}'
result = parse_file_content(content, "jsonl")
assert result == [{"a": 1}, 42, {"b": 2}]
def test_tabular_preserves_key_order(self):
"""Table header should follow the key order of the first object."""
content = '{"z": 1, "a": 2}\n{"z": 3, "a": 4}'
result = parse_file_content(content, "jsonl")
assert result[0] == ["z", "a"] # order from first object
assert result[1] == [1, 2]
assert result[2] == [3, 4]
def test_single_dict_stays_as_list(self):
"""Single-line JSONL with one dict → [dict], NOT a table.
Tabular detection requires ≥2 dicts to avoid vacuously true all()."""
content = '{"a": 1, "b": 2}'
result = parse_file_content(content, "jsonl")
assert result == [{"a": 1, "b": 2}]
def test_tabular_with_none_values(self):
"""Uniform keys but some null values → table with None cells."""
content = '{"name":"apple","color":"red"}\n{"name":"banana","color":null}'
result = parse_file_content(content, "jsonl")
assert result == [
["name", "color"],
["apple", "red"],
["banana", None],
]
def test_empty_file_fallback(self):
result = parse_file_content("", "jsonl")
assert result == ""
def test_all_blank_lines_fallback(self):
result = parse_file_content("\n\n\n", "jsonl")
assert result == "\n\n\n"
def test_invalid_line_fallback(self):
content = '{"a": 1}\nnot json\n'
result = parse_file_content(content, "jsonl")
assert result == content # fallback
# ---------------------------------------------------------------------------
# parse_file_content — CSV
# ---------------------------------------------------------------------------
class TestParseCsv:
def test_basic(self):
content = "Name,Score\nAlice,90\nBob,85"
result = parse_file_content(content, "csv")
assert result == [["Name", "Score"], ["Alice", "90"], ["Bob", "85"]]
def test_quoted_fields(self):
content = 'Name,Bio\nAlice,"Loves, commas"\nBob,Simple'
result = parse_file_content(content, "csv")
assert result[1] == ["Alice", "Loves, commas"]
def test_single_column_fallback(self):
# Only 1 column — not tabular enough.
content = "Name\nAlice\nBob"
result = parse_file_content(content, "csv")
assert result == content
def test_empty_rows_skipped(self):
content = "A,B\n\n1,2\n\n3,4"
result = parse_file_content(content, "csv")
assert result == [["A", "B"], ["1", "2"], ["3", "4"]]
def test_empty_file_fallback(self):
result = parse_file_content("", "csv")
assert result == ""
def test_utf8_bom(self):
"""CSV with a UTF-8 BOM should parse correctly (BOM stripped by decode)."""
bom = "\ufeff"
content = bom + "Name,Score\nAlice,90\nBob,85"
result = parse_file_content(content, "csv")
# The BOM may be part of the first header cell; ensure rows are still parsed.
assert len(result) == 3
assert result[1] == ["Alice", "90"]
assert result[2] == ["Bob", "85"]
# ---------------------------------------------------------------------------
# parse_file_content — TSV
# ---------------------------------------------------------------------------
class TestParseTsv:
def test_basic(self):
content = "Name\tScore\nAlice\t90\nBob\t85"
result = parse_file_content(content, "tsv")
assert result == [["Name", "Score"], ["Alice", "90"], ["Bob", "85"]]
def test_single_column_fallback(self):
content = "Name\nAlice\nBob"
result = parse_file_content(content, "tsv")
assert result == content
# ---------------------------------------------------------------------------
# parse_file_content — YAML
# ---------------------------------------------------------------------------
class TestParseYaml:
def test_list(self):
content = "- apple\n- banana\n- cherry"
result = parse_file_content(content, "yaml")
assert result == ["apple", "banana", "cherry"]
def test_dict(self):
content = "name: Alice\nage: 30"
result = parse_file_content(content, "yaml")
assert result == {"name": "Alice", "age": 30}
def test_nested(self):
content = "users:\n - name: Alice\n - name: Bob"
result = parse_file_content(content, "yaml")
assert result == {"users": [{"name": "Alice"}, {"name": "Bob"}]}
def test_scalar_stays_as_string(self):
result = parse_file_content("hello world", "yaml")
assert result == "hello world"
def test_invalid_yaml_fallback(self):
content = ":\n :\n invalid: - -"
result = parse_file_content(content, "yaml")
# Malformed YAML should fall back to the original string, not raise.
assert result == content
# ---------------------------------------------------------------------------
# parse_file_content — TOML
# ---------------------------------------------------------------------------
class TestParseToml:
def test_basic(self):
content = '[server]\nhost = "localhost"\nport = 8080'
result = parse_file_content(content, "toml")
assert result == {"server": {"host": "localhost", "port": 8080}}
def test_flat(self):
content = 'name = "test"\ncount = 42'
result = parse_file_content(content, "toml")
assert result == {"name": "test", "count": 42}
def test_empty_string_returns_empty_dict(self):
result = parse_file_content("", "toml")
assert result == {}
def test_invalid_toml_fallback(self):
result = parse_file_content("not = [valid toml", "toml")
assert result == "not = [valid toml"
# ---------------------------------------------------------------------------
# parse_file_content — Parquet (binary)
# ---------------------------------------------------------------------------
try:
import pyarrow as _pa # noqa: F401 # pyright: ignore[reportMissingImports]
_has_pyarrow = True
except ImportError:
_has_pyarrow = False
@pytest.mark.skipif(not _has_pyarrow, reason="pyarrow not installed")
class TestParseParquet:
@pytest.fixture
def parquet_bytes(self) -> bytes:
import pandas as pd
df = pd.DataFrame({"Name": ["Alice", "Bob"], "Score": [90, 85]})
buf = io.BytesIO()
df.to_parquet(buf, index=False)
return buf.getvalue()
def test_basic(self, parquet_bytes: bytes):
result = parse_file_content(parquet_bytes, "parquet")
assert result == [["Name", "Score"], ["Alice", 90], ["Bob", 85]]
def test_string_input_fallback(self):
# Parquet is binary — string input can't be parsed.
result = parse_file_content("not parquet", "parquet")
assert result == "not parquet"
def test_invalid_bytes_fallback(self):
result = parse_file_content(b"not parquet bytes", "parquet")
assert result == b"not parquet bytes"
def test_empty_bytes_fallback(self):
"""Empty binary input should return the empty bytes, not crash."""
result = parse_file_content(b"", "parquet")
assert result == b""
def test_nan_replaced_with_none(self):
"""NaN values in Parquet must become None for JSON serializability."""
import math
import pandas as pd
df = pd.DataFrame({"A": [1.0, float("nan"), 3.0], "B": ["x", None, "z"]})
buf = io.BytesIO()
df.to_parquet(buf, index=False)
result = parse_file_content(buf.getvalue(), "parquet")
# Row with NaN in float col → None
assert result[2][0] is None # float NaN → None
assert result[2][1] is None # str None → None
# Ensure no NaN leaks
for row in result[1:]:
for cell in row:
if isinstance(cell, float):
assert not math.isnan(cell), f"NaN leaked: {row}"
# ---------------------------------------------------------------------------
# parse_file_content — Excel (binary)
# ---------------------------------------------------------------------------
class TestParseExcel:
@pytest.fixture
def xlsx_bytes(self) -> bytes:
import pandas as pd
df = pd.DataFrame({"Name": ["Alice", "Bob"], "Score": [90, 85]})
buf = io.BytesIO()
df.to_excel(buf, index=False) # type: ignore[arg-type] # BytesIO is a valid target
return buf.getvalue()
def test_basic(self, xlsx_bytes: bytes):
result = parse_file_content(xlsx_bytes, "xlsx")
assert result == [["Name", "Score"], ["Alice", 90], ["Bob", 85]]
def test_string_input_fallback(self):
result = parse_file_content("not xlsx", "xlsx")
assert result == "not xlsx"
def test_invalid_bytes_fallback(self):
result = parse_file_content(b"not xlsx bytes", "xlsx")
assert result == b"not xlsx bytes"
def test_empty_bytes_fallback(self):
"""Empty binary input should return the empty bytes, not crash."""
result = parse_file_content(b"", "xlsx")
assert result == b""
def test_nan_replaced_with_none(self):
"""NaN values in float columns must become None for JSON serializability."""
import math
import pandas as pd
df = pd.DataFrame({"A": [1.0, float("nan"), 3.0], "B": ["x", "y", None]})
buf = io.BytesIO()
df.to_excel(buf, index=False) # type: ignore[arg-type]
result = parse_file_content(buf.getvalue(), "xlsx")
# Row with NaN in float col → None, not float('nan')
assert result[2][0] is None # float NaN → None
assert result[3][1] is None # str None → None
# Ensure no NaN leaks
for row in result[1:]: # skip header
for cell in row:
if isinstance(cell, float):
assert not math.isnan(cell), f"NaN leaked: {row}"
# ---------------------------------------------------------------------------
# parse_file_content — unknown format / fallback
# ---------------------------------------------------------------------------
class TestFallback:
def test_unknown_format_returns_content(self):
result = parse_file_content("hello world", "xml")
assert result == "hello world"
def test_none_format_returns_content(self):
# Shouldn't normally be called with unrecognised format, but must not crash.
result = parse_file_content("hello", "unknown_format")
assert result == "hello"
# ---------------------------------------------------------------------------
# BINARY_FORMATS
# ---------------------------------------------------------------------------
class TestBinaryFormats:
def test_parquet_is_binary(self):
assert "parquet" in BINARY_FORMATS
def test_xlsx_is_binary(self):
assert "xlsx" in BINARY_FORMATS
def test_text_formats_not_binary(self):
for fmt in ("json", "jsonl", "csv", "tsv", "yaml", "toml"):
assert fmt not in BINARY_FORMATS
# ---------------------------------------------------------------------------
# MIME mapping
# ---------------------------------------------------------------------------
class TestMimeMapping:
def test_application_yaml(self):
assert infer_format_from_uri("workspace://abc123#application/yaml") == "yaml"
# ---------------------------------------------------------------------------
# CSV sniffer fallback
# ---------------------------------------------------------------------------
class TestCsvSnifferFallback:
def test_tab_delimited_with_csv_format(self):
"""Tab-delimited content parsed as csv should use sniffer fallback."""
content = "Name\tScore\nAlice\t90\nBob\t85"
result = parse_file_content(content, "csv")
assert result == [["Name", "Score"], ["Alice", "90"], ["Bob", "85"]]
def test_sniffer_failure_returns_content(self):
"""When sniffer fails, single-column falls back to raw content."""
content = "Name\nAlice\nBob"
result = parse_file_content(content, "csv")
assert result == content
# ---------------------------------------------------------------------------
# OpenpyxlInvalidFile fallback
# ---------------------------------------------------------------------------
class TestOpenpyxlFallback:
def test_invalid_xlsx_non_strict(self):
"""Invalid xlsx bytes should fall back gracefully in non-strict mode."""
result = parse_file_content(b"not xlsx bytes", "xlsx")
assert result == b"not xlsx bytes"
# ---------------------------------------------------------------------------
# Header-only CSV
# ---------------------------------------------------------------------------
class TestHeaderOnlyCsv:
def test_header_only_csv_returns_header_row(self):
"""CSV with only a header row (no data rows) should return [[header]]."""
content = "Name,Score"
result = parse_file_content(content, "csv")
assert result == [["Name", "Score"]]
def test_header_only_csv_with_trailing_newline(self):
content = "Name,Score\n"
result = parse_file_content(content, "csv")
assert result == [["Name", "Score"]]
# ---------------------------------------------------------------------------
# Binary format + line range (line range ignored for binary formats)
# ---------------------------------------------------------------------------
@pytest.mark.skipif(not _has_pyarrow, reason="pyarrow not installed")
class TestBinaryFormatLineRange:
def test_parquet_ignores_line_range(self):
"""Binary formats should parse the full file regardless of line range.
Line ranges are meaningless for binary formats (parquet/xlsx) — the
caller (file_ref._expand_bare_ref) passes raw bytes and the parser
should return the complete structured data.
"""
import pandas as pd
df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
buf = io.BytesIO()
df.to_parquet(buf, index=False)
# parse_file_content itself doesn't take a line range — this tests
# that the full content is parsed even though the bytes could have
# been truncated upstream (it's not, by design).
result = parse_file_content(buf.getvalue(), "parquet")
assert result == [["A", "B"], [1, 4], [2, 5], [3, 6]]
# ---------------------------------------------------------------------------
# Legacy .xls UX
# ---------------------------------------------------------------------------
class TestXlsFallback:
def test_xls_returns_helpful_error_string(self):
"""Uploading a .xls file should produce a helpful error, not garbled binary."""
result = parse_file_content(b"\xd0\xcf\x11\xe0garbled", "xls")
assert isinstance(result, str)
assert ".xlsx" in result
assert "not supported" in result.lower()
def test_xls_with_string_content(self):
result = parse_file_content("some text", "xls")
assert isinstance(result, str)
assert ".xlsx" in result

View File

@@ -8,7 +8,12 @@ from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from backend.data.execution import ExecutionContext
from backend.util.file import store_media_file
from backend.util.file import (
is_media_file_ref,
parse_data_uri,
resolve_media_content,
store_media_file,
)
from backend.util.type import MediaFileType
@@ -344,3 +349,162 @@ class TestFileCloudIntegration:
execution_context=make_test_context(graph_exec_id=graph_exec_id),
return_format="for_local_processing",
)
# ---------------------------------------------------------------------------
# is_media_file_ref
# ---------------------------------------------------------------------------
class TestIsMediaFileRef:
def test_data_uri(self):
assert is_media_file_ref("data:image/png;base64,iVBORw0KGg==") is True
def test_workspace_uri(self):
assert is_media_file_ref("workspace://abc123") is True
def test_workspace_uri_with_mime(self):
assert is_media_file_ref("workspace://abc123#image/png") is True
def test_http_url(self):
assert is_media_file_ref("http://example.com/image.png") is True
def test_https_url(self):
assert is_media_file_ref("https://example.com/image.png") is True
def test_plain_text(self):
assert is_media_file_ref("print('hello')") is False
def test_local_path(self):
assert is_media_file_ref("/tmp/file.txt") is False
def test_empty_string(self):
assert is_media_file_ref("") is False
def test_filename(self):
assert is_media_file_ref("image.png") is False
# ---------------------------------------------------------------------------
# parse_data_uri
# ---------------------------------------------------------------------------
class TestParseDataUri:
def test_valid_png(self):
result = parse_data_uri("data:image/png;base64,iVBORw0KGg==")
assert result is not None
mime, payload = result
assert mime == "image/png"
assert payload == "iVBORw0KGg=="
def test_valid_text(self):
result = parse_data_uri("data:text/plain;base64,SGVsbG8=")
assert result is not None
assert result[0] == "text/plain"
assert result[1] == "SGVsbG8="
def test_mime_case_normalized(self):
result = parse_data_uri("data:IMAGE/PNG;base64,abc")
assert result is not None
assert result[0] == "image/png"
def test_not_data_uri(self):
assert parse_data_uri("workspace://abc123") is None
def test_plain_text(self):
assert parse_data_uri("hello world") is None
def test_missing_base64(self):
assert parse_data_uri("data:image/png;utf-8,abc") is None
def test_empty_payload(self):
result = parse_data_uri("data:image/png;base64,")
assert result is not None
assert result[1] == ""
# ---------------------------------------------------------------------------
# resolve_media_content
# ---------------------------------------------------------------------------
class TestResolveMediaContent:
@pytest.mark.asyncio
async def test_plain_text_passthrough(self):
"""Plain text content (not a media ref) passes through unchanged."""
ctx = make_test_context()
result = await resolve_media_content(
MediaFileType("print('hello')"),
ctx,
return_format="for_external_api",
)
assert result == "print('hello')"
@pytest.mark.asyncio
async def test_empty_string_passthrough(self):
"""Empty string passes through unchanged."""
ctx = make_test_context()
result = await resolve_media_content(
MediaFileType(""),
ctx,
return_format="for_external_api",
)
assert result == ""
@pytest.mark.asyncio
async def test_media_ref_delegates_to_store(self):
"""Media references are resolved via store_media_file."""
ctx = make_test_context()
with patch(
"backend.util.file.store_media_file",
new=AsyncMock(return_value=MediaFileType("data:image/png;base64,abc")),
) as mock_store:
result = await resolve_media_content(
MediaFileType("workspace://img123"),
ctx,
return_format="for_external_api",
)
assert result == "data:image/png;base64,abc"
mock_store.assert_called_once_with(
MediaFileType("workspace://img123"),
ctx,
return_format="for_external_api",
)
@pytest.mark.asyncio
async def test_data_uri_delegates_to_store(self):
"""Data URIs are also resolved via store_media_file."""
ctx = make_test_context()
data_uri = "data:image/png;base64,iVBORw0KGg=="
with patch(
"backend.util.file.store_media_file",
new=AsyncMock(return_value=MediaFileType(data_uri)),
) as mock_store:
result = await resolve_media_content(
MediaFileType(data_uri),
ctx,
return_format="for_external_api",
)
assert result == data_uri
mock_store.assert_called_once()
@pytest.mark.asyncio
async def test_https_url_delegates_to_store(self):
"""HTTPS URLs are resolved via store_media_file."""
ctx = make_test_context()
with patch(
"backend.util.file.store_media_file",
new=AsyncMock(return_value=MediaFileType("data:image/png;base64,abc")),
) as mock_store:
result = await resolve_media_content(
MediaFileType("https://example.com/image.png"),
ctx,
return_format="for_local_processing",
)
assert result == "data:image/png;base64,abc"
mock_store.assert_called_once_with(
MediaFileType("https://example.com/image.png"),
ctx,
return_format="for_local_processing",
)

View File

@@ -10,7 +10,7 @@ from sentry_sdk.integrations.launchdarkly import LaunchDarklyIntegration
from sentry_sdk.integrations.logging import LoggingIntegration
from backend.util import feature_flag
from backend.util.settings import Settings
from backend.util.settings import BehaveAs, Settings
settings = Settings()
logger = logging.getLogger(__name__)
@@ -21,6 +21,95 @@ class DiscordChannel(str, Enum):
PRODUCT = "product" # For product alerts (low balance, zero balance, etc.)
def _before_send(event, hint):
"""Filter out expected/transient errors from Sentry to reduce noise."""
if "exc_info" in hint:
exc_type, exc_value, _ = hint["exc_info"]
exc_msg = str(exc_value).lower() if exc_value else ""
# AMQP/RabbitMQ transient connection errors — expected during deploys
amqp_keywords = [
"amqpconnection",
"amqpconnector",
"connection_forced",
"channelinvalidstateerror",
"no active transport",
]
if any(kw in exc_msg for kw in amqp_keywords):
return None
# "connection refused" only for AMQP-related exceptions (not other services)
if "connection refused" in exc_msg:
exc_module = getattr(exc_type, "__module__", "") or ""
exc_name = getattr(exc_type, "__name__", "") or ""
amqp_indicators = ["aio_pika", "aiormq", "amqp", "pika", "rabbitmq"]
if any(
ind in exc_module.lower() or ind in exc_name.lower()
for ind in amqp_indicators
) or any(kw in exc_msg for kw in ["amqp", "pika", "rabbitmq"]):
return None
# User-caused credential/auth errors — not platform bugs
user_auth_keywords = [
"incorrect api key",
"invalid x-api-key",
"missing authentication header",
"invalid api token",
"authentication_error",
]
if any(kw in exc_msg for kw in user_auth_keywords):
return None
# Expected business logic — insufficient balance
if "insufficient balance" in exc_msg or "no credits left" in exc_msg:
return None
# Expected security check — blocked IP access
if "access to blocked or private ip" in exc_msg:
return None
# Discord bot token misconfiguration — not a platform error
if "improper token has been passed" in exc_msg or (
exc_type and exc_type.__name__ == "Forbidden" and "50001" in exc_msg
):
return None
# Google metadata DNS errors — expected in non-GCP environments
if (
"metadata.google.internal" in exc_msg
and settings.config.behave_as != BehaveAs.CLOUD
):
return None
# Inactive email recipients — expected for bounced addresses
if "marked as inactive" in exc_msg or "inactive addresses" in exc_msg:
return None
# Also filter log-based events for known noisy messages.
# Sentry's LoggingIntegration stores log messages under "logentry", not "message".
logentry = event.get("logentry") or {}
log_msg = (
logentry.get("formatted") or logentry.get("message") or event.get("message")
)
if event.get("logger") and log_msg:
msg = log_msg.lower()
noisy_patterns = [
"amqpconnection",
"connection_forced",
"unclosed client session",
"unclosed connector",
]
if any(p in msg for p in noisy_patterns):
return None
# "connection refused" in logs only when AMQP-related context is present
if "connection refused" in msg and any(
ind in msg for ind in ("amqp", "pika", "rabbitmq", "aio_pika", "aiormq")
):
return None
return event
def sentry_init():
sentry_dsn = settings.secrets.sentry_dsn
integrations = []
@@ -35,6 +124,7 @@ def sentry_init():
profiles_sample_rate=1.0,
environment=f"app:{settings.config.app_env.value}-behave:{settings.config.behave_as.value}",
_experiments={"enable_logs": True},
before_send=_before_send,
integrations=[
AsyncioIntegration(),
LoggingIntegration(sentry_logs_level=logging.INFO),

View File

@@ -70,9 +70,6 @@ def _msg_tokens(msg: dict, enc) -> int:
# Count tool result tokens
tool_call_tokens += _tok_len(item.get("tool_use_id", ""), enc)
tool_call_tokens += _tok_len(item.get("content", ""), enc)
elif isinstance(item, dict) and item.get("type") == "text":
# Count text block tokens
tool_call_tokens += _tok_len(item.get("text", ""), enc)
elif isinstance(item, dict) and "content" in item:
# Other content types with content field
tool_call_tokens += _tok_len(item.get("content", ""), enc)
@@ -148,14 +145,10 @@ def _truncate_middle_tokens(text: str, enc, max_tok: int) -> str:
if len(ids) <= max_tok:
return text # nothing to do
# Need at least 3 tokens (head + ellipsis + tail) for meaningful truncation
mid = enc.encode("")
if max_tok < 3:
return enc.decode(mid)
# Split the allowance between the two ends:
head = max_tok // 2 - 1 # -1 for the ellipsis
tail = max_tok - head - 1
mid = enc.encode("")
return enc.decode(ids[:head] + mid + ids[-tail:])
@@ -403,7 +396,7 @@ def validate_and_remove_orphan_tool_responses(
if log_warning:
logger.warning(
"Removing %d orphan tool response(s): %s", len(orphan_ids), orphan_ids
f"Removing {len(orphan_ids)} orphan tool response(s): {orphan_ids}"
)
return _remove_orphan_tool_responses(messages, orphan_ids)
@@ -495,9 +488,8 @@ def _ensure_tool_pairs_intact(
# Some tool_call_ids couldn't be resolved - remove those tool responses
# This shouldn't happen in normal operation but handles edge cases
logger.warning(
"Could not find assistant messages for tool_call_ids: %s. "
"Removing orphan tool responses.",
orphan_tool_call_ids,
f"Could not find assistant messages for tool_call_ids: {orphan_tool_call_ids}. "
"Removing orphan tool responses."
)
recent_messages = _remove_orphan_tool_responses(
recent_messages, orphan_tool_call_ids
@@ -505,8 +497,8 @@ def _ensure_tool_pairs_intact(
if messages_to_prepend:
logger.info(
"Extended recent messages by %d to preserve tool_call/tool_response pairs",
len(messages_to_prepend),
f"Extended recent messages by {len(messages_to_prepend)} to preserve "
f"tool_call/tool_response pairs"
)
return messages_to_prepend + recent_messages
@@ -694,15 +686,11 @@ async def compress_context(
msgs = [summary_msg] + recent_msgs
logger.info(
"Context summarized: %d -> %d tokens, summarized %d messages",
original_count,
total_tokens(),
messages_summarized,
f"Context summarized: {original_count} -> {total_tokens()} tokens, "
f"summarized {messages_summarized} messages"
)
except Exception as e:
logger.warning(
"Summarization failed, continuing with truncation: %s", e
)
logger.warning(f"Summarization failed, continuing with truncation: {e}")
# Fall through to content truncation
# ---- STEP 2: Normalize content ----------------------------------------
@@ -740,12 +728,6 @@ async def compress_context(
# This is more granular than dropping all old messages at once.
while total_tokens() + reserve > target_tokens and len(msgs) > 2:
deletable: list[int] = []
# Count assistant messages to ensure we keep at least one
assistant_indices: set[int] = {
i
for i in range(len(msgs))
if msgs[i] is not None and msgs[i].get("role") == "assistant"
}
for i in range(1, len(msgs) - 1):
msg = msgs[i]
if (
@@ -753,9 +735,6 @@ async def compress_context(
and not _is_tool_message(msg)
and not _is_objective_message(msg)
):
# Skip if this is the last remaining assistant message
if msg.get("role") == "assistant" and len(assistant_indices) <= 1:
continue
deletable.append(i)
if not deletable:
break

View File

@@ -64,7 +64,7 @@ def send_rate_limited_discord_alert(
return True
except Exception as alert_error:
logger.error(f"Failed to send Discord alert: {alert_error}")
logger.warning(f"Failed to send Discord alert: {alert_error}")
return False
@@ -182,7 +182,8 @@ def conn_retry(
func_name = getattr(retry_state.fn, "__name__", "unknown")
if retry_state.outcome.failed and retry_state.next_action is None:
logger.error(f"{prefix} {action_name} failed after retries: {exception}")
# Final failure is logged by sync_wrapper/async_wrapper — skip here to avoid duplicates
pass
else:
if attempt_number == EXCESSIVE_RETRY_THRESHOLD:
if send_rate_limited_discord_alert(
@@ -225,7 +226,7 @@ def conn_retry(
logger.info(f"{prefix} {action_name} completed successfully.")
return result
except Exception as e:
logger.error(f"{prefix} {action_name} failed after retries: {e}")
logger.warning(f"{prefix} {action_name} failed after retries: {e}")
raise
@wraps(func)
@@ -237,7 +238,7 @@ def conn_retry(
logger.info(f"{prefix} {action_name} completed successfully.")
return result
except Exception as e:
logger.error(f"{prefix} {action_name} failed after retries: {e}")
logger.warning(f"{prefix} {action_name} failed after retries: {e}")
raise
return async_wrapper if is_coroutine else sync_wrapper

View File

@@ -183,7 +183,8 @@ class WorkspaceManager:
f"{Config().max_file_size_mb}MB limit"
)
# Virus scan content before persisting (defense in depth)
# Scan here — callers must NOT duplicate this scan.
# WorkspaceManager owns virus scanning for all persisted files.
await scan_content_safe(content, filename=filename)
# Determine path with session scoping

View File

@@ -1360,6 +1360,18 @@ files = [
dnspython = ">=2.0.0"
idna = ">=2.0.0"
[[package]]
name = "et-xmlfile"
version = "2.0.0"
description = "An implementation of lxml.xmlfile for the standard library"
optional = false
python-versions = ">=3.8"
groups = ["main"]
files = [
{file = "et_xmlfile-2.0.0-py3-none-any.whl", hash = "sha256:7a91720bc756843502c3b7504c77b8fe44217c85c537d85037f0f536151b2caa"},
{file = "et_xmlfile-2.0.0.tar.gz", hash = "sha256:dab3f4764309081ce75662649be815c4c9081e88f0837825f90fd28317d4da54"},
]
[[package]]
name = "exa-py"
version = "1.16.1"
@@ -4228,6 +4240,21 @@ datalib = ["numpy (>=1)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
realtime = ["websockets (>=13,<16)"]
voice-helpers = ["numpy (>=2.0.2)", "sounddevice (>=0.5.1)"]
[[package]]
name = "openpyxl"
version = "3.1.5"
description = "A Python library to read/write Excel 2010 xlsx/xlsm files"
optional = false
python-versions = ">=3.8"
groups = ["main"]
files = [
{file = "openpyxl-3.1.5-py2.py3-none-any.whl", hash = "sha256:5282c12b107bffeef825f4617dc029afaf41d0ea60823bbb665ef3079dc79de2"},
{file = "openpyxl-3.1.5.tar.gz", hash = "sha256:cf0e3cf56142039133628b5acffe8ef0c12bc902d2aadd3e0fe5878dc08d1050"},
]
[package.dependencies]
et-xmlfile = "*"
[[package]]
name = "opentelemetry-api"
version = "1.39.1"
@@ -5430,6 +5457,66 @@ files = [
{file = "psycopg2_binary-2.9.11-cp39-cp39-win_amd64.whl", hash = "sha256:875039274f8a2361e5207857899706da840768e2a775bf8c65e82f60b197df02"},
]
[[package]]
name = "pyarrow"
version = "23.0.1"
description = "Python library for Apache Arrow"
optional = false
python-versions = ">=3.10"
groups = ["main"]
files = [
{file = "pyarrow-23.0.1-cp310-cp310-macosx_12_0_arm64.whl", hash = "sha256:3fab8f82571844eb3c460f90a75583801d14ca0cc32b1acc8c361650e006fd56"},
{file = "pyarrow-23.0.1-cp310-cp310-macosx_12_0_x86_64.whl", hash = "sha256:3f91c038b95f71ddfc865f11d5876c42f343b4495535bd262c7b321b0b94507c"},
{file = "pyarrow-23.0.1-cp310-cp310-manylinux_2_28_aarch64.whl", hash = "sha256:d0744403adabef53c985a7f8a082b502a368510c40d184df349a0a8754533258"},
{file = "pyarrow-23.0.1-cp310-cp310-manylinux_2_28_x86_64.whl", hash = "sha256:c33b5bf406284fd0bba436ed6f6c3ebe8e311722b441d89397c54f871c6863a2"},
{file = "pyarrow-23.0.1-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:ddf743e82f69dcd6dbbcb63628895d7161e04e56794ef80550ac6f3315eeb1d5"},
{file = "pyarrow-23.0.1-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:e052a211c5ac9848ae15d5ec875ed0943c0221e2fcfe69eee80b604b4e703222"},
{file = "pyarrow-23.0.1-cp310-cp310-win_amd64.whl", hash = "sha256:5abde149bb3ce524782d838eb67ac095cd3fd6090eba051130589793f1a7f76d"},
{file = "pyarrow-23.0.1-cp311-cp311-macosx_12_0_arm64.whl", hash = "sha256:6f0147ee9e0386f519c952cc670eb4a8b05caa594eeffe01af0e25f699e4e9bb"},
{file = "pyarrow-23.0.1-cp311-cp311-macosx_12_0_x86_64.whl", hash = "sha256:0ae6e17c828455b6265d590100c295193f93cc5675eb0af59e49dbd00d2de350"},
{file = "pyarrow-23.0.1-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:fed7020203e9ef273360b9e45be52a2a47d3103caf156a30ace5247ffb51bdbd"},
{file = "pyarrow-23.0.1-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:26d50dee49d741ac0e82185033488d28d35be4d763ae6f321f97d1140eb7a0e9"},
{file = "pyarrow-23.0.1-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:3c30143b17161310f151f4a2bcfe41b5ff744238c1039338779424e38579d701"},
{file = "pyarrow-23.0.1-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:db2190fa79c80a23fdd29fef4b8992893f024ae7c17d2f5f4db7171fa30c2c78"},
{file = "pyarrow-23.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:f00f993a8179e0e1c9713bcc0baf6d6c01326a406a9c23495ec1ba9c9ebf2919"},
{file = "pyarrow-23.0.1-cp312-cp312-macosx_12_0_arm64.whl", hash = "sha256:f4b0dbfa124c0bb161f8b5ebb40f1a680b70279aa0c9901d44a2b5a20806039f"},
{file = "pyarrow-23.0.1-cp312-cp312-macosx_12_0_x86_64.whl", hash = "sha256:7707d2b6673f7de054e2e83d59f9e805939038eebe1763fe811ee8fa5c0cd1a7"},
{file = "pyarrow-23.0.1-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:86ff03fb9f1a320266e0de855dee4b17da6794c595d207f89bba40d16b5c78b9"},
{file = "pyarrow-23.0.1-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:813d99f31275919c383aab17f0f455a04f5a429c261cc411b1e9a8f5e4aaaa05"},
{file = "pyarrow-23.0.1-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:bf5842f960cddd2ef757d486041d57c96483efc295a8c4a0e20e704cbbf39c67"},
{file = "pyarrow-23.0.1-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:564baf97c858ecc03ec01a41062e8f4698abc3e6e2acd79c01c2e97880a19730"},
{file = "pyarrow-23.0.1-cp312-cp312-win_amd64.whl", hash = "sha256:07deae7783782ac7250989a7b2ecde9b3c343a643f82e8a4df03d93b633006f0"},
{file = "pyarrow-23.0.1-cp313-cp313-macosx_12_0_arm64.whl", hash = "sha256:6b8fda694640b00e8af3c824f99f789e836720aa8c9379fb435d4c4953a756b8"},
{file = "pyarrow-23.0.1-cp313-cp313-macosx_12_0_x86_64.whl", hash = "sha256:8ff51b1addc469b9444b7c6f3548e19dc931b172ab234e995a60aea9f6e6025f"},
{file = "pyarrow-23.0.1-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:71c5be5cbf1e1cb6169d2a0980850bccb558ddc9b747b6206435313c47c37677"},
{file = "pyarrow-23.0.1-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:9b6f4f17b43bc39d56fec96e53fe89d94bac3eb134137964371b45352d40d0c2"},
{file = "pyarrow-23.0.1-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:9fc13fc6c403d1337acab46a2c4346ca6c9dec5780c3c697cf8abfd5e19b6b37"},
{file = "pyarrow-23.0.1-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:5c16ed4f53247fa3ffb12a14d236de4213a4415d127fe9cebed33d51671113e2"},
{file = "pyarrow-23.0.1-cp313-cp313-win_amd64.whl", hash = "sha256:cecfb12ef629cf6be0b1887f9f86463b0dd3dc3195ae6224e74006be4736035a"},
{file = "pyarrow-23.0.1-cp313-cp313t-macosx_12_0_arm64.whl", hash = "sha256:29f7f7419a0e30264ea261fdc0e5fe63ce5a6095003db2945d7cd78df391a7e1"},
{file = "pyarrow-23.0.1-cp313-cp313t-macosx_12_0_x86_64.whl", hash = "sha256:33d648dc25b51fd8055c19e4261e813dfc4d2427f068bcecc8b53d01b81b0500"},
{file = "pyarrow-23.0.1-cp313-cp313t-manylinux_2_28_aarch64.whl", hash = "sha256:cd395abf8f91c673dd3589cadc8cc1ee4e8674fa61b2e923c8dd215d9c7d1f41"},
{file = "pyarrow-23.0.1-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:00be9576d970c31defb5c32eb72ef585bf600ef6d0a82d5eccaae96639cf9d07"},
{file = "pyarrow-23.0.1-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:c2139549494445609f35a5cda4eb94e2c9e4d704ce60a095b342f82460c73a83"},
{file = "pyarrow-23.0.1-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:7044b442f184d84e2351e5084600f0d7343d6117aabcbc1ac78eb1ae11eb4125"},
{file = "pyarrow-23.0.1-cp313-cp313t-win_amd64.whl", hash = "sha256:a35581e856a2fafa12f3f54fce4331862b1cfb0bef5758347a858a4aa9d6bae8"},
{file = "pyarrow-23.0.1-cp314-cp314-macosx_12_0_arm64.whl", hash = "sha256:5df1161da23636a70838099d4aaa65142777185cc0cdba4037a18cee7d8db9ca"},
{file = "pyarrow-23.0.1-cp314-cp314-macosx_12_0_x86_64.whl", hash = "sha256:fa8e51cb04b9f8c9c5ace6bab63af9a1f88d35c0d6cbf53e8c17c098552285e1"},
{file = "pyarrow-23.0.1-cp314-cp314-manylinux_2_28_aarch64.whl", hash = "sha256:0b95a3994f015be13c63148fef8832e8a23938128c185ee951c98908a696e0eb"},
{file = "pyarrow-23.0.1-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:4982d71350b1a6e5cfe1af742c53dfb759b11ce14141870d05d9e540d13bc5d1"},
{file = "pyarrow-23.0.1-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:c250248f1fe266db627921c89b47b7c06fee0489ad95b04d50353537d74d6886"},
{file = "pyarrow-23.0.1-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:5f4763b83c11c16e5f4c15601ba6dfa849e20723b46aa2617cb4bffe8768479f"},
{file = "pyarrow-23.0.1-cp314-cp314-win_amd64.whl", hash = "sha256:3a4c85ef66c134161987c17b147d6bffdca4566f9a4c1d81a0a01cdf08414ea5"},
{file = "pyarrow-23.0.1-cp314-cp314t-macosx_12_0_arm64.whl", hash = "sha256:17cd28e906c18af486a499422740298c52d7c6795344ea5002a7720b4eadf16d"},
{file = "pyarrow-23.0.1-cp314-cp314t-macosx_12_0_x86_64.whl", hash = "sha256:76e823d0e86b4fb5e1cf4a58d293036e678b5a4b03539be933d3b31f9406859f"},
{file = "pyarrow-23.0.1-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:a62e1899e3078bf65943078b3ad2a6ddcacf2373bc06379aac61b1e548a75814"},
{file = "pyarrow-23.0.1-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:df088e8f640c9fae3b1f495b3c64755c4e719091caf250f3a74d095ddf3c836d"},
{file = "pyarrow-23.0.1-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:46718a220d64677c93bc243af1d44b55998255427588e400677d7192671845c7"},
{file = "pyarrow-23.0.1-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:a09f3876e87f48bc2f13583ab551f0379e5dfb83210391e68ace404181a20690"},
{file = "pyarrow-23.0.1-cp314-cp314t-win_amd64.whl", hash = "sha256:527e8d899f14bd15b740cd5a54ad56b7f98044955373a17179d5956ddb93d9ce"},
{file = "pyarrow-23.0.1.tar.gz", hash = "sha256:b8c5873e33440b2bc2f4a79d2b47017a89c5a24116c055625e6f2ee50523f019"},
]
[[package]]
name = "pyasn1"
version = "0.6.2"
@@ -8882,4 +8969,4 @@ cffi = ["cffi (>=1.17,<2.0) ; platform_python_implementation != \"PyPy\" and pyt
[metadata]
lock-version = "2.1"
python-versions = ">=3.10,<3.14"
content-hash = "4e4365721cd3b68c58c237353b74adae1c64233fd4446904c335f23eb866fdca"
content-hash = "86dab25684dd46e635a33bd33281a926e5626a874ecc048c34389fecf34a87d8"

View File

@@ -92,6 +92,8 @@ gravitas-md2gdocs = "^0.1.0"
posthog = "^7.6.0"
fpdf2 = "^2.8.6"
langsmith = "^0.7.7"
openpyxl = "^3.1.5"
pyarrow = "^23.0.0"
[tool.poetry.group.dev.dependencies]
aiohappyeyeballs = "^2.6.1"

View File

@@ -44,6 +44,12 @@ Do NOT skip these steps. If any command reports errors, fix them and re-run unti
- Fully capitalize acronyms in symbols, e.g. `graphID`, `useBackendAPI`
- Use function declarations (not arrow functions) for components/handlers
- No `dark:` Tailwind classes — the design system handles dark mode
- Use Next.js `<Link>` for internal navigation — never raw `<a>` tags
- No `any` types unless the value genuinely can be anything
- No linter suppressors (`// @ts-ignore`, `// eslint-disable`) — fix the actual issue
- **File length** — keep files under ~200 lines; extract sub-components or hooks into their own files when a file grows beyond this
- **Function/component length** — keep render functions and hooks under ~50 lines; extract named helpers or sub-components when they grow longer
## Architecture

View File

@@ -0,0 +1,440 @@
import { describe, it, expect, vi, beforeEach } from "vitest";
import { screen, cleanup } from "@testing-library/react";
import { render } from "@/tests/integrations/test-utils";
import React from "react";
import { BlockUIType } from "../components/types";
import type {
CustomNodeData,
CustomNode as CustomNodeType,
} from "../components/FlowEditor/nodes/CustomNode/CustomNode";
import type { NodeProps } from "@xyflow/react";
import type { NodeExecutionResult } from "@/app/api/__generated__/models/nodeExecutionResult";
// ---- Mock sub-components ----
vi.mock(
"@/app/(platform)/build/components/FlowEditor/nodes/CustomNode/components/NodeContainer",
() => ({
NodeContainer: ({
children,
hasErrors,
}: {
children: React.ReactNode;
hasErrors: boolean;
}) => (
<div data-testid="node-container" data-has-errors={String(!!hasErrors)}>
{children}
</div>
),
}),
);
vi.mock(
"@/app/(platform)/build/components/FlowEditor/nodes/CustomNode/components/NodeHeader",
() => ({
NodeHeader: ({ data }: { data: CustomNodeData }) => (
<div data-testid="node-header">{data.title}</div>
),
}),
);
vi.mock(
"@/app/(platform)/build/components/FlowEditor/nodes/CustomNode/components/StickyNoteBlock",
() => ({
StickyNoteBlock: ({ data }: { data: CustomNodeData }) => (
<div data-testid="sticky-note-block">{data.title}</div>
),
}),
);
vi.mock(
"@/app/(platform)/build/components/FlowEditor/nodes/CustomNode/components/NodeAdvancedToggle",
() => ({
NodeAdvancedToggle: () => <div data-testid="node-advanced-toggle" />,
}),
);
vi.mock(
"@/app/(platform)/build/components/FlowEditor/nodes/CustomNode/components/NodeOutput/NodeOutput",
() => ({
NodeDataRenderer: () => <div data-testid="node-data-renderer" />,
}),
);
vi.mock(
"@/app/(platform)/build/components/FlowEditor/nodes/CustomNode/components/NodeExecutionBadge",
() => ({
NodeExecutionBadge: () => <div data-testid="node-execution-badge" />,
}),
);
vi.mock(
"@/app/(platform)/build/components/FlowEditor/nodes/CustomNode/components/NodeRightClickMenu",
() => ({
NodeRightClickMenu: ({ children }: { children: React.ReactNode }) => (
<div data-testid="node-right-click-menu">{children}</div>
),
}),
);
vi.mock(
"@/app/(platform)/build/components/FlowEditor/nodes/CustomNode/components/WebhookDisclaimer",
() => ({
WebhookDisclaimer: () => <div data-testid="webhook-disclaimer" />,
}),
);
vi.mock(
"@/app/(platform)/build/components/FlowEditor/nodes/CustomNode/components/SubAgentUpdate/SubAgentUpdateFeature",
() => ({
SubAgentUpdateFeature: () => <div data-testid="sub-agent-update" />,
}),
);
vi.mock(
"@/app/(platform)/build/components/FlowEditor/nodes/CustomNode/components/AyrshareConnectButton",
() => ({
AyrshareConnectButton: () => <div data-testid="ayrshare-connect-button" />,
}),
);
vi.mock(
"@/app/(platform)/build/components/FlowEditor/nodes/FormCreator",
() => ({
FormCreator: () => <div data-testid="form-creator" />,
}),
);
vi.mock(
"@/app/(platform)/build/components/FlowEditor/nodes/OutputHandler",
() => ({
OutputHandler: () => <div data-testid="output-handler" />,
}),
);
vi.mock(
"@/components/renderers/InputRenderer/utils/input-schema-pre-processor",
() => ({
preprocessInputSchema: (schema: unknown) => schema,
}),
);
vi.mock(
"@/app/(platform)/build/components/FlowEditor/nodes/CustomNode/useCustomNode",
() => ({
useCustomNode: ({ data }: { data: CustomNodeData }) => ({
inputSchema: data.inputSchema,
outputSchema: data.outputSchema,
isMCPWithTool: false,
}),
}),
);
vi.mock("@xyflow/react", async () => {
const actual = await vi.importActual("@xyflow/react");
return {
...actual,
useReactFlow: () => ({
getNodes: () => [],
getEdges: () => [],
setNodes: vi.fn(),
setEdges: vi.fn(),
getNode: vi.fn(),
}),
useNodeId: () => "test-node-id",
useUpdateNodeInternals: () => vi.fn(),
Handle: ({ children }: { children: React.ReactNode }) => (
<div>{children}</div>
),
Position: { Left: "left", Right: "right", Top: "top", Bottom: "bottom" },
};
});
import { CustomNode } from "../components/FlowEditor/nodes/CustomNode/CustomNode";
// ---- Helpers ----
function buildNodeData(
overrides: Partial<CustomNodeData> = {},
): CustomNodeData {
return {
hardcodedValues: {},
title: "Test Block",
description: "A test block",
inputSchema: { type: "object", properties: {} },
outputSchema: { type: "object", properties: {} },
uiType: BlockUIType.STANDARD,
block_id: "block-123",
costs: [],
categories: [],
...overrides,
};
}
function buildNodeProps(
dataOverrides: Partial<CustomNodeData> = {},
propsOverrides: Partial<NodeProps<CustomNodeType>> = {},
): NodeProps<CustomNodeType> {
return {
id: "node-1",
data: buildNodeData(dataOverrides),
selected: false,
type: "custom",
isConnectable: true,
positionAbsoluteX: 0,
positionAbsoluteY: 0,
zIndex: 0,
dragging: false,
dragHandle: undefined,
draggable: true,
selectable: true,
deletable: true,
parentId: undefined,
width: undefined,
height: undefined,
sourcePosition: undefined,
targetPosition: undefined,
...propsOverrides,
};
}
function renderCustomNode(
dataOverrides: Partial<CustomNodeData> = {},
propsOverrides: Partial<NodeProps<CustomNodeType>> = {},
) {
const props = buildNodeProps(dataOverrides, propsOverrides);
return render(<CustomNode {...props} />);
}
function createExecutionResult(
overrides: Partial<NodeExecutionResult> = {},
): NodeExecutionResult {
return {
node_exec_id: overrides.node_exec_id ?? "exec-1",
node_id: overrides.node_id ?? "node-1",
graph_exec_id: overrides.graph_exec_id ?? "graph-exec-1",
graph_id: overrides.graph_id ?? "graph-1",
graph_version: overrides.graph_version ?? 1,
user_id: overrides.user_id ?? "test-user",
block_id: overrides.block_id ?? "block-1",
status: overrides.status ?? "COMPLETED",
input_data: overrides.input_data ?? {},
output_data: overrides.output_data ?? {},
add_time: overrides.add_time ?? new Date("2024-01-01T00:00:00Z"),
queue_time: overrides.queue_time ?? new Date("2024-01-01T00:00:00Z"),
start_time: overrides.start_time ?? new Date("2024-01-01T00:00:01Z"),
end_time: overrides.end_time ?? new Date("2024-01-01T00:00:02Z"),
};
}
// ---- Tests ----
beforeEach(() => {
cleanup();
});
describe("CustomNode", () => {
describe("STANDARD type rendering", () => {
it("renders NodeHeader with the block title", () => {
renderCustomNode({ title: "My Standard Block" });
const header = screen.getByTestId("node-header");
expect(header).toBeDefined();
expect(header.textContent).toContain("My Standard Block");
});
it("renders NodeContainer, FormCreator, OutputHandler, and NodeExecutionBadge", () => {
renderCustomNode();
expect(screen.getByTestId("node-container")).toBeDefined();
expect(screen.getByTestId("form-creator")).toBeDefined();
expect(screen.getByTestId("output-handler")).toBeDefined();
expect(screen.getByTestId("node-execution-badge")).toBeDefined();
expect(screen.getByTestId("node-data-renderer")).toBeDefined();
expect(screen.getByTestId("node-advanced-toggle")).toBeDefined();
});
it("wraps content in NodeRightClickMenu", () => {
renderCustomNode();
expect(screen.getByTestId("node-right-click-menu")).toBeDefined();
});
it("does not render StickyNoteBlock for STANDARD type", () => {
renderCustomNode();
expect(screen.queryByTestId("sticky-note-block")).toBeNull();
});
});
describe("NOTE type rendering", () => {
it("renders StickyNoteBlock instead of main UI", () => {
renderCustomNode({ uiType: BlockUIType.NOTE, title: "My Note" });
const note = screen.getByTestId("sticky-note-block");
expect(note).toBeDefined();
expect(note.textContent).toContain("My Note");
});
it("does not render NodeContainer or other standard components", () => {
renderCustomNode({ uiType: BlockUIType.NOTE });
expect(screen.queryByTestId("node-container")).toBeNull();
expect(screen.queryByTestId("node-header")).toBeNull();
expect(screen.queryByTestId("form-creator")).toBeNull();
expect(screen.queryByTestId("output-handler")).toBeNull();
});
});
describe("WEBHOOK type rendering", () => {
it("renders WebhookDisclaimer for WEBHOOK type", () => {
renderCustomNode({ uiType: BlockUIType.WEBHOOK });
expect(screen.getByTestId("webhook-disclaimer")).toBeDefined();
});
it("renders WebhookDisclaimer for WEBHOOK_MANUAL type", () => {
renderCustomNode({ uiType: BlockUIType.WEBHOOK_MANUAL });
expect(screen.getByTestId("webhook-disclaimer")).toBeDefined();
});
});
describe("AGENT type rendering", () => {
it("renders SubAgentUpdateFeature for AGENT type", () => {
renderCustomNode({ uiType: BlockUIType.AGENT });
expect(screen.getByTestId("sub-agent-update")).toBeDefined();
});
it("does not render SubAgentUpdateFeature for non-AGENT types", () => {
renderCustomNode({ uiType: BlockUIType.STANDARD });
expect(screen.queryByTestId("sub-agent-update")).toBeNull();
});
});
describe("OUTPUT type rendering", () => {
it("does not render OutputHandler for OUTPUT type", () => {
renderCustomNode({ uiType: BlockUIType.OUTPUT });
expect(screen.queryByTestId("output-handler")).toBeNull();
});
it("still renders FormCreator and other components for OUTPUT type", () => {
renderCustomNode({ uiType: BlockUIType.OUTPUT });
expect(screen.getByTestId("form-creator")).toBeDefined();
expect(screen.getByTestId("node-header")).toBeDefined();
expect(screen.getByTestId("node-execution-badge")).toBeDefined();
});
});
describe("AYRSHARE type rendering", () => {
it("renders AyrshareConnectButton for AYRSHARE type", () => {
renderCustomNode({ uiType: BlockUIType.AYRSHARE });
expect(screen.getByTestId("ayrshare-connect-button")).toBeDefined();
});
it("does not render AyrshareConnectButton for non-AYRSHARE types", () => {
renderCustomNode({ uiType: BlockUIType.STANDARD });
expect(screen.queryByTestId("ayrshare-connect-button")).toBeNull();
});
});
describe("error states", () => {
it("sets hasErrors on NodeContainer when data.errors has non-empty values", () => {
renderCustomNode({
errors: { field1: "This field is required" },
});
const container = screen.getByTestId("node-container");
expect(container.getAttribute("data-has-errors")).toBe("true");
});
it("does not set hasErrors when data.errors is empty", () => {
renderCustomNode({ errors: {} });
const container = screen.getByTestId("node-container");
expect(container.getAttribute("data-has-errors")).toBe("false");
});
it("does not set hasErrors when data.errors values are all empty strings", () => {
renderCustomNode({ errors: { field1: "" } });
const container = screen.getByTestId("node-container");
expect(container.getAttribute("data-has-errors")).toBe("false");
});
it("sets hasErrors when last execution result has error in output_data", () => {
renderCustomNode({
nodeExecutionResults: [
createExecutionResult({
output_data: { error: ["Something went wrong"] },
}),
],
});
const container = screen.getByTestId("node-container");
expect(container.getAttribute("data-has-errors")).toBe("true");
});
it("does not set hasErrors when execution results have no error", () => {
renderCustomNode({
nodeExecutionResults: [
createExecutionResult({
output_data: { result: ["success"] },
}),
],
});
const container = screen.getByTestId("node-container");
expect(container.getAttribute("data-has-errors")).toBe("false");
});
});
describe("NodeExecutionBadge", () => {
it("always renders NodeExecutionBadge for non-NOTE types", () => {
renderCustomNode({ uiType: BlockUIType.STANDARD });
expect(screen.getByTestId("node-execution-badge")).toBeDefined();
});
it("renders NodeExecutionBadge for AGENT type", () => {
renderCustomNode({ uiType: BlockUIType.AGENT });
expect(screen.getByTestId("node-execution-badge")).toBeDefined();
});
it("renders NodeExecutionBadge for OUTPUT type", () => {
renderCustomNode({ uiType: BlockUIType.OUTPUT });
expect(screen.getByTestId("node-execution-badge")).toBeDefined();
});
});
describe("edge cases", () => {
it("renders without nodeExecutionResults", () => {
renderCustomNode({ nodeExecutionResults: undefined });
const container = screen.getByTestId("node-container");
expect(container).toBeDefined();
expect(container.getAttribute("data-has-errors")).toBe("false");
});
it("renders without errors property", () => {
renderCustomNode({ errors: undefined });
const container = screen.getByTestId("node-container");
expect(container).toBeDefined();
expect(container.getAttribute("data-has-errors")).toBe("false");
});
it("renders with empty execution results array", () => {
renderCustomNode({ nodeExecutionResults: [] });
const container = screen.getByTestId("node-container");
expect(container).toBeDefined();
expect(container.getAttribute("data-has-errors")).toBe("false");
});
});
});

View File

@@ -0,0 +1,342 @@
import { describe, it, expect, beforeEach, afterEach, vi } from "vitest";
import {
render,
screen,
fireEvent,
waitFor,
cleanup,
} from "@/tests/integrations/test-utils";
import { useBlockMenuStore } from "../stores/blockMenuStore";
import { useControlPanelStore } from "../stores/controlPanelStore";
import { DefaultStateType } from "../components/NewControlPanel/NewBlockMenu/types";
import { SearchEntryFilterAnyOfItem } from "@/app/api/__generated__/models/searchEntryFilterAnyOfItem";
// ---------------------------------------------------------------------------
// Mocks for heavy child components
// ---------------------------------------------------------------------------
vi.mock(
"../components/NewControlPanel/NewBlockMenu/BlockMenuDefault/BlockMenuDefault",
() => ({
BlockMenuDefault: () => (
<div data-testid="block-menu-default">Default Content</div>
),
}),
);
vi.mock(
"../components/NewControlPanel/NewBlockMenu/BlockMenuSearch/BlockMenuSearch",
() => ({
BlockMenuSearch: () => (
<div data-testid="block-menu-search">Search Results</div>
),
}),
);
// Mock query client used by the search bar hook
vi.mock("@/lib/react-query/queryClient", () => ({
getQueryClient: () => ({
invalidateQueries: vi.fn(),
}),
}));
// ---------------------------------------------------------------------------
// Reset stores before each test
// ---------------------------------------------------------------------------
afterEach(() => {
cleanup();
});
beforeEach(() => {
useBlockMenuStore.getState().reset();
useBlockMenuStore.setState({
filters: [],
creators: [],
creators_list: [],
categoryCounts: {
blocks: 0,
integrations: 0,
marketplace_agents: 0,
my_agents: 0,
},
});
useControlPanelStore.getState().reset();
});
// ===========================================================================
// Section 1: blockMenuStore unit tests
// ===========================================================================
describe("blockMenuStore", () => {
describe("searchQuery", () => {
it("defaults to an empty string", () => {
expect(useBlockMenuStore.getState().searchQuery).toBe("");
});
it("sets the search query", () => {
useBlockMenuStore.getState().setSearchQuery("timer");
expect(useBlockMenuStore.getState().searchQuery).toBe("timer");
});
});
describe("defaultState", () => {
it("defaults to SUGGESTION", () => {
expect(useBlockMenuStore.getState().defaultState).toBe(
DefaultStateType.SUGGESTION,
);
});
it("sets the default state", () => {
useBlockMenuStore.getState().setDefaultState(DefaultStateType.ALL_BLOCKS);
expect(useBlockMenuStore.getState().defaultState).toBe(
DefaultStateType.ALL_BLOCKS,
);
});
});
describe("filters", () => {
it("defaults to an empty array", () => {
expect(useBlockMenuStore.getState().filters).toEqual([]);
});
it("adds a filter", () => {
useBlockMenuStore.getState().addFilter(SearchEntryFilterAnyOfItem.blocks);
expect(useBlockMenuStore.getState().filters).toEqual([
SearchEntryFilterAnyOfItem.blocks,
]);
});
it("removes a filter", () => {
useBlockMenuStore
.getState()
.setFilters([
SearchEntryFilterAnyOfItem.blocks,
SearchEntryFilterAnyOfItem.integrations,
]);
useBlockMenuStore
.getState()
.removeFilter(SearchEntryFilterAnyOfItem.blocks);
expect(useBlockMenuStore.getState().filters).toEqual([
SearchEntryFilterAnyOfItem.integrations,
]);
});
it("replaces all filters with setFilters", () => {
useBlockMenuStore.getState().addFilter(SearchEntryFilterAnyOfItem.blocks);
useBlockMenuStore
.getState()
.setFilters([SearchEntryFilterAnyOfItem.marketplace_agents]);
expect(useBlockMenuStore.getState().filters).toEqual([
SearchEntryFilterAnyOfItem.marketplace_agents,
]);
});
});
describe("creators", () => {
it("adds a creator", () => {
useBlockMenuStore.getState().addCreator("alice");
expect(useBlockMenuStore.getState().creators).toEqual(["alice"]);
});
it("removes a creator", () => {
useBlockMenuStore.getState().setCreators(["alice", "bob"]);
useBlockMenuStore.getState().removeCreator("alice");
expect(useBlockMenuStore.getState().creators).toEqual(["bob"]);
});
it("replaces all creators with setCreators", () => {
useBlockMenuStore.getState().addCreator("alice");
useBlockMenuStore.getState().setCreators(["charlie"]);
expect(useBlockMenuStore.getState().creators).toEqual(["charlie"]);
});
});
describe("categoryCounts", () => {
it("sets category counts", () => {
const counts = {
blocks: 10,
integrations: 5,
marketplace_agents: 3,
my_agents: 2,
};
useBlockMenuStore.getState().setCategoryCounts(counts);
expect(useBlockMenuStore.getState().categoryCounts).toEqual(counts);
});
});
describe("searchId", () => {
it("defaults to undefined", () => {
expect(useBlockMenuStore.getState().searchId).toBeUndefined();
});
it("sets and clears searchId", () => {
useBlockMenuStore.getState().setSearchId("search-123");
expect(useBlockMenuStore.getState().searchId).toBe("search-123");
useBlockMenuStore.getState().setSearchId(undefined);
expect(useBlockMenuStore.getState().searchId).toBeUndefined();
});
});
describe("integration", () => {
it("defaults to undefined", () => {
expect(useBlockMenuStore.getState().integration).toBeUndefined();
});
it("sets the integration", () => {
useBlockMenuStore.getState().setIntegration("slack");
expect(useBlockMenuStore.getState().integration).toBe("slack");
});
});
describe("reset", () => {
it("resets searchQuery, searchId, defaultState, and integration", () => {
useBlockMenuStore.getState().setSearchQuery("hello");
useBlockMenuStore.getState().setSearchId("id-1");
useBlockMenuStore.getState().setDefaultState(DefaultStateType.ALL_BLOCKS);
useBlockMenuStore.getState().setIntegration("github");
useBlockMenuStore.getState().reset();
const state = useBlockMenuStore.getState();
expect(state.searchQuery).toBe("");
expect(state.searchId).toBeUndefined();
expect(state.defaultState).toBe(DefaultStateType.SUGGESTION);
expect(state.integration).toBeUndefined();
});
it("does not reset filters or creators (by design)", () => {
useBlockMenuStore
.getState()
.setFilters([SearchEntryFilterAnyOfItem.blocks]);
useBlockMenuStore.getState().setCreators(["alice"]);
useBlockMenuStore.getState().reset();
expect(useBlockMenuStore.getState().filters).toEqual([
SearchEntryFilterAnyOfItem.blocks,
]);
expect(useBlockMenuStore.getState().creators).toEqual(["alice"]);
});
});
});
// ===========================================================================
// Section 2: controlPanelStore unit tests
// ===========================================================================
describe("controlPanelStore", () => {
it("defaults blockMenuOpen to false", () => {
expect(useControlPanelStore.getState().blockMenuOpen).toBe(false);
});
it("sets blockMenuOpen", () => {
useControlPanelStore.getState().setBlockMenuOpen(true);
expect(useControlPanelStore.getState().blockMenuOpen).toBe(true);
});
it("sets forceOpenBlockMenu", () => {
useControlPanelStore.getState().setForceOpenBlockMenu(true);
expect(useControlPanelStore.getState().forceOpenBlockMenu).toBe(true);
});
it("resets all control panel state", () => {
useControlPanelStore.getState().setBlockMenuOpen(true);
useControlPanelStore.getState().setForceOpenBlockMenu(true);
useControlPanelStore.getState().setSaveControlOpen(true);
useControlPanelStore.getState().setForceOpenSave(true);
useControlPanelStore.getState().reset();
const state = useControlPanelStore.getState();
expect(state.blockMenuOpen).toBe(false);
expect(state.forceOpenBlockMenu).toBe(false);
expect(state.saveControlOpen).toBe(false);
expect(state.forceOpenSave).toBe(false);
});
});
// ===========================================================================
// Section 3: BlockMenuContent integration tests
// ===========================================================================
// We import BlockMenuContent directly to avoid dealing with the Popover wrapper.
import { BlockMenuContent } from "../components/NewControlPanel/NewBlockMenu/BlockMenuContent/BlockMenuContent";
describe("BlockMenuContent", () => {
it("shows BlockMenuDefault when there is no search query", () => {
useBlockMenuStore.getState().setSearchQuery("");
render(<BlockMenuContent />);
expect(screen.getByTestId("block-menu-default")).toBeDefined();
expect(screen.queryByTestId("block-menu-search")).toBeNull();
});
it("shows BlockMenuSearch when a search query is present", () => {
useBlockMenuStore.getState().setSearchQuery("timer");
render(<BlockMenuContent />);
expect(screen.getByTestId("block-menu-search")).toBeDefined();
expect(screen.queryByTestId("block-menu-default")).toBeNull();
});
it("renders the search bar", () => {
render(<BlockMenuContent />);
expect(
screen.getByPlaceholderText(
"Blocks, Agents, Integrations or Keywords...",
),
).toBeDefined();
});
it("switches from default to search view when store query changes", () => {
const { rerender } = render(<BlockMenuContent />);
expect(screen.getByTestId("block-menu-default")).toBeDefined();
// Simulate typing by setting the store directly
useBlockMenuStore.getState().setSearchQuery("webhook");
rerender(<BlockMenuContent />);
expect(screen.getByTestId("block-menu-search")).toBeDefined();
expect(screen.queryByTestId("block-menu-default")).toBeNull();
});
it("switches back to default view when search query is cleared", () => {
useBlockMenuStore.getState().setSearchQuery("something");
const { rerender } = render(<BlockMenuContent />);
expect(screen.getByTestId("block-menu-search")).toBeDefined();
useBlockMenuStore.getState().setSearchQuery("");
rerender(<BlockMenuContent />);
expect(screen.getByTestId("block-menu-default")).toBeDefined();
expect(screen.queryByTestId("block-menu-search")).toBeNull();
});
it("typing in the search bar updates the local input value", async () => {
render(<BlockMenuContent />);
const input = screen.getByPlaceholderText(
"Blocks, Agents, Integrations or Keywords...",
);
fireEvent.change(input, { target: { value: "slack" } });
expect((input as HTMLInputElement).value).toBe("slack");
});
it("shows clear button when input has text and clears on click", async () => {
render(<BlockMenuContent />);
const input = screen.getByPlaceholderText(
"Blocks, Agents, Integrations or Keywords...",
);
fireEvent.change(input, { target: { value: "test" } });
// The clear button should appear
const clearButton = screen.getByRole("button");
fireEvent.click(clearButton);
await waitFor(() => {
expect((input as HTMLInputElement).value).toBe("");
});
});
});

View File

@@ -0,0 +1,270 @@
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import {
render,
screen,
fireEvent,
waitFor,
cleanup,
} from "@/tests/integrations/test-utils";
import { UseFormReturn, useForm } from "react-hook-form";
import { zodResolver } from "@hookform/resolvers/zod";
import * as z from "zod";
import { renderHook } from "@testing-library/react";
import { useControlPanelStore } from "../stores/controlPanelStore";
import { TooltipProvider } from "@/components/atoms/Tooltip/BaseTooltip";
import { NewSaveControl } from "../components/NewControlPanel/NewSaveControl/NewSaveControl";
import { useNewSaveControl } from "../components/NewControlPanel/NewSaveControl/useNewSaveControl";
const formSchema = z.object({
name: z.string().min(1, "Name is required").max(100),
description: z.string().max(500),
});
type SaveableGraphFormValues = z.infer<typeof formSchema>;
const mockHandleSave = vi.fn();
vi.mock(
"../components/NewControlPanel/NewSaveControl/useNewSaveControl",
() => ({
useNewSaveControl: vi.fn(),
}),
);
const mockUseNewSaveControl = vi.mocked(useNewSaveControl);
function createMockForm(
defaults: SaveableGraphFormValues = { name: "", description: "" },
): UseFormReturn<SaveableGraphFormValues> {
const { result } = renderHook(() =>
useForm<SaveableGraphFormValues>({
resolver: zodResolver(formSchema),
defaultValues: defaults,
}),
);
return result.current;
}
function setupMock(overrides: {
isSaving?: boolean;
graphVersion?: number;
name?: string;
description?: string;
}) {
const form = createMockForm({
name: overrides.name ?? "",
description: overrides.description ?? "",
});
mockUseNewSaveControl.mockReturnValue({
form,
isSaving: overrides.isSaving ?? false,
graphVersion: overrides.graphVersion,
handleSave: mockHandleSave,
});
return form;
}
function resetStore() {
useControlPanelStore.setState({
blockMenuOpen: false,
saveControlOpen: false,
forceOpenBlockMenu: false,
forceOpenSave: false,
});
}
beforeEach(() => {
cleanup();
resetStore();
mockHandleSave.mockReset();
});
afterEach(() => {
cleanup();
});
describe("NewSaveControl", () => {
it("renders save button trigger", () => {
setupMock({});
render(
<TooltipProvider>
<NewSaveControl />
</TooltipProvider>,
);
expect(screen.getByTestId("save-control-save-button")).toBeDefined();
});
it("renders name and description inputs when popover is open", () => {
useControlPanelStore.setState({ saveControlOpen: true });
setupMock({});
render(
<TooltipProvider>
<NewSaveControl />
</TooltipProvider>,
);
expect(screen.getByTestId("save-control-name-input")).toBeDefined();
expect(screen.getByTestId("save-control-description-input")).toBeDefined();
});
it("does not render popover content when closed", () => {
useControlPanelStore.setState({ saveControlOpen: false });
setupMock({});
render(
<TooltipProvider>
<NewSaveControl />
</TooltipProvider>,
);
expect(screen.queryByTestId("save-control-name-input")).toBeNull();
expect(screen.queryByTestId("save-control-description-input")).toBeNull();
});
it("shows version output when graphVersion is set", () => {
useControlPanelStore.setState({ saveControlOpen: true });
setupMock({ graphVersion: 3 });
render(
<TooltipProvider>
<NewSaveControl />
</TooltipProvider>,
);
const versionInput = screen.getByTestId("save-control-version-output");
expect(versionInput).toBeDefined();
expect((versionInput as HTMLInputElement).disabled).toBe(true);
});
it("hides version output when graphVersion is undefined", () => {
useControlPanelStore.setState({ saveControlOpen: true });
setupMock({ graphVersion: undefined });
render(
<TooltipProvider>
<NewSaveControl />
</TooltipProvider>,
);
expect(screen.queryByTestId("save-control-version-output")).toBeNull();
});
it("enables save button when isSaving is false", () => {
useControlPanelStore.setState({ saveControlOpen: true });
setupMock({ isSaving: false });
render(
<TooltipProvider>
<NewSaveControl />
</TooltipProvider>,
);
const saveButton = screen.getByTestId("save-control-save-agent-button");
expect((saveButton as HTMLButtonElement).disabled).toBe(false);
});
it("disables save button when isSaving is true", () => {
useControlPanelStore.setState({ saveControlOpen: true });
setupMock({ isSaving: true });
render(
<TooltipProvider>
<NewSaveControl />
</TooltipProvider>,
);
const saveButton = screen.getByRole("button", { name: /save agent/i });
expect((saveButton as HTMLButtonElement).disabled).toBe(true);
});
it("calls handleSave on form submission with valid data", async () => {
useControlPanelStore.setState({ saveControlOpen: true });
const form = setupMock({ name: "My Agent", description: "A description" });
form.setValue("name", "My Agent");
form.setValue("description", "A description");
render(
<TooltipProvider>
<NewSaveControl />
</TooltipProvider>,
);
const saveButton = screen.getByTestId("save-control-save-agent-button");
fireEvent.click(saveButton);
await waitFor(() => {
expect(mockHandleSave).toHaveBeenCalledWith(
{ name: "My Agent", description: "A description" },
expect.anything(),
);
});
});
it("does not call handleSave when name is empty (validation fails)", async () => {
useControlPanelStore.setState({ saveControlOpen: true });
setupMock({ name: "", description: "" });
render(
<TooltipProvider>
<NewSaveControl />
</TooltipProvider>,
);
const saveButton = screen.getByTestId("save-control-save-agent-button");
fireEvent.click(saveButton);
await waitFor(() => {
expect(mockHandleSave).not.toHaveBeenCalled();
});
});
it("popover stays open when forceOpenSave is true", () => {
useControlPanelStore.setState({
saveControlOpen: false,
forceOpenSave: true,
});
setupMock({});
render(
<TooltipProvider>
<NewSaveControl />
</TooltipProvider>,
);
expect(screen.getByTestId("save-control-name-input")).toBeDefined();
});
it("allows typing in name and description inputs", () => {
useControlPanelStore.setState({ saveControlOpen: true });
setupMock({});
render(
<TooltipProvider>
<NewSaveControl />
</TooltipProvider>,
);
const nameInput = screen.getByTestId(
"save-control-name-input",
) as HTMLInputElement;
const descriptionInput = screen.getByTestId(
"save-control-description-input",
) as HTMLInputElement;
fireEvent.change(nameInput, { target: { value: "Test Agent" } });
fireEvent.change(descriptionInput, {
target: { value: "Test Description" },
});
expect(nameInput.value).toBe("Test Agent");
expect(descriptionInput.value).toBe("Test Description");
});
it("displays save button text", () => {
useControlPanelStore.setState({ saveControlOpen: true });
setupMock({});
render(
<TooltipProvider>
<NewSaveControl />
</TooltipProvider>,
);
expect(screen.getByText("Save Agent")).toBeDefined();
});
});

View File

@@ -0,0 +1,147 @@
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { screen, fireEvent, cleanup } from "@testing-library/react";
import { render } from "@/tests/integrations/test-utils";
import React from "react";
import { useGraphStore } from "../stores/graphStore";
vi.mock(
"@/app/(platform)/build/components/BuilderActions/components/RunGraph/useRunGraph",
() => ({
useRunGraph: vi.fn(),
}),
);
vi.mock(
"@/app/(platform)/build/components/BuilderActions/components/RunInputDialog/RunInputDialog",
() => ({
RunInputDialog: ({ isOpen }: { isOpen: boolean }) =>
isOpen ? <div data-testid="run-input-dialog">Dialog</div> : null,
}),
);
// Must import after mocks
import { useRunGraph } from "../components/BuilderActions/components/RunGraph/useRunGraph";
import { RunGraph } from "../components/BuilderActions/components/RunGraph/RunGraph";
const mockUseRunGraph = vi.mocked(useRunGraph);
function createMockReturnValue(
overrides: Partial<ReturnType<typeof useRunGraph>> = {},
) {
return {
handleRunGraph: vi.fn(),
handleStopGraph: vi.fn(),
openRunInputDialog: false,
setOpenRunInputDialog: vi.fn(),
isExecutingGraph: false,
isTerminatingGraph: false,
isSaving: false,
...overrides,
};
}
// RunGraph uses Tooltip which requires TooltipProvider
import { TooltipProvider } from "@/components/atoms/Tooltip/BaseTooltip";
function renderRunGraph(flowID: string | null = "test-flow-id") {
return render(
<TooltipProvider>
<RunGraph flowID={flowID} />
</TooltipProvider>,
);
}
describe("RunGraph", () => {
beforeEach(() => {
cleanup();
mockUseRunGraph.mockReturnValue(createMockReturnValue());
useGraphStore.setState({ isGraphRunning: false });
});
afterEach(() => {
cleanup();
});
it("renders an enabled button when flowID is provided", () => {
renderRunGraph("test-flow-id");
const button = screen.getByRole("button");
expect((button as HTMLButtonElement).disabled).toBe(false);
});
it("renders a disabled button when flowID is null", () => {
renderRunGraph(null);
const button = screen.getByRole("button");
expect((button as HTMLButtonElement).disabled).toBe(true);
});
it("disables the button when isExecutingGraph is true", () => {
mockUseRunGraph.mockReturnValue(
createMockReturnValue({ isExecutingGraph: true }),
);
renderRunGraph();
expect((screen.getByRole("button") as HTMLButtonElement).disabled).toBe(
true,
);
});
it("disables the button when isTerminatingGraph is true", () => {
mockUseRunGraph.mockReturnValue(
createMockReturnValue({ isTerminatingGraph: true }),
);
renderRunGraph();
expect((screen.getByRole("button") as HTMLButtonElement).disabled).toBe(
true,
);
});
it("disables the button when isSaving is true", () => {
mockUseRunGraph.mockReturnValue(createMockReturnValue({ isSaving: true }));
renderRunGraph();
expect((screen.getByRole("button") as HTMLButtonElement).disabled).toBe(
true,
);
});
it("uses data-id run-graph-button when not running", () => {
renderRunGraph();
const button = screen.getByRole("button");
expect(button.getAttribute("data-id")).toBe("run-graph-button");
});
it("uses data-id stop-graph-button when running", () => {
useGraphStore.setState({ isGraphRunning: true });
renderRunGraph();
const button = screen.getByRole("button");
expect(button.getAttribute("data-id")).toBe("stop-graph-button");
});
it("calls handleRunGraph when clicked and graph is not running", () => {
const handleRunGraph = vi.fn();
mockUseRunGraph.mockReturnValue(createMockReturnValue({ handleRunGraph }));
renderRunGraph();
fireEvent.click(screen.getByRole("button"));
expect(handleRunGraph).toHaveBeenCalledOnce();
});
it("calls handleStopGraph when clicked and graph is running", () => {
const handleStopGraph = vi.fn();
mockUseRunGraph.mockReturnValue(createMockReturnValue({ handleStopGraph }));
useGraphStore.setState({ isGraphRunning: true });
renderRunGraph();
fireEvent.click(screen.getByRole("button"));
expect(handleStopGraph).toHaveBeenCalledOnce();
});
it("renders RunInputDialog hidden by default", () => {
renderRunGraph();
expect(screen.queryByTestId("run-input-dialog")).toBeNull();
});
it("renders RunInputDialog when openRunInputDialog is true", () => {
mockUseRunGraph.mockReturnValue(
createMockReturnValue({ openRunInputDialog: true }),
);
renderRunGraph();
expect(screen.getByTestId("run-input-dialog")).toBeDefined();
});
});

View File

@@ -0,0 +1,257 @@
import { describe, it, expect, beforeEach, vi } from "vitest";
import { CustomNode } from "../components/FlowEditor/nodes/CustomNode/CustomNode";
import { BlockUIType } from "../components/types";
vi.mock("@/services/storage/local-storage", () => {
const store: Record<string, string> = {};
return {
Key: { COPIED_FLOW_DATA: "COPIED_FLOW_DATA" },
storage: {
get: (key: string) => store[key] ?? null,
set: (key: string, value: string) => {
store[key] = value;
},
clean: (key: string) => {
delete store[key];
},
},
};
});
import { useCopyPasteStore } from "../stores/copyPasteStore";
import { useNodeStore } from "../stores/nodeStore";
import { useEdgeStore } from "../stores/edgeStore";
import { useHistoryStore } from "../stores/historyStore";
import { storage, Key } from "@/services/storage/local-storage";
function createTestNode(
id: string,
overrides: Partial<CustomNode> = {},
): CustomNode {
return {
id,
type: "custom",
position: overrides.position ?? { x: 100, y: 200 },
selected: overrides.selected,
data: {
hardcodedValues: {},
title: `Node ${id}`,
description: "test node",
inputSchema: {},
outputSchema: {},
uiType: BlockUIType.STANDARD,
block_id: `block-${id}`,
costs: [],
categories: [],
...overrides.data,
},
} as CustomNode;
}
describe("useCopyPasteStore", () => {
beforeEach(() => {
useNodeStore.setState({ nodes: [], nodeCounter: 0 });
useEdgeStore.setState({ edges: [] });
useHistoryStore.getState().clear();
storage.clean(Key.COPIED_FLOW_DATA);
});
describe("copySelectedNodes", () => {
it("copies a single selected node to localStorage", () => {
const node = createTestNode("1", { selected: true });
useNodeStore.setState({ nodes: [node] });
useCopyPasteStore.getState().copySelectedNodes();
const stored = storage.get(Key.COPIED_FLOW_DATA);
expect(stored).not.toBeNull();
const parsed = JSON.parse(stored!);
expect(parsed.nodes).toHaveLength(1);
expect(parsed.nodes[0].id).toBe("1");
expect(parsed.edges).toHaveLength(0);
});
it("copies only edges between selected nodes", () => {
const nodeA = createTestNode("a", { selected: true });
const nodeB = createTestNode("b", { selected: true });
const nodeC = createTestNode("c", { selected: false });
useNodeStore.setState({ nodes: [nodeA, nodeB, nodeC] });
useEdgeStore.setState({
edges: [
{
id: "e-ab",
source: "a",
target: "b",
sourceHandle: "out",
targetHandle: "in",
},
{
id: "e-bc",
source: "b",
target: "c",
sourceHandle: "out",
targetHandle: "in",
},
{
id: "e-ac",
source: "a",
target: "c",
sourceHandle: "out",
targetHandle: "in",
},
],
});
useCopyPasteStore.getState().copySelectedNodes();
const parsed = JSON.parse(storage.get(Key.COPIED_FLOW_DATA)!);
expect(parsed.nodes).toHaveLength(2);
expect(parsed.edges).toHaveLength(1);
expect(parsed.edges[0].id).toBe("e-ab");
});
it("stores empty data when no nodes are selected", () => {
const node = createTestNode("1", { selected: false });
useNodeStore.setState({ nodes: [node] });
useCopyPasteStore.getState().copySelectedNodes();
const parsed = JSON.parse(storage.get(Key.COPIED_FLOW_DATA)!);
expect(parsed.nodes).toHaveLength(0);
expect(parsed.edges).toHaveLength(0);
});
});
describe("pasteNodes", () => {
it("creates new nodes with new IDs via incrementNodeCounter", () => {
const node = createTestNode("orig", {
selected: true,
position: { x: 100, y: 200 },
});
useNodeStore.setState({ nodes: [node], nodeCounter: 5 });
useCopyPasteStore.getState().copySelectedNodes();
useCopyPasteStore.getState().pasteNodes();
const { nodes } = useNodeStore.getState();
expect(nodes).toHaveLength(2);
const pastedNode = nodes.find((n) => n.id !== "orig");
expect(pastedNode).toBeDefined();
expect(pastedNode!.id).not.toBe("orig");
});
it("offsets pasted node positions by +50 x/y", () => {
const node = createTestNode("orig", {
selected: true,
position: { x: 100, y: 200 },
});
useNodeStore.setState({ nodes: [node], nodeCounter: 5 });
useCopyPasteStore.getState().copySelectedNodes();
useCopyPasteStore.getState().pasteNodes();
const { nodes } = useNodeStore.getState();
const pastedNode = nodes.find((n) => n.id !== "orig");
expect(pastedNode).toBeDefined();
expect(pastedNode!.position).toEqual({ x: 150, y: 250 });
});
it("preserves internal connections with remapped IDs", () => {
const nodeA = createTestNode("a", {
selected: true,
position: { x: 0, y: 0 },
});
const nodeB = createTestNode("b", {
selected: true,
position: { x: 200, y: 0 },
});
useNodeStore.setState({ nodes: [nodeA, nodeB], nodeCounter: 0 });
useEdgeStore.setState({
edges: [
{
id: "e-ab",
source: "a",
target: "b",
sourceHandle: "output",
targetHandle: "input",
},
],
});
useCopyPasteStore.getState().copySelectedNodes();
useCopyPasteStore.getState().pasteNodes();
const { edges } = useEdgeStore.getState();
const newEdges = edges.filter((e) => e.id !== "e-ab");
expect(newEdges).toHaveLength(1);
const newEdge = newEdges[0];
expect(newEdge.source).not.toBe("a");
expect(newEdge.target).not.toBe("b");
const { nodes } = useNodeStore.getState();
const pastedNodeIDs = nodes
.filter((n) => n.id !== "a" && n.id !== "b")
.map((n) => n.id);
expect(pastedNodeIDs).toContain(newEdge.source);
expect(pastedNodeIDs).toContain(newEdge.target);
});
it("deselects existing nodes and selects pasted ones", () => {
const existingNode = createTestNode("existing", {
selected: true,
position: { x: 0, y: 0 },
});
const nodeToCopy = createTestNode("copy-me", {
selected: true,
position: { x: 100, y: 100 },
});
useNodeStore.setState({
nodes: [existingNode, nodeToCopy],
nodeCounter: 0,
});
useCopyPasteStore.getState().copySelectedNodes();
// Deselect nodeToCopy, keep existingNode selected to verify deselection on paste
useNodeStore.setState({
nodes: [
{ ...existingNode, selected: true },
{ ...nodeToCopy, selected: false },
],
});
useCopyPasteStore.getState().pasteNodes();
const { nodes } = useNodeStore.getState();
const originalNodes = nodes.filter(
(n) => n.id === "existing" || n.id === "copy-me",
);
const pastedNodes = nodes.filter(
(n) => n.id !== "existing" && n.id !== "copy-me",
);
originalNodes.forEach((n) => {
expect(n.selected).toBe(false);
});
pastedNodes.forEach((n) => {
expect(n.selected).toBe(true);
});
});
it("does nothing when clipboard is empty", () => {
const node = createTestNode("1", { position: { x: 0, y: 0 } });
useNodeStore.setState({ nodes: [node], nodeCounter: 0 });
useCopyPasteStore.getState().pasteNodes();
const { nodes } = useNodeStore.getState();
expect(nodes).toHaveLength(1);
expect(nodes[0].id).toBe("1");
});
});
});

View File

@@ -0,0 +1,751 @@
import { describe, it, expect, beforeEach, vi } from "vitest";
import { MarkerType } from "@xyflow/react";
import { useEdgeStore } from "../stores/edgeStore";
import { useNodeStore } from "../stores/nodeStore";
import { useHistoryStore } from "../stores/historyStore";
import type { CustomEdge } from "../components/FlowEditor/edges/CustomEdge";
import type { NodeExecutionResult } from "@/app/api/__generated__/models/nodeExecutionResult";
import type { Link } from "@/app/api/__generated__/models/link";
function makeEdge(overrides: Partial<CustomEdge> & { id: string }): CustomEdge {
return {
type: "custom",
source: "node-a",
target: "node-b",
sourceHandle: "output",
targetHandle: "input",
...overrides,
};
}
function makeExecutionResult(
overrides: Partial<NodeExecutionResult>,
): NodeExecutionResult {
return {
user_id: "user-1",
graph_id: "graph-1",
graph_version: 1,
graph_exec_id: "gexec-1",
node_exec_id: "nexec-1",
node_id: "node-1",
block_id: "block-1",
status: "INCOMPLETE",
input_data: {},
output_data: {},
add_time: new Date(),
queue_time: null,
start_time: null,
end_time: null,
...overrides,
};
}
beforeEach(() => {
useEdgeStore.setState({ edges: [] });
useNodeStore.setState({ nodes: [] });
useHistoryStore.setState({ past: [], future: [] });
});
describe("edgeStore", () => {
describe("setEdges", () => {
it("replaces all edges", () => {
const edges = [
makeEdge({ id: "e1" }),
makeEdge({ id: "e2", source: "node-c" }),
];
useEdgeStore.getState().setEdges(edges);
expect(useEdgeStore.getState().edges).toHaveLength(2);
expect(useEdgeStore.getState().edges[0].id).toBe("e1");
expect(useEdgeStore.getState().edges[1].id).toBe("e2");
});
});
describe("addEdge", () => {
it("adds an edge and auto-generates an ID", () => {
const result = useEdgeStore.getState().addEdge({
source: "n1",
target: "n2",
sourceHandle: "out",
targetHandle: "in",
});
expect(result.id).toBe("n1:out->n2:in");
expect(useEdgeStore.getState().edges).toHaveLength(1);
expect(useEdgeStore.getState().edges[0].id).toBe("n1:out->n2:in");
});
it("uses provided ID when given", () => {
const result = useEdgeStore.getState().addEdge({
id: "custom-id",
source: "n1",
target: "n2",
sourceHandle: "out",
targetHandle: "in",
});
expect(result.id).toBe("custom-id");
});
it("sets type to custom and adds arrow marker", () => {
const result = useEdgeStore.getState().addEdge({
source: "n1",
target: "n2",
sourceHandle: "out",
targetHandle: "in",
});
expect(result.type).toBe("custom");
expect(result.markerEnd).toEqual({
type: MarkerType.ArrowClosed,
strokeWidth: 2,
color: "#555",
});
});
it("rejects duplicate edges without adding", () => {
useEdgeStore.getState().addEdge({
source: "n1",
target: "n2",
sourceHandle: "out",
targetHandle: "in",
});
const pushSpy = vi.spyOn(useHistoryStore.getState(), "pushState");
const duplicate = useEdgeStore.getState().addEdge({
source: "n1",
target: "n2",
sourceHandle: "out",
targetHandle: "in",
});
expect(useEdgeStore.getState().edges).toHaveLength(1);
expect(duplicate.id).toBe("n1:out->n2:in");
expect(pushSpy).not.toHaveBeenCalled();
pushSpy.mockRestore();
});
it("pushes previous state to history store", () => {
const pushSpy = vi.spyOn(useHistoryStore.getState(), "pushState");
useEdgeStore.getState().addEdge({
source: "n1",
target: "n2",
sourceHandle: "out",
targetHandle: "in",
});
expect(pushSpy).toHaveBeenCalledWith({
nodes: [],
edges: [],
});
pushSpy.mockRestore();
});
});
describe("removeEdge", () => {
it("removes an edge by ID", () => {
useEdgeStore.setState({
edges: [makeEdge({ id: "e1" }), makeEdge({ id: "e2" })],
});
useEdgeStore.getState().removeEdge("e1");
expect(useEdgeStore.getState().edges).toHaveLength(1);
expect(useEdgeStore.getState().edges[0].id).toBe("e2");
});
it("does nothing when removing a non-existent edge", () => {
useEdgeStore.setState({ edges: [makeEdge({ id: "e1" })] });
useEdgeStore.getState().removeEdge("nonexistent");
expect(useEdgeStore.getState().edges).toHaveLength(1);
});
it("pushes previous state to history store", () => {
const existingEdges = [makeEdge({ id: "e1" })];
useEdgeStore.setState({ edges: existingEdges });
const pushSpy = vi.spyOn(useHistoryStore.getState(), "pushState");
useEdgeStore.getState().removeEdge("e1");
expect(pushSpy).toHaveBeenCalledWith({
nodes: [],
edges: existingEdges,
});
pushSpy.mockRestore();
});
});
describe("upsertMany", () => {
it("inserts new edges", () => {
useEdgeStore.setState({ edges: [makeEdge({ id: "e1" })] });
useEdgeStore.getState().upsertMany([makeEdge({ id: "e2" })]);
expect(useEdgeStore.getState().edges).toHaveLength(2);
});
it("updates existing edges by ID", () => {
useEdgeStore.setState({
edges: [makeEdge({ id: "e1", source: "old-source" })],
});
useEdgeStore
.getState()
.upsertMany([makeEdge({ id: "e1", source: "new-source" })]);
expect(useEdgeStore.getState().edges).toHaveLength(1);
expect(useEdgeStore.getState().edges[0].source).toBe("new-source");
});
it("handles mixed inserts and updates", () => {
useEdgeStore.setState({
edges: [makeEdge({ id: "e1", source: "old" })],
});
useEdgeStore
.getState()
.upsertMany([
makeEdge({ id: "e1", source: "updated" }),
makeEdge({ id: "e2", source: "new" }),
]);
const edges = useEdgeStore.getState().edges;
expect(edges).toHaveLength(2);
expect(edges.find((e) => e.id === "e1")?.source).toBe("updated");
expect(edges.find((e) => e.id === "e2")?.source).toBe("new");
});
});
describe("removeEdgesByHandlePrefix", () => {
it("removes edges targeting a node with matching handle prefix", () => {
useEdgeStore.setState({
edges: [
makeEdge({ id: "e1", target: "node-b", targetHandle: "input_foo" }),
makeEdge({ id: "e2", target: "node-b", targetHandle: "input_bar" }),
makeEdge({
id: "e3",
target: "node-b",
targetHandle: "other_handle",
}),
makeEdge({ id: "e4", target: "node-c", targetHandle: "input_foo" }),
],
});
useEdgeStore.getState().removeEdgesByHandlePrefix("node-b", "input_");
const edges = useEdgeStore.getState().edges;
expect(edges).toHaveLength(2);
expect(edges.map((e) => e.id).sort()).toEqual(["e3", "e4"]);
});
it("does not remove edges where target does not match nodeId", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
source: "node-b",
target: "node-c",
targetHandle: "input_x",
}),
],
});
useEdgeStore.getState().removeEdgesByHandlePrefix("node-b", "input_");
expect(useEdgeStore.getState().edges).toHaveLength(1);
});
});
describe("getNodeEdges", () => {
it("returns edges where node is source", () => {
useEdgeStore.setState({
edges: [
makeEdge({ id: "e1", source: "node-a", target: "node-b" }),
makeEdge({ id: "e2", source: "node-c", target: "node-d" }),
],
});
const result = useEdgeStore.getState().getNodeEdges("node-a");
expect(result).toHaveLength(1);
expect(result[0].id).toBe("e1");
});
it("returns edges where node is target", () => {
useEdgeStore.setState({
edges: [
makeEdge({ id: "e1", source: "node-a", target: "node-b" }),
makeEdge({ id: "e2", source: "node-c", target: "node-d" }),
],
});
const result = useEdgeStore.getState().getNodeEdges("node-b");
expect(result).toHaveLength(1);
expect(result[0].id).toBe("e1");
});
it("returns edges for both source and target", () => {
useEdgeStore.setState({
edges: [
makeEdge({ id: "e1", source: "node-a", target: "node-b" }),
makeEdge({ id: "e2", source: "node-b", target: "node-c" }),
makeEdge({ id: "e3", source: "node-d", target: "node-e" }),
],
});
const result = useEdgeStore.getState().getNodeEdges("node-b");
expect(result).toHaveLength(2);
expect(result.map((e) => e.id).sort()).toEqual(["e1", "e2"]);
});
it("returns empty array for unconnected node", () => {
useEdgeStore.setState({
edges: [makeEdge({ id: "e1", source: "node-a", target: "node-b" })],
});
expect(useEdgeStore.getState().getNodeEdges("node-z")).toHaveLength(0);
});
});
describe("isInputConnected", () => {
it("returns true when target handle is connected", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
target: "node-b",
targetHandle: "input",
}),
],
});
expect(useEdgeStore.getState().isInputConnected("node-b", "input")).toBe(
true,
);
});
it("returns false when target handle is not connected", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
target: "node-b",
targetHandle: "input",
}),
],
});
expect(useEdgeStore.getState().isInputConnected("node-b", "other")).toBe(
false,
);
});
it("returns false when node is source not target", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
source: "node-b",
target: "node-c",
sourceHandle: "output",
targetHandle: "input",
}),
],
});
expect(useEdgeStore.getState().isInputConnected("node-b", "output")).toBe(
false,
);
});
});
describe("isOutputConnected", () => {
it("returns true when source handle is connected", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
source: "node-a",
sourceHandle: "output",
}),
],
});
expect(
useEdgeStore.getState().isOutputConnected("node-a", "output"),
).toBe(true);
});
it("returns false when source handle is not connected", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
source: "node-a",
sourceHandle: "output",
}),
],
});
expect(useEdgeStore.getState().isOutputConnected("node-a", "other")).toBe(
false,
);
});
});
describe("getBackendLinks", () => {
it("converts edges to Link format", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
source: "n1",
target: "n2",
sourceHandle: "out",
targetHandle: "in",
data: { isStatic: true },
}),
],
});
const links = useEdgeStore.getState().getBackendLinks();
expect(links).toHaveLength(1);
expect(links[0]).toEqual({
id: "e1",
source_id: "n1",
sink_id: "n2",
source_name: "out",
sink_name: "in",
is_static: true,
});
});
});
describe("addLinks", () => {
it("converts Links to edges and adds them", () => {
const links: Link[] = [
{
id: "link-1",
source_id: "n1",
sink_id: "n2",
source_name: "out",
sink_name: "in",
is_static: false,
},
];
useEdgeStore.getState().addLinks(links);
const edges = useEdgeStore.getState().edges;
expect(edges).toHaveLength(1);
expect(edges[0].source).toBe("n1");
expect(edges[0].target).toBe("n2");
expect(edges[0].sourceHandle).toBe("out");
expect(edges[0].targetHandle).toBe("in");
expect(edges[0].data?.isStatic).toBe(false);
});
it("adds multiple links", () => {
const links: Link[] = [
{
id: "link-1",
source_id: "n1",
sink_id: "n2",
source_name: "out",
sink_name: "in",
},
{
id: "link-2",
source_id: "n3",
sink_id: "n4",
source_name: "result",
sink_name: "value",
},
];
useEdgeStore.getState().addLinks(links);
expect(useEdgeStore.getState().edges).toHaveLength(2);
});
});
describe("getAllHandleIdsOfANode", () => {
it("returns targetHandle values for edges targeting the node", () => {
useEdgeStore.setState({
edges: [
makeEdge({ id: "e1", target: "node-b", targetHandle: "input_a" }),
makeEdge({ id: "e2", target: "node-b", targetHandle: "input_b" }),
makeEdge({ id: "e3", target: "node-c", targetHandle: "input_c" }),
],
});
const handles = useEdgeStore.getState().getAllHandleIdsOfANode("node-b");
expect(handles).toEqual(["input_a", "input_b"]);
});
it("returns empty array when no edges target the node", () => {
useEdgeStore.setState({
edges: [makeEdge({ id: "e1", source: "node-b", target: "node-c" })],
});
expect(useEdgeStore.getState().getAllHandleIdsOfANode("node-b")).toEqual(
[],
);
});
it("returns empty string for edges with no targetHandle", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
target: "node-b",
targetHandle: undefined,
}),
],
});
expect(useEdgeStore.getState().getAllHandleIdsOfANode("node-b")).toEqual([
"",
]);
});
});
describe("updateEdgeBeads", () => {
it("updates bead counts for edges targeting the node", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
target: "node-b",
targetHandle: "input",
data: { beadUp: 0, beadDown: 0, beadData: new Map() },
}),
],
});
useEdgeStore.getState().updateEdgeBeads(
"node-b",
makeExecutionResult({
node_exec_id: "exec-1",
status: "COMPLETED",
input_data: { input: "some-value" },
}),
);
const edge = useEdgeStore.getState().edges[0];
expect(edge.data?.beadUp).toBe(1);
expect(edge.data?.beadDown).toBe(1);
});
it("counts INCOMPLETE status in beadUp but not beadDown", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
target: "node-b",
targetHandle: "input",
data: { beadUp: 0, beadDown: 0, beadData: new Map() },
}),
],
});
useEdgeStore.getState().updateEdgeBeads(
"node-b",
makeExecutionResult({
node_exec_id: "exec-1",
status: "INCOMPLETE",
input_data: { input: "data" },
}),
);
const edge = useEdgeStore.getState().edges[0];
expect(edge.data?.beadUp).toBe(1);
expect(edge.data?.beadDown).toBe(0);
});
it("does not modify edges not targeting the node", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
target: "node-c",
targetHandle: "input",
data: { beadUp: 0, beadDown: 0, beadData: new Map() },
}),
],
});
useEdgeStore.getState().updateEdgeBeads(
"node-b",
makeExecutionResult({
node_exec_id: "exec-1",
status: "COMPLETED",
input_data: { input: "data" },
}),
);
const edge = useEdgeStore.getState().edges[0];
expect(edge.data?.beadUp).toBe(0);
expect(edge.data?.beadDown).toBe(0);
});
it("does not update edge when input_data has no matching handle", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
target: "node-b",
targetHandle: "input",
data: { beadUp: 0, beadDown: 0, beadData: new Map() },
}),
],
});
useEdgeStore.getState().updateEdgeBeads(
"node-b",
makeExecutionResult({
node_exec_id: "exec-1",
status: "COMPLETED",
input_data: { other_handle: "data" },
}),
);
const edge = useEdgeStore.getState().edges[0];
expect(edge.data?.beadUp).toBe(0);
expect(edge.data?.beadDown).toBe(0);
});
it("accumulates beads across multiple executions", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
target: "node-b",
targetHandle: "input",
data: { beadUp: 0, beadDown: 0, beadData: new Map() },
}),
],
});
useEdgeStore.getState().updateEdgeBeads(
"node-b",
makeExecutionResult({
node_exec_id: "exec-1",
status: "COMPLETED",
input_data: { input: "data1" },
}),
);
useEdgeStore.getState().updateEdgeBeads(
"node-b",
makeExecutionResult({
node_exec_id: "exec-2",
status: "INCOMPLETE",
input_data: { input: "data2" },
}),
);
const edge = useEdgeStore.getState().edges[0];
expect(edge.data?.beadUp).toBe(2);
expect(edge.data?.beadDown).toBe(1);
});
it("handles static edges by setting beadUp to beadDown + 1", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
target: "node-b",
targetHandle: "input",
data: {
isStatic: true,
beadUp: 0,
beadDown: 0,
beadData: new Map(),
},
}),
],
});
useEdgeStore.getState().updateEdgeBeads(
"node-b",
makeExecutionResult({
node_exec_id: "exec-1",
status: "COMPLETED",
input_data: { input: "data" },
}),
);
const edge = useEdgeStore.getState().edges[0];
expect(edge.data?.beadUp).toBe(2);
expect(edge.data?.beadDown).toBe(1);
});
});
describe("resetEdgeBeads", () => {
it("resets all bead data on all edges", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
data: {
beadUp: 5,
beadDown: 3,
beadData: new Map([["exec-1", "COMPLETED"]]),
},
}),
makeEdge({
id: "e2",
data: {
beadUp: 2,
beadDown: 1,
beadData: new Map([["exec-2", "INCOMPLETE"]]),
},
}),
],
});
useEdgeStore.getState().resetEdgeBeads();
const edges = useEdgeStore.getState().edges;
for (const edge of edges) {
expect(edge.data?.beadUp).toBe(0);
expect(edge.data?.beadDown).toBe(0);
expect(edge.data?.beadData?.size).toBe(0);
}
});
it("preserves other edge data when resetting beads", () => {
useEdgeStore.setState({
edges: [
makeEdge({
id: "e1",
data: {
isStatic: true,
edgeColorClass: "text-red-500",
beadUp: 3,
beadDown: 2,
beadData: new Map(),
},
}),
],
});
useEdgeStore.getState().resetEdgeBeads();
const edge = useEdgeStore.getState().edges[0];
expect(edge.data?.isStatic).toBe(true);
expect(edge.data?.edgeColorClass).toBe("text-red-500");
expect(edge.data?.beadUp).toBe(0);
});
});
});

View File

@@ -0,0 +1,347 @@
import { describe, it, expect, beforeEach } from "vitest";
import { useGraphStore } from "../stores/graphStore";
import { AgentExecutionStatus } from "@/app/api/__generated__/models/agentExecutionStatus";
import { GraphMeta } from "@/app/api/__generated__/models/graphMeta";
function createTestGraphMeta(
overrides: Partial<GraphMeta> & { id: string; name: string },
): GraphMeta {
return {
version: 1,
description: "",
is_active: true,
user_id: "test-user",
created_at: new Date("2024-01-01T00:00:00Z"),
...overrides,
};
}
function resetStore() {
useGraphStore.setState({
graphExecutionStatus: undefined,
isGraphRunning: false,
inputSchema: null,
credentialsInputSchema: null,
outputSchema: null,
availableSubGraphs: [],
});
}
beforeEach(() => {
resetStore();
});
describe("graphStore", () => {
describe("execution status transitions", () => {
it("handles QUEUED -> RUNNING -> COMPLETED transition", () => {
const { setGraphExecutionStatus } = useGraphStore.getState();
setGraphExecutionStatus(AgentExecutionStatus.QUEUED);
expect(useGraphStore.getState().graphExecutionStatus).toBe(
AgentExecutionStatus.QUEUED,
);
expect(useGraphStore.getState().isGraphRunning).toBe(true);
setGraphExecutionStatus(AgentExecutionStatus.RUNNING);
expect(useGraphStore.getState().graphExecutionStatus).toBe(
AgentExecutionStatus.RUNNING,
);
expect(useGraphStore.getState().isGraphRunning).toBe(true);
setGraphExecutionStatus(AgentExecutionStatus.COMPLETED);
expect(useGraphStore.getState().graphExecutionStatus).toBe(
AgentExecutionStatus.COMPLETED,
);
expect(useGraphStore.getState().isGraphRunning).toBe(false);
});
it("handles QUEUED -> RUNNING -> FAILED transition", () => {
const { setGraphExecutionStatus } = useGraphStore.getState();
setGraphExecutionStatus(AgentExecutionStatus.QUEUED);
expect(useGraphStore.getState().isGraphRunning).toBe(true);
setGraphExecutionStatus(AgentExecutionStatus.RUNNING);
expect(useGraphStore.getState().isGraphRunning).toBe(true);
setGraphExecutionStatus(AgentExecutionStatus.FAILED);
expect(useGraphStore.getState().graphExecutionStatus).toBe(
AgentExecutionStatus.FAILED,
);
expect(useGraphStore.getState().isGraphRunning).toBe(false);
});
});
describe("setGraphExecutionStatus auto-sets isGraphRunning", () => {
it("sets isGraphRunning to true for RUNNING", () => {
useGraphStore
.getState()
.setGraphExecutionStatus(AgentExecutionStatus.RUNNING);
expect(useGraphStore.getState().isGraphRunning).toBe(true);
});
it("sets isGraphRunning to true for QUEUED", () => {
useGraphStore
.getState()
.setGraphExecutionStatus(AgentExecutionStatus.QUEUED);
expect(useGraphStore.getState().isGraphRunning).toBe(true);
});
it("sets isGraphRunning to false for COMPLETED", () => {
useGraphStore
.getState()
.setGraphExecutionStatus(AgentExecutionStatus.RUNNING);
expect(useGraphStore.getState().isGraphRunning).toBe(true);
useGraphStore
.getState()
.setGraphExecutionStatus(AgentExecutionStatus.COMPLETED);
expect(useGraphStore.getState().isGraphRunning).toBe(false);
});
it("sets isGraphRunning to false for FAILED", () => {
useGraphStore
.getState()
.setGraphExecutionStatus(AgentExecutionStatus.RUNNING);
useGraphStore
.getState()
.setGraphExecutionStatus(AgentExecutionStatus.FAILED);
expect(useGraphStore.getState().isGraphRunning).toBe(false);
});
it("sets isGraphRunning to false for TERMINATED", () => {
useGraphStore
.getState()
.setGraphExecutionStatus(AgentExecutionStatus.RUNNING);
useGraphStore
.getState()
.setGraphExecutionStatus(AgentExecutionStatus.TERMINATED);
expect(useGraphStore.getState().isGraphRunning).toBe(false);
});
it("sets isGraphRunning to false for INCOMPLETE", () => {
useGraphStore
.getState()
.setGraphExecutionStatus(AgentExecutionStatus.RUNNING);
useGraphStore
.getState()
.setGraphExecutionStatus(AgentExecutionStatus.INCOMPLETE);
expect(useGraphStore.getState().isGraphRunning).toBe(false);
});
it("sets isGraphRunning to false for undefined", () => {
useGraphStore
.getState()
.setGraphExecutionStatus(AgentExecutionStatus.RUNNING);
expect(useGraphStore.getState().isGraphRunning).toBe(true);
useGraphStore.getState().setGraphExecutionStatus(undefined);
expect(useGraphStore.getState().graphExecutionStatus).toBeUndefined();
expect(useGraphStore.getState().isGraphRunning).toBe(false);
});
});
describe("setIsGraphRunning", () => {
it("sets isGraphRunning independently of status", () => {
useGraphStore.getState().setIsGraphRunning(true);
expect(useGraphStore.getState().isGraphRunning).toBe(true);
useGraphStore.getState().setIsGraphRunning(false);
expect(useGraphStore.getState().isGraphRunning).toBe(false);
});
});
describe("schema management", () => {
it("sets all three schemas via setGraphSchemas", () => {
const input = { properties: { prompt: { type: "string" } } };
const credentials = { properties: { apiKey: { type: "string" } } };
const output = { properties: { result: { type: "string" } } };
useGraphStore.getState().setGraphSchemas(input, credentials, output);
const state = useGraphStore.getState();
expect(state.inputSchema).toEqual(input);
expect(state.credentialsInputSchema).toEqual(credentials);
expect(state.outputSchema).toEqual(output);
});
it("sets schemas to null", () => {
const input = { properties: { prompt: { type: "string" } } };
useGraphStore.getState().setGraphSchemas(input, null, null);
const state = useGraphStore.getState();
expect(state.inputSchema).toEqual(input);
expect(state.credentialsInputSchema).toBeNull();
expect(state.outputSchema).toBeNull();
});
it("overwrites previous schemas", () => {
const first = { properties: { a: { type: "string" } } };
const second = { properties: { b: { type: "number" } } };
useGraphStore.getState().setGraphSchemas(first, first, first);
useGraphStore.getState().setGraphSchemas(second, null, second);
const state = useGraphStore.getState();
expect(state.inputSchema).toEqual(second);
expect(state.credentialsInputSchema).toBeNull();
expect(state.outputSchema).toEqual(second);
});
});
describe("hasInputs", () => {
it("returns false when inputSchema is null", () => {
expect(useGraphStore.getState().hasInputs()).toBe(false);
});
it("returns false when inputSchema has no properties", () => {
useGraphStore.getState().setGraphSchemas({}, null, null);
expect(useGraphStore.getState().hasInputs()).toBe(false);
});
it("returns false when inputSchema has empty properties", () => {
useGraphStore.getState().setGraphSchemas({ properties: {} }, null, null);
expect(useGraphStore.getState().hasInputs()).toBe(false);
});
it("returns true when inputSchema has properties", () => {
useGraphStore
.getState()
.setGraphSchemas(
{ properties: { prompt: { type: "string" } } },
null,
null,
);
expect(useGraphStore.getState().hasInputs()).toBe(true);
});
});
describe("hasCredentials", () => {
it("returns false when credentialsInputSchema is null", () => {
expect(useGraphStore.getState().hasCredentials()).toBe(false);
});
it("returns false when credentialsInputSchema has empty properties", () => {
useGraphStore.getState().setGraphSchemas(null, { properties: {} }, null);
expect(useGraphStore.getState().hasCredentials()).toBe(false);
});
it("returns true when credentialsInputSchema has properties", () => {
useGraphStore
.getState()
.setGraphSchemas(
null,
{ properties: { apiKey: { type: "string" } } },
null,
);
expect(useGraphStore.getState().hasCredentials()).toBe(true);
});
});
describe("hasOutputs", () => {
it("returns false when outputSchema is null", () => {
expect(useGraphStore.getState().hasOutputs()).toBe(false);
});
it("returns false when outputSchema has empty properties", () => {
useGraphStore.getState().setGraphSchemas(null, null, { properties: {} });
expect(useGraphStore.getState().hasOutputs()).toBe(false);
});
it("returns true when outputSchema has properties", () => {
useGraphStore.getState().setGraphSchemas(null, null, {
properties: { result: { type: "string" } },
});
expect(useGraphStore.getState().hasOutputs()).toBe(true);
});
});
describe("reset", () => {
it("clears execution status and schemas but preserves outputSchema and availableSubGraphs", () => {
const subGraphs: GraphMeta[] = [
createTestGraphMeta({
id: "sub-1",
name: "Sub Graph",
description: "A sub graph",
}),
];
useGraphStore
.getState()
.setGraphExecutionStatus(AgentExecutionStatus.RUNNING);
useGraphStore
.getState()
.setGraphSchemas(
{ properties: { a: {} } },
{ properties: { b: {} } },
{ properties: { c: {} } },
);
useGraphStore.getState().setAvailableSubGraphs(subGraphs);
useGraphStore.getState().reset();
const state = useGraphStore.getState();
expect(state.graphExecutionStatus).toBeUndefined();
expect(state.isGraphRunning).toBe(false);
expect(state.inputSchema).toBeNull();
expect(state.credentialsInputSchema).toBeNull();
// reset does not clear outputSchema or availableSubGraphs
expect(state.outputSchema).toEqual({ properties: { c: {} } });
expect(state.availableSubGraphs).toEqual(subGraphs);
});
it("is idempotent on fresh state", () => {
useGraphStore.getState().reset();
const state = useGraphStore.getState();
expect(state.graphExecutionStatus).toBeUndefined();
expect(state.isGraphRunning).toBe(false);
expect(state.inputSchema).toBeNull();
expect(state.credentialsInputSchema).toBeNull();
});
});
describe("setAvailableSubGraphs", () => {
it("sets sub-graphs list", () => {
const graphs: GraphMeta[] = [
createTestGraphMeta({
id: "graph-1",
name: "Graph One",
description: "First graph",
}),
createTestGraphMeta({
id: "graph-2",
version: 2,
name: "Graph Two",
description: "Second graph",
}),
];
useGraphStore.getState().setAvailableSubGraphs(graphs);
expect(useGraphStore.getState().availableSubGraphs).toEqual(graphs);
});
it("replaces previous sub-graphs", () => {
const first: GraphMeta[] = [createTestGraphMeta({ id: "a", name: "A" })];
const second: GraphMeta[] = [
createTestGraphMeta({ id: "b", name: "B" }),
createTestGraphMeta({ id: "c", name: "C" }),
];
useGraphStore.getState().setAvailableSubGraphs(first);
expect(useGraphStore.getState().availableSubGraphs).toHaveLength(1);
useGraphStore.getState().setAvailableSubGraphs(second);
expect(useGraphStore.getState().availableSubGraphs).toHaveLength(2);
expect(useGraphStore.getState().availableSubGraphs).toEqual(second);
});
it("can set empty sub-graphs list", () => {
useGraphStore
.getState()
.setAvailableSubGraphs([createTestGraphMeta({ id: "x", name: "X" })]);
useGraphStore.getState().setAvailableSubGraphs([]);
expect(useGraphStore.getState().availableSubGraphs).toEqual([]);
});
});
});

View File

@@ -0,0 +1,407 @@
import { describe, it, expect, beforeEach } from "vitest";
import { useHistoryStore } from "../stores/historyStore";
import { useNodeStore } from "../stores/nodeStore";
import { useEdgeStore } from "../stores/edgeStore";
import { CustomNode } from "../components/FlowEditor/nodes/CustomNode/CustomNode";
import { CustomEdge } from "../components/FlowEditor/edges/CustomEdge";
function createTestNode(
id: string,
overrides: Partial<CustomNode> = {},
): CustomNode {
return {
id,
type: "custom" as const,
position: { x: 0, y: 0 },
data: {
hardcodedValues: {},
title: `Node ${id}`,
description: "",
inputSchema: {},
outputSchema: {},
uiType: "STANDARD" as never,
block_id: `block-${id}`,
costs: [],
categories: [],
},
...overrides,
} as CustomNode;
}
function createTestEdge(
id: string,
source: string,
target: string,
): CustomEdge {
return {
id,
source,
target,
type: "custom" as const,
} as CustomEdge;
}
async function flushMicrotasks() {
await new Promise<void>((resolve) => queueMicrotask(resolve));
}
beforeEach(() => {
useHistoryStore.getState().clear();
useNodeStore.setState({ nodes: [] });
useEdgeStore.setState({ edges: [] });
});
describe("historyStore", () => {
describe("undo/redo single action", () => {
it("undoes a single pushed state", async () => {
const node = createTestNode("1");
// Initialize history with node present as baseline
useNodeStore.setState({ nodes: [node] });
useHistoryStore.getState().initializeHistory();
// Simulate a change: clear nodes
useNodeStore.setState({ nodes: [] });
// Undo should restore to [node]
useHistoryStore.getState().undo();
expect(useNodeStore.getState().nodes).toEqual([node]);
expect(useHistoryStore.getState().future).toHaveLength(1);
expect(useHistoryStore.getState().future[0].nodes).toEqual([]);
});
it("redoes after undo", async () => {
const node = createTestNode("1");
useNodeStore.setState({ nodes: [node] });
useHistoryStore.getState().initializeHistory();
// Change: clear nodes
useNodeStore.setState({ nodes: [] });
// Undo → back to [node]
useHistoryStore.getState().undo();
expect(useNodeStore.getState().nodes).toEqual([node]);
// Redo → back to []
useHistoryStore.getState().redo();
expect(useNodeStore.getState().nodes).toEqual([]);
});
});
describe("undo/redo multiple actions", () => {
it("undoes through multiple states in order", async () => {
const node1 = createTestNode("1");
const node2 = createTestNode("2");
const node3 = createTestNode("3");
// Initialize with [node1] as baseline
useNodeStore.setState({ nodes: [node1] });
useHistoryStore.getState().initializeHistory();
// Second change: add node2, push pre-change state
useNodeStore.setState({ nodes: [node1, node2] });
useHistoryStore.getState().pushState({ nodes: [node1], edges: [] });
await flushMicrotasks();
// Third change: add node3, push pre-change state
useNodeStore.setState({ nodes: [node1, node2, node3] });
useHistoryStore
.getState()
.pushState({ nodes: [node1, node2], edges: [] });
await flushMicrotasks();
// Undo 1: back to [node1, node2]
useHistoryStore.getState().undo();
expect(useNodeStore.getState().nodes).toEqual([node1, node2]);
// Undo 2: back to [node1]
useHistoryStore.getState().undo();
expect(useNodeStore.getState().nodes).toEqual([node1]);
});
});
describe("undo past empty history", () => {
it("does nothing when there is no history to undo", () => {
useHistoryStore.getState().undo();
expect(useNodeStore.getState().nodes).toEqual([]);
expect(useEdgeStore.getState().edges).toEqual([]);
expect(useHistoryStore.getState().past).toHaveLength(1);
});
it("does nothing when current state equals last past entry", () => {
expect(useHistoryStore.getState().canUndo()).toBe(false);
useHistoryStore.getState().undo();
expect(useHistoryStore.getState().past).toHaveLength(1);
expect(useHistoryStore.getState().future).toHaveLength(0);
});
});
describe("state consistency: undo after node add restores previous, redo restores added", () => {
it("undo removes added node, redo restores it", async () => {
const node = createTestNode("added");
useNodeStore.setState({ nodes: [node] });
useHistoryStore.getState().pushState({ nodes: [], edges: [] });
await flushMicrotasks();
useHistoryStore.getState().undo();
expect(useNodeStore.getState().nodes).toEqual([]);
useHistoryStore.getState().redo();
expect(useNodeStore.getState().nodes).toEqual([node]);
});
});
describe("history limits", () => {
it("does not grow past MAX_HISTORY (50)", async () => {
for (let i = 0; i < 60; i++) {
const node = createTestNode(`node-${i}`);
useNodeStore.setState({ nodes: [node] });
useHistoryStore.getState().pushState({
nodes: [createTestNode(`node-${i - 1}`)],
edges: [],
});
await flushMicrotasks();
}
expect(useHistoryStore.getState().past.length).toBeLessThanOrEqual(50);
});
});
describe("edge cases", () => {
it("redo does nothing when future is empty", () => {
const nodesBefore = useNodeStore.getState().nodes;
const edgesBefore = useEdgeStore.getState().edges;
useHistoryStore.getState().redo();
expect(useNodeStore.getState().nodes).toEqual(nodesBefore);
expect(useEdgeStore.getState().edges).toEqual(edgesBefore);
});
it("interleaved undo/redo sequence", async () => {
const node1 = createTestNode("1");
const node2 = createTestNode("2");
const node3 = createTestNode("3");
useNodeStore.setState({ nodes: [node1] });
useHistoryStore.getState().pushState({ nodes: [], edges: [] });
await flushMicrotasks();
useNodeStore.setState({ nodes: [node1, node2] });
useHistoryStore.getState().pushState({ nodes: [node1], edges: [] });
await flushMicrotasks();
useNodeStore.setState({ nodes: [node1, node2, node3] });
useHistoryStore.getState().pushState({
nodes: [node1, node2],
edges: [],
});
await flushMicrotasks();
useHistoryStore.getState().undo();
expect(useNodeStore.getState().nodes).toEqual([node1, node2]);
useHistoryStore.getState().undo();
expect(useNodeStore.getState().nodes).toEqual([node1]);
useHistoryStore.getState().redo();
expect(useNodeStore.getState().nodes).toEqual([node1, node2]);
useHistoryStore.getState().undo();
expect(useNodeStore.getState().nodes).toEqual([node1]);
useHistoryStore.getState().redo();
useHistoryStore.getState().redo();
expect(useNodeStore.getState().nodes).toEqual([node1, node2, node3]);
});
});
describe("canUndo / canRedo", () => {
it("canUndo is false on fresh store", () => {
expect(useHistoryStore.getState().canUndo()).toBe(false);
});
it("canUndo is true when current state differs from last past entry", async () => {
const node = createTestNode("1");
useNodeStore.setState({ nodes: [node] });
useHistoryStore.getState().pushState({ nodes: [], edges: [] });
await flushMicrotasks();
expect(useHistoryStore.getState().canUndo()).toBe(true);
});
it("canRedo is false on fresh store", () => {
expect(useHistoryStore.getState().canRedo()).toBe(false);
});
it("canRedo is true after undo", async () => {
const node = createTestNode("1");
useNodeStore.setState({ nodes: [node] });
useHistoryStore.getState().pushState({ nodes: [], edges: [] });
await flushMicrotasks();
useHistoryStore.getState().undo();
expect(useHistoryStore.getState().canRedo()).toBe(true);
});
it("canRedo becomes false after redo exhausts future", async () => {
const node = createTestNode("1");
useNodeStore.setState({ nodes: [node] });
useHistoryStore.getState().pushState({ nodes: [], edges: [] });
await flushMicrotasks();
useHistoryStore.getState().undo();
useHistoryStore.getState().redo();
expect(useHistoryStore.getState().canRedo()).toBe(false);
});
});
describe("pushState deduplication", () => {
it("does not push a state identical to the last past entry", async () => {
useHistoryStore.getState().pushState({ nodes: [], edges: [] });
await flushMicrotasks();
expect(useHistoryStore.getState().past).toHaveLength(1);
});
it("does not push if state matches current node/edge store state", async () => {
const node = createTestNode("1");
useNodeStore.setState({ nodes: [node] });
useEdgeStore.setState({ edges: [] });
useHistoryStore.getState().pushState({ nodes: [node], edges: [] });
await flushMicrotasks();
expect(useHistoryStore.getState().past).toHaveLength(1);
});
});
describe("initializeHistory", () => {
it("resets history with current node/edge store state", async () => {
const node = createTestNode("1");
const edge = createTestEdge("e1", "1", "2");
useNodeStore.setState({ nodes: [node] });
useEdgeStore.setState({ edges: [edge] });
useNodeStore.setState({ nodes: [node, createTestNode("2")] });
useHistoryStore.getState().pushState({ nodes: [node], edges: [edge] });
await flushMicrotasks();
useHistoryStore.getState().initializeHistory();
const { past, future } = useHistoryStore.getState();
expect(past).toHaveLength(1);
expect(past[0].nodes).toEqual(useNodeStore.getState().nodes);
expect(past[0].edges).toEqual(useEdgeStore.getState().edges);
expect(future).toHaveLength(0);
});
});
describe("clear", () => {
it("resets to empty initial state", async () => {
const node = createTestNode("1");
useNodeStore.setState({ nodes: [node] });
useHistoryStore.getState().pushState({ nodes: [], edges: [] });
await flushMicrotasks();
useHistoryStore.getState().clear();
const { past, future } = useHistoryStore.getState();
expect(past).toEqual([{ nodes: [], edges: [] }]);
expect(future).toEqual([]);
});
});
describe("microtask batching", () => {
it("only commits the first state when multiple pushState calls happen in the same tick", async () => {
const node1 = createTestNode("1");
const node2 = createTestNode("2");
const node3 = createTestNode("3");
useNodeStore.setState({ nodes: [node1, node2, node3] });
useHistoryStore.getState().pushState({ nodes: [node1], edges: [] });
useHistoryStore.getState().pushState({ nodes: [node2], edges: [] });
useHistoryStore
.getState()
.pushState({ nodes: [node1, node2], edges: [] });
await flushMicrotasks();
const { past } = useHistoryStore.getState();
expect(past).toHaveLength(2);
expect(past[1].nodes).toEqual([node1]);
});
it("commits separately when pushState calls are in different ticks", async () => {
const node1 = createTestNode("1");
const node2 = createTestNode("2");
useNodeStore.setState({ nodes: [node1, node2] });
useHistoryStore.getState().pushState({ nodes: [node1], edges: [] });
await flushMicrotasks();
useHistoryStore.getState().pushState({ nodes: [node2], edges: [] });
await flushMicrotasks();
const { past } = useHistoryStore.getState();
expect(past).toHaveLength(3);
expect(past[1].nodes).toEqual([node1]);
expect(past[2].nodes).toEqual([node2]);
});
});
describe("edges in undo/redo", () => {
it("restores edges on undo and redo", async () => {
const edge = createTestEdge("e1", "1", "2");
useEdgeStore.setState({ edges: [edge] });
useHistoryStore.getState().pushState({ nodes: [], edges: [] });
await flushMicrotasks();
useHistoryStore.getState().undo();
expect(useEdgeStore.getState().edges).toEqual([]);
useHistoryStore.getState().redo();
expect(useEdgeStore.getState().edges).toEqual([edge]);
});
});
describe("pushState clears future", () => {
it("clears future when a new state is pushed after undo", async () => {
const node1 = createTestNode("1");
const node2 = createTestNode("2");
const node3 = createTestNode("3");
// Initialize empty
useHistoryStore.getState().initializeHistory();
// First change: set [node1]
useNodeStore.setState({ nodes: [node1] });
// Second change: set [node1, node2], push pre-change [node1]
useNodeStore.setState({ nodes: [node1, node2] });
useHistoryStore.getState().pushState({ nodes: [node1], edges: [] });
await flushMicrotasks();
// Undo: back to [node1]
useHistoryStore.getState().undo();
expect(useHistoryStore.getState().future).toHaveLength(1);
// New diverging change: add node3 instead of node2
useNodeStore.setState({ nodes: [node1, node3] });
useHistoryStore.getState().pushState({ nodes: [node1], edges: [] });
await flushMicrotasks();
expect(useHistoryStore.getState().future).toHaveLength(0);
});
});
});

View File

@@ -0,0 +1,791 @@
import { describe, it, expect, beforeEach, vi } from "vitest";
import { useNodeStore } from "../stores/nodeStore";
import { useHistoryStore } from "../stores/historyStore";
import { useEdgeStore } from "../stores/edgeStore";
import { BlockUIType } from "../components/types";
import type { CustomNode } from "../components/FlowEditor/nodes/CustomNode/CustomNode";
import type { CustomNodeData } from "../components/FlowEditor/nodes/CustomNode/CustomNode";
import type { NodeExecutionResult } from "@/app/api/__generated__/models/nodeExecutionResult";
function createTestNode(overrides: {
id: string;
position?: { x: number; y: number };
data?: Partial<CustomNodeData>;
}): CustomNode {
const defaults: CustomNodeData = {
hardcodedValues: {},
title: "Test Block",
description: "A test block",
inputSchema: {},
outputSchema: {},
uiType: BlockUIType.STANDARD,
block_id: "test-block-id",
costs: [],
categories: [],
};
return {
id: overrides.id,
type: "custom",
position: overrides.position ?? { x: 0, y: 0 },
data: { ...defaults, ...overrides.data },
};
}
function createExecutionResult(
overrides: Partial<NodeExecutionResult> = {},
): NodeExecutionResult {
return {
node_exec_id: overrides.node_exec_id ?? "exec-1",
node_id: overrides.node_id ?? "1",
graph_exec_id: overrides.graph_exec_id ?? "graph-exec-1",
graph_id: overrides.graph_id ?? "graph-1",
graph_version: overrides.graph_version ?? 1,
user_id: overrides.user_id ?? "test-user",
block_id: overrides.block_id ?? "block-1",
status: overrides.status ?? "COMPLETED",
input_data: overrides.input_data ?? { input_key: "input_value" },
output_data: overrides.output_data ?? { output_key: ["output_value"] },
add_time: overrides.add_time ?? new Date("2024-01-01T00:00:00Z"),
queue_time: overrides.queue_time ?? new Date("2024-01-01T00:00:00Z"),
start_time: overrides.start_time ?? new Date("2024-01-01T00:00:01Z"),
end_time: overrides.end_time ?? new Date("2024-01-01T00:00:02Z"),
};
}
function resetStores() {
useNodeStore.setState({
nodes: [],
nodeCounter: 0,
nodeAdvancedStates: {},
latestNodeInputData: {},
latestNodeOutputData: {},
accumulatedNodeInputData: {},
accumulatedNodeOutputData: {},
nodesInResolutionMode: new Set(),
brokenEdgeIDs: new Map(),
nodeResolutionData: new Map(),
});
useEdgeStore.setState({ edges: [] });
useHistoryStore.setState({ past: [], future: [] });
}
describe("nodeStore", () => {
beforeEach(() => {
resetStores();
vi.restoreAllMocks();
});
describe("node lifecycle", () => {
it("starts with empty nodes", () => {
const { nodes } = useNodeStore.getState();
expect(nodes).toEqual([]);
});
it("adds a single node with addNode", () => {
const node = createTestNode({ id: "1" });
useNodeStore.getState().addNode(node);
const { nodes } = useNodeStore.getState();
expect(nodes).toHaveLength(1);
expect(nodes[0].id).toBe("1");
});
it("sets nodes with setNodes, replacing existing ones", () => {
const node1 = createTestNode({ id: "1" });
const node2 = createTestNode({ id: "2" });
useNodeStore.getState().addNode(node1);
useNodeStore.getState().setNodes([node2]);
const { nodes } = useNodeStore.getState();
expect(nodes).toHaveLength(1);
expect(nodes[0].id).toBe("2");
});
it("removes nodes via onNodesChange", () => {
const node = createTestNode({ id: "1" });
useNodeStore.getState().setNodes([node]);
useNodeStore.getState().onNodesChange([{ type: "remove", id: "1" }]);
expect(useNodeStore.getState().nodes).toHaveLength(0);
});
it("updates node data with updateNodeData", () => {
const node = createTestNode({ id: "1" });
useNodeStore.getState().addNode(node);
useNodeStore.getState().updateNodeData("1", { title: "Updated Title" });
const updated = useNodeStore.getState().nodes[0];
expect(updated.data.title).toBe("Updated Title");
expect(updated.data.block_id).toBe("test-block-id");
});
it("updateNodeData does not affect other nodes", () => {
const node1 = createTestNode({ id: "1" });
const node2 = createTestNode({
id: "2",
data: { title: "Node 2" },
});
useNodeStore.getState().setNodes([node1, node2]);
useNodeStore.getState().updateNodeData("1", { title: "Changed" });
expect(useNodeStore.getState().nodes[1].data.title).toBe("Node 2");
});
});
describe("bulk operations", () => {
it("adds multiple nodes with addNodes", () => {
const nodes = [
createTestNode({ id: "1" }),
createTestNode({ id: "2" }),
createTestNode({ id: "3" }),
];
useNodeStore.getState().addNodes(nodes);
expect(useNodeStore.getState().nodes).toHaveLength(3);
});
it("removes multiple nodes via onNodesChange", () => {
const nodes = [
createTestNode({ id: "1" }),
createTestNode({ id: "2" }),
createTestNode({ id: "3" }),
];
useNodeStore.getState().setNodes(nodes);
useNodeStore.getState().onNodesChange([
{ type: "remove", id: "1" },
{ type: "remove", id: "3" },
]);
const remaining = useNodeStore.getState().nodes;
expect(remaining).toHaveLength(1);
expect(remaining[0].id).toBe("2");
});
});
describe("nodeCounter", () => {
it("starts at zero", () => {
expect(useNodeStore.getState().nodeCounter).toBe(0);
});
it("increments the counter", () => {
useNodeStore.getState().incrementNodeCounter();
expect(useNodeStore.getState().nodeCounter).toBe(1);
useNodeStore.getState().incrementNodeCounter();
expect(useNodeStore.getState().nodeCounter).toBe(2);
});
it("sets the counter to a specific value", () => {
useNodeStore.getState().setNodeCounter(42);
expect(useNodeStore.getState().nodeCounter).toBe(42);
});
});
describe("advanced states", () => {
it("defaults to false for unknown node IDs", () => {
expect(useNodeStore.getState().getShowAdvanced("unknown")).toBe(false);
});
it("toggles advanced state", () => {
useNodeStore.getState().toggleAdvanced("node-1");
expect(useNodeStore.getState().getShowAdvanced("node-1")).toBe(true);
useNodeStore.getState().toggleAdvanced("node-1");
expect(useNodeStore.getState().getShowAdvanced("node-1")).toBe(false);
});
it("sets advanced state explicitly", () => {
useNodeStore.getState().setShowAdvanced("node-1", true);
expect(useNodeStore.getState().getShowAdvanced("node-1")).toBe(true);
useNodeStore.getState().setShowAdvanced("node-1", false);
expect(useNodeStore.getState().getShowAdvanced("node-1")).toBe(false);
});
});
describe("convertCustomNodeToBackendNode", () => {
it("converts a node with minimal data", () => {
const node = createTestNode({
id: "42",
position: { x: 100, y: 200 },
});
const backend = useNodeStore
.getState()
.convertCustomNodeToBackendNode(node);
expect(backend.id).toBe("42");
expect(backend.block_id).toBe("test-block-id");
expect(backend.input_default).toEqual({});
expect(backend.metadata).toEqual({ position: { x: 100, y: 200 } });
});
it("includes customized_name when present in metadata", () => {
const node = createTestNode({
id: "1",
data: {
metadata: { customized_name: "My Custom Name" },
},
});
const backend = useNodeStore
.getState()
.convertCustomNodeToBackendNode(node);
expect(backend.metadata).toHaveProperty(
"customized_name",
"My Custom Name",
);
});
it("includes credentials_optional when present in metadata", () => {
const node = createTestNode({
id: "1",
data: {
metadata: { credentials_optional: true },
},
});
const backend = useNodeStore
.getState()
.convertCustomNodeToBackendNode(node);
expect(backend.metadata).toHaveProperty("credentials_optional", true);
});
it("prunes empty values from hardcodedValues", () => {
const node = createTestNode({
id: "1",
data: {
hardcodedValues: { filled: "value", empty: "" },
},
});
const backend = useNodeStore
.getState()
.convertCustomNodeToBackendNode(node);
expect(backend.input_default).toEqual({ filled: "value" });
expect(backend.input_default).not.toHaveProperty("empty");
});
});
describe("getBackendNodes", () => {
it("converts all nodes to backend format", () => {
useNodeStore
.getState()
.setNodes([
createTestNode({ id: "1", position: { x: 0, y: 0 } }),
createTestNode({ id: "2", position: { x: 100, y: 100 } }),
]);
const backendNodes = useNodeStore.getState().getBackendNodes();
expect(backendNodes).toHaveLength(2);
expect(backendNodes[0].id).toBe("1");
expect(backendNodes[1].id).toBe("2");
});
});
describe("node status", () => {
it("returns undefined for a node with no status", () => {
useNodeStore.getState().addNode(createTestNode({ id: "1" }));
expect(useNodeStore.getState().getNodeStatus("1")).toBeUndefined();
});
it("updates node status", () => {
useNodeStore.getState().addNode(createTestNode({ id: "1" }));
useNodeStore.getState().updateNodeStatus("1", "RUNNING");
expect(useNodeStore.getState().getNodeStatus("1")).toBe("RUNNING");
useNodeStore.getState().updateNodeStatus("1", "COMPLETED");
expect(useNodeStore.getState().getNodeStatus("1")).toBe("COMPLETED");
});
it("cleans all node statuses", () => {
useNodeStore
.getState()
.setNodes([createTestNode({ id: "1" }), createTestNode({ id: "2" })]);
useNodeStore.getState().updateNodeStatus("1", "RUNNING");
useNodeStore.getState().updateNodeStatus("2", "COMPLETED");
useNodeStore.getState().cleanNodesStatuses();
expect(useNodeStore.getState().getNodeStatus("1")).toBeUndefined();
expect(useNodeStore.getState().getNodeStatus("2")).toBeUndefined();
});
it("updating status for non-existent node does not crash", () => {
useNodeStore.getState().updateNodeStatus("nonexistent", "RUNNING");
expect(
useNodeStore.getState().getNodeStatus("nonexistent"),
).toBeUndefined();
});
});
describe("execution result tracking", () => {
it("returns empty array for node with no results", () => {
useNodeStore.getState().addNode(createTestNode({ id: "1" }));
expect(useNodeStore.getState().getNodeExecutionResults("1")).toEqual([]);
});
it("tracks a single execution result", () => {
useNodeStore.getState().addNode(createTestNode({ id: "1" }));
const result = createExecutionResult({ node_id: "1" });
useNodeStore.getState().updateNodeExecutionResult("1", result);
const results = useNodeStore.getState().getNodeExecutionResults("1");
expect(results).toHaveLength(1);
expect(results[0].node_exec_id).toBe("exec-1");
});
it("accumulates multiple execution results", () => {
useNodeStore.getState().addNode(createTestNode({ id: "1" }));
useNodeStore.getState().updateNodeExecutionResult(
"1",
createExecutionResult({
node_exec_id: "exec-1",
input_data: { key: "val1" },
output_data: { key: ["out1"] },
}),
);
useNodeStore.getState().updateNodeExecutionResult(
"1",
createExecutionResult({
node_exec_id: "exec-2",
input_data: { key: "val2" },
output_data: { key: ["out2"] },
}),
);
expect(useNodeStore.getState().getNodeExecutionResults("1")).toHaveLength(
2,
);
});
it("updates latest input/output data", () => {
useNodeStore.getState().addNode(createTestNode({ id: "1" }));
useNodeStore.getState().updateNodeExecutionResult(
"1",
createExecutionResult({
node_exec_id: "exec-1",
input_data: { key: "first" },
output_data: { key: ["first_out"] },
}),
);
useNodeStore.getState().updateNodeExecutionResult(
"1",
createExecutionResult({
node_exec_id: "exec-2",
input_data: { key: "second" },
output_data: { key: ["second_out"] },
}),
);
expect(useNodeStore.getState().getLatestNodeInputData("1")).toEqual({
key: "second",
});
expect(useNodeStore.getState().getLatestNodeOutputData("1")).toEqual({
key: ["second_out"],
});
});
it("accumulates input/output data across results", () => {
useNodeStore.getState().addNode(createTestNode({ id: "1" }));
useNodeStore.getState().updateNodeExecutionResult(
"1",
createExecutionResult({
node_exec_id: "exec-1",
input_data: { key: "val1" },
output_data: { key: ["out1"] },
}),
);
useNodeStore.getState().updateNodeExecutionResult(
"1",
createExecutionResult({
node_exec_id: "exec-2",
input_data: { key: "val2" },
output_data: { key: ["out2"] },
}),
);
const accInput = useNodeStore.getState().getAccumulatedNodeInputData("1");
expect(accInput.key).toEqual(["val1", "val2"]);
const accOutput = useNodeStore
.getState()
.getAccumulatedNodeOutputData("1");
expect(accOutput.key).toEqual(["out1", "out2"]);
});
it("deduplicates execution results by node_exec_id", () => {
useNodeStore.getState().addNode(createTestNode({ id: "1" }));
useNodeStore.getState().updateNodeExecutionResult(
"1",
createExecutionResult({
node_exec_id: "exec-1",
input_data: { key: "original" },
output_data: { key: ["original_out"] },
}),
);
useNodeStore.getState().updateNodeExecutionResult(
"1",
createExecutionResult({
node_exec_id: "exec-1",
input_data: { key: "updated" },
output_data: { key: ["updated_out"] },
}),
);
const results = useNodeStore.getState().getNodeExecutionResults("1");
expect(results).toHaveLength(1);
expect(results[0].input_data).toEqual({ key: "updated" });
});
it("returns the latest execution result", () => {
useNodeStore.getState().addNode(createTestNode({ id: "1" }));
useNodeStore
.getState()
.updateNodeExecutionResult(
"1",
createExecutionResult({ node_exec_id: "exec-1" }),
);
useNodeStore
.getState()
.updateNodeExecutionResult(
"1",
createExecutionResult({ node_exec_id: "exec-2" }),
);
const latest = useNodeStore.getState().getLatestNodeExecutionResult("1");
expect(latest?.node_exec_id).toBe("exec-2");
});
it("returns undefined for latest result on unknown node", () => {
expect(
useNodeStore.getState().getLatestNodeExecutionResult("unknown"),
).toBeUndefined();
});
it("clears all execution results", () => {
useNodeStore
.getState()
.setNodes([createTestNode({ id: "1" }), createTestNode({ id: "2" })]);
useNodeStore
.getState()
.updateNodeExecutionResult(
"1",
createExecutionResult({ node_exec_id: "exec-1" }),
);
useNodeStore
.getState()
.updateNodeExecutionResult(
"2",
createExecutionResult({ node_exec_id: "exec-2" }),
);
useNodeStore.getState().clearAllNodeExecutionResults();
expect(useNodeStore.getState().getNodeExecutionResults("1")).toEqual([]);
expect(useNodeStore.getState().getNodeExecutionResults("2")).toEqual([]);
expect(
useNodeStore.getState().getLatestNodeInputData("1"),
).toBeUndefined();
expect(
useNodeStore.getState().getLatestNodeOutputData("1"),
).toBeUndefined();
expect(useNodeStore.getState().getAccumulatedNodeInputData("1")).toEqual(
{},
);
expect(useNodeStore.getState().getAccumulatedNodeOutputData("1")).toEqual(
{},
);
});
it("returns empty object for accumulated data on unknown node", () => {
expect(
useNodeStore.getState().getAccumulatedNodeInputData("unknown"),
).toEqual({});
expect(
useNodeStore.getState().getAccumulatedNodeOutputData("unknown"),
).toEqual({});
});
});
describe("getNodeBlockUIType", () => {
it("returns the node UI type", () => {
useNodeStore.getState().addNode(
createTestNode({
id: "1",
data: {
uiType: BlockUIType.INPUT,
},
}),
);
expect(useNodeStore.getState().getNodeBlockUIType("1")).toBe(
BlockUIType.INPUT,
);
});
it("defaults to STANDARD for unknown node IDs", () => {
expect(useNodeStore.getState().getNodeBlockUIType("unknown")).toBe(
BlockUIType.STANDARD,
);
});
});
describe("hasWebhookNodes", () => {
it("returns false when there are no webhook nodes", () => {
useNodeStore.getState().addNode(createTestNode({ id: "1" }));
expect(useNodeStore.getState().hasWebhookNodes()).toBe(false);
});
it("returns true when a WEBHOOK node exists", () => {
useNodeStore.getState().addNode(
createTestNode({
id: "1",
data: {
uiType: BlockUIType.WEBHOOK,
},
}),
);
expect(useNodeStore.getState().hasWebhookNodes()).toBe(true);
});
it("returns true when a WEBHOOK_MANUAL node exists", () => {
useNodeStore.getState().addNode(
createTestNode({
id: "1",
data: {
uiType: BlockUIType.WEBHOOK_MANUAL,
},
}),
);
expect(useNodeStore.getState().hasWebhookNodes()).toBe(true);
});
});
describe("node errors", () => {
it("returns undefined for a node with no errors", () => {
useNodeStore.getState().addNode(createTestNode({ id: "1" }));
expect(useNodeStore.getState().getNodeErrors("1")).toBeUndefined();
});
it("sets and retrieves node errors", () => {
useNodeStore.getState().addNode(createTestNode({ id: "1" }));
const errors = { field1: "required", field2: "invalid" };
useNodeStore.getState().updateNodeErrors("1", errors);
expect(useNodeStore.getState().getNodeErrors("1")).toEqual(errors);
});
it("clears errors for a specific node", () => {
useNodeStore
.getState()
.setNodes([createTestNode({ id: "1" }), createTestNode({ id: "2" })]);
useNodeStore.getState().updateNodeErrors("1", { f: "err" });
useNodeStore.getState().updateNodeErrors("2", { g: "err2" });
useNodeStore.getState().clearNodeErrors("1");
expect(useNodeStore.getState().getNodeErrors("1")).toBeUndefined();
expect(useNodeStore.getState().getNodeErrors("2")).toEqual({ g: "err2" });
});
it("clears all node errors", () => {
useNodeStore
.getState()
.setNodes([createTestNode({ id: "1" }), createTestNode({ id: "2" })]);
useNodeStore.getState().updateNodeErrors("1", { a: "err1" });
useNodeStore.getState().updateNodeErrors("2", { b: "err2" });
useNodeStore.getState().clearAllNodeErrors();
expect(useNodeStore.getState().getNodeErrors("1")).toBeUndefined();
expect(useNodeStore.getState().getNodeErrors("2")).toBeUndefined();
});
it("sets errors by backend ID matching node id", () => {
useNodeStore.getState().addNode(createTestNode({ id: "backend-1" }));
useNodeStore
.getState()
.setNodeErrorsForBackendId("backend-1", { x: "error" });
expect(useNodeStore.getState().getNodeErrors("backend-1")).toEqual({
x: "error",
});
});
});
describe("getHardCodedValues", () => {
it("returns hardcoded values for a node", () => {
useNodeStore.getState().addNode(
createTestNode({
id: "1",
data: {
hardcodedValues: { key: "value" },
},
}),
);
expect(useNodeStore.getState().getHardCodedValues("1")).toEqual({
key: "value",
});
});
it("returns empty object for unknown node", () => {
expect(useNodeStore.getState().getHardCodedValues("unknown")).toEqual({});
});
});
describe("credentials optional", () => {
it("sets credentials_optional in node metadata", () => {
useNodeStore.getState().addNode(createTestNode({ id: "1" }));
useNodeStore.getState().setCredentialsOptional("1", true);
const node = useNodeStore.getState().nodes[0];
expect(node.data.metadata?.credentials_optional).toBe(true);
});
});
describe("resolution mode", () => {
it("defaults to not in resolution mode", () => {
expect(useNodeStore.getState().isNodeInResolutionMode("1")).toBe(false);
});
it("enters and exits resolution mode", () => {
useNodeStore.getState().setNodeResolutionMode("1", true);
expect(useNodeStore.getState().isNodeInResolutionMode("1")).toBe(true);
useNodeStore.getState().setNodeResolutionMode("1", false);
expect(useNodeStore.getState().isNodeInResolutionMode("1")).toBe(false);
});
it("tracks broken edge IDs", () => {
useNodeStore.getState().setBrokenEdgeIDs("node-1", ["edge-1", "edge-2"]);
expect(useNodeStore.getState().isEdgeBroken("edge-1")).toBe(true);
expect(useNodeStore.getState().isEdgeBroken("edge-2")).toBe(true);
expect(useNodeStore.getState().isEdgeBroken("edge-3")).toBe(false);
});
it("removes individual broken edge IDs", () => {
useNodeStore.getState().setBrokenEdgeIDs("node-1", ["edge-1", "edge-2"]);
useNodeStore.getState().removeBrokenEdgeID("node-1", "edge-1");
expect(useNodeStore.getState().isEdgeBroken("edge-1")).toBe(false);
expect(useNodeStore.getState().isEdgeBroken("edge-2")).toBe(true);
});
it("clears all resolution state", () => {
useNodeStore.getState().setNodeResolutionMode("1", true);
useNodeStore.getState().setBrokenEdgeIDs("1", ["edge-1"]);
useNodeStore.getState().clearResolutionState();
expect(useNodeStore.getState().isNodeInResolutionMode("1")).toBe(false);
expect(useNodeStore.getState().isEdgeBroken("edge-1")).toBe(false);
});
it("cleans up broken edges when exiting resolution mode", () => {
useNodeStore.getState().setNodeResolutionMode("1", true);
useNodeStore.getState().setBrokenEdgeIDs("1", ["edge-1"]);
useNodeStore.getState().setNodeResolutionMode("1", false);
expect(useNodeStore.getState().isEdgeBroken("edge-1")).toBe(false);
});
});
describe("edge cases", () => {
it("handles updating data on a non-existent node gracefully", () => {
useNodeStore
.getState()
.updateNodeData("nonexistent", { title: "New Title" });
expect(useNodeStore.getState().nodes).toHaveLength(0);
});
it("handles removing a non-existent node gracefully", () => {
useNodeStore.getState().addNode(createTestNode({ id: "1" }));
useNodeStore
.getState()
.onNodesChange([{ type: "remove", id: "nonexistent" }]);
expect(useNodeStore.getState().nodes).toHaveLength(1);
});
it("handles duplicate node IDs in addNodes", () => {
useNodeStore.getState().addNodes([
createTestNode({
id: "1",
data: { title: "First" },
}),
createTestNode({
id: "1",
data: { title: "Second" },
}),
]);
const { nodes } = useNodeStore.getState();
expect(nodes).toHaveLength(2);
expect(nodes[0].data.title).toBe("First");
expect(nodes[1].data.title).toBe("Second");
});
it("updating node status mid-execution preserves other data", () => {
useNodeStore.getState().addNode(
createTestNode({
id: "1",
data: {
title: "My Node",
hardcodedValues: { key: "val" },
},
}),
);
useNodeStore.getState().updateNodeStatus("1", "RUNNING");
const node = useNodeStore.getState().nodes[0];
expect(node.data.status).toBe("RUNNING");
expect(node.data.title).toBe("My Node");
expect(node.data.hardcodedValues).toEqual({ key: "val" });
});
it("execution result for non-existent node does not add it", () => {
useNodeStore
.getState()
.updateNodeExecutionResult(
"nonexistent",
createExecutionResult({ node_exec_id: "exec-1" }),
);
expect(useNodeStore.getState().nodes).toHaveLength(0);
expect(
useNodeStore.getState().getNodeExecutionResults("nonexistent"),
).toEqual([]);
});
it("getBackendNodes returns empty array when no nodes exist", () => {
expect(useNodeStore.getState().getBackendNodes()).toEqual([]);
});
});
});

View File

@@ -0,0 +1,567 @@
import { describe, it, expect, beforeEach, vi } from "vitest";
import { renderHook, act } from "@testing-library/react";
import { CustomNode } from "../components/FlowEditor/nodes/CustomNode/CustomNode";
import { BlockUIType } from "../components/types";
// ---- Mocks ----
const mockGetViewport = vi.fn(() => ({ x: 0, y: 0, zoom: 1 }));
vi.mock("@xyflow/react", async () => {
const actual = await vi.importActual("@xyflow/react");
return {
...actual,
useReactFlow: vi.fn(() => ({
getViewport: mockGetViewport,
})),
};
});
const mockToast = vi.fn();
vi.mock("@/components/molecules/Toast/use-toast", () => ({
useToast: vi.fn(() => ({ toast: mockToast })),
}));
let uuidCounter = 0;
vi.mock("uuid", () => ({
v4: vi.fn(() => `new-uuid-${++uuidCounter}`),
}));
// Mock navigator.clipboard
const mockWriteText = vi.fn(() => Promise.resolve());
const mockReadText = vi.fn(() => Promise.resolve(""));
Object.defineProperty(navigator, "clipboard", {
value: {
writeText: mockWriteText,
readText: mockReadText,
},
writable: true,
configurable: true,
});
// Mock window.innerWidth / innerHeight for viewport centering calculations
Object.defineProperty(window, "innerWidth", { value: 1000, writable: true });
Object.defineProperty(window, "innerHeight", { value: 800, writable: true });
import { useCopyPaste } from "../components/FlowEditor/Flow/useCopyPaste";
import { useNodeStore } from "../stores/nodeStore";
import { useEdgeStore } from "../stores/edgeStore";
import { useHistoryStore } from "../stores/historyStore";
import { CustomEdge } from "../components/FlowEditor/edges/CustomEdge";
const CLIPBOARD_PREFIX = "autogpt-flow-data:";
function createTestNode(
id: string,
overrides: Partial<CustomNode> = {},
): CustomNode {
return {
id,
type: "custom",
position: overrides.position ?? { x: 100, y: 200 },
selected: overrides.selected,
data: {
hardcodedValues: {},
title: `Node ${id}`,
description: "test node",
inputSchema: {},
outputSchema: {},
uiType: BlockUIType.STANDARD,
block_id: `block-${id}`,
costs: [],
categories: [],
...overrides.data,
},
} as CustomNode;
}
function createTestEdge(
id: string,
source: string,
target: string,
sourceHandle = "out",
targetHandle = "in",
): CustomEdge {
return {
id,
source,
target,
sourceHandle,
targetHandle,
} as CustomEdge;
}
function makeCopyEvent(): KeyboardEvent {
return new KeyboardEvent("keydown", {
key: "c",
ctrlKey: true,
bubbles: true,
});
}
function makePasteEvent(): KeyboardEvent {
return new KeyboardEvent("keydown", {
key: "v",
ctrlKey: true,
bubbles: true,
});
}
function clipboardPayload(nodes: CustomNode[], edges: CustomEdge[]): string {
return `${CLIPBOARD_PREFIX}${JSON.stringify({ nodes, edges })}`;
}
describe("useCopyPaste", () => {
beforeEach(() => {
useNodeStore.setState({ nodes: [], nodeCounter: 0 });
useEdgeStore.setState({ edges: [] });
useHistoryStore.getState().clear();
mockWriteText.mockClear();
mockReadText.mockClear();
mockToast.mockClear();
mockGetViewport.mockReturnValue({ x: 0, y: 0, zoom: 1 });
uuidCounter = 0;
// Ensure no input element is focused
if (document.activeElement && document.activeElement !== document.body) {
(document.activeElement as HTMLElement).blur();
}
});
describe("copy (Ctrl+C)", () => {
it("copies a single selected node to clipboard with prefix", async () => {
const node = createTestNode("1", { selected: true });
useNodeStore.setState({ nodes: [node] });
const { result } = renderHook(() => useCopyPaste());
act(() => {
result.current(makeCopyEvent());
});
await vi.waitFor(() => {
expect(mockWriteText).toHaveBeenCalledTimes(1);
});
const written = (mockWriteText.mock.calls as string[][])[0][0];
expect(written.startsWith(CLIPBOARD_PREFIX)).toBe(true);
const parsed = JSON.parse(written.slice(CLIPBOARD_PREFIX.length));
expect(parsed.nodes).toHaveLength(1);
expect(parsed.nodes[0].id).toBe("1");
expect(parsed.edges).toHaveLength(0);
});
it("shows a success toast after copying", async () => {
const node = createTestNode("1", { selected: true });
useNodeStore.setState({ nodes: [node] });
const { result } = renderHook(() => useCopyPaste());
act(() => {
result.current(makeCopyEvent());
});
await vi.waitFor(() => {
expect(mockToast).toHaveBeenCalledWith(
expect.objectContaining({
title: "Copied successfully",
}),
);
});
});
it("copies multiple connected nodes and preserves internal edges", async () => {
const nodeA = createTestNode("a", { selected: true });
const nodeB = createTestNode("b", { selected: true });
const nodeC = createTestNode("c", { selected: false });
useNodeStore.setState({ nodes: [nodeA, nodeB, nodeC] });
useEdgeStore.setState({
edges: [
createTestEdge("e-ab", "a", "b"),
createTestEdge("e-bc", "b", "c"),
],
});
const { result } = renderHook(() => useCopyPaste());
act(() => {
result.current(makeCopyEvent());
});
await vi.waitFor(() => {
expect(mockWriteText).toHaveBeenCalledTimes(1);
});
const parsed = JSON.parse(
(mockWriteText.mock.calls as string[][])[0][0].slice(
CLIPBOARD_PREFIX.length,
),
);
expect(parsed.nodes).toHaveLength(2);
expect(parsed.edges).toHaveLength(1);
expect(parsed.edges[0].id).toBe("e-ab");
});
it("drops external edges where one endpoint is not selected", async () => {
const nodeA = createTestNode("a", { selected: true });
const nodeB = createTestNode("b", { selected: false });
useNodeStore.setState({ nodes: [nodeA, nodeB] });
useEdgeStore.setState({
edges: [createTestEdge("e-ab", "a", "b")],
});
const { result } = renderHook(() => useCopyPaste());
act(() => {
result.current(makeCopyEvent());
});
await vi.waitFor(() => {
expect(mockWriteText).toHaveBeenCalledTimes(1);
});
const parsed = JSON.parse(
(mockWriteText.mock.calls as string[][])[0][0].slice(
CLIPBOARD_PREFIX.length,
),
);
expect(parsed.nodes).toHaveLength(1);
expect(parsed.edges).toHaveLength(0);
});
it("copies nothing when no nodes are selected", async () => {
const node = createTestNode("1", { selected: false });
useNodeStore.setState({ nodes: [node] });
const { result } = renderHook(() => useCopyPaste());
act(() => {
result.current(makeCopyEvent());
});
await vi.waitFor(() => {
expect(mockWriteText).toHaveBeenCalledTimes(1);
});
const parsed = JSON.parse(
(mockWriteText.mock.calls as string[][])[0][0].slice(
CLIPBOARD_PREFIX.length,
),
);
expect(parsed.nodes).toHaveLength(0);
expect(parsed.edges).toHaveLength(0);
});
});
describe("paste (Ctrl+V)", () => {
it("creates new nodes with new UUIDs", async () => {
const node = createTestNode("orig", {
selected: true,
position: { x: 100, y: 200 },
});
mockReadText.mockResolvedValue(clipboardPayload([node], []));
useNodeStore.setState({ nodes: [], nodeCounter: 0 });
const { result } = renderHook(() => useCopyPaste());
act(() => {
result.current(makePasteEvent());
});
await vi.waitFor(() => {
const { nodes } = useNodeStore.getState();
expect(nodes).toHaveLength(1);
});
const { nodes } = useNodeStore.getState();
expect(nodes[0].id).toBe("new-uuid-1");
expect(nodes[0].id).not.toBe("orig");
});
it("centers pasted nodes in the current viewport", async () => {
// Viewport at origin, zoom 1 => center = (500, 400)
mockGetViewport.mockReturnValue({ x: 0, y: 0, zoom: 1 });
const node = createTestNode("orig", {
selected: true,
position: { x: 100, y: 100 },
});
mockReadText.mockResolvedValue(clipboardPayload([node], []));
useNodeStore.setState({ nodes: [], nodeCounter: 0 });
const { result } = renderHook(() => useCopyPaste());
act(() => {
result.current(makePasteEvent());
});
await vi.waitFor(() => {
const { nodes } = useNodeStore.getState();
expect(nodes).toHaveLength(1);
});
const { nodes } = useNodeStore.getState();
// Single node: center of bounds = (100, 100)
// Viewport center = (500, 400)
// Offset = (400, 300)
// New position = (100 + 400, 100 + 300) = (500, 400)
expect(nodes[0].position).toEqual({ x: 500, y: 400 });
});
it("deselects existing nodes and selects pasted nodes", async () => {
const existingNode = createTestNode("existing", {
selected: true,
position: { x: 0, y: 0 },
});
useNodeStore.setState({ nodes: [existingNode], nodeCounter: 0 });
const nodeToPaste = createTestNode("paste-me", {
selected: false,
position: { x: 100, y: 100 },
});
mockReadText.mockResolvedValue(clipboardPayload([nodeToPaste], []));
const { result } = renderHook(() => useCopyPaste());
act(() => {
result.current(makePasteEvent());
});
await vi.waitFor(() => {
const { nodes } = useNodeStore.getState();
expect(nodes).toHaveLength(2);
});
const { nodes } = useNodeStore.getState();
const originalNode = nodes.find((n) => n.id === "existing");
const pastedNode = nodes.find((n) => n.id !== "existing");
expect(originalNode!.selected).toBe(false);
expect(pastedNode!.selected).toBe(true);
});
it("remaps edge source/target IDs to newly created node IDs", async () => {
const nodeA = createTestNode("a", {
selected: true,
position: { x: 0, y: 0 },
});
const nodeB = createTestNode("b", {
selected: true,
position: { x: 200, y: 0 },
});
const edge = createTestEdge("e-ab", "a", "b", "output", "input");
mockReadText.mockResolvedValue(clipboardPayload([nodeA, nodeB], [edge]));
useNodeStore.setState({ nodes: [], nodeCounter: 0 });
useEdgeStore.setState({ edges: [] });
const { result } = renderHook(() => useCopyPaste());
act(() => {
result.current(makePasteEvent());
});
await vi.waitFor(() => {
const { nodes } = useNodeStore.getState();
expect(nodes).toHaveLength(2);
});
// Wait for edges to be added too
await vi.waitFor(() => {
const { edges } = useEdgeStore.getState();
expect(edges).toHaveLength(1);
});
const { edges } = useEdgeStore.getState();
const newEdge = edges[0];
// Edge source/target should be remapped to new UUIDs, not "a"/"b"
expect(newEdge.source).not.toBe("a");
expect(newEdge.target).not.toBe("b");
expect(newEdge.source).toBe("new-uuid-1");
expect(newEdge.target).toBe("new-uuid-2");
expect(newEdge.sourceHandle).toBe("output");
expect(newEdge.targetHandle).toBe("input");
});
it("does nothing when clipboard does not have the expected prefix", async () => {
mockReadText.mockResolvedValue("some random text");
const existingNode = createTestNode("1", { position: { x: 0, y: 0 } });
useNodeStore.setState({ nodes: [existingNode], nodeCounter: 0 });
const { result } = renderHook(() => useCopyPaste());
act(() => {
result.current(makePasteEvent());
});
// Give async operations time to settle
await vi.waitFor(() => {
expect(mockReadText).toHaveBeenCalled();
});
// Ensure no state changes happen after clipboard read
await vi.waitFor(() => {
const { nodes } = useNodeStore.getState();
expect(nodes).toHaveLength(1);
expect(nodes[0].id).toBe("1");
});
});
it("does nothing when clipboard is empty", async () => {
mockReadText.mockResolvedValue("");
const existingNode = createTestNode("1", { position: { x: 0, y: 0 } });
useNodeStore.setState({ nodes: [existingNode], nodeCounter: 0 });
const { result } = renderHook(() => useCopyPaste());
act(() => {
result.current(makePasteEvent());
});
await vi.waitFor(() => {
expect(mockReadText).toHaveBeenCalled();
});
// Ensure no state changes happen after clipboard read
await vi.waitFor(() => {
const { nodes } = useNodeStore.getState();
expect(nodes).toHaveLength(1);
expect(nodes[0].id).toBe("1");
});
});
});
describe("input field focus guard", () => {
it("ignores Ctrl+C when an input element is focused", async () => {
const node = createTestNode("1", { selected: true });
useNodeStore.setState({ nodes: [node] });
const input = document.createElement("input");
document.body.appendChild(input);
input.focus();
const { result } = renderHook(() => useCopyPaste());
act(() => {
result.current(makeCopyEvent());
});
// Clipboard write should NOT be called
expect(mockWriteText).not.toHaveBeenCalled();
document.body.removeChild(input);
});
it("ignores Ctrl+V when a textarea element is focused", async () => {
mockReadText.mockResolvedValue(
clipboardPayload(
[createTestNode("a", { position: { x: 0, y: 0 } })],
[],
),
);
useNodeStore.setState({ nodes: [], nodeCounter: 0 });
const textarea = document.createElement("textarea");
document.body.appendChild(textarea);
textarea.focus();
const { result } = renderHook(() => useCopyPaste());
act(() => {
result.current(makePasteEvent());
});
expect(mockReadText).not.toHaveBeenCalled();
const { nodes } = useNodeStore.getState();
expect(nodes).toHaveLength(0);
document.body.removeChild(textarea);
});
it("ignores keypresses when a contenteditable element is focused", async () => {
const node = createTestNode("1", { selected: true });
useNodeStore.setState({ nodes: [node] });
const div = document.createElement("div");
div.setAttribute("contenteditable", "true");
document.body.appendChild(div);
div.focus();
const { result } = renderHook(() => useCopyPaste());
act(() => {
result.current(makeCopyEvent());
});
expect(mockWriteText).not.toHaveBeenCalled();
document.body.removeChild(div);
});
});
describe("meta key support (macOS)", () => {
it("handles Cmd+C (metaKey) the same as Ctrl+C", async () => {
const node = createTestNode("1", { selected: true });
useNodeStore.setState({ nodes: [node] });
const { result } = renderHook(() => useCopyPaste());
const metaCopyEvent = new KeyboardEvent("keydown", {
key: "c",
metaKey: true,
bubbles: true,
});
act(() => {
result.current(metaCopyEvent);
});
await vi.waitFor(() => {
expect(mockWriteText).toHaveBeenCalledTimes(1);
});
});
it("handles Cmd+V (metaKey) the same as Ctrl+V", async () => {
const node = createTestNode("orig", {
selected: true,
position: { x: 0, y: 0 },
});
mockReadText.mockResolvedValue(clipboardPayload([node], []));
useNodeStore.setState({ nodes: [], nodeCounter: 0 });
const { result } = renderHook(() => useCopyPaste());
const metaPasteEvent = new KeyboardEvent("keydown", {
key: "v",
metaKey: true,
bubbles: true,
});
act(() => {
result.current(metaPasteEvent);
});
await vi.waitFor(() => {
const { nodes } = useNodeStore.getState();
expect(nodes).toHaveLength(1);
});
});
});
});

View File

@@ -0,0 +1,134 @@
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { renderHook, act } from "@testing-library/react";
const mockScreenToFlowPosition = vi.fn((pos: { x: number; y: number }) => pos);
const mockFitView = vi.fn();
vi.mock("@xyflow/react", async () => {
const actual = await vi.importActual("@xyflow/react");
return {
...actual,
useReactFlow: () => ({
screenToFlowPosition: mockScreenToFlowPosition,
fitView: mockFitView,
}),
};
});
const mockSetQueryStates = vi.fn();
let mockQueryStateValues: {
flowID: string | null;
flowVersion: number | null;
flowExecutionID: string | null;
} = {
flowID: null,
flowVersion: null,
flowExecutionID: null,
};
vi.mock("nuqs", () => ({
parseAsString: {},
parseAsInteger: {},
useQueryStates: vi.fn(() => [mockQueryStateValues, mockSetQueryStates]),
}));
let mockGraphLoading = false;
let mockBlocksLoading = false;
vi.mock("@/app/api/__generated__/endpoints/graphs/graphs", () => ({
useGetV1GetSpecificGraph: vi.fn(() => ({
data: undefined,
isLoading: mockGraphLoading,
})),
useGetV1GetExecutionDetails: vi.fn(() => ({
data: undefined,
})),
useGetV1ListUserGraphs: vi.fn(() => ({
data: undefined,
})),
}));
vi.mock("@/app/api/__generated__/endpoints/default/default", () => ({
useGetV2GetSpecificBlocks: vi.fn(() => ({
data: undefined,
isLoading: mockBlocksLoading,
})),
}));
vi.mock("@/app/api/helpers", () => ({
okData: (res: { data: unknown }) => res?.data,
}));
vi.mock("../components/helper", () => ({
convertNodesPlusBlockInfoIntoCustomNodes: vi.fn(),
}));
describe("useFlow", () => {
beforeEach(() => {
vi.clearAllMocks();
vi.useFakeTimers({ shouldAdvanceTime: true });
mockGraphLoading = false;
mockBlocksLoading = false;
mockQueryStateValues = {
flowID: null,
flowVersion: null,
flowExecutionID: null,
};
});
afterEach(() => {
vi.useRealTimers();
});
describe("loading states", () => {
it("returns isFlowContentLoading true when graph is loading", async () => {
mockGraphLoading = true;
mockQueryStateValues = {
flowID: "test-flow",
flowVersion: 1,
flowExecutionID: null,
};
const { useFlow } = await import("../components/FlowEditor/Flow/useFlow");
const { result } = renderHook(() => useFlow());
expect(result.current.isFlowContentLoading).toBe(true);
});
it("returns isFlowContentLoading true when blocks are loading", async () => {
mockBlocksLoading = true;
mockQueryStateValues = {
flowID: "test-flow",
flowVersion: 1,
flowExecutionID: null,
};
const { useFlow } = await import("../components/FlowEditor/Flow/useFlow");
const { result } = renderHook(() => useFlow());
expect(result.current.isFlowContentLoading).toBe(true);
});
it("returns isFlowContentLoading false when neither is loading", async () => {
const { useFlow } = await import("../components/FlowEditor/Flow/useFlow");
const { result } = renderHook(() => useFlow());
expect(result.current.isFlowContentLoading).toBe(false);
});
});
describe("initial load completion", () => {
it("marks initial load complete for new flows without flowID", async () => {
const { useFlow } = await import("../components/FlowEditor/Flow/useFlow");
const { result } = renderHook(() => useFlow());
expect(result.current.isInitialLoadComplete).toBe(false);
await act(async () => {
vi.advanceTimersByTime(300);
});
expect(result.current.isInitialLoadComplete).toBe(true);
});
});
});

View File

@@ -1,8 +1,14 @@
"use client";
import {
DropdownMenu,
DropdownMenuContent,
DropdownMenuItem,
DropdownMenuTrigger,
} from "@/components/molecules/DropdownMenu/DropdownMenu";
import { SidebarProvider } from "@/components/ui/sidebar";
import { cn } from "@/lib/utils";
import { UploadSimple } from "@phosphor-icons/react";
import { DotsThree, UploadSimple } from "@phosphor-icons/react";
import { useCallback, useRef, useState } from "react";
import { ChatContainer } from "./components/ChatContainer/ChatContainer";
import { ChatSidebar } from "./components/ChatSidebar/ChatSidebar";
@@ -86,6 +92,7 @@ export function CopilotPage() {
// Delete functionality
sessionToDelete,
isDeleting,
handleDeleteClick,
handleConfirmDelete,
handleCancelDelete,
} = useCopilotPage();
@@ -141,6 +148,38 @@ export function CopilotPage() {
isUploadingFiles={isUploadingFiles}
droppedFiles={droppedFiles}
onDroppedFilesConsumed={handleDroppedFilesConsumed}
headerSlot={
isMobile && sessionId ? (
<div className="flex justify-end">
<DropdownMenu>
<DropdownMenuTrigger asChild>
<button
className="rounded p-1.5 hover:bg-neutral-100"
aria-label="More actions"
>
<DotsThree className="h-5 w-5 text-neutral-600" />
</button>
</DropdownMenuTrigger>
<DropdownMenuContent align="end">
<DropdownMenuItem
onClick={() => {
const session = sessions.find(
(s) => s.id === sessionId,
);
if (session) {
handleDeleteClick(session.id, session.title);
}
}}
disabled={isDeleting}
className="text-red-600 focus:bg-red-50 focus:text-red-600"
>
Delete chat
</DropdownMenuItem>
</DropdownMenuContent>
</DropdownMenu>
</div>
) : undefined
}
/>
</div>
</div>

View File

@@ -2,6 +2,7 @@
import { ChatInput } from "@/app/(platform)/copilot/components/ChatInput/ChatInput";
import { UIDataTypes, UIMessage, UITools } from "ai";
import { LayoutGroup, motion } from "framer-motion";
import { ReactNode } from "react";
import { ChatMessagesContainer } from "../ChatMessagesContainer/ChatMessagesContainer";
import { CopilotChatActionsProvider } from "../CopilotChatActionsProvider/CopilotChatActionsProvider";
import { EmptySession } from "../EmptySession/EmptySession";
@@ -20,6 +21,7 @@ export interface ChatContainerProps {
onSend: (message: string, files?: File[]) => void | Promise<void>;
onStop: () => void;
isUploadingFiles?: boolean;
headerSlot?: ReactNode;
/** Files dropped onto the chat window. */
droppedFiles?: File[];
/** Called after droppedFiles have been consumed by ChatInput. */
@@ -38,6 +40,7 @@ export const ChatContainer = ({
onSend,
onStop,
isUploadingFiles,
headerSlot,
droppedFiles,
onDroppedFilesConsumed,
}: ChatContainerProps) => {
@@ -60,6 +63,7 @@ export const ChatContainer = ({
status={status}
error={error}
isLoading={isLoadingSession}
headerSlot={headerSlot}
sessionID={sessionId}
/>
<motion.div

View File

@@ -1,3 +1,4 @@
import { useCopilotUIStore } from "@/app/(platform)/copilot/store";
import { ChangeEvent, FormEvent, useEffect, useState } from "react";
interface Args {
@@ -16,6 +17,16 @@ export function useChatInput({
}: Args) {
const [value, setValue] = useState("");
const [isSending, setIsSending] = useState(false);
const { initialPrompt, setInitialPrompt } = useCopilotUIStore();
useEffect(
function consumeInitialPrompt() {
if (!initialPrompt) return;
setValue((prev) => (prev.length === 0 ? initialPrompt : prev));
setInitialPrompt(null);
},
[initialPrompt, setInitialPrompt],
);
useEffect(
function focusOnMount() {

View File

@@ -30,6 +30,7 @@ interface Props {
status: string;
error: Error | undefined;
isLoading: boolean;
headerSlot?: React.ReactNode;
sessionID?: string | null;
}
@@ -101,6 +102,7 @@ export function ChatMessagesContainer({
status,
error,
isLoading,
headerSlot,
sessionID,
}: Props) {
const lastMessage = messages[messages.length - 1];
@@ -133,6 +135,7 @@ export function ChatMessagesContainer({
return (
<Conversation className="min-h-0 flex-1">
<ConversationContent className="flex flex-1 flex-col gap-6 px-3 py-6">
{headerSlot}
{isLoading && messages.length === 0 && (
<div
className="flex flex-1 items-center justify-center"

View File

@@ -37,7 +37,6 @@ import { useCopilotUIStore } from "../../store";
import { NotificationToggle } from "./components/NotificationToggle/NotificationToggle";
import { DeleteChatDialog } from "../DeleteChatDialog/DeleteChatDialog";
import { PulseLoader } from "../PulseLoader/PulseLoader";
import { UsageLimits } from "../UsageLimits/UsageLimits";
export function ChatSidebar() {
const { state } = useSidebar();
@@ -257,10 +256,11 @@ export function ChatSidebar() {
<Text variant="h3" size="body-medium">
Your chats
</Text>
<div className="flex items-center">
<UsageLimits />
<div className="relative left-5 flex items-center gap-1">
<NotificationToggle />
<SidebarTrigger />
<div className="relative left-1">
<SidebarTrigger />
</div>
</div>
</div>
{sessionId ? (

View File

@@ -7,7 +7,6 @@ import {
PopoverTrigger,
} from "@/components/molecules/Popover/Popover";
import { toast } from "@/components/molecules/Toast/use-toast";
import { Button } from "@/components/ui/button";
import { cn } from "@/lib/utils";
import { Bell, BellRinging, BellSlash } from "@phosphor-icons/react";
import { useCopilotUIStore } from "../../../../store";
@@ -49,7 +48,10 @@ export function NotificationToggle() {
return (
<Popover>
<PopoverTrigger asChild>
<Button variant="ghost" size="icon" aria-label="Notification settings">
<button
className="rounded p-1 text-black transition-colors hover:bg-zinc-50"
aria-label="Notification settings"
>
{!isNotificationsEnabled ? (
<BellSlash className="!size-5" />
) : isSoundEnabled ? (
@@ -57,7 +59,7 @@ export function NotificationToggle() {
) : (
<Bell className="!size-5" />
)}
</Button>
</button>
</PopoverTrigger>
<PopoverContent align="start" className="w-56 p-3">
<div className="flex flex-col gap-3">

View File

@@ -5,7 +5,7 @@ const TOOL_TO_CATEGORY: Record<string, string> = {
find_agent: "search",
find_library_agent: "search",
run_agent: "agent run",
run_block: "block run",
run_block: "action",
create_agent: "agent created",
edit_agent: "agent edited",
schedule_agent: "agent scheduled",

View File

@@ -1,146 +0,0 @@
import type { CoPilotUsageStatus } from "@/app/api/__generated__/models/coPilotUsageStatus";
import {
Popover,
PopoverContent,
PopoverTrigger,
} from "@/components/molecules/Popover/Popover";
import { Button } from "@/components/ui/button";
import { ChartBar } from "@phosphor-icons/react";
import { useUsageLimits } from "./useUsageLimits";
const MS_PER_MINUTE = 60_000;
const MS_PER_HOUR = 3_600_000;
function formatResetTime(resetsAt: Date | string): string {
const resetDate =
typeof resetsAt === "string" ? new Date(resetsAt) : resetsAt;
const now = new Date();
const diffMs = resetDate.getTime() - now.getTime();
if (diffMs <= 0) return "now";
const hours = Math.floor(diffMs / MS_PER_HOUR);
// Under 24h: show relative time ("in 4h 23m")
if (hours < 24) {
const minutes = Math.floor((diffMs % MS_PER_HOUR) / MS_PER_MINUTE);
if (hours > 0) return `in ${hours}h ${minutes}m`;
return `in ${minutes}m`;
}
// Over 24h: show day and time in local timezone ("Mon 12:00 AM PST")
return resetDate.toLocaleString(undefined, {
weekday: "short",
hour: "numeric",
minute: "2-digit",
timeZoneName: "short",
});
}
function UsageBar({
label,
used,
limit,
resetsAt,
}: {
label: string;
used: number;
limit: number;
resetsAt: Date | string;
}) {
if (limit <= 0) return null;
const rawPercent = (used / limit) * 100;
const percent = Math.min(100, Math.round(rawPercent));
const isHigh = percent >= 80;
const percentLabel =
used > 0 && percent === 0 ? "<1% used" : `${percent}% used`;
return (
<div className="flex flex-col gap-1">
<div className="flex items-baseline justify-between">
<span className="text-xs font-medium text-neutral-700">{label}</span>
<span className="text-[11px] tabular-nums text-neutral-500">
{percentLabel}
</span>
</div>
<div className="text-[10px] text-neutral-400">
Resets {formatResetTime(resetsAt)}
</div>
<div className="h-2 w-full overflow-hidden rounded-full bg-neutral-200">
<div
className={`h-full rounded-full transition-[width] duration-300 ease-out ${
isHigh ? "bg-orange-500" : "bg-blue-500"
}`}
style={{ width: `${Math.max(used > 0 ? 1 : 0, percent)}%` }}
/>
</div>
</div>
);
}
export function UsagePanelContent({
usage,
showBillingLink = true,
}: {
usage: CoPilotUsageStatus;
showBillingLink?: boolean;
}) {
const hasDailyLimit = usage.daily.limit > 0;
const hasWeeklyLimit = usage.weekly.limit > 0;
if (!hasDailyLimit && !hasWeeklyLimit) {
return (
<div className="text-xs text-neutral-500">No usage limits configured</div>
);
}
return (
<div className="flex flex-col gap-3">
<div className="text-xs font-semibold text-neutral-800">Usage limits</div>
{hasDailyLimit && (
<UsageBar
label="Today"
used={usage.daily.used}
limit={usage.daily.limit}
resetsAt={usage.daily.resets_at}
/>
)}
{hasWeeklyLimit && (
<UsageBar
label="This week"
used={usage.weekly.used}
limit={usage.weekly.limit}
resetsAt={usage.weekly.resets_at}
/>
)}
{showBillingLink && (
<a
href="/profile/credits"
className="text-[11px] text-blue-600 hover:underline"
>
Learn more about usage limits
</a>
)}
</div>
);
}
export function UsageLimits() {
const { data: usage, isLoading } = useUsageLimits();
if (isLoading || !usage) return null;
if (usage.daily.limit <= 0 && usage.weekly.limit <= 0) return null;
return (
<Popover>
<PopoverTrigger asChild>
<Button variant="ghost" size="icon" aria-label="Usage limits">
<ChartBar className="!size-5" weight="light" />
</Button>
</PopoverTrigger>
<PopoverContent align="start" className="w-64 p-3">
<UsagePanelContent usage={usage} />
</PopoverContent>
</Popover>
);
}

View File

@@ -1,121 +0,0 @@
import { render, screen, cleanup } from "@/tests/integrations/test-utils";
import { afterEach, describe, expect, it, vi } from "vitest";
import { UsageLimits } from "../UsageLimits";
// Mock the useUsageLimits hook
const mockUseUsageLimits = vi.fn();
vi.mock("../useUsageLimits", () => ({
useUsageLimits: () => mockUseUsageLimits(),
}));
// Mock Popover to render children directly (Radix portals don't work in happy-dom)
vi.mock("@/components/molecules/Popover/Popover", () => ({
Popover: ({ children }: { children: React.ReactNode }) => (
<div>{children}</div>
),
PopoverTrigger: ({ children }: { children: React.ReactNode }) => (
<div>{children}</div>
),
PopoverContent: ({ children }: { children: React.ReactNode }) => (
<div>{children}</div>
),
}));
afterEach(() => {
cleanup();
mockUseUsageLimits.mockReset();
});
function makeUsage({
dailyUsed = 500,
dailyLimit = 10000,
weeklyUsed = 2000,
weeklyLimit = 50000,
}: {
dailyUsed?: number;
dailyLimit?: number;
weeklyUsed?: number;
weeklyLimit?: number;
} = {}) {
const future = new Date(Date.now() + 3600 * 1000); // 1h from now
return {
daily: { used: dailyUsed, limit: dailyLimit, resets_at: future },
weekly: { used: weeklyUsed, limit: weeklyLimit, resets_at: future },
};
}
describe("UsageLimits", () => {
it("renders nothing while loading", () => {
mockUseUsageLimits.mockReturnValue({ data: undefined, isLoading: true });
const { container } = render(<UsageLimits />);
expect(container.innerHTML).toBe("");
});
it("renders nothing when no limits are configured", () => {
mockUseUsageLimits.mockReturnValue({
data: makeUsage({ dailyLimit: 0, weeklyLimit: 0 }),
isLoading: false,
});
const { container } = render(<UsageLimits />);
expect(container.innerHTML).toBe("");
});
it("renders the usage button when limits exist", () => {
mockUseUsageLimits.mockReturnValue({
data: makeUsage(),
isLoading: false,
});
render(<UsageLimits />);
expect(screen.getByRole("button", { name: /usage limits/i })).toBeDefined();
});
it("displays daily and weekly usage percentages", () => {
mockUseUsageLimits.mockReturnValue({
data: makeUsage({ dailyUsed: 5000, dailyLimit: 10000 }),
isLoading: false,
});
render(<UsageLimits />);
expect(screen.getByText("50% used")).toBeDefined();
expect(screen.getByText("Today")).toBeDefined();
expect(screen.getByText("This week")).toBeDefined();
expect(screen.getByText("Usage limits")).toBeDefined();
});
it("shows only weekly bar when daily limit is 0", () => {
mockUseUsageLimits.mockReturnValue({
data: makeUsage({
dailyLimit: 0,
weeklyUsed: 25000,
weeklyLimit: 50000,
}),
isLoading: false,
});
render(<UsageLimits />);
expect(screen.getByText("This week")).toBeDefined();
expect(screen.queryByText("Today")).toBeNull();
});
it("caps percentage at 100% when over limit", () => {
mockUseUsageLimits.mockReturnValue({
data: makeUsage({ dailyUsed: 15000, dailyLimit: 10000 }),
isLoading: false,
});
render(<UsageLimits />);
expect(screen.getByText("100% used")).toBeDefined();
});
it("shows learn more link to credits page", () => {
mockUseUsageLimits.mockReturnValue({
data: makeUsage(),
isLoading: false,
});
render(<UsageLimits />);
const link = screen.getByText("Learn more about usage limits");
expect(link).toBeDefined();
expect(link.closest("a")?.getAttribute("href")).toBe("/profile/credits");
});
});

View File

@@ -1,12 +0,0 @@
import type { CoPilotUsageStatus } from "@/app/api/__generated__/models/coPilotUsageStatus";
import { useGetV2GetCopilotUsage } from "@/app/api/__generated__/endpoints/chat/chat";
export function useUsageLimits() {
return useGetV2GetCopilotUsage({
query: {
select: (res) => res.data as CoPilotUsageStatus,
refetchInterval: 30000,
staleTime: 10000,
},
});
}

View File

@@ -7,6 +7,10 @@ export interface DeleteTarget {
}
interface CopilotUIState {
/** Prompt extracted from URL hash (e.g. /copilot#prompt=...) for input prefill. */
initialPrompt: string | null;
setInitialPrompt: (prompt: string | null) => void;
sessionToDelete: DeleteTarget | null;
setSessionToDelete: (target: DeleteTarget | null) => void;
@@ -31,6 +35,9 @@ interface CopilotUIState {
}
export const useCopilotUIStore = create<CopilotUIState>((set) => ({
initialPrompt: null,
setInitialPrompt: (prompt) => set({ initialPrompt: prompt }),
sessionToDelete: null,
setSessionToDelete: (target) => set({ sessionToDelete: target }),

View File

@@ -706,8 +706,8 @@ export default function StyleguidePage() {
input: { block_id: "weather-block-123" },
output: {
type: ResponseType.error,
message: "Failed to run the block.",
error: "Block execution timed out after 30 seconds.",
message: "Something went wrong while running this step.",
error: "Execution timed out after 30 seconds.",
details: {
block_id: "weather-block-123",
timeout_ms: 30000,

View File

@@ -61,7 +61,7 @@ export function FindBlocksTool({ part }: Props) {
const query = (part.input as FindBlockInput | undefined)?.query?.trim();
const accordionDescription = parsed
? `Found ${parsed.count} block${parsed.count === 1 ? "" : "s"}${query ? ` for "${query}"` : ""}`
? `Found ${parsed.count} action${parsed.count === 1 ? "" : "s"}${query ? ` for "${query}"` : ""}`
: undefined;
return (
@@ -77,7 +77,7 @@ export function FindBlocksTool({ part }: Props) {
{hasBlocks && parsed && (
<ToolAccordion
icon={<AccordionIcon />}
title="Block results"
title="Results"
description={accordionDescription}
>
<HorizontalScroll dependencyList={[parsed.blocks.length]}>

View File

@@ -30,21 +30,21 @@ export function getAnimationText(part: FindBlockToolPart): string {
switch (part.state) {
case "input-streaming":
case "input-available":
return `Searching for blocks${queryText}`;
return `Searching for actions${queryText}`;
case "output-available": {
const parsed = parseOutput(part.output);
if (parsed) {
return `Found ${parsed.count} block${parsed.count === 1 ? "" : "s"}${queryText}`;
return `Found ${parsed.count} action${parsed.count === 1 ? "" : "s"}${queryText}`;
}
return `Searching for blocks${queryText}`;
return `Searching for actions${queryText}`;
}
case "output-error":
return `Error finding blocks${queryText}`;
return `Search failed${query ? ` for "${query}"` : ""}`;
default:
return "Searching for blocks";
return "Searching for actions";
}
}

View File

@@ -144,6 +144,23 @@ export function truncate(text: string, maxLen: number): string {
return text.slice(0, maxLen).trimEnd() + "\u2026";
}
const STRIPPABLE_EXTENSIONS =
/\.(md|csv|json|txt|yaml|yml|xml|html|js|ts|py|sh|toml|cfg|ini|log|pdf|png|jpg|jpeg|gif|svg|mp4|mp3|wav|zip|tar|gz)$/i;
export function humanizeFileName(filePath: string): string {
const fileName = filePath.split("/").pop() ?? filePath;
const stem = fileName.replace(STRIPPABLE_EXTENSIONS, "");
const words = stem
.replace(/[_-]/g, " ")
.split(/\s+/)
.filter(Boolean)
.map((w) => {
if (w === w.toUpperCase()) return w;
return w.charAt(0).toUpperCase() + w.slice(1).toLowerCase();
});
return `"${words.join(" ")}"`;
}
/* ------------------------------------------------------------------ */
/* Exit code helper */
/* ------------------------------------------------------------------ */
@@ -191,16 +208,16 @@ export function getAnimationText(
? `Browsing ${shortSummary}`
: "Interacting with browser\u2026";
case "file-read":
return shortSummary
? `Reading ${shortSummary}`
return summary
? `Reading ${humanizeFileName(summary)}`
: "Reading file\u2026";
case "file-write":
return shortSummary
? `Writing ${shortSummary}`
return summary
? `Writing ${humanizeFileName(summary)}`
: "Writing file\u2026";
case "file-delete":
return shortSummary
? `Deleting ${shortSummary}`
return summary
? `Deleting ${humanizeFileName(summary)}`
: "Deleting file\u2026";
case "file-list":
return shortSummary
@@ -211,8 +228,8 @@ export function getAnimationText(
? `Searching for "${shortSummary}"`
: "Searching\u2026";
case "edit":
return shortSummary
? `Editing ${shortSummary}`
return summary
? `Editing ${humanizeFileName(summary)}`
: "Editing file\u2026";
case "todo":
return shortSummary ? `${shortSummary}` : "Updating task list\u2026";
@@ -246,11 +263,17 @@ export function getAnimationText(
? `Browsed ${shortSummary}`
: "Browser action completed";
case "file-read":
return shortSummary ? `Read ${shortSummary}` : "File read completed";
return summary
? `Read ${humanizeFileName(summary)}`
: "File read completed";
case "file-write":
return shortSummary ? `Wrote ${shortSummary}` : "File written";
return summary
? `Wrote ${humanizeFileName(summary)}`
: "File written";
case "file-delete":
return shortSummary ? `Deleted ${shortSummary}` : "File deleted";
return summary
? `Deleted ${humanizeFileName(summary)}`
: "File deleted";
case "file-list":
return "Listed files";
case "search":
@@ -258,7 +281,9 @@ export function getAnimationText(
? `Searched for "${shortSummary}"`
: "Search completed";
case "edit":
return shortSummary ? `Edited ${shortSummary}` : "Edit completed";
return summary
? `Edited ${humanizeFileName(summary)}`
: "Edit completed";
case "todo":
return "Updated task list";
case "compaction":

View File

@@ -149,10 +149,10 @@ export function getAnimationText(part: {
}
if (isRunAgentNeedLoginOutput(output))
return "Sign in required to run agent";
return "Error running agent";
return "Something went wrong";
}
case "output-error":
return "Error running agent";
return "Something went wrong";
default:
return actionPhrase;
}

View File

@@ -18,10 +18,10 @@ import {
interface Props {
output: SetupRequirementsResponse;
/** Override the message sent to the chat when the user clicks Proceed after connecting credentials.
* Defaults to "Please re-run the block now." */
* Defaults to "Please re-run this step now." */
retryInstruction?: string;
/** Override the label shown above the credentials section.
* Defaults to "Block credentials". */
* Defaults to "Credentials". */
credentialsLabel?: string;
}
@@ -87,11 +87,9 @@ export function SetupRequirementsCard({
([, v]) => v !== undefined && v !== null && v !== "",
),
);
parts.push(
`Run the block with these inputs: ${JSON.stringify(nonEmpty, null, 2)}`,
);
parts.push(`Run with these inputs: ${JSON.stringify(nonEmpty, null, 2)}`);
} else {
parts.push(retryInstruction ?? "Please re-run the block now.");
parts.push(retryInstruction ?? "Please re-run this step now.");
}
onSend(parts.join(" "));
@@ -105,7 +103,7 @@ export function SetupRequirementsCard({
{needsCredentials && (
<div className="rounded-2xl border bg-background p-3">
<Text variant="small" className="w-fit border-b text-zinc-500">
{credentialsLabel ?? "Block credentials"}
{credentialsLabel ?? "Credentials"}
</Text>
<div className="mt-6">
<CredentialsGroupedView
@@ -122,7 +120,7 @@ export function SetupRequirementsCard({
{inputSchema && (
<div className="rounded-2xl border bg-background p-3 pt-4">
<Text variant="small" className="w-fit border-b text-zinc-500">
Block inputs
Inputs
</Text>
<FormRenderer
jsonSchema={inputSchema}

View File

@@ -165,12 +165,12 @@ export function getAnimationText(part: {
if (isRunBlockReviewRequiredOutput(output)) {
return `Review needed for "${output.block_name}"`;
}
return "Error running block";
return "Action failed";
}
case "output-error":
return "Error running block";
return "Action failed";
default:
return "Running the block";
return "Running";
}
}

View File

@@ -19,6 +19,42 @@ import { useCopilotStream } from "./useCopilotStream";
const TITLE_POLL_INTERVAL_MS = 2_000;
const TITLE_POLL_MAX_ATTEMPTS = 5;
/**
* Extract a prompt from the URL hash fragment.
* Supports: /copilot#prompt=URL-encoded-text
* Optionally auto-submits if ?autosubmit=true is in the query string.
* Returns null if no prompt is present.
*/
function extractPromptFromUrl(): {
prompt: string;
autosubmit: boolean;
} | null {
if (typeof window === "undefined") return null;
const hash = window.location.hash;
if (!hash) return null;
const hashParams = new URLSearchParams(hash.slice(1));
const prompt = hashParams.get("prompt");
if (!prompt || !prompt.trim()) return null;
const searchParams = new URLSearchParams(window.location.search);
const autosubmit = searchParams.get("autosubmit") === "true";
// Clean up hash + autosubmit param only (preserve other query params)
const cleanURL = new URL(window.location.href);
cleanURL.hash = "";
cleanURL.searchParams.delete("autosubmit");
window.history.replaceState(
null,
"",
`${cleanURL.pathname}${cleanURL.search}`,
);
return { prompt: prompt.trim(), autosubmit };
}
interface UploadedFile {
file_id: string;
name: string;
@@ -127,6 +163,28 @@ export function useCopilotPage() {
}
}, [sessionId, pendingMessage, sendMessage]);
// --- Extract prompt from URL hash on mount (e.g. /copilot#prompt=Hello) ---
const { setInitialPrompt } = useCopilotUIStore();
const hasProcessedUrlPrompt = useRef(false);
useEffect(() => {
if (hasProcessedUrlPrompt.current) return;
const urlPrompt = extractPromptFromUrl();
if (!urlPrompt) return;
hasProcessedUrlPrompt.current = true;
if (urlPrompt.autosubmit) {
setPendingMessage(urlPrompt.prompt);
void createSession().catch(() => {
setPendingMessage(null);
setInitialPrompt(urlPrompt.prompt);
});
} else {
setInitialPrompt(urlPrompt.prompt);
}
}, [createSession, setInitialPrompt]);
async function uploadFiles(
files: File[],
sid: string,

Some files were not shown because too many files have changed in this diff Show More