Compare commits

...

86 Commits

Author SHA1 Message Date
An Vy Le
b42d087003 Merge branch 'dev' into feat/task-decomposition-copilot 2026-04-30 07:32:57 +02:00
Zamil Majdy
4a1741cc15 fix(platform): cancel-banner copy + clearer 422 on currency mismatch (#12947)
## Why

Two regressions surfaced after
[#12933](https://github.com/Significant-Gravitas/AutoGPT/pull/12933)
merged to `dev`:

1. **Cancel-pending banner shows wrong copy.** The merged PR moved
cancel-at-period-end from `BASIC` → `NO_TIER`, but
`PendingChangeBanner.isCancellation` was still keyed on `"BASIC"`. As a
result, a user who cancels their sub now sees *"Scheduled to downgrade
to No subscription on …"* instead of the intended *"Scheduled to cancel
your subscription on …"*. Caught by Sentry on the merged PR.

2. **Currency-mismatch downgrade returns 502 (looks like outage).** A
user with an existing GBP-active sub (Max Price has
`currency_options.gbp`) tried to downgrade to Pro and got 502. The
backend logs show:
   ```
stripe._error.InvalidRequestError: The price specified only supports
`usd`.
   This doesn't match the expected currency: `gbp`.
   ```
The Pro Price is USD-only; Stripe rejects `SubscriptionSchedule.modify`
because phases must share currency. Wrapping that in a generic 502 hid
the real cause and made it read like a Stripe outage.

## What

* Frontend: flip `PendingChangeBanner.isCancellation` from `pendingTier
=== "BASIC"` to `"NO_TIER"`. Update both component and page-level tests
that exercised the cancellation branch.
* Backend: catch `stripe.InvalidRequestError` whose message mentions
`currency` in `update_subscription_tier`, and return **422** with *"Tier
change unavailable for your current billing currency. Cancel your
subscription and re-subscribe at the target tier, or contact support."*
— so users see the actual reason, not a misleading outage message. Other
`StripeError` paths still return 502.
* New backend test asserts the currency-mismatch branch returns 422 with
the new copy.

## How

* `PendingChangeBanner.tsx` line 28: 1-char change (`"BASIC"` →
`"NO_TIER"`).
* `subscription_routes_test.py` and `PendingChangeBanner.test.tsx`
updated to use `NO_TIER` for the cancellation fixture.
* `v1.py` `update_subscription_tier` adds a typed `except
stripe.InvalidRequestError` branch ahead of the generic `StripeError`;
only currency-mismatch messages get the special 422, everything else
falls through to the existing 502.

## The real fix lives in Stripe config

The defensive 422 here is just a clearer error surface. To actually
unblock GBP/EUR users from changing tiers, the per-tier Stripe Prices
(Pro, and Basic if priced) need `currency_options` for GBP added — Max
already has this, which is why Max checkout shows the £/$ toggle. Stripe
locks `currency_options` after a Price has been transacted, so the
procedure is: create a new Price with USD + GBP from the start → update
the `stripe-price-ids` LD flag to the new Price ID. No further code
change required; same Price ID stays per tier, multiple currencies
inside it.

## Checklist

- [x] Component test for new banner copy
- [x] Backend test for 422 currency-mismatch branch
- [x] Format / lint / types pass
- [x] No protected route added — N/A
2026-04-30 10:25:02 +07:00
anvyle
db20387ef9 Merge branch 'dev' of https://github.com/Significant-Gravitas/AutoGPT into feat/task-decomposition-copilot
# Conflicts:
#	autogpt_platform/frontend/src/app/(platform)/copilot/components/ChatMessagesContainer/helpers.ts
2026-04-30 00:52:07 +02:00
anvyle
59d74fb10b fix(copilot): make splitReasoningAndResponse robust to bundler optimization
CI's vitest bundler was eliding the body of splitReasoningAndResponse so
the function returned the input unchanged — verified by adding a console.log
inside the function which made it work correctly. Replace
Array.prototype.findLastIndex and .some with explicit reverse / forward
loops; the explicit shape is opaque to whatever transform was rewriting
the previous version. Drop the diagnostic logging on the test side now
that the root cause is fixed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 16:09:54 +02:00
anvyle
62217614f6 debug(copilot): instrument splitReasoningAndResponse internals
Test scope shows findLastIndex returns 1 yet the function returned
{reasoning: [], response: parts}. Add console.log inside the function
to confirm whether lastToolIndex == -1 or hasResponseAfterTools == false
when called from the failing CI test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:32:16 +02:00
anvyle
234f05fe25 test(copilot): deeper diagnostic — compare findLastIndex vs manual loop
Previous diagnostic showed reasoning=0/response=3 in CI, indicating
splitReasoningAndResponse early-returned because lastToolIndex === -1
even though parts[1].type === "tool-decompose_goal". Suspect that
parts.findLastIndex returns -1 for some env reason. Log both the
findLastIndex result and a manual reverse-loop equivalent so we can
confirm.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:22:13 +02:00
anvyle
3b7cbe7d38 test(copilot): add diagnostic logging to helpers.test.ts
CI fails with reasoning length 0 instead of 1, but the same code passes
locally with --frozen-lockfile + clean regen. Log runtime state from
inside the failing test so the next CI run surfaces what isInteractiveToolPart
sees and what splitReasoningAndResponse returns. Will revert once root cause
is found.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:13:49 +02:00
anvyle
b373e3757f test(copilot): use complete fixtures for splitReasoningAndResponse tests
CI reports these two tests fail with reasoning length 0 instead of 1, but
the same code passes locally with --frozen-lockfile + clean regen.
Pad the decompose_goal output with the full set of fields the backend
actually sends (message, goal, steps, step_count) and add an explicit
typing on parts. Also pin the leading text-part identity in both
assertions so a future regression points at the exact element.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 15:01:04 +02:00
anvyle
dbab590442 refactor(copilot): make decompose_goal a visibility-only tool
Strip the approval gate, server auto-approve timer, /cancel-auto-approve
endpoint, and the frontend Approve/Modify/edit-mode UI. The plan card now
renders read-only and the LLM continues building in the same turn.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:03:57 +02:00
An Vy Le
8023b66116 Merge branch 'dev' into feat/task-decomposition-copilot 2026-04-24 17:43:48 +02:00
anvyle
b1eee6eca4 fix(copilot): suppress duplicate text when calling decompose_goal
The LLM was generating a natural-text summary alongside the
decompose_goal tool call, producing both a text plan and the UI card.

Add explicit instruction: do not write any text before or after the
tool call — the platform renders the plan UI automatically.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 07:00:19 +02:00
anvyle
eedb64a03d Merge branch 'feat/task-decomposition-copilot' of https://github.com/Significant-Gravitas/AutoGPT into feat/task-decomposition-copilot 2026-04-24 06:45:51 +02:00
anvyle
15499722d2 fix(copilot): stronger prompt to end turn after decompose_goal
The LLM sometimes called decompose_goal mid-turn and continued to
find_block or create_agent without waiting for user approval. The
existing "STOP" instruction wasn't forceful enough.

Rewrite to explicitly state decompose_goal MUST be the last tool call
in the turn — no other tools, no text after the result, end the turn
immediately so the user can review and respond.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 06:45:37 +02:00
An Vy Le
b9849ce5c0 Merge branch 'dev' into feat/task-decomposition-copilot 2026-04-24 02:26:49 +02:00
anvyle
263a414ffd Merge remote-tracking branch 'origin/dev' into feat/task-decomposition-copilot
# Conflicts:
#	autogpt_platform/backend/backend/api/features/chat/routes.py
#	autogpt_platform/backend/backend/copilot/sdk/agent_generation_guide.md
#	autogpt_platform/backend/backend/copilot/tools/tool_schema_test.py
2026-04-23 06:25:11 +02:00
anvyle
26caa86bb8 Merge branch 'feat/task-decomposition-copilot' of https://github.com/Significant-Gravitas/AutoGPT into feat/task-decomposition-copilot 2026-04-22 11:14:29 +02:00
anvyle
95362f7881 fix(copilot): timer only ticks after turn fully finishes
The countdown was ticking during LLM streaming (gated on showActions,
not actionsEnabled), so streaming time ate into the user's review
window. A 10s stream meant only 50s to review the plan.

Gate the interval on actionsEnabled (includes !isMessageStreaming) so
the timer starts at the full countdown duration only after all streaming
completes. For session re-entry, the lazy initializer computes remaining
from created_at. Removes the created_at-based recompute from the live
interval (no longer needed since the timer doesn't tick during streaming
and naive decrement is accurate when the tab is active).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 11:14:07 +02:00
An Vy Le
4fff564268 Merge branch 'dev' into feat/task-decomposition-copilot 2026-04-22 07:19:02 +02:00
anvyle
f8df0fabaa Merge remote-tracking branch 'origin/dev' into feat/task-decomposition-copilot
# Conflicts:
#	autogpt_platform/backend/backend/copilot/tools/tool_schema_test.py
2026-04-22 06:47:30 +02:00
anvyle
16a51e5ca8 Merge branch 'feat/task-decomposition-copilot' of https://github.com/Significant-Gravitas/AutoGPT into feat/task-decomposition-copilot 2026-04-22 06:35:01 +02:00
anvyle
089fc065b2 Merge remote-tracking branch 'origin/dev' into feat/task-decomposition-copilot
# Conflicts:
#	autogpt_platform/backend/backend/copilot/model.py
#	autogpt_platform/backend/backend/copilot/tools/tool_schema_test.py
2026-04-22 06:34:33 +02:00
An Vy Le
00c992ae08 Merge branch 'dev' into feat/task-decomposition-copilot 2026-04-21 13:58:27 +02:00
anvyle
51c16c687d fix(copilot): any user response after decompose_goal unblocks build gate
The keyword-matching gate (_APPROVAL_MARKERS = "approved") rejected
free-text approvals like "Sure", forcing the LLM to re-decompose and
eventually producing duplicate sub-instructions + auto-approve messages.

Replace with: any user message after the latest decompose_goal tool call
passes the gate. The LLM interprets intent (build, modify, or reject).
The gate only blocks same-turn builds where no user response exists yet.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 13:46:40 +02:00
anvyle
40d56ede76 fix(copilot): skip server auto-approve when turn is already in flight
The server's run_copilot_turn_via_queue at T=65 queued a duplicate
"Approved" mid-turn because the queue path has no dedup.

Fix: check is_turn_in_flight() before dispatching. If a turn is already
running (client sent "Approved" at T=60, or user typed approval in
chat), skip entirely. Only fire when the session is idle (client closed
the tab).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 10:49:14 +02:00
anvyle
1ec30ab130 refactor(copilot): use run_copilot_turn_via_queue for auto-approve
Replace the manual append_message_if + stream_registry.create_session +
enqueue_copilot_turn logic in _run_auto_approve with Zamil's canonical
run_copilot_turn_via_queue helper from session_waiter.py. This helper
handles "add or queue based on session state" consistently — the same
path used by run_sub_session, AutoPilotBlock, and the frontend.

- If session is idle: creates stream registry entry + enqueues turn
- If turn is in flight: queues into pending buffer for mid-turn drain
- timeout=0 makes it fire-and-forget

Removed: _no_user_action_since predicate, append_message_if import,
baseline_index parameter, manual stream_registry + enqueue calls.

Kept: Redis cancel flag for cross-process Modify cancellation,
_pending_auto_approvals dict for in-process task cancel, client-side
approve() at T=60 + server at T=65.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 09:34:11 +02:00
anvyle
35231703bb Merge remote-tracking branch 'origin/dev' into feat/task-decomposition-copilot
# Conflicts:
#	autogpt_platform/backend/backend/copilot/baseline/service.py
#	autogpt_platform/backend/backend/copilot/sdk/service.py
2026-04-21 07:37:06 +02:00
anvyle
2b24ef5b07 fix(copilot): resolve merge conflict with dev + fix test failures
- Resolve conflict in create_agent.py: keep both require_guide_read
  gate (from dev) and needs_build_plan_approval gate (from this branch)
- Fix create_agent_test.py: add decompose_goal + approval history to
  test sessions so needs_build_plan_approval gate passes
- Bump tool schema char budget from 32000 to 34000 (new decompose_goal
  tool descriptions pushed it past the old limit)
- Fix test_schedule_auto_approve_creates_task: use dynamic baseline
  (make_session now pre-populates guide_read message)
- Fix ruff F841 unused variable errors from dev merge
- Re-export openapi.json

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 11:24:12 +02:00
anvyle
731748da41 Merge remote-tracking branch 'origin/dev' into feat/task-decomposition-copilot
# Conflicts:
#	autogpt_platform/backend/backend/copilot/tools/create_agent.py
#	autogpt_platform/frontend/src/app/api/openapi.json
2026-04-20 10:47:29 +02:00
Zamil Majdy
792c78883b Merge branch 'dev' into feat/task-decomposition-copilot 2026-04-17 23:26:48 +07:00
Zamil Majdy
d0cb1e981e fix(copilot): require fresh decompose_goal+approval per build request
Previous gate (has_pending_decomposition) only blocked when decompose_goal
was called but not yet approved. It missed the case where the LLM skipped
decompose_goal entirely on follow-up requests — if a session already had
a prior completed decomposition, the gate returned False and let a fresh
create_agent through without any new plan/approval.

Tighter rule (needs_build_plan_approval): the most recent user message
must be an approval (contains 'approved', case-insensitive) AND a
decompose_goal tool_call must exist in the session before that approval.
This forces the LLM to call decompose_goal for every new build request,
end its turn, and only resume building after the user responds.
2026-04-17 20:11:51 +07:00
Zamil Majdy
162c3f09a4 fix(copilot): enforce decompose_goal approval gate at code level
The 'STOP — do not proceed until the user approves' instruction in
agent_generation_guide.md was purely prompt-level. Observed failure:
the LLM called decompose_goal and create_agent in the same turn,
building the agent while the user was still mid-countdown — the whole
premise of the PR (user-approval before build) was being bypassed.

Add has_pending_decomposition(session) which checks session history
for an unanswered decompose_goal tool call, and call it from
create_agent._execute. When pending, the tool returns an ErrorResponse
instructing the LLM to end its turn instead of building.
2026-04-17 20:03:58 +07:00
Zamil Majdy
982808c435 fix(backend): skip append_message_if when session lock not acquired
When Redis is unavailable, _get_session_lock yields False — without
this guard two concurrent callers could both read the same DB state,
both pass the predicate, and both append, producing duplicate messages.
append_message_if only powers non-critical fire-and-forget flows
(decompose_goal auto-approve), so skipping is safer than risking a
duplicate; the user can still trigger the action manually.
2026-04-17 18:49:08 +07:00
Zamil Majdy
5b29ff5a16 fix(backend): invalidate cache on failed write in append_message_if
Mirror the failure-path invalidation already in append_and_save_message.
Without this, a failed cache write leaves stale session data in Redis,
which later predicate checks could act on and produce duplicate messages.
2026-04-17 18:44:37 +07:00
Zamil Majdy
a31297cf8a fix(frontend): avoid eager decrement on legacy decompose_goal sessions
Guard the synchronous recompute() call so only deadline-driven (has
created_at) sessions correct stale state on mount. Legacy fallback now
waits the first 1s tick instead of snapping 60→59 instantly.
2026-04-17 18:34:14 +07:00
Zamil Majdy
4d5969d59e refactor(copilot): simplify cancel-auto-approve route, derive countdown from timestamp
- Route always returns cancelled=True, reason branch was dead code since
  cancel_auto_approve() unconditionally returns True
- Use crypto.randomUUID() for new step IDs to avoid same-millisecond collisions
  on rapid insert clicks
- Re-derive secondsLeft from created_at+auto_approve_seconds on each tick
  (and on visibilitychange) instead of naive decrement; browsers throttle
  setInterval in background tabs, causing the displayed countdown to drift
  far behind wall-clock time
2026-04-17 18:12:27 +07:00
An Vy Le
5542f780d2 Merge branch 'dev' into feat/task-decomposition-copilot 2026-04-17 09:57:18 +02:00
An Vy Le
8299948b21 Merge branch 'dev' into feat/task-decomposition-copilot 2026-04-17 09:37:03 +02:00
anvyle
db36a7624e Merge branch 'feat/task-decomposition-copilot' of https://github.com/Significant-Gravitas/AutoGPT into feat/task-decomposition-copilot 2026-04-16 19:05:31 +02:00
anvyle
dece04ec04 Merge remote-tracking branch 'origin/dev' into feat/task-decomposition-copilot
# Conflicts:
#	autogpt_platform/backend/backend/copilot/model.py
2026-04-16 19:05:24 +02:00
An Vy Le
88b39e4611 Merge branch 'dev' into feat/task-decomposition-copilot 2026-04-16 02:06:31 +02:00
anvyle
480ec70218 fix(frontend): remove unused RenderSegment type import
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 01:53:42 +02:00
anvyle
e376db617c Merge branch 'feat/task-decomposition-copilot' of https://github.com/Significant-Gravitas/AutoGPT into feat/task-decomposition-copilot 2026-04-16 00:53:02 +02:00
anvyle
e4a3c4f6ce style(copilot): reformat test files
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 00:50:50 +02:00
copilot-swe-agent[bot]
559438fd64 fix(frontend): run pnpm format on DecomposeGoal test files
Agent-Logs-Url: https://github.com/Significant-Gravitas/AutoGPT/sessions/40326921-8824-4f62-bb6b-9f7dc5f89657

Co-authored-by: ntindle <8845353+ntindle@users.noreply.github.com>
2026-04-15 20:16:28 +00:00
Nicholas Tindle
c01c47d53c Merge branch 'dev' into feat/task-decomposition-copilot 2026-04-15 15:06:13 -05:00
anvyle
19cd77f2eb Merge branch 'feat/task-decomposition-copilot' of https://github.com/Significant-Gravitas/AutoGPT into feat/task-decomposition-copilot 2026-04-15 20:19:14 +02:00
anvyle
420251ca9d fix(copilot): clean up stream registry on enqueue failure in auto-approve
If enqueue_copilot_turn failed after stream_registry.create_session, the
session was left in "running" state in Redis — locking the session until
the TTL expired. Now calls mark_session_completed on failure so the
session is released immediately.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 20:19:03 +02:00
An Vy Le
62d474899d Merge branch 'dev' into feat/task-decomposition-copilot 2026-04-15 20:03:54 +02:00
copilot-swe-agent[bot]
225bdfb543 test(frontend): add integration tests for DecomposeGoal, StepItem, and ChatMessagesContainer helpers
Agent-Logs-Url: https://github.com/Significant-Gravitas/AutoGPT/sessions/fbc0ab75-9d2e-4ea5-b9c7-c53fb8605913

Co-authored-by: ntindle <8845353+ntindle@users.noreply.github.com>
2026-04-15 17:53:53 +00:00
anvyle
35bca7c7ad fix(copilot): fix task callback race + skip redundant DB existence check
1. The done_callback on a cancelled old task would pop the new task from
   _pending_auto_approvals (same session_id key). Now checks identity
   before removing.
2. append_message_if passes skip_existence_check=True to _save_session_to_db
   since the session was already loaded from DB.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 19:09:48 +02:00
anvyle
ffdafccc35 fix(frontend): remove unused DecomposeGoalOutput type import
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 18:52:00 +02:00
anvyle
bee1c9a3bb fix(copilot): resolve merge conflict with dev + improve frontend test coverage
- Resolve merge conflict in routes.py (keep both TaskDecompositionResponse
  and Memory*Response types) and openapi.json (re-exported from backend)
- Move computeRemainingSeconds from DecomposeGoal.tsx to helpers.tsx so
  it's testable as a pure function
- Add 9 tests for computeRemainingSeconds covering: null/error output,
  missing/unparseable created_at, correct remaining calculation, zero
  clamp when deadline passed, total clamp for future timestamps, and
  fallback when auto_approve_seconds is missing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 18:38:44 +02:00
anvyle
b0c46ff197 Merge remote-tracking branch 'origin/dev' into feat/task-decomposition-copilot
# Conflicts:
#	autogpt_platform/backend/backend/api/features/chat/routes.py
#	autogpt_platform/frontend/src/app/api/openapi.json
2026-04-15 18:18:06 +02:00
anvyle
8d102d6eeb refactor(copilot): remove max 8 steps limit from decompose_goal
The step count is now unrestricted — the LLM can decompose goals into
as many steps as needed. Removes MAX_STEPS constant, the too_many_steps
validation, and the two related tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 03:00:05 +02:00
anvyle
020d094381 fix(copilot): prevent duplicate auto-approve via DB-backed predicate + remount guard
Two "Approved" messages appeared because:

1. Server-side: append_message_if read the session from Redis cache, but
   the executor's upsert_chat_session during the build could overwrite
   the cache with an in-memory copy missing the client's "Approved".
   Fix: read from DB directly (_get_session_from_db) — the DB is the
   source of truth and is never overwritten by cache upserts.

2. Client-side: approvedRef resets on remount, so re-entering after the
   deadline fired approve() again from stale data.
   Fix: wasInitiallyPastDeadlineRef skips auto-approve when secondsLeft
   starts at 0 due to an already-elapsed deadline — the server handles it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 02:56:18 +02:00
anvyle
aa5b84e13c style(copilot): reformat auto-approve constants
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 23:40:44 +02:00
anvyle
6d1cd41c43 fix(copilot): restore client auto-approve with 5s server grace to fix UI not updating
When the server was the sole auto-approver (both at 60s, client removed),
users who stayed in the session saw nothing happen — the server fired
the turn but the frontend had no SSE subscription to receive the events.

Restore the client-side auto-approve effect so it fires at 60s (creating
the SSE subscription), and add a 5s server grace (server fires at 65s).
When the client IS present, it fires first → SSE → user sees the build.
The server wakes 5s later, sees the client's message, skips. When the
client is gone (tab closed), the server fires at 65s as the fallback.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 23:21:44 +02:00
anvyle
d892b66580 fix(copilot): fall back to default approval when all step descriptions are empty
If the user cleared all step descriptions without deleting the steps,
buildMessage() produced "Approved with modifications. Please build the
agent following these steps: " — a dangling colon with no actual steps.
Now falls back to the standard "Approved. Please build the agent." when
filledSteps is empty after filtering.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 22:36:41 +02:00
anvyle
0840b565e3 fix(copilot): clear stale Redis cancel flag when scheduling new auto-approve
Sentry correctly identified that after a user clicks Modify (which sets
a Redis cancel flag) and then the LLM calls decompose_goal again for a
new plan, the stale cancel flag would suppress the new auto-approve.

_schedule_auto_approve is now async and DELETEs the Redis cancel key
before scheduling the new task. Also fixes the autouse test fixture to
use an async no-op (matching the now-async function signature).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 22:01:26 +02:00
anvyle
5f19f6ca23 fix(copilot): remove client-side auto-approve to prevent duplicate messages
Both the client and server fired "Approved. Please build the agent." at
~60s, producing two messages and two agent-build turns.

The server-side timer is now the sole auto-approver. The client countdown
is purely visual — when it hits 0, the server's synthetic message arrives
via SSE. User clicks (Start now / Approve in edit mode) still call
approve() directly and send the message; the server's predicate sees
the user message and skips its own.

Removes wasInitiallyPastDeadlineRef (no longer needed since the client
never fires auto-approve on its own) and the auto-approve useEffect.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:33:47 +02:00
anvyle
b2dab8afad fix(copilot): use Redis flag for cross-process auto-approve cancellation
The cancel endpoint runs in the AgentServer process while the asyncio
auto-approve task lives in the CoPilotExecutor process — separate memory.
The in-process dict cancel from the previous commit was a no-op across
processes.

- cancel_auto_approve now SETs a Redis key with TTL as the primary cancel
  signal, plus best-effort in-process task.cancel() for single-worker.
- _run_auto_approve checks the Redis key before firing. If set, skips.
- Tests stub get_redis_async with a fake to avoid real Redis connections.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 20:34:58 +02:00
anvyle
7b60e45604 feat(copilot): cancel server auto-approve when user clicks Modify + use generated types
Blocker fix: the server-side auto-approve timer fired even when the user
was editing steps via Modify, potentially building an agent against a plan
the user had explicitly chosen to change.

- backend: change _auto_approve_tasks set → _pending_auto_approvals dict
  keyed by session_id. Add cancel_auto_approve(session_id) that looks up
  and cancels the pending asyncio task.
- backend: new POST /sessions/{id}/cancel-auto-approve endpoint in
  chat/routes.py, following the existing cancel_session_task pattern.
- frontend: handleModify() now fires postV2CancelAutoApproveTask
  (generated hook) as a best-effort cancel before entering edit mode.
- helpers.tsx: import DecompositionStepModel from generated API types
  instead of hand-rolling the interface. TaskDecompositionOutput stays
  hand-rolled (runtime shape differs from generated type for created_at).
- Add session_id to TaskDecompositionOutput so the cancel call has it.
- Default step.status to "pending" where the generated type is optional.
- 2 new tests: cancel_auto_approve cancels pending task + returns false
  for unknown session.
- Regenerate openapi.json with the new endpoint.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 20:05:28 +02:00
anvyle
8f5b9fa791 fix(copilot): align server auto-approve timer with client at 60s
Remove the 30s grace period — both client and server now fire at 60s.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 19:47:53 +02:00
anvyle
ca7dc221df chore(frontend): regenerate openapi.json with TaskDecompositionResponse.created_at
The created_at field was added to TaskDecompositionResponse a few commits
back but openapi.json was never regenerated, so the check-api-types CI
job (which re-exports the schema and asserts no diff) was failing.
Re-exporting via poetry run export-api-schema and prettier.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 21:57:33 +02:00
anvyle
98470c27e1 chore(backend): black-format platform_cost_test.py
Pre-existing formatting issue inherited from the dev merge — black wants
one blank line between TestUsdToMicrodollars and TestMaskEmail, not two.
This is unrelated to the decomposition feature but blocks CI lint.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 21:47:16 +02:00
anvyle
2760cb076f Merge remote-tracking branch 'origin/dev' into feat/task-decomposition-copilot 2026-04-10 21:46:51 +02:00
anvyle
fdfd53b45e fix(copilot): don't auto-approve decomposition on mount when deadline already passed
If the user reopened the tab between 60s and 90s after a decomposition
was created, the lazy initializer for ``secondsLeft`` would return 0
(server-stamped deadline already elapsed). The auto-approve useEffect
fires whenever ``secondsLeft === 0``, so it would silently send the
"Approved" message on mount with no user interaction — even if the user
came back specifically to click Modify.

Track in a ref whether the lazy init returned 0 because the deadline
had already passed (vs. 0 because the timer counted down from a
positive value), and skip the auto-approve in that case. The server's
own fallback timer (running 30s longer than the client) handles the
"user never returns" path, so the client doesn't need to silently fire
on mount. The user can still click Approve or Modify manually; the
server will inject its own approval at 90s if neither happens.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 21:42:28 +02:00
anvyle
ed989801d2 fix(copilot): index-based predicate so manual approve cancels server timer
The auto-approve task was firing a duplicate "Approved" message after the
agent had already been built manually. The predicate compared
ChatMessage.sequence against a baseline, but _save_session_to_db assigns
sequences in the DB without writing them back to the in-memory message
objects, and cache_chat_session writes those (sequence=None) objects to
Redis. So the predicate's loaded-from-cache view had None sequences for
freshly-appended messages, treated them as 0, and missed the user's
"Approved" entirely — leaving the timer to fire after the build had
already completed and re-injecting "Approved" for a duplicate turn.

Fix: capture len(session.messages) at schedule time and check for any
user-role message at index >= baseline. Indices are monotonic and require
no DB-side sequence bookkeeping.

Adds a regression test that constructs a session with sequence=None on
the user message, asserting the predicate detects it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 20:54:18 +02:00
anvyle
f467ead855 fix(copilot): disable decompose_goal Approve/Modify while message is streaming
After the build plan box appears, the assistant continues streaming a
short summary text. Clicking Approve or Modify in that 1-2s window failed
because the chat session is locked to the in-flight turn — sending a new
user message gets rejected.

- ChatMessagesContainer now forwards isCurrentlyStreaming through
  renderSegments → MessagePartRenderer → DecomposeGoalTool.
- DecomposeGoalTool computes actionsEnabled = showActions && !streaming
  and uses it to (a) disable the Approve, Modify, and timer buttons and
  (b) gate the auto-approve effect so the timer can hit 0 mid-stream
  without firing — the effect re-runs and approves once streaming ends.
- The countdown ring keeps ticking during streaming so it stays in sync
  with the server-side timer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 18:49:17 +02:00
anvyle
f7601d06ed fix(copilot): resume decompose_goal countdown from server timestamp
Reopening a session was restarting the client countdown from a fresh 60s,
even though the server had been counting the whole time. Now the timer
reflects real elapsed time so the user sees the actual remaining seconds
(or 0, which auto-approves immediately).

- backend: stamp UTC created_at on TaskDecompositionResponse via a default
  factory. The timestamp is set when the tool returns and persisted in the
  message content JSON, so it survives DB round-trips.
- frontend: lazy-init secondsLeft from (auto_approve_seconds -
  (Date.now() - created_at)), clamped to [0, total]. Older messages
  without created_at fall back to a fresh full countdown (existing
  behaviour).
- Test: assert created_at is stamped within the duration of _execute().

Note: openapi.json regen is skipped in this commit because the existing
REST server is in use; the frontend reads tool output as opaque JSON via
custom helpers, so the regen is not required for the feature to work.
Regen later for completeness.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 17:44:50 +02:00
anvyle
fb86fcb67d feat(copilot): add server-side auto-approve fallback for decompose_goal
The decompose_goal countdown was purely client-side: if the user closed the
tab before the timer ran out, the agent never got built. Add a server-side
timer that fires the same approval message even when no client is connected.

- backend/copilot/model.py: add append_message_if helper that appends a
  message inside the session lock only if a predicate is satisfied. Used
  by the auto-approve task to no-op when the user has already acted.
- backend/copilot/tools/decompose_goal.py: when the tool returns, schedule
  a fire-and-forget asyncio task (same _background_tasks pattern as
  agent_browser.py) that sleeps 90s, re-checks the session, and if no user
  message has appeared since, appends "Approved. Please build the agent."
  and enqueues a new copilot turn. Stays in process; restart-resilience
  is a documented follow-up.
- backend/copilot/tools/models.py: expose auto_approve_seconds on
  TaskDecompositionResponse so the frontend countdown is sourced from the
  backend instead of a hard-coded constant.
- frontend DecomposeGoal.tsx: seed secondsLeft from output.auto_approve_seconds
  with a 60s fallback for older sessions.
- Regenerate openapi.json with the new field.
- Tests: 9 new unit tests covering the predicate, the auto-approve flow
  (idle / user-acted / errors swallowed) and _schedule_auto_approve.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 16:34:46 +02:00
anvyle
94f065a7e0 fix(frontend/copilot): remove setInitialPrompt conflict and reset edit mode on new message
- Remove setInitialPrompt() from handleModify() — the inline editor is the
  sole editing UX; pre-filling the chat input simultaneously creates a
  conflicting interface where chat-input submission loses inline edits
- Add useEffect to reset isEditing when showActions goes false (new message
  arrives while editing), preventing users from being stuck in edit mode with
  no way to submit

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 23:15:16 +02:00
anvyle
8d5e8a9e3f fix(backend/copilot): add decompose_goal to ToolName Literal in permissions.py
The ToolName Literal must stay in sync with TOOL_REGISTRY keys. Adds
'decompose_goal' to the platform tools section to fix CI test failures.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 23:09:14 +02:00
anvyle
02b972cfc4 fix(backend/copilot): regenerate openapi.json with TaskDecompositionResponse schema
The API schema was missing DecompositionStepModel and TaskDecompositionResponse
after the merge. Regenerated with export-api-schema and formatted with prettier.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 22:51:18 +02:00
anvyle
31ce418d5e fix(backend/copilot): resolve merge conflict with dev branch in models.py
Merge upstream dev changes (Graphiti memory responses) alongside the
TaskDecompositionResponse added in this PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 22:44:02 +02:00
anvyle
70689ce326 fix(frontend/copilot): guard isPending flag on error and filter empty steps from approval
- Prevent simultaneous pending + error state when output-error has null payload:
  isPending is now false when isError is true
- Filter out steps with empty descriptions before building the approval
  message, preventing malformed input from reaching the LLM

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 22:40:39 +02:00
anvyle
9004a3ada1 fix(copilot): guard auto-approve against race condition when isLastMessage changes
Add showActions to the auto-approve useEffect dependency array and
condition. This prevents the approval from firing after isLastMessage
becomes false (e.g. when a new message arrives just as the timer
expires), closing the race condition flagged by Sentry.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 22:25:27 +02:00
anvyle
5e9cee524d fix(copilot): address PR review comments on decompose_goal tool
- Add TaskDecompositionResponse to ToolResponseUnion for OpenAPI codegen
- Remove LLM-controllable require_approval param (hardcoded to True)
- Validate each step is a dict before calling .get()
- Validate step descriptions are non-empty
- Validate action values against allowlist, coerce unknown to DEFAULT_ACTION
- Align MAX_STEPS=8 with agent_generation_guide.md (was 10)
- Add DEFAULT_ACTION constant; use enum in schema
- Add model_validator to sync step_count with len(steps)
- Fix handleModify: pre-fill chat input via setInitialPrompt instead of sending dangling message
- Add approvedRef guard on handleModify to prevent double-clicks
- Fix eslint-disable: rewrite auto-approve effect without dependency suppression
- Fix hardcoded light-mode colors (bg-white, border-slate-200, text-zinc-800) → semantic tokens
- Fix error card: render ToolErrorCard whenever isError=true, not only when output is present
- Fix hint text: only show approve hint when requires_approval=true
- Remove dead `action` prop from StepItem
- Add aria-label to all StepStatusIcon states
- Tighten parseOutput type guards (Array.isArray check, no false positives)
- Rename isOperating → isPending for clarity
- Add backend unit tests for DecomposeGoalTool (16 cases)
- Add frontend unit tests for helpers.tsx (20 cases)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 22:23:11 +02:00
anvyle
b9d47a8cf5 fix(copilot): auto-size editable step textareas on initial render and input
- Replace <input type="text"> with <textarea> for step descriptions
- Use ref callback to set height from scrollHeight on every render so
  long descriptions wrap to multiple lines by default without interaction
- Bump countdown ring container from 20px to 24px and text from 9px to
  11px for better legibility

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 22:10:51 +02:00
anvyle
5fa33111de feat(copilot): add auto-approve timer with editable steps to decompose_goal UI
- Replace static Approve/Modify buttons with a 99s countdown timer that
  auto-approves when it expires
- Timer ring animates inline within "Starting in [N]s" text using SVG
  strokeDasharray; hover on the text swaps it to "Start now" via Tailwind
  named groups (group/label)
- Clicking Modify stops the timer, enters editable mode where steps can be
  renamed, deleted, or inserted between existing steps
- In edit mode only Approve is shown; timer and Modify are hidden
- showActions gated on isLastMessage (server-derived) so the timer never
  re-appears when returning to a session with prior messages
- Forward isLastMessage through ChatMessagesContainer → MessagePartRenderer

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 21:50:43 +02:00
anvyle
aca81f3e40 Merge branch 'dev' of https://github.com/Significant-Gravitas/AutoGPT into feat/task-decomposition-copilot 2026-04-09 12:27:23 +02:00
anvyle
629fb4d3bb fix(copilot): allow sub-instructions companion text and restore streaming render
- Revert ChatMessagesContainer streaming filter — decompose_goal now visible during stream
- Remove text suppression in splitReasoningAndResponse — table message is allowed alongside sub-instructions box

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 12:27:05 +02:00
anvyle
703d34364d chore(frontend): update openapi.json snapshot
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 23:37:25 +02:00
anvyle
f330699a89 fix(copilot): improve decompose_goal UX — pin box post-stream, suppress companion text
- Move decomposition prompt from prompting.py to agent_generation_guide.md as a required pre-build gate
- Add tool-decompose_goal to CUSTOM_TOOL_TYPES so it renders individually (not collapsed)
- Add task_decomposition to INTERACTIVE_RESPONSE_TYPES so the box is pinned to response after streaming
- Filter out text parts (table) from response when decompose_goal is pinned
- Hide decompose_goal during streaming so the box only appears once all reasoning is complete and Approve is immediately actionable

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 23:36:17 +02:00
anvyle
5bb919e7b5 feat(copilot): add task decomposition for agent building
Add a decompose_goal tool that breaks user goals into sub-instructions
before building. Users see a plan checklist and can approve or modify
before the agent is created, improving transparency and control.

- Backend: DecomposeGoalTool, TaskDecompositionResponse model, system prompt update
- Frontend: DecomposeGoal component with StepItem checklist, approve/modify buttons

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 14:33:49 +02:00
anvyle
261959104a fix(backend/copilot): skip AI blocks without model property in fix_ai_model_parameter
Some AI-category blocks do not expose a "model" input property in their
inputSchema. The fixer was injecting a default model value into these blocks,
which is incorrect. Now checks for the presence of "model" in inputSchema
properties before attempting to set or validate the model field.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 19:00:47 +02:00
23 changed files with 1670 additions and 31 deletions

View File

@@ -82,6 +82,7 @@ from backend.copilot.tools.models import (
NoResultsResponse,
SetupRequirementsResponse,
SuggestedGoalResponse,
TaskDecompositionResponse,
TodoWriteResponse,
UnderstandingUpdatedResponse,
)
@@ -1490,6 +1491,7 @@ ToolResponseUnion = (
| DocPageResponse
| MCPToolsDiscoveredResponse
| MCPToolOutputResponse
| TaskDecompositionResponse
| MemoryStoreResponse
| MemorySearchResponse
| MemoryForgetCandidatesResponse

View File

@@ -416,6 +416,98 @@ def test_update_subscription_tier_paid_requires_urls(
assert response.status_code == 422
def test_update_subscription_tier_currency_mismatch_returns_422(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockFixture,
) -> None:
"""Stripe rejects a SubscriptionSchedule whose phases mix currencies (e.g.
GBP-checkout sub trying to schedule a USD-only target Price). The handler
must convert that into a specific 422 instead of the generic 502 so the
caller can tell the difference between a currency-config bug and a Stripe
outage."""
mock_user = Mock()
mock_user.subscription_tier = SubscriptionTier.MAX
async def mock_feature_enabled(*args, **kwargs):
return True
mocker.patch(
"backend.api.features.v1.get_user_by_id",
new_callable=AsyncMock,
return_value=mock_user,
)
mocker.patch(
"backend.api.features.v1.is_feature_enabled",
side_effect=mock_feature_enabled,
)
mocker.patch(
"backend.api.features.v1.modify_stripe_subscription_for_tier",
side_effect=stripe.InvalidRequestError(
"The price specified only supports `usd`. This doesn't match the"
" expected currency: `gbp`.",
param="phases",
),
)
response = client.post(
"/credits/subscription",
json={
"tier": "PRO",
"success_url": f"{TEST_FRONTEND_ORIGIN}/success",
"cancel_url": f"{TEST_FRONTEND_ORIGIN}/cancel",
},
)
assert response.status_code == 422
detail = response.json()["detail"]
assert "billing currency" in detail.lower()
assert "contact support" in detail.lower()
def test_update_subscription_tier_non_currency_invalid_request_returns_502(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockFixture,
) -> None:
"""Locks the contract that *only* currency-mismatch InvalidRequestErrors
translate to 422 — every other Stripe InvalidRequestError must still
surface as the generic 502 so that widening the conditional later is
caught by the suite."""
mock_user = Mock()
mock_user.subscription_tier = SubscriptionTier.MAX
async def mock_feature_enabled(*args, **kwargs):
return True
mocker.patch(
"backend.api.features.v1.get_user_by_id",
new_callable=AsyncMock,
return_value=mock_user,
)
mocker.patch(
"backend.api.features.v1.is_feature_enabled",
side_effect=mock_feature_enabled,
)
mocker.patch(
"backend.api.features.v1.modify_stripe_subscription_for_tier",
side_effect=stripe.InvalidRequestError(
"No such price: 'price_does_not_exist'",
param="items[0][price]",
),
)
response = client.post(
"/credits/subscription",
json={
"tier": "PRO",
"success_url": f"{TEST_FRONTEND_ORIGIN}/success",
"cancel_url": f"{TEST_FRONTEND_ORIGIN}/cancel",
},
)
assert response.status_code == 502
assert "billing currency" not in response.json()["detail"].lower()
def test_update_subscription_tier_creates_checkout(
client: fastapi.testclient.TestClient,
mocker: pytest_mock.MockFixture,

View File

@@ -1003,6 +1003,35 @@ async def update_subscription_tier(
return await get_subscription_status(user_id)
except ValueError as e:
raise HTTPException(status_code=422, detail=str(e))
except stripe.InvalidRequestError as e:
# Stripe rejects schedule modify when phases mix currencies, e.g. the
# active sub was checked out in GBP but the target tier's Price is
# USD-only. 502 reads as outage; surface a 422 with a specific message
# so the user/admin can see what to fix in Stripe.
msg = str(e)
if "currency" in msg.lower():
logger.warning(
"Currency mismatch on tier change for user %s: %s", user_id, msg
)
raise HTTPException(
status_code=422,
detail=(
"Tier change unavailable for your current billing currency."
" Please contact support — the target tier needs to be"
" configured for your currency in Stripe before this"
" change can go through."
),
)
logger.exception(
"Stripe error modifying subscription for user %s: %s", user_id, e
)
raise HTTPException(
status_code=502,
detail=(
"Unable to update your subscription right now. "
"Please try again or contact support."
),
)
except stripe.StripeError as e:
logger.exception(
"Stripe error modifying subscription for user %s: %s", user_id, e

View File

@@ -81,6 +81,7 @@ ToolName = Literal[
"create_feature_request",
"create_folder",
"customize_agent",
"decompose_goal",
"delete_folder",
"delete_workspace_file",
"edit_agent",

View File

@@ -3,6 +3,47 @@
You can create, edit, and customize agents directly. You ARE the brain —
generate the agent JSON yourself using block schemas, then validate and save.
### Clarifying — Before or During Building
Use `ask_question` whenever the user's intent is ambiguous — whether
that's before starting or midway through the workflow. Common moments:
- **Before building**: output format, delivery channel, data source, or
trigger is unspecified.
- **During block discovery**: multiple blocks could fit and the user
should choose.
- **During JSON generation**: a wiring decision depends on user
preference.
Steps:
1. Call `find_block` (or another discovery tool) to learn what the
platform actually supports for the ambiguous dimension.
2. Call `ask_question` with a concrete question listing the discovered
options (e.g. "The platform supports Gmail, Slack, and Google Docs —
which should the agent use for delivery?").
3. **Wait for the user's answer** before continuing.
**Skip this** when the goal already specifies all dimensions (e.g.
"scrape prices from Amazon and email me daily").
### Before Building: Show the Plan
Start agent generation by calling `decompose_goal` once to display your
build plan to the user as a step-by-step UI card.
1. Analyze the user's request and break it into logical build steps (e.g.
"add input block", "add AI summarizer", "wire blocks together").
2. Call `decompose_goal` with those steps. Do not write any text before
or after the tool call — the platform renders the plan UI card
automatically, so any extra text duplicates the display.
3. Continue immediately with the workflow below in the same turn. The
plan card is informational only — there is no approval step, no
countdown, and no need to wait for the user.
For simple goals (1-2 blocks), keep steps brief (2-3 steps).
For complex goals, use as many steps as needed.
### Workflow for Creating/Editing Agents
1. **If editing**: First narrow to the specific agent by UUID, then fetch its

View File

@@ -17,6 +17,7 @@ from .connect_integration import ConnectIntegrationTool
from .continue_run_block import ContinueRunBlockTool
from .create_agent import CreateAgentTool
from .customize_agent import CustomizeAgentTool
from .decompose_goal import DecomposeGoalTool
from .edit_agent import EditAgentTool
from .feature_requests import CreateFeatureRequestTool, SearchFeatureRequestsTool
from .find_agent import FindAgentTool
@@ -65,6 +66,7 @@ TOOL_REGISTRY: dict[str, BaseTool] = {
"add_understanding": AddUnderstandingTool(),
"create_agent": CreateAgentTool(),
"customize_agent": CustomizeAgentTool(),
"decompose_goal": DecomposeGoalTool(),
"edit_agent": EditAgentTool(),
"find_agent": FindAgentTool(),
"find_block": FindBlockTool(),

View File

@@ -0,0 +1,136 @@
"""DecomposeGoalTool - Breaks agent-building goals into sub-instructions."""
import logging
from typing import Any
from backend.copilot.model import ChatSession
from .base import BaseTool
from .models import (
DecompositionStepModel,
ErrorResponse,
TaskDecompositionResponse,
ToolResponseBase,
)
logger = logging.getLogger(__name__)
DEFAULT_ACTION = "add_block"
VALID_ACTIONS = {"add_block", "connect_blocks", "configure", "add_input", "add_output"}
class DecomposeGoalTool(BaseTool):
"""Tool for decomposing an agent goal into sub-instructions."""
@property
def name(self) -> str:
return "decompose_goal"
@property
def description(self) -> str:
return (
"Show the user your build plan as a step-by-step card before "
"constructing the agent. Each step maps to one task (e.g. add a "
"block, wire connections, configure settings). Display-only — "
"the build continues in the same turn without pausing for user "
"input."
)
@property
def parameters(self) -> dict[str, Any]:
return {
"type": "object",
"properties": {
"goal": {
"type": "string",
"description": "The user's agent-building goal.",
},
"steps": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {
"type": "string",
"description": "Human-readable step description.",
},
"action": {
"type": "string",
"description": (
"Action type: 'add_block', 'connect_blocks', "
"'configure', 'add_input', 'add_output'."
),
"enum": list(VALID_ACTIONS),
},
"block_name": {
"type": "string",
"description": "Block name if adding a block.",
},
},
"required": ["description", "action"],
},
"description": "List of sub-instructions for the plan.",
},
},
"required": ["goal", "steps"],
}
async def _execute(
self,
user_id: str | None,
session: ChatSession,
goal: str | None = None,
steps: list[Any] | None = None,
**kwargs,
) -> ToolResponseBase:
session_id = session.session_id if session else None
if not goal:
return ErrorResponse(
message="Please provide a goal to decompose.",
error="missing_goal",
session_id=session_id,
)
if not steps:
return ErrorResponse(
message="Please provide at least one step in the plan.",
error="missing_steps",
session_id=session_id,
)
decomposition_steps: list[DecompositionStepModel] = []
for i, step in enumerate(steps):
if not isinstance(step, dict):
return ErrorResponse(
message=f"Step {i + 1} is malformed — expected an object.",
error="invalid_step",
session_id=session_id,
)
description = step.get("description", "")
if not description or not description.strip():
return ErrorResponse(
message=f"Step {i + 1} is missing a description.",
error="empty_description",
session_id=session_id,
)
action = step.get("action", DEFAULT_ACTION)
if action not in VALID_ACTIONS:
action = DEFAULT_ACTION
decomposition_steps.append(
DecompositionStepModel(
step_id=f"step_{i + 1}",
description=description,
action=action,
block_name=step.get("block_name"),
status="pending",
)
)
return TaskDecompositionResponse(
message=f"Here's the plan to build your agent ({len(decomposition_steps)} steps):",
goal=goal,
steps=decomposition_steps,
step_count=len(decomposition_steps),
session_id=session_id,
)

View File

@@ -0,0 +1,208 @@
"""Unit tests for DecomposeGoalTool."""
import pytest
from ._test_data import make_session
from .decompose_goal import DEFAULT_ACTION, DecomposeGoalTool
from .models import ErrorResponse, TaskDecompositionResponse
_USER_ID = "test-user-decompose-goal"
_VALID_STEPS = [
{"description": "Add input block", "action": "add_input"},
{
"description": "Add AI summarizer block",
"action": "add_block",
"block_name": "AI Text Generator",
},
{"description": "Connect blocks together", "action": "connect_blocks"},
]
@pytest.fixture()
def tool() -> DecomposeGoalTool:
return DecomposeGoalTool()
@pytest.fixture()
def session():
return make_session(_USER_ID)
# ---------------------------------------------------------------------------
# Happy path
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_happy_path(tool: DecomposeGoalTool, session):
result = await tool._execute(
user_id=_USER_ID,
session=session,
goal="Build a news summarizer agent",
steps=_VALID_STEPS,
)
assert isinstance(result, TaskDecompositionResponse)
assert result.goal == "Build a news summarizer agent"
assert len(result.steps) == 3
assert result.step_count == 3
assert result.steps[0].step_id == "step_1"
assert result.steps[0].description == "Add input block"
assert result.steps[1].block_name == "AI Text Generator"
@pytest.mark.asyncio
async def test_step_count_matches_steps(tool: DecomposeGoalTool, session):
"""TaskDecompositionResponse.step_count must always equal len(steps)."""
result = await tool._execute(
user_id=_USER_ID,
session=session,
goal="Simple agent",
steps=[{"description": "Only step", "action": "add_block"}],
)
assert isinstance(result, TaskDecompositionResponse)
assert result.step_count == len(result.steps)
@pytest.mark.asyncio
async def test_invalid_action_defaults_to_add_block(tool: DecomposeGoalTool, session):
"""Unknown action values are coerced to DEFAULT_ACTION."""
result = await tool._execute(
user_id=_USER_ID,
session=session,
goal="Build agent",
steps=[{"description": "Do something weird", "action": "fly_to_moon"}],
)
assert isinstance(result, TaskDecompositionResponse)
assert result.steps[0].action == DEFAULT_ACTION
@pytest.mark.asyncio
async def test_block_name_optional(tool: DecomposeGoalTool, session):
"""Steps without block_name should succeed with block_name=None."""
result = await tool._execute(
user_id=_USER_ID,
session=session,
goal="Agent with no block name",
steps=[{"description": "Configure the agent", "action": "configure"}],
)
assert isinstance(result, TaskDecompositionResponse)
assert result.steps[0].block_name is None
# ---------------------------------------------------------------------------
# Validation — missing inputs
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_missing_goal_returns_error(tool: DecomposeGoalTool, session):
result = await tool._execute(
user_id=_USER_ID,
session=session,
goal=None,
steps=_VALID_STEPS,
)
assert isinstance(result, ErrorResponse)
assert result.error == "missing_goal"
@pytest.mark.asyncio
async def test_empty_goal_returns_error(tool: DecomposeGoalTool, session):
result = await tool._execute(
user_id=_USER_ID,
session=session,
goal="",
steps=_VALID_STEPS,
)
assert isinstance(result, ErrorResponse)
assert result.error == "missing_goal"
@pytest.mark.asyncio
async def test_missing_steps_returns_error(tool: DecomposeGoalTool, session):
result = await tool._execute(
user_id=_USER_ID,
session=session,
goal="Build agent",
steps=None,
)
assert isinstance(result, ErrorResponse)
assert result.error == "missing_steps"
@pytest.mark.asyncio
async def test_empty_steps_returns_error(tool: DecomposeGoalTool, session):
result = await tool._execute(
user_id=_USER_ID,
session=session,
goal="Build agent",
steps=[],
)
assert isinstance(result, ErrorResponse)
assert result.error == "missing_steps"
# ---------------------------------------------------------------------------
# Validation — malformed step items
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_non_dict_step_returns_error(tool: DecomposeGoalTool, session):
"""A step that is not a dict should return an error."""
result = await tool._execute(
user_id=_USER_ID,
session=session,
goal="Build agent",
steps=["not a dict"], # type: ignore[list-item]
)
assert isinstance(result, ErrorResponse)
assert result.error == "invalid_step"
@pytest.mark.asyncio
async def test_step_with_empty_description_returns_error(
tool: DecomposeGoalTool, session
):
result = await tool._execute(
user_id=_USER_ID,
session=session,
goal="Build agent",
steps=[{"description": "", "action": "add_block"}],
)
assert isinstance(result, ErrorResponse)
assert result.error == "empty_description"
@pytest.mark.asyncio
async def test_step_with_missing_description_returns_error(
tool: DecomposeGoalTool, session
):
result = await tool._execute(
user_id=_USER_ID,
session=session,
goal="Build agent",
steps=[{"action": "add_block"}],
)
assert isinstance(result, ErrorResponse)
assert result.error == "empty_description"
# ---------------------------------------------------------------------------
# ID generation
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_step_ids_are_sequential(tool: DecomposeGoalTool, session):
result = await tool._execute(
user_id=_USER_ID,
session=session,
goal="Build agent",
steps=_VALID_STEPS,
)
assert isinstance(result, TaskDecompositionResponse)
for i, step in enumerate(result.steps):
assert step.step_id == f"step_{i + 1}"

View File

@@ -4,7 +4,7 @@ from datetime import datetime
from enum import Enum
from typing import Any, Literal
from pydantic import BaseModel, Field
from pydantic import BaseModel, Field, model_validator
from backend.data.graph import BaseGraph
from backend.data.model import CredentialsMetaInput
@@ -36,6 +36,9 @@ class ResponseType(str, Enum):
AGENT_BUILDER_VALIDATION_RESULT = "agent_builder_validation_result"
AGENT_BUILDER_FIX_RESULT = "agent_builder_fix_result"
# Task decomposition (goal → sub-instructions)
TASK_DECOMPOSITION = "task_decomposition"
# Block
BLOCK_LIST = "block_list"
BLOCK_DETAILS = "block_details"
@@ -813,6 +816,42 @@ class AgentsMovedToFolderResponse(ToolResponseBase):
count: int = 0
# Task decomposition models
class DecompositionStepModel(BaseModel):
"""A single step in a decomposed agent-building plan."""
step_id: str = Field(description="Unique step identifier, e.g. 'step_1'")
description: str = Field(description="Human-readable step description")
action: str = Field(
description="Action type: 'add_block', 'connect_blocks', 'configure', etc."
)
block_name: str | None = Field(
default=None, description="Block being added, if applicable"
)
status: str = Field(
default="pending",
description="Step status: pending, in_progress, completed, failed",
)
class TaskDecompositionResponse(ToolResponseBase):
"""Response for decompose_goal tool — shows the plan to the user."""
type: ResponseType = ResponseType.TASK_DECOMPOSITION
goal: str = Field(description="The original user goal")
steps: list[DecompositionStepModel]
step_count: int = Field(
default=0, description="Number of steps (auto-derived from steps list)"
)
@model_validator(mode="after")
def sync_step_count(self) -> "TaskDecompositionResponse":
self.step_count = len(self.steps)
return self
# --- Graphiti memory responses ---

View File

@@ -21,22 +21,9 @@ from backend.copilot.tools import TOOL_REGISTRY
# response shape carries) and the dry_run description. Keeps the
# regression gate effective while accepting a deliberate ~120-token
# spend on LLM-decision-critical copy.
# Bumped 32500 -> 32800 on PR #12871 for the new web_search tool
# (server-side Anthropic beta). Description already trimmed to the
# minimum viable copy; the bump absorbs the schema skeleton cost
# (~300 chars / ~75 tokens) for a new LLM-facing primitive.
# Bumped 32800 -> 33200 on PR #12873 for the web_search Perplexity
# Sonar refactor — adds a load-bearing `deep` boolean with explicit
# "~100x more expensive" cost warning the model must see to avoid
# accidentally triggering sonar-reasoning on ordinary lookups, plus
# synthesised-answer wording in the top-level description so the LLM
# reads the answer before reaching for `web_fetch`. Both are
# LLM-decision-critical copy, not bloat.
# Bumped 33200 -> 34000 when baseline gained the MCP `TodoWrite` tool
# for parity with the Claude Code SDK's built-in (PR #12879). The new
# schema adds ~600 chars; description already trimmed to the minimum
# viable copy.
_CHAR_BUDGET = 34_000
# Bumped to 34000 to accommodate decompose_goal tool + web_search +
# TodoWrite tool descriptions.
_CHAR_BUDGET = 35_000
@pytest.fixture(scope="module")

View File

@@ -0,0 +1,279 @@
import { describe, expect, it } from "vitest";
import {
buildRenderSegments,
isCompletedToolPart,
isInteractiveToolPart,
parseSpecialMarkers,
splitReasoningAndResponse,
} from "../helpers";
import type { MessagePart } from "../helpers";
function textPart(text: string): MessagePart {
return { type: "text", text } as MessagePart;
}
function toolPart(
toolName: string,
state: string,
output?: unknown,
): MessagePart {
return {
type: `tool-${toolName}`,
toolCallId: `call_${toolName}`,
toolName,
state,
output,
} as unknown as MessagePart;
}
describe("isCompletedToolPart", () => {
it("returns true for output-available tool part", () => {
const part = toolPart("some_tool", "output-available");
expect(isCompletedToolPart(part)).toBe(true);
});
it("returns true for output-error tool part", () => {
const part = toolPart("some_tool", "output-error");
expect(isCompletedToolPart(part)).toBe(true);
});
it("returns false for input-streaming tool part", () => {
const part = toolPart("some_tool", "input-streaming");
expect(isCompletedToolPart(part)).toBe(false);
});
it("returns false for text part", () => {
const part = textPart("hello");
expect(isCompletedToolPart(part)).toBe(false);
});
});
describe("isInteractiveToolPart", () => {
it("returns true for task_decomposition type", () => {
const part = toolPart("decompose_goal", "output-available", {
type: "task_decomposition",
message: "Plan",
goal: "Build agent",
steps: [],
step_count: 0,
});
expect(isInteractiveToolPart(part)).toBe(true);
});
it("returns true for setup_requirements type", () => {
const part = toolPart("run_mcp_tool", "output-available", {
type: "setup_requirements",
message: "Setup needed",
});
expect(isInteractiveToolPart(part)).toBe(true);
});
it("returns true for agent_details type", () => {
const part = toolPart("find_agent", "output-available", {
type: "agent_details",
});
expect(isInteractiveToolPart(part)).toBe(true);
});
it("returns false for non-interactive output type", () => {
const part = toolPart("some_tool", "output-available", {
type: "generic_output",
});
expect(isInteractiveToolPart(part)).toBe(false);
});
it("returns false when state is not output-available", () => {
const part = toolPart("decompose_goal", "input-streaming", {
type: "task_decomposition",
});
expect(isInteractiveToolPart(part)).toBe(false);
});
it("returns false for non-tool parts", () => {
const part = textPart("hello");
expect(isInteractiveToolPart(part)).toBe(false);
});
it("returns false when output is null", () => {
const part = toolPart("decompose_goal", "output-available", null);
expect(isInteractiveToolPart(part)).toBe(false);
});
it("handles JSON-encoded string output", () => {
const part = toolPart(
"decompose_goal",
"output-available",
JSON.stringify({ type: "task_decomposition" }),
);
expect(isInteractiveToolPart(part)).toBe(true);
});
it("returns false for invalid JSON string output", () => {
const part = toolPart(
"decompose_goal",
"output-available",
"not valid json",
);
expect(isInteractiveToolPart(part)).toBe(false);
});
});
describe("buildRenderSegments", () => {
it("returns individual segments for custom tool types", () => {
const parts = [
toolPart("decompose_goal", "output-available", {
type: "task_decomposition",
}),
];
const segments = buildRenderSegments(parts);
expect(segments).toHaveLength(1);
expect(segments[0].kind).toBe("part");
});
it("collapses consecutive generic completed tool parts", () => {
const parts = [
toolPart("unknown_tool_a", "output-available"),
toolPart("unknown_tool_b", "output-available"),
];
const segments = buildRenderSegments(parts);
expect(segments).toHaveLength(1);
expect(segments[0].kind).toBe("collapsed-group");
if (segments[0].kind === "collapsed-group") {
expect(segments[0].parts).toHaveLength(2);
}
});
it("does not collapse custom tool types into groups", () => {
const parts = [
toolPart("decompose_goal", "output-available", {
type: "task_decomposition",
}),
toolPart("create_agent", "output-available"),
];
const segments = buildRenderSegments(parts);
expect(segments).toHaveLength(2);
expect(segments[0].kind).toBe("part");
expect(segments[1].kind).toBe("part");
});
it("renders text parts individually", () => {
const parts = [textPart("Hello"), textPart("World")];
const segments = buildRenderSegments(parts);
expect(segments).toHaveLength(2);
expect(segments.every((s) => s.kind === "part")).toBe(true);
});
it("handles mixed custom tools, generic tools, and text", () => {
const parts = [
textPart("Plan:"),
toolPart("decompose_goal", "output-available"),
toolPart("generic_a", "output-available"),
toolPart("generic_b", "output-available"),
textPart("Done"),
];
const segments = buildRenderSegments(parts);
expect(segments[0].kind).toBe("part");
expect(segments[1].kind).toBe("part");
expect(segments[2].kind).toBe("collapsed-group");
expect(segments[3].kind).toBe("part");
});
it("does not collapse a single generic tool part", () => {
const parts = [toolPart("generic_a", "output-available")];
const segments = buildRenderSegments(parts);
expect(segments).toHaveLength(1);
expect(segments[0].kind).toBe("part");
});
it("preserves baseIndex offset in part segments", () => {
const parts = [textPart("Hello")];
const segments = buildRenderSegments(parts, 5);
expect(segments).toHaveLength(1);
if (segments[0].kind === "part") {
expect(segments[0].index).toBe(5);
}
});
});
describe("splitReasoningAndResponse", () => {
it("returns all parts as response when no tools are present", () => {
const parts = [textPart("Hello"), textPart("World")];
const { reasoning, response } = splitReasoningAndResponse(parts);
expect(reasoning).toHaveLength(0);
expect(response).toHaveLength(2);
});
it("returns all parts as response when no text follows the last tool", () => {
const parts = [
textPart("Thinking..."),
toolPart("decompose_goal", "output-available", {
type: "task_decomposition",
}),
];
const { reasoning, response } = splitReasoningAndResponse(parts);
expect(reasoning).toHaveLength(0);
expect(response).toHaveLength(2);
});
it("keeps non-interactive tool parts in reasoning", () => {
const genericTool = toolPart("find_block", "output-available", {
type: "block_list",
});
const parts = [
textPart("Looking for blocks..."),
genericTool,
textPart("Found them."),
];
const { reasoning, response } = splitReasoningAndResponse(parts);
expect(reasoning).toHaveLength(2);
expect(reasoning[1]).toBe(genericTool);
expect(response).toHaveLength(1);
});
});
describe("parseSpecialMarkers", () => {
it("returns null marker for plain text", () => {
const result = parseSpecialMarkers("Hello world");
expect(result.markerType).toBeNull();
expect(result.cleanText).toBe("Hello world");
});
it("detects error marker", () => {
const result = parseSpecialMarkers(
"Some preamble [__COPILOT_ERROR_f7a1__] Something went wrong",
);
expect(result.markerType).toBe("error");
expect(result.markerText).toBe("Something went wrong");
});
it("detects retryable error marker", () => {
const result = parseSpecialMarkers(
"[__COPILOT_RETRYABLE_ERROR_a9c2__] Timeout reached",
);
expect(result.markerType).toBe("retryable_error");
expect(result.markerText).toBe("Timeout reached");
});
it("detects system marker", () => {
const result = parseSpecialMarkers(
"[__COPILOT_SYSTEM_e3b0__] Session expired",
);
expect(result.markerType).toBe("system");
expect(result.markerText).toBe("Session expired");
});
it("retryable takes precedence over regular error when both present", () => {
const text =
"[__COPILOT_RETRYABLE_ERROR_a9c2__] Retryable issue [__COPILOT_ERROR_f7a1__] Also error";
const result = parseSpecialMarkers(text);
expect(result.markerType).toBe("retryable_error");
});
it("strips marker from cleanText", () => {
const result = parseSpecialMarkers(
"Preamble text [__COPILOT_SYSTEM_e3b0__] System message",
);
expect(result.cleanText).toBe("Preamble text");
});
});

View File

@@ -7,6 +7,7 @@ import { ArtifactCard } from "../../ArtifactCard/ArtifactCard";
import { AskQuestionTool } from "../../../tools/AskQuestion/AskQuestion";
import { ConnectIntegrationTool } from "../../../tools/ConnectIntegrationTool/ConnectIntegrationTool";
import { CreateAgentTool } from "../../../tools/CreateAgent/CreateAgent";
import { DecomposeGoalTool } from "../../../tools/DecomposeGoal/DecomposeGoal";
import { EditAgentTool } from "../../../tools/EditAgent/EditAgent";
import {
CreateFeatureRequestTool,
@@ -180,6 +181,8 @@ export function MessagePartRenderer({
case "tool-run_agent":
case "tool-schedule_agent":
return <RunAgentTool key={key} part={part as ToolUIPart} />;
case "tool-decompose_goal":
return <DecomposeGoalTool key={key} part={part as ToolUIPart} />;
case "tool-create_agent":
return <CreateAgentTool key={key} part={part as ToolUIPart} />;
case "tool-edit_agent":

View File

@@ -31,6 +31,7 @@ const CUSTOM_TOOL_TYPES = new Set([
"tool-view_agent_output",
"tool-search_feature_requests",
"tool-create_feature_request",
"tool-decompose_goal",
]);
const REASONING_TOOL_TYPES = new Set([
@@ -62,6 +63,7 @@ const INTERACTIVE_RESPONSE_TYPES: ReadonlySet<string> = new Set([
ResponseType.suggested_goal,
ResponseType.agent_builder_preview,
ResponseType.agent_builder_saved,
ResponseType.task_decomposition,
]);
export function isCompletedToolPart(part: MessagePart): part is ToolUIPart {
@@ -144,15 +146,29 @@ export function splitReasoningAndResponse(parts: MessagePart[]): {
reasoning: MessagePart[];
response: MessagePart[];
} {
const lastReasoningIndex = parts.findLastIndex(isReasoningBoundary);
// Manual reverse loop instead of `Array.prototype.findLastIndex`. The
// built-in version was being elided by the bundler in CI's vitest run,
// causing the function to misread the boundary index and return the input
// unchanged. The explicit loop is opaque to that optimization.
let lastReasoningIndex = -1;
for (let i = parts.length - 1; i >= 0; i--) {
if (isReasoningBoundary(parts[i])) {
lastReasoningIndex = i;
break;
}
}
if (lastReasoningIndex === -1) {
return { reasoning: [], response: parts };
}
const hasResponseAfterReasoning = parts
.slice(lastReasoningIndex + 1)
.some((p) => p.type === "text");
let hasResponseAfterReasoning = false;
for (let i = lastReasoningIndex + 1; i < parts.length; i++) {
if (parts[i].type === "text") {
hasResponseAfterReasoning = true;
break;
}
}
if (!hasResponseAfterReasoning) {
return { reasoning: [], response: parts };

View File

@@ -0,0 +1,93 @@
"use client";
import type { ToolUIPart } from "ai";
import { useCopilotChatActions } from "../../components/CopilotChatActionsProvider/useCopilotChatActions";
import { MorphingTextAnimation } from "../../components/MorphingTextAnimation/MorphingTextAnimation";
import {
ContentGrid,
ContentMessage,
} from "../../components/ToolAccordion/AccordionContent";
import { ToolAccordion } from "../../components/ToolAccordion/ToolAccordion";
import { ToolErrorCard } from "../../components/ToolErrorCard/ToolErrorCard";
import { StepItem } from "./components/StepItem";
import {
AccordionIcon,
getAnimationText,
getDecomposeGoalOutput,
isDecompositionOutput,
isErrorOutput,
ToolIcon,
} from "./helpers";
interface Props {
part: ToolUIPart;
}
export function DecomposeGoalTool({ part }: Props) {
const text = getAnimationText(part);
const { onSend } = useCopilotChatActions();
const isStreaming =
part.state === "input-streaming" || part.state === "input-available";
const output = getDecomposeGoalOutput(part);
const isError =
part.state === "output-error" || (!!output && isErrorOutput(output));
const isPending = !output && !isError;
return (
<div className="py-2">
{isPending && (
<div className="flex items-center gap-2 text-sm text-muted-foreground">
<ToolIcon isStreaming={isStreaming} isError={isError} />
<MorphingTextAnimation
text={text}
className={isError ? "text-red-500" : undefined}
/>
</div>
)}
{isError && (
<ToolErrorCard
message={
output && isErrorOutput(output) ? (output.message ?? "") : ""
}
fallbackMessage="Failed to analyze the goal. Please try again."
actions={[
{
label: "Try again",
onClick: () => onSend("Please try decomposing the goal again."),
},
]}
/>
)}
{output && isDecompositionOutput(output) && (
<ToolAccordion
icon={<AccordionIcon />}
title={`Build Plan — ${output.step_count} steps`}
description={output.goal}
defaultExpanded
>
<ContentGrid>
<ContentMessage>{output.message}</ContentMessage>
<div className="rounded-lg border border-border bg-card p-3">
<div className="space-y-0.5">
{output.steps.map((step, i) => (
<StepItem
key={step.step_id}
index={i}
description={step.description}
blockName={step.block_name}
status={step.status ?? "pending"}
/>
))}
</div>
</div>
</ContentGrid>
</ToolAccordion>
)}
</div>
);
}

View File

@@ -0,0 +1,163 @@
import {
cleanup,
fireEvent,
render,
screen,
} from "@/tests/integrations/test-utils";
import { afterEach, describe, expect, it, vi } from "vitest";
import { DecomposeGoalTool } from "../DecomposeGoal";
import type { TaskDecompositionOutput } from "../helpers";
const mockOnSend = vi.fn();
vi.mock(
"../../../components/CopilotChatActionsProvider/useCopilotChatActions",
() => ({
useCopilotChatActions: () => ({ onSend: mockOnSend }),
}),
);
const STEPS = [
{
step_id: "step_1",
description: "Add input block",
action: "add_input",
block_name: null,
status: "pending",
},
{
step_id: "step_2",
description: "Add AI summarizer",
action: "add_block",
block_name: "AI Text Generator",
status: "pending",
},
{
step_id: "step_3",
description: "Connect blocks",
action: "connect_blocks",
block_name: null,
status: "pending",
},
];
const DECOMPOSITION: TaskDecompositionOutput = {
type: "task_decomposition",
message: "Here's the plan (3 steps):",
goal: "Build a news summarizer",
steps: STEPS,
step_count: 3,
session_id: "test-session-1",
};
function makePart(
state: string,
output?: unknown,
): {
type: string;
toolCallId: string;
toolName: string;
state: string;
input?: unknown;
output?: unknown;
} {
return {
type: "tool-decompose_goal",
toolCallId: "call_1",
toolName: "decompose_goal",
state,
output,
};
}
describe("DecomposeGoalTool", () => {
afterEach(() => {
cleanup();
mockOnSend.mockClear();
});
it("renders analyzing animation during input-streaming", () => {
render(<DecomposeGoalTool part={makePart("input-streaming") as any} />);
expect(screen.getByText(/A/)).toBeDefined();
});
it("renders error card when state is output-error", () => {
render(<DecomposeGoalTool part={makePart("output-error") as any} />);
expect(screen.getByText(/Failed to analyze the goal/i)).toBeDefined();
expect(screen.getByText("Try again")).toBeDefined();
});
it("sends retry message when Try again is clicked on error", () => {
render(<DecomposeGoalTool part={makePart("output-error") as any} />);
fireEvent.click(screen.getByText("Try again"));
expect(mockOnSend).toHaveBeenCalledWith(
"Please try decomposing the goal again.",
);
});
it("renders error card for error output object", () => {
const errorOutput = {
type: "error",
error: "missing_steps",
message: "Please provide at least one step.",
};
render(
<DecomposeGoalTool
part={makePart("output-available", errorOutput) as any}
/>,
);
expect(screen.getByText("Please provide at least one step.")).toBeDefined();
});
it("renders the build plan accordion with steps as a read-only list", () => {
render(
<DecomposeGoalTool
part={makePart("output-available", DECOMPOSITION) as any}
/>,
);
expect(screen.getByText(/Build Plan — 3 steps/)).toBeDefined();
expect(screen.getByText("Build a news summarizer")).toBeDefined();
expect(screen.getByText(/Here's the plan/)).toBeDefined();
expect(screen.getByText(/1\. Add input block/)).toBeDefined();
expect(screen.getByText(/2\. Add AI summarizer/)).toBeDefined();
expect(screen.getByText(/3\. Connect blocks/)).toBeDefined();
});
it("renders block name badges for steps that have them", () => {
render(
<DecomposeGoalTool
part={makePart("output-available", DECOMPOSITION) as any}
/>,
);
expect(screen.getByText("AI Text Generator")).toBeDefined();
});
it("does not render approve, modify, or edit controls", () => {
render(
<DecomposeGoalTool
part={makePart("output-available", DECOMPOSITION) as any}
/>,
);
expect(screen.queryByText("Modify")).toBeNull();
expect(screen.queryByText("Approve")).toBeNull();
expect(screen.queryByText(/Starting in/)).toBeNull();
expect(screen.queryByPlaceholderText("Step description")).toBeNull();
expect(screen.queryByLabelText("Remove step")).toBeNull();
expect(screen.queryByLabelText("Insert step here")).toBeNull();
});
it("does not call onSend when the plan card renders", () => {
render(
<DecomposeGoalTool
part={makePart("output-available", DECOMPOSITION) as any}
/>,
);
expect(mockOnSend).not.toHaveBeenCalled();
});
it("renders nothing pending when output is not yet available", () => {
const { container } = render(
<DecomposeGoalTool part={makePart("input-available") as any} />,
);
expect(container.querySelector(".py-2")).toBeDefined();
});
});

View File

@@ -0,0 +1,207 @@
/**
* Unit tests for DecomposeGoal/helpers.tsx
*
* Covers: parseOutput / getDecomposeGoalOutput, type guards, getAnimationText
*/
import { describe, expect, it } from "vitest";
import {
getAnimationText,
getDecomposeGoalOutput,
isDecompositionOutput,
isErrorOutput,
type DecomposeErrorOutput,
type DecomposeGoalOutput,
type TaskDecompositionOutput,
} from "../helpers";
// ---------------------------------------------------------------------------
// Fixtures
// ---------------------------------------------------------------------------
const DECOMPOSITION: TaskDecompositionOutput = {
type: "task_decomposition",
message: "Here's the plan (3 steps):",
goal: "Build a news summarizer",
steps: [
{
step_id: "step_1",
description: "Add input block",
action: "add_input",
block_name: null,
status: "pending",
},
{
step_id: "step_2",
description: "Add AI summarizer",
action: "add_block",
block_name: "AI Text Generator",
status: "pending",
},
{
step_id: "step_3",
description: "Connect blocks",
action: "connect_blocks",
block_name: null,
status: "pending",
},
],
step_count: 3,
};
const ERROR_OUTPUT: DecomposeErrorOutput = {
type: "error",
error: "missing_steps",
message: "Please provide at least one step.",
};
// ---------------------------------------------------------------------------
// isDecompositionOutput
// ---------------------------------------------------------------------------
describe("isDecompositionOutput", () => {
it("returns true for a full decomposition output", () => {
expect(isDecompositionOutput(DECOMPOSITION)).toBe(true);
});
it("returns false for an error output", () => {
expect(
isDecompositionOutput(ERROR_OUTPUT as unknown as DecomposeGoalOutput),
).toBe(false);
});
it("returns false when steps is not an array (type guard tightness)", () => {
const malformed = {
steps: "not-an-array",
goal: "test",
} as unknown as DecomposeGoalOutput;
expect(isDecompositionOutput(malformed)).toBe(false);
});
});
// ---------------------------------------------------------------------------
// isErrorOutput
// ---------------------------------------------------------------------------
describe("isErrorOutput", () => {
it("returns true for error output", () => {
expect(isErrorOutput(ERROR_OUTPUT as unknown as DecomposeGoalOutput)).toBe(
true,
);
});
it("returns false for decomposition output", () => {
expect(isErrorOutput(DECOMPOSITION)).toBe(false);
});
});
// ---------------------------------------------------------------------------
// getDecomposeGoalOutput — output parsing
// ---------------------------------------------------------------------------
describe("getDecomposeGoalOutput", () => {
it("parses a direct object output", () => {
const part = { output: DECOMPOSITION };
const result = getDecomposeGoalOutput(part);
expect(result).not.toBeNull();
expect(isDecompositionOutput(result!)).toBe(true);
});
it("parses a JSON-encoded string output", () => {
const part = { output: JSON.stringify(DECOMPOSITION) };
const result = getDecomposeGoalOutput(part);
expect(result).not.toBeNull();
expect(isDecompositionOutput(result!)).toBe(true);
expect((result as TaskDecompositionOutput).goal).toBe(
"Build a news summarizer",
);
});
it("parses an error output object", () => {
const part = { output: ERROR_OUTPUT };
const result = getDecomposeGoalOutput(part);
expect(result).not.toBeNull();
expect(isErrorOutput(result!)).toBe(true);
});
it("returns null for falsy output", () => {
expect(getDecomposeGoalOutput({ output: null })).toBeNull();
expect(getDecomposeGoalOutput({ output: undefined })).toBeNull();
expect(getDecomposeGoalOutput({ output: "" })).toBeNull();
});
it("returns null for a plain non-JSON string", () => {
expect(getDecomposeGoalOutput({ output: "just text" })).toBeNull();
});
it("returns null for a non-object part", () => {
expect(getDecomposeGoalOutput(null)).toBeNull();
expect(getDecomposeGoalOutput("string")).toBeNull();
expect(getDecomposeGoalOutput(42)).toBeNull();
});
it("returns null for an array-type output (not a valid shape)", () => {
expect(
getDecomposeGoalOutput({ output: ["not", "an", "object"] }),
).toBeNull();
});
it("classifies 'steps+goal' before 'error' when object has all three keys", () => {
// Verify type discrimination precedence: steps+goal wins
const mixed = { ...DECOMPOSITION, error: "some_error" };
const part = { output: mixed };
const result = getDecomposeGoalOutput(part);
expect(result).not.toBeNull();
expect(isDecompositionOutput(result!)).toBe(true);
});
it("returns message-only error when no error key but has message", () => {
const messageOnly = { type: "error", message: "Something failed" };
const result = getDecomposeGoalOutput({ output: messageOnly });
expect(result).not.toBeNull();
// isErrorOutput requires 'error' key, so this falls through to message-only branch
expect((result as DecomposeErrorOutput).message).toBe("Something failed");
});
});
// ---------------------------------------------------------------------------
// getAnimationText
// ---------------------------------------------------------------------------
describe("getAnimationText", () => {
it("shows analyzing text during input-streaming", () => {
const text = getAnimationText({ state: "input-streaming" });
expect(text.toLowerCase()).toContain("analyzing");
});
it("shows analyzing text during input-available", () => {
const text = getAnimationText({ state: "input-available" });
expect(text.toLowerCase()).toContain("analyzing");
});
it("shows plan ready with step count on output-available with decomposition", () => {
const text = getAnimationText({
state: "output-available",
output: DECOMPOSITION,
});
expect(text).toContain("3 steps");
});
it("shows analyzing when output-available but output is not a decomposition", () => {
const text = getAnimationText({
state: "output-available",
output: null,
});
expect(text.toLowerCase()).toContain("analyzing");
});
it("shows error text on output-error state", () => {
const text = getAnimationText({ state: "output-error" });
expect(text.toLowerCase()).toContain("error");
});
it("falls back to analyzing for unknown state", () => {
const text = getAnimationText({ state: "result" as never });
expect(text.toLowerCase()).toContain("analyzing");
});
});

View File

@@ -0,0 +1,38 @@
"use client";
import { Text } from "@/components/atoms/Text/Text";
import { CubeIcon } from "@phosphor-icons/react";
import { StepStatusIcon } from "../helpers";
interface Props {
index: number;
description: string;
blockName?: string | null;
status: string;
}
export function StepItem({ index, description, blockName, status }: Props) {
return (
<div className="flex items-start gap-3 py-1.5">
<div className="mt-0.5 flex shrink-0 items-center">
<StepStatusIcon status={status} />
</div>
<div className="min-w-0 flex-1">
<Text variant="body-medium" className="text-sm text-foreground">
{index + 1}. {description}
</Text>
{blockName && (
<div className="mt-0.5 flex items-center gap-1">
<CubeIcon size={12} className="text-muted-foreground" />
<Text
variant="small"
className="font-mono text-xs text-muted-foreground"
>
{blockName}
</Text>
</div>
)}
</div>
</div>
);
}

View File

@@ -0,0 +1,61 @@
import { render, screen } from "@/tests/integrations/test-utils";
import { describe, expect, it } from "vitest";
import { StepItem } from "../StepItem";
describe("StepItem", () => {
it("renders step number and description", () => {
render(
<StepItem index={0} description="Add input block" status="pending" />,
);
expect(screen.getByText("1. Add input block")).toBeDefined();
});
it("renders block name when provided", () => {
render(
<StepItem
index={1}
description="Add AI summarizer"
blockName="AI Text Generator"
status="pending"
/>,
);
expect(screen.getByText("AI Text Generator")).toBeDefined();
});
it("does not render block name when null", () => {
render(
<StepItem
index={0}
description="Connect blocks"
blockName={null}
status="pending"
/>,
);
expect(screen.queryByText("AI Text Generator")).toBeNull();
});
it("renders pending icon by default", () => {
render(<StepItem index={0} description="Step" status="pending" />);
expect(screen.getByLabelText("pending")).toBeDefined();
});
it("renders completed icon for completed status", () => {
render(<StepItem index={0} description="Step" status="completed" />);
expect(screen.getByLabelText("completed")).toBeDefined();
});
it("renders in-progress icon for in_progress status", () => {
render(<StepItem index={0} description="Step" status="in_progress" />);
expect(screen.getByLabelText("in progress")).toBeDefined();
});
it("renders failed icon for failed status", () => {
render(<StepItem index={0} description="Step" status="failed" />);
expect(screen.getByLabelText("failed")).toBeDefined();
});
it("uses zero-based index to render 1-based step number", () => {
render(<StepItem index={4} description="Fifth step" status="pending" />);
expect(screen.getByText("5. Fifth step")).toBeDefined();
});
});

View File

@@ -0,0 +1,176 @@
"use client";
import type { DecompositionStepModel } from "@/app/api/__generated__/models/decompositionStepModel";
import {
CheckCircleIcon,
CircleDashedIcon,
ListChecksIcon,
SpinnerGapIcon,
WarningDiamondIcon,
XCircleIcon,
} from "@phosphor-icons/react";
import type { ToolUIPart } from "ai";
import { ScaleLoader } from "../../components/ScaleLoader/ScaleLoader";
// Re-export generated step type for consumers that need it.
export type DecompositionStep = DecompositionStepModel;
export interface TaskDecompositionOutput {
type: string;
message: string;
session_id?: string | null;
goal: string;
steps: DecompositionStep[];
step_count: number;
}
export interface DecomposeErrorOutput {
type: string;
error?: string;
message?: string;
}
export type DecomposeGoalOutput =
| TaskDecompositionOutput
| DecomposeErrorOutput;
function parseOutput(output: unknown): DecomposeGoalOutput | null {
if (!output) return null;
if (typeof output === "string") {
const trimmed = output.trim();
if (!trimmed) return null;
try {
return parseOutput(JSON.parse(trimmed) as unknown);
} catch {
return null;
}
}
if (typeof output === "object" && !Array.isArray(output)) {
const obj = output as Record<string, unknown>;
if (
"steps" in obj &&
"goal" in obj &&
Array.isArray(obj.steps) &&
typeof obj.goal === "string"
) {
return obj as unknown as TaskDecompositionOutput;
}
if ("error" in obj && typeof obj.error === "string") {
return obj as unknown as DecomposeErrorOutput;
}
// Message-only error payload (no 'error' key but also not a decomposition)
if ("message" in obj && typeof obj.message === "string") {
return obj as unknown as DecomposeErrorOutput;
}
}
return null;
}
export function getDecomposeGoalOutput(
part: unknown,
): DecomposeGoalOutput | null {
if (!part || typeof part !== "object") return null;
return parseOutput((part as { output?: unknown }).output);
}
export function isDecompositionOutput(
output: DecomposeGoalOutput,
): output is TaskDecompositionOutput {
return (
"steps" in output &&
Array.isArray((output as TaskDecompositionOutput).steps) &&
"goal" in output
);
}
export function isErrorOutput(
output: DecomposeGoalOutput,
): output is DecomposeErrorOutput {
return "error" in output;
}
export function getAnimationText(part: {
state: ToolUIPart["state"];
output?: unknown;
}): string {
switch (part.state) {
case "input-streaming":
case "input-available":
return "Analyzing your goal...";
case "output-available": {
const output = parseOutput(part.output);
if (output && isDecompositionOutput(output))
return `Plan ready (${output.step_count} steps)`;
return "Analyzing your goal...";
}
case "output-error":
return "Error analyzing goal";
default:
return "Analyzing your goal...";
}
}
export function ToolIcon({
isStreaming,
isError,
}: {
isStreaming?: boolean;
isError?: boolean;
}) {
if (isError) {
return (
<WarningDiamondIcon size={14} weight="regular" className="text-red-500" />
);
}
if (isStreaming) {
return <ScaleLoader size={14} />;
}
return (
<ListChecksIcon size={14} weight="regular" className="text-neutral-400" />
);
}
export function AccordionIcon() {
return <ListChecksIcon size={32} weight="light" />;
}
export function StepStatusIcon({ status }: { status: string }) {
switch (status) {
case "completed":
return (
<CheckCircleIcon
size={18}
weight="fill"
className="text-emerald-500"
aria-label="completed"
/>
);
case "in_progress":
return (
<SpinnerGapIcon
size={18}
weight="bold"
className="animate-spin text-blue-500"
aria-label="in progress"
/>
);
case "failed":
return (
<XCircleIcon
size={18}
weight="fill"
className="text-red-500"
aria-label="failed"
/>
);
default:
return (
<CircleDashedIcon
size={18}
weight="regular"
className="text-neutral-400"
aria-label="pending"
/>
);
}
}

View File

@@ -736,24 +736,19 @@ describe("SubscriptionTierSection", () => {
).toBeDefined();
});
it("renders BASIC cancellation copy in banner when pending_tier is BASIC", () => {
it("renders cancellation copy in banner when pending_tier is NO_TIER", () => {
setupMocks({
subscription: makeSubscription({
tier: "MAX",
pendingTier: "BASIC",
// Noon UTC so the local-formatted date lands on the same day
// regardless of the runner's timezone (midnight UTC drifts to
// the prior day in any timezone west of UTC).
pendingTier: "NO_TIER",
pendingTierEffectiveAt: new Date("2026-05-15T12:00:00Z"),
}),
});
render(<SubscriptionTierSection />);
// Cancellation copy — distinct from the generic downgrade phrasing.
expect(
screen.getByText(/scheduled to cancel your subscription on/i),
).toBeDefined();
expect(screen.getByText(/May 15, 2026/)).toBeDefined();
// Must NOT render the "downgrade to" phrasing on BASIC cancellation.
expect(screen.queryByText(/scheduled to downgrade to/i)).toBeNull();
});
});

View File

@@ -25,7 +25,7 @@ export function PendingChangeBanner({
const currentLabel = getTierLabel(currentTier);
const dateText = formatPendingDate(pendingEffectiveAt);
const isCancellation = pendingTier === "BASIC";
const isCancellation = pendingTier === "NO_TIER";
return (
<div

View File

@@ -7,7 +7,7 @@ import { PendingChangeBanner } from "../PendingChangeBanner";
describe("PendingChangeBanner", () => {
const baseProps = {
currentTier: "PRO",
pendingTier: "BASIC",
pendingTier: "NO_TIER",
// Use noon UTC so the formatted local date lands on the same day
// regardless of the host timezone (important for CI runners).
pendingEffectiveAt: "2026-05-01T12:00:00Z",
@@ -25,7 +25,7 @@ describe("PendingChangeBanner", () => {
expect(container.firstChild).toBeNull();
});
it("shows cancellation copy when pending tier is BASIC", () => {
it("shows cancellation copy when pending tier is NO_TIER", () => {
render(<PendingChangeBanner {...baseProps} />);
expect(screen.getByText(/cancel your subscription on/i)).toBeDefined();
expect(screen.getByText("May 1, 2026")).toBeDefined();

View File

@@ -2109,6 +2109,9 @@
"$ref": "#/components/schemas/MCPToolsDiscoveredResponse"
},
{ "$ref": "#/components/schemas/MCPToolOutputResponse" },
{
"$ref": "#/components/schemas/TaskDecompositionResponse"
},
{ "$ref": "#/components/schemas/MemoryStoreResponse" },
{ "$ref": "#/components/schemas/MemorySearchResponse" },
{
@@ -10916,6 +10919,40 @@
],
"title": "CreditTransactionType"
},
"DecompositionStepModel": {
"properties": {
"step_id": {
"type": "string",
"title": "Step Id",
"description": "Unique step identifier, e.g. 'step_1'"
},
"description": {
"type": "string",
"title": "Description",
"description": "Human-readable step description"
},
"action": {
"type": "string",
"title": "Action",
"description": "Action type: 'add_block', 'connect_blocks', 'configure', etc."
},
"block_name": {
"anyOf": [{ "type": "string" }, { "type": "null" }],
"title": "Block Name",
"description": "Block being added, if applicable"
},
"status": {
"type": "string",
"title": "Status",
"description": "Step status: pending, in_progress, completed, failed",
"default": "pending"
}
},
"type": "object",
"required": ["step_id", "description", "action"],
"title": "DecompositionStepModel",
"description": "A single step in a decomposed agent-building plan."
},
"DeleteFileResponse": {
"properties": { "deleted": { "type": "boolean", "title": "Deleted" } },
"type": "object",
@@ -14865,6 +14902,7 @@
"agent_builder_clarification_needed",
"agent_builder_validation_result",
"agent_builder_fix_result",
"task_decomposition",
"block_list",
"block_details",
"block_output",
@@ -16420,6 +16458,39 @@
"required": ["recent_searches", "providers", "top_blocks"],
"title": "SuggestionsResponse"
},
"TaskDecompositionResponse": {
"properties": {
"type": {
"$ref": "#/components/schemas/ResponseType",
"default": "task_decomposition"
},
"message": { "type": "string", "title": "Message" },
"session_id": {
"anyOf": [{ "type": "string" }, { "type": "null" }],
"title": "Session Id"
},
"goal": {
"type": "string",
"title": "Goal",
"description": "The original user goal"
},
"steps": {
"items": { "$ref": "#/components/schemas/DecompositionStepModel" },
"type": "array",
"title": "Steps"
},
"step_count": {
"type": "integer",
"title": "Step Count",
"description": "Number of steps (auto-derived from steps list)",
"default": 0
}
},
"type": "object",
"required": ["message", "goal", "steps"],
"title": "TaskDecompositionResponse",
"description": "Response for decompose_goal tool — shows the plan to the user."
},
"TimezoneResponse": {
"properties": {
"timezone": {