Commit Graph

8407 Commits

Author SHA1 Message Date
majdyz
64c3ef45df chore: apply Prettier formatting to BuilderChatPanel files
Three files were flagged by the CI lint/format check — apply prettier
--write to bring them into compliance.
2026-04-13 04:15:37 +00:00
majdyz
77ed619613 fix(frontend/builder): add flowID to tool-call effect deps for correct navigation guard 2026-04-13 04:09:05 +00:00
majdyz
3b7e678b97 fix(frontend/builder): address round-5 review comments on BuilderChatPanel
- Add type="button" and focus-visible ring to Stop/Send buttons in PanelInput
- Add type="button" to Retry button in MessageList and Apply button in ActionList
- Fix MessageList to render plain text directly and only pass dynamic-tool parts
  to MessagePartRenderer (text parts were being misrouted through a tool renderer)
- Replace clearGraphSessionCacheForTesting export with _graphSessionCache for
  tests — avoids leaking test scaffolding into the production bundle
- Add toast notification in undo restore when target node was deleted between
  apply and undo (prevents silent no-op)
- Fix misleading test: remove red-herring mockNodes.push from 'no auto-send' test
  since the guard is isGraphLoaded===false, not the node array
- Add truncation-path coverage to helpers.test.ts (MAX_NODES/MAX_EDGES branches)
- Add deleted-node undo test to actionApplicators.test.ts
2026-04-13 04:01:42 +00:00
majdyz
ed65756d58 fix(frontend): reset isCreatingSession in retrySession and cap parsedActions
Add setIsCreatingSession(false) at the start of retrySession so the
spinner is cleared when retrying during an in-flight session creation.

Add MAX_PARSED_ACTIONS=100 cap to trim the oldest entries from the
accumulated action list, preventing unbounded growth in long conversations.
2026-04-12 23:22:08 +00:00
majdyz
44aa676fa5 fix(frontend): update stale assertion in applyConnectNodes duplicate-edge test
After the double-call fix removed the direct setAppliedActionKeys call
from the alreadyExists branch, the test still expected it to be called
once. Updated to .not.toHaveBeenCalled() since the caller
(handleApplyAction) now handles marking applied keys.
2026-04-12 12:09:06 +00:00
majdyz
6bad358d78 fix(frontend/builder): address review comments on useBuilderChatPanel
- Clear graphSessionCache on user change to prevent session leaks across sign-outs
- Reset all per-session state in retrySession including skipNextParseRef/skipNextToolScanRef
- Trim whitespace in sendRawMessage before empty guard
- Remove duplicate setAppliedActionKeys call from alreadyExists branch in applyConnectNodes
2026-04-12 11:01:34 +00:00
majdyz
47852cfdf5 fix(frontend/builder): add navigation race guard to tool-call detection effect
Mirrors the existing `skipNextParseRef` guard on the parse-actions effect.
When `flowID` changes, the reset effect clears `processedToolCallsRef` and
`lastScannedToolCallIndexRef` and queues `setMessages([])`, but the cleared
messages are not yet committed when the tool-call detection effect runs in
the same effect cycle. Without the skip, the effect would re-scan the
previous graph's messages from index 0 and re-fire `onGraphEdited` /
`setQueryStates(flowExecutionID)` for tool calls belonging to the old
graph — triggering a stray `refetchGraph()` on the new graph or
auto-following a stale execution.

Uses a separate `skipNextToolScanRef` so each effect consumes its own
flag independently; a shared ref would let whichever effect ran first
clear the guard before the other could skip.
2026-04-11 10:08:08 +00:00
majdyz
83fc444a3d fix(frontend/builder): re-add navigation race guard for parsedActions
When the user navigates between graphs, the flowID-reset effect resets
`lastParsedMessageIndexRef` and the parsed-actions cache, then queues
`setMessages([])`. The parse-actions effect runs in the same effect
cycle — *before* the queued state updates are committed — so its
`messages` closure still belongs to the previous graph. With the index
reset to -1 and the cache empty, it would re-scan those stale messages
from index 0 and briefly flash the previous graph's actions in the new
panel.

A previous guard (`277c19642`) was lost when commit `1935137c1` (the
DefaultChatTransport memoization fix) accidentally dropped the
`if (currentFlowIDRef.current !== flowID) return;` line. That guard
was actually a no-op because `currentFlowIDRef` is updated by an earlier
effect in the same cycle, so the check never fired — the bug was masked
in practice but came back into view when sentry re-flagged it.

Replace the removed line with a one-shot `skipNextParseRef` flag that
the cleanup effect sets only on *actual* navigation (not initial mount,
detected via `prevFlowIDRef`). The parse-actions effect skips one pass
when the flag is set, then clears it. This correctly handles:

  - Initial mount: no skip (flag stays false), first run parses normally.
  - Navigation: skip one pass; next render arrives with fresh messages
    from useChat's re-key and parses them correctly.
  - Same-flowID re-render: cleanup doesn't fire, no skip, normal parse.

New regression test reproduces the navigation race in the parsed-actions
integration suite.

Sentry bug prediction: PRRT_kwDOJKSTjM56RVeU (severity HIGH).
2026-04-11 05:18:25 +00:00
majdyz
1935137c10 fix(frontend): memoize DefaultChatTransport to prevent mid-stream resets
Wraps the DefaultChatTransport instantiation in useMemo([sessionId]) so
the same transport object is reused across renders. Without memoisation,
each streaming chunk (which triggers a re-render) created a new transport
instance, resetting useChat's internal Chat state mid-stream. Matches the
pattern already used in useCopilotStream.ts.
2026-04-11 09:25:37 +07:00
Zamil Majdy
e79214f3dd refactor(frontend/builder): remove useMemo violations + add incremental tool-call scanning
Per AGENTS.md conventions, useMemo/useCallback should not be used unless
asked to optimise. Remove useMemo from ActionList (nodeMap), MessageList
(visibleMessages filter), and useBuilderChatPanel (transport).

Also add lastScannedToolCallIndexRef to make tool-call detection O(new
messages) matching the action parser's incremental approach.
2026-04-11 09:08:15 +07:00
majdyz
958344562b merge(frontend/builder): resolve conflicts from PR #12726 dev merge
Resolve merge conflicts between builder-chat-panel feature and the
per-model cost breakdown PR (#12726):

- Flow.tsx: keep ErrorBoundary wrapper from our branch
- BuilderChatPanel.tsx + useBuilderChatPanel.ts: keep our latest refactor
- platform_cost_test.py: use Prisma ORM style for export test (theirs)
- useBuilderChatPanel.test.ts + BuilderChatPanel.test.tsx: keep latest tests
2026-04-11 08:54:55 +07:00
majdyz
bb071a9c88 refactor(frontend/builder): address review comments in BuilderChatPanel
- Remove useCallback from handleApplyAction (violates AGENTS.md)
- Import TEXTAREA_MAX_LENGTH from PanelInput instead of duplicating constant
- Remove dead @tanstack/react-query mock and associated invalidateQueries test
2026-04-11 00:03:10 +07:00
Zamil Majdy
b319c26cab feat(platform/admin): per-model cost breakdown, cache token tracking, OrchestratorBlock cost fix (#12726)
## Why

The platform cost tracking system had several gaps that made the admin
dashboard less accurate and harder to reason about:

**Q: Do we have per-model granularity on the provider page?**
The `model` column was stored in `PlatformCostLog` but the SQL
aggregation grouped only by `(provider, tracking_type)`, so all models
for a given provider collapsed into one row. Now grouped by `(provider,
tracking_type, model)` — each model gets its own row.

**Q: Why does Anthropic show `per_run` for OrchestratorBlock?**
Bug: `OrchestratorBlock._call_llm()` was building `NodeExecutionStats`
with only `input_token_count` and `output_token_count` — it dropped
`resp.provider_cost` entirely. For OpenRouter calls this silently
discarded the `cost_usd`. For the SDK (autopilot) path,
`ResultMessage.total_cost_usd` was never read. When `provider_cost` is
None and token counts are 0 (e.g. SDK error path), `resolve_tracking`
falls through to `per_run`. Fixed by propagating all cost/cache fields.

**Q: Why can't we get `cost_usd` for Anthropic direct API calls?**
The Anthropic Messages API does not return a dollar amount — only token
counts. OpenRouter returns cost via response headers, so it uses
`cost_usd` directly. The Claude Agent SDK *does* compute
`total_cost_usd` internally, so SDK-mode OrchestratorBlock runs now get
`cost_usd` tracking. For direct Anthropic LLM blocks the estimate uses
per-token rates (see cache section below).

**Q: What about labeling by source (autopilot vs block)?**
Already tracked: `block_name` stores `copilot:SDK`, `copilot:Baseline`,
or the actual block name. Visible in the raw logs table. Not added to
the provider group-by (would explode row count); use the logs table
filter instead.

**Q: Is there double-counting between `tokens`, `per_run`, and
`cost_usd`?**
No. `resolve_tracking()` uses a strict preference hierarchy — exactly
one tracking type per execution: `cost_usd` > `tokens` > provider
heuristics > `per_run`. A single execution produces exactly one
`PlatformCostLog` row.

**Q: Should we track Anthropic prompt cache tokens (PR #12725)?**
Yes — PR #12725 adds `cache_control` markers to Anthropic API calls,
which causes the API to return `cache_read_input_tokens` and
`cache_creation_input_tokens` alongside regular `input_tokens`. These
have different billing rates:
- Cache reads: **10%** of base input rate (much cheaper)
- Cache writes: **125%** of base input rate (slightly more expensive,
one-time)
- Uncached input: **100%** of base rate

Without tracking them separately, a flat-rate estimate on
`total_input_tokens` would be wrong in both directions.

## What

- **Per-model provider table**: SQL now groups by `(provider,
tracking_type, model)`. `ProviderCostSummary` and the frontend
`ProviderTable` show a model column.
- **Cache token columns**: New `cacheReadTokens` and
`cacheCreationTokens` columns in `PlatformCostLog` with matching
migration.
- **LLM block cache tracking**: `LLMResponse` captures
`cache_read_input_tokens` / `cache_creation_input_tokens` from Anthropic
responses. `NodeExecutionStats` gains `cache_read_token_count` /
`cache_creation_token_count`. Both propagate to `PlatformCostEntry` and
the DB.
- **Copilot path**: `token_tracking.persist_and_record_usage` now writes
cache tokens as dedicated `PlatformCostEntry` fields (was
metadata-only).
- **OrchestratorBlock bug fix**: `_call_llm()` now includes
`resp.provider_cost`, `resp.cache_read_tokens`,
`resp.cache_creation_tokens` in the stats merge. SDK path captures
`ResultMessage.total_cost_usd` as `provider_cost`.
- **Accurate cost estimation**: `estimateCostForRow` uses
token-type-specific rates for `tokens` rows (uncached=100%, reads=10%,
writes=125% of configured base rate).

## How

`resolve_tracking` priority is unchanged. For Anthropic LLM blocks the
tracking type remains `tokens` (Anthropic API returns no dollar amount).
For OrchestratorBlock in SDK/autopilot mode it now correctly uses
`cost_usd` because the Claude Agent SDK computes and returns
`total_cost_usd`. For OpenRouter through OrchestratorBlock it now
correctly uses `cost_usd` (was silently dropped before).

## Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
  - [x] `ProviderCostSummary` SQL updated
- [x] Cache token fields present in `PlatformCostEntry` and
`PlatformCostLogCreateInput`
  - [x] Prisma client regenerated — all type checks pass
  - [x] Frontend `helpers.test.ts` updated for new `rateKey` format
  - [x] Pre-commit hooks pass (Black, Ruff, isort, tsc, Prisma generate)
2026-04-10 23:14:43 +07:00
Zamil Majdy
85921f227a Merge branch 'dev' of github.com:Significant-Gravitas/AutoGPT into preview/all-active-prs 2026-04-10 22:59:30 +07:00
Zamil Majdy
5844b13fb1 feat(backend/copilot): support multiple questions in ask_question tool (#12732)
### Why / What / How

**Why:** The `ask_question` copilot tool previously only accepted a
single question per invocation. When the LLM needs to ask multiple
clarifying questions simultaneously, it either crams them into one text
field (requiring users to format numbered answers manually) or makes
multiple sequential tool calls (slow and disruptive UX).

**What:** Replace the single `question`/`options`/`keyword` parameters
with a `questions` array parameter so the LLM can ask multiple questions
in one tool call, each rendered as its own input box.

**How:** Simplified the tool to accept only `questions` (array of
question objects). Each item has `question` (required), `options`, and
`keyword`. The frontend `ClarificationQuestionsCard` already supports
rendering multiple questions — no frontend changes needed.

### Changes 🏗️

- `backend/copilot/tools/ask_question.py`: Replaced dual
question/questions schema with single `questions` array. Extracted
parsing into module-level `_parse_questions` and `_parse_one` helpers.
Follows backend code style: early returns, list comprehensions, top-down
ordering, functions under 40 lines.
- `backend/copilot/tools/ask_question_test.py`: Rewritten with 18
focused tests covering happy paths, keyword handling, options filtering,
and invalid input handling.

### Checklist 📋

#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
- [ ] Run `poetry run pytest backend/copilot/tools/ask_question_test.py`
— all tests pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 21:54:53 +07:00
majdyz
7804e03e7a style(builder-chat): apply prettier formatting to actionApplicators
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 13:20:46 +00:00
majdyz
277c196426 fix(builder-chat): guard parsed actions against flowID navigation race
When flowID changes, the flow-reset effect clears messages and
parsedActions but those state updates aren't committed until the next
render. The parsedActions effect could run on a render where the
currentFlowIDRef still holds the previous flowID, briefly re-populating
parsedActions from stale messages and flashing old action buttons in
the new chat panel.

Fix: skip parsing when currentFlowIDRef.current !== flowID, and add
flowID to the effect deps so the effect re-runs once the ref catches up.

Addresses sentry finding 13127725.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 13:15:32 +00:00
majdyz
c7885e32ec fix(builder-chat): differential undo for applied graph actions
Undo for chat-applied graph edits now reverts only the specific field
or edge that was changed, rather than restoring a full snapshot of the
nodes/edges arrays. Restoring a whole-array snapshot discarded any
subsequent manual edits the user made after clicking Apply, which was
flagged as a high-severity regression.

- applyUpdateNodeInput: snapshot only the previous value of the single
  field that is about to change. The undo closure re-reads live nodes
  at undo time and only rewrites action.key on the target node. If the
  field did not exist pre-apply, undo deletes it from hardcodedValues.
- applyConnectNodes: drop the pre-clone of edges entirely. The undo
  closure re-reads live edges at undo time and filters out the single
  edge matching source/target/handles, preserving other edges that
  were added afterwards.
- Tests updated to assert differential behavior (later unrelated edits
  are preserved through undo for both nodes and edges).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 13:02:40 +00:00
majdyz
41f4064555 fix(frontend/builder): address round-3 review feedback
- Pure render: move parsedActions incremental parsing from useMemo into a
  useEffect (mutating refs inside a memo breaks React Strict Mode). Also
  move currentFlowIDRef sync into an effect so the render body stays pure.
- actionApplicators: extract DEFAULT_EDGE_MARKER_COLOR constant, add a
  shared safeCloneArray<T> helper, and export cloneNodes for unit tests.
- Add dedicated actionApplicators.test.ts (21 tests) covering validation,
  undo snapshot isolation, dangerous-key blocking, idempotent connect,
  and the structuredClone fallback path.
- Add MessageList.test.ts covering normalizePartForRenderer with a runtime
  type guard (isDynamicToolPart) so the unsafe double cast has a real
  regression test.
- Add useBuilderChatPanel tests for sendRawMessage length clamp (empty,
  canSend guard, under-cap passthrough, 4000-char truncation).
- Add BUILDER_CHAT_PANEL default tests to envFlagOverride test suite.
- Memoize visibleMessages filter and nodeMap map construction so they do
  not rebuild on every streaming re-render.
- Accessibility: add focus-visible ring to the toggle button; mark the
  Applied badge with role=status + aria-live=polite for screen readers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 12:32:25 +00:00
majdyz
86f6695a33 refactor(frontend/builder): split chat panel into smaller files + address latest review
Addresses the "Should Fix" and "Nice to Have" items from the latest automated review.

- Extracts PanelHeader, MessageList, ActionList, PanelInput, TypingIndicator
  into a local components/ folder so BuilderChatPanel.tsx drops from 452 to
  ~128 lines (under the AGENTS.md 200-line guideline).
- Extracts handleApplyAction branches into applyUpdateNodeInput /
  applyConnectNodes helpers in actionApplicators.ts, and shares a pushUndoEntry
  helper to DRY the MAX_UNDO trimming. useBuilderChatPanel.ts drops from 616
  to ~510 lines.
- Uses structuredClone() for undo snapshots so restore callbacks are isolated
  from in-place mutations (falls back to a shallow copy on unsupported envs).
- Incremental action parsing: lastParsedMessageIndexRef + parsedActionsCacheRef
  avoid the O(all_messages) re-scan per turn.
- Adds a simple LRU cap (MAX_SESSION_CACHE = 50) to graphSessionCache so the
  module-scope Map cannot grow unbounded across navigations.
- sendRawMessage now clamps to MAX_RAW_MESSAGE_LENGTH so programmatic callers
  cannot bypass the textarea length cap.
- TypingIndicator gains role="status"/aria-label for screen readers.
- PanelInput shows a character counter once >=80% of maxLength and highlights
  red at the limit.
- Panel container uses max-h-[70vh] (with min-h-[320px] and sm:max-h-[75vh])
  so it gracefully shrinks on small screens instead of overlapping the
  builder toolbar.
- normalizePartForRenderer extracted from the inline dynamic-tool transform.
- Adds BuilderChatPanel.test.tsx coverage for connect_nodes action label
  rendering (with customized_name + fallback to raw node id).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:52:27 +00:00
majdyz
1921795e42 fix(frontend/builder): wrap BuilderChatPanel in ErrorBoundary
Prevents a runtime error in action parsing or message rendering from
crashing the entire build page. Uses a null fallback so the rest of the
editor remains usable if the chat panel fails.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:01:31 +00:00
majdyz
503e5e1f38 fix(frontend/builder): gate seed message on isOpen to prevent empty-graph context poison
When navigating between graphs with the panel closed, the cached session
could trigger the seed message effect with isOpen=false, causing the nodes
selector to return EMPTY_NODES. This would send an empty-graph summary to
the AI, poisoning the context before the panel was even opened.

Fix: add `isOpen` guard at the top of the seed effect and include it in
the dependency array so the seed fires only when the panel is visible and
nodes reflect the actual graph state.

Add regression test: verifies seed is NOT sent when panel is closed even
when sessionId is cached from a prior navigation and isGraphLoaded is true.
2026-04-10 09:25:16 +07:00
majdyz
6b6ce9db27 test(frontend/builder): add run_agent tool-call detection tests and capture setQueryStates
Adds four tests covering run_agent tool-call handling in useBuilderChatPanel:
- sets flowExecutionID via setQueryStates when execution_id is valid
- skips setQueryStates when output has no execution_id
- rejects path-traversal execution_id (security: /^[\w-]+$/i validation)
- deduplicates run_agent via processedToolCallsRef

Captures mockSetQueryStates from nuqs mock so run_agent assertions can verify
the correct query-state mutation rather than just the absence of errors.
2026-04-10 09:18:14 +07:00
Zamil Majdy
c014e1aa35 merge(preview): merge all active PRs into preview/all-active-prs from fresh dev 2026-04-10 08:40:23 +07:00
Zamil Majdy
e59f576622 Merge remote-tracking branch 'origin/spare/13' into preview/all-active-prs 2026-04-10 08:39:34 +07:00
Zamil Majdy
c99fa32ae3 Merge remote-tracking branch 'origin/spare/3' into preview/all-active-prs 2026-04-10 08:39:34 +07:00
Zamil Majdy
b71789da50 Merge remote-tracking branch 'origin/feat/subscription-tier-billing' into preview/all-active-prs 2026-04-10 08:39:34 +07:00
Zamil Majdy
5661326e7e fix(platform): fetch real Stripe prices in subscription status endpoint
- Import get_subscription_price_id in v1.py
- get_subscription_status now calls stripe.Price.retrieve for PRO/BUSINESS
  tiers to return actual unit_amount instead of hardcoded zeros
- UI will now show correct monthly costs when LD price IDs are configured
- Fix Button import from __legacy__ to design system in SubscriptionTierSection
- Update subscription status tests to mock the new Stripe price lookup
2026-04-10 08:37:40 +07:00
Zamil Majdy
df3fe926f2 style(backend/copilot): apply Black formatting to ask_question
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:56:42 +00:00
Zamil Majdy
505af7e673 refactor(backend/copilot): simplify ask_question to questions-only API
Drop the dual question/questions schema in favor of a single
`questions` array parameter. This removes ~175 lines of complexity
(the _execute_single path, duplicate params, precedence logic).

Restructured per backend code style rules:
- Top-down ordering: public _execute first, helpers below
- Early return with guard clauses, no deep nesting
- List comprehensions via walrus operator in _parse_questions
- Helpers extracted as module-level functions (not methods)
- Functions under 40 lines each

The frontend ClarificationQuestionsCard already renders arrays of
any length — no UI changes needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:54:11 +00:00
Zamil Majdy
d896a1f9fa fix(backend/copilot): add missing isinstance assertion in test
Add isinstance narrowing in test_execute_multiple_questions_ignores_single_params
to fix Pyright type-check CI failure (reportAttributeAccessIssue).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:48:02 +00:00
Zamil Majdy
6aa5a808e0 fix(backend/copilot): add isinstance assertions to fix type-check CI
Tests that access `result.questions` without first narrowing the type
from `ToolResponseBase` to `ClarificationNeededResponse` cause Pyright
type-check failures. Added `assert isinstance(result,
ClarificationNeededResponse)` before accessing `.questions` in 4 tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:40:08 +00:00
majdyz
d49f2518a2 fix(frontend/builder): always clear messages on flowID change to keep action state consistent
When navigating back to a cached session, appliedActionKeys was reset to empty
but messages were preserved. This caused previously applied actions to reappear
as unapplied in the UI, allowing them to be re-applied and creating duplicate
undo entries. Clearing messages unconditionally on navigation ensures the
displayed action buttons always reflect the actual applied state.
2026-04-10 02:03:56 +07:00
Zamil Majdy
18c88b4da0 fix(frontend/builder): always clear messages on flowID change to keep action state consistent
When navigating back to a cached session, appliedActionKeys was reset to empty
but messages were preserved. This caused previously applied actions to reappear
as unapplied in the UI, allowing them to be re-applied and creating duplicate
undo entries. Clearing messages unconditionally on navigation ensures the
displayed action buttons always reflect the actual applied state.
2026-04-10 02:03:56 +07:00
Zamil Majdy
3a5ce570e0 fix(backend/copilot): address PR review round 4
- Restore top-level `required: ["question"]` in schema for LLM tool-
  calling compatibility; validation handles the questions-only path
- Fix keyword null bug: `item.get("keyword")` returning None now
  correctly falls back to `question-{idx}` instead of producing "None"
- Filter empty-string options in _build_question (`str(o).strip()`)
  to avoid artifacts like "Email, , Slack"
- Revert session type hint to `ChatSession` to match base class contract
- Add tests for null keyword and empty-string options filtering

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 18:56:37 +00:00
Zamil Majdy
5a3739e54d fix(backend/copilot): address PR review round 2
- Remove top-level `required: ["question"]` from schema so the
  `questions`-only calling convention is valid for schema-compliant LLMs
- Move logger assignment below all imports (PEP 8 / isort)
- Remove duplicated option filtering in `_execute_single`; let
  `_build_question` own that responsibility
- Fix `session` type hint to `ChatSession | None` to match the guard
- Add test for `questions` as non-list type (falls back to single path)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 18:43:11 +00:00
majdyz
31cb6a2f58 fix(frontend/builder): guard msg.parts with nullish coalescing to prevent runtime error 2026-04-10 01:41:15 +07:00
Zamil Majdy
72bc8a92df fix(frontend/builder): guard msg.parts with nullish coalescing to prevent runtime error 2026-04-10 01:41:15 +07:00
Zamil Majdy
cc29cf5e20 fix(backend/copilot): address PR review round 1
- Fix falsy option filtering: use `if o is not None` instead of `if o`
  so valid values like "0" are preserved
- Improve multi-question `message` field: join all questions with ";"
  instead of only using the first question's text
- Add logging warnings for skipped invalid items in multi-question path
  instead of silently dropping them
- Simplify schema: use `"required": ["question"]` instead of empty
  required + anyOf (more LLM-friendly)
- Add missing test cases: session=None, single-item questions array,
  duplicate keywords, falsy option values

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 18:39:55 +00:00
Zamil Majdy
a0efbbba90 feat(backend/copilot): support multiple questions in ask_question tool
The ask_question tool previously only accepted a single question per
invocation, forcing the LLM to cram multiple queries into one text box
or make multiple sequential tool calls. This adds a `questions` parameter
(list of question objects) so multiple input fields render at once.

Backward-compatible: the existing `question`/`options`/`keyword` params
still work. When `questions` (plural) is provided, they take precedence.
The frontend ClarificationQuestionsCard already supports rendering
multiple questions — no frontend changes needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 18:21:35 +00:00
majdyz
a721fc689b fix(frontend/builder): clear stale messages in retrySession so new session starts clean 2026-04-10 00:56:31 +07:00
Zamil Majdy
8ed959433a fix(frontend/builder): clear stale messages in retrySession so new session starts clean 2026-04-10 00:56:31 +07:00
majdyz
dc0631809c fix(frontend/builder): reset hasSentSeedMessageRef in retrySession so seed is sent to new session 2026-04-10 00:39:10 +07:00
Zamil Majdy
98f3e09580 fix(frontend/builder): reset hasSentSeedMessageRef in retrySession so seed is sent to new session 2026-04-10 00:39:10 +07:00
Zamil Majdy
9ec44dd109 test(backend): add route-level tests for subscription API endpoints
Tests for GET/POST /credits/subscription covering:
- GET returns current tier (PRO, FREE default when None)
- POST FREE skips Stripe when payment disabled
- POST PRO sets tier directly for beta users (payment disabled)
- POST paid tier rejects missing success_url/cancel_url with 422
- POST paid tier creates Stripe Checkout Session and returns URL
- POST FREE with payment enabled cancels active Stripe subscription
2026-04-10 00:19:06 +07:00
Zamil Majdy
bfb82b6246 fix(platform): address reviewer feedback on subscription endpoint
- Remove useCallback from changeTier (not needed per project guidelines)
- Block self-service tier changes for ENTERPRISE users (admin-managed)
- Preserve current tier on unrecognized Stripe price_id instead of
  defaulting to FREE (prevents accidental downgrades during price migration)
2026-04-10 00:08:54 +07:00
Zamil Majdy
63210770ce test(backend): add tests for get_subscription_price_id to improve coverage 2026-04-09 23:54:02 +07:00
Zamil Majdy
f2b8f81bb1 test(backend/copilot): add unit tests for update_message_content_by_sequence
Cover success, not-found (returns False + warning), and DB-error (returns
False + error log) paths to push patch coverage above the 80% threshold.
2026-04-09 23:52:39 +07:00
Zamil Majdy
68b51ae2d3 test(backend): add coverage for sync_subscription_from_stripe edge cases
Tests for:
- Unknown/mismatched Stripe price_id defaults to FREE (not early return)
- None from LaunchDarkly price flags defaults to FREE
- BUSINESS tier mapping
- StripeError during cancel_stripe_subscription is logged, not raised
2026-04-09 23:52:16 +07:00
Zamil Majdy
63ff214563 fix(backend): default to FREE tier on unknown Stripe price ID in webhook sync
When sync_subscription_from_stripe encounters an unrecognized price_id
(e.g. LD flags unconfigured or price changed), it no longer returns early
leaving the user on a stale tier. Instead it defaults to FREE and logs a
warning, keeping the DB state consistent with Stripe's subscription status.

Also guard against None pro_price/biz_price from LaunchDarkly before
comparison to avoid silent mismatches.
2026-04-09 23:41:51 +07:00