- Replace broad `except Exception` with `except (json.JSONDecodeError,
ValidationError, TypeError, ValueError)` in drain_pending_messages so
unexpected non-data errors propagate instead of being silently swallowed
- Introduce `PendingMessageContext` Pydantic model to replace the raw
`dict[str, str]` for the context field, making the url/content contract
explicit and enabling typed attribute access instead of .get() calls
- Update routes.py to construct PendingMessageContext from the validated
request dict before passing to PendingMessage
- Update tests to use PendingMessageContext directly
Addresses coderabbitai review comments.
Replace broad `url in content` assertion with exact `[Page URL: url]`
substring check so CodeQL does not flag it as Incomplete URL Substring
Sanitization.
- Use _pre_drain_msg_count for transcript load gate (len > 1 check)
to avoid spurious transcript load on first turn with pending messages
- Use _pre_drain_msg_count for Graphiti warm context gate to prevent
warm context skip when pending messages are drained at first turn
- Add context.url/content length validators to QueuePendingMessageRequest
to prevent LLM context-window stuffing (2K url, 32K content caps)
- Rename underscore-prefixed active variables (_pm, _content, _pt)
to conventional names (pm, content, pt) per Python convention
1. SDK pyright: the inner ``_fetch_transcript`` closure captured
``session`` which pyright couldn't narrow to non-None (the outer
scope casts it, but the narrowing doesn't propagate into the
nested async function). Added an explicit ``assert session is not
None`` at the top of the closure.
2. Lint: re-formatted ``platform_cost_test.py`` — some pre-existing
whitespace drift from an upstream merge was tripping Black on CI.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. SDK retry tests failing with "Event loop is closed" — the
drain-at-start call in stream_chat_completion_sdk was reaching the
real ``drain_pending_messages`` (which hits Redis) instead of being
mocked. Added a ``drain_pending_messages`` stub returning ``[]`` to
the shared ``_make_sdk_patches`` helper so all retry-integration
tests skip the drain path.
2. API types check failing — the new
``POST /sessions/{id}/messages/pending`` endpoint wasn't reflected
in the frontend's ``openapi.json``. Regenerated via
``poetry run export-api-schema --output ../frontend/src/app/api/openapi.json``
and ``pnpm prettier --write``.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace maybe_append_user_message with direct session.messages.append
for pending drain in both baseline mid-loop and SDK drain-at-start:
pending messages are atomically popped from Redis and are never
stale-cache duplicates, so the dedup is wrong and causes
openai_messages/transcript to diverge from the DB record
- Add immediate upsert_chat_session after SDK drain-at-start so a
crash between drain and finally doesn't lose messages already removed
from Redis
- Capture _pre_drain_msg_count before the baseline drain-at-start:
use it for is_first_turn (prevents pending messages from flipping the
flag to False on an actual first turn) and for _load_prior_transcript
(prevents the stale-transcript check from firing on every turn that
drains pending messages, which would block transcript upload forever)
- Remove redundant if user_id: guards in queue_pending_message — user_id
is guaranteed non-empty by Security(auth.get_user_id); the guards made
the rate-limit check silently optional
Round 4 review nits:
- ``_PUSH_LUA`` block comment mentioned "returns 0 from our earlier
LLEN" which was a leftover from an earlier design that had a
separate LLEN check. The atomicity guarantee doesn't depend on it.
Reworded to describe Redis EVAL serialisation instead.
- ``clear_pending_messages`` docstring said "called at the end of a
turn" but the finally-block call sites were removed in round 2
when the atomic drain-at-start became the primary consumer. The
function is now only an operator/debug escape hatch. Docstring
updated to match.
No behavioural change.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Round 3 follow-up: the drain-at-start in ``stream_chat_completion_baseline``
persisted pending messages to ``session.messages`` but never called
``transcript_builder.append_user`` for them. A mid-turn transcript
upload would be missing the drained text, which could produce a
malformed assistant-after-assistant structure on the next turn.
The drain block runs BEFORE ``transcript_builder`` is instantiated
(which happens after prompt/transcript async setup), so we can't call
append_user in the drain block itself. Instead, we remember the
drained list and mirror it into the transcript right after the
single-message ``transcript_builder.append_user(content=message)``
call near the prompt-build site.
Also cleaned up the stray adjacent-string concatenation in the log
line (``"...turn start " "for session %s"`` → single string).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Critical: SDK path was double-injecting. The endpoint persisted the
message to ``session.messages`` AND the executor drained it from Redis
and concatenated into ``current_message`` — the LLM saw each queued
message twice (once via the compacted history / gap context that
``_build_query_message`` pulls from ``session.messages``, once via
the new query). Baseline avoided this via ``maybe_append_user_message``
dedup but SDK had no equivalent guard.
### Fix: Redis is the single source of truth
- Endpoint no longer persists to ``session.messages``. It only
pushes to Redis and returns.
- Baseline drain-at-start calls ``maybe_append_user_message`` (dedup
is a safety net, not the primary guard).
- SDK drain-at-start calls ``maybe_append_user_message`` too, so the
durable transcript records the queued messages. The concatenation
into ``current_message`` stays so the SDK CLI sees the content in
the first user message of the new turn.
### Baseline max-iterations silent-loss — Fixed
``tool_call_loop`` yields ``finished_naturally=False`` when
``iteration == max_iterations`` then returns. Previously the drain
only skipped ``finished_naturally=True``, so messages drained on the
max-iterations final yield were appended to ``openai_messages`` and
silently lost (the loop was already exiting). Now the drain also
skips when ``loop_result.iterations >= _MAX_TOOL_ROUNDS``.
### API response cleanup
- ``QueuePendingMessageResponse``: dropped ``queued`` (always True) and
``detail`` (human-readable, clients shouldn't parse). Kept
``buffer_length``, ``max_buffer_length``, and ``turn_in_flight``.
### Tests
- Removed dead ``_FakePipeline`` class (the code switched to Lua EVAL
in round 1 so the pipeline fake was unused).
- Added ``test_drain_decodes_bytes_payloads`` so the ``bytes → str``
decode branch in ``drain_pending_messages`` is actually exercised
(real redis-py returns bytes when ``decode_responses=False``).
- Updated ``_FakeRedis.lists`` type hint to ``list[str | bytes]``.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Critical fix — the SDK mid-stream injection was structurally broken.
``ClaudeSDKClient.receive_response()`` explicitly returns after the
first ``ResultMessage``, so re-issuing ``client.query()`` and setting
``acc.stream_completed = False`` could never restart the iteration —
the next ``__anext__`` raised ``StopAsyncIteration`` and the injected
turn's response was never consumed. Replaced the broken mid-stream
path with a turn-start drain that works for both baseline and SDK.
### Changes
**Atomic push via Lua EVAL** (``pending_messages.py``)
- Replace the ``RPUSH`` + ``LTRIM`` + ``EXPIRE`` + ``LLEN`` pipeline
(which was ``transaction=False`` and racy against concurrent
``LPOP``) with a single Lua script so the push is atomic.
- Drop the unused ``enqueued_at`` field.
- Add 16k ``max_length`` cap on ``PendingMessage.content``.
**Baseline path** (``baseline/service.py``)
- Drain at turn start (atomic ``LPOP``): any message queued while the
session was idle or between turns is picked up before the first
LLM call.
- Mid-loop drain now skips the final ``tool_call_loop`` yield
(``finished_naturally=True``) — draining there would append a user
message the loop is about to exit past, silently losing it.
- Inject via ``format_pending_as_user_message`` so file IDs + context
are preserved in both ``openai_messages`` and the persisted session
transcript (previously the DB copy lost file/context metadata).
- Remove the ``finally`` ``clear_pending_messages`` — atomic drain at
turn start means any late push belongs to the next turn; clearing
here would racily clobber it.
**SDK path** (``sdk/service.py``)
- Remove the broken mid-stream injection block entirely.
- Drain at turn start (same atomic ``LPOP``) and merge the drained
messages into ``current_message`` before ``_build_query_message``,
so the SDK CLI sees them as part of the initial user message.
- Remove the ``finally`` ``clear_pending_messages``.
- Delete the unused ``_combine_pending_messages`` helper.
**Endpoint** (``api/features/chat/routes.py``)
- Enforce ``check_rate_limit`` / ``get_global_rate_limits`` — was
bypassing per-user daily/weekly token limits that ``/stream``
enforces.
- ``QueuePendingMessageRequest`` gets ``extra="forbid"`` and
``message: max_length=16_000``.
- Push-first, persist-second: if the Redis push fails we raise 5xx;
previously the session DB got an orphan user message with no
corresponding queued entry and a retry would duplicate it.
- Log a warning when sanitised file IDs drop unknown entries.
- Persisted message content now uses ``format_pending_as_user_message``
so the session copy matches what the model actually sees on drain.
- Response returns ``buffer_length``, ``max_buffer_length``, and
``turn_in_flight`` so the frontend can show accurate feedback about
whether the message will hit the current turn or the next one.
**Tests** (``pending_messages_test.py``)
- ``_FakeRedis.eval`` emulates the Lua push script so the existing
push/drain/cap tests keep working under the new atomic path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a user sends a follow-up message while a copilot turn is still
streaming, we now queue it into a per-session Redis buffer and let the
executor currently processing the turn drain it between tool-call
rounds — the model sees the new message before its next LLM call.
Previously such messages were blocked at the RabbitMQ/cluster-lock
layer and only processed after the current turn completed.
### New module
`backend/copilot/pending_messages.py`
- Redis list buffer keyed by ``copilot:pending:{session_id}``
- Pub/sub notify channel as a wake-up hint for future blocking-wait use
- Cap of ``MAX_PENDING_MESSAGES=10`` — trims oldest on overflow
- 1h TTL matches ``stream_ttl`` default
- Helpers: ``push_pending_message``, ``drain_pending_messages``,
``peek_pending_count``, ``clear_pending_messages``,
``format_pending_as_user_message``
### New endpoint
`POST /sessions/{session_id}/messages/pending`
- Returns 202 + current buffer length
- Persists the message to the DB so it's in the transcript immediately
- Sanitises file IDs against the caller's workspace
- Does NOT start a new turn (unlike ``stream``)
### Baseline path (simple — in-process injection)
`backend/copilot/baseline/service.py`
- Between iterations of ``tool_call_loop``, drain pending and append to
the shared ``openai_messages`` list so the loop picks them up on the
next LLM call
- Persist session via ``upsert_chat_session`` after injection
- Finally-block safety net clears the buffer on early exit
### SDK path (in-process injection via live client.query)
`backend/copilot/sdk/service.py`
- When the SDK loop detects ``acc.stream_completed``, before breaking,
drain pending and send them via the existing open ``client.query()``
as a new user message; reset ``stream_completed`` to ``False`` and
``continue`` the async-for loop so we keep consuming CLI messages
- Combines multiple drained messages into a single ``query()`` call via
``_combine_pending_messages`` to preserve ordering
- Finally-block safety net clears the buffer on early exit
- This works because the Claude Agent SDK's ``ClaudeSDKClient`` is a
long-lived connection: ``query()`` writes a new user message to the
CLI's stdin and the same ``receive_response()`` stream picks up the
next turn's events, so we keep session continuity without releasing
the cluster lock or restarting the subprocess
### Tests
`backend/copilot/pending_messages_test.py`
- FakeRedis + FakePipeline so tests don't need a live Redis
- Covers push/drain, ordering, buffer cap (MAX_PENDING_MESSAGES),
clear, publish hook, malformed-payload handling, and the format
helper (plain / with context / with file_ids)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
### Why / What / How
**Why:** The `ask_question` copilot tool previously only accepted a
single question per invocation. When the LLM needs to ask multiple
clarifying questions simultaneously, it either crams them into one text
field (requiring users to format numbered answers manually) or makes
multiple sequential tool calls (slow and disruptive UX).
**What:** Replace the single `question`/`options`/`keyword` parameters
with a `questions` array parameter so the LLM can ask multiple questions
in one tool call, each rendered as its own input box.
**How:** Simplified the tool to accept only `questions` (array of
question objects). Each item has `question` (required), `options`, and
`keyword`. The frontend `ClarificationQuestionsCard` already supports
rendering multiple questions — no frontend changes needed.
### Changes 🏗️
- `backend/copilot/tools/ask_question.py`: Replaced dual
question/questions schema with single `questions` array. Extracted
parsing into module-level `_parse_questions` and `_parse_one` helpers.
Follows backend code style: early returns, list comprehensions, top-down
ordering, functions under 40 lines.
- `backend/copilot/tools/ask_question_test.py`: Rewritten with 18
focused tests covering happy paths, keyword handling, options filtering,
and invalid input handling.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
- [ ] Run `poetry run pytest backend/copilot/tools/ask_question_test.py`
— all tests pass
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Import get_subscription_price_id in v1.py
- get_subscription_status now calls stripe.Price.retrieve for PRO/BUSINESS
tiers to return actual unit_amount instead of hardcoded zeros
- UI will now show correct monthly costs when LD price IDs are configured
- Fix Button import from __legacy__ to design system in SubscriptionTierSection
- Update subscription status tests to mock the new Stripe price lookup
Drop the dual question/questions schema in favor of a single
`questions` array parameter. This removes ~175 lines of complexity
(the _execute_single path, duplicate params, precedence logic).
Restructured per backend code style rules:
- Top-down ordering: public _execute first, helpers below
- Early return with guard clauses, no deep nesting
- List comprehensions via walrus operator in _parse_questions
- Helpers extracted as module-level functions (not methods)
- Functions under 40 lines each
The frontend ClarificationQuestionsCard already renders arrays of
any length — no UI changes needed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add isinstance narrowing in test_execute_multiple_questions_ignores_single_params
to fix Pyright type-check CI failure (reportAttributeAccessIssue).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests that access `result.questions` without first narrowing the type
from `ToolResponseBase` to `ClarificationNeededResponse` cause Pyright
type-check failures. Added `assert isinstance(result,
ClarificationNeededResponse)` before accessing `.questions` in 4 tests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When navigating back to a cached session, appliedActionKeys was reset to empty
but messages were preserved. This caused previously applied actions to reappear
as unapplied in the UI, allowing them to be re-applied and creating duplicate
undo entries. Clearing messages unconditionally on navigation ensures the
displayed action buttons always reflect the actual applied state.
- Restore top-level `required: ["question"]` in schema for LLM tool-
calling compatibility; validation handles the questions-only path
- Fix keyword null bug: `item.get("keyword")` returning None now
correctly falls back to `question-{idx}` instead of producing "None"
- Filter empty-string options in _build_question (`str(o).strip()`)
to avoid artifacts like "Email, , Slack"
- Revert session type hint to `ChatSession` to match base class contract
- Add tests for null keyword and empty-string options filtering
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove top-level `required: ["question"]` from schema so the
`questions`-only calling convention is valid for schema-compliant LLMs
- Move logger assignment below all imports (PEP 8 / isort)
- Remove duplicated option filtering in `_execute_single`; let
`_build_question` own that responsibility
- Fix `session` type hint to `ChatSession | None` to match the guard
- Add test for `questions` as non-list type (falls back to single path)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix falsy option filtering: use `if o is not None` instead of `if o`
so valid values like "0" are preserved
- Improve multi-question `message` field: join all questions with ";"
instead of only using the first question's text
- Add logging warnings for skipped invalid items in multi-question path
instead of silently dropping them
- Simplify schema: use `"required": ["question"]` instead of empty
required + anyOf (more LLM-friendly)
- Add missing test cases: session=None, single-item questions array,
duplicate keywords, falsy option values
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The ask_question tool previously only accepted a single question per
invocation, forcing the LLM to cram multiple queries into one text box
or make multiple sequential tool calls. This adds a `questions` parameter
(list of question objects) so multiple input fields render at once.
Backward-compatible: the existing `question`/`options`/`keyword` params
still work. When `questions` (plural) is provided, they take precedence.
The frontend ClarificationQuestionsCard already supports rendering
multiple questions — no frontend changes needed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests for GET/POST /credits/subscription covering:
- GET returns current tier (PRO, FREE default when None)
- POST FREE skips Stripe when payment disabled
- POST PRO sets tier directly for beta users (payment disabled)
- POST paid tier rejects missing success_url/cancel_url with 422
- POST paid tier creates Stripe Checkout Session and returns URL
- POST FREE with payment enabled cancels active Stripe subscription
- Remove useCallback from changeTier (not needed per project guidelines)
- Block self-service tier changes for ENTERPRISE users (admin-managed)
- Preserve current tier on unrecognized Stripe price_id instead of
defaulting to FREE (prevents accidental downgrades during price migration)
Tests for:
- Unknown/mismatched Stripe price_id defaults to FREE (not early return)
- None from LaunchDarkly price flags defaults to FREE
- BUSINESS tier mapping
- StripeError during cancel_stripe_subscription is logged, not raised
When sync_subscription_from_stripe encounters an unrecognized price_id
(e.g. LD flags unconfigured or price changed), it no longer returns early
leaving the user on a stale tier. Instead it defaults to FREE and logs a
warning, keeping the DB state consistent with Stripe's subscription status.
Also guard against None pro_price/biz_price from LaunchDarkly before
comparison to avoid silent mismatches.
EditAgentTool and RunAgentTool call useCopilotChatActions() which throws
if no provider is in the tree. Wrap the panel content with
CopilotChatActionsProvider wired to sendRawMessage so tool components
can send retry prompts without crashing.
The user message was saved to DB before the <user_context> prefix was added
to session.messages. Subsequent upsert_chat_session calls only append new
messages (slicing by existing_message_count), so the prefixed content was
never written to the DB. On page reload or --resume, the unprefixed version
was loaded, losing personalisation.
Fix: add update_message_content_by_sequence to db.py and call it after
injecting the prefix in both sdk/service.py and baseline/service.py.
Add customer.subscription.created to the sync handler so user tier is
upgraded immediately when the subscription is first created (not just on
subsequent updates/deletions).
dev branch already creates SubscriptionTier enum and subscriptionTier column in
20260326200000_add_rate_limit_tier. Remove duplicate DDL from our migration and
only add SUBSCRIPTION to CreditTransactionType using IF NOT EXISTS guard.
Add cancel_stripe_subscription() which lists and cancels all active Stripe
subscriptions for the customer, preventing continued billing after downgrade.
Call it from update_subscription_tier() when tier == FREE and payment is
enabled. Add two unit tests covering active and empty subscription scenarios.
Add tests for the /logs/export endpoint (success, truncated, filters, auth) and
fix missing import of get_platform_cost_logs_for_export in platform_cost_test.py.
Add upfront 422 validation when upgrading to a paid tier without providing
redirect URLs. Also catch stripe.StripeError alongside ValueError to return
a proper 422 instead of a 500 on Stripe API errors.
Pressing Escape while drafting a message was silently discarding the
user's text. Guard the handler so it only closes the panel when focus is
outside an editable element.