A transient LaunchDarkly failure returned None from get_subscription_price_id,
which was cached for the full 60-second TTL, blocking subscription upgrades
until expiry. Adding cache_none=False ensures None is never stored in the cache
so the next call retries LD immediately.
Adds a regression test verifying that two consecutive calls where the first
returns None (LD transient error) and the second returns the real price ID
both hit LD, confirming the None sentinel is not cached.
Flagged by sentry[bot] (credit.py:1352, Severity: MEDIUM).
Since get_subscription_price_id is now @cached, in-memory cache state can
persist between tests in the same process and cause false cache hits. Call
cache_clear() before and after tests that call the function directly to
ensure each test exercises a fresh LD flag lookup.
- Cache `get_subscription_price_id` with 60s TTL via @cached decorator;
LD flag values change only at deploy time — caching avoids hitting the
LD SDK on every webhook delivery and GET /credits/subscription page load
- Add webhook identity cross-check in `sync_subscription_from_stripe`:
verify metadata.user_id (set during Checkout Session creation) matches
the user found via stripeCustomerId; log + bail on mismatch to prevent
silently updating the wrong user's subscription tier
- Move `handleTierChange` business logic from SubscriptionTierSection
component into `useSubscriptionTierSection` hook per project convention;
dialog state (confirmDowngradeTo) stays in component as it's UI state
- Add three new backend tests for metadata identity cross-check:
matching user_id accepted, mismatching user_id blocked, absent
metadata skips check (backward compat with non-Checkout subs)
- Strip ?subscription=cancelled from address bar in useSubscriptionTierSection
alongside the existing ?subscription=success cleanup so Stripe cancel
redirects don't leave stale params in the URL
- Parallelize the two sequential stripe.Subscription.list calls on the
cancel webhook path using asyncio.gather to reduce handler latency
- Add a test for ?subscription=cancelled being a no-op (no toast, URL cleaned)
- Add test_sync_subscription_from_stripe_missing_customer_key_returns_early to
verify the .get("customer") fix: a payload with no 'customer' key returns
early without querying the DB or writing a tier (no KeyError→500)
- Update SubscriptionTierSection loading test to match skeleton-card output
(no longer expects empty container; now asserts tier card text is absent)
- Replace null return on isLoading with three Skeleton card placeholders
matching the expected height of the tier grid to prevent layout shift
- Add eslint-disable-next-line comment on useEffect dependency array
explaining why refetch/toast are included despite being new refs each
render (stable in practice; effect is guarded by subscriptionStatus check)
- Use .get("customer") with an early return + warning log instead of direct
key access; prevents KeyError→500 on malformed webhook payloads that pass
HMAC verification but omit the customer field
- Document the paid-to-paid upgrade race window (PRO→BUSINESS) in a comment
so the known limitation is visible without changing semantics
- Add mock_set_tier.assert_not_called() to the multi-partial-failure test to
explicitly assert the DB tier is never updated when a Stripe cancel raises
Replace toastShownRef guard with router.replace(pathname) so the success
toast is not re-shown on page refresh and correctly re-fires on a second
checkout in the same SPA session. Adds test coverage for the behaviour.
_cleanup_stale_subscriptions was called before the idempotency guard
(current_tier == tier -> return), so webhook replays for an already-
applied event would fire another cleanup round and could inadvertently
cancel a new subscription the user signed up for between the original
event and its replay.
Move the cleanup call to after the idempotency check so it only runs
when we are actually going to apply a tier change. Add status in
("active", "trialing") and new_sub_id guard to ensure cleanup is
only triggered for paid-sub activation events, not cancellations.
- openapi.json: SubscriptionStatusResponse.tier was missing enum constraint — generated TS type was string instead of literal union. Added enum:[FREE,PRO,BUSINESS,ENTERPRISE] to match the Literal on the Python model.
- credit_subscription_test.py: set has_more=False on all MagicMock subscription list objects so _cancel_customer_subscriptions does not log spurious 'more than 10 subs' errors in tests. Also added clarifying comment on multi_partial_failure assertion.
- cache.py: replaced _MISSING: Any = object() with a dedicated _MissingType singleton class so mypy correctly narrows type after 'result is _MISSING' comparisons.
- SubscriptionStatusResponse.tier: str -> Literal["FREE","PRO","BUSINESS","ENTERPRISE"] so OpenAPI schema emits an enum and the generated TS client is narrowly typed
- Add test_update_subscription_tier_same_tier_is_noop: asserts the double-billing guard at line 868 returns 200/empty URL and never calls create_subscription_checkout
- Reset toastShownRef.current to false when subscriptionStatus != "success" so the success toast fires again after a second checkout on the same SPA mount
React onClick handlers don't await async functions, so passing an
async function directly creates a floating promise. Wrap in void to
make the intent explicit and prevent unhandled rejections.
The pre-parse rejection of @ was overly broad — it rejected valid URLs
with @ in query strings or fragments (e.g. ?ref=user@company.com).
The user:pass@host authority attack only applies to the netloc component.
Move the @ check to run against parsed.netloc after urlparse.
- Use set difference instead of any() to correctly detect other active
subs (any(sub["id"] != new_sub_id ...) returns True if ANY sub has a
different ID, which is always true when >1 sub exists regardless of
whether the cancelled sub is in the list).
- Add has_more check with logger.error in _cancel_customer_subscriptions
so we surface when a customer has >10 subs and some were silently
skipped.
Revert the tentative update_many conditional guard (prisma where-clause
null semantics are fiddly and the test suite mocks get_stripe_customer_id
end-to-end, so a real prisma error wouldn't be caught locally). The
idempotency_key on Customer.create is sufficient: Stripe collapses
concurrent + retried calls to the same Customer object for 24h, which
comfortably covers every realistic in-flight retry window.
Also invalidate the get_user_by_id cache after the DB write so the
freshly-persisted stripeCustomerId is visible on the next read.
Address review findings on the subscription tier billing PR:
1. get_stripe_customer_id race: two concurrent calls (double-click,
retried request) could each create a Stripe Customer for the same
user, leaving an orphaned billable customer. Pass an idempotency_key
so Stripe collapses concurrent + retried calls server-side, and use
a conditional update_many so the loser of a longer-window race
re-reads the persisted ID instead of overwriting.
2. update_subscription_tier no-op short-circuit: if the user is already
on the requested paid tier, return without creating a Checkout
Session. Without this guard, a duplicate request creates a second
subscription for the same price; the user would be charged for both
until _cleanup_stale_subscriptions runs from the resulting webhook —
which only fires after the second charge has cleared.
3. stripe_webhook payload defensive extraction: a malformed payload
(missing/non-dict data.object, missing id) would raise KeyError /
TypeError after signature verification, which Stripe interprets as
a delivery failure and retries forever. Validate shape, log a
warning, and ack with 200 so Stripe stops retrying.
4. _cleanup_stale_subscriptions: bump the swallowed-error log from
warning to exception so Sentry surfaces it as an error, include
the customer/sub IDs needed for manual reconciliation, and add a
TODO referencing the missing periodic reconcile job that the
docstring already promises as the backstop.
The @cached decorator could not differentiate "no entry" from "entry is
None" — both `_get_from_memory` and `_get_from_redis` returned `None`
for misses, and the wrappers checked `result is not None` to decide
whether to recompute. Functions that returned `None` as a valid value
were therefore re-executed on every call, defeating the cache and (for
shared_cache=False) potentially causing per-pod thundering herd against
upstream APIs.
Fix:
- Use a module-level `_MISSING = object()` sentinel for "no entry".
- Wrappers now check `result is not _MISSING` so cached `None` is
returned correctly.
- Add a `cache_none: bool = True` parameter so callers that *want* the
retry-on-None behavior (e.g. external API calls returning `None` to
signal a transient error) can explicitly opt out via `cache_none=False`.
- `_get_stripe_price_amount` opts out: returning None on a Stripe error
must not poison the 5-minute cache window. Updated its docstring to
describe the actual behavior.
New tests cover both default (None is cached) and `cache_none=False`
(None is not stored, next call retries) for sync, async, and shared
cache paths.
Sentry bug prediction: PRRT_kwDOJKSTjM56RTEu (severity HIGH).
- Dialog controlled set callback: use explicit if-block to avoid
returning 'false | void' (TS2322)
- Test redirect test: use vi.stubGlobal to replace window.location with
a plain object (Proxy on jsdom Location breaks private-field access)
Return None on StripeError instead of 0 so the @cached decorator
(which skips caching None) does not persist the error state for 5 min.
Added test to verify the None→0 fallback path in get_subscription_status.
- Replace __legacy__ Dialog import with molecules/Dialog in SubscriptionTierSection
- Update test mock to match new Dialog API (controlled pattern)
- Guard still_has_active_sub against empty new_sub_id in sync_subscription_from_stripe
- Move urlparse import from inside _validate_checkout_redirect_url to module level
Backend:
- Cache stripe.Price.retrieve with 5-min TTL via _get_stripe_price_amount
to avoid 200-600ms Stripe round-trip on every GET /credits/subscription
- Use SubscriptionTier enum .value for FREE/ENTERPRISE in tier_costs dict
for consistency (instead of hardcoded strings)
- Rename misleading test names: "defaults_to_FREE" → "preserves_current_tier"
to reflect actual behaviour (unknown price IDs preserve tier, not reset)
- Update subscription_routes_test to mock _get_stripe_price_amount instead
of stripe.Price.retrieve directly, avoiding cached-result interference
Frontend:
- Handle ?subscription=success return from Stripe Checkout: refetch + toast
- Add downgrade confirmation Dialog before cancelling paid subscription
- Handle ENTERPRISE tier: render dedicated admin-managed plan card, not the
FREE/PRO/BUSINESS tier cards (which would show no "Current" badge)
- Track pendingTier (via variables) so only the clicked button shows "Updating..."
- Show "Pricing available soon" for paid tiers with cost=0 (unconfigured LD flags)
instead of misleading "Free"
- Move tierError state into the hook, set via changeTier internally
- Move TIER_ORDER constant to module scope (was magic array inside render body)
- Add aria-current="true" to active tier card for screen reader accessibility
- Add role="alert" to all error paragraph elements
- Improve tier descriptions with concrete capacity values
Black detected double blank lines between class definitions in
platform_cost_test.py (pulled from dev base). Normalise to a single
blank line so the CI merge-commit lint check passes.
An empty STRIPE_WEBHOOK_SECRET (the default) allows an attacker to
compute a valid HMAC-SHA256 signature over the same key and forge any
webhook event (customer.subscription.created, etc.), escalating any
user to an arbitrary subscription tier without paying.
Fix: return 503 immediately when stripe_webhook_secret is unset rather
than proceeding to signature verification. Also add run_in_threadpool
to get_stripe_customer_id and remove the duplicate trialing-sub test.
Merges origin/feat/subscription-tier-billing which had the open-redirect
guard, blocking-IO fix, and idempotency/ENTERPRISE guard.
Test added: test_stripe_webhook_unconfigured_secret_returns_503
- Guard stripe_webhook: return 503 when STRIPE_WEBHOOK_SECRET is empty.
An empty secret allows HMAC forgery (attacker computes a valid sig over
the same key), so we reject all webhook calls when unconfigured.
- Suppress raw Stripe error from 502 cancel response; log server-side instead.
- Wrap all blocking Stripe SDK calls in run_in_threadpool: Customer.create,
Subscription.list, Subscription.cancel, checkout.Session.create.
- cancel_stripe_subscription now also cancels 'trialing' subscriptions
(previously only 'active'), preventing billing after a FREE downgrade.
- session.url None now raises ValueError instead of returning empty string.
- Add tests: webhook 503 on missing secret, trialing-sub cancellation.
## Why
The platform cost tracking system had several gaps that made the admin
dashboard less accurate and harder to reason about:
**Q: Do we have per-model granularity on the provider page?**
The `model` column was stored in `PlatformCostLog` but the SQL
aggregation grouped only by `(provider, tracking_type)`, so all models
for a given provider collapsed into one row. Now grouped by `(provider,
tracking_type, model)` — each model gets its own row.
**Q: Why does Anthropic show `per_run` for OrchestratorBlock?**
Bug: `OrchestratorBlock._call_llm()` was building `NodeExecutionStats`
with only `input_token_count` and `output_token_count` — it dropped
`resp.provider_cost` entirely. For OpenRouter calls this silently
discarded the `cost_usd`. For the SDK (autopilot) path,
`ResultMessage.total_cost_usd` was never read. When `provider_cost` is
None and token counts are 0 (e.g. SDK error path), `resolve_tracking`
falls through to `per_run`. Fixed by propagating all cost/cache fields.
**Q: Why can't we get `cost_usd` for Anthropic direct API calls?**
The Anthropic Messages API does not return a dollar amount — only token
counts. OpenRouter returns cost via response headers, so it uses
`cost_usd` directly. The Claude Agent SDK *does* compute
`total_cost_usd` internally, so SDK-mode OrchestratorBlock runs now get
`cost_usd` tracking. For direct Anthropic LLM blocks the estimate uses
per-token rates (see cache section below).
**Q: What about labeling by source (autopilot vs block)?**
Already tracked: `block_name` stores `copilot:SDK`, `copilot:Baseline`,
or the actual block name. Visible in the raw logs table. Not added to
the provider group-by (would explode row count); use the logs table
filter instead.
**Q: Is there double-counting between `tokens`, `per_run`, and
`cost_usd`?**
No. `resolve_tracking()` uses a strict preference hierarchy — exactly
one tracking type per execution: `cost_usd` > `tokens` > provider
heuristics > `per_run`. A single execution produces exactly one
`PlatformCostLog` row.
**Q: Should we track Anthropic prompt cache tokens (PR #12725)?**
Yes — PR #12725 adds `cache_control` markers to Anthropic API calls,
which causes the API to return `cache_read_input_tokens` and
`cache_creation_input_tokens` alongside regular `input_tokens`. These
have different billing rates:
- Cache reads: **10%** of base input rate (much cheaper)
- Cache writes: **125%** of base input rate (slightly more expensive,
one-time)
- Uncached input: **100%** of base rate
Without tracking them separately, a flat-rate estimate on
`total_input_tokens` would be wrong in both directions.
## What
- **Per-model provider table**: SQL now groups by `(provider,
tracking_type, model)`. `ProviderCostSummary` and the frontend
`ProviderTable` show a model column.
- **Cache token columns**: New `cacheReadTokens` and
`cacheCreationTokens` columns in `PlatformCostLog` with matching
migration.
- **LLM block cache tracking**: `LLMResponse` captures
`cache_read_input_tokens` / `cache_creation_input_tokens` from Anthropic
responses. `NodeExecutionStats` gains `cache_read_token_count` /
`cache_creation_token_count`. Both propagate to `PlatformCostEntry` and
the DB.
- **Copilot path**: `token_tracking.persist_and_record_usage` now writes
cache tokens as dedicated `PlatformCostEntry` fields (was
metadata-only).
- **OrchestratorBlock bug fix**: `_call_llm()` now includes
`resp.provider_cost`, `resp.cache_read_tokens`,
`resp.cache_creation_tokens` in the stats merge. SDK path captures
`ResultMessage.total_cost_usd` as `provider_cost`.
- **Accurate cost estimation**: `estimateCostForRow` uses
token-type-specific rates for `tokens` rows (uncached=100%, reads=10%,
writes=125% of configured base rate).
## How
`resolve_tracking` priority is unchanged. For Anthropic LLM blocks the
tracking type remains `tokens` (Anthropic API returns no dollar amount).
For OrchestratorBlock in SDK/autopilot mode it now correctly uses
`cost_usd` because the Claude Agent SDK computes and returns
`total_cost_usd`. For OpenRouter through OrchestratorBlock it now
correctly uses `cost_usd` (was silently dropped before).
## Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [x] I have tested my changes according to the test plan:
- [x] `ProviderCostSummary` SQL updated
- [x] Cache token fields present in `PlatformCostEntry` and
`PlatformCostLogCreateInput`
- [x] Prisma client regenerated — all type checks pass
- [x] Frontend `helpers.test.ts` updated for new `rateKey` format
- [x] Pre-commit hooks pass (Black, Ruff, isort, tsc, Prisma generate)
### Why / What / How
**Why:** The `ask_question` copilot tool previously only accepted a
single question per invocation. When the LLM needs to ask multiple
clarifying questions simultaneously, it either crams them into one text
field (requiring users to format numbered answers manually) or makes
multiple sequential tool calls (slow and disruptive UX).
**What:** Replace the single `question`/`options`/`keyword` parameters
with a `questions` array parameter so the LLM can ask multiple questions
in one tool call, each rendered as its own input box.
**How:** Simplified the tool to accept only `questions` (array of
question objects). Each item has `question` (required), `options`, and
`keyword`. The frontend `ClarificationQuestionsCard` already supports
rendering multiple questions — no frontend changes needed.
### Changes 🏗️
- `backend/copilot/tools/ask_question.py`: Replaced dual
question/questions schema with single `questions` array. Extracted
parsing into module-level `_parse_questions` and `_parse_one` helpers.
Follows backend code style: early returns, list comprehensions, top-down
ordering, functions under 40 lines.
- `backend/copilot/tools/ask_question_test.py`: Rewritten with 18
focused tests covering happy paths, keyword handling, options filtering,
and invalid input handling.
### Checklist 📋
#### For code changes:
- [x] I have clearly listed my changes in the PR description
- [x] I have made a test plan
- [ ] I have tested my changes according to the test plan:
- [ ] Run `poetry run pytest backend/copilot/tools/ask_question_test.py`
— all tests pass
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
_cancel_customer_subscriptions previously only queried status="active",
leaving trialing subscriptions in place. A user on a trial who downgrades
to FREE, or upgrades to a different paid tier, would continue to be billed
once the trial ended. Query both "active" and "trialing" statuses and
dedupe by sub id to ensure every billable sub is cleaned up.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When Stripe processes a subscription upgrade, the old subscription's
customer.subscription.deleted event may arrive after the new subscription's
customer.subscription.created has already been handled. Unconditionally
setting the user to FREE in the cancel branch would immediately undo the
upgrade.
sync_subscription_from_stripe now checks Stripe for other active/trialing
subscriptions on the same customer before downgrading. If at least one
different active sub exists, the handler preserves the current tier and
returns without writing. Added a regression test that mocks Stripe
returning sub_new as active and asserts set_subscription_tier is never
awaited.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Security and quality fixes for PR #12727 subscription tier billing review:
- Open-redirect protection: validate success_url/cancel_url against
settings.config.frontend_base_url before passing to Stripe Checkout.
- Blocking I/O: wrap every synchronous Stripe SDK call (Subscription.list,
Subscription.cancel, checkout.Session.create) with run_in_threadpool via
a shared _cancel_customer_subscriptions helper.
- Info leakage: log raw Stripe errors server-side but return a generic
502 detail to the client ("Please try again or contact support.").
- Webhook idempotency: skip DB writes in sync_subscription_from_stripe
when the tier is already current, avoiding redundant writes on retry.
- ENTERPRISE guard in webhook: refuse to overwrite ENTERPRISE tier from
Stripe events (admin-managed, not self-service).
- create_subscription_checkout raises ValueError on empty session.url
instead of silently returning "".
- Tests: fixture-based client (no leaky try/finally), open-redirect test,
ENTERPRISE 403 test, webhook dispatch test, trialing status test,
multi-sub partial-cancel-failure test, idempotency test, renamed
misleading "defaults to FREE" tests to "preserves_current_tier".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a PRO user upgrades to BUSINESS via a fresh Checkout Session, Stripe
creates a new subscription without touching the existing one, leaving the
customer double-billed. Cleaning up in sync_subscription_from_stripe
rather than the API handler ensures an abandoned Checkout does not leave
the user without a subscription: we only cancel the old sub once the new
sub has actually become active.
Errors listing or cancelling stale subs are logged but not propagated —
the new subscription tier still gets persisted, and Stripe will retry
the webhook later if listing fails.
Addresses sentry[bot] comment 3061713750 on PR #12727.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After a tier change, the rate-limit cache (get_user_tier, 5-minute TTL)
was not cleared, so CoPilot rate limits would continue enforcing the old tier
until the TTL expired. Call get_user_tier.cache_delete(user_id) via a local
import to avoid circular import issues.
Addresses sentry[bot] comment 3061725912 on PR #12727.
customer_update only accepts {address, name, shipping} per Stripe's TypedDict.
The payment_method key does not exist in CreateParamsCustomerUpdate, so pyright
was failing the type-check CI. Remove the invalid parameter — for Stripe
subscriptions the payment method used for the first invoice is automatically
saved to the customer by Stripe.
getSubscription() and setSubscriptionTier() in client.ts were replaced by
generated hooks (useGetSubscriptionStatus, useUpdateSubscriptionTier) and
are no longer called anywhere in the codebase. Remove them to avoid adding
further surface area to the deprecated BackendAPI.
- stripe.Subscription.list() is now wrapped in try-except; StripeError
is logged and re-raised so callers know the listing failed.
- stripe.Subscription.cancel() StripeError is now re-raised (was swallowed),
preventing set_subscription_tier from marking the user FREE when Stripe
cancellation failed.
- update_subscription_tier catches StripeError from cancel and returns HTTP 502
so DB tier is only updated if Stripe succeeds.
- Fix test patch path: use backend.data.credit.stripe.checkout.Session.create
instead of bare stripe.checkout.Session.create for import-refactor safety.
- Add tests for raise-on-list-failure, raise-on-cancel-failure, and
502 route response on cancel failure.
Addresses sentry[bot] comments 3061585490, 3061654688 on PR #12727.
Adds customer_update={payment_method: auto} so the payment method used
for subscription is set as the Stripe customer's default. Makes it show
pre-selected in future Checkout sessions (manual top-ups).
- Import get_subscription_price_id in v1.py
- get_subscription_status now calls stripe.Price.retrieve for PRO/BUSINESS
tiers to return actual unit_amount instead of hardcoded zeros
- UI will now show correct monthly costs when LD price IDs are configured
- Fix Button import from __legacy__ to design system in SubscriptionTierSection
- Update subscription status tests to mock the new Stripe price lookup