mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-04-30 03:00:41 -04:00
## Why The platform cost tracking system had several gaps that made the admin dashboard less accurate and harder to reason about: **Q: Do we have per-model granularity on the provider page?** The `model` column was stored in `PlatformCostLog` but the SQL aggregation grouped only by `(provider, tracking_type)`, so all models for a given provider collapsed into one row. Now grouped by `(provider, tracking_type, model)` — each model gets its own row. **Q: Why does Anthropic show `per_run` for OrchestratorBlock?** Bug: `OrchestratorBlock._call_llm()` was building `NodeExecutionStats` with only `input_token_count` and `output_token_count` — it dropped `resp.provider_cost` entirely. For OpenRouter calls this silently discarded the `cost_usd`. For the SDK (autopilot) path, `ResultMessage.total_cost_usd` was never read. When `provider_cost` is None and token counts are 0 (e.g. SDK error path), `resolve_tracking` falls through to `per_run`. Fixed by propagating all cost/cache fields. **Q: Why can't we get `cost_usd` for Anthropic direct API calls?** The Anthropic Messages API does not return a dollar amount — only token counts. OpenRouter returns cost via response headers, so it uses `cost_usd` directly. The Claude Agent SDK *does* compute `total_cost_usd` internally, so SDK-mode OrchestratorBlock runs now get `cost_usd` tracking. For direct Anthropic LLM blocks the estimate uses per-token rates (see cache section below). **Q: What about labeling by source (autopilot vs block)?** Already tracked: `block_name` stores `copilot:SDK`, `copilot:Baseline`, or the actual block name. Visible in the raw logs table. Not added to the provider group-by (would explode row count); use the logs table filter instead. **Q: Is there double-counting between `tokens`, `per_run`, and `cost_usd`?** No. `resolve_tracking()` uses a strict preference hierarchy — exactly one tracking type per execution: `cost_usd` > `tokens` > provider heuristics > `per_run`. A single execution produces exactly one `PlatformCostLog` row. **Q: Should we track Anthropic prompt cache tokens (PR #12725)?** Yes — PR #12725 adds `cache_control` markers to Anthropic API calls, which causes the API to return `cache_read_input_tokens` and `cache_creation_input_tokens` alongside regular `input_tokens`. These have different billing rates: - Cache reads: **10%** of base input rate (much cheaper) - Cache writes: **125%** of base input rate (slightly more expensive, one-time) - Uncached input: **100%** of base rate Without tracking them separately, a flat-rate estimate on `total_input_tokens` would be wrong in both directions. ## What - **Per-model provider table**: SQL now groups by `(provider, tracking_type, model)`. `ProviderCostSummary` and the frontend `ProviderTable` show a model column. - **Cache token columns**: New `cacheReadTokens` and `cacheCreationTokens` columns in `PlatformCostLog` with matching migration. - **LLM block cache tracking**: `LLMResponse` captures `cache_read_input_tokens` / `cache_creation_input_tokens` from Anthropic responses. `NodeExecutionStats` gains `cache_read_token_count` / `cache_creation_token_count`. Both propagate to `PlatformCostEntry` and the DB. - **Copilot path**: `token_tracking.persist_and_record_usage` now writes cache tokens as dedicated `PlatformCostEntry` fields (was metadata-only). - **OrchestratorBlock bug fix**: `_call_llm()` now includes `resp.provider_cost`, `resp.cache_read_tokens`, `resp.cache_creation_tokens` in the stats merge. SDK path captures `ResultMessage.total_cost_usd` as `provider_cost`. - **Accurate cost estimation**: `estimateCostForRow` uses token-type-specific rates for `tokens` rows (uncached=100%, reads=10%, writes=125% of configured base rate). ## How `resolve_tracking` priority is unchanged. For Anthropic LLM blocks the tracking type remains `tokens` (Anthropic API returns no dollar amount). For OrchestratorBlock in SDK/autopilot mode it now correctly uses `cost_usd` because the Claude Agent SDK computes and returns `total_cost_usd`. For OpenRouter through OrchestratorBlock it now correctly uses `cost_usd` (was silently dropped before). ## Checklist 📋 #### For code changes: - [x] I have clearly listed my changes in the PR description - [x] I have made a test plan - [x] I have tested my changes according to the test plan: - [x] `ProviderCostSummary` SQL updated - [x] Cache token fields present in `PlatformCostEntry` and `PlatformCostLogCreateInput` - [x] Prisma client regenerated — all type checks pass - [x] Frontend `helpers.test.ts` updated for new `rateKey` format - [x] Pre-commit hooks pass (Black, Ruff, isort, tsc, Prisma generate)