fix(copilot): don't inflate completion tokens to 1 for tool-only responses

When streaming providers don't report usage, the fallback estimator
used max(..., 1) for completion tokens. For tool-only responses with
no text output, this artificially charged 1 phantom token.
This commit is contained in:
Zamil Majdy
2026-03-17 01:37:50 +07:00
parent 1476a580b2
commit fd75e14eb8

View File

@@ -454,8 +454,8 @@ async def stream_chat_completion_baseline(
turn_prompt_tokens = max(
estimate_token_count(openai_messages, model=config.model), 1
)
turn_completion_tokens = max(
estimate_token_count_str(assistant_text, model=config.model), 1
turn_completion_tokens = estimate_token_count_str(
assistant_text, model=config.model
)
logger.info(
"[Baseline] No streaming usage reported; estimated tokens: "