mirror of
https://github.com/Significant-Gravitas/AutoGPT.git
synced 2026-04-30 03:00:41 -04:00
## Summary - Fixes SmartDecisionMakerBlock conversation management to work with OpenAI's Responses API, which was introduced in #12099 (commit1240f38) - The migration to `responses.create` updated the outbound LLM call but missed the conversation history serialization — the `raw_response` is now the entire `Response` object (not a `ChatCompletionMessage`), and tool calls/results use `function_call` / `function_call_output` types instead of role-based messages - This caused a 400 error on the second LLM call in agent mode: `"Invalid value: ''. Supported values are: 'assistant', 'system', 'developer', and 'user'."` ### Changes **`smart_decision_maker.py`** — 6 functions updated: | Function | Fix | |---|---| | `_convert_raw_response_to_dict` | Detects Responses API `Response` objects, extracts output items as a list | | `_get_tool_requests` | Recognizes `type: "function_call"` items | | `_get_tool_responses` | Recognizes `type: "function_call_output"` items | | `_create_tool_response` | New `responses_api` kwarg produces `function_call_output` format | | `_update_conversation` | Handles list return from `_convert_raw_response_to_dict` | | Non-agent mode path | Same list handling for traditional execution | **`test_smart_decision_maker_responses_api.py`** — 61 tests covering: - Every branch of all 6 affected helper functions - Chat Completions, Anthropic, and Responses API formats - End-to-end agent mode and traditional mode conversation validity ## Test plan - [x] 61 new unit tests all pass - [x] 11 existing SmartDecisionMakerBlock tests still pass (no regressions) - [x] All pre-commit hooks pass (ruff, black, isort, pyright) - [ ] CI integration tests 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Medium Risk** > Updates core LLM invocation and agent conversation/tool-call bookkeeping to match OpenAI’s Responses API, which can affect tool execution loops and prompt serialization across providers. Risk is mitigated by extensive new unit tests, but regressions could surface in production agent-mode flows or token/usage accounting. > > **Overview** > **Migrates OpenAI calls from Chat Completions to the Responses API end-to-end**, including tool schema conversion, output parsing, reasoning/text extraction, and updated token usage fields in `LLMResponse`. > > **Fixes SmartDecisionMakerBlock conversation/tool handling for Responses API** by treating `raw_response` as a Response object (splitting it into `output` items for replay), recognizing `function_call`/`function_call_output` entries, and emitting tool outputs in the correct Responses format to prevent invalid follow-up prompts. > > Also adjusts prompt compaction/token estimation to understand Responses API tool items, changes `get_execution_outputs_by_node_exec_id` to return list-valued `CompletedBlockOutput`, removes `gpt-3.5-turbo` from model/cost/docs lists, and adds focused unit tests plus a lightweight `conftest.py` to run these tests without the full server stack. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commitff292efd3d. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Otto <otto@agpt.co> Co-authored-by: Krzysztof Czerwinski <kpczerwinski@gmail.com>