Adding LLM Based Editing capability (#8677)

Co-authored-by: Xingyao Wang <xingyao@all-hands.dev> Co-authored-by: Engel Nyst <enyst@users.noreply.github.com> Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
2026-01-08 22:38:05 -05:00 · 2025-06-09 09:57:20 -04:00
parent 4eef22e04e
commit d84befe28f
7 changed files with 119 additions and 34 deletions
--- a/evaluation/README.md
+++ b/evaluation/README.md
@@ -74,6 +74,24 @@ If no condenser configuration is specified, the 'noop' condenser will be used by

 For other configurations specific to evaluation, such as `save_trajectory_path`, these are typically set in the `get_config` function of the respective `run_infer.py` file for each benchmark.

+### Enabling LLM-Based Editor Tools
+
+The LLM-Based Editor tool (currently supported only for SWE-Bench) can be enabled by setting:
+```bash
+export ENABLE_LLM_EDITOR=true
+```
+
+You can set the config for the Editor LLM as:
+```toml
+[llm.draft_editor]
+base_url = "http://localhost:9002/v1"
+model = "hosted_vllm/lite_coder_qwen_editor_3B"
+api_key = ""
+temperature = 0.7
+max_input_tokens = 10500
+max_output_tokens = 10500
+```
+
 ## Supported Benchmarks

 The OpenHands evaluation harness supports a wide variety of benchmarks across [software engineering](#software-engineering), [web browsing](#web-browsing), [miscellaneous assistance](#misc-assistance), and [real-world](#real-world) tasks.
--- a/evaluation/benchmarks/swe_bench/run_infer.py
+++ b/evaluation/benchmarks/swe_bench/run_infer.py
@@ -42,7 +42,7 @@ from openhands.core.config import (
    AgentConfig,
    OpenHandsConfig,
    get_llm_config_arg,
-    get_parser,
+    get_parser
 )
 from openhands.core.config.condenser_config import NoOpCondenserConfig
 from openhands.core.config.utils import get_condenser_config_arg
@@ -62,6 +62,7 @@ from openhands.utils.shutdown_listener import sleep_if_should_continue

 USE_HINT_TEXT = os.environ.get('USE_HINT_TEXT', 'false').lower() == 'true'
 RUN_WITH_BROWSING = os.environ.get('RUN_WITH_BROWSING', 'false').lower() == 'true'
+ENABLE_LLM_EDITOR = os.environ.get('ENABLE_LLM_EDITOR', 'false').lower() == 'true'
 BenchMode = Literal['swe', 'swt', 'swt-ci']


@@ -254,15 +255,19 @@ def get_config(
        workspace_base=None,
        workspace_mount_path=None,
    )
+
    config.set_llm_config(
        update_llm_config_for_completions_logging(
            metadata.llm_config, metadata.eval_output_dir, instance['instance_id']
        )
    )
+    # get 'draft_editor' config if exists
+    config.set_llm_config(get_llm_config_arg('draft_editor'), 'draft_editor')
+
    agent_config = AgentConfig(
        enable_jupyter=False,
        enable_browsing=RUN_WITH_BROWSING,
-        enable_llm_editor=False,
+        enable_llm_editor=ENABLE_LLM_EDITOR,
        enable_mcp=False,
        condenser=metadata.condenser_config,
        enable_prompt_extensions=False,