Fix frontend tests for GitHub token documentation changes

chore(deps): bump the version-all group with 4 updates (#7308 )
Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: openhands <openhands@all-hands.dev>
2026-04-29 03:00:45 -04:00 · 2025-03-17 19:22:35 +00:00 · 2025-03-17 18:57:56 +00:00 · 2025-03-17 18:37:35 +00:00
34 changed files with 310 additions and 753 deletions
--- a/.github/workflows/deploy-docs.yml
+++ b/.github/workflows/deploy-docs.yml
@@ -11,7 +11,6 @@ on:
    paths:
      - 'docs/**'
      - '.github/workflows/deploy-docs.yml'
-      - 'pydoc-markdown.yml'
    branches:
      - main

@@ -40,10 +39,7 @@ jobs:
        with:
          python-version: '3.12'
      - name: Generate Python Docs
-        run: |
-          rm -rf docs/modules/python
-          pip install pydoc-markdown
-          pydoc-markdown
+        run: rm -rf docs/modules/python && pip install pydoc-markdown && pydoc-markdown
      - name: Install dependencies
        run: cd docs && npm ci
      - name: Build website
--- a/docs/modules/usage/configuration-options.md
+++ b/docs/modules/usage/configuration-options.md
@@ -308,11 +308,6 @@ The agent configuration options are defined in the `[agent]` and `[agent.<agent_
  - Default: `false`
  - Description: Whether Jupyter is enabled in the action space

- `enable_search_engine`
-  - Type: `bool`
-  - Default: `false`
-  - Description: Whether the search engine tool is enabled in the action space. See [Search Configuration](./search/search-configuration.md) for details.
-
 - `enable_history_truncation`
  - Type: `bool`
  - Default: `true`
--- a/docs/modules/usage/search/search-configuration.md
+++ b/docs/modules/usage/search/search-configuration.md
@@ -1,113 +0,0 @@
-# Search Configuration
-
-OpenHands provides a search engine capability that allows agents to perform web searches using the Brave Search API. This guide explains how to configure and use the search feature.
-
-## Overview
-
-The search engine feature enables agents to:
- Execute web search queries programmatically
- Get structured results including web pages, news, videos, and FAQs
- Avoid CAPTCHA challenges that often occur when using browser-based search
-
-## Configuration
-
-### Enabling Search
-
-To enable the search engine feature, set the following in your `config.toml`:
-
-```toml
-[agent]
-enable_search_engine = true
-```
-
-Or when using Docker, set the environment variable:
-```bash
-e AGENT_ENABLE_SEARCH_ENGINE=true
-```
-
-### API Key Setup
-
-The search feature requires a Brave Search API key. You can obtain one from the [Brave Search API Dashboard](https://api.search.brave.com/app/keys).
-
-Set the API key in your `config.toml`:
-```toml
-[search]
-enabled = true
-api_key = "your-api-key-here"
-```
-
-Or when using Docker:
-```bash
-e SEARCH_ENABLED=true
-e SEARCH_API_KEY="your-api-key-here"
-```
-
-## Search Results
-
-When a search is performed, the results are returned in a structured format that includes:
-
- Web search results
- News articles
- Video content
- FAQ entries
- Discussion threads
- Infoboxes (when available)
- Location information (when relevant)
-
-Each result type includes:
- Title
- URL (when applicable)
- Description or snippet
- Additional metadata specific to the result type
-
-## Usage Example
-
-When the search feature is enabled, agents can use the `search_engine` tool to perform searches. For example:
-
-```python
-# The agent can make a tool call like this:
-{
-    "name": "search_engine",
-    "arguments": {
-        "query": "latest developments in AI"
-    }
-}
-```
-
-The search results will be returned in a markdown-formatted structure that's easy for the agent to parse and understand.
-
-## Best Practices
-
-1. **Query Formulation**
-   - Keep queries focused and specific
-   - Include relevant keywords
-   - Avoid overly complex or compound queries
-
-2. **Rate Limiting**
-   - Be mindful of API rate limits
-   - Cache results when appropriate
-   - Implement retries with exponential backoff for failed requests
-
-3. **Error Handling**
-   - Handle API errors gracefully
-   - Provide meaningful feedback when searches fail
-   - Have fallback strategies when search is unavailable
-
-## Troubleshooting
-
-Common issues and solutions:
-
-1. **Search Not Working**
-   - Verify `enable_search_engine` is set to `true`
-   - Confirm the Brave API key is correctly set
-   - Check API key permissions and quotas
-
-2. **No Results**
-   - Verify the query is not empty
-   - Try reformulating the search query
-   - Check for any API response errors
-
-3. **Rate Limiting**
-   - Monitor API usage
-   - Implement caching if needed
-   - Consider upgrading API tier if limits are consistently hit
--- a/evaluation/benchmarks/swe_bench/README.md
+++ b/evaluation/benchmarks/swe_bench/README.md
@@ -18,6 +18,20 @@ Please follow instruction [here](../../README.md#setup) to setup your local deve

 ## Run Inference (Rollout) on SWE-Bench Instances: Generate Patch from Problem Statement

+> [!NOTE]
+> **Iterative Evaluation Protocol**
+>
+> We have an iterative approach for more stable and reproducible results:
+> - For each instance, we attempt to generate a solution up to 3 times
+> - Each attempt continues until either:
+>   1. The agent successfully produces a patch with `AgentFinishAction`, or
+>   2. The attempt reaches the maximum iteration limit
+> - If an attempt fails, we retry with a fresh attempt (up to the 3-attempt maximum)
+> - If your LLM config has temperature=0, we will automatically use temperature=0.1 for the 2nd and 3rd attempts
+>
+> To enable this iterative protocol, set `export ITERATIVE_EVAL_MODE=true`
+
+
 ### Running Locally with Docker

 Make sure your Docker daemon is running, and you have ample disk space (at least 200-500GB, depends on the SWE-Bench set you are running on) for the instance-level docker image.
@@ -45,7 +59,7 @@ to `CodeActAgent`.
 default, the script evaluates the entire SWE-bench_Lite test set (300 issues). Note:
 in order to use `eval_limit`, you must also set `agent`.
 - `max_iter`, e.g. `20`, is the maximum number of iterations for the agent to run. By
-default, it is set to 30.
+default, it is set to 60.
 - `num_workers`, e.g. `3`, is the number of parallel workers to run the evaluation. By
 default, it is set to 1.
 - `dataset`, a huggingface dataset name. e.g. `princeton-nlp/SWE-bench`, `princeton-nlp/SWE-bench_Lite`, or `princeton-nlp/SWE-bench_Verified`, specifies which dataset to evaluate on.
--- a/evaluation/benchmarks/swe_bench/run_infer.py
+++ b/evaluation/benchmarks/swe_bench/run_infer.py
@@ -37,9 +37,10 @@ from openhands.core.config import (
 )
 from openhands.core.logger import openhands_logger as logger
 from openhands.core.main import create_runtime, run_controller
+from openhands.critic import AgentFinishedCritic
 from openhands.events.action import CmdRunAction, MessageAction
 from openhands.events.observation import CmdOutputObservation, ErrorObservation
-from openhands.events.serialization.event import event_to_dict
+from openhands.events.serialization.event import event_from_dict, event_to_dict
 from openhands.runtime.base import Runtime
 from openhands.utils.async_utils import call_async_from_sync
 from openhands.utils.shutdown_listener import sleep_if_should_continue
@@ -122,7 +123,9 @@ You SHOULD NEVER attempt to browse the web.


 # TODO: migrate all swe-bench docker to ghcr.io/openhands
-DEFAULT_DOCKER_IMAGE_PREFIX = os.environ.get('EVAL_DOCKER_IMAGE_PREFIX', 'docker.io/xingyaoww/')
+DEFAULT_DOCKER_IMAGE_PREFIX = os.environ.get(
+    'EVAL_DOCKER_IMAGE_PREFIX', 'docker.io/xingyaoww/'
+)
 logger.info(f'Default docker image prefix: {DEFAULT_DOCKER_IMAGE_PREFIX}')


@@ -637,20 +640,132 @@ if __name__ == '__main__':

    output_file = os.path.join(metadata.eval_output_dir, 'output.jsonl')
    print(f'### OUTPUT FILE: {output_file} ###')
-    instances = prepare_dataset(swe_bench_tests, output_file, args.eval_n_limit)

-    if len(instances) > 0 and not isinstance(
-        instances['PASS_TO_PASS'][instances['PASS_TO_PASS'].index[0]], str
-    ):
-        for col in ['PASS_TO_PASS', 'FAIL_TO_PASS']:
-            instances[col] = instances[col].apply(lambda x: str(x))
-
-    run_evaluation(
-        instances,
-        metadata,
-        output_file,
-        args.eval_num_workers,
-        process_instance,
-        timeout_seconds=8 * 60 * 60,  # 8 hour PER instance should be more than enough
-        max_retries=5,
+    # Run evaluation in iterative mode:
+    # If a rollout fails to output AgentFinishAction, we will try again until it succeeds OR total 3 attempts have been made.
+    ITERATIVE_EVAL_MODE = (
+        os.environ.get('ITERATIVE_EVAL_MODE', 'false').lower() == 'true'
    )
+    ITERATIVE_EVAL_MODE_MAX_ATTEMPTS = int(
+        os.environ.get('ITERATIVE_EVAL_MODE_MAX_ATTEMPTS', '3')
+    )
+
+    if not ITERATIVE_EVAL_MODE:
+        # load the dataset
+        instances = prepare_dataset(swe_bench_tests, output_file, args.eval_n_limit)
+        if len(instances) > 0 and not isinstance(
+            instances['PASS_TO_PASS'][instances['PASS_TO_PASS'].index[0]], str
+        ):
+            for col in ['PASS_TO_PASS', 'FAIL_TO_PASS']:
+                instances[col] = instances[col].apply(lambda x: str(x))
+
+        run_evaluation(
+            instances,
+            metadata,
+            output_file,
+            args.eval_num_workers,
+            process_instance,
+            timeout_seconds=8
+            * 60
+            * 60,  # 8 hour PER instance should be more than enough
+            max_retries=5,
+        )
+    else:
+        critic = AgentFinishedCritic()
+
+        def get_cur_output_file_path(attempt: int) -> str:
+            return (
+                f'{output_file.removesuffix(".jsonl")}.critic_attempt_{attempt}.jsonl'
+            )
+
+        eval_ids = None
+        for attempt in range(1, ITERATIVE_EVAL_MODE_MAX_ATTEMPTS + 1):
+            cur_output_file = get_cur_output_file_path(attempt)
+            logger.info(
+                f'Running evaluation with critic {critic.__class__.__name__} for attempt {attempt} of {ITERATIVE_EVAL_MODE_MAX_ATTEMPTS}.'
+            )
+
+            # For deterministic eval, we set temperature to 0.1 for (>1) attempt
+            # so hopefully we get slightly different results
+            if attempt > 1 and metadata.llm_config.temperature == 0:
+                logger.info(
+                    f'Detected temperature is 0 for (>1) attempt {attempt}. Setting temperature to 0.1...'
+                )
+                metadata.llm_config.temperature = 0.1
+
+            # Load instances - at first attempt, we evaluate all instances
+            # On subsequent attempts, we only evaluate the instances that failed the previous attempt determined by critic
+            instances = prepare_dataset(
+                swe_bench_tests, cur_output_file, args.eval_n_limit, eval_ids=eval_ids
+            )
+            if len(instances) > 0 and not isinstance(
+                instances['PASS_TO_PASS'][instances['PASS_TO_PASS'].index[0]], str
+            ):
+                for col in ['PASS_TO_PASS', 'FAIL_TO_PASS']:
+                    instances[col] = instances[col].apply(lambda x: str(x))
+
+            # Run evaluation - but save them to cur_output_file
+            logger.info(
+                f'Evaluating {len(instances)} instances for attempt {attempt}...'
+            )
+            run_evaluation(
+                instances,
+                metadata,
+                cur_output_file,
+                args.eval_num_workers,
+                process_instance,
+                timeout_seconds=8
+                * 60
+                * 60,  # 8 hour PER instance should be more than enough
+                max_retries=5,
+            )
+
+            # When eval is done, we update eval_ids to the instances that failed the current attempt
+            instances_failed = []
+            logger.info(
+                f'Use critic {critic.__class__.__name__} to check {len(instances)} instances for attempt {attempt}...'
+            )
+            with open(cur_output_file, 'r') as f:
+                for line in f:
+                    instance = json.loads(line)
+                    history = [event_from_dict(event) for event in instance['history']]
+                    critic_result = critic.evaluate(history)
+                    if not critic_result.success:
+                        instances_failed.append(instance['instance_id'])
+            logger.info(
+                f'{len(instances_failed)} instances failed the current attempt {attempt}: {instances_failed}'
+            )
+            eval_ids = instances_failed
+
+            # If no instances failed, we break
+            if len(instances_failed) == 0:
+                break
+
+        # Then we should aggregate the results from all attempts into the original output file
+        # and remove the intermediate files
+        logger.info(
+            'Aggregating results from all attempts into the original output file...'
+        )
+        fout = open(output_file, 'w')
+        added_instance_ids = set()
+        for attempt in reversed(range(1, ITERATIVE_EVAL_MODE_MAX_ATTEMPTS + 1)):
+            cur_output_file = get_cur_output_file_path(attempt)
+            if not os.path.exists(cur_output_file):
+                logger.warning(
+                    f'Intermediate output file {cur_output_file} does not exist. Skipping...'
+                )
+                continue
+
+            with open(cur_output_file, 'r') as f:
+                for line in f:
+                    instance = json.loads(line)
+                    if instance['instance_id'] not in added_instance_ids:
+                        fout.write(line)
+                        added_instance_ids.add(instance['instance_id'])
+            logger.info(
+                f'Aggregated instances from {cur_output_file}. Total instances added so far: {len(added_instance_ids)}'
+            )
+        fout.close()
+        logger.info(
+            f'Done! Total {len(added_instance_ids)} instances added to {output_file}'
+        )
--- a/evaluation/benchmarks/swe_bench/scripts/run_infer.sh
+++ b/evaluation/benchmarks/swe_bench/scripts/run_infer.sh
@@ -25,8 +25,8 @@ if [ -z "$AGENT" ]; then
 fi

 if [ -z "$MAX_ITER" ]; then
-  echo "MAX_ITER not specified, use default 100"
-  MAX_ITER=100
+  echo "MAX_ITER not specified, use default 60"
+  MAX_ITER=60
 fi

 if [ -z "$RUN_WITH_BROWSING" ]; then
--- a/frontend/tests/routes/settings.test.tsx
+++ b/frontend/tests/routes/settings.test.tsx
@@ -95,7 +95,10 @@ describe("Settings Screen", () => {

      await waitFor(() => {
        screen.getByTestId("github-token-input");
-        screen.getByTestId("github-token-help-anchor");
+        // Check for GitHub link instead of the help anchor
+        screen.getByRole("link", { name: "GitHub" });
+        // Check for documentation link
+        screen.getByRole("link", { name: "documentation" });
        screen.getByTestId("language-input");
        screen.getByTestId("enable-analytics-switch");
      });
@@ -237,10 +240,12 @@ describe("Settings Screen", () => {

      await waitFor(() => {
        const input = screen.queryByTestId("github-token-input");
-        const helpAnchor = screen.queryByTestId("github-token-help-anchor");
+        const githubLink = screen.queryByText("GitHub");
+        const documentationLink = screen.queryByText("documentation");

        expect(input).not.toBeInTheDocument();
-        expect(helpAnchor).not.toBeInTheDocument();
+        expect(githubLink).not.toBeInTheDocument();
+        expect(documentationLink).not.toBeInTheDocument();
      });
    });

--- a/frontend/src/routes/account-settings.tsx
+++ b/frontend/src/routes/account-settings.tsx
@@ -410,12 +410,31 @@ function AccountSettings() {
                  placeholder={isGitHubTokenSet ? "**********" : ""}
                />

-                <HelpLink
-                  testId="github-token-help-anchor"
-                  text="Get your token"
-                  linkText="here"
-                  href="https://github.com/settings/tokens/new?description=openhands-app&scopes=repo,user,workflow"
-                />
+                <p className="text-xs">
+                  Generate a token on{" "}
+                  <b>
+                    <a
+                      href="https://github.com/settings/tokens/new?description=openhands-app&scopes=repo,user,workflow"
+                      target="_blank"
+                      className="underline underline-offset-2"
+                      rel="noopener noreferrer"
+                    >
+                      GitHub
+                    </a>{" "}
+                  </b>
+                  or see the{" "}
+                  <b>
+                    <a
+                      href="https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token"
+                      target="_blank"
+                      className="underline underline-offset-2"
+                      rel="noopener noreferrer"
+                    >
+                      documentation
+                    </a>
+                  </b>
+                  .
+                </p>
              </>
            )}

--- a/openhands/agenthub/codeact_agent/codeact_agent.py
+++ b/openhands/agenthub/codeact_agent/codeact_agent.py
@@ -70,7 +70,6 @@ class CodeActAgent(Agent):
            codeact_enable_browsing=self.config.codeact_enable_browsing,
            codeact_enable_jupyter=self.config.codeact_enable_jupyter,
            codeact_enable_llm_editor=self.config.codeact_enable_llm_editor,
-            codeact_enable_search_engine=self.config.enable_search_engine,
            llm=self.llm,
        )
        logger.debug(
--- a/openhands/agenthub/codeact_agent/function_calling.py
+++ b/openhands/agenthub/codeact_agent/function_calling.py
@@ -15,7 +15,6 @@ from openhands.agenthub.codeact_agent.tools import (
    FinishTool,
    IPythonTool,
    LLMBasedFileEditTool,
-    SearchEngineTool,
    ThinkTool,
    WebReadTool,
    create_cmd_run_tool,
@@ -37,7 +36,6 @@ from openhands.events.action import (
    FileReadAction,
    IPythonRunCellAction,
    MessageAction,
-    SearchAction,
 )
 from openhands.events.event import FileEditSource, FileReadSource
 from openhands.events.tool import ToolCallMetadata
@@ -193,15 +191,6 @@ def response_to_actions(response: ModelResponse) -> list[Action]:
                        f'Missing required argument "url" in tool call {tool_call.function.name}'
                    )
                action = BrowseURLAction(url=arguments['url'])
-            # ================================================
-            # SearchEngineTool (search the web using text queries)
-            # ================================================
-            elif tool_call.function.name == SearchEngineTool['function']['name']:
-                if 'query' not in arguments:
-                    raise FunctionCallNotExistsError(
-                        f'Missing required argument "query" in tool call {tool_call.function.name}'
-                    )
-                action = SearchAction(query=arguments['query'])
            else:
                raise FunctionCallNotExistsError(
                    f'Tool {tool_call.function.name} is not registered. (arguments: {arguments}). Please check the tool name and retry with an existing tool.'
@@ -234,7 +223,6 @@ def get_tools(
    codeact_enable_browsing: bool = False,
    codeact_enable_llm_editor: bool = False,
    codeact_enable_jupyter: bool = False,
-    codeact_enable_search_engine: bool = False,
    llm: LLM | None = None,
 ) -> list[ChatCompletionToolParam]:
    SIMPLIFIED_TOOL_DESCRIPTION_LLM_SUBSTRS = ['gpt-', 'o3', 'o1']
@@ -251,8 +239,6 @@ def get_tools(
        ThinkTool,
        FinishTool,
    ]
-    if codeact_enable_search_engine:
-        tools.append(SearchEngineTool)
    if codeact_enable_browsing:
        tools.append(WebReadTool)
        tools.append(BrowserTool)
--- a/openhands/agenthub/codeact_agent/tools/init.py
+++ b/openhands/agenthub/codeact_agent/tools/init.py
@@ -3,7 +3,6 @@ from .browser import BrowserTool
 from .finish import FinishTool
 from .ipython import IPythonTool
 from .llm_based_edit import LLMBasedFileEditTool
-from .search_engine import SearchEngineTool
 from .str_replace_editor import create_str_replace_editor_tool
 from .think import ThinkTool
 from .web_read import WebReadTool
@@ -14,7 +13,6 @@ __all__ = [
    'FinishTool',
    'IPythonTool',
    'LLMBasedFileEditTool',
-    'SearchEngineTool',
    'create_str_replace_editor_tool',
    'WebReadTool',
    'ThinkTool',
--- a/openhands/agenthub/codeact_agent/tools/search_engine.py
+++ b/openhands/agenthub/codeact_agent/tools/search_engine.py
@@ -1,24 +0,0 @@
-from litellm import ChatCompletionToolParam, ChatCompletionToolParamFunctionChunk
-
-_SEARCH_ENGINE_DESCRIPTION = """Execute a web search query (similar to Google search).
-
-NOTE: When you need to search for information online, please use the `search_engine` tool rather than the `browser` or `web_read` tools. The `search_engine` tool connects directly to a search engine, which will help avoid CAPTCHA challenges that would otherwise block your access.
-"""
-
-SearchEngineTool = ChatCompletionToolParam(
-    type='function',
-    function=ChatCompletionToolParamFunctionChunk(
-        name='search_engine',
-        description=_SEARCH_ENGINE_DESCRIPTION,
-        parameters={
-            'type': 'object',
-            'properties': {
-                'query': {
-                    'type': 'string',
-                    'description': 'The web search query (must be a non-empty string).',
-                },
-            },
-            'required': ['query'],
-        },
-    ),
-)
--- a/openhands/core/config/init.py
+++ b/openhands/core/config/init.py
@@ -8,7 +8,6 @@ from openhands.core.config.config_utils import (
 from openhands.core.config.extended_config import ExtendedConfig
 from openhands.core.config.llm_config import LLMConfig
 from openhands.core.config.sandbox_config import SandboxConfig
-from openhands.core.config.search_config import SearchConfig
 from openhands.core.config.security_config import SecurityConfig
 from openhands.core.config.utils import (
    finalize_config,
@@ -29,7 +28,6 @@ __all__ = [
    'AppConfig',
    'LLMConfig',
    'SandboxConfig',
-    'SearchConfig',
    'SecurityConfig',
    'ExtendedConfig',
    'load_app_config',
--- a/openhands/core/config/agent_config.py
+++ b/openhands/core/config/agent_config.py
@@ -2,10 +2,7 @@ from __future__ import annotations

 from pydantic import BaseModel, Field, ValidationError

-from openhands.core.config.condenser_config import (
-    CondenserConfig,
-    NoOpCondenserConfig,
-)
+from openhands.core.config.condenser_config import CondenserConfig, NoOpCondenserConfig
 from openhands.core.logger import openhands_logger as logger


@@ -33,7 +30,6 @@ class AgentConfig(BaseModel):
    disabled_microagents: list[str] = Field(default_factory=list)
    enable_history_truncation: bool = Field(default=True)
    enable_som_visual_browsing: bool = Field(default=False)
-    enable_search_engine: bool = Field(default=False)
    condenser: CondenserConfig = Field(default_factory=NoOpCondenserConfig)

    model_config = {'extra': 'forbid'}
--- a/openhands/core/config/app_config.py
+++ b/openhands/core/config/app_config.py
@@ -12,7 +12,6 @@ from openhands.core.config.config_utils import (
 from openhands.core.config.extended_config import ExtendedConfig
 from openhands.core.config.llm_config import LLMConfig
 from openhands.core.config.sandbox_config import SandboxConfig
-from openhands.core.config.search_config import SearchConfig
 from openhands.core.config.security_config import SecurityConfig


@@ -54,7 +53,6 @@ class AppConfig(BaseModel):
    default_agent: str = Field(default=OH_DEFAULT_AGENT)
    sandbox: SandboxConfig = Field(default_factory=SandboxConfig)
    security: SecurityConfig = Field(default_factory=SecurityConfig)
-    search: SearchConfig = Field(default_factory=SearchConfig)
    extended: ExtendedConfig = Field(default_factory=lambda: ExtendedConfig({}))
    runtime: str = Field(default='docker')
    file_store: str = Field(default='local')
--- a/openhands/core/config/search_config.py
+++ b/openhands/core/config/search_config.py
@@ -1,35 +0,0 @@
-"""Configuration for search engine functionality."""
-
-import os
-from typing import Any
-
-from pydantic import BaseModel, Field, SecretStr
-
-
-class SearchConfig(BaseModel):
-    """Configuration for search engine functionality.
-
-    Attributes:
-        enabled: Whether search engine functionality is enabled.
-        api_key: The API key for the search engine.
-        api_url: The base URL for the search API.
-    """
-
-    enabled: bool = Field(default=False)
-    api_key: SecretStr | None = Field(default=None)
-    api_url: str = Field(default="https://api.search.brave.com/res/v1/web/search")
-
-    model_config = {"extra": "forbid"}
-
-    def model_post_init(self, __context: Any) -> None:
-        """Post-initialization hook to assign search-related variables to environment variables.
-
-        This ensures that these values are accessible to the search engine at runtime.
-        """
-        super().model_post_init(__context)
-
-        # Set environment variables for search engine
-        if self.api_key:
-            os.environ["BRAVE_API_KEY"] = self.api_key.get_secret_value()
-        if self.api_url:
-            os.environ["BRAVE_API_URL"] = self.api_url
--- a/openhands/core/schema/action.py
+++ b/openhands/core/schema/action.py
@@ -82,9 +82,6 @@ class ActionTypeSchema(BaseModel):
    SEND_PR: str = Field(default='send_pr')
    """Send a PR to github."""

-    SEARCH: str = Field(default='search')
-    """Queries a search engine."""
-
    RECALL: str = Field(default='recall')
    """Retrieves content from a user workspace, microagent, or other source."""

--- a/openhands/core/schema/observation.py
+++ b/openhands/core/schema/observation.py
@@ -49,9 +49,6 @@ class ObservationTypeSchema(BaseModel):
    CONDENSE: str = Field(default='condense')
    """Result of a condensation operation."""

-    SEARCH: str = Field(default='search')
-    """Result of querying a search engine."""
-
    RECALL: str = Field(default='recall')
    """Result of a recall operation. This can be the workspace context, a microagent, or other types of information."""

--- a/openhands/critic/init.py
+++ b/openhands/critic/init.py
@@ -0,0 +1,4 @@
+from .base import BaseCritic, CriticResult
+from .finish_critic import AgentFinishedCritic
+
+__all__ = ['CriticResult', 'BaseCritic', 'AgentFinishedCritic']
--- a/openhands/critic/base.py
+++ b/openhands/critic/base.py
@@ -0,0 +1,31 @@
+import abc
+
+from pydantic import BaseModel
+
+from openhands.events import Event
+
+
+class CriticResult(BaseModel):
+    """
+    A critic result is a score and a message.
+    """
+
+    score: float
+    message: str
+
+    @property
+    def success(self) -> bool:
+        """
+        Whether the agent is successful.
+        """
+        return self.score >= 0.5
+
+
+class BaseCritic(abc.ABC):
+    """
+    A critic is a function that takes in a list of events and returns a score about the quality of those events.
+    """
+
+    @abc.abstractmethod
+    def evaluate(self, events: list[Event]) -> CriticResult:
+        pass
--- a/openhands/critic/finish_critic.py
+++ b/openhands/critic/finish_critic.py
@@ -0,0 +1,21 @@
+from openhands.critic.base import BaseCritic, CriticResult
+from openhands.events import Event
+from openhands.events.action import Action, AgentFinishAction
+
+
+class AgentFinishedCritic(BaseCritic):
+    """This is a simple rule-based critic that checks if the last event is an AgentFinishAction.
+
+    If not, it will return a score of 0 and a message indicating that the agent did not finish.
+    """
+
+    def __init__(self):
+        pass
+
+    def evaluate(self, events: list[Event]) -> CriticResult:
+        last_action = next((h for h in reversed(events) if isinstance(h, Action)), None)
+
+        if isinstance(last_action, AgentFinishAction):
+            return CriticResult(score=1, message='Agent finished.')
+        else:
+            return CriticResult(score=0, message='Agent did not finish.')
--- a/openhands/events/action/init.py
+++ b/openhands/events/action/init.py
@@ -17,7 +17,6 @@ from openhands.events.action.files import (
    FileWriteAction,
 )
 from openhands.events.action.message import MessageAction
-from openhands.events.action.search_engine import SearchAction

 __all__ = [
    'Action',
@@ -37,6 +36,5 @@ __all__ = [
    'MessageAction',
    'ActionConfirmationStatus',
    'AgentThinkAction',
-    'SearchAction',
    'RecallAction',
 ]
--- a/openhands/events/action/search_engine.py
+++ b/openhands/events/action/search_engine.py
@@ -1,24 +0,0 @@
-from dataclasses import dataclass
-from typing import ClassVar
-
-from openhands.core.schema import ActionType
-from openhands.events.action.action import Action
-
-
-@dataclass
-class SearchAction(Action):
-    query: str
-    thought: str = ''
-    action: str = ActionType.SEARCH
-    runnable: ClassVar[bool] = True
-
-    @property
-    def message(self) -> str:
-        return f'I am querying the search engine to search for {self.query}'
-
-    def __str__(self) -> str:
-        ret = '**SearchAction**\n'
-        if self.thought:
-            ret += f'THOUGHT: {self.thought}\n'
-        ret += f'QUERY: {self.query}'
-        return ret
--- a/openhands/events/observation/init.py
+++ b/openhands/events/observation/init.py
@@ -5,7 +5,6 @@ from openhands.events.observation.agent import (
    AgentThinkObservation,
    RecallObservation,
 )
-from openhands.events.observation.search_engine import SearchEngineObservation
 from openhands.events.observation.browse import BrowserOutputObservation
 from openhands.events.observation.commands import (
    CmdOutputMetadata,
@@ -43,7 +42,6 @@ __all__ = [
    'SuccessObservation',
    'UserRejectObservation',
    'AgentCondensationObservation',
-    'SearchEngineObservation',
    'RecallObservation',
    'RecallType',
 ]
--- a/openhands/events/observation/search_engine.py
+++ b/openhands/events/observation/search_engine.py
@@ -1,22 +0,0 @@
-from dataclasses import dataclass
-
-from openhands.core.schema import ObservationType
-from openhands.events.observation.observation import Observation
-
-
-@dataclass
-class SearchEngineObservation(Observation):
-    query: str
-    observation: str = ObservationType.SEARCH
-
-    @property
-    def message(self) -> str:
-        return f'Searched for: {self.query}'
-
-    def __str__(self) -> str:
-        ret = (
-            '**SearchEngineObservation**\n'
-            f'Query: {self.query}\n'
-            f'Search Results: {self.content}\n'
-        )
-        return ret
--- a/openhands/events/serialization/action.py
+++ b/openhands/events/serialization/action.py
@@ -22,7 +22,6 @@ from openhands.events.action.files import (
    FileWriteAction,
 )
 from openhands.events.action.message import MessageAction
-from openhands.events.action.search_engine import SearchAction

 actions = (
    NullAction,
@@ -40,7 +39,6 @@ actions = (
    RecallAction,
    ChangeAgentStateAction,
    MessageAction,
-    SearchAction,
 )

 ACTION_TYPE_TO_CLASS = {action_class.action: action_class for action_class in actions}  # type: ignore[attr-defined]
--- a/openhands/memory/conversation_memory.py
+++ b/openhands/memory/conversation_memory.py
@@ -27,7 +27,6 @@ from openhands.events.observation import (
    FileEditObservation,
    FileReadObservation,
    IPythonRunCellObservation,
-    SearchEngineObservation,
    UserRejectObservation,
 )
 from openhands.events.observation.agent import (
@@ -386,9 +385,6 @@ class ConversationMemory:
        elif isinstance(obs, AgentCondensationObservation):
            text = truncate_content(obs.content, max_message_chars)
            message = Message(role='user', content=[TextContent(text=text)])
-        elif isinstance(obs, SearchEngineObservation):
-            text = truncate_content(obs.content, max_message_chars)
-            message = Message(role='user', content=[TextContent(text=text)])
        elif (
            isinstance(obs, RecallObservation)
            and self.agent_config.enable_prompt_extensions
--- a/openhands/runtime/action_execution_server.py
+++ b/openhands/runtime/action_execution_server.py
@@ -41,7 +41,6 @@ from openhands.events.action import (
    FileReadAction,
    FileWriteAction,
    IPythonRunCellAction,
-    SearchAction,
 )
 from openhands.events.event import FileEditSource, FileReadSource
 from openhands.events.observation import (
@@ -57,7 +56,6 @@ from openhands.events.serialization import event_from_dict, event_to_dict
 from openhands.runtime.browser import browse
 from openhands.runtime.browser.browser_env import BrowserEnv
 from openhands.runtime.plugins import ALL_PLUGINS, JupyterPlugin, Plugin, VSCodePlugin
-from openhands.runtime.search_engine.brave_search import search
 from openhands.runtime.utils.bash import BashSession
 from openhands.runtime.utils.files import insert_lines, read_lines
 from openhands.runtime.utils.memory_monitor import MemoryMonitor
@@ -165,6 +163,7 @@ class ActionExecutor:
        self.start_time = time.time()
        self.last_execution_time = self.start_time
        self._initialized = False
+
        self.max_memory_gb: int | None = None
        if _override_max_memory_gb := os.environ.get('RUNTIME_MAX_MEMORY_GB', None):
            self.max_memory_gb = int(_override_max_memory_gb)
@@ -465,10 +464,6 @@ class ActionExecutor:
    async def browse_interactive(self, action: BrowseInteractiveAction) -> Observation:
        return await browse(action, self.browser)

-    async def search(self, action: SearchAction) -> Observation:
-        obs = await call_sync_from_async(search, action)
-        return obs
-
    def close(self):
        self.memory_monitor.stop_monitoring()
        if self.bash_session is not None:
--- a/openhands/runtime/impl/action_execution/action_execution_client.py
+++ b/openhands/runtime/impl/action_execution/action_execution_client.py
@@ -24,7 +24,6 @@ from openhands.events.action import (
    FileReadAction,
    FileWriteAction,
    IPythonRunCellAction,
-    SearchAction,
 )
 from openhands.events.action.action import Action
 from openhands.events.action.files import FileEditSource
@@ -298,9 +297,6 @@ class ActionExecutionClient(Runtime):
    def browse_interactive(self, action: BrowseInteractiveAction) -> Observation:
        return self.send_action_for_execution(action)

-    def search(self, action: SearchAction) -> Observation:
-        return self.send_action_for_execution(action)
-
    def close(self) -> None:
        # Make sure we don't close the session multiple times
        # Can happen in evaluation
--- a/openhands/runtime/search_engine/init.py
+++ b/openhands/runtime/search_engine/init.py
@@ -1,3 +0,0 @@
-from openhands.runtime.search_engine.brave_search import search
-
-__all__ = ['search']
--- a/openhands/runtime/search_engine/brave_search.py
+++ b/openhands/runtime/search_engine/brave_search.py
@@ -1,239 +0,0 @@
-import os
-import re
-
-import requests
-import tenacity
-
-from openhands.core.config import AppConfig
-from openhands.events.action import SearchAction
-from openhands.events.observation.error import ErrorObservation
-from openhands.events.observation.search_engine import SearchEngineObservation
-from openhands.utils.tenacity_stop import stop_if_should_exit
-
-
-def get_title(result):
-    return f"### Title: {result['title']}\n" if 'title' in result else ''
-
-
-def get_url(result):
-    return f"### URL: {result['url']}\n" if 'url' in result else ''
-
-
-def get_description(result):
-    return (
-        f"### Description: {result['description']}\n" if 'description' in result else ''
-    )
-
-
-def get_question(result):
-    return f"### Question: {result['question']}\n" if 'question' in result else ''
-
-
-def get_answer(result):
-    return f"### Answer: {result['answer']}\n" if 'answer' in result else ''
-
-
-def get_cluster(result):
-    if 'cluster' in result:
-        output = ''
-        for i, result_obj in enumerate(result['cluster']):
-            title = get_title(result_obj)
-            url = get_url(result_obj)
-            description = get_description(result_obj)
-            discussion_output = (
-                f'### Related webpage\n#{title}#{url}#{description}\n'
-                if url != ''
-                else ''
-            )
-            output += discussion_output
-        return output
-    else:
-        return ''
-
-
-def response_to_markdown(results, query):
-    all_results = {}
-
-    # discussions
-    discussion_results = []
-    if 'discussions' in results and 'results' in results['discussions']['results']:
-        for result in results['discussions']['results']:
-            title = get_title(result)
-            url = get_url(result)
-            description = get_description(result)
-            cluster = get_cluster(result)
-            discussion_output = f'## Discussion\n{title}{url}{description}{cluster}\n'
-            discussion_results.append(discussion_output)
-    all_results['discussions'] = discussion_results
-
-    # FAQs
-    faq_results = []
-    if 'faq' in results and 'results' in results['faq']:
-        for result in results['faq']['results']:
-            title = get_title(result)
-            url = get_url(result)
-            question = get_question(result)
-            answer = get_answer(result)
-            faq_output = f'## FAQ\n{title}{url}{question}{answer}\n'
-            faq_results.append(faq_output)
-    all_results['faq'] = faq_results
-
-    # News
-    news_results = []
-    if 'news' in results and 'results' in results['news']:
-        for result in results['news']['results']:
-            title = get_title(result)
-            url = get_url(result)
-            description = get_description(result)
-            news_output = f'## News\n{title}{url}{description}\n'
-            news_results.append(news_output)
-    all_results['news'] = news_results
-
-    # Videos
-    video_results = []
-    if 'videos' in results and 'results' in results['videos']:
-        for result in results['videos']['results']:
-            title = get_title(result)
-            url = get_url(result)
-            description = get_description(result)
-            video_output = f'## Video\n{title}{url}{description}\n'
-            video_results.append(video_output)
-    all_results['videos'] = video_results
-
-    # Web Search Results
-    websearch_results = []
-    if 'web' in results and 'results' in results['web']:
-        for result in results['web']['results']:
-            title = get_title(result)
-            url = get_url(result)
-            description = get_description(result)
-            cluster = get_cluster(result)
-            if cluster:
-                websearch_output = f'## Webpage\n{title}{url}{description}\n{cluster}\n'
-            else:
-                websearch_output = f'## Webpage\n{title}{url}{description}\n'
-            websearch_results.append(websearch_output)
-    all_results['web'] = websearch_results
-
-    # infobox
-    infobox_results = []
-    if 'infobox' in results and 'results' in results['infobox']:
-        for result in results['infobox']['results']:
-            title = get_title(result)
-            url = get_url(result)
-            description = get_description(result)
-            infobox_output = f'## Infobox\n{title}{url}{description}\n'
-            infobox_results.append(infobox_output)
-    all_results['infobox'] = infobox_results
-
-    # locations
-    location_results = []
-    if 'locations' in results and 'results' in results['location']:
-        for result in results['locations']['results']:
-            title = get_title(result)
-            url = get_url(result)
-            description = get_description(result)
-            location_output = f'## Location\n{title}{url}{description}\n'
-            location_results.append(location_output)
-    all_results['locations'] = location_results
-
-    markdown = '# Search Results\n\n'
-    markdown += f'**Searched query**: {query}\n\n'
-
-    # ranked results if available
-    if 'mixed' in results:
-        for rank_type in ['main', 'top', 'side']:
-            if rank_type not in results['mixed']:
-                continue
-            for ranked_result in results['mixed'][rank_type]:
-                result_type = ranked_result['type']
-                if result_type in all_results:
-                    include_all = ranked_result['all']
-                    idx = ranked_result.get('index', None)
-                    if include_all:
-                        markdown += ''.join(all_results[result_type])
-                    elif idx is not None and idx < len(all_results[result_type]):
-                        markdown += all_results[result_type][idx]
-        for result_list in all_results.values():
-            for result in result_list:
-                if result in markdown:
-                    continue
-                else:
-                    markdown += result
-    else:
-        markdown += ''.join(
-            websearch_results
-            + video_results
-            + news_results
-            + infobox_results
-            + faq_results
-            + discussion_results
-            + location_results
-        )
-    return markdown
-
-
-def return_error(retry_state: tenacity.RetryCallState):
-    return ErrorObservation('Failed to query Brave Search API.')
-
-
-@tenacity.retry(
-    wait=tenacity.wait_exponential(min=2, max=10),
-    stop=tenacity.stop_after_attempt(5) | stop_if_should_exit(),
-    retry_error_callback=return_error,
-)
-def query_api(query: str, API_KEY, BRAVE_SEARCH_URL):
-    headers = {'Accept': 'application/json', 'X-Subscription-Token': API_KEY}
-
-    params: list[tuple[str, str | int | bool]] = [
-        ('q', query),
-        ('count', 20),  # Number of results to return, max allowed = 20
-        ('extra_snippets', False),  # TODO: Should we keep it as true?
-    ]
-
-    response = requests.get(
-        BRAVE_SEARCH_URL,
-        headers=headers,
-        params=params,  # type: ignore
-        timeout=10,
-    )
-    response.raise_for_status()  # Raise exception for 4XX/5XX responses
-    results = response.json()
-    markdown_content = response_to_markdown(results, query)
-    # TODO: Handle other types of HTML tags? I couldn't find any other tags in brave search responses for the queries I tried.
-    markdown_content = re.sub(r'</?strong>', '', markdown_content)
-    return SearchEngineObservation(query=query, content=markdown_content)
-
-
-def search(action: SearchAction, config: AppConfig):
-    """Execute a search query using the Brave Search API.
-
-    Args:
-        action: The search action containing the query.
-        config: The application configuration.
-
-    Returns:
-        SearchEngineObservation: The search results in markdown format.
-        ErrorObservation: If the query is empty or search is not enabled.
-    """
-    if not config.search.enabled:
-        return ErrorObservation(
-            content='Search engine functionality is not enabled. Enable it by setting search.enabled=true in config.'
-        )
-
-    query = action.query
-    if query is None or len(query.strip()) == 0:
-        return ErrorObservation(
-            content='The query string for search_engine tool must be a non-empty string.'
-        )
-
-    if config.search.api_key is None:
-        return ErrorObservation(
-            content='Search API key not configured. Set search.api_key in config.'
-        )
-
-    return query_api(
-        query=query,
-        API_KEY=config.search.api_key.get_secret_value(),
-        BRAVE_SEARCH_URL=config.search.api_url
-    )
--- a/poetry.lock
+++ b/poetry.lock
@@ -496,18 +496,18 @@ files = [

 [[package]]
 name = "boto3"
-version = "1.37.12"
+version = "1.37.13"
 description = "The AWS SDK for Python"
 optional = false
 python-versions = ">=3.8"
 groups = ["main"]
 files = [
-    {file = "boto3-1.37.12-py3-none-any.whl", hash = "sha256:516feaa0d2afaeda1515216fd09291368a1215754bbccb0f28414c0a91a830a2"},
-    {file = "boto3-1.37.12.tar.gz", hash = "sha256:9412d404f103ad6d14f033eb29cd5e0cdca2b9b08cbfa9d4dabd1d7be2de2625"},
+    {file = "boto3-1.37.13-py3-none-any.whl", hash = "sha256:90fa5a91d7d7456219f0b7c4a93b38335dc5cf4613d885da4d4c1d099e04c6b7"},
+    {file = "boto3-1.37.13.tar.gz", hash = "sha256:295648f887464ab74c5c301a44982df76f9ba39ebfc16be5b8f071ad1a81fe95"},
 ]

 [package.dependencies]
-botocore = ">=1.37.12,<1.38.0"
+botocore = ">=1.37.13,<1.38.0"
 jmespath = ">=0.7.1,<2.0.0"
 s3transfer = ">=0.11.0,<0.12.0"

@@ -516,14 +516,14 @@ crt = ["botocore[crt] (>=1.21.0,<2.0a0)"]

 [[package]]
 name = "botocore"
-version = "1.37.12"
+version = "1.37.13"
 description = "Low-level, data-driven core of boto 3."
 optional = false
 python-versions = ">=3.8"
 groups = ["main"]
 files = [
-    {file = "botocore-1.37.12-py3-none-any.whl", hash = "sha256:ba1948c883bbabe20d95ff62c3e36954c9269686f7db9361857835677ca3e676"},
-    {file = "botocore-1.37.12.tar.gz", hash = "sha256:ae2d5328ce6ad02eb615270507235a6e90fd3eeed615a6c0732b5a68b12f2017"},
+    {file = "botocore-1.37.13-py3-none-any.whl", hash = "sha256:aa417bac0f4d79533080e6e17c0509e149353aec83cfe7879597a7942f7f08d0"},
+    {file = "botocore-1.37.13.tar.gz", hash = "sha256:60dfb831c54eb466db9b91891a6c8a0c223626caa049969d5d42858ad1e7f8c7"},
 ]

 [package.dependencies]
@@ -3808,14 +3808,14 @@ types-tqdm = "*"

 [[package]]
 name = "litellm"
-version = "1.63.8"
+version = "1.63.11"
 description = "Library to easily interface with LLM API providers"
 optional = false
 python-versions = "!=2.7.*,!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*,!=3.6.*,!=3.7.*,>=3.8"
 groups = ["main"]
 files = [
-    {file = "litellm-1.63.8-py3-none-any.whl", hash = "sha256:12615acf16d34b444e13cb9faab89466f63a22330e72e30c7d35e12ebd526188"},
-    {file = "litellm-1.63.8.tar.gz", hash = "sha256:ae7324fb93a0da2dfd05f8fa301c3ac20dfce05d4651bdb005aeb64c88a76672"},
+    {file = "litellm-1.63.11-py3-none-any.whl", hash = "sha256:f3915dc35309b164ef2419ad05e5241ddd97f3f47aa036df28365bf889d8ea23"},
+    {file = "litellm-1.63.11.tar.gz", hash = "sha256:89930895121d0cbf5553e560ed886c45be480ceec0eca3c53ae441473d5d46a4"},
 ]

 [package.dependencies]
@@ -4251,14 +4251,14 @@ files = [

 [[package]]
 name = "modal"
-version = "0.73.102"
+version = "0.73.110"
 description = "Python client library for Modal"
 optional = false
 python-versions = ">=3.9"
 groups = ["main", "evaluation"]
 files = [
-    {file = "modal-0.73.102-py3-none-any.whl", hash = "sha256:26151ef6164e0b93b0d1961f73d5a715deb72f23e2641215f5410cf58bf403d3"},
-    {file = "modal-0.73.102.tar.gz", hash = "sha256:198876cf94ff13633283e251d8b37cc1f1bb5e27a7aa547e02072def1f29b66e"},
+    {file = "modal-0.73.110-py3-none-any.whl", hash = "sha256:5ccdf9ce6e5fbf953738670819a63f02059b65333e270a3fd19a9230b8a6d505"},
+    {file = "modal-0.73.110.tar.gz", hash = "sha256:d4110c223c975ddd4adbe9e2b9040c4cdbf6dd20625343d1e839b3f1881b33a8"},
 ]

 [package.dependencies]
@@ -4712,67 +4712,67 @@ test = ["pytest", "pytest-console-scripts", "pytest-jupyter", "pytest-tornasync"

 [[package]]
 name = "numpy"
-version = "2.2.3"
+version = "2.2.4"
 description = "Fundamental package for array computing in Python"
 optional = false
 python-versions = ">=3.10"
 groups = ["main", "evaluation", "test"]
 files = [
-    {file = "numpy-2.2.3-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:cbc6472e01952d3d1b2772b720428f8b90e2deea8344e854df22b0618e9cce71"},
-    {file = "numpy-2.2.3-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:cdfe0c22692a30cd830c0755746473ae66c4a8f2e7bd508b35fb3b6a0813d787"},
-    {file = "numpy-2.2.3-cp310-cp310-macosx_14_0_arm64.whl", hash = "sha256:e37242f5324ffd9f7ba5acf96d774f9276aa62a966c0bad8dae692deebec7716"},
-    {file = "numpy-2.2.3-cp310-cp310-macosx_14_0_x86_64.whl", hash = "sha256:95172a21038c9b423e68be78fd0be6e1b97674cde269b76fe269a5dfa6fadf0b"},
-    {file = "numpy-2.2.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d5b47c440210c5d1d67e1cf434124e0b5c395eee1f5806fdd89b553ed1acd0a3"},
-    {file = "numpy-2.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:0391ea3622f5c51a2e29708877d56e3d276827ac5447d7f45e9bc4ade8923c52"},
-    {file = "numpy-2.2.3-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:f6b3dfc7661f8842babd8ea07e9897fe3d9b69a1d7e5fbb743e4160f9387833b"},
-    {file = "numpy-2.2.3-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:1ad78ce7f18ce4e7df1b2ea4019b5817a2f6a8a16e34ff2775f646adce0a5027"},
-    {file = "numpy-2.2.3-cp310-cp310-win32.whl", hash = "sha256:5ebeb7ef54a7be11044c33a17b2624abe4307a75893c001a4800857956b41094"},
-    {file = "numpy-2.2.3-cp310-cp310-win_amd64.whl", hash = "sha256:596140185c7fa113563c67c2e894eabe0daea18cf8e33851738c19f70ce86aeb"},
-    {file = "numpy-2.2.3-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:16372619ee728ed67a2a606a614f56d3eabc5b86f8b615c79d01957062826ca8"},
-    {file = "numpy-2.2.3-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:5521a06a3148686d9269c53b09f7d399a5725c47bbb5b35747e1cb76326b714b"},
-    {file = "numpy-2.2.3-cp311-cp311-macosx_14_0_arm64.whl", hash = "sha256:7c8dde0ca2f77828815fd1aedfdf52e59071a5bae30dac3b4da2a335c672149a"},
-    {file = "numpy-2.2.3-cp311-cp311-macosx_14_0_x86_64.whl", hash = "sha256:77974aba6c1bc26e3c205c2214f0d5b4305bdc719268b93e768ddb17e3fdd636"},
-    {file = "numpy-2.2.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d42f9c36d06440e34226e8bd65ff065ca0963aeecada587b937011efa02cdc9d"},
-    {file = "numpy-2.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f2712c5179f40af9ddc8f6727f2bd910ea0eb50206daea75f58ddd9fa3f715bb"},
-    {file = "numpy-2.2.3-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:c8b0451d2ec95010d1db8ca733afc41f659f425b7f608af569711097fd6014e2"},
-    {file = "numpy-2.2.3-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:d9b4a8148c57ecac25a16b0e11798cbe88edf5237b0df99973687dd866f05e1b"},
-    {file = "numpy-2.2.3-cp311-cp311-win32.whl", hash = "sha256:1f45315b2dc58d8a3e7754fe4e38b6fce132dab284a92851e41b2b344f6441c5"},
-    {file = "numpy-2.2.3-cp311-cp311-win_amd64.whl", hash = "sha256:9f48ba6f6c13e5e49f3d3efb1b51c8193215c42ac82610a04624906a9270be6f"},
-    {file = "numpy-2.2.3-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:12c045f43b1d2915eca6b880a7f4a256f59d62df4f044788c8ba67709412128d"},
-    {file = "numpy-2.2.3-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:87eed225fd415bbae787f93a457af7f5990b92a334e346f72070bf569b9c9c95"},
-    {file = "numpy-2.2.3-cp312-cp312-macosx_14_0_arm64.whl", hash = "sha256:712a64103d97c404e87d4d7c47fb0c7ff9acccc625ca2002848e0d53288b90ea"},
-    {file = "numpy-2.2.3-cp312-cp312-macosx_14_0_x86_64.whl", hash = "sha256:a5ae282abe60a2db0fd407072aff4599c279bcd6e9a2475500fc35b00a57c532"},
-    {file = "numpy-2.2.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5266de33d4c3420973cf9ae3b98b54a2a6d53a559310e3236c4b2b06b9c07d4e"},
-    {file = "numpy-2.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3b787adbf04b0db1967798dba8da1af07e387908ed1553a0d6e74c084d1ceafe"},
-    {file = "numpy-2.2.3-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:34c1b7e83f94f3b564b35f480f5652a47007dd91f7c839f404d03279cc8dd021"},
-    {file = "numpy-2.2.3-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:4d8335b5f1b6e2bce120d55fb17064b0262ff29b459e8493d1785c18ae2553b8"},
-    {file = "numpy-2.2.3-cp312-cp312-win32.whl", hash = "sha256:4d9828d25fb246bedd31e04c9e75714a4087211ac348cb39c8c5f99dbb6683fe"},
-    {file = "numpy-2.2.3-cp312-cp312-win_amd64.whl", hash = "sha256:83807d445817326b4bcdaaaf8e8e9f1753da04341eceec705c001ff342002e5d"},
-    {file = "numpy-2.2.3-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:7bfdb06b395385ea9b91bf55c1adf1b297c9fdb531552845ff1d3ea6e40d5aba"},
-    {file = "numpy-2.2.3-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:23c9f4edbf4c065fddb10a4f6e8b6a244342d95966a48820c614891e5059bb50"},
-    {file = "numpy-2.2.3-cp313-cp313-macosx_14_0_arm64.whl", hash = "sha256:a0c03b6be48aaf92525cccf393265e02773be8fd9551a2f9adbe7db1fa2b60f1"},
-    {file = "numpy-2.2.3-cp313-cp313-macosx_14_0_x86_64.whl", hash = "sha256:2376e317111daa0a6739e50f7ee2a6353f768489102308b0d98fcf4a04f7f3b5"},
-    {file = "numpy-2.2.3-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:8fb62fe3d206d72fe1cfe31c4a1106ad2b136fcc1606093aeab314f02930fdf2"},
-    {file = "numpy-2.2.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:52659ad2534427dffcc36aac76bebdd02b67e3b7a619ac67543bc9bfe6b7cdb1"},
-    {file = "numpy-2.2.3-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:1b416af7d0ed3271cad0f0a0d0bee0911ed7eba23e66f8424d9f3dfcdcae1304"},
-    {file = "numpy-2.2.3-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:1402da8e0f435991983d0a9708b779f95a8c98c6b18a171b9f1be09005e64d9d"},
-    {file = "numpy-2.2.3-cp313-cp313-win32.whl", hash = "sha256:136553f123ee2951bfcfbc264acd34a2fc2f29d7cdf610ce7daf672b6fbaa693"},
-    {file = "numpy-2.2.3-cp313-cp313-win_amd64.whl", hash = "sha256:5b732c8beef1d7bc2d9e476dbba20aaff6167bf205ad9aa8d30913859e82884b"},
-    {file = "numpy-2.2.3-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:435e7a933b9fda8126130b046975a968cc2d833b505475e588339e09f7672890"},
-    {file = "numpy-2.2.3-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:7678556eeb0152cbd1522b684dcd215250885993dd00adb93679ec3c0e6e091c"},
-    {file = "numpy-2.2.3-cp313-cp313t-macosx_14_0_arm64.whl", hash = "sha256:2e8da03bd561504d9b20e7a12340870dfc206c64ea59b4cfee9fceb95070ee94"},
-    {file = "numpy-2.2.3-cp313-cp313t-macosx_14_0_x86_64.whl", hash = "sha256:c9aa4496fd0e17e3843399f533d62857cef5900facf93e735ef65aa4bbc90ef0"},
-    {file = "numpy-2.2.3-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f4ca91d61a4bf61b0f2228f24bbfa6a9facd5f8af03759fe2a655c50ae2c6610"},
-    {file = "numpy-2.2.3-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:deaa09cd492e24fd9b15296844c0ad1b3c976da7907e1c1ed3a0ad21dded6f76"},
-    {file = "numpy-2.2.3-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:246535e2f7496b7ac85deffe932896a3577be7af8fb7eebe7146444680297e9a"},
-    {file = "numpy-2.2.3-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:daf43a3d1ea699402c5a850e5313680ac355b4adc9770cd5cfc2940e7861f1bf"},
-    {file = "numpy-2.2.3-cp313-cp313t-win32.whl", hash = "sha256:cf802eef1f0134afb81fef94020351be4fe1d6681aadf9c5e862af6602af64ef"},
-    {file = "numpy-2.2.3-cp313-cp313t-win_amd64.whl", hash = "sha256:aee2512827ceb6d7f517c8b85aa5d3923afe8fc7a57d028cffcd522f1c6fd082"},
-    {file = "numpy-2.2.3-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:3c2ec8a0f51d60f1e9c0c5ab116b7fc104b165ada3f6c58abf881cb2eb16044d"},
-    {file = "numpy-2.2.3-pp310-pypy310_pp73-macosx_14_0_x86_64.whl", hash = "sha256:ed2cf9ed4e8ebc3b754d398cba12f24359f018b416c380f577bbae112ca52fc9"},
-    {file = "numpy-2.2.3-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:39261798d208c3095ae4f7bc8eaeb3481ea8c6e03dc48028057d3cbdbdb8937e"},
-    {file = "numpy-2.2.3-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:783145835458e60fa97afac25d511d00a1eca94d4a8f3ace9fe2043003c678e4"},
-    {file = "numpy-2.2.3.tar.gz", hash = "sha256:dbdc15f0c81611925f382dfa97b3bd0bc2c1ce19d4fe50482cb0ddc12ba30020"},
+    {file = "numpy-2.2.4-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:8146f3550d627252269ac42ae660281d673eb6f8b32f113538e0cc2a9aed42b9"},
+    {file = "numpy-2.2.4-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:e642d86b8f956098b564a45e6f6ce68a22c2c97a04f5acd3f221f57b8cb850ae"},
+    {file = "numpy-2.2.4-cp310-cp310-macosx_14_0_arm64.whl", hash = "sha256:a84eda42bd12edc36eb5b53bbcc9b406820d3353f1994b6cfe453a33ff101775"},
+    {file = "numpy-2.2.4-cp310-cp310-macosx_14_0_x86_64.whl", hash = "sha256:4ba5054787e89c59c593a4169830ab362ac2bee8a969249dc56e5d7d20ff8df9"},
+    {file = "numpy-2.2.4-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:7716e4a9b7af82c06a2543c53ca476fa0b57e4d760481273e09da04b74ee6ee2"},
+    {file = "numpy-2.2.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:adf8c1d66f432ce577d0197dceaac2ac00c0759f573f28516246351c58a85020"},
+    {file = "numpy-2.2.4-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:218f061d2faa73621fa23d6359442b0fc658d5b9a70801373625d958259eaca3"},
+    {file = "numpy-2.2.4-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:df2f57871a96bbc1b69733cd4c51dc33bea66146b8c63cacbfed73eec0883017"},
+    {file = "numpy-2.2.4-cp310-cp310-win32.whl", hash = "sha256:a0258ad1f44f138b791327961caedffbf9612bfa504ab9597157806faa95194a"},
+    {file = "numpy-2.2.4-cp310-cp310-win_amd64.whl", hash = "sha256:0d54974f9cf14acf49c60f0f7f4084b6579d24d439453d5fc5805d46a165b542"},
+    {file = "numpy-2.2.4-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:e9e0a277bb2eb5d8a7407e14688b85fd8ad628ee4e0c7930415687b6564207a4"},
+    {file = "numpy-2.2.4-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:9eeea959168ea555e556b8188da5fa7831e21d91ce031e95ce23747b7609f8a4"},
+    {file = "numpy-2.2.4-cp311-cp311-macosx_14_0_arm64.whl", hash = "sha256:bd3ad3b0a40e713fc68f99ecfd07124195333f1e689387c180813f0e94309d6f"},
+    {file = "numpy-2.2.4-cp311-cp311-macosx_14_0_x86_64.whl", hash = "sha256:cf28633d64294969c019c6df4ff37f5698e8326db68cc2b66576a51fad634880"},
+    {file = "numpy-2.2.4-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:2fa8fa7697ad1646b5c93de1719965844e004fcad23c91228aca1cf0800044a1"},
+    {file = "numpy-2.2.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f4162988a360a29af158aeb4a2f4f09ffed6a969c9776f8f3bdee9b06a8ab7e5"},
+    {file = "numpy-2.2.4-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:892c10d6a73e0f14935c31229e03325a7b3093fafd6ce0af704be7f894d95687"},
+    {file = "numpy-2.2.4-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:db1f1c22173ac1c58db249ae48aa7ead29f534b9a948bc56828337aa84a32ed6"},
+    {file = "numpy-2.2.4-cp311-cp311-win32.whl", hash = "sha256:ea2bb7e2ae9e37d96835b3576a4fa4b3a97592fbea8ef7c3587078b0068b8f09"},
+    {file = "numpy-2.2.4-cp311-cp311-win_amd64.whl", hash = "sha256:f7de08cbe5551911886d1ab60de58448c6df0f67d9feb7d1fb21e9875ef95e91"},
+    {file = "numpy-2.2.4-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:a7b9084668aa0f64e64bd00d27ba5146ef1c3a8835f3bd912e7a9e01326804c4"},
+    {file = "numpy-2.2.4-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:dbe512c511956b893d2dacd007d955a3f03d555ae05cfa3ff1c1ff6df8851854"},
+    {file = "numpy-2.2.4-cp312-cp312-macosx_14_0_arm64.whl", hash = "sha256:bb649f8b207ab07caebba230d851b579a3c8711a851d29efe15008e31bb4de24"},
+    {file = "numpy-2.2.4-cp312-cp312-macosx_14_0_x86_64.whl", hash = "sha256:f34dc300df798742b3d06515aa2a0aee20941c13579d7a2f2e10af01ae4901ee"},
+    {file = "numpy-2.2.4-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c3f7ac96b16955634e223b579a3e5798df59007ca43e8d451a0e6a50f6bfdfba"},
+    {file = "numpy-2.2.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4f92084defa704deadd4e0a5ab1dc52d8ac9e8a8ef617f3fbb853e79b0ea3592"},
+    {file = "numpy-2.2.4-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:7a4e84a6283b36632e2a5b56e121961f6542ab886bc9e12f8f9818b3c266bfbb"},
+    {file = "numpy-2.2.4-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:11c43995255eb4127115956495f43e9343736edb7fcdb0d973defd9de14cd84f"},
+    {file = "numpy-2.2.4-cp312-cp312-win32.whl", hash = "sha256:65ef3468b53269eb5fdb3a5c09508c032b793da03251d5f8722b1194f1790c00"},
+    {file = "numpy-2.2.4-cp312-cp312-win_amd64.whl", hash = "sha256:2aad3c17ed2ff455b8eaafe06bcdae0062a1db77cb99f4b9cbb5f4ecb13c5146"},
+    {file = "numpy-2.2.4-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:1cf4e5c6a278d620dee9ddeb487dc6a860f9b199eadeecc567f777daace1e9e7"},
+    {file = "numpy-2.2.4-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:1974afec0b479e50438fc3648974268f972e2d908ddb6d7fb634598cdb8260a0"},
+    {file = "numpy-2.2.4-cp313-cp313-macosx_14_0_arm64.whl", hash = "sha256:79bd5f0a02aa16808fcbc79a9a376a147cc1045f7dfe44c6e7d53fa8b8a79392"},
+    {file = "numpy-2.2.4-cp313-cp313-macosx_14_0_x86_64.whl", hash = "sha256:3387dd7232804b341165cedcb90694565a6015433ee076c6754775e85d86f1fc"},
+    {file = "numpy-2.2.4-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:6f527d8fdb0286fd2fd97a2a96c6be17ba4232da346931d967a0630050dfd298"},
+    {file = "numpy-2.2.4-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bce43e386c16898b91e162e5baaad90c4b06f9dcbe36282490032cec98dc8ae7"},
+    {file = "numpy-2.2.4-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:31504f970f563d99f71a3512d0c01a645b692b12a63630d6aafa0939e52361e6"},
+    {file = "numpy-2.2.4-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:81413336ef121a6ba746892fad881a83351ee3e1e4011f52e97fba79233611fd"},
+    {file = "numpy-2.2.4-cp313-cp313-win32.whl", hash = "sha256:f486038e44caa08dbd97275a9a35a283a8f1d2f0ee60ac260a1790e76660833c"},
+    {file = "numpy-2.2.4-cp313-cp313-win_amd64.whl", hash = "sha256:207a2b8441cc8b6a2a78c9ddc64d00d20c303d79fba08c577752f080c4007ee3"},
+    {file = "numpy-2.2.4-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:8120575cb4882318c791f839a4fd66161a6fa46f3f0a5e613071aae35b5dd8f8"},
+    {file = "numpy-2.2.4-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:a761ba0fa886a7bb33c6c8f6f20213735cb19642c580a931c625ee377ee8bd39"},
+    {file = "numpy-2.2.4-cp313-cp313t-macosx_14_0_arm64.whl", hash = "sha256:ac0280f1ba4a4bfff363a99a6aceed4f8e123f8a9b234c89140f5e894e452ecd"},
+    {file = "numpy-2.2.4-cp313-cp313t-macosx_14_0_x86_64.whl", hash = "sha256:879cf3a9a2b53a4672a168c21375166171bc3932b7e21f622201811c43cdd3b0"},
+    {file = "numpy-2.2.4-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f05d4198c1bacc9124018109c5fba2f3201dbe7ab6e92ff100494f236209c960"},
+    {file = "numpy-2.2.4-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e2f085ce2e813a50dfd0e01fbfc0c12bbe5d2063d99f8b29da30e544fb6483b8"},
+    {file = "numpy-2.2.4-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:92bda934a791c01d6d9d8e038363c50918ef7c40601552a58ac84c9613a665bc"},
+    {file = "numpy-2.2.4-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:ee4d528022f4c5ff67332469e10efe06a267e32f4067dc76bb7e2cddf3cd25ff"},
+    {file = "numpy-2.2.4-cp313-cp313t-win32.whl", hash = "sha256:05c076d531e9998e7e694c36e8b349969c56eadd2cdcd07242958489d79a7286"},
+    {file = "numpy-2.2.4-cp313-cp313t-win_amd64.whl", hash = "sha256:188dcbca89834cc2e14eb2f106c96d6d46f200fe0200310fc29089657379c58d"},
+    {file = "numpy-2.2.4-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:7051ee569db5fbac144335e0f3b9c2337e0c8d5c9fee015f259a5bd70772b7e8"},
+    {file = "numpy-2.2.4-pp310-pypy310_pp73-macosx_14_0_x86_64.whl", hash = "sha256:ab2939cd5bec30a7430cbdb2287b63151b77cf9624de0532d629c9a1c59b1d5c"},
+    {file = "numpy-2.2.4-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d0f35b19894a9e08639fd60a1ec1978cb7f5f7f1eace62f38dd36be8aecdef4d"},
+    {file = "numpy-2.2.4-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:b4adfbbc64014976d2f91084915ca4e626fbf2057fb81af209c1a6d776d23e3d"},
+    {file = "numpy-2.2.4.tar.gz", hash = "sha256:9ba03692a45d3eef66559efe1d1096c4b9b75c0986b5dff5530c378fb8331d4f"},
 ]

 [[package]]
--- a/tests/unit/test_brave_search.py
+++ b/tests/unit/test_brave_search.py
@@ -1,83 +0,0 @@
-"""Tests for the Brave Search functionality."""
-
-from unittest.mock import Mock, patch
-
-import pytest
-
-from openhands.core.config import AppConfig, SearchConfig
-from openhands.events.action import SearchAction
-from openhands.events.observation.error import ErrorObservation
-from openhands.events.observation.search_engine import SearchEngineObservation
-from openhands.runtime.search_engine.brave_search import search
-
-
-@pytest.fixture
-def mock_config():
-    """Create a mock config with search enabled."""
-    config = AppConfig()
-    config.search = SearchConfig(
-        enabled=True,
-        api_key="test_key",
-        api_url="https://test.url"
-    )
-    return config
-
-
-@pytest.fixture
-def mock_query_api():
-    """Create a mock query_api function."""
-    with patch("openhands.runtime.search_engine.brave_search.query_api") as mock:
-        mock.return_value = SearchEngineObservation(
-            query="test query",
-            content="test content"
-        )
-        yield mock
-
-
-def test_search_disabled(mock_query_api):
-    """Test that search returns error when disabled."""
-    config = AppConfig()
-    config.search = SearchConfig(enabled=False)
-    action = SearchAction(query="test query")
-
-    result = search(action, config)
-    assert isinstance(result, ErrorObservation)
-    assert "not enabled" in result.content
-    mock_query_api.assert_not_called()
-
-
-def test_search_no_api_key(mock_query_api):
-    """Test that search returns error when API key is not set."""
-    config = AppConfig()
-    config.search = SearchConfig(enabled=True)
-    action = SearchAction(query="test query")
-
-    result = search(action, config)
-    assert isinstance(result, ErrorObservation)
-    assert "API key not configured" in result.content
-    mock_query_api.assert_not_called()
-
-
-def test_search_empty_query(mock_query_api, mock_config):
-    """Test that search returns error when query is empty."""
-    action = SearchAction(query="")
-
-    result = search(action, mock_config)
-    assert isinstance(result, ErrorObservation)
-    assert "must be a non-empty string" in result.content
-    mock_query_api.assert_not_called()
-
-
-def test_search_success(mock_query_api, mock_config):
-    """Test that search returns results when everything is configured correctly."""
-    action = SearchAction(query="test query")
-
-    result = search(action, mock_config)
-    assert isinstance(result, SearchEngineObservation)
-    assert result.query == "test query"
-    assert result.content == "test content"
-    mock_query_api.assert_called_once_with(
-        query="test query",
-        API_KEY="test_key",
-        BRAVE_SEARCH_URL="https://test.url"
-    )
--- a/tests/unit/test_codeact_agent.py
+++ b/tests/unit/test_codeact_agent.py
@@ -25,7 +25,6 @@ from openhands.core.message import ImageContent, Message, TextContent
 from openhands.events.action import (
    CmdRunAction,
    MessageAction,
-    SearchAction,
 )
 from openhands.events.event import EventSource
 from openhands.events.observation.commands import (
@@ -101,26 +100,22 @@ def test_get_tools_with_options():
        codeact_enable_browsing=True,
        codeact_enable_jupyter=True,
        codeact_enable_llm_editor=True,
-        codeact_enable_search_engine=True,
    )
    tool_names = [tool['function']['name'] for tool in tools]
    assert 'browser' in tool_names
    assert 'execute_ipython_cell' in tool_names
    assert 'edit_file' in tool_names
-    assert 'search_engine' in tool_names

    # Test with all options disabled
    tools = get_tools(
        codeact_enable_browsing=False,
        codeact_enable_jupyter=False,
        codeact_enable_llm_editor=False,
-        codeact_enable_search_engine=False,
    )
    tool_names = [tool['function']['name'] for tool in tools]
    assert 'browser' not in tool_names
    assert 'execute_ipython_cell' not in tool_names
    assert 'edit_file' not in tool_names
-    assert 'search_engine' not in tool_names


 def test_cmd_run_tool():
@@ -181,15 +176,6 @@ def test_web_read_tool():
    assert WebReadTool['function']['parameters']['required'] == ['url']


-def test_search_engine_tool():
-    from openhands.agenthub.codeact_agent.tools import SearchEngineTool
-
-    assert SearchEngineTool['type'] == 'function'
-    assert SearchEngineTool['function']['name'] == 'search_engine'
-    assert 'query' in SearchEngineTool['function']['parameters']['properties']
-    assert SearchEngineTool['function']['parameters']['required'] == ['query']
-
-
 def test_browser_tool():
    assert BrowserTool['type'] == 'function'
    assert BrowserTool['function']['name'] == 'browser'
@@ -226,42 +212,6 @@ def test_browser_tool():
    assert 'description' in BrowserTool['function']['parameters']['properties']['code']


-def test_response_to_actions_search_engine():
-    # Test response with search engine tool call
-    from litellm import ChatCompletionMessageToolCall, Choices, Message, ModelResponse
-
-    mock_response = ModelResponse(
-        id='mock_id',
-        choices=[
-            Choices(
-                message=Message(
-                    content='Let me search for that',
-                    tool_calls=[
-                        ChatCompletionMessageToolCall(
-                            id='tool_call_10',
-                            function={
-                                'name': 'search_engine',
-                                'arguments': '{"query": "test query"}',
-                            },
-                            type='function',
-                        )
-                    ],
-                    role='assistant',
-                ),
-                index=0,
-                finish_reason='tool_calls',
-            )
-        ],
-        model='mock_model',
-        usage={'total_tokens': 100},
-    )
-
-    actions = response_to_actions(mock_response)
-    assert len(actions) == 1
-    assert isinstance(actions[0], SearchAction)
-    assert actions[0].query == 'test query'
-
-
 def test_response_to_actions_invalid_tool():
    # Test response with invalid tool call
    mock_response = Mock()
Author	SHA1	Message	Date
openhands	6310c070b3	Fix frontend tests for GitHub token documentation changes	2025-03-17 19:22:35 +00:00
dependabot[bot]	41c8c9230b	chore(deps): bump the version-all group with 4 updates (#7308 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: openhands <openhands@all-hands.dev>	2025-03-17 18:57:56 +00:00
Xingyao Wang	9b9e728cf6	Iterative evaluation with rule-based critic (#7293 )	2025-03-17 18:37:35 +00:00