Fix microagent test filenames to match expected names

- Change test filenames from 'test.md' to match expected microagent names - Use 'default.md' for tests expecting 'default' name - Use 'custom_name.md' for test expecting 'custom_name' name - Use 'test_agent.md' for test expecting 'test_agent' name - This properly tests the filename-based naming behavior
Fix microagent tests and remove debug prints
2026-04-29 03:00:45 -04:00 · 2025-06-24 14:20:34 +00:00 · 2025-06-24 14:16:20 +00:00 · 2025-06-24 14:01:12 +00:00 · 2025-06-23 17:49:02 -04:00 · 2025-06-23 16:09:32 -04:00
98 changed files with 2096 additions and 1102 deletions
@@ -45,6 +45,13 @@ body:
      description: What version of OpenHands are you using?
      placeholder: ex. 0.9.8, main, etc.

+  - type: input
+    id: model-name
+    attributes:
+      label: Model Name
+      description: What model are you using?
+      placeholder: ex. gpt-4o, claude-3-5-sonnet, openrouter/deepseek-r1, etc.
+
  - type: dropdown
    id: os
    attributes:
@@ -48,7 +48,7 @@ Learn more at [docs.all-hands.dev](https://docs.all-hands.dev), or [sign up for

 ## ☁️ OpenHands Cloud
 The easiest way to get started with OpenHands is on [OpenHands Cloud](https://app.all-hands.dev),
-which comes with $50 in free credits for new users.
+which comes with $20 in free credits for new users.

 ## 💻 Running OpenHands Locally

@@ -1,4 +1,4 @@
-FROM ubuntu:22.04
+FROM ubuntu:24.04

 # install basic packages
 RUN apt-get update && apt-get install -y \
@@ -26,7 +26,7 @@
          "usage/installation",
          "usage/getting-started",
          "usage/key-features",
-          "usage/faq",
+          "usage/faqs",
          {
            "group": "OpenHands Cloud",
            "pages": [
@@ -44,7 +44,7 @@
            ]
          },
          {
-            "group": "Running OpenHands on Your Own",
+            "group": "Run OpenHands on Your Own",
            "pages": [
              "usage/local-setup",
              "usage/how-to/gui-mode",
@@ -104,8 +104,9 @@
            ]
          },
          {
-            "group": "Customization",
+            "group": "Customizations & Settings",
            "pages": [
+              "usage/common-settings",
              "usage/prompting/repository",
              {
                "group": "Microagents",
@@ -150,6 +151,12 @@
          }
        ]
      },
+      {
+        "tab": "Success Stories",
+        "pages": [
+          "success-stories/index"
+        ]
+      },
      {
          "tab": "API Reference",
          "openapi": "/openapi.json"
@@ -0,0 +1,217 @@
+---
+title: "Success Stories"
+description: "Real-world examples of what you can achieve with OpenHands"
+---
+
+Discover how developers and teams are using OpenHands to automate their software development workflows. From quick fixes to complex projects, see what's possible with AI-powered development assistance.
+
+Check out the [#success-stories](https://www.linen.dev/s/openhands/c/success-stories) channel on our Slack for more!
+
+<Update label="2025-06-13 OpenHands helps frontline support" description="@Joe Pelletier">
+
+## One of the cool things about OpenHands, and especially the Slack Integration, is the ability to empower folks who are on the ‘front lines’ with customers.
+
+For example, often times Support and Customer Success teams will field bug reports, doc questions, and other ‘nits’ from customers. They tend to have few options to deal with this, other than file a feedback ticket with product teams and hope it gets prioritized in an upcoming sprint.
+
+Instead, with tools like OpenHands and the Slack integration, they can request OpenHands to make fixes proactively and then have someone on the engineering team (like a lead engineer, a merge engineer, or even technical product manager) review the PR and approve it — thus reducing the cycle time for ‘quick wins’ from weeks to just a few hours.
+
+Here's how we do that with the OpenHands project:
+
+<iframe
+  width="560"
+  height="560"
+  src="https://www.linen.dev/s/openhands/t/29118545/seems-mcp-config-from-config-toml-is-being-overwritten-hence#629f8e2b-cde8-427e-920c-390557a06cc9"
+  frameborder="0"
+  allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
+  allowfullscreen
+></iframe>
+
+[Original Slack thread](https://www.linen.dev/s/openhands/t/29124350/one-of-the-cool-things-about-openhands-and-especially-the-sl#25029f37-7b0d-4535-9187-83b3e06a4011)
+
+</Update>
+
+
+<Update label="2025-06-13 Ask OpenHands to show me some love" description="@Graham Neubig">
+
+## Asked openhands to “show me some love” and...
+
+Asked openhands to “show me some love” and it coded up this app for me, actually kinda genuinely feel loved
+
+<video
+  controls
+  autoplay
+  className="w-full aspect-video"
+  src="/success-stories/stories/2025-06-13-show-love/v1.mp4"
+></video>
+
+[Original Slack thread](https://www.linen.dev/s/openhands/t/29100731/asked-openhands-to-show-me-some-love-and-it-coded-up-this-ap#1e08af6b-b7d5-4167-8a53-17e6806555e0)
+
+</Update>
+
+<Update label="2025-06-11 OpenHands does 100% of my infra IAM research for me" description="@Xingyao Wang">
+
+## Now, OpenHands does 100% of my infra IAM research for me
+
+Got an IAM error on GCP? Send a screenshot to OH... and it just works!!!
+Can't imagine going back to the early days without OH: I'd spend an entire afternoon figuring how to get IAM right
+
+[Original Slack thread](https://www.linen.dev/s/openhands/t/29100732/now-openhands-does-100-of-my-infra-iam-research-for-me-sweat#20482a73-4e2e-4edd-b6d1-c9e8442fccd1)
+
+![](/success-stories/stories/2025-06-11-infra-iam/s1.png)
+![](/success-stories/stories/2025-06-11-infra-iam/s2.png)
+
+</Update>
+
+<Update label="2025-06-08 OpenHands builds an interactive map for me" description="@Rodrigo Argenton Freire (ODLab)">
+
+## Very simple example, but baby steps....
+
+I am a professor of architecture and urban design. We built, me and some students, an interactive map prototype to help visitors and new students to find important places in the campus. Considering that we lack a lot of knowledge in programming, that was really nice to build and a smooth process.
+We first created the main components with all-hands and then adjusted some details locally. Definitely, saved us a lot of time and money.
+That's a prototype but we will have all the info by tuesday.
+https://buriti-emau.github.io/Mapa-UFU/
+
+[Original Slack thread](https://www.linen.dev/s/openhands/t/29100736/very-simple-example-but-baby-steps-i-am-a-professor-of-archi#8f2e3f3f-44e6-44ea-b9a8-d53487470179)
+
+![](/success-stories/stories/2025-06-08-map/s1.png)
+
+</Update>
+
+
+<Update label="2025-06-06 Web Search Saves the Day" description="@Ian Walker">
+
+## Tavily adapter helps solve persistent debugging issue
+
+Big congratulations to the new [Tavily adapter](https://www.all-hands.dev/blog/building-a-provably-versatile-agent)... OpenHands and I have been beavering away at a Lightstreamer client library for most of this week but were getting a persistent (and unhelpful) "unexpected error" from the server.
+
+Coming back to the problem today, after trying several unsuccessful fixes prompted by me, OH decided all by itself to search the web, and found the cause of the problem (of course it was simply CRLF line endings...). I was on the verge of giving up - good thing OH has more stamina than me!
+
+This demonstrates how OpenHands' web search capabilities can help solve debugging issues that would otherwise require extensive manual research.
+
+<iframe
+  width="560"
+  height="560"
+  src="https://www.linen.dev/s/openhands/t/29100737/big-congratulations-to-the-new-tavily-adapter-openhands-and-#87b027e5-188b-425e-8aa9-719dcb4929f4"
+  title="YouTube video player"
+  frameborder="0"
+  allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
+  allowfullscreen
+></iframe>
+
+
+[Original Slack thread](https://www.linen.dev/s/openhands/t/29100737/big-congratulations-to-the-new-tavily-adapter-openhands-and-#76f1fb26-6ef7-4709-b9ea-fb99105e47e4)
+
+</Update>
+
+<Update label="2025-06-05 OpenHands updates my personal website for a new paper" description="@Xingyao Wang">
+
+## I asked OpenHands to update my personal website for the "OpenHands Versa" paper.
+
+It is an extremely trivial task: You just need to browse to arxiv, copy the author names, format them for BibTeX, and then modify the papers.bib file. But now I'm getting way too lazy to even open my IDE and actually do this one-file change!
+
+[Original Tweet/X thread](https://x.com/xingyaow_/status/1930796287919542410)
+
+[Original Slack thread](https://www.linen.dev/s/openhands/t/29100738/i-asked-openhands-to-update-my-personal-website-for-the-open#f0324022-b12b-4d34-b12b-bdbc43823f69)
+
+</Update>
+
+<Update label="2025-06-02 OpenHands makes an animated gif of swe-bench verified scores over time" description="@Graham Neubig">
+
+## I asked OpenHands to make an animated gif of swe-bench verified scores over time.
+
+It took a bit of prompting but ended up looking pretty nice I think
+
+<video width="560" height="315" autoPlay loop muted src="/success-stories/stories/2025-06-02-swebench-score/s1.mp4"></video>
+
+[Original Slack thread](https://www.linen.dev/s/openhands/t/29100744/i-asked-openhands-to-make-an-animated-gif-of-swe-bench-verif#fb3b82c9-6222-4311-b97b-b2ac1cfe6dff)
+
+</Update>
+
+<Update label="2025-05-30 AWS Troubleshooting" description="@Graham Neubig">
+
+## Quick AWS security group fix
+
+I really don't like trying to fix issues with AWS, especially security groups and other finicky things like this. But I started up an instance and wasn't able to ssh in. So I asked OpenHands:
+
+> Currently, the following ssh command is timing out:
+>
+> $ ssh -i gneubig.pem ubuntu@XXX.us-east-2.compute.amazonaws.com
+> ssh: connect to host XXX.us-east-2.compute.amazonaws.com port 22: Operation timed out
+>
+> Use the provided AWS credentials to take a look at i-XXX and examine why
+
+And 2 minutes later I was able to SSH in!
+
+This shows how OpenHands can quickly diagnose and fix AWS infrastructure issues that would normally require manual investigation.
+
+[Original Slack thread](https://www.linen.dev/s/openhands/t/29100747/i-really-don-t-like-trying-to-fix-issues-with-aws-especially#d92a66d2-3bc1-4467-9d09-dc983004d083)
+
+</Update>
+
+
+<Update label="2025-05-04 Chrome Extension Development" description="@Xingyao Wang">
+
+## OpenHands builds Chrome extension for GitHub integration
+
+I asked OpenHands to write a Chrome extension based on our [OpenHands Cloud API](https://docs.all-hands.dev/modules/usage/cloud/cloud-api). Once installed, you can now easily launch an OpenHands cloud session from your GitHub webpage/PR!
+
+This demonstrates OpenHands' ability to create browser extensions and integrate with external APIs, enabling seamless workflows between GitHub and OpenHands Cloud.
+
+![Chrome extension](/success-stories/stories/2025-05-04-chrome-extension/s1.png)
+![Chrome extension](/success-stories/stories/2025-05-04-chrome-extension/s2.png)
+
+[GitHub Repository](https://github.com/xingyaoww/openhands-chrome-extension)
+
+[Original Slack thread](https://www.linen.dev/s/openhands/t/29100755/i-asked-openhands-to-write-a-chrome-extension-based-on-our-h#88f14b7f-f8ff-40a6-83c2-bd64e95924c5)
+
+</Update>
+
+
+<Update label="2025-04-11 Visual UI Testing" description="@Xingyao Wang">
+
+## OpenHands tests UI automatically with visual browsing
+
+Thanks to visual browsing -- OpenHands can actually test some simple UI by serving the website, clicking the button in the browser and looking at screenshots now!
+
+Prompt is just:
+```
+I want to create a Hello World app in Javascript that:
+* Displays Hello World in the middle.
+* Has a button that when clicked, changes the greeting with a bouncing animation to fun versions of Hello.
+* Has a counter for how many times the button has been clicked.
+* Has another button that changes the app's background color.
+```
+
+Eager-to-work Sonnet 3.7 will test stuff for you without you asking!
+
+This showcases OpenHands' visual browsing capabilities, enabling it to create, serve, and automatically test web applications through actual browser interactions and screenshot analysis.
+
+![Visual UI testing](/success-stories/stories/2025-04-11-visual-ui/s1.png)
+
+[Original Slack thread](https://www.linen.dev/s/openhands/t/29100764/thanks-to-u07k0p3bdb9-s-visual-browsing-openhands-can-actual#21beb9bc-1a04-4272-87e9-4d3e3b9925e7)
+
+</Update>
+
+<Update label="2025-03-07 Proactive Error Handling" description="@Graham Neubig">
+
+## OpenHands fixes crashes before you notice them
+
+Interesting story, I asked OpenHands to start an app on port 12000, it showed up on the app pane. I started using the app, and then it crashed... But because it crashed in OpenHands, OpenHands immediately saw the error message and started fixing the problem without me having to do anything. It was already fixing the problem before I even realized what was going wrong.
+
+This demonstrates OpenHands' proactive monitoring capabilities - it doesn't just execute commands, but actively watches for errors and begins remediation automatically, often faster than human reaction time.
+
+</Update>
+
+<Update label="2024-12-03 Creative Design Acceleration" description="@Rohit Malhotra">
+
+## Pair programming for interactive design projects
+
+Used OpenHands as a pair programmer to do heavy lifting for a creative/interactive design project in p5js.
+
+I usually take around 2 days for high fidelity interactions (planning strategy + writing code + circling back with designer), did this in around 5hrs instead with the designer watching curiously the entire time.
+
+This showcases how OpenHands can accelerate creative and interactive design workflows, reducing development time by 75% while maintaining high quality output.
+
+[Original Tweet](https://x.com/rohit_malh5/status/1863995531657425225)
+
+</Update>
@@ -1,7 +1,7 @@
 ---
 title: Cloud UI
-description: The Cloud UI provides a web interface for interacting with OpenHands. This page explains how to use the
- OpenHands Cloud UI.
+description: The Cloud UI provides a web interface for interacting with OpenHands. This page provides references on
+ how to use the OpenHands Cloud UI.
 ---

 ## Landing Page
@@ -19,10 +19,12 @@ The landing page is where you can:
 The Settings page allows you to:

 - [Configure GitHub repository access](/usage/cloud/github-installation#modifying-repository-access) for OpenHands.
+- [Install the OpenHands Slack app](/usage/cloud/slack-installation).
 - Set application settings like your preferred language, notifications and other preferences.
 - Add credits to your account.
- Generate custom secrets.
- Create API keys to work with OpenHands programmatically.
+- [Generate custom secrets](/usage/common-settings#secrets-management).
+- [Create API keys to work with OpenHands programmatically](/usage/cloud/cloud-api).
+- Change your email address.

 ## Key Features

@@ -5,7 +5,7 @@ description: This guide walks you through installing the OpenHands Slack app.

 ## Prerequisites

- Access to OpenHands Cloud
+- Access to OpenHands Cloud.

 ## Installation Steps

@@ -23,11 +23,11 @@ description: This guide walks you through installing the OpenHands Slack app.

 <Accordion title="Authorize Slack App (for all Slack workspace members)">

-  **Make sure your Slack workspace admin/owner has installed OpenHands Slack App first**
+  **Make sure your Slack workspace admin/owner has installed OpenHands Slack App first.**

-  Every user in the Slack workspace (including admins/owners) must link their Cloud OpenHands account to the OpenHands Slack App. To do this:
+  Every user in the Slack workspace (including admins/owners) must link their OpenHands Cloud account to the OpenHands Slack App. To do this:
  1. Visit [integrations settings](https://app.all-hands.dev/settings/integrations) in OpenHands Cloud.
-  2. Click the button "Install Slack App".
+  2. Click `Install OpenHands Slack App`.
  3. In the top right corner, select the workspace to install the OpenHands Slack app.
  4. Review permissions and click allow.

@@ -0,0 +1,52 @@
+---
+title: OpenHands Settings
+description: Overview of some of the settings available in OpenHands.
+---
+
+## Openhands Cloud vs Running on Your Own
+
+There are some differences between the settings available in OpenHands Cloud and those available when running OpenHands
+on your own:
+* [OpenHands Cloud settings](/usage/cloud/cloud-ui#settings)
+* [Settings available when running on your own](/usage/how-to/gui-mode#settings)
+
+Refer to these pages for more detailed information.
+
+## Secrets Management
+
+OpenHands provides a secrets manager that allows you to securely store and manage sensitive information that can be
+accessed by the agent during runtime, such as API keys. These secrets are automatically exported as environment
+variables in the agent's runtime environment.
+
+### Accessing the Secrets Manager
+
+In the Settings page, navigate to the `Secrets` tab. Here, you'll see a list of all your existing custom secrets.
+
+### Adding a New Secret
+1. Click `Add a new secret`.
+2. Fill in the following fields:
+   - **Name**: A unique identifier for your secret (e.g., `AWS_ACCESS_KEY`). This will be the environment variable name.
+   - **Value**: The sensitive information you want to store.
+   - **Description** (optional): A brief description of what the secret is used for, which is also provided to the agent.
+3. Click `Add secret` to save.
+
+### Editing a Secret
+
+1. Click the `Edit` button next to the secret you want to modify.
+2. You can update the name and description of the secret.
+<Note>
+  For security reasons, you cannot view or edit the value of an existing secret. If you need to change the
+  value, delete the secret and create a new one.
+</Note>
+
+### Deleting a Secret
+
+1. Click the `Delete` button next to the secret you want to remove.
+2. Select `Confirm` to delete the secret.
+
+### Using Secrets in the Agent
+ - All custom secrets are automatically exported as environment variables in the agent's runtime environment.
+ - You can access them in your code using standard environment variable access methods
+   (e.g., `os.environ['SECRET_NAME']` in Python).
+ - Example: If you create a secret named `OPENAI_API_KEY`, you can access it in your code as
+   `process.env.OPENAI_API_KEY` in JavaScript or `os.environ['OPENAI_API_KEY']` in Python.
@@ -8,26 +8,33 @@ icon: question

 ### I'm new to OpenHands. Where should I start?

-1. **Quick Start**: Use [OpenHands Cloud](/usage/cloud/openhands-cloud) to get started quickly with [GitHub](/usage/cloud/github-installation), [GitLab](/usage/cloud/gitlab-installation), and [Slack](/usage/cloud/slack-installation) integration
-2. **Local Setup**: If you prefer to run it on your own hardware, follow our [Getting Started guide](/usage/local-setup)
-3. **First steps**: Complete the [start building tutorial](/usage/getting-started) to learn the basics
-
+1. **Quick start**: Use [OpenHands Cloud](/usage/cloud/openhands-cloud) to get started quickly with
+  [GitHub](/usage/cloud/github-installation), [GitLab](/usage/cloud/gitlab-installation),
+  and [Slack](/usage/cloud/slack-installation) integrations.
+2. **Run on your own**: If you prefer to run it on your own hardware, follow our [Getting Started guide](/usage/local-setup).
+3. **First steps**: Complete the [start building tutorial](/usage/getting-started) to learn the basics.

 ### Can I use OpenHands for production workloads?

-OpenHands is meant to be run by a single user on their local workstation. It is not appropriate for multi-tenant deployments where multiple users share the same instance. There is no built-in authentication, isolation, or scalability.
+OpenHands is meant to be run by a single user on their local workstation. It is not appropriate for multi-tenant
+deployments where multiple users share the same instance. There is no built-in authentication, isolation, or scalability.

-If you're interested in running OpenHands in a multi-tenant environment, check out the source-available, commercially-licensed [OpenHands Cloud Helm Chart](https://github.com/all-Hands-AI/OpenHands-cloud).
+If you're interested in running OpenHands in a multi-tenant environment, check out the source-available,
+commercially-licensed [OpenHands Cloud Helm Chart](https://github.com/all-Hands-AI/OpenHands-cloud).

 <Info>
-Using OpenHands for work? We'd love to chat! Fill out [this short form](https://docs.google.com/forms/d/e/1FAIpQLSet3VbGaz8z32gW9Wm-Grl4jpt5WgMXPgJ4EDPVmCETCBpJtQ/viewform) to join our Design Partner program, where you'll get early access to commercial features and the opportunity to provide input on our product roadmap.
+Using OpenHands for work? We'd love to chat! Fill out
+[this short form](https://docs.google.com/forms/d/e/1FAIpQLSet3VbGaz8z32gW9Wm-Grl4jpt5WgMXPgJ4EDPVmCETCBpJtQ/viewform)
+to join our Design Partner program, where you'll get early access to commercial features and the opportunity to provide
+input on our product roadmap.
 </Info>

 ## Safety and Security

 ### It's doing stuff without asking, is that safe?

-**Generally yes, but with important considerations.** OpenHands runs all code in a secure, isolated Docker container (called a "sandbox") that is separate from your host system. However, the safety depends on your configuration:
+**Generally yes, but with important considerations.** OpenHands runs all code in a secure, isolated Docker container
+(called a "sandbox") that is separate from your host system. However, the safety depends on your configuration:

 **What's protected:**
 - Your host system files and programs (unless you mount them using [this feature](/usage/runtimes/docker#connecting-to-your-filesystem))
@@ -35,12 +42,14 @@ Using OpenHands for work? We'd love to chat! Fill out [this short form](https://
 - Other containers and processes

 **Potential risks to consider:**
- The agent can access the internet from within the container
- If you provide credentials (API keys, tokens), the agent can use them
- Mounted files and directories can be modified or deleted
- Network requests can be made to external services
+- The agent can access the internet from within the container.
+- If you provide credentials (API keys, tokens), the agent can use them.
+- Mounted files and directories can be modified or deleted.
+- Network requests can be made to external services.

-For detailed security information, see our [Runtime Architecture](/usage/architecture/runtime), [Security Configuration](/usage/configuration-options#security-configuration), and [Hardened Docker Installation](/usage/runtimes/docker#hardened-docker-installation) documentation.
+For detailed security information, see our [Runtime Architecture](/usage/architecture/runtime),
+[Security Configuration](/usage/configuration-options#security-configuration),
+and [Hardened Docker Installation](/usage/runtimes/docker#hardened-docker-installation) documentation.

 ## File Storage and Access

@@ -49,20 +58,19 @@ For detailed security information, see our [Runtime Architecture](/usage/archite
 Your files are stored in different locations depending on how you've configured OpenHands:

 **Default behavior (no file mounting):**
- Files created by the agent are stored inside the Docker container
- These files are temporary and will be lost when the container is removed
- The agent works in the `/workspace` directory inside the container
+- Files created by the agent are stored inside the runtime Docker container.
+- These files are temporary and will be lost when the container is removed.
+- The agent works in the `/workspace` directory inside the runtime container.

 **When you mount your local filesystem (following [this](/usage/runtimes/docker#connecting-to-your-filesystem)):**
- Your local files are mounted into the container's `/workspace` directory
- Changes made by the agent are reflected in your local filesystem
- Files persist after the container is stopped
+- Your local files are mounted into the container's `/workspace` directory.
+- Changes made by the agent are reflected in your local filesystem.
+- Files persist after the container is stopped.

 <Warning>
 Be careful when mounting your filesystem - the agent can modify or delete any files in the mounted directory.
 </Warning>

-
 ## Development Tools and Environment

 ### How do I get the dev tools I need?
@@ -71,15 +79,18 @@ OpenHands comes with a basic runtime environment that includes Python and Node.j
 It also has the ability to install any tools it needs, so usually it's sufficient to ask it to set up its environment.

 If you would like to set things up more systematically, you can:
- **Use setup.sh**: Add a [setup.sh file](https://docs.all-hands.dev/usage/prompting/repository#setup-script) file to your repository, which will be run every time the agent starts
- **Use a custom sandbox**: Use a [custom docker image](/usage/how-to/custom-sandbox-guide) to initialize the sandbox
-
+- **Use setup.sh**: Add a [setup.sh file](/usage/prompting/repository#setup-script) file to
+  your repository, which will be run every time the agent starts.
+- **Use a custom sandbox**: Use a [custom docker image](/usage/how-to/custom-sandbox-guide) to initialize the sandbox.

 ### Something's not working. Where can I get help?

-1. **Check our troubleshooting guide**: Common issues and solutions are documented in [Troubleshooting](/usage/troubleshooting/troubleshooting)
-2. **Search existing issues**: Check our [GitHub issues](https://github.com/All-Hands-AI/OpenHands/issues) to see if others have encountered the same problem
-3. **Join our community**: Get help from other users and developers:
+1. **Search existing issues**: Check our [GitHub issues](https://github.com/All-Hands-AI/OpenHands/issues) to see if
+  others have encountered the same problem.
+2. **Join our community**: Get help from other users and developers:
   - [Slack community](https://join.slack.com/t/openhands-ai/shared_invite/zt-34zm4j0gj-Qz5kRHoca8DFCbqXPS~f_A)
   - [Discord server](https://discord.gg/ESHStjSjD4)
-4. **Report bugs**: If you've found a bug, please [create an issue](https://github.com/All-Hands-AI/OpenHands/issues/new) with details about your setup and the problem
+3. **Check our troubleshooting guide**: Common issues and solutions are documented in
+  [Troubleshooting](/usage/troubleshooting/troubleshooting).
+4. **Report bugs**: If you've found a bug, please [create an issue](https://github.com/All-Hands-AI/OpenHands/issues/new)
+  and fill in as much detail as possible.
@@ -1,6 +1,6 @@
 ---
 title: Start Building
-description: So you've [run OpenHands](./installation) and have [set up your LLM](./installation#setup). Now what?
+description: So you've [run OpenHands](/usage/installation). Now what?
 icon: code
 ---

@@ -25,9 +25,9 @@ You can use the Settings page at any time to:
 - Setup the LLM provider and model for OpenHands.
 - [Setup the search engine](/usage/search-engine-setup).
 - [Configure MCP servers](/usage/mcp).
- [Connect to GitHub](/usage/how-to/gui-mode#github-setup) and [connect to GitLab](/usage/how-to/gui-mode#gitlab-setup)
+- [Connect to GitHub](/usage/how-to/gui-mode#github-setup) and [connect to GitLab](/usage/how-to/gui-mode#gitlab-setup).
 - Set application settings like your preferred language, notifications and other preferences.
- [Manage custom secrets](/usage/how-to/gui-mode#secrets-management).
+- [Manage custom secrets](/usage/common-settings#secrets-management).

 #### GitHub Setup

@@ -157,37 +157,6 @@ OpenHands automatically exports a `GITLAB_TOKEN` to the shell environment if pro

 </AccordionGroup>

-
-#### Secrets Management
-
-OpenHands provides a secrets manager that allows you to securely store and manage sensitive information that can be accessed by the agent during runtime, such as API keys. These secrets are automatically exported as environment variables in the agent's runtime environment.
-
-1. **Accessing the Secrets Manager**:
-   - In the Settings page, navigate to the `Secrets` tab.
-   - You'll see a list of all your existing custom secrets (if any).
-
-2. **Adding a New Secret**:
-   - Click the `Add New Secret` button.
-   - Fill in the following fields:
-     - **Name**: A unique identifier for your secret (e.g., `AWS_ACCESS_KEY`). This will be the environment variable name.
-     - **Value**: The sensitive information you want to store.
-     - **Description** (optional): A brief description of what the secret is used for, which is also provided to the agent.
-   - Click `Add Secret` to save.
-
-3. **Editing a Secret**:
-   - Click the `Edit` button next to the secret you want to modify.
-   - You can update the name and description of the secret.
-   - Note: For security reasons, you cannot view or edit the value of an existing secret. If you need to change the value, delete the secret and create a new one.
-
-4. **Deleting a Secret**:
-   - Click the `Delete` button next to the secret you want to remove.
-   - Confirm the deletion when prompted.
-
-5. **Using Secrets in the Agent**:
-   - All custom secrets are automatically exported as environment variables in the agent's runtime environment.
-   - You can access them in your code using standard environment variable access methods (e.g., `os.environ['SECRET_NAME']` in Python).
-   - Example: If you create a secret named `OPENAI_API_KEY`, you can access it in your code as `process.env.OPENAI_API_KEY` in JavaScript or `os.environ['OPENAI_API_KEY']` in Python.
-
 #### Advanced Settings

 The `Advanced` settings allows configuration of additional LLM settings. Inside the Settings page, under the `LLM` tab,
@@ -208,11 +177,11 @@ section of the documentation.
 The status indicator located in the bottom left of the screen will cycle through a number of states as a new conversation
 is loaded. Typically these include:

-* `Disconnected` : The frontend is not connected to any conversation
+* `Disconnected` : The frontend is not connected to any conversation.
 * `Connecting` : The frontend is connecting a websocket to a conversation.
 * `Building Runtime...` : The server is building a runtime. This is typically in development mode only while building a docker image.
 * `Starting Runtime...` : The server is starting a new runtime instance - probably a new docker container or remote runtime.
-* `Initializing Agent...` : The server is starting the agent loop. (This step does not appear at present with Nested runtimes)
+* `Initializing Agent...` : The server is starting the agent loop (This step does not appear at present with Nested runtimes).
 * `Setting up workspace...` : Usually this means a `git clone ...` operation.
 * `Setting up git hooks` : Setting up the git pre commit hooks for the workspace.
 * `Agent is awaiting user input...` : Ready to go!
@@ -1,12 +1,12 @@
 ---
 title: Quick Start
-description: Running OpenHands Cloud or running on your local system.
+description: Running OpenHands Cloud or running on your own.
 icon: rocket
 ---

 ## OpenHands Cloud

-The easiest way to get started with OpenHands is on OpenHands Cloud, which comes with $50 in free credits for new users.
+The easiest way to get started with OpenHands is on OpenHands Cloud, which comes with $20 in free credits for new users.

 To get started with OpenHands Cloud, visit [app.all-hands.dev](https://app.all-hands.dev).

@@ -6,58 +6,70 @@ description: When using a Local LLM, OpenHands may have limited functionality. I
 ## News

 - 2025/05/21: We collaborated with Mistral AI and released [Devstral Small](https://mistral.ai/news/devstral) that achieves [46.8% on SWE-Bench Verified](https://github.com/SWE-bench/experiments/pull/228)!
- 2025/03/31: We released an open model OpenHands LM v0.1 32B that achieves 37.1% on SWE-Bench Verified
+- 2025/03/31: We released an open model OpenHands LM 32B v0.1 that achieves 37.1% on SWE-Bench Verified
 ([blog](https://www.all-hands.dev/blog/introducing-openhands-lm-32b----a-strong-open-coding-agent-model), [model](https://huggingface.co/all-hands/openhands-lm-32b-v0.1)).

+## Quickstart: Running OpenHands with a Local LLM using LM Studio

-## Quickstart: Running OpenHands on Your Macbook
+This guide explains how to serve a local Devstral LLM using [LM Studio](https://lmstudio.ai/) and have OpenHands connect to it.

-### Serve the model on your Macbook
+We recommend:
+- **LM Studio** as the local model server, which handles metadata downloads automatically and offers a simple, user-friendly interface for configuration.
+- **Devstral Small 2505** as the LLM for software development, trained on real GitHub issues and optimized for agent-style workflows like OpenHands.

-We recommend using [LMStudio](https://lmstudio.ai/) for serving these models locally.
+### Hardware Requirements

-1. Download [LM Studio](https://lmstudio.ai/) and install it
+Running Devstral requires a recent GPU with at least 16GB of VRAM, or a Mac with Apple Silicon (M1, M2, etc.) with at least 32GB of RAM.

-2. Download the model:
-   - Option 1: Directly download the LLM from [this link](https://lmstudio.ai/model/devstral-small-2505-mlx) or by searching for the name `Devstral-Small-2505` in LM Studio
-   - Option 2: Download a LLM in GGUF format. For example, to download [Devstral Small 2505 GGUF](https://huggingface.co/mistralai/Devstral-Small-2505_gguf), using `huggingface-cli download mistralai/Devstral-Small-2505_gguf --local-dir mistralai/Devstral-Small-2505_gguf`. Then in bash terminal, run `lms import {model_name}` in the directory where you've downloaded the model checkpoint (e.g. run `lms import devstralQ4_K_M.gguf` in `mistralai/Devstral-Small-2505_gguf`)
+### 1. Install LM Studio

-3. Open LM Studio application, you should first switch to `power user` mode, and then open the developer tab:
+Download and install the LM Studio desktop app from [lmstudio.ai](https://lmstudio.ai/).

-![image](./screenshots/1_select_power_user.png)
+### 2. Download Devstral Small

-4. Then click `Select a model to load` on top of the application:
+1. Make sure to set the User Interface Complexity Level to "Power User", by clicking on the appropriate label at the bottom of the window.
+2. Click the "Discover" button (Magnifying Glass icon) on the left navigation bar to open the Models download page.

-![image](./screenshots/2_select_model.png)
+![image](./screenshots/01_lm_studio_open_model_hub.png)

-5. And choose the model you want to use, holding `option` on mac to enable advanced loading options:
+3. Search for the "Devstral Small 2505" model, confirm it's the official Mistral AI (mistralai) model, then proceed to download.

-![image](./screenshots/3_select_devstral.png)
+![image](./screenshots/02_lm_studio_download_devstral.png)

-6. You should then pick an appropriate context window for OpenHands based on your hardware configuration (larger than 32768 is recommended for using OpenHands, but too large may cause you to run out of memory); Flash attention is also recommended if it works on your machine.
+4. Wait for the download to finish.

-![image](./screenshots/4_set_context_window.png)
+### 3. Load the Model

-7. And you should start the server (if it is not already in `Running` status), un-toggle `Serve on Local Network` and remember the port number of the LMStudio URL (`1234` is the port number for `http://127.0.0.1:1234` in this example):
+1. Click the "Developer" button (Console icon) on the left navigation bar to open the Developer Console.
+2. Click the "Select a model to load" dropdown at the top of the application window.

-![image](./screenshots/5_copy_url.png)
+![image](./screenshots/03_lm_studio_open_load_model.png)

-8. Finally, you can click the `copy` button near model name to copy the model name (`imported-models/uncategorized/devstralq4_k_m.gguf` in this example):
+3. Enable the "Manually choose model load parameters" switch.
+4. Select 'Devstral Small 2505' from the model list.

-![image](./screenshots/6_copy_to_get_model_name.png)
+![image](./screenshots/04_lm_studio_setup_devstral_part_1.png)

-### Start OpenHands with locally served model
+5. Enable the "Show advanced settings" switch at the bottom of the Model settings flyout to show all the available settings.
+6. Set "Context Length" to at least 32768 and enable Flash Attention.
+7. Click "Load Model" to start loading the model.

-Check [the installation guide](/usage/local-setup) to make sure you have all the prerequisites for running OpenHands.
+![image](./screenshots/05_lm_studio_setup_devstral_part_2.png)
+
+### 4. Start the LLM server
+
+1. Enable the switch next to "Status" at the top-left of the Window.
+2. Take note of the Model API Identifier shown on the sidebar on the right.
+
+![image](./screenshots/06_lm_studio_start_server.png)
+
+### 5. Start OpenHands
+
+1. Check [the installation guide](/usage/local-setup) and ensure all prerequisites are met before running OpenHands, then run:

 ```bash
-export LMSTUDIO_MODEL_NAME="imported-models/uncategorized/devstralq4_k_m.gguf" # <- Replace this with the model name you copied from LMStudio
-export LMSTUDIO_URL="http://host.docker.internal:1234"  # <- Replace this with the port from LMStudio
-
 docker pull docker.all-hands.dev/all-hands-ai/runtime:0.45-nikolaik

-mkdir -p ~/.openhands && echo '{"language":"en","agent":"CodeActAgent","max_iterations":null,"security_analyzer":null,"confirmation_mode":false,"llm_model":"lm_studio/'$LMSTUDIO_MODEL_NAME'","llm_api_key":"dummy","llm_base_url":"'$LMSTUDIO_URL/v1'","remote_runtime_resource_factor":null,"github_token":null,"enable_default_condenser":true,"user_consents_to_analytics":true}' > ~/.openhands/settings.json
-
 docker run -it --rm --pull=always \
    -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.45-nikolaik \
    -e LOG_ALL_EVENTS=true \
@@ -69,9 +81,7 @@ docker run -it --rm --pull=always \
    docker.all-hands.dev/all-hands-ai/openhands:0.45
 ```

-> **Note**: If you used OpenHands before version 0.44, you may want to run `mv ~/.openhands-state ~/.openhands` to migrate your conversation history to the new location.
-
-Once your server is running -- you can visit `http://localhost:3000` in your browser to use OpenHands with local Devstral model:
+2. Wait until the server is running (see log below):
 ```
 Digest: sha256:e72f9baecb458aedb9afc2cd5bc935118d1868719e55d50da73190d3a85c674f
 Status: Image is up to date for docker.all-hands.dev/all-hands-ai/openhands:0.45
@@ -84,65 +94,88 @@ INFO:     Application startup complete.
 INFO:     Uvicorn running on http://0.0.0.0:3000 (Press CTRL+C to quit)
 ```

+3. Visit `http://localhost:3000` in your browser.

-## Advanced: Serving LLM on GPUs
+### 6. Configure OpenHands to use the LLM server

-### Download model checkpoints
+Once you open OpenHands in your browser, you'll need to configure it to use the local LLM server you just started.

-<Note>
-The model checkpoints downloaded here should NOT be in GGUF format.
-</Note>
+When started for the first time, OpenHands will prompt you to set up the LLM provider.

-For example, to download [OpenHands LM 32B v0.1](https://huggingface.co/all-hands/openhands-lm-32b-v0.1):
+1. Click "see advanced settings" to open the LLM Settings page.
+
+![image](./screenshots/07_openhands_open_advanced_settings.png)
+
+2. Enable the "Advanced" switch at the top of the page to show all the available settings.
+
+3. Set the following values:
+    - **Custom Model**: `openai/mistralai/devstral-small-2505` (the Model API identifier from LM Studio, prefixed with "openai/")
+    - **Base URL**: `http://host.docker.internal:1234/v1`
+    - **API Key**: `local-llm`
+
+4. Click "Save Settings" to save the configuration.
+
+![image](./screenshots/08_openhands_configure_local_llm_parameters.png)
+
+That's it! You can now start using OpenHands with the local LLM server.
+
+If you encounter any issues, let us know on [Slack](https://join.slack.com/t/openhands-ai/shared_invite/zt-34zm4j0gj-Qz5kRHoca8DFCbqXPS~f_A) or [Discord](https://discord.gg/ESHStjSjD4).
+
+## Advanced: Alternative LLM Backends
+
+This section describes how to run local LLMs with OpenHands using alternative backends like Ollama, SGLang, or vLLM — without relying on LM Studio.
+
+### Create an OpenAI-Compatible Endpoint with Ollama
+
+- Install Ollama following [the official documentation](https://ollama.com/download).
+- Example launch command for Devstral Small 2505:

 ```bash
-huggingface-cli download all-hands/openhands-lm-32b-v0.1 --local-dir all-hands/openhands-lm-32b-v0.1
+# ⚠️ WARNING: OpenHands requires a large context size to work properly.
+# When using Ollama, set OLLAMA_CONTEXT_LENGTH to at least 32768.
+# The default (4096) is way too small — not even the system prompt will fit, and the agent will not behave correctly.
+OLLAMA_CONTEXT_LENGTH=32768 OLLAMA_HOST=0.0.0.0:11434 OLLAMA_KEEP_ALIVE=-1 nohup ollama serve &
+ollama pull devstral:latest
 ```

-### Create an OpenAI-Compatible Endpoint With SGLang
+### Create an OpenAI-Compatible Endpoint with vLLM or SGLang
+
+First, download the model checkpoints. For [Devstral Small 2505](https://huggingface.co/mistralai/Devstral-Small-2505):
+
+```bash
+huggingface-cli download mistralai/Devstral-Small-2505 --local-dir mistralai/Devstral-Small-2505
+```
+
+#### Serving the model using SGLang

 - Install SGLang following [the official documentation](https://docs.sglang.ai/start/install.html).
- Example launch command for OpenHands LM 32B (with at least 2 GPUs):
+- Example launch command for Devstral Small 2505 (with at least 2 GPUs):

 ```bash
 SGLANG_ALLOW_OVERWRITE_LONGER_CONTEXT_LEN=1 python3 -m sglang.launch_server \
-    --model all-hands/openhands-lm-32b-v0.1 \
-    --served-model-name openhands-lm-32b-v0.1 \
+    --model mistralai/Devstral-Small-2505 \
+    --served-model-name Devstral-Small-2505 \
    --port 8000 \
    --tp 2 --dp 1 \
    --host 0.0.0.0 \
    --api-key mykey --context-length 131072
 ```

-### Create an OpenAI-Compatible Endpoint with vLLM
+#### Serving the model using vLLM

 - Install vLLM following [the official documentation](https://docs.vllm.ai/en/latest/getting_started/installation.html).
- Example launch command for OpenHands LM 32B (with at least 2 GPUs):
+- Example launch command for Devstral Small 2505 (with at least 2 GPUs):

 ```bash
-vllm serve all-hands/openhands-lm-32b-v0.1 \
+vllm serve mistralai/Devstral-Small-2505 \
    --host 0.0.0.0 --port 8000 \
    --api-key mykey \
    --tensor-parallel-size 2 \
-    --served-model-name openhands-lm-32b-v0.1
+    --served-model-name Devstral-Small-2505 \
    --enable-prefix-caching
 ```

-### Create an OpenAI-Compatible Endpoint with Ollama
-
- Install Ollama following [the official documentation](https://ollama.com/download).
- For Ollama configuration, use `ollama/<modelname>` as custom model in web. Api key also can be set to `ollama`.
- Example launch command for Devstral LM 24B:
-
-```bash
-OLLAMA_CONTEXT_LENGTH=32768 OLLAMA_HOST=0.0.0.0:11434 OLLAMA_KEEP_ALIVE=-1 nohup ollama serve&
-#The minimum context size is ~8196, even the system prompt won't fit smaller
-ollama pull devstral:latest
-```
-
-## Advanced: Run and Configure OpenHands
-
-### Run OpenHands
+### Run OpenHands (Alternative Backends)

 #### Using Docker

@@ -151,24 +184,20 @@ Run OpenHands using [the official docker run command](../installation#start-the-
 #### Using Development Mode

 Use the instructions in [Development.md](https://github.com/All-Hands-AI/OpenHands/blob/main/Development.md) to build OpenHands.
-Ensure `config.toml` exists by running `make setup-config` which will create one for you. In the `config.toml`, enter the following:
-
-```
-[core]
-workspace_base="/path/to/your/workspace"
-
-[llm]
-model="openhands-lm-32b-v0.1"
-ollama_base_url="http://localhost:8000"
-```

 Start OpenHands using `make run`.

-### Configure OpenHands
+### Configure OpenHands (Alternative Backends)

-Once OpenHands is running, you'll need to set the following in the OpenHands UI through the Settings under the `LLM` tab:
-1. Enable `Advanced` options.
-2. Set the following:
- `Custom Model` to `openai/<served-model-name>` (e.g. `openai/openhands-lm-32b-v0.1`)
- `Base URL` to `http://host.docker.internal:8000`
- `API key` to the same string you set when serving the model (e.g. `mykey`)
+Once OpenHands is running, open the Settings page in the UI and go to the `LLM` tab.
+
+1. Click **"see advanced settings"** to access the full configuration panel.
+2. Enable the **Advanced** toggle at the top of the page.
+3. Set the following parameters, if you followed the examples above:
+   - **Custom Model**: `openai/<served-model-name>`
+     e.g. `openai/devstral` if you're using Ollama, or `openai/Devstral-Small-2505` for SGLang or vLLM.
+   - **Base URL**: `http://host.docker.internal:<port>/v1`
+     Use port `11434` for Ollama, or `8000` for SGLang and vLLM.
+   - **API Key**:
+     - For **Ollama**: any placeholder value (e.g. `dummy`, `local-llm`)
+     - For **SGLang** or **vLLM**: use the same key provided when starting the server (e.g. `mykey`)
@@ -153,8 +153,6 @@ To enable search functionality in OpenHands:

 For more details, see the [Search Engine Setup](/usage/search-engine-setup) guide.

-Now you're ready to [get started with OpenHands](/usage/getting-started).
-
 ### Versions

 The [docker command above](/usage/local-setup#start-the-app) pulls the most recent stable release of OpenHands. You have other options as well:
@@ -5,26 +5,111 @@ description: Keyword-triggered microagents provide OpenHands with specific instr

 ## Usage

-These microagents are only loaded when a prompt includes one of the trigger words.
+Keyword-triggered microagents are only loaded when a prompt includes one of the trigger words. There are two types of keyword-triggered microagents:
+
+1. **Standard Keyword Microagents**: Triggered by keywords embedded in text
+2. **Command-Style Microagents**: Triggered by command-style inputs (e.g., `/fix_test`) that can prompt for user input
+
+Additionally, there's a special type of microagent that's always active:
+
+3. **Repository Microagents**: Always active for a specific repository, providing repository-specific context and tools

 ## Frontmatter Syntax

 Frontmatter is required for keyword-triggered microagents. It must be placed at the top of the file,
-above the guidelines.
+above the guidelines. Enclose the frontmatter in triple dashes (---).

-Enclose the frontmatter in triple dashes (---) and include the following fields:
+### Standard Keyword Microagents
+
+For standard keyword microagents, include the following fields:

 | Field      | Description                                      | Required | Default          |
 |------------|--------------------------------------------------|----------|------------------|
+| `name`     | The name of the microagent                       | No       | Filename         |
+| `type`     | The type of microagent (`knowledge`)             | No       | Inferred         |
 | `triggers` | A list of keywords that activate the microagent. | Yes      | None             |
-| `agent`    | The agent this microagent applies to.            | No       | 'CodeActAgent'   |

+### Command-Style Microagents

-## Example
+For command-style microagents that require user input, include the following fields:

-Keyword-triggered microagent file example located at `.openhands/microagents/yummy.md`:
-```
+| Field      | Description                                                | Required | Default          |
+|------------|------------------------------------------------------------|----------|------------------|
+| `name`     | The name of the microagent                                 | No       | Filename         |
+| `type`     | The type of microagent (`task`)                            | No       | Inferred         |
+| `triggers` | A list of command triggers (e.g., `/fix_test`)             | No       | `/[name]`        |
+| `inputs`   | A list of input variables the microagent requires          | Yes      | None             |
+
+### Repository Microagents
+
+Repository microagents are always active for a specific repository. They provide repository-specific context and tools.
+
+| Field      | Description                                                | Required | Default          |
+|------------|------------------------------------------------------------|----------|------------------|
+| `name`     | The name of the microagent                                 | No       | Filename         |
+| `type`     | The type of microagent (`repo`)                            | No       | Inferred         |
+
+#### Repository Microagent Example
+
+Here's an example of a repository microagent:
+
+```yaml
 ---
+# The type field is optional and will be inferred as 'repo' when no triggers are present
+---
+
+# Repository Guidelines
+
+This repository follows these coding standards:
+1. Use PEP 8 for Python code
+2. Use ESLint for JavaScript code
+3. Write unit tests for all new features
+```
+
+This microagent is always active when working with the repository and provides repository-specific guidelines.
+
+### MCP Tools Support
+
+Microagents can also provide additional MCP (Model-Code-Prompt) tools to the agent. This is useful for extending the agent's capabilities with custom tools.
+
+| Field        | Description                                                | Required | Default          |
+|--------------|-----------------------------------------------------------|----------|------------------|
+| `mcp_tools`  | Configuration for additional MCP tools                     | No       | None             |
+
+#### MCP Tools Example
+
+Here's an example of a microagent that provides an additional MCP tool (the `fetch` tool for accessing web content):
+
+```yaml
+---
+# The type field is optional and will be inferred as 'repo' when no triggers are present
+mcp_tools:
+  stdio_servers:
+    - name: "fetch"
+      command: uvx
+      args:
+        - mcp-server-fetch
+---
+```
+
+This microagent is a repository microagent (always active) that adds the `fetch` tool to the agent's capabilities.
+
+Each input in the `inputs` list requires:
+
+| Field         | Description                                      | Required |
+|---------------|--------------------------------------------------|----------|
+| `name`        | The name of the input variable                   | Yes      |
+| `description` | A description of what the input should contain   | Yes      |
+
+
+## Examples
+
+### Standard Keyword Microagent Example
+
+Standard keyword microagent file example located at `.openhands/microagents/yummy.md`:
+```yaml
+---
+# The type field is optional and will be inferred as 'knowledge' when triggers are present
 triggers:
 - yummyhappy
 - happyyummy
@@ -33,4 +118,58 @@ triggers:
 The user has said the magic word. Respond with "That was delicious!"
 ```

-[See examples of microagents triggered by keywords in the official OpenHands repository](https://github.com/All-Hands-AI/OpenHands/tree/main/microagents)
+### Command-Style Microagent Example
+
+Command-style microagent file example located at `.openhands/microagents/fix_test.md`:
+```yaml
+---
+# The type field is optional and will be inferred as 'task' when inputs are present
+triggers:
+- /fix_test
+inputs:
+  - name: BRANCH_NAME
+    description: "Branch for the agent to work on"
+  - name: TEST_COMMAND_TO_RUN
+    description: "The test command you want the agent to work on. For example, `pytest tests/unit/test_bash_parsing.py`"
+  - name: FUNCTION_TO_FIX
+    description: "The name of function to fix"
+  - name: FILE_FOR_FUNCTION
+    description: "The path of the file that contains the function"
+---
+
+Can you check out branch "{{ BRANCH_NAME }}", and run {{ TEST_COMMAND_TO_RUN }}.
+
+Help me fix these tests to pass by fixing the {{ FUNCTION_TO_FIX }} function in file {{ FILE_FOR_FUNCTION }}.
+
+PLEASE DO NOT modify the tests by yourself -- Let me know if you think some of the tests are incorrect.
+```
+
+## Using Command-Style Microagents
+
+Command-style microagents are designed to streamline common development tasks by providing structured templates for specific operations. They are triggered using a command-style format and will prompt the user for any required inputs.
+
+### How to Use
+
+1. Type `/` in the chat input to see available command-style microagents
+2. Select a microagent from the dropdown or type its name (e.g., `/fix_test`)
+3. The agent will prompt you for any required inputs
+4. Provide the requested information
+5. The agent will execute the task with your inputs
+
+### Template Variables
+
+In the body of a command-style microagent, you can reference input variables using the `{{ VARIABLE_NAME }}` syntax. These will be replaced with the user-provided values when the microagent is triggered.
+
+### Available Command-Style Microagents
+
+OpenHands includes several built-in command-style microagents:
+
+| Command              | Description                                           |
+|----------------------|-------------------------------------------------------|
+| `/fix_test`          | Fix failing tests by modifying a specific function    |
+| `/update_test`       | Update tests for a new implementation                 |
+| `/update_pr`         | Update a pull request description                     |
+| `/address_pr_comments` | Address comments on a pull request                  |
+| `/add_repo_instruction` | Add instructions to the repository microagent      |
+
+[See examples of microagents in the official OpenHands repository](https://github.com/All-Hands-AI/OpenHands/tree/main/microagents)
@@ -8,7 +8,7 @@ description: Microagents are specialized prompts that enhance OpenHands with dom
 Currently OpenHands supports the following types of microagents:

 - [General Microagents](./microagents-repo): General guidelines for OpenHands about the repository.
- [Keyword-Triggered Microagents](./microagents-keyword): Guidelines activated by specific keywords in prompts.
+- [Keyword-Triggered Microagents](./microagents-keyword): Guidelines activated by specific keywords in prompts, including command-style microagents that prompt for user inputs.

 To customize OpenHands' behavior, create a .openhands/microagents/ directory in the root of your repository and
 add `<microagent_name>.md` files inside. For repository-specific guidelines, you can ask OpenHands to analyze your repository and create a comprehensive `repo.md` file (see [General Microagents](./microagents-repo) for details).
@@ -34,7 +34,7 @@ some-repository/
 Each microagent file may include frontmatter that provides additional information. In some cases, this frontmatter
 is required:

-| Microagent Type                 | Required |
-|---------------------------------|----------|
-| `General Microagents`           | No       |
-| `Keyword-Triggered Microagents` | Yes      |
+| Microagent Type                                | Required |
+|------------------------------------------------|----------|
+| `General Microagents`                          | No       |
+| `Keyword-Triggered Microagents (all types)`    | Yes      |
@@ -128,3 +128,7 @@ docker network create openhands-network
 docker run # ... \
    --network openhands-network \
 ```
+
+<Note>
+**Docker Desktop Required**: Network isolation features, including custom networks and `host.docker.internal` routing, require Docker Desktop. Docker Engine alone does not support these features on localhost across custom networks. If you're using Docker Engine without Docker Desktop, network isolation may not work as expected.
+</Note>
@@ -133,13 +133,66 @@ This guide provides step-by-step instructions for running OpenHands on a Windows

   > **Note**: If you're running the frontend in development mode (using `npm run dev`), use port 3001 instead: `http://localhost:3001`

+## Installing and Running the CLI
+
+To install and run the OpenHands CLI on Windows without WSL, follow these steps:
+
+### 1. Install uv (Python Package Manager)
+
+Open PowerShell as Administrator and run:
+
+```powershell
+powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
+```
+
+### 2. Install .NET SDK (Required)
+
+The OpenHands CLI **requires** the .NET Core runtime for PowerShell integration. Without it, the CLI will fail to start with a `coreclr` error. Install the .NET SDK which includes the runtime:
+
+```powershell
+winget install Microsoft.DotNet.SDK.8
+```
+
+Alternatively, you can download and install the .NET SDK from the [official Microsoft website](https://dotnet.microsoft.com/download).
+
+After installation, restart your PowerShell session to ensure the environment variables are updated.
+
+### 3. Install and Run OpenHands
+
+After installing the prerequisites, you can install and run OpenHands with:
+
+```powershell
+uvx --python 3.12 --from openhands-ai openhands
+```
+
+### Troubleshooting CLI Issues
+
+#### CoreCLR Error
+
+If you encounter an error like `Failed to load CoreCLR` or `pythonnet.load('coreclr')` when running OpenHands CLI, this indicates that the .NET Core runtime is missing or not properly configured. To fix this:
+
+1. Install the .NET SDK as described in step 2 above
+2. Verify that your system PATH includes the .NET SDK directories
+3. Restart your PowerShell session completely after installing the .NET SDK
+4. Make sure you're using PowerShell 7 (pwsh) rather than Windows PowerShell
+
+To verify your .NET installation, run:
+
+```powershell
+dotnet --info
+```
+
+This should display information about your installed .NET SDKs and runtimes. If this command fails, the .NET SDK is not properly installed or not in your PATH.
+
+If the issue persists after installing the .NET SDK, try installing the specific .NET Runtime version 6.0 or later from the [.NET download page](https://dotnet.microsoft.com/download).
+
 ## Limitations on Windows

 When running OpenHands on Windows without WSL or Docker, be aware of the following limitations:

 1. **Browser Tool Not Supported**: The browser tool is not currently supported on Windows.

-2. **.NET Core Requirement**: The PowerShell integration requires .NET Core Runtime to be installed. If .NET Core is not available, OpenHands will automatically fall back to a more limited PowerShell implementation with reduced functionality.
+2. **.NET Core Requirement**: The PowerShell integration requires .NET Core Runtime to be installed. The CLI implementation attempts to load the CoreCLR at startup with `pythonnet.load('coreclr')` and will fail with an error if .NET Core is not properly installed.

 3. **Interactive Shell Commands**: Some interactive shell commands may not work as expected. The PowerShell session implementation has limitations compared to the bash session used on Linux/macOS.

@@ -7,7 +7,7 @@
    "node": ">=20.0.0"
  },
  "dependencies": {
-    "@heroui/react": "^2.8.0-beta.7",
+    "@heroui/react": "^2.8.0-beta.9",
    "@microlink/react-json-view": "^1.26.2",
    "@monaco-editor/react": "^4.7.0-rc.0",
    "@react-router/node": "^7.6.2",
@@ -18,7 +18,7 @@
    "@stripe/stripe-js": "^7.3.1",
    "@tailwindcss/postcss": "^4.1.10",
    "@tailwindcss/vite": "^4.1.10",
-    "@tanstack/react-query": "^5.80.7",
+    "@tanstack/react-query": "^5.80.10",
    "@vitejs/plugin-react": "^4.5.2",
    "@xterm/addon-fit": "^0.10.0",
    "@xterm/xterm": "^5.4.0",
@@ -31,7 +31,7 @@
    "i18next-http-backend": "^3.0.2",
    "isbot": "^5.1.28",
    "jose": "^6.0.11",
-    "lucide-react": "^0.517.0",
+    "lucide-react": "^0.519.0",
    "monaco-editor": "^0.52.2",
    "posthog-js": "^1.255.0",
    "react": "^19.1.0",
@@ -84,7 +84,7 @@
    "@babel/traverse": "^7.27.1",
    "@babel/types": "^7.27.0",
    "@mswjs/socket.io-binding": "^0.2.0",
-    "@playwright/test": "^1.53.0",
+    "@playwright/test": "^1.53.1",
    "@react-router/dev": "^7.6.2",
    "@tailwindcss/typography": "^0.5.16",
    "@tanstack/eslint-plugin-query": "^5.78.0",
@@ -1,20 +1,16 @@
 ---
-name: add_agent
-type: knowledge
-version: 1.0.0
-agent: CodeActAgent
 triggers:
-  - new agent
-  - new microagent
-  - create agent
-  - create an agent
-  - create microagent
-  - create a microagent
-  - add agent
-  - add an agent
-  - add microagent
-  - add a microagent
-  - microagent template
+- new agent
+- new microagent
+- create agent
+- create an agent
+- create microagent
+- create a microagent
+- add agent
+- add an agent
+- add microagent
+- add a microagent
+- microagent template
 ---

 This agent helps create new microagents in the `.openhands/microagents` directory by providing guidance and templates.
@@ -1,13 +1,9 @@
 ---
-name: add_repo_inst
-version: 1.0.0
-author: openhands
-agent: CodeActAgent
+inputs:
+- description: Branch for the agent to work on
+  name: REPO_FOLDER_NAME
 triggers:
 - /add_repo_inst
-inputs:
-  - name: REPO_FOLDER_NAME
-    description: "Branch for the agent to work on"
 ---

 Please browse the current repository under /workspace/{{ REPO_FOLDER_NAME }}, look at the documentation and relevant code, and understand the purpose of this repository.
@@ -18,7 +14,6 @@ Here's an example:
 ```markdown
 ---
 name: repo
-type: repo
 agent: CodeActAgent
 ---

@@ -1,15 +1,11 @@
 ---
-name: address_pr_comments
-version: 1.0.0
-author: openhands
-agent: CodeActAgent
+inputs:
+- description: URL of the pull request
+  name: PR_URL
+- description: Branch name corresponds to the pull request
+  name: BRANCH_NAME
 triggers:
 - /address_pr_comments
-inputs:
-  - name: PR_URL
-    description: "URL of the pull request"
-  - name: BRANCH_NAME
-    description: "Branch name corresponds to the pull request"
 ---

 First, check the branch {{ BRANCH_NAME }} and read the diff against the main branch to understand the purpose.
@@ -1,8 +1,4 @@
 ---
-name: agent_memory
-type: knowledge
-version: 1.0.0
-agent: CodeActAgent
 triggers:
 - /remember
 ---
@@ -1,15 +1,8 @@
 ---
-# This is a repo microagent that is always activated
-# to include necessary default tools implemented with MCP
-name: default-tools
-type: repo
-version: 1.0.0
-agent: CodeActAgent
 mcp_tools:
  stdio_servers:
-    - name: "fetch"
-      command: "uvx"
-      args: ["mcp-server-fetch"]
-# We leave the body empty because MCP tools will automatically add the
-# tool description for LLMs in tool calls, so there's no need to add extra descriptions.
+  - args:
+    - mcp-server-fetch
+    command: uvx
+    name: fetch
 ---
@@ -1,8 +1,4 @@
 ---
-name: docker
-type: knowledge
-version: 1.0.0
-agent: CodeActAgent
 triggers:
 - docker
 - container
@@ -1,19 +1,16 @@
 ---
-name: fix_test
-version: 1.0.0
-author: openhands
-agent: CodeActAgent
+inputs:
+- description: Branch for the agent to work on
+  name: BRANCH_NAME
+- description: The test command you want the agent to work on. For example, `pytest
+    tests/unit/test_bash_parsing.py`
+  name: TEST_COMMAND_TO_RUN
+- description: The name of function to fix
+  name: FUNCTION_TO_FIX
+- description: The path of the file that contains the function
+  name: FILE_FOR_FUNCTION
 triggers:
 - /fix_test
-inputs:
-  - name: BRANCH_NAME
-    description: "Branch for the agent to work on"
-  - name: TEST_COMMAND_TO_RUN
-    description: "The test command you want the agent to work on. For example, `pytest tests/unit/test_bash_parsing.py`"
-  - name: FUNCTION_TO_FIX
-    description: "The name of function to fix"
-  - name: FILE_FOR_FUNCTION
-    description: "The path of the file that contains the function"
 ---

 Can you check out branch "{{ BRANCH_NAME }}", and run {{ TEST_COMMAND_TO_RUN }}.
@@ -1,8 +1,4 @@
 ---
-name: flarglebargle
-type: knowledge
-version: 1.0.0
-agent: CodeActAgent
 triggers:
 - flarglebargle
 ---
@@ -1,8 +1,4 @@
 ---
-name: github
-type: knowledge
-version: 1.0.0
-agent: CodeActAgent
 triggers:
 - github
 - git
@@ -1,8 +1,4 @@
 ---
-name: gitlab
-type: knowledge
-version: 1.0.0
-agent: CodeActAgent
 triggers:
 - gitlab
 - git
@@ -1,8 +1,4 @@
 ---
-name: kubernetes
-type: knowledge
-version: 1.0.0
-agent: CodeActAgent
 triggers:
 - kubernetes
 - k8s
@@ -1,8 +1,4 @@
 ---
-name: npm
-type: knowledge
-version: 1.0.0
-agent: CodeActAgent
 triggers:
 - npm
 ---
@@ -1,8 +1,4 @@
 ---
-name: pdflatex
-type: knowledge
-version: 1.0.0
-agent: CodeActAgent
 triggers:
 - pdflatex
 ---
@@ -1,15 +1,12 @@
 ---
-name: security
-type: knowledge
-version: 1.0.0
-agent: CodeActAgent
 triggers:
-  - security
-  - vulnerability
-  - authentication
-  - authorization
-  - permissions
+- security
+- vulnerability
+- authentication
+- authorization
+- permissions
 ---
+
 This document provides guidance on security best practices

 You should always be considering security implications when developing.
@@ -1,16 +1,12 @@
 ---
-name: SSH Microagent
-type: knowledge
-version: 1.0.0
-agent: CodeActAgent
 triggers:
-  - ssh
-  - remote server
-  - remote machine
-  - remote host
-  - remote connection
-  - secure shell
-  - ssh keys
+- ssh
+- remote server
+- remote machine
+- remote host
+- remote connection
+- secure shell
+- ssh keys
 ---

 # SSH Microagent
@@ -1,12 +1,8 @@
 ---
- name: swift-linux
- type: knowledge
- agent: CodeActAgent
- version: 1.0.0
- triggers:
- - swift-linux
- - swift-debian
- - swift-installation
+triggers:
+- swift-linux
+- swift-debian
+- swift-installation
 ---

 # Swift Installation Guide for Debian Linux
@@ -1,19 +1,15 @@
 ---
-name: update_pr_description
-version: 1.0.0
-author: openhands
-agent: CodeActAgent
+inputs:
+- description: URL of the pull request
+  name: PR_URL
+  type: string
+  validation:
+    pattern: ^https://github.com/.+/.+/pull/[0-9]+$
+- description: Branch name corresponds to the pull request
+  name: BRANCH_NAME
+  type: string
 triggers:
 - /update_pr_description
-inputs:
-  - name: PR_URL
-    description: "URL of the pull request"
-    type: string
-    validation:
-      pattern: "^https://github.com/.+/.+/pull/[0-9]+$"
-  - name: BRANCH_NAME
-    description: "Branch name corresponds to the pull request"
-    type: string
 ---

 Please check the branch "{{ BRANCH_NAME }}" and look at the diff against the main branch. This branch belongs to this PR "{{ PR_URL }}".
@@ -1,15 +1,12 @@
 ---
-name: update_test
-version: 1.0.0
-author: openhands
-agent: CodeActAgent
+inputs:
+- description: Branch for the agent to work on
+  name: BRANCH_NAME
+- description: The test command you want the agent to work on. For example, `pytest
+    tests/unit/test_bash_parsing.py`
+  name: TEST_COMMAND_TO_RUN
 triggers:
 - /update_test
-inputs:
-  - name: BRANCH_NAME
-    description: "Branch for the agent to work on"
-  - name: TEST_COMMAND_TO_RUN
-    description: "The test command you want the agent to work on. For example, `pytest tests/unit/test_bash_parsing.py`"
 ---

 Can you check out branch "{{ BRANCH_NAME }}", and run {{ TEST_COMMAND_TO_RUN }}.
@@ -95,6 +95,7 @@ class CodeActAgent(Agent):
        if self._prompt_manager is None:
            self._prompt_manager = PromptManager(
                prompt_dir=os.path.join(os.path.dirname(__file__), 'prompts'),
+                system_prompt_filename=self.config.system_prompt_filename,
            )

        return self._prompt_manager
@@ -6,17 +6,20 @@ from openhands.llm.tool_names import EXECUTE_BASH_TOOL_NAME

 _DETAILED_BASH_DESCRIPTION = """Execute a bash command in the terminal within a persistent shell session.

+
 ### Command Execution
 * One command at a time: You can only execute one bash command at a time. If you need to run multiple commands sequentially, use `&&` or `;` to chain them together.
 * Persistent session: Commands execute in a persistent shell session where environment variables, virtual environments, and working directory persist between commands.
-* Timeout: Commands have a soft timeout of 10 seconds, once that's reached, you have the option to continue or interrupt the command (see section below for details)
+* Soft timeout: Commands have a soft timeout of 10 seconds, once that's reached, you have the option to continue or interrupt the command (see section below for details)

-### Running and Interacting with Processes
-* Long running commands: For commands that may run indefinitely, run them in the background and redirect output to a file, e.g. `python3 app.py > server.log 2>&1 &`. For commands that need to run for a specific duration, like "sleep", you can set the "timeout" argument to specify a hard timeout in seconds.
-* Interact with running process: If a bash command returns exit code `-1`, this means the process is not yet finished. By setting `is_input` to `true`, you can:
+### Long-running Commands
+* For commands that may run indefinitely, run them in the background and redirect output to a file, e.g. `python3 app.py > server.log 2>&1 &`.
+* For commands that may run for a long time (e.g. installation or testing commands), or commands that run for a fixed amount of time (e.g. sleep), you should set the "timeout" parameter of your function call to an appropriate value.
+* If a bash command returns exit code `-1`, this means the process hit the soft timeout and is not yet finished. By setting `is_input` to `true`, you can:
  - Send empty `command` to retrieve additional logs
  - Send text (set `command` to the text) to STDIN of the running process
  - Send control commands like `C-c` (Ctrl+C), `C-d` (Ctrl+D), or `C-z` (Ctrl+Z) to interrupt the process
+  - If you do C-c, you can re-start the process with a longer "timeout" parameter to let it run to completion

 ### Best Practices
 * Directory verification: Before creating new directories or files, first verify the parent directory exists and is the correct location.
@@ -52,6 +52,7 @@ async def handle_commands(

    if command == '/exit':
        close_repl = handle_exit_command(
+            config,
            event_stream,
            usage_metrics,
            sid,
@@ -66,7 +67,7 @@ async def handle_commands(
        handle_status_command(usage_metrics, sid)
    elif command == '/new':
        close_repl, new_session_requested = handle_new_command(
-            event_stream, usage_metrics, sid
+            config, event_stream, usage_metrics, sid
        )
    elif command == '/settings':
        await handle_settings_command(config, settings_store)
@@ -81,12 +82,16 @@ async def handle_commands(


 def handle_exit_command(
-    event_stream: EventStream, usage_metrics: UsageMetrics, sid: str
+    config: OpenHandsConfig,
+    event_stream: EventStream,
+    usage_metrics: UsageMetrics,
+    sid: str,
 ) -> bool:
    close_repl = False

    confirm_exit = (
-        cli_confirm('\nTerminate session?', ['Yes, proceed', 'No, dismiss']) == 0
+        cli_confirm(config, '\nTerminate session?', ['Yes, proceed', 'No, dismiss'])
+        == 0
    )

    if confirm_exit:
@@ -119,7 +124,7 @@ async def handle_init_command(
    reload_microagents = False

    if config.runtime == 'local':
-        init_repo = await init_repository(current_dir)
+        init_repo = await init_repository(config, current_dir)
        if init_repo:
            event_stream.add_event(
                MessageAction(content=REPO_MD_CREATE_PROMPT),
@@ -140,13 +145,17 @@ def handle_status_command(usage_metrics: UsageMetrics, sid: str) -> None:


 def handle_new_command(
-    event_stream: EventStream, usage_metrics: UsageMetrics, sid: str
+    config: OpenHandsConfig,
+    event_stream: EventStream,
+    usage_metrics: UsageMetrics,
+    sid: str,
 ) -> tuple[bool, bool]:
    close_repl = False
    new_session_requested = False

    new_session_requested = (
        cli_confirm(
+            config,
            '\nCurrent session will be terminated and you will lose the conversation history.\n\nContinue?',
            ['Yes, proceed', 'No, dismiss'],
        )
@@ -171,6 +180,7 @@ async def handle_settings_command(
 ) -> None:
    display_settings(config)
    modify_settings = cli_confirm(
+        config,
        '\nWhich settings would you like to modify?',
        [
            'Basic',
@@ -207,7 +217,7 @@ async def handle_resume_command(
    return close_repl, new_session_requested


-async def init_repository(current_dir: str) -> bool:
+async def init_repository(config: OpenHandsConfig, current_dir: str) -> bool:
    repo_file_path = Path(current_dir) / '.openhands' / 'microagents' / 'repo.md'
    init_repo = False

@@ -237,6 +247,7 @@ async def init_repository(current_dir: str) -> bool:

            init_repo = (
                cli_confirm(
+                    config,
                    'Do you want to re-initialize?',
                    ['Yes, re-initialize', 'No, dismiss'],
                )
@@ -255,6 +266,7 @@ async def init_repository(current_dir: str) -> bool:

        init_repo = (
            cli_confirm(
+                config,
                'Do you want to proceed?',
                ['Yes, create', 'No, dismiss'],
            )
@@ -297,7 +309,10 @@ def check_folder_security_agreement(config: OpenHandsConfig, current_dir: str) -
        print_formatted_text('')

        confirm = (
-            cli_confirm('Do you wish to continue?', ['Yes, proceed', 'No, exit']) == 0
+            cli_confirm(
+                config, 'Do you wish to continue?', ['Yes, proceed', 'No, exit']
+            )
+            == 0
        )

        if confirm:
@@ -155,7 +155,7 @@ async def run_session(
        nonlocal reload_microagents, new_session_requested
        while True:
            next_message = await read_prompt_input(
-                agent_state, multiline=config.cli_multiline_input
+                config, agent_state, multiline=config.cli_multiline_input
            )

            if not next_message.strip():
@@ -214,7 +214,7 @@ async def run_session(
                    )
                    return

-                confirmation_status = await read_confirmation_input()
+                confirmation_status = await read_confirmation_input(config)
                if confirmation_status == 'yes' or confirmation_status == 'always':
                    event_stream.add_event(
                        ChangeAgentStateAction(AgentState.USER_CONFIRMED),
@@ -135,9 +135,10 @@ async def get_validated_input(
    return value


-def save_settings_confirmation() -> bool:
+def save_settings_confirmation(config: OpenHandsConfig) -> bool:
    return (
        cli_confirm(
+            config,
            '\nSave new settings? (They will take effect after restart)',
            ['Yes, save', 'No, discard'],
        )
@@ -173,6 +174,7 @@ async def modify_llm_settings_basic(
        # Show verified providers plus "Select another provider" option
        provider_choices = verified_providers + ['Select another provider']
        provider_choice = cli_confirm(
+            config,
            '(Step 1/3) Select LLM Provider:',
            provider_choices,
        )
@@ -255,6 +257,7 @@ async def modify_llm_settings_basic(
        )
        change_model = (
            cli_confirm(
+                config,
                'Do you want to use a different model?',
                [f'Use {default_model}', 'Select another model'],
            )
@@ -264,23 +267,28 @@ async def modify_llm_settings_basic(
        if change_model:
            model_completer = FuzzyWordCompleter(provider_models)

-            # Define a validator function that prints an error message
+            # Define a validator function that allows custom models but shows a warning
            def model_validator(x):
-                is_valid = x in provider_models
-                if not is_valid:
+                # Allow any non-empty model name
+                if not x.strip():
+                    return False
+
+                # Show a warning for models not in the predefined list, but still allow them
+                if x not in provider_models:
                    print_formatted_text(
                        HTML(
-                            f'<grey>Invalid model selected for provider {provider}: {x}</grey>'
+                            f'<yellow>Warning: {x} is not in the predefined list for provider {provider}. '
+                            f'Make sure this model name is correct.</yellow>'
                        )
                    )
-                return is_valid
+                return True

            model = await get_validated_input(
                session,
                '(Step 2/3) Select LLM Model (TAB for options, CTRL-c to cancel): ',
                completer=model_completer,
                validator=model_validator,
-                error_message=f'Invalid model selected for provider {provider}',
+                error_message='Model name cannot be empty',
            )
        else:
            # Use the default model
@@ -302,7 +310,7 @@ async def modify_llm_settings_basic(
    # The try-except block above ensures we either have valid inputs or we've already returned
    # No need to check for None values here

-    save_settings = save_settings_confirmation()
+    save_settings = save_settings_confirmation(config)

    if not save_settings:
        return
@@ -377,6 +385,7 @@ async def modify_llm_settings_advanced(

        enable_confirmation_mode = (
            cli_confirm(
+                config,
                question='(Step 5/6) Confirmation Mode (CTRL-c to cancel):',
                choices=['Enable', 'Disable'],
            )
@@ -385,6 +394,7 @@ async def modify_llm_settings_advanced(

        enable_memory_condensation = (
            cli_confirm(
+                config,
                question='(Step 6/6) Memory Condensation (CTRL-c to cancel):',
                choices=['Enable', 'Disable'],
            )
@@ -401,7 +411,7 @@ async def modify_llm_settings_advanced(
    # The try-except block above ensures we either have valid inputs or we've already returned
    # No need to check for None values here

-    save_settings = save_settings_confirmation()
+    save_settings = save_settings_confirmation(config)

    if not save_settings:
        return
@@ -191,19 +191,24 @@ def display_event(event: Event, config: OpenHandsConfig) -> None:
        if isinstance(event, MessageAction):
            if event.source == EventSource.AGENT:
                display_message(event.content)
+
        if isinstance(event, CmdRunAction):
-            display_command(event)
+            # Only display the command if it's not already confirmed
+            # Commands are always shown when AWAITING_CONFIRMATION, so we don't need to show them again when CONFIRMED
+            if event.confirmation_state != ActionConfirmationStatus.CONFIRMED:
+                display_command(event)
+
            if event.confirmation_state == ActionConfirmationStatus.CONFIRMED:
                initialize_streaming_output()
-        if isinstance(event, CmdOutputObservation):
+        elif isinstance(event, CmdOutputObservation):
            display_command_output(event.content)
-        if isinstance(event, FileEditObservation):
+        elif isinstance(event, FileEditObservation):
            display_file_edit(event)
-        if isinstance(event, FileReadObservation):
+        elif isinstance(event, FileReadObservation):
            display_file_read(event)
-        if isinstance(event, AgentStateChangedObservation):
+        elif isinstance(event, AgentStateChangedObservation):
            display_agent_state_change_message(event.agent_state)
-        if isinstance(event, ErrorObservation):
+        elif isinstance(event, ErrorObservation):
            display_error(event.content)


@@ -515,13 +520,16 @@ class CommandCompleter(Completer):
                    )


-def create_prompt_session() -> PromptSession[str]:
-    return PromptSession(style=DEFAULT_STYLE)
+def create_prompt_session(config: OpenHandsConfig) -> PromptSession[str]:
+    """Creates a prompt session with VI mode enabled if specified in the config."""
+    return PromptSession(style=DEFAULT_STYLE, vi_mode=config.cli.vi_mode)


-async def read_prompt_input(agent_state: str, multiline: bool = False) -> str:
+async def read_prompt_input(
+    config: OpenHandsConfig, agent_state: str, multiline: bool = False
+) -> str:
    try:
-        prompt_session = create_prompt_session()
+        prompt_session = create_prompt_session(config)
        prompt_session.completer = (
            CommandCompleter(agent_state) if not multiline else None
        )
@@ -553,9 +561,9 @@ async def read_prompt_input(agent_state: str, multiline: bool = False) -> str:
        return '/exit'


-async def read_confirmation_input() -> str:
+async def read_confirmation_input(config: OpenHandsConfig) -> str:
    try:
-        prompt_session = create_prompt_session()
+        prompt_session = create_prompt_session(config)

        with patch_stdout():
            print_formatted_text('')
@@ -601,7 +609,9 @@ async def process_agent_pause(done: asyncio.Event, event_stream: EventStream) ->


 def cli_confirm(
-    question: str = 'Are you sure?', choices: list[str] | None = None
+    config: OpenHandsConfig,
+    question: str = 'Are you sure?',
+    choices: list[str] | None = None,
 ) -> int:
    """Display a confirmation prompt with the given question and choices.

@@ -625,15 +635,27 @@ def cli_confirm(
    kb = KeyBindings()

    @kb.add('up')
-    def _(event: KeyPressEvent) -> None:
+    def _handle_up(event: KeyPressEvent) -> None:
        selected[0] = (selected[0] - 1) % len(choices)

+    if config.cli.vi_mode:
+
+        @kb.add('k')
+        def _handle_k(event: KeyPressEvent) -> None:
+            selected[0] = (selected[0] - 1) % len(choices)
+
    @kb.add('down')
-    def _(event: KeyPressEvent) -> None:
+    def _handle_down(event: KeyPressEvent) -> None:
        selected[0] = (selected[0] + 1) % len(choices)

+    if config.cli.vi_mode:
+
+        @kb.add('j')
+        def _handle_j(event: KeyPressEvent) -> None:
+            selected[0] = (selected[0] + 1) % len(choices)
+
    @kb.add('enter')
-    def _(event: KeyPressEvent) -> None:
+    def _handle_enter(event: KeyPressEvent) -> None:
        event.app.exit(result=selected[0])

    style = Style.from_dict({'selected': COLOR_GOLD, 'unselected': ''})
@@ -5,12 +5,12 @@ from typing import TYPE_CHECKING

 if TYPE_CHECKING:
    from openhands.controller.state.state import State
-    from openhands.core.config import AgentConfig
    from openhands.events.action import Action
    from openhands.events.action.message import SystemMessageAction
    from openhands.utils.prompt import PromptManager
 from litellm import ChatCompletionToolParam

+from openhands.core.config import AgentConfig
 from openhands.core.exceptions import (
    AgentAlreadyRegisteredError,
    AgentNotRegisteredError,
@@ -33,10 +33,13 @@ class Agent(ABC):
    _registry: dict[str, type['Agent']] = {}
    sandbox_plugins: list[PluginRequirement] = []

+    config_model: type[AgentConfig] = AgentConfig
+    """Class field that specifies the config model to use for the agent. Subclasses may override with a derived config model if needed."""
+
    def __init__(
        self,
        llm: LLM,
-        config: 'AgentConfig',
+        config: AgentConfig,
    ):
        self.llm = llm
        self.config = config
@@ -821,6 +821,11 @@ class AgentController:
                    or 'input length and `max_tokens` exceed context limit' in error_str
                    or 'please reduce the length of either one'
                    in error_str  # For OpenRouter context window errors
+                    or (
+                        'sambanovaexception' in error_str
+                        and 'maximum context length' in error_str
+                    )
+                    # For SambaNova context window errors - only match when both patterns are present
                    or isinstance(e, ContextWindowExceededError)
                ):
                    if self.agent.config.enable_history_truncation:
@@ -1,4 +1,5 @@
 from openhands.core.config.agent_config import AgentConfig
+from openhands.core.config.cli_config import CLIConfig
 from openhands.core.config.config_utils import (
    OH_DEFAULT_AGENT,
    OH_MAX_ITERATIONS,
@@ -26,6 +27,7 @@ __all__ = [
    'OH_DEFAULT_AGENT',
    'OH_MAX_ITERATIONS',
    'AgentConfig',
+    'CLIConfig',
    'OpenHandsConfig',
    'MCPConfig',
    'LLMConfig',
@@ -5,6 +5,7 @@ from pydantic import BaseModel, Field, ValidationError
 from openhands.core.config.condenser_config import CondenserConfig, NoOpCondenserConfig
 from openhands.core.config.extended_config import ExtendedConfig
 from openhands.core.logger import openhands_logger as logger
+from openhands.utils.import_utils import get_impl


 class AgentConfig(BaseModel):
@@ -12,6 +13,8 @@ class AgentConfig(BaseModel):
    """The name of the llm config to use. If specified, this will override global llm config."""
    classpath: str | None = Field(default=None)
    """The classpath of the agent to use. To be used for custom agents that are not defined in the openhands.agenthub package."""
+    system_prompt_filename: str = Field(default='system_prompt.j2')
+    """Filename of the system prompt template file within the agent's prompt directory. Defaults to 'system_prompt.j2'."""
    enable_browsing: bool = Field(default=True)
    """Whether to enable browsing tool.
    Note: If using CLIRuntime, browsing is not implemented and should be disabled."""
@@ -96,7 +99,27 @@ class AgentConfig(BaseModel):
            try:
                # Merge base config with overrides
                merged = {**base_config.model_dump(), **overrides}
-                custom_config = cls.model_validate(merged)
+                if merged.get('classpath'):
+                    # if an explicit classpath is given, try to load it and look up its config model class
+                    from openhands.controller.agent import Agent
+
+                    try:
+                        agent_cls = get_impl(Agent, merged.get('classpath'))
+                        custom_config = agent_cls.config_model.model_validate(merged)
+                    except Exception as e:
+                        logger.warning(
+                            f'Failed to load custom agent class [{merged.get("classpath")}]: {e}. Using default config model.'
+                        )
+                        custom_config = cls.model_validate(merged)
+                else:
+                    # otherwise, try to look up the agent class by name (i.e. if it's a built-in)
+                    # if that fails, just use the default AgentConfig class.
+                    try:
+                        agent_cls = Agent.get_cls(name)
+                        custom_config = agent_cls.config_model.model_validate(merged)
+                    except Exception:
+                        # otherwise, just fall back to the default config model
+                        custom_config = cls.model_validate(merged)
                agent_mapping[name] = custom_config
            except ValidationError as e:
                logger.warning(
@@ -0,0 +1,9 @@
+from pydantic import BaseModel, Field
+
+
+class CLIConfig(BaseModel):
+    """Configuration for CLI-specific settings."""
+
+    vi_mode: bool = Field(default=False)
+
+    model_config = {'extra': 'forbid'}
@@ -158,6 +158,7 @@ class MCPConfig(BaseModel):
            mcp_mapping['mcp'] = cls(
                sse_servers=mcp_config.sse_servers,
                stdio_servers=mcp_config.stdio_servers,
+                shttp_servers=mcp_config.shttp_servers,
            )
        except ValidationError as e:
            raise ValueError(f'Invalid MCP configuration: {e}')
@@ -5,6 +5,7 @@ from pydantic import BaseModel, Field, SecretStr

 from openhands.core import logger
 from openhands.core.config.agent_config import AgentConfig
+from openhands.core.config.cli_config import CLIConfig
 from openhands.core.config.config_utils import (
    OH_DEFAULT_AGENT,
    OH_MAX_ITERATIONS,
@@ -109,6 +110,7 @@ class OpenHandsConfig(BaseModel):
    mcp_host: str = Field(default=f'localhost:{os.getenv("port", 3000)}')
    mcp: MCPConfig = Field(default_factory=MCPConfig)
    kubernetes: KubernetesConfig = Field(default_factory=KubernetesConfig)
+    cli: CLIConfig = Field(default_factory=CLIConfig)

    defaults_dict: ClassVar[dict] = {}

@@ -794,17 +794,19 @@ def convert_non_fncall_messages_to_fncall_messages(
                )

            if tool_result_match:
-                if not (
-                    isinstance(content, str)
-                    or (
-                        isinstance(content, list)
-                        and len(content) == 1
-                        and content[0].get('type') == 'text'
-                    )
-                ):
+                if isinstance(content, list):
+                    text_content_items = [
+                        item for item in content if item.get('type') == 'text'
+                    ]
+                    if not text_content_items:
+                        raise FunctionCallConversionError(
+                            f'Could not find text content in message with tool result. Content: {content}'
+                        )
+                elif not isinstance(content, str):
                    raise FunctionCallConversionError(
-                        f'Expected str or list with one text item when tool result is present in the message. Content: {content}'
+                        f'Unexpected content type {type(content)}. Expected str or list. Content: {content}'
                    )
+
                tool_name = tool_result_match.group(1)
                tool_result = tool_result_match.group(2).strip()

@@ -163,6 +163,7 @@ class LLM(RetryMixin, DebugMixin):
            'temperature': self.config.temperature,
            'max_completion_tokens': self.config.max_output_tokens,
        }
+
        if self.config.top_k is not None:
            # openai doesn't expose top_k
            # litellm will handle it a bit differently than the openai-compatible params
@@ -486,26 +487,6 @@ class LLM(RetryMixin, DebugMixin):
                # Safe fallback for any potentially viable model
                self.config.max_input_tokens = 4096

-        if self.config.max_output_tokens is None:
-            # Safe default for any potentially viable model
-            self.config.max_output_tokens = 4096
-            if self.model_info is not None:
-                # max_output_tokens has precedence over max_tokens, if either exists.
-                # litellm has models with both, one or none of these 2 parameters!
-                if 'max_output_tokens' in self.model_info and isinstance(
-                    self.model_info['max_output_tokens'], int
-                ):
-                    self.config.max_output_tokens = self.model_info['max_output_tokens']
-                elif 'max_tokens' in self.model_info and isinstance(
-                    self.model_info['max_tokens'], int
-                ):
-                    self.config.max_output_tokens = self.model_info['max_tokens']
-            if any(
-                model in self.config.model
-                for model in ['claude-3-7-sonnet', 'claude-3.7-sonnet']
-            ):
-                self.config.max_output_tokens = 64000  # litellm set max to 128k, but that requires a header to be set
-
        # Initialize function calling capability
        # Check if model name is in our supported list
        model_name_supported = (
@@ -40,6 +40,11 @@ class BaseMicroagent(BaseModel):
        derived_name = None
        if microagent_dir is not None:
            derived_name = str(path.relative_to(microagent_dir).with_suffix(''))
+        else:
+            derived_name = path.with_suffix('').name
+            logger.warning(
+                f'No microagent_dir provided. Microagent name will be the file name: {derived_name}'
+            )

        # Only load directly from path if file_content is not provided
        if file_content is None:
@@ -95,6 +100,16 @@ class BaseMicroagent(BaseModel):
            MicroagentType.TASK: TaskMicroagent,
        }

+        # We will always use derived_name if available
+        assert derived_name is not None
+        agent_name = derived_name
+        if metadata.name is not None:
+            logger.warning(
+                f'Detected `name:` field in frontmatter for microagent {metadata.name}. '
+                "This is deprecated. Microagent's name will use the file name "
+                f'({derived_name}) instead.'
+            )
+
        # Infer the agent type:
        # 1. If inputs exist -> TASK
        # 2. If triggers exist -> KNOWLEDGE
@@ -102,8 +117,7 @@ class BaseMicroagent(BaseModel):
        inferred_type: MicroagentType
        if metadata.inputs:
            inferred_type = MicroagentType.TASK
-            # Add a trigger for the agent name if not already present
-            trigger = f'/{metadata.name}'
+            trigger = f'/{agent_name}'
            if not metadata.triggers or trigger not in metadata.triggers:
                if not metadata.triggers:
                    metadata.triggers = [trigger]
@@ -120,9 +134,6 @@ class BaseMicroagent(BaseModel):
            # This should theoretically not happen with the logic above
            raise ValueError(f'Could not determine microagent type for: {path}')

-        # Use derived_name if available (from relative path), otherwise fallback to metadata.name
-        agent_name = derived_name if derived_name is not None else metadata.name
-
        agent_class = subclass_map[inferred_type]
        return agent_class(
            name=agent_name,
@@ -25,10 +25,12 @@ class InputMetadata(BaseModel):
 class MicroagentMetadata(BaseModel):
    """Metadata for all microagents."""

-    name: str = 'default'
+    name: str = Field(default='default', exclude=True)
    type: MicroagentType = Field(default=MicroagentType.REPO_KNOWLEDGE)
-    version: str = Field(default='1.0.0')
-    agent: str = Field(default='CodeActAgent')
+    # Keep these fields for backward compatibility but they're not used
+    version: str = Field(default='1.0.0', exclude=True)
+    agent: str = Field(default='CodeActAgent', exclude=True)
+    author: str = Field(default='', exclude=True)
    triggers: list[str] = []  # optional, only exists for knowledge microagents
    inputs: list[InputMetadata] = []  # optional, only exists for task microagents
    mcp_tools: MCPConfig | None = (
@@ -9,11 +9,12 @@ import select
 import shutil
 import signal
 import subprocess
+import sys
 import tempfile
 import time
 import zipfile
 from pathlib import Path
-from typing import Any, Callable
+from typing import TYPE_CHECKING, Any, Callable

 from binaryornot.check import is_binary
 from openhands_aci.editor.editor import OHEditor
@@ -51,6 +52,41 @@ from openhands.runtime.base import Runtime
 from openhands.runtime.plugins import PluginRequirement
 from openhands.runtime.runtime_status import RuntimeStatus

+if TYPE_CHECKING:
+    from openhands.runtime.utils.windows_bash import WindowsPowershellSession
+
+# Import Windows PowerShell support if on Windows
+if sys.platform == 'win32':
+    try:
+        from openhands.runtime.utils.windows_bash import WindowsPowershellSession
+        from openhands.runtime.utils.windows_exceptions import DotNetMissingError
+    except (ImportError, DotNetMissingError) as err:
+        # Print a user-friendly error message without stack trace
+        friendly_message = """
+ERROR: PowerShell and .NET SDK are required but not properly configured
+
+The .NET SDK and PowerShell are required for OpenHands CLI on Windows.
+PowerShell integration cannot function without .NET Core.
+
+Please install the .NET SDK by following the instructions at:
+https://docs.all-hands.dev/usage/windows-without-wsl
+
+After installing .NET SDK, restart your terminal and try again.
+"""
+        print(friendly_message, file=sys.stderr)
+        logger.error(
+            f'Windows runtime initialization failed: {type(err).__name__}: {str(err)}'
+        )
+        if (
+            isinstance(err, DotNetMissingError)
+            and hasattr(err, 'details')
+            and err.details
+        ):
+            logger.debug(f'Details: {err.details}')
+
+        # Exit the program with an error code
+        sys.exit(1)
+

 class CLIRuntime(Runtime):
    """
@@ -119,6 +155,10 @@ class CLIRuntime(Runtime):
        self.file_editor = OHEditor(workspace_root=self._workspace_path)
        self._shell_stream_callback: Callable[[str], None] | None = None

+        # Initialize PowerShell session on Windows
+        self._is_windows = sys.platform == 'win32'
+        self._powershell_session: WindowsPowershellSession | None = None
+
        logger.warning(
            'Initializing CLIRuntime. WARNING: NO SANDBOX IS USED. '
            'This runtime executes commands directly on the local system. '
@@ -135,6 +175,15 @@ class CLIRuntime(Runtime):
        # Change to the workspace directory
        os.chdir(self._workspace_path)

+        # Initialize PowerShell session if on Windows
+        if self._is_windows:
+            self._powershell_session = WindowsPowershellSession(
+                work_dir=self._workspace_path,
+                username=None,  # Use current user
+                no_change_timeout_seconds=30,
+                max_memory_mb=None,
+            )
+
        if not self.attach_to_existing:
            await asyncio.to_thread(self.setup_initial_env)

@@ -241,6 +290,40 @@ class CLIRuntime(Runtime):
        except Exception as e:
            logger.error(f'Error: {e}')

+    def _execute_powershell_command(
+        self, command: str, timeout: float
+    ) -> CmdOutputObservation | ErrorObservation:
+        """
+        Execute a command using PowerShell session on Windows.
+        Args:
+            command: The command to execute
+            timeout: Timeout in seconds for the command
+        Returns:
+            CmdOutputObservation containing the complete output and exit code
+        """
+        if self._powershell_session is None:
+            return ErrorObservation(
+                content='PowerShell session is not available.',
+                error_id='POWERSHELL_SESSION_ERROR',
+            )
+
+        try:
+            # Create a CmdRunAction for the PowerShell session
+            from openhands.events.action import CmdRunAction
+
+            ps_action = CmdRunAction(command=command)
+            ps_action.set_hard_timeout(timeout)
+
+            # Execute the command using the PowerShell session
+            return self._powershell_session.execute(ps_action)
+
+        except Exception as e:
+            logger.error(f'Error executing PowerShell command "{command}": {e}')
+            return ErrorObservation(
+                content=f'Error executing PowerShell command "{command}": {str(e)}',
+                error_id='POWERSHELL_EXECUTION_ERROR',
+            )
+
    def _execute_shell_command(
        self, command: str, timeout: float
    ) -> CmdOutputObservation:
@@ -378,9 +461,16 @@ class CLIRuntime(Runtime):
            logger.debug(
                f'Running command in CLIRuntime: "{action.command}" with effective timeout: {effective_timeout}s'
            )
-            return self._execute_shell_command(
-                action.command, timeout=effective_timeout
-            )
+
+            # Use PowerShell on Windows if available, otherwise use subprocess
+            if self._is_windows and self._powershell_session is not None:
+                return self._execute_powershell_command(
+                    action.command, timeout=effective_timeout
+                )
+            else:
+                return self._execute_shell_command(
+                    action.command, timeout=effective_timeout
+                )
        except Exception as e:
            logger.error(
                f'Error in CLIRuntime.run for command "{action.command}": {str(e)}'
@@ -737,6 +827,16 @@ class CLIRuntime(Runtime):
            raise RuntimeError(f'Error creating zip file: {str(e)}')

    def close(self) -> None:
+        # Clean up PowerShell session if it exists
+        if self._powershell_session is not None:
+            try:
+                self._powershell_session.close()
+                logger.debug('PowerShell session closed successfully.')
+            except Exception as e:
+                logger.warning(f'Error closing PowerShell session: {e}')
+            finally:
+                self._powershell_session = None
+
        self._runtime_initialized = False
        super().close()

@@ -21,21 +21,31 @@ from openhands.events.observation.commands import (
    CmdOutputObservation,
 )
 from openhands.runtime.utils.bash_constants import TIMEOUT_MESSAGE_TEMPLATE
+from openhands.runtime.utils.windows_exceptions import DotNetMissingError
 from openhands.utils.shutdown_listener import should_continue

-pythonnet.load('coreclr')
-logger.info("Successfully called pythonnet.load('coreclr')")
-
-# Now that pythonnet is initialized, import clr and System
 try:
-    import clr
+    pythonnet.load('coreclr')
+    logger.info("Successfully called pythonnet.load('coreclr')")

-    logger.debug(f'Imported clr module from: {clr.__file__}')
-    # Load System assembly *after* pythonnet is initialized
-    clr.AddReference('System')
-    import System
-except Exception as clr_sys_ex:
-    raise RuntimeError(f'FATAL: Failed to import clr or System. Error: {clr_sys_ex}')
+    # Now that pythonnet is initialized, import clr and System
+    try:
+        import clr
+
+        logger.debug(f'Imported clr module from: {clr.__file__}')
+        # Load System assembly *after* pythonnet is initialized
+        clr.AddReference('System')
+        import System
+    except Exception as clr_sys_ex:
+        error_msg = 'Failed to import .NET components.'
+        details = str(clr_sys_ex)
+        logger.error(f'{error_msg} Details: {details}')
+        raise DotNetMissingError(error_msg, details)
+except Exception as coreclr_ex:
+    error_msg = 'Failed to load CoreCLR.'
+    details = str(coreclr_ex)
+    logger.error(f'{error_msg} Details: {details}')
+    raise DotNetMissingError(error_msg, details)

 # Attempt to load the PowerShell SDK assembly only if clr and System loaded
 ps_sdk_path = None
@@ -78,9 +88,10 @@ try:
        RunspaceState,
    )
 except Exception as e:
-    raise RuntimeError(
-        f'FATAL: Failed to load PowerShell SDK components. Error: {e}. Check pythonnet installation and .NET Runtime compatibility. Path searched: {ps_sdk_path}'
-    )
+    error_msg = 'Failed to load PowerShell SDK components.'
+    details = f'{str(e)} (Path searched: {ps_sdk_path})'
+    logger.error(f'{error_msg} Details: {details}')
+    raise DotNetMissingError(error_msg, details)


 class WindowsPowershellSession:
@@ -115,9 +126,11 @@ class WindowsPowershellSession:

        if PowerShell is None:  # Check if SDK loading failed during module import
            # Logged critical error during import, just raise here to prevent instantiation
-            raise RuntimeError(
-                'PowerShell SDK (System.Management.Automation.dll) could not be loaded. Cannot initialize WindowsPowershellSession.'
+            error_msg = (
+                'PowerShell SDK (System.Management.Automation.dll) could not be loaded.'
            )
+            logger.error(error_msg)
+            raise DotNetMissingError(error_msg)

        self.work_dir = os.path.abspath(work_dir)
        self.username = username
@@ -0,0 +1,15 @@
+"""
+Custom exceptions for Windows-specific runtime issues.
+"""
+
+
+class DotNetMissingError(Exception):
+    """
+    Exception raised when .NET SDK or CoreCLR is missing or cannot be loaded.
+    This is used to provide a cleaner error message to users without a full stack trace.
+    """
+
+    def __init__(self, message: str, details: str | None = None):
+        self.message = message
+        self.details = details
+        super().__init__(message)
@@ -107,6 +107,10 @@ class ConversationManager(ABC):
    async def send_to_event_stream(self, connection_id: str, data: dict):
        """Send data to an event stream."""

+    @abstractmethod
+    async def send_event_to_conversation(self, sid: str, data: dict):
+        """Send an event to a conversation."""
+
    @abstractmethod
    async def disconnect_from_session(self, connection_id: str):
        """Disconnect from a session."""
@@ -275,6 +275,18 @@ class DockerNestedConversationManager(ConversationManager):
        # Not supported - clients should connect directly to the nested server!
        raise ValueError('unsupported_operation')

+    async def send_event_to_conversation(self, sid, data):
+        async with httpx.AsyncClient(
+            headers={
+                'X-Session-API-Key': self._get_session_api_key_for_conversation(sid)
+            }
+        ) as client:
+            nested_url = self._get_nested_url(sid)
+            response = await client.post(
+                f'{nested_url}/api/conversations/{sid}/events', json=data
+            )
+            response.raise_for_status()
+
    async def disconnect_from_session(self, connection_id: str):
        # Not supported - clients should connect directly to the nested server!
        raise ValueError('unsupported_operation')
@@ -331,13 +331,13 @@ class StandaloneConversationManager(ConversationManager):
        sid = self._local_connection_id_to_session_id.get(connection_id)
        if not sid:
            raise RuntimeError(f'no_connected_session:{connection_id}')
+        await self.send_event_to_conversation(sid, data)

+    async def send_event_to_conversation(self, sid: str, data: dict):
        session = self._local_agent_loops_by_sid.get(sid)
-        if session:
-            await session.dispatch(data)
-            return
-
-        raise RuntimeError(f'no_connected_session:{connection_id}:{sid}')
+        if not session:
+            raise RuntimeError(f'no_conversation:{sid}')
+        await session.dispatch(data)

    async def disconnect_from_session(self, connection_id: str):
        sid = self._local_connection_id_to_session_id.pop(connection_id, None)
@@ -72,6 +72,7 @@ async def load_settings(
        )
        settings_with_token_data.llm_api_key = None
        settings_with_token_data.search_api_key = None
+        settings_with_token_data.sandbox_api_key = None
        return settings_with_token_data
    except Exception as e:
        logger.warning(f'Invalid token: {e}')
@@ -133,6 +133,8 @@ class Session:
        default_llm_config.api_key = settings.llm_api_key
        default_llm_config.base_url = settings.llm_base_url
        self.config.search_api_key = settings.search_api_key
+        if settings.sandbox_api_key:
+            self.config.sandbox.api_key = settings.sandbox_api_key.get_secret_value()

        # NOTE: this need to happen AFTER the config is updated with the search_api_key
        self.config.mcp = settings.mcp_config or MCPConfig(
@@ -40,6 +40,7 @@ class Settings(BaseModel):
    sandbox_runtime_container_image: str | None = None
    mcp_config: MCPConfig | None = None
    search_api_key: SecretStr | None = None
+    sandbox_api_key: SecretStr | None = None
    max_budget_per_task: float | None = None
    email: str | None = None
    email_verified: bool | None = None
@@ -52,13 +52,33 @@ class PromptManager:
    def __init__(
        self,
        prompt_dir: str,
+        system_prompt_filename: str = 'system_prompt.j2',
    ):
        self.prompt_dir: str = prompt_dir
-        self.system_template: Template = self._load_template('system_prompt')
+        self.system_template: Template = self._load_system_template(
+            system_prompt_filename
+        )
        self.user_template: Template = self._load_template('user_prompt')
        self.additional_info_template: Template = self._load_template('additional_info')
        self.microagent_info_template: Template = self._load_template('microagent_info')

+    def _load_system_template(self, system_prompt_filename: str) -> Template:
+        """Load the system prompt template using the specified filename."""
+        # Remove .j2 extension if present to use with _load_template
+        template_name = system_prompt_filename
+        if template_name.endswith('.j2'):
+            template_name = template_name[:-3]
+
+        try:
+            return self._load_template(template_name)
+        except FileNotFoundError:
+            # Provide a more specific error message for system prompt files
+            template_path = os.path.join(self.prompt_dir, f'{template_name}.j2')
+            raise FileNotFoundError(
+                f'System prompt file "{system_prompt_filename}" not found at {template_path}. '
+                f'Please ensure the file exists in the prompt directory: {self.prompt_dir}'
+            )
+
    def _load_template(self, template_name: str) -> Template:
        if self.prompt_dir is None:
            raise ValueError('Prompt directory is not set')
@@ -1,4 +1,4 @@
-# This file is automatically @generated by Poetry 2.1.1 and should not be changed by hand.
+# This file is automatically @generated by Poetry 2.1.3 and should not be changed by hand.

 [[package]]
 name = "aioboto3"
@@ -462,7 +462,7 @@ description = "LTS Port of Python audioop"
 optional = false
 python-versions = ">=3.13"
 groups = ["main"]
-markers = "python_version >= \"3.13\""
+markers = "python_version == \"3.13\""
 files = [
    {file = "audioop_lts-0.2.1-cp313-abi3-macosx_10_13_universal2.whl", hash = "sha256:fd1345ae99e17e6910f47ce7d52673c6a1a70820d78b67de1b7abb3af29c426a"},
    {file = "audioop_lts-0.2.1-cp313-abi3-macosx_10_13_x86_64.whl", hash = "sha256:e175350da05d2087e12cea8e72a70a1a8b14a17e92ed2022952a4419689ede5e"},
@@ -663,14 +663,14 @@ crt = ["botocore[crt] (>=1.21.0,<2.0a0)"]

 [[package]]
 name = "boto3-stubs"
-version = "1.38.39"
-description = "Type annotations for boto3 1.38.39 generated with mypy-boto3-builder 8.11.0"
+version = "1.38.40"
+description = "Type annotations for boto3 1.38.40 generated with mypy-boto3-builder 8.11.0"
 optional = false
 python-versions = ">=3.8"
 groups = ["evaluation"]
 files = [
-    {file = "boto3_stubs-1.38.39-py3-none-any.whl", hash = "sha256:6cf3965964a9a22e895d75b4b1d8f21582de4bd114b9865c23da9d5b1cc7853c"},
-    {file = "boto3_stubs-1.38.39.tar.gz", hash = "sha256:a6066470f97da5810afeaa6d30028c4b198c9364c94180dc764597224321ae10"},
+    {file = "boto3_stubs-1.38.40-py3-none-any.whl", hash = "sha256:401c51aa07c96df4b5452f37a2af3936c21b97cea0e68cddda9977a9f864d110"},
+    {file = "boto3_stubs-1.38.40.tar.gz", hash = "sha256:cf8cf0f67aa6aac0caa6cdcd78609c182cdc12727a67bd811dd8a95d03fa6c8f"},
 ]

 [package.dependencies]
@@ -727,7 +727,7 @@ bedrock-data-automation-runtime = ["mypy-boto3-bedrock-data-automation-runtime (
 bedrock-runtime = ["mypy-boto3-bedrock-runtime (>=1.38.0,<1.39.0)"]
 billing = ["mypy-boto3-billing (>=1.38.0,<1.39.0)"]
 billingconductor = ["mypy-boto3-billingconductor (>=1.38.0,<1.39.0)"]
-boto3 = ["boto3 (==1.38.39)"]
+boto3 = ["boto3 (==1.38.40)"]
 braket = ["mypy-boto3-braket (>=1.38.0,<1.39.0)"]
 budgets = ["mypy-boto3-budgets (>=1.38.0,<1.39.0)"]
 ce = ["mypy-boto3-ce (>=1.38.0,<1.39.0)"]
@@ -1644,7 +1644,7 @@ files = [
    {file = "colorama-0.4.6-py2.py3-none-any.whl", hash = "sha256:4f1d9991f5acc0ca119f9d443620b77f9d6b33703e51011c16baf57afb285fc6"},
    {file = "colorama-0.4.6.tar.gz", hash = "sha256:08695f5cb7ed6e0531a20572697297273c47b8cae5a63ffc6d6ed5c201be6e44"},
 ]
-markers = {main = "platform_system == \"Windows\" or sys_platform == \"win32\" or os_name == \"nt\"", dev = "os_name == \"nt\" or sys_platform == \"win32\"", runtime = "sys_platform == \"win32\"", test = "platform_system == \"Windows\" or sys_platform == \"win32\""}
+markers = {main = "platform_system == \"Windows\" or os_name == \"nt\" or sys_platform == \"win32\"", dev = "os_name == \"nt\" or sys_platform == \"win32\"", runtime = "sys_platform == \"win32\"", test = "platform_system == \"Windows\" or sys_platform == \"win32\""}

 [[package]]
 name = "comm"
@@ -3053,8 +3053,8 @@ files = [
 google-api-core = {version = ">=1.34.1,<2.0.dev0 || >=2.11.dev0,<3.0.0dev", extras = ["grpc"]}
 google-auth = ">=2.14.1,<2.24.0 || >2.24.0,<2.25.0 || >2.25.0,<3.0.0dev"
 proto-plus = [
-    {version = ">=1.22.3,<2.0.0dev"},
    {version = ">=1.25.0,<2.0.0dev", markers = "python_version >= \"3.13\""},
+    {version = ">=1.22.3,<2.0.0dev"},
 ]
 protobuf = ">=3.20.2,<4.21.0 || >4.21.0,<4.21.1 || >4.21.1,<4.21.2 || >4.21.2,<4.21.3 || >4.21.3,<4.21.4 || >4.21.4,<4.21.5 || >4.21.5,<6.0.0dev"

@@ -3076,8 +3076,8 @@ googleapis-common-protos = ">=1.56.2,<2.0.0"
 grpcio = {version = ">=1.49.1,<2.0.0", optional = true, markers = "python_version >= \"3.11\" and extra == \"grpc\""}
 grpcio-status = {version = ">=1.49.1,<2.0.0", optional = true, markers = "python_version >= \"3.11\" and extra == \"grpc\""}
 proto-plus = [
-    {version = ">=1.22.3,<2.0.0"},
    {version = ">=1.25.0,<2.0.0", markers = "python_version >= \"3.13\""},
+    {version = ">=1.22.3,<2.0.0"},
 ]
 protobuf = ">=3.19.5,<3.20.0 || >3.20.0,<3.20.1 || >3.20.1,<4.21.0 || >4.21.0,<4.21.1 || >4.21.1,<4.21.2 || >4.21.2,<4.21.3 || >4.21.3,<4.21.4 || >4.21.4,<4.21.5 || >4.21.5,<7.0.0"
 requests = ">=2.18.0,<3.0.0"
@@ -3090,14 +3090,14 @@ grpcio-gcp = ["grpcio-gcp (>=0.2.2,<1.0.0)"]

 [[package]]
 name = "google-api-python-client"
-version = "2.172.0"
+version = "2.173.0"
 description = "Google API Client Library for Python"
 optional = false
 python-versions = ">=3.7"
 groups = ["main"]
 files = [
-    {file = "google_api_python_client-2.172.0-py3-none-any.whl", hash = "sha256:9f1b9a268d5dc1228207d246c673d3a09ee211b41a11521d38d9212aeaa43af7"},
-    {file = "google_api_python_client-2.172.0.tar.gz", hash = "sha256:dcb3b7e067154b2aa41f1776cf86584a5739c0ac74e6ff46fc665790dca0e6a6"},
+    {file = "google_api_python_client-2.173.0-py3-none-any.whl", hash = "sha256:16a8e81c772dd116f5c4ee47d83643149e1367dc8fb4f47cb471fbcb5c7d7ac7"},
+    {file = "google_api_python_client-2.173.0.tar.gz", hash = "sha256:b537bc689758f4be3e6f40d59a6c0cd305abafdea91af4bc66ec31d40c08c804"},
 ]

 [package.dependencies]
@@ -3295,8 +3295,8 @@ google-api-core = {version = ">=1.34.1,<2.0.dev0 || >=2.11.dev0,<3.0.0", extras
 google-auth = ">=2.14.1,<2.24.0 || >2.24.0,<2.25.0 || >2.25.0,<3.0.0"
 grpc-google-iam-v1 = ">=0.14.0,<1.0.0"
 proto-plus = [
-    {version = ">=1.22.3,<2.0.0"},
    {version = ">=1.25.0,<2.0.0", markers = "python_version >= \"3.13\""},
+    {version = ">=1.22.3,<2.0.0"},
 ]
 protobuf = ">=3.20.2,<4.21.0 || >4.21.0,<4.21.1 || >4.21.1,<4.21.2 || >4.21.2,<4.21.3 || >4.21.3,<4.21.4 || >4.21.4,<4.21.5 || >4.21.5,<7.0.0"

@@ -5073,18 +5073,18 @@ types-tqdm = "*"

 [[package]]
 name = "litellm"
-version = "1.72.6.post1"
+version = "1.72.7"
 description = "Library to easily interface with LLM API providers"
 optional = false
 python-versions = "!=2.7.*,!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*,!=3.6.*,!=3.7.*,>=3.8"
 groups = ["main"]
 files = [
-    {file = "litellm-1.72.6.post1-py3-none-any.whl", hash = "sha256:94abba480b3e1c9f05ed75f908e8fb0d5fe1bd5304868bd435f99fbdfcdb4c88"},
-    {file = "litellm-1.72.6.post1.tar.gz", hash = "sha256:cc03455bfae17c4226e70bc0e8ac7c889fc577b4832fbb6bc5bd8bc5a2ab02f9"},
+    {file = "litellm-1.72.7-py3-none-any.whl", hash = "sha256:704317ca71b00ca7fb164e8367e6559712605ed7f96f6d7fdb8b1276904a95e9"},
+    {file = "litellm-1.72.7.tar.gz", hash = "sha256:9bfdc156ebd8cb2fc869e0a388931b64468c4fd534df6a2f6e27f9cba2dafcb9"},
 ]

 [package.dependencies]
-aiohttp = "*"
+aiohttp = ">=3.10"
 click = "*"
 httpx = ">=0.23.0"
 importlib-metadata = ">=6.8.0"
@@ -5099,7 +5099,7 @@ tokenizers = "*"
 [package.extras]
 caching = ["diskcache (>=5.6.1,<6.0.0)"]
 extra-proxy = ["azure-identity (>=1.15.0,<2.0.0)", "azure-keyvault-secrets (>=4.8.0,<5.0.0)", "google-cloud-kms (>=2.21.3,<3.0.0)", "prisma (==0.11.0)", "redisvl (>=0.4.1,<0.5.0) ; python_version >= \"3.9\" and python_version < \"3.14\"", "resend (>=0.8.0,<0.9.0)"]
-proxy = ["PyJWT (>=2.8.0,<3.0.0)", "apscheduler (>=3.10.4,<4.0.0)", "backoff", "boto3 (==1.34.34)", "cryptography (>=43.0.1,<44.0.0)", "fastapi (>=0.115.5,<0.116.0)", "fastapi-sso (>=0.16.0,<0.17.0)", "gunicorn (>=23.0.0,<24.0.0)", "litellm-enterprise (==0.1.7)", "litellm-proxy-extras (==0.2.3)", "mcp (==1.9.3) ; python_version >= \"3.10\"", "orjson (>=3.9.7,<4.0.0)", "pynacl (>=1.5.0,<2.0.0)", "python-multipart (>=0.0.18,<0.0.19)", "pyyaml (>=6.0.1,<7.0.0)", "rich (==13.7.1)", "rq", "uvicorn (>=0.29.0,<0.30.0)", "uvloop (>=0.21.0,<0.22.0) ; sys_platform != \"win32\"", "websockets (>=13.1.0,<14.0.0)"]
+proxy = ["PyJWT (>=2.8.0,<3.0.0)", "apscheduler (>=3.10.4,<4.0.0)", "backoff", "boto3 (==1.34.34)", "cryptography (>=43.0.1,<44.0.0)", "fastapi (>=0.115.5,<0.116.0)", "fastapi-sso (>=0.16.0,<0.17.0)", "gunicorn (>=23.0.0,<24.0.0)", "litellm-enterprise (==0.1.7)", "litellm-proxy-extras (==0.2.5)", "mcp (==1.9.3) ; python_version >= \"3.10\"", "orjson (>=3.9.7,<4.0.0)", "pynacl (>=1.5.0,<2.0.0)", "python-multipart (>=0.0.18,<0.0.19)", "pyyaml (>=6.0.1,<7.0.0)", "rich (==13.7.1)", "rq", "uvicorn (>=0.29.0,<0.30.0)", "uvloop (>=0.21.0,<0.22.0) ; sys_platform != \"win32\"", "websockets (>=13.1.0,<14.0.0)"]
 utils = ["numpydoc"]

 [[package]]
@@ -6586,8 +6586,8 @@ files = [
 [package.dependencies]
 googleapis-common-protos = ">=1.52,<2.0"
 grpcio = [
-    {version = ">=1.63.2,<2.0.0", markers = "python_version < \"3.13\""},
    {version = ">=1.66.2,<2.0.0", markers = "python_version >= \"3.13\""},
+    {version = ">=1.63.2,<2.0.0", markers = "python_version < \"3.13\""},
 ]
 opentelemetry-api = ">=1.15,<2.0"
 opentelemetry-exporter-otlp-proto-common = "1.34.1"
@@ -9350,7 +9350,6 @@ files = [
    {file = "setuptools-80.9.0-py3-none-any.whl", hash = "sha256:062d34222ad13e0cc312a4c02d73f059e86a4acbfbdea8f8f76b28c99f306922"},
    {file = "setuptools-80.9.0.tar.gz", hash = "sha256:f36b47402ecde768dbfafc46e8e4207b4360c654f1f3bb84475f0a28628fb19c"},
 ]
-markers = {evaluation = "platform_system == \"Linux\" and platform_machine == \"x86_64\""}

 [package.extras]
 check = ["pytest-checkdocs (>=2.4)", "pytest-ruff (>=0.2.1) ; sys_platform != \"cygwin\"", "ruff (>=0.8.0) ; sys_platform != \"cygwin\""]
@@ -9593,7 +9592,7 @@ description = "Standard library aifc redistribution. \"dead battery\"."
 optional = false
 python-versions = "*"
 groups = ["main"]
-markers = "python_version >= \"3.13\""
+markers = "python_version == \"3.13\""
 files = [
    {file = "standard_aifc-3.13.0-py3-none-any.whl", hash = "sha256:f7ae09cc57de1224a0dd8e3eb8f73830be7c3d0bc485de4c1f82b4a7f645ac66"},
    {file = "standard_aifc-3.13.0.tar.gz", hash = "sha256:64e249c7cb4b3daf2fdba4e95721f811bde8bdfc43ad9f936589b7bb2fae2e43"},
@@ -9610,7 +9609,7 @@ description = "Standard library chunk redistribution. \"dead battery\"."
 optional = false
 python-versions = "*"
 groups = ["main"]
-markers = "python_version >= \"3.13\""
+markers = "python_version == \"3.13\""
 files = [
    {file = "standard_chunk-3.13.0-py3-none-any.whl", hash = "sha256:17880a26c285189c644bd5bd8f8ed2bdb795d216e3293e6dbe55bbd848e2982c"},
    {file = "standard_chunk-3.13.0.tar.gz", hash = "sha256:4ac345d37d7e686d2755e01836b8d98eda0d1a3ee90375e597ae43aaf064d654"},
@@ -1,7 +1,6 @@
 """Tests for microagent loading in runtime."""

 import os
-import tempfile
 from pathlib import Path
 from unittest.mock import AsyncMock, MagicMock, patch

@@ -20,7 +19,7 @@ from openhands.microagent.microagent import (
    RepoMicroagent,
    TaskMicroagent,
 )
-from openhands.microagent.types import MicroagentType
+from openhands.microagent.types import InputMetadata, MicroagentType


 def _create_test_microagents(test_dir: str):
@@ -32,10 +31,6 @@ def _create_test_microagents(test_dir: str):
    knowledge_dir = microagents_dir / 'knowledge'
    knowledge_dir.mkdir(exist_ok=True)
    knowledge_agent = """---
-name: test_knowledge_agent
-type: knowledge
-version: 1.0.0
-agent: CodeActAgent
 triggers:
  - test
  - pytest
@@ -45,17 +40,10 @@ triggers:

 Testing best practices and guidelines.
 """
-    (knowledge_dir / 'knowledge.md').write_text(knowledge_agent)
+    (knowledge_dir / 'test_knowledge_agent.md').write_text(knowledge_agent)

    # Create test repo agent
-    repo_agent = """---
-name: test_repo_agent
-type: repo
-version: 1.0.0
-agent: CodeActAgent
---
-
-# Test Repository Agent
+    repo_agent = """# Test Repository Agent

 Repository-specific test instructions.
 """
@@ -89,7 +77,7 @@ def test_load_microagents_with_trailing_slashes(
        # Check knowledge agents
        assert len(knowledge_agents) == 1
        agent = knowledge_agents[0]
-        assert agent.name == 'knowledge/knowledge'
+        assert agent.name == 'knowledge/test_knowledge_agent'
        assert 'test' in agent.triggers
        assert 'pytest' in agent.triggers

@@ -126,7 +114,7 @@ def test_load_microagents_with_selected_repo(temp_dir, runtime_cls, run_as_openh
        # Check knowledge agents
        assert len(knowledge_agents) == 1
        agent = knowledge_agents[0]
-        assert agent.name == 'knowledge/knowledge'
+        assert agent.name == 'knowledge/test_knowledge_agent'
        assert 'test' in agent.triggers
        assert 'pytest' in agent.triggers

@@ -180,7 +168,7 @@ Repository-specific test instructions.
        _close_test_runtime(runtime)


-def test_task_microagent_creation():
+def test_task_microagent_creation(temp_dir):
    """Test that a TaskMicroagent is created correctly."""
    content = """---
 name: test_task
@@ -196,21 +184,43 @@ inputs:

 This is a test task microagent with a variable: ${test_var}.
 """
+    with open(os.path.join(temp_dir, 'test_task.md'), 'w') as f:
+        f.write(content)

-    with tempfile.NamedTemporaryFile(suffix='.md') as f:
-        f.write(content.encode())
-        f.flush()
+    agent = BaseMicroagent.load(os.path.join(temp_dir, 'test_task.md'))

-        agent = BaseMicroagent.load(f.name)
+    assert isinstance(agent, TaskMicroagent)
+    assert agent.type == MicroagentType.TASK
+    assert agent.name == 'test_task'
+    assert '/test_task' in agent.triggers
+    assert "If the user didn't provide any of these variables" in agent.content
+    assert agent.inputs == [InputMetadata(name='TEST_VAR', description='Test variable')]
+    simplified_content = """---
+triggers:
+- /test_task
+inputs:
+- name: TEST_VAR
+  description: "Test variable"
+---

-        assert isinstance(agent, TaskMicroagent)
-        assert agent.type == MicroagentType.TASK
-        assert agent.name == 'test_task'
-        assert '/test_task' in agent.triggers
-        assert "If the user didn't provide any of these variables" in agent.content
+This is a test task microagent with a variable: ${test_var}.
+"""
+
+    with open(os.path.join(temp_dir, 'test_task.md'), 'w') as f:
+        f.write(simplified_content)
+
+    simplified_agent = BaseMicroagent.load(os.path.join(temp_dir, 'test_task.md'))
+
+    assert isinstance(simplified_agent, TaskMicroagent)
+    assert simplified_agent.type == MicroagentType.TASK
+    assert simplified_agent.name == 'test_task'
+    assert '/test_task' in simplified_agent.triggers
+    assert (
+        "If the user didn't provide any of these variables" in simplified_agent.content
+    )


-def test_task_microagent_variable_extraction():
+def test_task_microagent_variable_extraction(temp_dir):
    """Test that variables are correctly extracted from the content."""
    content = """---
 name: test_task
@@ -227,19 +237,18 @@ inputs:
 This is a test with variables: ${var1}, ${var2}, and ${var3}.
 """

-    with tempfile.NamedTemporaryFile(suffix='.md') as f:
-        f.write(content.encode())
-        f.flush()
+    with open(os.path.join(temp_dir, 'test_task.md'), 'w') as f:
+        f.write(content)

-        agent = BaseMicroagent.load(f.name)
+    agent = BaseMicroagent.load(os.path.join(temp_dir, 'test_task.md'))

-        assert isinstance(agent, TaskMicroagent)
-        variables = agent.extract_variables(agent.content)
-        assert set(variables) == {'var1', 'var2', 'var3'}
-        assert agent.requires_user_input()
+    assert isinstance(agent, TaskMicroagent)
+    variables = agent.extract_variables(agent.content)
+    assert set(variables) == {'var1', 'var2', 'var3'}
+    assert agent.requires_user_input()


-def test_knowledge_microagent_no_prompt():
+def test_knowledge_microagent_no_prompt(temp_dir):
    """Test that a regular KnowledgeMicroagent doesn't get the prompt."""
    content = """---
 name: test_knowledge
@@ -252,19 +261,17 @@ triggers:

 This is a test knowledge microagent.
 """
+    with open(os.path.join(temp_dir, 'test_knowledge.md'), 'w') as f:
+        f.write(content)

-    with tempfile.NamedTemporaryFile(suffix='.md') as f:
-        f.write(content.encode())
-        f.flush()
+    agent = BaseMicroagent.load(os.path.join(temp_dir, 'test_knowledge.md'))

-        agent = BaseMicroagent.load(f.name)
-
-        assert isinstance(agent, KnowledgeMicroagent)
-        assert agent.type == MicroagentType.KNOWLEDGE
-        assert "If the user didn't provide any of these variables" not in agent.content
+    assert isinstance(agent, KnowledgeMicroagent)
+    assert agent.type == MicroagentType.KNOWLEDGE
+    assert "If the user didn't provide any of these variables" not in agent.content


-def test_task_microagent_trigger_addition():
+def test_task_microagent_trigger_addition(temp_dir):
    """Test that a trigger is added if not present."""
    content = """---
 name: test_task
@@ -278,18 +285,16 @@ inputs:

 This is a test task microagent.
 """
+    with open(os.path.join(temp_dir, 'test_task.md'), 'w') as f:
+        f.write(content)

-    with tempfile.NamedTemporaryFile(suffix='.md') as f:
-        f.write(content.encode())
-        f.flush()
+    agent = BaseMicroagent.load(os.path.join(temp_dir, 'test_task.md'))

-        agent = BaseMicroagent.load(f.name)
-
-        assert isinstance(agent, TaskMicroagent)
-        assert '/test_task' in agent.triggers
+    assert isinstance(agent, TaskMicroagent)
+    assert '/test_task' in agent.triggers


-def test_task_microagent_no_duplicate_trigger():
+def test_task_microagent_no_duplicate_trigger(temp_dir):
    """Test that a trigger is not duplicated if already present."""
    content = """---
 name: test_task
@@ -306,21 +311,19 @@ inputs:

 This is a test task microagent.
 """
+    with open(os.path.join(temp_dir, 'test_task.md'), 'w') as f:
+        f.write(content)

-    with tempfile.NamedTemporaryFile(suffix='.md') as f:
-        f.write(content.encode())
-        f.flush()
+    agent = BaseMicroagent.load(os.path.join(temp_dir, 'test_task.md'))

-        agent = BaseMicroagent.load(f.name)
-
-        assert isinstance(agent, TaskMicroagent)
-        assert agent.triggers.count('/test_task') == 1  # No duplicates
-        assert len(agent.triggers) == 2
-        assert 'another_trigger' in agent.triggers
-        assert '/test_task' in agent.triggers
+    assert isinstance(agent, TaskMicroagent)
+    assert agent.triggers.count('/test_task') == 1  # No duplicates
+    assert len(agent.triggers) == 2
+    assert 'another_trigger' in agent.triggers
+    assert '/test_task' in agent.triggers


-def test_task_microagent_match_trigger():
+def test_task_microagent_match_trigger(temp_dir):
    """Test that a task microagent matches its trigger correctly."""
    content = """---
 name: test_task
@@ -337,17 +340,16 @@ inputs:
 This is a test task microagent.
 """

-    with tempfile.NamedTemporaryFile(suffix='.md') as f:
-        f.write(content.encode())
-        f.flush()
+    with open(os.path.join(temp_dir, 'test_task.md'), 'w') as f:
+        f.write(content)

-        agent = BaseMicroagent.load(f.name)
+    agent = BaseMicroagent.load(os.path.join(temp_dir, 'test_task.md'))

-        assert isinstance(agent, TaskMicroagent)
-        assert agent.match_trigger('/test_task') == '/test_task'
-        assert agent.match_trigger('  /test_task  ') == '/test_task'
-        assert agent.match_trigger('This contains /test_task') == '/test_task'
-        assert agent.match_trigger('/other_task') is None
+    assert isinstance(agent, TaskMicroagent)
+    assert agent.match_trigger('/test_task') == '/test_task'
+    assert agent.match_trigger('  /test_task  ') == '/test_task'
+    assert agent.match_trigger('This contains /test_task') == '/test_task'
+    assert agent.match_trigger('/other_task') is None


 def test_default_tools_microagent_exists():
@@ -369,15 +371,12 @@ def test_default_tools_microagent_exists():
    with open(default_tools_path, 'r') as f:
        content = f.read()

-    # Verify it's a repo microagent (always activated)
-    assert 'type: repo' in content, 'default-tools.md should be a repo microagent'
+    assert 'command: uvx' in content, 'default-tools.md should use uvx command'
+    assert 'mcp-server-fetch' in content, 'default-tools.md should use mcp-server-fetch'

-    # Verify it has the fetch tool configured
-    assert 'name: "fetch"' in content, 'default-tools.md should have a fetch tool'
-    assert 'command: "uvx"' in content, 'default-tools.md should use uvx command'
-    assert 'args: ["mcp-server-fetch"]' in content, (
-        'default-tools.md should use mcp-server-fetch'
-    )
+    agent = BaseMicroagent.load(default_tools_path)
+
+    assert isinstance(agent, RepoMicroagent)


@pytest.mark.asyncio
@@ -1682,3 +1682,138 @@ async def test_openrouter_context_window_exceeded_error(
    )

    await controller.close()
+
+
+@pytest.mark.asyncio
+async def test_sambanova_context_window_exceeded_error(
+    mock_agent, test_event_stream, mock_status_callback
+):
+    """Test that SambaNova context window exceeded errors are properly detected and handled."""
+    max_iterations = 5
+    error_after = 2
+
+    class StepState:
+        def __init__(self):
+            self.has_errored = False
+            self.index = 0
+            self.views = []
+
+        def step(self, state: State):
+            # Store the view for later inspection
+            self.views.append(state.view)
+            # only throw it once.
+            if self.index < error_after or self.has_errored:
+                self.index += 1
+                return MessageAction(content=f'Test message {self.index}')
+
+            # Create a BadRequestError with the SambaNova context window exceeded message pattern
+            error = BadRequestError(
+                message='litellm.BadRequestError: SambanovaException - The maximum context length of DeepSeek-V3-0324 is 32768. However, answering your request will take 39732 tokens. Please reduce the length of the messages or the specified max_completion_tokens value.',
+                model='sambanova/deepseek-v3-0324',
+                llm_provider='sambanova',
+            )
+            self.has_errored = True
+            raise error
+
+    step_state = StepState()
+    mock_agent.step = step_state.step
+    mock_agent.config = AgentConfig(enable_history_truncation=True)
+
+    controller = AgentController(
+        agent=mock_agent,
+        event_stream=test_event_stream,
+        iteration_delta=max_iterations,
+        sid='test',
+        confirmation_mode=False,
+        headless_mode=True,
+        status_callback=mock_status_callback,
+    )
+
+    # Set the agent state to RUNNING
+    controller.state.agent_state = AgentState.RUNNING
+
+    # Run the controller until it hits the error
+    for _ in range(error_after + 2):  # +2 to ensure we go past the error
+        await controller._step()
+        if step_state.has_errored:
+            break
+
+    # Verify that the error was handled as a context window exceeded error
+    # by checking that _handle_long_context_error was called (which adds a CondensationAction)
+    events = list(test_event_stream.get_events())
+    condensation_actions = [e for e in events if isinstance(e, CondensationAction)]
+
+    # There should be at least one CondensationAction if the error was handled correctly
+    assert len(condensation_actions) > 0, (
+        'SambaNova context window exceeded error was not handled correctly'
+    )
+
+    await controller.close()
+
+
+@pytest.mark.asyncio
+async def test_sambanova_generic_exception_not_handled_as_context_error(
+    mock_agent, test_event_stream, mock_status_callback
+):
+    """Test that generic SambaNova exceptions (without context length pattern) are NOT handled as context window errors."""
+    max_iterations = 5
+    error_after = 2
+
+    class StepState:
+        def __init__(self):
+            self.has_errored = False
+            self.index = 0
+            self.views = []
+
+        def step(self, state: State):
+            # Store the view for later inspection
+            self.views.append(state.view)
+            # only throw it once.
+            if self.index < error_after or self.has_errored:
+                self.index += 1
+                return MessageAction(content=f'Test message {self.index}')
+
+            # Create a BadRequestError with a generic SambaNova error (no context length pattern)
+            error = BadRequestError(
+                message='litellm.BadRequestError: SambanovaException - Some other error occurred',
+                model='sambanova/deepseek-v3-0324',
+                llm_provider='sambanova',
+            )
+            self.has_errored = True
+            raise error
+
+    step_state = StepState()
+    mock_agent.step = step_state.step
+    mock_agent.config = AgentConfig(enable_history_truncation=True)
+
+    controller = AgentController(
+        agent=mock_agent,
+        event_stream=test_event_stream,
+        iteration_delta=max_iterations,
+        sid='test',
+        confirmation_mode=False,
+        headless_mode=True,
+        status_callback=mock_status_callback,
+    )
+
+    # Set the agent state to RUNNING
+    controller.state.agent_state = AgentState.RUNNING
+
+    # Run the controller until it hits the error
+    with pytest.raises(BadRequestError):
+        for _ in range(error_after + 2):  # +2 to ensure we go past the error
+            await controller._step()
+            if step_state.has_errored:
+                break
+
+    # Verify that the error was NOT handled as a context window exceeded error
+    # by checking that _handle_long_context_error was NOT called (no CondensationAction should be added)
+    events = list(test_event_stream.get_events())
+    condensation_actions = [e for e in events if isinstance(e, CondensationAction)]
+
+    # There should be NO CondensationAction if the error was correctly NOT handled as context window error
+    assert len(condensation_actions) == 0, (
+        'Generic SambaNova exception was incorrectly handled as context window error'
+    )
+
+    await controller.close()
@@ -50,6 +50,7 @@ class TestHandleCommands:
        )

        mock_handle_exit.assert_called_once_with(
+            mock_dependencies['config'],
            mock_dependencies['event_stream'],
            mock_dependencies['usage_metrics'],
            mock_dependencies['sid'],
@@ -116,6 +117,7 @@ class TestHandleCommands:
        )

        mock_handle_new.assert_called_once_with(
+            mock_dependencies['config'],
            mock_dependencies['event_stream'],
            mock_dependencies['usage_metrics'],
            mock_dependencies['sid'],
@@ -166,6 +168,7 @@ class TestHandleExitCommand:
    @patch('openhands.cli.commands.cli_confirm')
    @patch('openhands.cli.commands.display_shutdown_message')
    def test_exit_with_confirmation(self, mock_display_shutdown, mock_cli_confirm):
+        config = MagicMock(spec=OpenHandsConfig)
        event_stream = MagicMock(spec=EventStream)
        usage_metrics = MagicMock(spec=UsageMetrics)
        sid = 'test-session-id'
@@ -174,7 +177,7 @@ class TestHandleExitCommand:
        mock_cli_confirm.return_value = 0  # First option, which is "Yes, proceed"

        # Call the function under test
-        result = handle_exit_command(event_stream, usage_metrics, sid)
+        result = handle_exit_command(config, event_stream, usage_metrics, sid)

        # Verify correct behavior
        mock_cli_confirm.assert_called_once()
@@ -191,6 +194,7 @@ class TestHandleExitCommand:
    @patch('openhands.cli.commands.cli_confirm')
    @patch('openhands.cli.commands.display_shutdown_message')
    def test_exit_without_confirmation(self, mock_display_shutdown, mock_cli_confirm):
+        config = MagicMock(spec=OpenHandsConfig)
        event_stream = MagicMock(spec=EventStream)
        usage_metrics = MagicMock(spec=UsageMetrics)
        sid = 'test-session-id'
@@ -199,7 +203,7 @@ class TestHandleExitCommand:
        mock_cli_confirm.return_value = 1  # Second option, which is "No, dismiss"

        # Call the function under test
-        result = handle_exit_command(event_stream, usage_metrics, sid)
+        result = handle_exit_command(config, event_stream, usage_metrics, sid)

        # Verify correct behavior
        mock_cli_confirm.assert_called_once()
@@ -230,6 +234,7 @@ class TestHandleNewCommand:
    @patch('openhands.cli.commands.cli_confirm')
    @patch('openhands.cli.commands.display_shutdown_message')
    def test_new_with_confirmation(self, mock_display_shutdown, mock_cli_confirm):
+        config = MagicMock(spec=OpenHandsConfig)
        event_stream = MagicMock(spec=EventStream)
        usage_metrics = MagicMock(spec=UsageMetrics)
        sid = 'test-session-id'
@@ -238,7 +243,9 @@ class TestHandleNewCommand:
        mock_cli_confirm.return_value = 0  # First option, which is "Yes, proceed"

        # Call the function under test
-        close_repl, new_session = handle_new_command(event_stream, usage_metrics, sid)
+        close_repl, new_session = handle_new_command(
+            config, event_stream, usage_metrics, sid
+        )

        # Verify correct behavior
        mock_cli_confirm.assert_called_once()
@@ -256,6 +263,7 @@ class TestHandleNewCommand:
    @patch('openhands.cli.commands.cli_confirm')
    @patch('openhands.cli.commands.display_shutdown_message')
    def test_new_without_confirmation(self, mock_display_shutdown, mock_cli_confirm):
+        config = MagicMock(spec=OpenHandsConfig)
        event_stream = MagicMock(spec=EventStream)
        usage_metrics = MagicMock(spec=UsageMetrics)
        sid = 'test-session-id'
@@ -264,7 +272,9 @@ class TestHandleNewCommand:
        mock_cli_confirm.return_value = 1  # Second option, which is "No, dismiss"

        # Call the function under test
-        close_repl, new_session = handle_new_command(event_stream, usage_metrics, sid)
+        close_repl, new_session = handle_new_command(
+            config, event_stream, usage_metrics, sid
+        )

        # Verify correct behavior
        mock_cli_confirm.assert_called_once()
@@ -292,7 +302,7 @@ class TestHandleInitCommand:
        )

        # Verify correct behavior
-        mock_init_repository.assert_called_once_with(current_dir)
+        mock_init_repository.assert_called_once_with(config, current_dir)
        event_stream.add_event.assert_called_once()
        # Check event is the right type
        args, kwargs = event_stream.add_event.call_args
@@ -320,7 +330,7 @@ class TestHandleInitCommand:
        )

        # Verify correct behavior
-        mock_init_repository.assert_called_once_with(current_dir)
+        mock_init_repository.assert_called_once_with(config, current_dir)
        event_stream.add_event.assert_not_called()

        assert close_repl is False
@@ -171,16 +171,17 @@ class TestModifyLLMSettingsBasic:
        session_instance = MagicMock()
        session_instance.prompt_async = AsyncMock(
            side_effect=[
-                'openai',  # Provider
                'gpt-4',  # Model
                'new-api-key',  # API Key
            ]
        )
        mock_session.return_value = session_instance

-        # Mock cli_confirm to select the second option (change provider/model) for the first two calls
-        # and then select the first option (save settings) for the last call
-        mock_confirm.side_effect = [1, 1, 0]
+        # Mock cli_confirm to:
+        # 1. Select the first provider (openai) from the list
+        # 2. Select "Select another model" option
+        # 3. Select "Yes, save" option
+        mock_confirm.side_effect = [0, 1, 0]

        # Call the function
        await modify_llm_settings_basic(app_config, settings_store)
@@ -189,8 +190,8 @@ class TestModifyLLMSettingsBasic:
        app_config.set_llm_config.assert_called_once()
        args, kwargs = app_config.set_llm_config.call_args
        # The model name might be different based on the default model in the list
-        # Just check that it starts with 'openai/'
-        assert args[0].model.startswith('openai/')
+        # Just check that it contains 'gpt-4' instead of checking for prefix
+        assert 'gpt-4' in args[0].model
        assert args[0].api_key.get_secret_value() == 'new-api-key'
        assert args[0].base_url is None

@@ -199,8 +200,8 @@ class TestModifyLLMSettingsBasic:
        args, kwargs = settings_store.store.call_args
        settings = args[0]
        # The model name might be different based on the default model in the list
-        # Just check that it starts with openai/
-        assert settings.llm_model.startswith('openai/')
+        # Just check that it contains 'gpt-4' instead of checking for prefix
+        assert 'gpt-4' in settings.llm_model
        assert settings.llm_api_key.get_secret_value() == 'new-api-key'
        assert settings.llm_base_url is None

@@ -249,7 +250,7 @@ class TestModifyLLMSettingsBasic:
        'openhands.cli.settings.LLMSummarizingCondenserConfig',
        MockLLMSummarizingCondenserConfig,
    )
-    async def test_modify_llm_settings_basic_invalid_input(
+    async def test_modify_llm_settings_basic_invalid_provider_input(
        self,
        mock_print,
        mock_confirm,
@@ -270,8 +271,7 @@ class TestModifyLLMSettingsBasic:
            side_effect=[
                'invalid-provider',  # First invalid provider
                'openai',  # Valid provider
-                'invalid-model',  # Invalid model
-                'gpt-4',  # Valid model
+                'custom-model',  # Custom model (now allowed with warning)
                'new-api-key',  # API key
            ]
        )
@@ -284,34 +284,32 @@ class TestModifyLLMSettingsBasic:
        # Call the function
        await modify_llm_settings_basic(app_config, settings_store)

-        # Verify error messages were shown for invalid inputs
-        assert (
-            mock_print.call_count >= 2
-        )  # At least two error messages should be printed
+        # Verify error message was shown for invalid provider and warning for custom model
+        assert mock_print.call_count >= 2  # At least two messages should be printed

-        # Check for invalid provider error
+        # Check for invalid provider error and custom model warning
        provider_error_found = False
-        model_error_found = False
+        model_warning_found = False

        for call in mock_print.call_args_list:
            args, _ = call
            if args and isinstance(args[0], HTML):
                if 'Invalid provider selected' in args[0].value:
                    provider_error_found = True
-                if 'Invalid model selected' in args[0].value:
-                    model_error_found = True
+                if 'Warning:' in args[0].value and 'custom-model' in args[0].value:
+                    model_warning_found = True

        assert provider_error_found, 'No error message for invalid provider'
-        assert model_error_found, 'No error message for invalid model'
+        assert model_warning_found, 'No warning message for custom model'

-        # Verify LLM config was updated with correct values
+        # Verify LLM config was updated with the custom model
        app_config.set_llm_config.assert_called_once()

-        # Verify settings were saved
+        # Verify settings were saved with the custom model
        settings_store.store.assert_called_once()
        args, kwargs = settings_store.store.call_args
        settings = args[0]
-        assert settings.llm_model == 'openai/gpt-4'
+        assert 'custom-model' in settings.llm_model
        assert settings.llm_api_key.get_secret_value() == 'new-api-key'
        assert settings.llm_base_url is None

@@ -85,12 +85,31 @@ class TestDisplayFunctions:
    @patch('openhands.cli.tui.display_command')
    def test_display_event_cmd_action(self, mock_display_command):
        config = MagicMock(spec=OpenHandsConfig)
+        # Test that commands awaiting confirmation are displayed
        cmd_action = CmdRunAction(command='echo test')
+        cmd_action.confirmation_state = ActionConfirmationStatus.AWAITING_CONFIRMATION

        display_event(cmd_action, config)

        mock_display_command.assert_called_once_with(cmd_action)

+    @patch('openhands.cli.tui.display_command')
+    @patch('openhands.cli.tui.initialize_streaming_output')
+    def test_display_event_cmd_action_confirmed(
+        self, mock_init_streaming, mock_display_command
+    ):
+        config = MagicMock(spec=OpenHandsConfig)
+        # Test that confirmed commands don't display the command but do initialize streaming
+        cmd_action = CmdRunAction(command='echo test')
+        cmd_action.confirmation_state = ActionConfirmationStatus.CONFIRMED
+
+        display_event(cmd_action, config)
+
+        # Command should not be displayed (since it was already shown when awaiting confirmation)
+        mock_display_command.assert_not_called()
+        # But streaming should be initialized
+        mock_init_streaming.assert_called_once()
+
    @patch('openhands.cli.tui.display_command_output')
    def test_display_event_cmd_output(self, mock_display_output):
        config = MagicMock(spec=OpenHandsConfig)
@@ -262,7 +281,7 @@ class TestReadConfirmationInput:
        mock_session.prompt_async.return_value = 'y'
        mock_create_session.return_value = mock_session

-        result = await read_confirmation_input()
+        result = await read_confirmation_input(config=MagicMock(spec=OpenHandsConfig))
        assert result == 'yes'

    @pytest.mark.asyncio
@@ -272,7 +291,7 @@ class TestReadConfirmationInput:
        mock_session.prompt_async.return_value = 'yes'
        mock_create_session.return_value = mock_session

-        result = await read_confirmation_input()
+        result = await read_confirmation_input(config=MagicMock(spec=OpenHandsConfig))
        assert result == 'yes'

    @pytest.mark.asyncio
@@ -282,7 +301,7 @@ class TestReadConfirmationInput:
        mock_session.prompt_async.return_value = 'n'
        mock_create_session.return_value = mock_session

-        result = await read_confirmation_input()
+        result = await read_confirmation_input(config=MagicMock(spec=OpenHandsConfig))
        assert result == 'no'

    @pytest.mark.asyncio
@@ -292,7 +311,7 @@ class TestReadConfirmationInput:
        mock_session.prompt_async.return_value = 'no'
        mock_create_session.return_value = mock_session

-        result = await read_confirmation_input()
+        result = await read_confirmation_input(config=MagicMock(spec=OpenHandsConfig))
        assert result == 'no'

    @pytest.mark.asyncio
@@ -302,7 +321,7 @@ class TestReadConfirmationInput:
        mock_session.prompt_async.return_value = 'a'
        mock_create_session.return_value = mock_session

-        result = await read_confirmation_input()
+        result = await read_confirmation_input(config=MagicMock(spec=OpenHandsConfig))
        assert result == 'always'

    @pytest.mark.asyncio
@@ -312,7 +331,7 @@ class TestReadConfirmationInput:
        mock_session.prompt_async.return_value = 'always'
        mock_create_session.return_value = mock_session

-        result = await read_confirmation_input()
+        result = await read_confirmation_input(config=MagicMock(spec=OpenHandsConfig))
        assert result == 'always'

    @pytest.mark.asyncio
@@ -322,7 +341,7 @@ class TestReadConfirmationInput:
        mock_session.prompt_async.return_value = 'invalid'
        mock_create_session.return_value = mock_session

-        result = await read_confirmation_input()
+        result = await read_confirmation_input(config=MagicMock(spec=OpenHandsConfig))
        assert result == 'no'

    @pytest.mark.asyncio
@@ -332,7 +351,7 @@ class TestReadConfirmationInput:
        mock_session.prompt_async.return_value = ''
        mock_create_session.return_value = mock_session

-        result = await read_confirmation_input()
+        result = await read_confirmation_input(config=MagicMock(spec=OpenHandsConfig))
        assert result == 'no'

    @pytest.mark.asyncio
@@ -342,7 +361,7 @@ class TestReadConfirmationInput:
        mock_session.prompt_async.return_value = None
        mock_create_session.return_value = mock_session

-        result = await read_confirmation_input()
+        result = await read_confirmation_input(config=MagicMock(spec=OpenHandsConfig))
        assert result == 'no'

    @pytest.mark.asyncio
@@ -354,7 +373,7 @@ class TestReadConfirmationInput:
        mock_session.prompt_async.side_effect = KeyboardInterrupt
        mock_create_session.return_value = mock_session

-        result = await read_confirmation_input()
+        result = await read_confirmation_input(config=MagicMock(spec=OpenHandsConfig))
        assert result == 'no'

    @pytest.mark.asyncio
@@ -364,5 +383,5 @@ class TestReadConfirmationInput:
        mock_session.prompt_async.side_effect = EOFError
        mock_create_session.return_value = mock_session

-        result = await read_confirmation_input()
+        result = await read_confirmation_input(config=MagicMock(spec=OpenHandsConfig))
        assert result == 'no'
@@ -0,0 +1,89 @@
+import os
+from unittest.mock import ANY, MagicMock, patch
+
+from openhands.core.config import CLIConfig, OpenHandsConfig
+
+
+class TestCliViMode:
+    """Test the VI mode feature."""
+
+    @patch('openhands.cli.tui.PromptSession')
+    def test_create_prompt_session_vi_mode_enabled(self, mock_prompt_session):
+        """Test that vi_mode can be enabled."""
+        from openhands.cli.tui import create_prompt_session
+
+        config = OpenHandsConfig(cli=CLIConfig(vi_mode=True))
+        create_prompt_session(config)
+        mock_prompt_session.assert_called_with(
+            style=ANY,
+            vi_mode=True,
+        )
+
+    @patch('openhands.cli.tui.PromptSession')
+    def test_create_prompt_session_vi_mode_disabled(self, mock_prompt_session):
+        """Test that vi_mode is disabled by default."""
+        from openhands.cli.tui import create_prompt_session
+
+        config = OpenHandsConfig(cli=CLIConfig(vi_mode=False))
+        create_prompt_session(config)
+        mock_prompt_session.assert_called_with(
+            style=ANY,
+            vi_mode=False,
+        )
+
+    @patch('openhands.cli.tui.Application')
+    def test_cli_confirm_vi_keybindings_are_added(self, mock_app_class):
+        """Test that vi keybindings are added to the KeyBindings object."""
+        from openhands.cli.tui import cli_confirm
+
+        config = OpenHandsConfig(cli=CLIConfig(vi_mode=True))
+        with patch('openhands.cli.tui.KeyBindings', MagicMock()) as mock_key_bindings:
+            cli_confirm(
+                config, 'Test question', choices=['Choice 1', 'Choice 2', 'Choice 3']
+            )
+            # here we are checking if the key bindings are being created
+            assert mock_key_bindings.call_count == 1
+
+            # then we check that the key bindings are being added
+            mock_kb_instance = mock_key_bindings.return_value
+            assert mock_kb_instance.add.call_count > 0
+
+    @patch('openhands.cli.tui.Application')
+    def test_cli_confirm_vi_keybindings_are_not_added(self, mock_app_class):
+        """Test that vi keybindings are not added when vi_mode is False."""
+        from openhands.cli.tui import cli_confirm
+
+        config = OpenHandsConfig(cli=CLIConfig(vi_mode=False))
+        with patch('openhands.cli.tui.KeyBindings', MagicMock()) as mock_key_bindings:
+            cli_confirm(
+                config, 'Test question', choices=['Choice 1', 'Choice 2', 'Choice 3']
+            )
+            # here we are checking if the key bindings are being created
+            assert mock_key_bindings.call_count == 1
+
+            # then we check that the key bindings are being added
+            mock_kb_instance = mock_key_bindings.return_value
+
+            # and here we check that the vi key bindings are not being added
+            for call in mock_kb_instance.add.call_args_list:
+                assert call[0][0] not in ('j', 'k')
+
+    @patch.dict(os.environ, {}, clear=True)
+    def test_vi_mode_disabled_by_default(self):
+        """Test that vi_mode is disabled by default when no env var is set."""
+        from openhands.core.config.utils import load_from_env
+
+        config = OpenHandsConfig()
+        load_from_env(config, os.environ)
+        assert config.cli.vi_mode is False, 'vi_mode should be False by default'
+
+    @patch.dict(os.environ, {'CLI_VI_MODE': 'True'})
+    def test_vi_mode_enabled_from_env(self):
+        """Test that vi_mode can be enabled from an environment variable."""
+        from openhands.core.config.utils import load_from_env
+
+        config = OpenHandsConfig()
+        load_from_env(config, os.environ)
+        assert config.cli.vi_mode is True, (
+            'vi_mode should be True when CLI_VI_MODE is set'
+        )
@@ -1211,3 +1211,39 @@ def test_agent_config_from_toml_section_with_invalid_base():
    assert 'CustomAgent' in result
    assert result['CustomAgent'].enable_browsing is False
    assert result['CustomAgent'].enable_jupyter is True
+
+
+def test_agent_config_system_prompt_filename_default():
+    """Test that AgentConfig defaults to 'system_prompt.j2' for system_prompt_filename."""
+    config = AgentConfig()
+    assert config.system_prompt_filename == 'system_prompt.j2'
+
+
+def test_agent_config_system_prompt_filename_toml_integration(
+    default_config, temp_toml_file
+):
+    """Test that system_prompt_filename is correctly loaded from TOML configuration."""
+    with open(temp_toml_file, 'w', encoding='utf-8') as toml_file:
+        toml_file.write(
+            """
+[agent]
+enable_browsing = true
+system_prompt_filename = "custom_prompt.j2"
+
+[agent.CodeReviewAgent]
+system_prompt_filename = "code_review_prompt.j2"
+enable_browsing = false
+"""
+        )
+
+    load_from_toml(default_config, temp_toml_file)
+
+    # Check default agent config
+    default_agent_config = default_config.get_agent_config()
+    assert default_agent_config.system_prompt_filename == 'custom_prompt.j2'
+    assert default_agent_config.enable_browsing is True
+
+    # Check custom agent config
+    custom_agent_config = default_config.get_agent_config('CodeReviewAgent')
+    assert custom_agent_config.system_prompt_filename == 'code_review_prompt.j2'
+    assert custom_agent_config.enable_browsing is False
@@ -132,7 +132,7 @@ def test_llm_init_with_model_info(mock_get_model_info, default_config):
    llm = LLM(default_config)
    llm.init_model_info()
    assert llm.config.max_input_tokens == 8000
-    assert llm.config.max_output_tokens == 2000
+    assert llm.config.max_output_tokens is None


@patch('openhands.llm.llm.litellm.get_model_info')
@@ -141,7 +141,7 @@ def test_llm_init_without_model_info(mock_get_model_info, default_config):
    llm = LLM(default_config)
    llm.init_model_info()
    assert llm.config.max_input_tokens == 4096
-    assert llm.config.max_output_tokens == 4096
+    assert llm.config.max_output_tokens is None


 def test_llm_init_with_custom_config():
@@ -260,7 +260,7 @@ def test_llm_init_with_openrouter_model(mock_get_model_info, default_config):
    llm = LLM(default_config)
    llm.init_model_info()
    assert llm.config.max_input_tokens == 7000
-    assert llm.config.max_output_tokens == 1500
+    assert llm.config.max_output_tokens is None
    mock_get_model_info.assert_called_once_with('openrouter:gpt-4o-mini')


@@ -7,7 +7,7 @@ from openhands.microagent.types import MicroagentType
 def test_load_markdown_without_frontmatter():
    """Test loading a markdown file without frontmatter."""
    content = '# Test Content\nThis is a test markdown file without frontmatter.'
-    path = Path('test.md')
+    path = Path('default.md')

    # Load the agent from content using keyword argument
    agent = BaseMicroagent.load(path=path, file_content=content)
@@ -26,7 +26,7 @@ def test_load_markdown_with_empty_frontmatter():
    content = (
        '---\n---\n# Test Content\nThis is a test markdown file with empty frontmatter.'
    )
-    path = Path('test.md')
+    path = Path('default.md')

    # Load the agent from content using keyword argument
    agent = BaseMicroagent.load(path=path, file_content=content)
@@ -50,12 +50,12 @@ name: custom_name
 ---
 # Test Content
 This is a test markdown file with partial frontmatter."""
-    path = Path('test.md')
+    path = Path('custom_name.md')

    # Load the agent from content using keyword argument
    agent = BaseMicroagent.load(path=path, file_content=content)

-    # Verify it uses provided name but default values for other fields
+    # Verify it uses filename instead of provided name (filename takes precedence)
    assert isinstance(agent, RepoMicroagent)
    assert agent.name == 'custom_name'
    assert (
@@ -77,12 +77,12 @@ version: 2.0.0
 ---
 # Test Content
 This is a test markdown file with full frontmatter."""
-    path = Path('test.md')
+    path = Path('test_agent.md')

    # Load the agent from content using keyword argument
    agent = BaseMicroagent.load(path=path, file_content=content)

-    # Verify all provided values are used
+    # Verify filename is used for name but other metadata values are preserved
    assert isinstance(agent, RepoMicroagent)
    assert agent.name == 'test_agent'
    assert (
@@ -269,3 +269,39 @@ def test_prompt_manager_initialization_error():
    """Test that PromptManager raises an error if the prompt directory is not set."""
    with pytest.raises(ValueError, match='Prompt directory is not set'):
        PromptManager(None)
+
+
+def test_prompt_manager_custom_system_prompt_filename(prompt_dir):
+    """Test that PromptManager can use a custom system prompt filename."""
+    # Create a custom system prompt file
+    with open(os.path.join(prompt_dir, 'custom_system.j2'), 'w') as f:
+        f.write('Custom system prompt: {{ custom_var }}')
+
+    # Create default system prompt
+    with open(os.path.join(prompt_dir, 'system_prompt.j2'), 'w') as f:
+        f.write('Default system prompt')
+
+    # Test with custom system prompt filename
+    manager = PromptManager(
+        prompt_dir=prompt_dir, system_prompt_filename='custom_system.j2'
+    )
+    system_msg = manager.get_system_message()
+    assert 'Custom system prompt:' in system_msg
+
+    # Test without custom system prompt filename (should use default)
+    manager_default = PromptManager(prompt_dir=prompt_dir)
+    default_msg = manager_default.get_system_message()
+    assert 'Default system prompt' in default_msg
+
+    # Clean up
+    os.remove(os.path.join(prompt_dir, 'custom_system.j2'))
+    os.remove(os.path.join(prompt_dir, 'system_prompt.j2'))
+
+
+def test_prompt_manager_custom_system_prompt_filename_not_found(prompt_dir):
+    """Test that PromptManager raises an error if custom system prompt file is not found."""
+    with pytest.raises(
+        FileNotFoundError,
+        match=r'System prompt file "non_existent\.j2" not found at .*/non_existent\.j2\. Please ensure the file exists in the prompt directory:',
+    ):
+        PromptManager(prompt_dir=prompt_dir, system_prompt_filename='non_existent.j2')
Author	SHA1	Message	Date
openhands	93287ef9ac	Fix microagent test filenames to match expected names - Change test filenames from 'test.md' to match expected microagent names - Use 'default.md' for tests expecting 'default' name - Use 'custom_name.md' for test expecting 'custom_name' name - Use 'test_agent.md' for test expecting 'test_agent' name - This properly tests the filename-based naming behavior	2025-06-24 14:20:34 +00:00
openhands	e70595f46f	Fix microagent tests and remove debug prints - Update test assertions to expect filename as microagent name instead of 'default' - Remove debug print statements from microagent.py - Revert pytest-asyncio dependency addition as requested - All tests now pass with the new filename-based naming behavior	2025-06-24 14:16:20 +00:00
openhands	1d3ff66987	Fix failing tests: add missing newlines and pytest-asyncio dependency - Add missing newlines at end of microagent files (fixed by pre-commit) - Add pytest-asyncio dependency to fix async test execution - All non-Docker tests now pass	2025-06-24 14:01:12 +00:00
Xingyao Wang	1a95f86802	fix all remaining issue'	2025-06-23 17:49:02 -04:00
Xingyao Wang	eee12bfd94	fix test	2025-06-23 16:09:32 -04:00
Xingyao Wang	8c2d4dbe8b	Merge branch 'main' into update-microagent-docs	2025-06-23 14:22:56 -04:00
மனோஜ்குமார் பழனிச்சாமி	f5ae1759b6	Add model name (#8718 )	2025-06-23 14:21:47 -04:00
Ikuo Matsumura	9ec94737ed	feat(cli): Add vi mode support (#9287 ) Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>	2025-06-23 17:39:38 +00:00
llamantino	63c7815823	docs: rewrite local LLMs page (#9307 )	2025-06-24 01:20:03 +08:00
baii	95ae47307c	Fix the issue where the shttp_services configuration from config.toml fails to load correctly. (#9175 )	2025-06-23 13:02:56 -04:00
Graham Neubig	035050252b	Better timeout prompt (#9140 ) Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>	2025-06-23 16:42:15 +00:00
Tommaso Bendinelli	5b48aee0c9	Fix openhands.core.exceptions.FunctionCallConversionError `fn_call_converter` for GPT-o4-mini when the agent generates images (#9152 ) Co-authored-by: tommaso <tommaso@t7144.csem.local>	2025-06-23 16:01:36 +00:00
Xingyao Wang	1a89dbb738	docs: Add Success Stories tab to documentation (#9120 ) Co-authored-by: openhands <openhands@all-hands.dev>	2025-06-23 23:39:39 +08:00
Rohit Malhotra	bba62c26fd	Make sandbox api key configurable via user settings (#8803 ) Co-authored-by: openhands <openhands@all-hands.dev>	2025-06-23 11:25:10 -04:00
Graham Neubig	9b4ad4e6e3	Fix SambaNova context length exception handling (#9252 ) Co-authored-by: openhands <openhands@all-hands.dev>	2025-06-23 07:06:31 -04:00
Graham Neubig	1e33624951	Simplify max_output_tokens handling in LLM classes (#9296 ) Co-authored-by: openhands <openhands@all-hands.dev>	2025-06-23 06:48:45 -04:00
Graham Neubig	8b90d610c6	Fix CLI model selection to allow custom model names (#9205 ) Co-authored-by: openhands <openhands@all-hands.dev>	2025-06-23 04:03:00 +00:00
mamoodi	834abc0eee	More doc updates (#9289 )	2025-06-22 22:46:47 -04:00
Tim O'Farrell	c9bb0fc168	Conversation Manager small refactor (#9286 ) Co-authored-by: openhands <openhands@all-hands.dev>	2025-06-22 19:27:03 -06:00
Graham Neubig	5d69e606eb	feat: Add Windows PowerShell support to CLI runtime (#9211 ) Co-authored-by: openhands <openhands@all-hands.dev>	2025-06-22 20:17:40 -04:00
Engel Nyst	081880248c	Fix lint (#9290 )	2025-06-22 13:40:14 -04:00
Chase	4ee269c3f7	Add ability to customize configuration model on per-agent basis (#8576 )	2025-06-22 14:43:17 +02:00
Xingyao Wang	711315c3b9	docs: Update documentation based on llamantino feedback (#9119 ) Co-authored-by: openhands <openhands@all-hands.dev>	2025-06-21 21:57:14 -04:00
mamoodi	c2e6244b86	Small doc updates. Fix FAQs (#9270 )	2025-06-21 15:52:29 -07:00
Xingyao Wang	a1479adfd3	feat(agent): Add configurable system_prompt_filename to AgentConfig (#9265 ) Co-authored-by: openhands <openhands@all-hands.dev>	2025-06-22 06:21:52 +08:00
dependabot[bot]	99fd3f7bb2	chore(deps): bump ubuntu from 22.04 to 24.04 in /containers/e2b-sandbox (#9042 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-06-21 12:23:42 -07:00
dependabot[bot]	c617881b3c	chore(deps): bump the version-all group in /frontend with 4 updates (#9234 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-06-21 12:22:43 -07:00
dependabot[bot]	7ca3607dcd	chore(deps): bump the version-all group with 3 updates (#9256 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-06-21 12:21:57 -07:00
mamoodi	89999a8e09	Update free credits lines (#9269 )	2025-06-21 15:35:04 +00:00
Ray Myers	3d9761df7e	Release branch for 0.45.0 (#9264 )	2025-06-20 21:14:23 +00:00
Xingyao Wang	ea3c4f9366	Fix(CLI): duplicated Command Action display in CLI (#9260 ) Co-authored-by: openhands <openhands@all-hands.dev>	2025-06-21 04:24:16 +08:00
Xingyao Wang	0ca3188afa	Merge branch 'main' into update-microagent-docs	2025-06-18 14:23:58 -04:00
openhands	283f503870	Exclude name field in MicroagentMetadata as it's deprecated	2025-06-08 22:07:33 +00:00
openhands	0691e5c0d0	Remove type: field from all microagent markdown files	2025-06-08 19:48:01 +00:00
openhands	fc16da8fd2	Update microagent documentation to clarify that type field is optional	2025-06-08 19:39:17 +00:00
openhands	bd3ff43c67	Remove name field from microagent files	2025-06-08 19:35:06 +00:00
openhands	0fe5b808af	Update microagent code to use filename as name when not specified	2025-06-08 19:34:59 +00:00
openhands	6c49686ff0	Add MCP tools documentation and update microagent field requirements	2025-06-08 19:30:21 +00:00
openhands	17212bb2f2	Remove unused fields from microagent code and update all microagent files	2025-06-08 19:26:56 +00:00
openhands	9d9f931e95	Remove unused fields from microagent documentation and example	2025-06-08 19:23:47 +00:00
openhands	6fe9680474	Consolidate task microagent documentation into keyword-triggered microagents	2025-06-08 19:19:44 +00:00
Xingyao Wang	53c80d1c92	Merge branch 'main' into update-microagent-docs	2025-06-08 15:17:37 -04:00
openhands	401262f353	Update documentation for task microagents with user input support	2025-06-08 19:15:31 +00:00
Xingyao Wang	58845b01a3	rename more files	2025-06-08 14:30:37 -04:00
Xingyao Wang	469d184157	address engel comment	2025-06-08 14:28:22 -04:00
Xingyao Wang	4837c4dc74	Update microagents/get_test_to_pass.md Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2025-06-09 02:24:23 +08:00
Xingyao Wang	6763f21cc3	Merge branch 'main' into add-back-microagents	2025-06-07 16:47:00 -04:00
Xingyao Wang	32e610ac1d	revert unnecessary change	2025-06-07 16:30:55 -04:00
Xingyao Wang	85c65391ca	revert changes	2025-06-03 13:53:27 -04:00
Xingyao Wang	c444dbfbbf	remove fe changes	2025-06-03 12:04:37 -04:00
Xingyao Wang	dd988d0f14	revert fe	2025-06-03 12:03:00 -04:00
Xingyao Wang	6f1a74e286	merge main	2025-06-03 11:37:51 -04:00
Xingyao Wang	7b956b6103	revert docs to look like main	2025-06-03 11:35:57 -04:00
openhands	34b097115d	Fix linting issues in frontend and Python code	2025-05-19 01:39:48 +00:00
openhands	3e4ab4f379	Fix docstring formatting in KnowledgeMicroagent class	2025-05-19 01:29:23 +00:00
openhands	54cd9f7e44	Fix unlocalized strings in microagent-dropdown.tsx	2025-05-19 01:26:33 +00:00
openhands	802b765f98	Add microagent button and dropdown to trajectory actions	2025-05-17 12:05:13 +00:00
openhands	18c88f99ff	Merge from main to resolve conflicts	2025-05-17 06:56:11 +00:00
openhands	f3934be07b	Fix microagent suggestions using tippy.js for better popup handling	2025-05-12 12:55:00 +00:00
openhands	6ce9f49d1e	Fix linting issues in TipTap editor component	2025-05-12 11:06:15 +00:00
openhands	fc07622b20	Implement microagent suggestions using TipTap	2025-05-12 11:00:08 +00:00
Xingyao Wang	da935f9d8f	Merge branch 'main' into add-back-microagents	2025-05-03 00:04:17 +08:00
openhands	642cc52a1a	Fix linting issues in handlers.ts	2025-05-02 13:06:21 +00:00
openhands	4c361ab9e5	Add mock handler for microagents endpoint	2025-05-02 09:23:25 +00:00
openhands	5dfa1bb6eb	Fix microagent suggestions UI and TypeScript errors	2025-05-02 09:21:15 +00:00
Xingyao Wang	a07cf972a5	Merge commit '6032d2620d6ec252d3c80695a6de1fc88da9c87a' into add-back-microagents	2025-05-02 09:03:17 +00:00
openhands	f2e3bc3254	Fix microagent suggestions feature	2025-05-02 08:52:19 +00:00
openhands	3790ec7d60	Add tests for microagent suggestions component	2025-05-02 03:31:41 +00:00
openhands	3c0719309e	Add microagent suggestions feature to chat input	2025-05-02 02:57:57 +00:00
Xingyao Wang	0236e0943e	fix test	2025-05-02 02:09:27 +00:00
Xingyao Wang	cd464c0022	rename files	2025-05-01 10:38:04 +08:00
Xingyao Wang	4519a7f4f3	fix test	2025-05-01 02:29:52 +00:00
Xingyao Wang	fdc591330b	add remain	2025-05-01 02:25:38 +00:00
Xingyao Wang	98e454e82c	fix lint and missing imports	2025-05-01 02:25:24 +00:00
Xingyao Wang	e088d2d24a	simplify microagent	2025-05-01 02:13:46 +00:00
Xingyao Wang	58c574af1e	revert changes	2025-05-01 02:13:00 +00:00
Xingyao Wang	405f0069f8	revert some changes	2025-05-01 02:03:06 +00:00
Xingyao Wang	f26d770d03	remove hardcoded last line	2025-05-01 02:01:51 +00:00
Xingyao Wang	bf2c3de219	cleanup tests	2025-04-30 11:11:23 +08:00
Xingyao Wang	7c35ce16e5	Merge branch 'main' into add-back-microagents	2025-04-30 11:07:17 +08:00
Xingyao Wang	f4024ccd94	Update microagents/update_pr_description.md Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2025-04-30 10:43:37 +08:00
Xingyao Wang	b55bfed831	Update microagents/address_pr_comments.md Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>	2025-04-30 10:38:22 +08:00
OpenHands Bot	cb0994027f	🤖 Auto-fix Python linting issues	2025-04-29 16:02:04 +00:00
openhands	bcc9bd0b9a	Move task microagent tests to test_microagent_task.py	2025-04-29 02:12:14 +00:00
openhands	6c144e6b5a	Add back microagent files with special handling for user inputs	2025-04-29 02:06:42 +00:00
openhands	e90b841b0d	Update microagent files to match original ones with added triggers and variable prompts	2025-04-29 01:48:10 +00:00
openhands	a1e6ed4dff	Add special handling for microagents that require user input	2025-04-29 01:47:18 +00:00
openhands	ad6311d3cd	Add back microagent files and add special handling for user input variables	2025-04-29 01:33:23 +00:00