Update openhands/runtime/client/client.py

revert to old client.py
2026-04-29 03:00:45 -04:00 · 2024-09-19 15:26:57 -04:00 · 2024-09-19 14:31:09 -04:00
787 changed files with 48823 additions and 48008 deletions
@@ -2,5 +2,7 @@
 sudo apt update
 sudo apt install -y netcat
 sudo add-apt-repository -y ppa:deadsnakes/ppa
-sudo apt install -y python3.12
-curl -sSL https://install.python-poetry.org | python3.12 -
+sudo apt install -y python3.11
+curl -sSL https://install.python-poetry.org | python3.11 -
+# chromadb requires SQLite > 3.35 but SQLite in Python3.11.9 comes with 3.31.1
+sudo cp /opt/conda/lib/libsqlite3.so.0 /lib/x86_64-linux-gnu/libsqlite3.so.0
@@ -5,55 +5,71 @@ labels: ['bug']
 body:
  - type: markdown
    attributes:
-      value: Thank you for taking the time to fill out this bug report. Please provide as much information as possible to help us understand and address the issue effectively.
+      value: Thank you for taking the time to fill out this bug report. We greatly appreciate your effort to complete this template fully. Please provide as much information as possible to help us understand and address the issue effectively.

  - type: checkboxes
    attributes:
      label: Is there an existing issue for the same bug?
      description: Please check if an issue already exists for the bug you encountered.
      options:
+      - label: I have checked the troubleshooting document at https://docs.all-hands.dev/modules/usage/troubleshooting
+        required: true
      - label: I have checked the existing issues.
        required: true

  - type: textarea
    id: bug-description
    attributes:
-      label: Describe the bug and reproduction steps
-      description: Provide a description of the issue along with any reproduction steps.
+      label: Describe the bug
+      description: Provide a short description of the problem.
    validations:
      required: true

-  - type: dropdown
-    id: installation
+  - type: textarea
+    id: current-version
    attributes:
-      label: OpenHands Installation
-      description: How are you running OpenHands?
-      options:
-        - Docker command in README
-        - Development workflow
-      default: 0
+      label: Current OpenHands version
+      description: What version of OpenHands are you using? If you're running in docker, tell us the tag you're using (e.g. ghcr.io/all-hands-ai/openhands:0.3.1).
+      render: bash
+    validations:
+      required: true

-  - type: input
-    id: openhands-version
+  - type: textarea
+    id: config
    attributes:
-      label: OpenHands Version
-      description: What version of OpenHands are you using?
-      placeholder: ex. 0.9.8, main, etc.
+      label: Installation and Configuration
+      description: Please provide any commands you ran and any configuration (redacting API keys)
+      render: bash
+    validations:
+      required: true

-  - type: dropdown
-    id: os
+  - type: textarea
+    id: model-agent
+    attributes:
+      label: Model and Agent
+      description: What model and agent are you using? You can see these settings in the UI by clicking the settings wheel.
+      placeholder: |
+        - Model:
+        - Agent:
+
+  - type: textarea
+    id: os-version
    attributes:
      label: Operating System
-      options:
-        - MacOS
-        - Linux
-        - WSL on Windows
+      description: What Operating System are you using? Linux, Mac OS, WSL on Windows
+
+  - type: textarea
+    id: repro-steps
+    attributes:
+      label: Reproduction Steps
+      description: Please list the steps to reproduce the issue.
+      placeholder: |
+        1.
+        2.
+        3.

  - type: textarea
    id: additional-context
    attributes:
      label: Logs, Errors, Screenshots, and Additional Context
-      description: Please provide any additional information you think might help. If you want to share the chat history
-        you can click the thumbs-down (👎) button above the input field and you will get a shareable link
-        (you can also click thumbs up when things are going well of course!). LLM logs will be stored in the
-        `logs/llm/default` folder. Please add any additional context about the problem here.
+      description: If you want to share the chat history you can click the thumbs-down (👎) button above the input field and you will get a shareable link (you can also click thumbs up when things are going well of course!). LLM logs will be stored in the `logs/llm/default` folder. Please add any additional context about the problem here.
@@ -1,35 +1,21 @@
+# To get started with Dependabot version updates, you'll need to specify which
+# package ecosystems to update and where the package manifests are located.
+# Please see the documentation for all configuration options:
+# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
+
 version: 2
 updates:
  - package-ecosystem: "pip"
    directory: "/"
    schedule:
      interval: "daily"
-    open-pull-requests-limit: 1
-    groups:
-      # put packages in their own group if they have a history of breaking the build or needing to be reverted
-      pre-commit:
-        patterns:
-          - "pre-commit"
-      llama:
-        patterns:
-          - "llama*"
-      chromadb:
-        patterns:
-          - "chromadb"
-      security-all:
-        applies-to: "security-updates"
-        patterns:
-          - "*"
-      version-all:
-        applies-to: "version-updates"
-        patterns:
-          - "*"
+    open-pull-requests-limit: 20

  - package-ecosystem: "npm"
    directory: "/frontend"
    schedule:
      interval: "daily"
-    open-pull-requests-limit: 1
+    open-pull-requests-limit: 20
    groups:
      docusaurus:
        patterns:
@@ -37,21 +23,12 @@ updates:
      eslint:
        patterns:
          - "*eslint*"
-      security-all:
-        applies-to: "security-updates"
-        patterns:
-          - "*"
-      version-all:
-        applies-to: "version-updates"
-        patterns:
-          - "*"

  - package-ecosystem: "npm"
    directory: "/docs"
    schedule:
-      interval: "weekly"
-      day: "wednesday"
-    open-pull-requests-limit: 1
+      interval: "daily"
+    open-pull-requests-limit: 20
    groups:
      docusaurus:
        patterns:
@@ -59,11 +36,3 @@ updates:
      eslint:
        patterns:
          - "*eslint*"
-      security-all:
-        applies-to: "security-updates"
-        patterns:
-          - "*"
-      version-all:
-        applies-to: "version-updates"
-        patterns:
-          - "*"
@@ -1,6 +1,6 @@
-**End-user friendly description of the problem this fixes or functionality that this introduces**
+**Short description of the problem this fixes or functionality that this introduces. This may be used for the CHANGELOG**
+

- [ ] Include this change in the Release Notes. If checked, you must provide an **end-user friendly** description for your change below

 ---
 **Give a summary of what the PR does, explaining any non-trivial design decisions**
@@ -14,11 +14,6 @@ on:
    branches:
      - main

-# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
-concurrency:
-  group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
-  cancel-in-progress: true
-
 jobs:
  # Build the documentation website
  build:
@@ -37,7 +32,7 @@ jobs:
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
-          python-version: '3.12'
+          python-version: '3.11'
      - name: Generate Python Docs
        run: rm -rf docs/modules/python && pip install pydoc-markdown && pydoc-markdown
      - name: Install dependencies
@@ -9,48 +9,25 @@ on:
    - main
  pull_request:

-# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
-concurrency:
-  group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
-  cancel-in-progress: true
-
 jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
-      - name: Free Disk Space (Ubuntu)
-        uses: jlumbroso/free-disk-space@main
-        with:
-          # this might remove tools that are actually needed,
-          # if set to "true" but frees about 6 GB
-          tool-cache: true
-          # all of these default to true, but feel free to set to
-          # "false" if necessary for your workflow
-          android: true
-          dotnet: true
-          haskell: true
-          large-packages: true
-          docker-images: false
-          swap-storage: true
-      - name: Set up Docker Buildx
-        id: buildx
-        uses: docker/setup-buildx-action@v3
-      - name: Install poetry via pipx
-        run: pipx install poetry
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
-          python-version: '3.12'
-          cache: 'poetry'
-      - name: Install Python dependencies using Poetry
-        run: poetry install --without evaluation,llama-index
-      - name: Build Environment
-        run: make build
+          python-version: '3.11'
+      - name: Set up environment
+        run: |
+          curl -sSL https://install.python-poetry.org | python3 -
+          poetry install --without evaluation,llama-index
+          poetry run playwright install --with-deps chromium
+          wget https://huggingface.co/BAAI/bge-small-en-v1.5/raw/main/1_Pooling/config.json -P /tmp/llama_index/models--BAAI--bge-small-en-v1.5/snapshots/5c38ec7c405ec4b44b94cc5a9bb96e735b38267a/1_Pooling/
      - name: Run tests
        run: |
          set -e
-          SANDBOX_FORCE_REBUILD_RUNTIME=True poetry run python3 openhands/core/main.py -t "do a flip" -d ./workspace/ -c DummyAgent
+          poetry run python openhands/core/main.py -t "do a flip" -d ./workspace/ -c DummyAgent
      - name: Check exit code
        run: |
          if [ $? -ne 0 ]; then
@@ -12,11 +12,6 @@ on:
      - 'frontend/**'
      -  '.github/workflows/fe-unit-tests.yml'

-# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
-concurrency:
-  group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
-  cancel-in-progress: true
-
 jobs:
  # Run frontend unit tests
  fe-test:
@@ -0,0 +1,65 @@
+# Workflow that builds, tests and then pushes the app docker images to the ghcr.io repository
+name: Build and Publish App Image
+
+# Always run on "main"
+# Always run on tags
+# Always run on PRs
+# Can also be triggered manually
+on:
+  push:
+    branches:
+      - main
+    tags:
+      - '*'
+  pull_request:
+  workflow_dispatch:
+    inputs:
+      reason:
+        description: 'Reason for manual trigger'
+        required: true
+        default: ''
+
+jobs:
+  # Builds the OpenHands Docker images
+  ghcr_build:
+    name: Build App Image
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      packages: write
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+      - name: Free Disk Space (Ubuntu)
+        uses: jlumbroso/free-disk-space@main
+        with:
+          # this might remove tools that are actually needed,
+          # if set to "true" but frees about 6 GB
+          tool-cache: true
+          # all of these default to true, but feel free to set to
+          # "false" if necessary for your workflow
+          android: true
+          dotnet: true
+          haskell: true
+          large-packages: true
+          docker-images: false
+          swap-storage: true
+      - name: Set up QEMU
+        uses: docker/setup-qemu-action@v3
+      - name: Login to GHCR
+        uses: docker/login-action@v3
+        with:
+          registry: ghcr.io
+          username: ${{ github.repository_owner }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+      - name: Set up Docker Buildx
+        id: buildx
+        uses: docker/setup-buildx-action@v3
+      - name: Build and push app image
+        if: "!github.event.pull_request.head.repo.fork"
+        run: |
+          ./containers/build.sh openhands ${{ github.repository_owner }} --push
+      - name: Build app image
+        if: "github.event.pull_request.head.repo.fork"
+        run: |
+          ./containers/build.sh openhands image ${{ github.repository_owner }}
@@ -1,6 +1,12 @@
-# Workflow that builds, tests and then pushes the OpenHands and runtime docker images to the ghcr.io repository
+# Workflow that builds, tests and then pushes the runtime docker images to the ghcr.io repository
 name: Build, Test and Publish RT Image

+# Only run one workflow of the same group at a time.
+# There can be at most one running and one pending job in a concurrency group at any time.
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
+
 # Always run on "main"
 # Always run on tags
 # Always run on PRs
@@ -19,76 +25,7 @@ on:
        required: true
        default: ''

-# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
-concurrency:
-  group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
-  cancel-in-progress: true
-
-env:
-  BASE_IMAGE_FOR_HASH_EQUIVALENCE_TEST: nikolaik/python-nodejs:python3.12-nodejs22
-  RELEVANT_SHA: ${{ github.event.pull_request.head.sha || github.sha }}
-
 jobs:
-  # Builds the OpenHands Docker images
-  ghcr_build_app:
-    name: Build App Image
-    runs-on: ubuntu-latest
-    permissions:
-      contents: read
-      packages: write
-    outputs:
-      hash_from_app_image: ${{ steps.get_hash_in_app_image.outputs.hash_from_app_image }}
-    steps:
-      - name: Checkout
-        uses: actions/checkout@v4
-      - name: Free Disk Space (Ubuntu)
-        uses: jlumbroso/free-disk-space@main
-        with:
-          # this might remove tools that are actually needed,
-          # if set to "true" but frees about 6 GB
-          tool-cache: true
-          # all of these default to true, but feel free to set to
-          # "false" if necessary for your workflow
-          android: true
-          dotnet: true
-          haskell: true
-          large-packages: true
-          docker-images: false
-          swap-storage: true
-      - name: Set up QEMU
-        uses: docker/setup-qemu-action@v3.0.0
-        with:
-          image: tonistiigi/binfmt:latest
-      - name: Login to GHCR
-        uses: docker/login-action@v3
-        with:
-          registry: ghcr.io
-          username: ${{ github.repository_owner }}
-          password: ${{ secrets.GITHUB_TOKEN }}
-      - name: Set up Docker Buildx
-        id: buildx
-        uses: docker/setup-buildx-action@v3
-      - name: Build and push app image
-        if: "!github.event.pull_request.head.repo.fork"
-        run: |
-          ./containers/build.sh -i openhands -o ${{ github.repository_owner }} --push
-      - name: Build app image
-        if: "github.event.pull_request.head.repo.fork"
-        run: |
-          ./containers/build.sh -i openhands -o ${{ github.repository_owner }} --load
-      - name: Get hash in App Image
-        id: get_hash_in_app_image
-        run: |
-          # Lowercase the repository owner
-          export REPO_OWNER=${{ github.repository_owner }}
-          REPO_OWNER=$(echo $REPO_OWNER | tr '[:upper:]' '[:lower:]')
-          # Run the build script in the app image
-          docker run -e SANDBOX_USER_ID=0 -v /var/run/docker.sock:/var/run/docker.sock ghcr.io/${REPO_OWNER}/openhands:${{ env.RELEVANT_SHA }} /bin/bash -c "mkdir -p containers/runtime; python3 openhands/runtime/utils/runtime_build.py --base_image ${{ env.BASE_IMAGE_FOR_HASH_EQUIVALENCE_TEST }} --build_folder containers/runtime --force_rebuild" 2>&1 | tee docker-outputs.txt
-          # Get the hash from the build script
-          hash_from_app_image=$(cat docker-outputs.txt | grep "Hash for docker build directory" | awk -F "): " '{print $2}' | uniq | head -n1)
-          echo "hash_from_app_image=$hash_from_app_image" >> $GITHUB_OUTPUT
-          echo "Hash from app image: $hash_from_app_image"
-
  # Builds the runtime Docker images
  ghcr_build_runtime:
    name: Build Image
@@ -99,7 +36,7 @@ jobs:
    strategy:
      matrix:
        base_image:
-          - image: 'nikolaik/python-nodejs:python3.12-nodejs22'
+          - image: 'nikolaik/python-nodejs:python3.11-nodejs22'
            tag: nikolaik
    steps:
      - name: Checkout
@@ -119,9 +56,7 @@ jobs:
          docker-images: false
          swap-storage: true
      - name: Set up QEMU
-        uses: docker/setup-qemu-action@v3.0.0
-        with:
-          image: tonistiigi/binfmt:latest
+        uses: docker/setup-qemu-action@v3
      - name: Login to GHCR
        uses: docker/login-action@v3
        with:
@@ -134,7 +69,7 @@ jobs:
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
-          python-version: '3.12'
+          python-version: '3.11'
      - name: Cache Poetry dependencies
        uses: actions/cache@v4
        with:
@@ -153,13 +88,13 @@ jobs:
      - name: Build and push runtime image ${{ matrix.base_image.image }}
        if: github.event.pull_request.head.repo.fork != true
        run: |
-          ./containers/build.sh -i runtime -o ${{ github.repository_owner }} --push -t ${{ matrix.base_image.tag }}
+          ./containers/build.sh runtime ${{ github.repository_owner }} --push ${{ matrix.base_image.tag }}
      # Forked repos can't push to GHCR, so we need to upload the image as an artifact
      - name: Build runtime image ${{ matrix.base_image.image }} for fork
        if: github.event.pull_request.head.repo.fork
        uses: docker/build-push-action@v6
        with:
-          tags: ghcr.io/all-hands-ai/runtime:${{ env.RELEVANT_SHA }}-${{ matrix.base_image.tag }}
+          tags: ghcr.io/all-hands-ai/runtime:${{ github.sha }}-${{ matrix.base_image.tag }}
          outputs: type=docker,dest=/tmp/runtime-${{ matrix.base_image.tag }}.tar
          context: containers/runtime
      - name: Upload runtime image for fork
@@ -169,56 +104,6 @@ jobs:
          name: runtime-${{ matrix.base_image.tag }}
          path: /tmp/runtime-${{ matrix.base_image.tag }}.tar

-  verify_hash_equivalence_in_runtime_and_app:
-    name: Verify Hash Equivalence in Runtime and Docker images
-    runs-on: ubuntu-latest
-    needs: [ghcr_build_runtime, ghcr_build_app]
-    strategy:
-      fail-fast: false
-      matrix:
-        base_image: ['nikolaik']
-    steps:
-      - uses: actions/checkout@v4
-      - name: Cache Poetry dependencies
-        uses: actions/cache@v4
-        with:
-          path: |
-            ~/.cache/pypoetry
-            ~/.virtualenvs
-          key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
-          restore-keys: |
-            ${{ runner.os }}-poetry-
-      - name: Set up Python
-        uses: actions/setup-python@v5
-        with:
-          python-version: '3.12'
-      - name: Install poetry via pipx
-        run: pipx install poetry
-      - name: Install Python dependencies using Poetry
-        run: make install-python-dependencies
-      - name: Get hash in App Image
-        run: |
-          echo "Hash from app image: ${{ needs.ghcr_build_app.outputs.hash_from_app_image }}"
-          echo "hash_from_app_image=${{ needs.ghcr_build_app.outputs.hash_from_app_image }}" >> $GITHUB_ENV
-
-      - name: Get hash using code (development mode)
-        run: |
-          mkdir -p containers/runtime
-          poetry run python3 openhands/runtime/utils/runtime_build.py --base_image ${{ env.BASE_IMAGE_FOR_HASH_EQUIVALENCE_TEST }} --build_folder containers/runtime --force_rebuild > output.txt 2>&1
-          hash_from_code=$(cat output.txt | grep "Hash for docker build directory" | awk -F "): " '{print $2}' | uniq | head -n1)
-          echo "hash_from_code=$hash_from_code" >> $GITHUB_ENV
-
-      - name: Compare hashes
-        run: |
-          echo "Hash from App Image: ${{ env.hash_from_app_image }}"
-          echo "Hash from Code: ${{ env.hash_from_code }}"
-          if [ "${{ env.hash_from_app_image }}" = "${{ env.hash_from_code }}" ]; then
-            echo "Hashes match!"
-          else
-            echo "Hashes do not match!"
-            exit 1
-          fi
-
  # Run unit tests with the EventStream runtime Docker images as root
  test_runtime_root:
    name: RT Unit Tests (Root)
@@ -230,23 +115,6 @@ jobs:
        base_image: ['nikolaik']
    steps:
      - uses: actions/checkout@v4
-      - name: Free Disk Space (Ubuntu)
-        uses: jlumbroso/free-disk-space@main
-        with:
-          # this might remove tools that are actually needed,
-          # if set to "true" but frees about 6 GB
-          tool-cache: true
-          # all of these default to true, but feel free to set to
-          # "false" if necessary for your workflow
-          android: true
-          dotnet: true
-          haskell: true
-          large-packages: true
-          docker-images: false
-          swap-storage: true
-      - name: Set up Docker Buildx
-        id: buildx
-        uses: docker/setup-buildx-action@v3
      # Forked repos can't push to GHCR, so we need to download the image as an artifact
      - name: Download runtime image for fork
        if: github.event.pull_request.head.repo.fork
@@ -270,29 +138,30 @@ jobs:
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
-          python-version: '3.12'
+          python-version: '3.11'
      - name: Install poetry via pipx
        run: pipx install poetry
      - name: Install Python dependencies using Poetry
        run: make install-python-dependencies
      - name: Run runtime tests
        run: |
-          # We install pytest-xdist in order to run tests across CPUs
+          # We install pytest-xdist in order to run tests across CPUs. However, tests start to fail when we run
+          # then across more than 2 CPUs for some reason
          poetry run pip install pytest-xdist

          # Install to be able to retry on failures for flaky tests
          poetry run pip install pytest-rerunfailures

-          image_name=ghcr.io/${{ github.repository_owner }}/runtime:${{ env.RELEVANT_SHA }}-${{ matrix.base_image }}
+          image_name=ghcr.io/${{ github.repository_owner }}/runtime:${{ github.sha }}-${{ matrix.base_image }}
          image_name=$(echo $image_name | tr '[:upper:]' '[:lower:]')

          SKIP_CONTAINER_LOGS=true \
          TEST_RUNTIME=eventstream \
          SANDBOX_USER_ID=$(id -u) \
-          SANDBOX_RUNTIME_CONTAINER_IMAGE=$image_name \
+          SANDBOX_BASE_CONTAINER_IMAGE=$image_name \
          TEST_IN_CI=true \
          RUN_AS_OPENHANDS=false \
-          poetry run pytest -n 3 -raRs --reruns 2 --reruns-delay 5 --cov=openhands --cov-report=xml -s ./tests/runtime
+          poetry run pytest -n 3 --reruns 1 --reruns-delay 3 --cov=agenthub --cov=openhands --cov-report=xml -s ./tests/runtime
      - name: Upload coverage to Codecov
        uses: codecov/codecov-action@v4
        env:
@@ -308,23 +177,6 @@ jobs:
        base_image: ['nikolaik']
    steps:
      - uses: actions/checkout@v4
-      - name: Free Disk Space (Ubuntu)
-        uses: jlumbroso/free-disk-space@main
-        with:
-          # this might remove tools that are actually needed,
-          # if set to "true" but frees about 6 GB
-          tool-cache: true
-          # all of these default to true, but feel free to set to
-          # "false" if necessary for your workflow
-          android: true
-          dotnet: true
-          haskell: true
-          large-packages: true
-          docker-images: false
-          swap-storage: true
-      - name: Set up Docker Buildx
-        id: buildx
-        uses: docker/setup-buildx-action@v3
      # Forked repos can't push to GHCR, so we need to download the image as an artifact
      - name: Download runtime image for fork
        if: github.event.pull_request.head.repo.fork
@@ -348,29 +200,85 @@ jobs:
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
-          python-version: '3.12'
+          python-version: '3.11'
      - name: Install poetry via pipx
        run: pipx install poetry
      - name: Install Python dependencies using Poetry
        run: make install-python-dependencies
      - name: Run runtime tests
        run: |
-          # We install pytest-xdist in order to run tests across CPUs
+          # We install pytest-xdist in order to run tests across CPUs. However, tests start to fail when we run
+          # then across more than 2 CPUs for some reason
          poetry run pip install pytest-xdist

          # Install to be able to retry on failures for flaky tests
          poetry run pip install pytest-rerunfailures

-          image_name=ghcr.io/${{ github.repository_owner }}/runtime:${{ env.RELEVANT_SHA }}-${{ matrix.base_image }}
+          image_name=ghcr.io/${{ github.repository_owner }}/runtime:${{ github.sha }}-${{ matrix.base_image }}
          image_name=$(echo $image_name | tr '[:upper:]' '[:lower:]')

          SKIP_CONTAINER_LOGS=true \
          TEST_RUNTIME=eventstream \
          SANDBOX_USER_ID=$(id -u) \
-          SANDBOX_RUNTIME_CONTAINER_IMAGE=$image_name \
+          SANDBOX_BASE_CONTAINER_IMAGE=$image_name \
          TEST_IN_CI=true \
          RUN_AS_OPENHANDS=true \
-          poetry run pytest -n 3 -raRs --reruns 2 --reruns-delay 5 --cov=openhands --cov-report=xml -s ./tests/runtime
+          poetry run pytest -n 3 --reruns 1 --reruns-delay 3 --cov=agenthub --cov=openhands --cov-report=xml -s ./tests/runtime
+      - name: Upload coverage to Codecov
+        uses: codecov/codecov-action@v4
+        env:
+          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
+
+  # Run integration tests with the eventstream runtime Docker image
+  runtime_integration_tests_on_linux:
+    name: RT Integration Tests (Linux)
+    runs-on: ubuntu-latest
+    needs: [ghcr_build_runtime]
+    strategy:
+      fail-fast: false
+      matrix:
+        base_image: ['nikolaik']
+    steps:
+      - uses: actions/checkout@v4
+      # Forked repos can't push to GHCR, so we need to download the image as an artifact
+      - name: Download runtime image for fork
+        if: github.event.pull_request.head.repo.fork
+        uses: actions/download-artifact@v4
+        with:
+          name: runtime-${{ matrix.base_image }}
+          path: /tmp
+      - name: Load runtime image for fork
+        if: github.event.pull_request.head.repo.fork
+        run: |
+          docker load --input /tmp/runtime-${{ matrix.base_image }}.tar
+      - name: Cache Poetry dependencies
+        uses: actions/cache@v4
+        with:
+          path: |
+            ~/.cache/pypoetry
+            ~/.virtualenvs
+          key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
+          restore-keys: |
+            ${{ runner.os }}-poetry-
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: '3.11'
+      - name: Install poetry via pipx
+        run: pipx install poetry
+      - name: Install Python dependencies using Poetry
+        run: make install-python-dependencies
+      - name: Run integration tests
+        run: |
+          image_name=ghcr.io/${{ github.repository_owner }}/runtime:${{ github.sha }}-${{ matrix.base_image }}
+          image_name=$(echo $image_name | tr '[:upper:]' '[:lower:]')
+
+          TEST_RUNTIME=eventstream \
+          SANDBOX_USER_ID=$(id -u) \
+          SANDBOX_BASE_CONTAINER_IMAGE=$image_name \
+          TEST_IN_CI=true \
+          TEST_ONLY=true \
+          ./tests/integration/regenerate.sh
      - name: Upload coverage to Codecov
        uses: codecov/codecov-action@v4
        env:
@@ -384,7 +292,7 @@ jobs:
    name: All Runtime Tests Passed
    if: ${{ !cancelled() && !contains(needs.*.result, 'failure') && !contains(needs.*.result, 'cancelled') }}
    runs-on: ubuntu-latest
-    needs: [test_runtime_root, test_runtime_oh, verify_hash_equivalence_in_runtime_and_app]
+    needs: [test_runtime_root, test_runtime_oh, runtime_integration_tests_on_linux]
    steps:
      - name: All tests passed
        run: echo "All runtime tests have passed successfully!"
@@ -393,7 +301,7 @@ jobs:
    name: All Runtime Tests Passed
    if: ${{ cancelled() || contains(needs.*.result, 'failure') || contains(needs.*.result, 'cancelled') }}
    runs-on: ubuntu-latest
-    needs: [test_runtime_root, test_runtime_oh, verify_hash_equivalence_in_runtime_and_app]
+    needs: [test_runtime_root, test_runtime_oh, runtime_integration_tests_on_linux]
    steps:
      - name: Some tests failed
        run: |
@@ -10,11 +10,6 @@ on:
    - main
  pull_request:

-# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
-concurrency:
-  group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
-  cancel-in-progress: true
-
 jobs:
  # Run lint on the frontend code
  lint-frontend:
@@ -46,9 +41,9 @@ jobs:
      - name: Set up python
        uses: actions/setup-python@v5
        with:
-          python-version: 3.12
+          python-version: 3.11
          cache: 'pip'
      - name: Install pre-commit
        run: pip install pre-commit==3.7.0
      - name: Run pre-commit hooks
-        run: pre-commit run --files openhands/**/* evaluation/**/* tests/**/* --show-diff-on-failure --config ./dev_config/python/.pre-commit-config.yaml
+        run: pre-commit run --files openhands/**/* agenthub/**/* evaluation/**/* tests/**/* --show-diff-on-failure --config ./dev_config/python/.pre-commit-config.yaml
@@ -1,13 +0,0 @@
-name: Resolve Issues with OpenHands
-
-on:
-  issues:
-    types: [labeled]
-
-jobs:
-  call-openhands-resolver:
-    uses: All-Hands-AI/openhands-resolver/.github/workflows/openhands-resolver.yml@main
-    if: github.event.label.name == 'fix-me'
-    with:
-      issue_number: ${{ github.event.issue.number }}
-    secrets: inherit
@@ -1,96 +0,0 @@
-# Workflow that runs python unit tests on mac
-name: Run Python Unit Tests Mac
-
-# This job is flaky so only run it nightly
-on:
-  schedule:
-    - cron: '0 0 * * *'
-
-jobs:
-  # Run python unit tests on macOS
-  test-on-macos:
-    name: Python Unit Tests on macOS
-    runs-on: macos-14
-    env:
-      INSTALL_DOCKER: '1' # Set to '0' to skip Docker installation
-    strategy:
-      matrix:
-        python-version: ['3.12']
-    steps:
-      - uses: actions/checkout@v4
-      - name: Set up Python ${{ matrix.python-version }}
-        uses: actions/setup-python@v5
-        with:
-          python-version: ${{ matrix.python-version }}
-      - name: Cache Poetry dependencies
-        uses: actions/cache@v4
-        with:
-          path: |
-            ~/.cache/pypoetry
-            ~/.virtualenvs
-          key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
-          restore-keys: |
-            ${{ runner.os }}-poetry-
-      - name: Install poetry via pipx
-        run: pipx install poetry
-      - name: Install Python dependencies using Poetry
-        run: poetry install --without evaluation,llama-index
-      - name: Install & Start Docker
-        if: env.INSTALL_DOCKER == '1'
-        run: |
-          INSTANCE_NAME="colima-${GITHUB_RUN_ID}"
-
-          # Uninstall colima to upgrade to the latest version
-          if brew list colima &>/dev/null; then
-            brew uninstall colima
-            # unlinking colima dependency: go
-            brew uninstall go@1.21
-          fi
-          rm -rf ~/.colima ~/.lima
-          brew install --HEAD colima
-          brew install docker
-
-          start_colima() {
-            # Find a free port in the range 10000-20000
-            RANDOM_PORT=$((RANDOM % 10001 + 10000))
-
-            # Original line:
-            if ! colima start --network-address --arch x86_64 --cpu=1 --memory=1 --verbose --ssh-port $RANDOM_PORT; then
-              echo "Failed to start Colima."
-              return 1
-            fi
-            return 0
-          }
-
-          # Attempt to start Colima for 5 total attempts:
-          ATTEMPT_LIMIT=5
-          for ((i=1; i<=ATTEMPT_LIMIT; i++)); do
-
-            if start_colima; then
-              echo "Colima started successfully."
-              break
-            else
-              colima stop -f
-              sleep 10
-              colima delete -f
-              if [ $i -eq $ATTEMPT_LIMIT ]; then
-                exit 1
-              fi
-              sleep 10
-            fi
-          done
-
-          # For testcontainers to find the Colima socket
-          # https://github.com/abiosoft/colima/blob/main/docs/FAQ.md#cannot-connect-to-the-docker-daemon-at-unixvarrundockersock-is-the-docker-daemon-running
-          sudo ln -sf $HOME/.colima/default/docker.sock /var/run/docker.sock
-      - name: Build Environment
-        run: make build
-      - name: Set up Docker Buildx
-        id: buildx
-        uses: docker/setup-buildx-action@v3
-      - name: Run Tests
-        run: poetry run pytest --forked --cov=openhands --cov-report=xml ./tests/unit --ignore=tests/unit/test_memory.py
-      - name: Upload coverage to Codecov
-        uses: codecov/codecov-action@v4
-        env:
-          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
@@ -10,12 +10,92 @@ on:
      - main
  pull_request:

-# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
-concurrency:
-  group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
-  cancel-in-progress: true
-
 jobs:
+  # Run python unit tests on macOS
+  test-on-macos:
+    name: Python Unit Tests on macOS
+    runs-on: macos-12
+    env:
+      INSTALL_DOCKER: '1' # Set to '0' to skip Docker installation
+    strategy:
+      matrix:
+        python-version: ['3.11']
+    steps:
+      - uses: actions/checkout@v4
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+      - name: Cache Poetry dependencies
+        uses: actions/cache@v4
+        with:
+          path: |
+            ~/.cache/pypoetry
+            ~/.virtualenvs
+          key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
+          restore-keys: |
+            ${{ runner.os }}-poetry-
+      - name: Install poetry via pipx
+        run: pipx install poetry
+      - name: Install Python dependencies using Poetry
+        run: poetry install --without evaluation,llama-index
+      - name: Install & Start Docker
+        if: env.INSTALL_DOCKER == '1'
+        run: |
+          INSTANCE_NAME="colima-${GITHUB_RUN_ID}"
+
+          # Uninstall colima to upgrade to the latest version
+          if brew list colima &>/dev/null; then
+            brew uninstall colima
+            # unlinking colima dependency: go
+            brew uninstall go@1.21
+          fi
+          rm -rf ~/.colima ~/.lima
+          brew install --HEAD colima
+          brew install docker
+
+          start_colima() {
+            # Find a free port in the range 10000-20000
+            RANDOM_PORT=$((RANDOM % 10001 + 10000))
+
+            # Original line:
+            if ! colima start --network-address --arch x86_64 --cpu=1 --memory=1 --verbose --ssh-port $RANDOM_PORT; then
+              echo "Failed to start Colima."
+              return 1
+            fi
+            return 0
+          }
+
+          # Attempt to start Colima for 5 total attempts:
+          ATTEMPT_LIMIT=5
+          for ((i=1; i<=ATTEMPT_LIMIT; i++)); do
+
+            if start_colima; then
+              echo "Colima started successfully."
+              break
+            else
+              colima stop -f
+              sleep 10
+              colima delete -f
+              if [ $i -eq $ATTEMPT_LIMIT ]; then
+                exit 1
+              fi
+              sleep 10
+            fi
+          done
+
+          # For testcontainers to find the Colima socket
+          # https://github.com/abiosoft/colima/blob/main/docs/FAQ.md#cannot-connect-to-the-docker-daemon-at-unixvarrundockersock-is-the-docker-daemon-running
+          sudo ln -sf $HOME/.colima/default/docker.sock /var/run/docker.sock
+      - name: Build Environment
+        run: make build
+      - name: Run Tests
+        run: poetry run pytest --forked --cov=agenthub --cov=openhands --cov-report=xml ./tests/unit
+      - name: Upload coverage to Codecov
+        uses: codecov/codecov-action@v4
+        env:
+          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
+
  # Run python unit tests on Linux
  test-on-linux:
    name: Python Unit Tests on Linux
@@ -24,12 +104,9 @@ jobs:
      INSTALL_DOCKER: '0' # Set to '0' to skip Docker installation
    strategy:
      matrix:
-        python-version: ['3.12']
+        python-version: ['3.11']
    steps:
      - uses: actions/checkout@v4
-      - name: Set up Docker Buildx
-        id: buildx
-        uses: docker/setup-buildx-action@v3
      - name: Install poetry via pipx
        run: pipx install poetry
      - name: Set up Python
@@ -42,7 +119,7 @@ jobs:
      - name: Build Environment
        run: make build
      - name: Run Tests
-        run: poetry run pytest --forked --cov=openhands --cov-report=xml -svv ./tests/unit --ignore=tests/unit/test_memory.py
+        run: poetry run pytest --forked --cov=agenthub --cov=openhands --cov-report=xml -svv ./tests/unit
      - name: Upload coverage to Codecov
        uses: codecov/codecov-action@v4
        env:
@@ -17,7 +17,7 @@ jobs:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
-          python-version: 3.12
+          python-version: 3.11
      - name: Install Poetry
        uses: snok/install-poetry@v1.4.1
        with:
@@ -26,6 +26,6 @@ jobs:
      - name: Install Poetry Dependencies
        run: poetry install --no-interaction --no-root
      - name: Build poetry project
-        run: ./build.sh
+        run: poetry build -v
      - name: publish
        run: poetry publish -u __token__ -p ${{ secrets.PYPI_TOKEN }}
@@ -0,0 +1,70 @@
+
+name: Regenerate Integration Tests
+
+on:
+  workflow_dispatch:
+    inputs:
+      debug:
+        description: 'Enable debug mode'
+        type: boolean
+        default: true
+      log_to_file:
+        description: 'Enable logging to file'
+        type: boolean
+        default: true
+      force_regenerate_tests:
+        description: 'Force regeneration of tests'
+        type: boolean
+        default: false
+      force_use_llm:
+        description: 'Force use of LLM'
+        type: boolean
+        default: false
+
+jobs:
+  regenerate_integration_tests:
+    if: github.ref != 'refs/heads/main'
+    runs-on: ubuntu-latest
+
+    steps:
+    - name: Checkout repository
+      uses: actions/checkout@v4
+    - name: Set up Python
+      uses: actions/setup-python@v5
+      with:
+        python-version: "3.11"
+    - name: Cache Poetry dependencies
+      uses: actions/cache@v4
+      with:
+        path: |
+          ~/.cache/pypoetry
+          ~/.virtualenvs
+        key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
+        restore-keys: |
+          ${{ runner.os }}-poetry-
+    - name: Install poetry via pipx
+      run: pipx install poetry
+    - name: Install Python dependencies using Poetry
+      run: make install-python-dependencies
+    - name: Build Environment
+      run: make build
+    - name: Regenerate integration tests
+      run: |
+        DEBUG=${{ inputs.debug }} \
+        LOG_TO_FILE=${{ inputs.log_to_file }} \
+        FORCE_REGENERATE_TESTS=${{ inputs.force_regenerate_tests }} \
+        FORCE_USE_LLM=${{ inputs.force_use_llm }} \
+        ./tests/integration/regenerate.sh
+    - name: Commit changes
+      run: |
+        if git diff --quiet --exit-code; then
+          echo "No changes to commit"
+          exit 0
+        fi
+
+        git config --global user.name 'github-actions[bot]'
+        git config --global user.email 'github-actions[bot]@users.noreply.github.com'
+        git add .
+        # run it twice in case pre-commit makes changes
+        git commit -am "Regenerate integration tests" || git commit -am "Regenerate integration tests"
+        git push
@@ -15,13 +15,10 @@ jobs:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
-    - name: Set up Docker Buildx
-      id: buildx
-      uses: docker/setup-buildx-action@v3
    - name: Set up Python
      uses: actions/setup-python@v5
      with:
-        python-version: '3.12'
+        python-version: '3.11'
    - name: install git, github cli
      run: |
        sudo apt-get install -y git gh
@@ -0,0 +1,113 @@
+# Workflow that uses OpenHands to resolve a GitHub issue. Issue must be labeled 'solve-this'
+name: Use OpenHands to Resolve GitHub Issue
+
+on:
+  issues:
+    types: [labeled]
+
+permissions:
+  contents: write
+  pull-requests: write
+  issues: write
+
+jobs:
+  dogfood:
+    if: github.event.label.name == 'solve-this'
+    runs-on: ubuntu-latest
+    container:
+      image: ghcr.io/all-hands-ai/openhands
+      volumes:
+        - /var/run/docker.sock:/var/run/docker.sock
+    steps:
+    - name: install git, github cli
+      run: apt-get install -y git gh
+    - name: Checkout Repository
+      uses: actions/checkout@v4
+    - name: Write Task File
+      env:
+        ISSUE_TITLE: ${{ github.event.issue.title }}
+        ISSUE_BODY: ${{ github.event.issue.body }}
+      run: |
+        echo "TITLE:" > task.txt
+        echo "${ISSUE_TITLE}" >> task.txt
+        echo "" >> task.txt
+        echo "BODY:" >> task.txt
+        echo "${ISSUE_BODY}" >> task.txt
+    - name: Set up environment
+      run: |
+        curl -sSL https://install.python-poetry.org | python3 -
+        export PATH="/github/home/.local/bin:$PATH"
+        poetry install --without evaluation,llama-index
+        poetry run playwright install --with-deps chromium
+    - name: Run OpenHands
+      env:
+        ISSUE_TITLE: ${{ github.event.issue.title }}
+        ISSUE_BODY: ${{ github.event.issue.body }}
+        LLM_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+        OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+      run: |
+        # Append path to launch poetry
+        export PATH="/github/home/.local/bin:$PATH"
+        # Append path to correctly import package, note: must set pwd at first
+        export PYTHONPATH=$(pwd):$PYTHONPATH
+        WORKSPACE_MOUNT_PATH=$GITHUB_WORKSPACE poetry run python ./openhands/core/main.py -i 50 -f task.txt -d $GITHUB_WORKSPACE
+        rm task.txt
+    - name: Setup Git, Create Branch, and Commit Changes
+      run: |
+        # Setup Git configuration
+        git config --global --add safe.directory $PWD
+        git config --global user.name 'OpenHands'
+        git config --global user.email 'OpenHands@users.noreply.github.com'
+
+        # Create a unique branch name with a timestamp
+        BRANCH_NAME="fix/${{ github.event.issue.number }}-$(date +%Y%m%d%H%M%S)"
+
+        # Checkout new branch
+        git checkout -b $BRANCH_NAME
+
+        # Add all changes to staging, except task.txt
+        git add --all -- ':!task.txt'
+
+        # Commit the changes, if any
+        git commit -m "OpenHands: Resolve Issue #${{ github.event.issue.number }}"
+        if [ $? -ne 0 ]; then
+          echo "No changes to commit."
+          exit 0
+        fi
+
+        # Push changes
+        git push --set-upstream origin $BRANCH_NAME
+    - name: Fetch Default Branch
+      env:
+        GH_TOKEN: ${{ github.token }}
+      run: |
+        # Fetch the default branch using gh cli
+        DEFAULT_BRANCH=$(gh repo view --json defaultBranchRef --jq .defaultBranchRef.name)
+        echo "Default branch is $DEFAULT_BRANCH"
+        echo "DEFAULT_BRANCH=$DEFAULT_BRANCH" >> $GITHUB_ENV
+    - name: Generate PR
+      env:
+        GH_TOKEN: ${{ github.token }}
+      run: |
+        # Create PR and capture URL
+        PR_URL=$(gh pr create \
+          --title "OpenHands: Resolve Issue #2" \
+          --body "This PR was generated by OpenHands to resolve issue #2" \
+          --repo "foragerr/OpenHands" \
+          --head "${{ github.head_ref }}" \
+          --base "${{ env.DEFAULT_BRANCH }}" \
+          | grep -o 'https://github.com/[^ ]*')
+
+        # Extract PR number from URL
+        PR_NUMBER=$(echo "$PR_URL" | grep -o '[0-9]\+$')
+
+        # Set environment vars
+        echo "PR_URL=$PR_URL" >> $GITHUB_ENV
+        echo "PR_NUMBER=$PR_NUMBER" >> $GITHUB_ENV
+
+    - name: Post Comment
+      env:
+        GH_TOKEN: ${{ github.token }}
+      run: |
+        gh issue comment ${{ github.event.issue.number }} \
+          -b "OpenHands raised [PR #${{ env.PR_NUMBER }}](${{ env.PR_URL }}) to resolve this issue."
@@ -121,7 +121,6 @@ celerybeat.pid

 # Environments
 .env
-frontend/.env
 .venv
 env/
 venv/
@@ -178,6 +177,7 @@ evaluation/toolqa/data
 # frontend

 # dependencies
+frontend/node_modules
 frontend/.pnp
 frontend/bun.lockb
 frontend/yarn.lock
@@ -217,6 +217,8 @@ config.toml
 config.toml_
 config.toml.bak

+containers/agnostic_sandbox
+
 # swe-bench-eval
 image_build_logs
 run_instance_logs
@@ -226,5 +228,3 @@ runtime_*.tar
 # docker build
 containers/runtime/Dockerfile
 containers/runtime/project.tar.gz
-containers/runtime/code
-**/node_modules/
@@ -1,28 +0,0 @@
-OpenHands is an automated AI software engineer. It is a repo with a Python backend
-(in the `openhands` directory) and TypeScript frontend (in the `frontend` directory).
-
-General Setup:
- To set up the entire repo, including frontend and backend, run `make build`
- To run linting and type-checking before finishing the job, run `poetry run pre-commit run --all-files --config ./dev_config/python/.pre-commit-config.yaml`
-
-Backend:
- Located in the `openhands` directory
- Testing:
-  - All tests are in `tests/unit/test_*.py`
-  - To test new code, run `poetry run pytest tests/unit/test_xxx.py` where `xxx` is the appropriate file for the current functionality
-  - Write all tests with pytest
-
-Frontend:
- Located in the `frontend` directory
- Prerequisites: A recent version of NodeJS / NPM
- Setup: Run `npm install` in the frontend directory
- Testing:
-  - Run tests: `npm run test`
-  - To run specific tests: `npm run test -- -t "TestName"`
- Building:
-  - Build for production: `npm run build`
- Environment Variables:
-  - Set in `frontend/.env` or as environment variables
-  - Available variables: VITE_BACKEND_HOST, VITE_USE_TLS, VITE_INSECURE_SKIP_VERIFY, VITE_FRONTEND_PORT
- Internationalization:
-  - Generate i18n declaration file: `npm run make-i18n`
@@ -2,70 +2,95 @@

 Thanks for your interest in contributing to OpenHands! We welcome and appreciate contributions.

-## Understanding OpenHands's CodeBase
-
-To understand the codebase, please refer to the README in each module:
- [frontend](./frontend/README.md)
- [evaluation](./evaluation/README.md)
- [openhands](./openhands/README.md)
-   - [agenthub](./openhands/agenthub/README.md)
-   - [server](./openhands/server/README.md)
-
-## Setting up your development environment
-
-We have a separate doc [Development.md](https://github.com/All-Hands-AI/OpenHands/blob/main/Development.md) that tells you how to set up a development workflow.
-
-## How can I contribute?
+## How Can I Contribute?

 There are many ways that you can contribute:

 1. **Download and use** OpenHands, and send [issues](https://github.com/All-Hands-AI/OpenHands/issues) when you encounter something that isn't working or a feature that you'd like to see.
 2. **Send feedback** after each session by [clicking the thumbs-up thumbs-down buttons](https://docs.all-hands.dev/modules/usage/feedback), so we can see where things are working and failing, and also build an open dataset for training code agents.
-3. **Improve the Codebase** by sending PRs (see details below). In particular, we have some [good first issues](https://github.com/All-Hands-AI/OpenHands/labels/good%20first%20issue) that may be ones to start on.
+3. **Improve the Codebase** by sending PRs (see details below). In particular, we have some [good first issue](https://github.com/All-Hands-AI/OpenHands/labels/good%20first%20issue) issues that may be ones to start on.

-## What can I build?
-Here are a few ways you can help improve the codebase.
+## Understanding OpenHands's CodeBase

-#### UI/UX
-We're always looking to improve the look and feel of the application. If you've got a small fix
-for something that's bugging you, feel free to open up a PR that changes the `./frontend` directory.
+To understand the codebase, please refer to the README in each module:
+- [frontend](./frontend/README.md)
+- [agenthub](./agenthub/README.md)
+- [evaluation](./evaluation/README.md)
+- [openhands](./openhands/README.md)
+    - [server](./openhands/server/README.md)

-If you're looking to make a bigger change, add a new UI element, or significantly alter the style
-of the application, please open an issue first, or better, join the #frontend channel in our Slack
-to gather consensus from our design team first.
-
-#### Improving the agent
-Our main agent is the CodeAct agent. You can [see its prompts here](https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/agenthub/codeact_agent)
-
-Changes to these prompts, and to the underlying behavior in Python, can have a huge impact on user experience.
-You can try modifying the prompts to see how they change the behavior of the agent as you use the app
-locally, but we will need to do an end-to-end evaluation of any changes here to ensure that the agent
-is getting better over time.
-
-We use the [SWE-bench](https://www.swebench.com/) benchmark to test our agent. You can join the #evaluation
-channel in Slack to learn more.
-
-#### Adding a new agent
-You may want to experiment with building new types of agents. You can add an agent to `openhands/agenthub`
-to help expand the capabilities of OpenHands.
-
-#### Adding a new runtime
-The agent needs a place to run code and commands. When you run OpenHands on your laptop, it uses a Docker container
-to do this by default. But there are other ways of creating a sandbox for the agent.
-
-If you work for a company that provides a cloud-based runtime, you could help us add support for that runtime
-by implementing the [interface specified here](https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/runtime/runtime.py).
-
-#### Testing
 When you write code, it is also good to write tests. Please navigate to the `tests` folder to see existing test suites.
 At the moment, we have two kinds of tests: `unit` and `integration`. Please refer to the README for each test suite. These tests also run on GitHub's continuous integration to ensure quality of the project.

 ## Sending Pull Requests to OpenHands

-You'll need to fork our repository to send us a Pull Request. You can learn more
-about how to fork a GitHub repo and open a PR with your changes in [this article](https://medium.com/swlh/forks-and-pull-requests-how-to-contribute-to-github-repos-8843fac34ce8)
+### 1. Fork the Official Repository
+Fork the [OpenHands repository](https://github.com/All-Hands-AI/OpenHands) into your own account.
+Clone your own forked repository into your local environment:

-### Pull Request title
+```shell
+git clone git@github.com:<YOUR-USERNAME>/OpenHands.git
+```
+
+### 2. Configure Git
+
+Set the official repository as your [upstream](https://www.atlassian.com/git/tutorials/git-forks-and-upstreams) to synchronize with the latest update in the official repository.
+Add the original repository as upstream:
+
+```shell
+cd OpenHands
+git remote add upstream git@github.com:All-Hands-AI/OpenHands.git
+```
+
+Verify that the remote is set:
+
+```shell
+git remote -v
+```
+
+You should see both `origin` and `upstream` in the output.
+
+### 3. Synchronize with Official Repository
+Synchronize latest commit with official repository before coding:
+
+```shell
+git fetch upstream
+git checkout main
+git merge upstream/main
+git push origin main
+```
+
+### 4. Set up the Development Environment
+
+We have a separate doc [Development.md](https://github.com/All-Hands-AI/OpenHands/blob/main/Development.md) that tells you how to set up a development workflow.
+
+### 5. Write Code and Commit It
+
+Once you have done this, you can write code, test it, and commit it to a branch (replace `my_branch` with an appropriate name):
+
+```shell
+git checkout -b my_branch
+git add .
+git commit
+git push origin my_branch
+```
+
+### 6. Open a Pull Request
+
+* On GitHub, go to the page of your forked repository, and create a Pull Request:
+   - Click on `Branches`
+   - Click on the `...` beside your branch and click on `New pull request`
+   - Set `base repository` to `All-Hands-AI/OpenHands`
+   - Set `base` to `main`
+   - Click `Create pull request`
+
+The PR should appear in [OpenHands PRs](https://github.com/All-Hands-AI/OpenHands/pulls).
+
+Then the OpenHands team will review your code.
+
+## PR Rules
+
+### 1. Pull Request title
 As described [here](https://github.com/commitizen/conventional-commit-types/blob/master/index.json), a valid PR title should begin with one of the following prefixes:

 - `feat`: A new feature
@@ -86,9 +111,6 @@ For example, a PR title could be:

 You may also check out previous PRs in the [PR list](https://github.com/All-Hands-AI/OpenHands/pulls).

-### Pull Request description
+### 2. Pull Request description
 - If your PR is small (such as a typo fix), you can go brief.
 - If it contains a lot of changes, it's better to write more details.
-
-If your changes are user-facing (e.g. a new feature in the UI, a change in behavior, or a bugfix)
-please include a short message that we can add to our changelog.
@@ -2,7 +2,7 @@

 ## Contributors

-We would like to thank all the [contributors](https://github.com/All-Hands-AI/OpenHands/graphs/contributors) who have helped make OpenHands possible. We greatly appreciate your dedication and hard work.
+We would like to thank all the [contributors](https://github.com/All-Hands-AI/OpenHands/graphs/contributors) who have helped make OpenHands possible. Your dedication and hard work are greatly appreciated.

 ## Open Source Projects

@@ -10,7 +10,7 @@ OpenHands includes and adapts the following open source projects. We are gratefu

 #### [SWE Agent](https://github.com/princeton-nlp/swe-agent)
   - License: MIT License
-   - Description: Adapted for use in OpenHands's agent hub
+   - Description: Adapted for use in OpenHands's agenthub

 #### [Aider](https://github.com/paul-gauthier/aider)
   - License: Apache License 2.0
@@ -5,14 +5,12 @@ Otherwise, you can clone the OpenHands project directly.

 ## Start the server for development
 ### 1. Requirements
-* Linux, Mac OS, or [WSL on Windows](https://learn.microsoft.com/en-us/windows/wsl/install)  [Ubuntu <= 22.04]
+* Linux, Mac OS, or [WSL on Windows](https://learn.microsoft.com/en-us/windows/wsl/install)  [ Ubuntu <= 22.04]
 * [Docker](https://docs.docker.com/engine/install/) (For those on MacOS, make sure to allow the default Docker socket to be used from advanced settings!)
-* [Python](https://www.python.org/downloads/) = 3.12
+* [Python](https://www.python.org/downloads/) = 3.11
 * [NodeJS](https://nodejs.org/en/download/package-manager) >= 18.17.1
 * [Poetry](https://python-poetry.org/docs/#installing-with-the-official-installer) >= 1.8
-* OS-specific dependencies:
-  - Ubuntu: build-essential => `sudo apt-get install build-essential`
-  - WSL: netcat => `sudo apt-get install netcat`
+* netcat => sudo apt-get install netcat

 Make sure you have all these dependencies installed before moving on to `make build`.

@@ -24,8 +22,8 @@ If you want to develop without system admin/sudo access to upgrade/install `Pyth
 curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
 bash Miniforge3-$(uname)-$(uname -m).sh

-# Install Python 3.12, nodejs, and poetry
-mamba install python=3.12
+# Install Python 3.11, nodejs, and poetry
+mamba install python=3.11
 mamba install conda-forge::nodejs
 mamba install conda-forge::poetry
 ```
@@ -93,15 +91,13 @@ To run tests, refer to the following:
 poetry run pytest ./tests/unit/test_*.py
 ```

+#### Integration tests
+Please refer to [this README](./tests/integration/README.md) for details.
+
 ### 9. Add or update dependency
 1. Add your dependency in `pyproject.toml` or use `poetry add xxx`
 2. Update the poetry.lock file via `poetry lock --no-update`

-### 9. Use existing Docker image
-To reduce build time (e.g., if no changes were made to the client-runtime component), you can use an existing Docker container image. Follow these steps:
-1. Set the SANDBOX_RUNTIME_CONTAINER_IMAGE environment variable to the desired Docker image.
-2. Example: export SANDBOX_RUNTIME_CONTAINER_IMAGE=ghcr.io/all-hands-ai/runtime:0.9-nikolaik
-
 ## Develop inside Docker container

 TL;DR
@@ -10,7 +10,7 @@ DEFAULT_WORKSPACE_DIR = "./workspace"
 DEFAULT_MODEL = "gpt-4o"
 CONFIG_FILE = config.toml
 PRE_COMMIT_CONFIG_PATH = "./dev_config/python/.pre-commit-config.yaml"
-PYTHON_VERSION = 3.12
+PYTHON_VERSION = 3.11

 # ANSI color codes
 GREEN=$(shell tput -Txterm setaf 2)
@@ -190,12 +190,12 @@ build-frontend:
 # Start backend
 start-backend:
 	@echo "$(YELLOW)Starting backend...$(RESET)"
-	@poetry run uvicorn openhands.server.listen:app --host $(BACKEND_HOST) --port $(BACKEND_PORT) --reload --reload-exclude "$(shell pwd)/workspace"
+	@poetry run uvicorn openhands.server.listen:app --host $(BACKEND_HOST) --port $(BACKEND_PORT) --reload --reload-exclude "workspace/*"

 # Start frontend
 start-frontend:
 	@echo "$(YELLOW)Starting frontend...$(RESET)"
-	@cd frontend && VITE_BACKEND_HOST=$(BACKEND_HOST_PORT) VITE_FRONTEND_PORT=$(FRONTEND_PORT) npm run dev -- --port $(FRONTEND_PORT) --host $(BACKEND_HOST)
+	@cd frontend && VITE_BACKEND_HOST=$(BACKEND_HOST_PORT) VITE_FRONTEND_PORT=$(FRONTEND_PORT) npm run start

 # Common setup for running the app (non-callable)
 _run_setup:
@@ -214,7 +214,7 @@ _run_setup:
 run:
 	@echo "$(YELLOW)Running the app...$(RESET)"
 	@$(MAKE) -s _run_setup
-	@$(MAKE) -s start-frontend
+	@cd frontend && echo "$(BLUE)Starting frontend with npm...$(RESET)" && npm run start -- --port $(FRONTEND_PORT)
 	@echo "$(GREEN)Application started successfully.$(RESET)"

 # Run the app (in docker)
@@ -36,16 +36,14 @@ Learn more at [docs.all-hands.dev](https://docs.all-hands.dev), or jump to the [
 The easiest way to run OpenHands is in Docker. You can change `WORKSPACE_BASE` below to
 point OpenHands to existing code that you'd like to modify.

-See the [Installation](https://docs.all-hands.dev/modules/usage/installation) guide for
+See the [Getting Started](https://docs.all-hands.dev/modules/usage/getting-started) guide for
 system requirements and more information.

 ```bash
 export WORKSPACE_BASE=$(pwd)/workspace

-docker pull ghcr.io/all-hands-ai/runtime:0.11-nikolaik
-
 docker run -it --pull=always \
-    -e SANDBOX_RUNTIME_CONTAINER_IMAGE=ghcr.io/all-hands-ai/runtime:0.11-nikolaik \
+    -e SANDBOX_RUNTIME_CONTAINER_IMAGE=ghcr.io/all-hands-ai/runtime:0.9-nikolaik \
    -e SANDBOX_USER_ID=$(id -u) \
    -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE \
    -v $WORKSPACE_BASE:/opt/workspace_base \
@@ -53,19 +51,15 @@ docker run -it --pull=always \
    -p 3000:3000 \
    --add-host host.docker.internal:host-gateway \
    --name openhands-app-$(date +%Y%m%d%H%M%S) \
-    ghcr.io/all-hands-ai/openhands:0.11
+    ghcr.io/all-hands-ai/openhands:0.9
 ```

 You'll find OpenHands running at [http://localhost:3000](http://localhost:3000)!

-You'll need a model provider and API key. One option that works well: [Claude 3.5 Sonnet](https://www.anthropic.com/api), but you have [many options](https://docs.all-hands.dev/modules/usage/llms).
-
---
-
 You can also run OpenHands in a scriptable [headless mode](https://docs.all-hands.dev/modules/usage/how-to/headless-mode),
 or as an [interactive CLI](https://docs.all-hands.dev/modules/usage/how-to/cli-mode).

-Visit [Installation](https://docs.all-hands.dev/modules/usage/installation) for more information and setup instructions.
+Visit [Getting Started](https://docs.all-hands.dev/modules/usage/getting-started) for more information and setup instructions.

 If you want to modify the OpenHands source code, check out [Development.md](https://github.com/All-Hands-AI/OpenHands/blob/main/Development.md).

@@ -120,8 +114,8 @@ For a list of open source projects and licenses used in OpenHands, please see ou
 ## 📚 Cite

 ```
-@misc{openhands,
-      title={{OpenHands: An Open Platform for AI Software Developers as Generalist Agents}},
+@misc{opendevin,
+      title={{OpenDevin: An Open Platform for AI Software Developers as Generalist Agents}},
      author={Xingyao Wang and Boxuan Li and Yufan Song and Frank F. Xu and Xiangru Tang and Mingchen Zhuge and Jiayi Pan and Yueqi Song and Bowen Li and Jaskirat Singh and Hoang H. Tran and Fuqiang Li and Ren Ma and Mingzhang Zheng and Bill Qian and Yanjun Shao and Niklas Muennighoff and Yizhe Zhang and Binyuan Hui and Junyang Lin and Robert Brennan and Hao Peng and Heng Ji and Graham Neubig},
      year={2024},
      eprint={2407.16741},
@@ -2,7 +2,7 @@

 In this folder, there may exist multiple implementations of `Agent` that will be used by the framework.

-For example, `openhands/agenthub/codeact_agent`, etc.
+For example, `agenthub/codeact_agent`, etc.
 Contributors from different backgrounds and interests can choose to contribute to any (or all!) of these directions.

 ## Constructing an Agent
@@ -1,13 +1,13 @@
 from dotenv import load_dotenv

-from openhands.agenthub.micro.agent import MicroAgent
-from openhands.agenthub.micro.registry import all_microagents
+from agenthub.micro.agent import MicroAgent
+from agenthub.micro.registry import all_microagents
 from openhands.controller.agent import Agent

 load_dotenv()


-from openhands.agenthub import (  # noqa: E402
+from agenthub import (  # noqa: E402
    browsing_agent,
    codeact_agent,
    codeact_swe_agent,
@@ -1,4 +1,4 @@
-from openhands.agenthub.browsing_agent.browsing_agent import BrowsingAgent
+from agenthub.browsing_agent.browsing_agent import BrowsingAgent
 from openhands.controller.agent import Agent

 Agent.register('BrowsingAgent', BrowsingAgent)
@@ -3,7 +3,7 @@ import os
 from browsergym.core.action.highlevel import HighLevelActionSet
 from browsergym.utils.obs import flatten_axtree_to_str

-from openhands.agenthub.browsing_agent.response_parser import BrowsingResponseParser
+from agenthub.browsing_agent.response_parser import BrowsingResponseParser
 from openhands.controller.agent import Agent
 from openhands.controller.state.state import State
 from openhands.core.config import AgentConfig
@@ -218,6 +218,7 @@ class BrowsingAgent(Agent):

        response = self.llm.completion(
            messages=self.llm.format_messages_for_llm(messages),
+            temperature=0.0,
            stop=[')```', ')\n```'],
        )
        return self.response_parser.parse(response)
@@ -12,7 +12,7 @@ from browsergym.core.action.base import AbstractActionSet
 from browsergym.core.action.highlevel import HighLevelActionSet
 from browsergym.core.action.python import PythonActionSet

-from openhands.agenthub.browsing_agent.utils import (
+from agenthub.browsing_agent.utils import (
    ParseError,
    parse_html_tags_raise,
 )
@@ -0,0 +1,88 @@
+import ast
+
+from openhands.controller.action_parser import ActionParser, ResponseParser
+from openhands.core.logger import openhands_logger as logger
+from openhands.events.action import (
+    Action,
+    BrowseInteractiveAction,
+)
+
+
+class BrowsingResponseParser(ResponseParser):
+    def __init__(self):
+        # Need to pay attention to the item order in self.action_parsers
+        super().__init__()
+        self.action_parsers = [BrowsingActionParserMessage()]
+        self.default_parser = BrowsingActionParserBrowseInteractive()
+
+    def parse(self, response: str) -> Action:
+        action_str = self.parse_response(response)
+        return self.parse_action(action_str)
+
+    def parse_response(self, response) -> str:
+        action_str = response['choices'][0]['message']['content']
+        if action_str is None:
+            return ''
+        action_str = action_str.strip()
+        if action_str and not action_str.endswith('```'):
+            action_str = action_str + ')```'
+        logger.debug(action_str)
+        return action_str
+
+    def parse_action(self, action_str: str) -> Action:
+        for action_parser in self.action_parsers:
+            if action_parser.check_condition(action_str):
+                return action_parser.parse(action_str)
+        return self.default_parser.parse(action_str)
+
+
+class BrowsingActionParserMessage(ActionParser):
+    """Parser action:
+    - BrowseInteractiveAction(browser_actions) - unexpected response format, message back to user
+    """
+
+    def __init__(
+        self,
+    ):
+        pass
+
+    def check_condition(self, action_str: str) -> bool:
+        return '```' not in action_str
+
+    def parse(self, action_str: str) -> Action:
+        msg = f'send_msg_to_user("""{action_str}""")'
+        return BrowseInteractiveAction(
+            browser_actions=msg,
+            thought=action_str,
+            browsergym_send_msg_to_user=action_str,
+        )
+
+
+class BrowsingActionParserBrowseInteractive(ActionParser):
+    """Parser action:
+    - BrowseInteractiveAction(browser_actions) - handle send message to user function call in BrowserGym
+    """
+
+    def __init__(
+        self,
+    ):
+        pass
+
+    def check_condition(self, action_str: str) -> bool:
+        return True
+
+    def parse(self, action_str: str) -> Action:
+        thought = action_str.split('```')[0].strip()
+        action_str = action_str.split('```')[1].strip()
+        msg_content = ''
+        for sub_action in action_str.split('\n'):
+            if 'send_msg_to_user(' in sub_action:
+                tree = ast.parse(sub_action)
+                args = tree.body[0].value.args  # type: ignore
+                msg_content = args[0].value
+
+        return BrowseInteractiveAction(
+            browser_actions=action_str,
+            thought=thought,
+            browsergym_send_msg_to_user=msg_content,
+        )
@@ -10,3 +10,20 @@ The conceptual idea is illustrated below. At each turn, the agent can:
   - Execute any valid `Python` code with [an interactive Python interpreter](https://ipython.org/). This is simulated through `bash` command, see plugin system below for more details.

 ![image](https://github.com/All-Hands-AI/OpenHands/assets/38853559/92b622e3-72ad-4a61-8f41-8c040b6d5fb3)
+
+## Plugin System
+
+To make the CodeAct agent more powerful with only access to `bash` action space, CodeAct agent leverages OpenHands's plugin system:
+- [Jupyter plugin](https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/runtime/plugins/jupyter): for IPython execution via bash command
+- [Agent Skills plugin](https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/runtime/plugins/agent_skills): Powerful bash command line tools for software development tasks introduced by [swe-agent](https://github.com/princeton-nlp/swe-agent).
+
+## Demo
+
+https://github.com/All-Hands-AI/OpenHands/assets/38853559/f592a192-e86c-4f48-ad31-d69282d5f6ac
+
+*Example of CodeActAgent with `gpt-4-turbo-2024-04-09` performing a data science task (linear regression)*
+
+## Work-in-progress & Next step
+
+[] Support web-browsing
+[] Complete the workflow for CodeAct agent to submit Github PRs
@@ -1,4 +1,4 @@
-from openhands.agenthub.codeact_agent.codeact_agent import CodeActAgent
+from agenthub.codeact_agent.codeact_agent import CodeActAgent
 from openhands.controller.agent import Agent

 Agent.register('CodeActAgent', CodeActAgent)
@@ -1,17 +1,11 @@
 import re

-from openhands.controller.action_parser import (
-    ActionParser,
-    ResponseParser,
-)
-from openhands.core.exceptions import LLMMalformedActionError
-from openhands.core.logger import openhands_logger as logger
+from openhands.controller.action_parser import ActionParser, ResponseParser
 from openhands.events.action import (
    Action,
    AgentDelegateAction,
    AgentFinishAction,
    CmdRunAction,
-    FileEditAction,
    IPythonRunCellAction,
    MessageAction,
 )
@@ -20,7 +14,6 @@ from openhands.events.action import (
 class CodeActResponseParser(ResponseParser):
    """Parser action:
    - CmdRunAction(command) - bash command to run
-    - FileEditAction(path, content) - edit a file
    - IPythonRunCellAction(code) - IPython code to run
    - AgentDelegateAction(agent, inputs) - delegate action for (sub)task
    - MessageAction(content) - Message action to run (e.g. ask for clarification)
@@ -32,7 +25,6 @@ class CodeActResponseParser(ResponseParser):
        super().__init__()
        self.action_parsers = [
            CodeActActionParserFinish(),
-            CodeActActionParserFileEdit(),
            CodeActActionParserCmdRun(),
            CodeActActionParserIPythonRunCell(),
            CodeActActionParserAgentDelegate(),
@@ -48,14 +40,8 @@ class CodeActResponseParser(ResponseParser):
        if action is None:
            return ''
        for lang in ['bash', 'ipython', 'browse']:
-            # special handling for DeepSeek: it has stop-word bug and returns </execute_ipython instead of </execute_ipython>
-            if f'</execute_{lang}' in action and f'</execute_{lang}>' not in action:
-                action = action.replace(f'</execute_{lang}', f'</execute_{lang}>')
-
            if f'<execute_{lang}>' in action and f'</execute_{lang}>' not in action:
                action += f'</execute_{lang}>'
-        if '<file_edit' in action and '</file_edit>' not in action:
-            action += '</file_edit>'
        return action

    def parse_action(self, action_str: str) -> Action:
@@ -168,15 +154,8 @@ class CodeActActionParserAgentDelegate(ActionParser):
        ), 'self.agent_delegate should not be None when parse is called'
        thought = action_str.replace(self.agent_delegate.group(0), '').strip()
        browse_actions = self.agent_delegate.group(1).strip()
-        thought = (
-            f'{thought}\nI should start with: {browse_actions}'
-            if thought
-            else f'I should start with: {browse_actions}'
-        )
-
-        return AgentDelegateAction(
-            agent='BrowsingAgent', thought=thought, inputs={'task': browse_actions}
-        )
+        task = f'{thought}. I should start with: {browse_actions}'
+        return AgentDelegateAction(agent='BrowsingAgent', inputs={'task': task})


 class CodeActActionParserMessage(ActionParser):
@@ -196,87 +175,3 @@ class CodeActActionParserMessage(ActionParser):

    def parse(self, action_str: str) -> Action:
        return MessageAction(content=action_str, wait_for_response=True)
-
-
-class CodeActActionParserFileEdit(ActionParser):
-    """Parser action:
-    - FileEditAction(path, content) - edit a file
-    """
-
-    def __init__(self):
-        self.file_edit_match: re.Match | None = None
-
-    def check_condition(self, action_str: str) -> bool:
-        if '<file_edit' not in action_str:
-            return False
-
-        # Updated regex to make start and end optional
-        self.file_edit_match = re.search(
-            r'<file_edit\s+path=(["\']?)(.*?)\1(?:\s+start=(["\']?)(.*?)\3)?(?:\s+end=(["\']?)(.*?)\5)?\s*>(.*?)</file_edit>',
-            action_str,
-            re.DOTALL,
-        )
-
-        if self.file_edit_match is None:
-            logger.error(
-                f'FileEditAction detected but the format is incorrect. Unable to match for <file_edit> in:\n{"-" * 80}\n{action_str}\n{"-" * 80}'
-            )
-            raise LLMMalformedActionError(
-                'FileEditAction detected but the format is incorrect. Usage:\n'
-                '<file_edit path="[path]" start=[start_line] end=[end_line]>\n'
-                '[content_to_edit]\n'
-                '</file_edit>\n'
-            )
-
-        path = self.file_edit_match.group(2)
-        start = self.file_edit_match.group(4)
-        end = self.file_edit_match.group(6)
-
-        if not path:
-            raise LLMMalformedActionError(
-                'FileEditAction detected but no `path` specified. You should specify the path of the file to edit.'
-            )
-
-        if start:
-            try:
-                int(start)
-            except ValueError:
-                raise LLMMalformedActionError(
-                    f'FileEditAction detected but `start` is not a valid integer: {start}'
-                )
-
-        if end:
-            try:
-                int(end)
-            except ValueError:
-                raise LLMMalformedActionError(
-                    f'FileEditAction detected but `end` is not a valid integer: {end}'
-                )
-
-        return True
-
-    def parse(self, action_str: str) -> Action:
-        assert (
-            self.file_edit_match is not None
-        ), 'self.file_edit_match should not be None when parse is called'
-
-        file_path = self.file_edit_match.group(2).strip()
-        start_line = (
-            int(self.file_edit_match.group(4))
-            if self.file_edit_match.group(4)
-            else None
-        )
-        end_line = (
-            int(self.file_edit_match.group(6))
-            if self.file_edit_match.group(6)
-            else None
-        )
-        content = self.file_edit_match.group(7)
-        thought = action_str.replace(self.file_edit_match.group(0), '').strip()
-
-        action = FileEditAction(path=file_path, content=content, thought=thought)
-        if start_line is not None:
-            action.start = start_line
-        if end_line is not None:
-            action.end = end_line
-        return action
@@ -1,24 +1,23 @@
 import os
 from itertools import islice

-from openhands.agenthub.codeact_agent.action_parser import CodeActResponseParser
+from agenthub.codeact_agent.action_parser import CodeActResponseParser
 from openhands.controller.agent import Agent
 from openhands.controller.state.state import State
 from openhands.core.config import AgentConfig
+from openhands.core.logger import openhands_logger as logger
 from openhands.core.message import ImageContent, Message, TextContent
 from openhands.events.action import (
    Action,
    AgentDelegateAction,
    AgentFinishAction,
    CmdRunAction,
-    FileEditAction,
    IPythonRunCellAction,
    MessageAction,
 )
 from openhands.events.observation import (
    AgentDelegateObservation,
    CmdOutputObservation,
-    FileEditObservation,
    IPythonRunCellObservation,
    UserRejectObservation,
 )
@@ -36,7 +35,7 @@ from openhands.utils.prompt import PromptManager


 class CodeActAgent(Agent):
-    VERSION = '2.0'
+    VERSION = '1.9'
    """
    The Code Act Agent is a minimalist agent.
    The agent works by passing the model a list of action-observation pairs and prompting the model to take the next step.
@@ -104,8 +103,6 @@ class CodeActAgent(Agent):
            return f'{action.thought}\n<execute_ipython>\n{action.code}\n</execute_ipython>'
        elif isinstance(action, AgentDelegateAction):
            return f'{action.thought}\n<execute_browse>\n{action.inputs["task"]}\n</execute_browse>'
-        elif isinstance(action, FileEditAction):
-            return f'{action.thought}\n<file_edit path={action.path}>\n{action.content}\n</file_edit>'
        elif isinstance(action, MessageAction):
            return action.content
        elif isinstance(action, AgentFinishAction) and action.source == 'agent':
@@ -113,16 +110,13 @@ class CodeActAgent(Agent):
        return ''

    def get_action_message(self, action: Action) -> Message | None:
-        if isinstance(
-            action,
-            (
-                AgentDelegateAction,
-                CmdRunAction,
-                IPythonRunCellAction,
-                MessageAction,
-                FileEditAction,
-            ),
-        ) or (isinstance(action, AgentFinishAction) and action.source == 'agent'):
+        if (
+            isinstance(action, AgentDelegateAction)
+            or isinstance(action, CmdRunAction)
+            or isinstance(action, IPythonRunCellAction)
+            or isinstance(action, MessageAction)
+            or (isinstance(action, AgentFinishAction) and action.source == 'agent')
+        ):
            content = [TextContent(text=self.action_to_str(action))]

            if (
@@ -141,9 +135,7 @@ class CodeActAgent(Agent):
        max_message_chars = self.llm.config.max_message_chars
        obs_prefix = 'OBSERVATION:\n'
        if isinstance(obs, CmdOutputObservation):
-            text = obs_prefix + truncate_content(
-                obs.content + obs.interpreter_details, max_message_chars
-            )
+            text = obs_prefix + truncate_content(obs.content, max_message_chars)
            text += (
                f'\n[Command {obs.command_id} finished with exit code {obs.exit_code}]'
            )
@@ -160,14 +152,8 @@ class CodeActAgent(Agent):
            text = '\n'.join(splitted)
            text = truncate_content(text, max_message_chars)
            return Message(role='user', content=[TextContent(text=text)])
-        elif isinstance(obs, FileEditObservation):
-            text = obs_prefix + truncate_content(str(obs), max_message_chars)
-            return Message(role='user', content=[TextContent(text=text)])
        elif isinstance(obs, AgentDelegateObservation):
-            text = obs_prefix + truncate_content(
-                obs.outputs['content'] if 'content' in obs.outputs else '',
-                max_message_chars,
-            )
+            text = obs_prefix + truncate_content(str(obs.outputs), max_message_chars)
            return Message(role='user', content=[TextContent(text=text)])
        elif isinstance(obs, ErrorObservation):
            text = obs_prefix + truncate_content(obs.content, max_message_chars)
@@ -213,11 +199,23 @@ class CodeActAgent(Agent):
                '</execute_ipython>',
                '</execute_bash>',
                '</execute_browse>',
-                '</file_edit>',
            ],
+            'temperature': 0.0,
        }

-        response = self.llm.completion(**params)
+        if self.llm.is_caching_prompt_active():
+            params['extra_headers'] = {
+                'anthropic-beta': 'prompt-caching-2024-07-31',
+            }
+
+        try:
+            response = self.llm.completion(**params)
+        except Exception as e:
+            logger.error(f'{e}')
+            error_message = '{}: {}'.format(type(e).__name__, str(e).split('\n')[0])
+            return AgentFinishAction(
+                thought=f'Agent encountered an error while processing the last action.\nError: {error_message}\nPlease try again.'
+            )

        return self.action_parser.parse(response)

@@ -0,0 +1,52 @@
+{% set MINIMAL_SYSTEM_PREFIX %}
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed answers to the user's questions.
+The assistant can use a Python environment with <execute_ipython>, e.g.:
+<execute_ipython>
+print("Hello World!")
+</execute_ipython>
+The assistant can execute bash commands wrapped with <execute_bash>, e.g. <execute_bash> ls </execute_bash>.
+If a bash command returns exit code `-1`, this means the process is not yet finished.
+The assistant must then send a second <execute_bash>. The second <execute_bash> can be empty
+(which will retrieve any additional logs), or it can contain text to be sent to STDIN of the running process,
+or it can contain the text `ctrl+c` to interrupt the process.
+
+For commands that may run indefinitely, the output should be redirected to a file and the command run
+in the background, e.g. <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
+If a command execution result says "Command timed out. Sending SIGINT to the process",
+the assistant should retry running the command in the background.
+{% endset %}
+{% set BROWSING_PREFIX %}
+The assistant can browse the Internet with <execute_browse> and </execute_browse>.
+For example, <execute_browse> Tell me the usa's president using google search </execute_browse>.
+Or <execute_browse> Tell me what is in http://example.com </execute_browse>.
+{% endset %}
+{% set PIP_INSTALL_PREFIX %}
+The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: <execute_ipython> %pip install [package needed] </execute_ipython> and should always import packages and define variables before starting to use them.
+{% endset %}
+{% set SYSTEM_PREFIX = MINIMAL_SYSTEM_PREFIX + BROWSING_PREFIX + PIP_INSTALL_PREFIX %}
+{% set COMMAND_DOCS %}
+Apart from the standard Python library, the assistant can also use the following functions (already imported) in <execute_ipython> environment:
+{{ agent_skills_docs }}
+IMPORTANT:
+- `open_file` only returns the first 100 lines of the file by default! The assistant MUST use `scroll_down` repeatedly to read the full file BEFORE making edits!
+- The assistant shall adhere to THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRING PROPER INDENTATION. If the assistant would like to add the line '        print(x)', it must fully write the line out, with all leading spaces before the code!
+- Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
+- Any code issued should be less than 50 lines to avoid context being cut off!
+- After EVERY `create_file` the method `append_file` shall be used to write the FIRST content!
+- For `edit_file_by_replace` NEVER provide empty parameters!
+- For `edit_file_by_replace` the file must be read fully before any replacements!
+{% endset %}
+{% set SYSTEM_SUFFIX %}
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
+Include ONLY ONE <execute_ipython>, <execute_bash>, or <execute_browse> per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
+If the assistant is finished with the task you MUST include <finish></finish> in your response.
+IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_browse> whenever possible.
+The assistant should utilize full file paths and the `pwd` command to prevent path-related errors.
+The assistant must avoid apologies and thanks in its responses.
+
+{% endset %}
+{# Combine all parts without newlines between them #}
+{{ SYSTEM_PREFIX -}}
+{{- COMMAND_DOCS -}}
+{{- SYSTEM_SUFFIX }}
@@ -4,19 +4,23 @@
 USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.

 ASSISTANT:
-Sure! Let me first check the current directory:
-<execute_bash>
-ls
-</execute_bash>
+Sure! Let me create a Python file `app.py`:
+<execute_ipython>
+create_file('app.py')
+</execute_ipython>

 USER:
 OBSERVATION:
-openhands@runtime:~/workspace$
+[File: /workspace/app.py (1 lines total)]
+(this is the beginning of the file)
+1|
+(this is the end of the file)
+[File app.py created.]

 ASSISTANT:
-There is no `app.py` file in the current directory. Let me create a Python file `app.py`:
-<file_edit path="/workspace/app.py" start=-1 end=-1>
-from flask import Flask
+Now I will write the Python code for starting a web server and save it to the file `app.py`:
+<execute_ipython>
+EDITED_CODE="""from flask import Flask
 app = Flask(__name__)

@app.route('/')
@@ -25,27 +29,30 @@ def index():
    return str(numbers)

 if __name__ == '__main__':
-    app.run(port=5000)
-</file_edit>
+    app.run(port=5000)"""
+
+insert_content_at_line(
+  'app.py',
+  1,
+  EDITED_CODE,
+)
+</execute_ipython>

 USER:
 OBSERVATION:
-[New file /workspace/app.py is created.]
-(begin of changes)
--- /workspace/app.py
-+++ /workspace/app.py
-@@ -1 +1,10 @@
-+from flask import Flask
-+app = Flask(__name__)
-
-+@app.route('/')
-+def index():
-+    numbers = list(range(1, 11))
-+    return str(numbers)
-+
-+if __name__ == '__main__':
-+    app.run(port=5000)
-(end of changes)
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6|    numbers = list(range(1, 11))
+7|    return str(numbers)
+8|
+9|if __name__ == '__main__':
+10|    app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]

 ASSISTANT:
 I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
@@ -170,17 +177,16 @@ USER:

 ASSISTANT:
 I should edit the file to display the numbers in a table format. I should include correct indentation. Let me update the file:
-<file_edit path="/workspace/app.py" start=4 end=7>
-@app.route('/')
-def index():
-    numbers = list(range(1, 11))
-    ret = '<table>' + ''.join([f'<tr><td>{i}</td></tr>' for i in numbers]) + '</table>'
-    return ret
-</file_edit>
+<execute_ipython>
+edit_file_by_replace(
+  'app.py',
+  to_replace="    return str(numbers)",
+  new_content="    return '<table>' + ''.join([f'<tr><td>{i}</td></tr>' for i in numbers]) + '</table>'",
+)
+</execute_ipython>

 USER:
 Observation:
-[Edited existing file /workspace/app.py]
 [File: /workspace/app.py (10 lines total after edit)]
 (this is the beginning of the file)
 1|from flask import Flask
@@ -189,11 +195,10 @@ Observation:
 4|@app.route('/')
 5|def index():
 6|    numbers = list(range(1, 11))
-7|    ret = '<table>' + ''.join([f'<tr><td>{i}</td></tr>' for i in numbers]) + '</table>'
-8|    return ret
-9|
-10|if __name__ == '__main__':
-11|    app.run(port=5000)
+7|    return '<table>' + ''.join([f'<tr><td>{i}</td></tr>' for i in numbers]) + '</table>'
+8|
+9|if __name__ == '__main__':
+10|    app.run(port=5000)
 (this is the end of the file)
 [File updated (edited at line 7). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]

@@ -1,4 +1,4 @@
-from openhands.agenthub.codeact_swe_agent.codeact_swe_agent import CodeActSWEAgent
+from agenthub.codeact_swe_agent.codeact_swe_agent import CodeActSWEAgent
 from openhands.controller.agent import Agent

 Agent.register('CodeActSWEAgent', CodeActSWEAgent)
@@ -1,12 +1,10 @@
-from openhands.agenthub.codeact_swe_agent.prompt import (
+from agenthub.codeact_swe_agent.prompt import (
    COMMAND_DOCS,
    SWE_EXAMPLE,
    SYSTEM_PREFIX,
    SYSTEM_SUFFIX,
 )
-from openhands.agenthub.codeact_swe_agent.response_parser import (
-    CodeActSWEResponseParser,
-)
+from agenthub.codeact_swe_agent.response_parser import CodeActSWEResponseParser
 from openhands.controller.agent import Agent
 from openhands.controller.state.state import State
 from openhands.core.config import AgentConfig
@@ -89,7 +87,11 @@ class CodeActSWEAgent(Agent):
        return ''

    def get_action_message(self, action: Action) -> Message | None:
-        if isinstance(action, (CmdRunAction, IPythonRunCellAction, MessageAction)):
+        if (
+            isinstance(action, CmdRunAction)
+            or isinstance(action, IPythonRunCellAction)
+            or isinstance(action, MessageAction)
+        ):
            content = [TextContent(text=self.action_to_str(action))]

            if (
@@ -108,9 +110,7 @@ class CodeActSWEAgent(Agent):
    def get_observation_message(self, obs: Observation) -> Message | None:
        max_message_chars = self.llm.config.max_message_chars
        if isinstance(obs, CmdOutputObservation):
-            text = 'OBSERVATION:\n' + truncate_content(
-                obs.content + obs.interpreter_details, max_message_chars
-            )
+            text = 'OBSERVATION:\n' + truncate_content(obs.content, max_message_chars)
            text += (
                f'\n[Command {obs.command_id} finished with exit code {obs.exit_code}]'
            )
@@ -166,6 +166,7 @@ class CodeActSWEAgent(Agent):
                '</execute_ipython>',
                '</execute_bash>',
            ],
+            temperature=0.0,
        )

        return self.response_parser.parse(response)
@@ -1,4 +1,4 @@
-from openhands.agenthub.codeact_swe_agent.action_parser import (
+from agenthub.codeact_swe_agent.action_parser import (
    CodeActSWEActionParserCmdRun,
    CodeActSWEActionParserFinish,
    CodeActSWEActionParserIPythonRunCell,
@@ -1,4 +1,4 @@
-from openhands.agenthub.delegator_agent.agent import DelegatorAgent
+from agenthub.delegator_agent.agent import DelegatorAgent
 from openhands.controller.agent import Agent

 Agent.register('DelegatorAgent', DelegatorAgent)
@@ -1,4 +1,4 @@
-from openhands.agenthub.dummy_agent.agent import DummyAgent
+from agenthub.dummy_agent.agent import DummyAgent
 from openhands.controller.agent import Agent

 Agent.register('DummyAgent', DummyAgent)
@@ -1,7 +1,7 @@
 from jinja2 import BaseLoader, Environment

-from openhands.agenthub.micro.instructions import instructions
-from openhands.agenthub.micro.registry import all_microagents
+from agenthub.micro.instructions import instructions
+from agenthub.micro.registry import all_microagents
 from openhands.controller.agent import Agent
 from openhands.controller.state.state import State
 from openhands.core.config import AgentConfig
@@ -78,6 +78,7 @@ class MicroAgent(Agent):
        message = Message(role='user', content=content)
        resp = self.llm.completion(
            messages=self.llm.format_messages_for_llm(message),
+            temperature=0.0,
        )
        action_resp = resp['choices'][0]['message']['content']
        action = parse_response(action_resp)
@@ -1,7 +1,6 @@
 # Task
 You are in charge of accomplishing the following task:
-{% set goal = latest_user_message if latest_user_message is not none else state.inputs.task %}
-{{ goal }}
+{{ latest_user_message }}

 In order to accomplish this goal, you must delegate tasks to one or more agents, who
 can do the actual work. A description of each agent is provided below. You MUST
@@ -1,4 +1,4 @@
-from openhands.agenthub.planner_agent.agent import PlannerAgent
+from agenthub.planner_agent.agent import PlannerAgent
 from openhands.controller.agent import Agent

 Agent.register('PlannerAgent', PlannerAgent)
@@ -1,5 +1,5 @@
-from openhands.agenthub.planner_agent.prompt import get_prompt_and_images
-from openhands.agenthub.planner_agent.response_parser import PlannerResponseParser
+from agenthub.planner_agent.prompt import get_prompt_and_images
+from agenthub.planner_agent.response_parser import PlannerResponseParser
 from openhands.controller.agent import Agent
 from openhands.controller.state.state import State
 from openhands.core.config import AgentConfig
@@ -1,5 +0,0 @@
-#!/bin/bash
-set -e
-
-cp pyproject.toml poetry.lock openhands
-poetry build -v
@@ -13,10 +13,6 @@
 # API key for E2B
 #e2b_api_key = ""

-# API key for Modal
-#modal_api_token_id = ""
-#modal_api_token_secret = ""
-
 # Base path for the workspace
 workspace_base = "./workspace"

@@ -32,9 +28,6 @@ workspace_base = "./workspace"
 # Enable saving and restoring the session when run from CLI
 #enable_cli_session = false

-# Path to store trajectories
-#trajectories_path="./trajectories"
-
 # File store path
 #file_store_path = "/tmp/file_store"

@@ -119,7 +112,7 @@ api_key = "your-api-key"
 #embedding_deployment_name = ""

 # Embedding model to use
-embedding_model = "local"
+embedding_model = ""

 # Maximum number of characters in an observation's content
 #max_message_chars = 10000
@@ -153,8 +146,8 @@ model = "gpt-4o"
 # Drop any unmapped (unsupported) params without causing an exception
 #drop_params = false

-# Using the prompt caching feature if provided by the LLM and supported
-#caching_prompt = true
+# Using the prompt caching feature provided by the LLM
+#caching_prompt = false

 # Base URL for the OLLAMA API
 #ollama_base_url = ""
@@ -166,15 +159,17 @@ model = "gpt-4o"
 #timeout = 0

 # Top p for the API
-#top_p = 1.0
+#top_p = 0.5

 # If model is vision capable, this option allows to disable image processing (useful for cost reduction).
 #disable_vision = true

 [llm.gpt4o-mini]
+# API key to use
 api_key = "your-api-key"
-model = "gpt-4o"

+# Model to use
+model = "gpt-4o-mini"

 #################################### Agent ###################################
 # Configuration for agents (group name starts with 'agent')
@@ -190,10 +185,10 @@ model = "gpt-4o"
 #memory_enabled = false

 # Memory maximum threads
-#memory_max_threads = 3
+#memory_max_threads = 2

 # LLM config group to use
-#llm_config = 'your-llm-config-group'
+#llm_config = 'llm'

 [agent.RepoExplorerAgent]
 # Example: use a cheaper model for RepoExplorerAgent to reduce cost, especially
@@ -211,7 +206,7 @@ llm_config = 'gpt3'
 #user_id = 1000

 # Container image to use for the sandbox
-#base_container_image = "nikolaik/python-nodejs:python3.12-nodejs22"
+#base_container_image = "nikolaik/python-nodejs:python3.11-nodejs22"

 # Use host network
 #use_host_network = false
@@ -237,7 +232,7 @@ llm_config = 'gpt3'
 [security]

 # Enable confirmation mode
-#confirmation_mode = false
+#confirmation_mode = true

 # The security analyzer to use
 #security_analyzer = ""
@@ -8,7 +8,7 @@ RUN npm install -g npm@10.5.1
 RUN npm ci

 COPY ./frontend ./
-RUN npm run build
+RUN npm run make-i18n && npm run build

 FROM python:3.12.3-slim AS backend-builder

@@ -28,7 +28,7 @@ COPY ./pyproject.toml ./poetry.lock ./
 RUN touch README.md
 RUN export POETRY_CACHE_DIR && poetry install --without evaluation,llama-index --no-root && rm -rf $POETRY_CACHE_DIR

-FROM python:3.12.3-slim AS openhands-app
+FROM python:3.12.3-slim AS runtime

 WORKDIR /app

@@ -37,7 +37,7 @@ ARG OPENHANDS_BUILD_VERSION #re-declare for this section
 ENV RUN_AS_OPENHANDS=true
 # A random number--we need this to be different from the user's UID on the host machine
 ENV OPENHANDS_USER_ID=42420
-ENV SANDBOX_LOCAL_RUNTIME_URL=http://host.docker.internal
+ENV SANDBOX_API_HOSTNAME=host.docker.internal
 ENV USE_HOST_NETWORK=false
 ENV WORKSPACE_BASE=/opt/workspace_base
 ENV OPENHANDS_BUILD_VERSION=$OPENHANDS_BUILD_VERSION
@@ -69,12 +69,11 @@ RUN playwright install --with-deps chromium

 COPY --chown=openhands:app --chmod=770 ./openhands ./openhands
 COPY --chown=openhands:app --chmod=777 ./openhands/runtime/plugins ./openhands/runtime/plugins
-COPY --chown=openhands:app --chmod=770 ./openhands/agenthub ./openhands/agenthub
-COPY --chown=openhands:app ./pyproject.toml ./pyproject.toml
-COPY --chown=openhands:app ./poetry.lock ./poetry.lock
-COPY --chown=openhands:app ./README.md ./README.md
-COPY --chown=openhands:app ./MANIFEST.in ./MANIFEST.in
-COPY --chown=openhands:app ./LICENSE ./LICENSE
+COPY --chown=openhands:app --chmod=770 ./agenthub ./agenthub
+COPY --chown=openhands:app --chmod=770 ./pyproject.toml ./pyproject.toml
+COPY --chown=openhands:app --chmod=770 ./poetry.lock ./poetry.lock
+COPY --chown=openhands:app --chmod=770 ./README.md ./README.md
+COPY --chown=openhands:app --chmod=770 ./MANIFEST.in ./MANIFEST.in

 # This is run as "openhands" user, and will create __pycache__ with openhands:openhands ownership
 RUN python openhands/core/download.py # No-op to download assets
@@ -82,7 +81,7 @@ RUN python openhands/core/download.py # No-op to download assets
 # openhands:openhands -> openhands:app
 RUN find /app \! -group app -exec chgrp app {} +

-COPY --chown=openhands:app --chmod=770 --from=frontend-builder /app/build ./frontend/build
+COPY --chown=openhands:app --chmod=770 --from=frontend-builder /app/dist ./frontend/dist
 COPY --chown=openhands:app --chmod=770 ./containers/app/entrypoint.sh /app/entrypoint.sh

 USER root
@@ -1,40 +1,13 @@
 #!/bin/bash
 set -eo pipefail

-# Initialize variables with default values
-image_name=""
-org_name=""
+image_name=$1
+org_name=$2
 push=0
-load=0
-tag_suffix=""
-
-# Function to display usage information
-usage() {
-    echo "Usage: $0 -i <image_name> [-o <org_name>] [--push] [--load] [-t <tag_suffix>]"
-    echo "  -i: Image name (required)"
-    echo "  -o: Organization name"
-    echo "  --push: Push the image"
-    echo "  --load: Load the image"
-    echo "  -t: Tag suffix"
-    exit 1
-}
-
-# Parse command-line options
-while [[ $# -gt 0 ]]; do
-    case $1 in
-        -i) image_name="$2"; shift 2 ;;
-        -o) org_name="$2"; shift 2 ;;
-        --push) push=1; shift ;;
-        --load) load=1; shift ;;
-        -t) tag_suffix="$2"; shift 2 ;;
-        *) usage ;;
-    esac
-done
-# Check if required arguments are provided
-if [[ -z "$image_name" ]]; then
-    echo "Error: Image name is required."
-    usage
+if [[ $3 == "--push" ]]; then
+  push=1
 fi
+tag_suffix=$4

 echo "Building: $image_name"
 tags=()
@@ -44,10 +17,10 @@ OPENHANDS_BUILD_VERSION="dev"
 cache_tag_base="buildcache"
 cache_tag="$cache_tag_base"

-if [[ -n $RELEVANT_SHA ]]; then
-  git_hash=$(git rev-parse --short "$RELEVANT_SHA")
+if [[ -n $GITHUB_SHA ]]; then
+  git_hash=$(git rev-parse --short "$GITHUB_SHA")
  tags+=("$git_hash")
-  tags+=("$RELEVANT_SHA")
+  tags+=("$GITHUB_SHA")
 fi

 if [[ -n $GITHUB_REF_NAME ]]; then
@@ -122,35 +95,14 @@ if [[ $push -eq 1 ]]; then
  args+=" --cache-to=type=registry,ref=$DOCKER_REPOSITORY:$cache_tag,mode=max"
 fi

-if [[ $load -eq 1 ]]; then
-  args+=" --load"
-fi
-
 echo "Args: $args"

-# Modify the platform selection based on --load flag
-if [[ $load -eq 1 ]]; then
-  # When loading, build only for the current platform
-  platform=$(docker version -f '{{.Server.Os}}/{{.Server.Arch}}')
-else
-  # For push or without load, build for multiple platforms
-  platform="linux/amd64,linux/arm64"
-fi
-
-echo "Building for platform(s): $platform"
-
 docker buildx build \
  $args \
  --build-arg OPENHANDS_BUILD_VERSION="$OPENHANDS_BUILD_VERSION" \
  --cache-from=type=registry,ref=$DOCKER_REPOSITORY:$cache_tag \
  --cache-from=type=registry,ref=$DOCKER_REPOSITORY:$cache_tag_base-main \
-  --platform $platform \
+  --platform linux/amd64,linux/arm64 \
  --provenance=false \
  -f "$dir/Dockerfile" \
  "$DOCKER_BASE_DIR"
-
-# If load was requested, print the loaded images
-if [[ $load -eq 1 ]]; then
-  echo "Local images built:"
-  docker images "$DOCKER_REPOSITORY" --format "{{.Repository}}:{{.Tag}}"
-fi
@@ -55,18 +55,18 @@ RUN curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | d
  && apt-get clean \
  && apt-get autoremove -y

-# Python 3.12
+# Python 3.11
 RUN add-apt-repository ppa:deadsnakes/ppa \
    && apt-get update \
-    && apt-get install -y python3.12 python3.12-venv python3.12-dev python3-pip \
-    && ln -s /usr/bin/python3.12 /usr/bin/python
+    && apt-get install -y python3.11 python3.11-venv python3.11-dev python3-pip \
+    && ln -s /usr/bin/python3.11 /usr/bin/python

 # NodeJS >= 18.17.1
 RUN curl -fsSL https://deb.nodesource.com/setup_18.x | bash - \
    && apt-get install -y nodejs

 # Poetry >= 1.8
-RUN curl -fsSL https://install.python-poetry.org | python3.12 - \
+RUN curl -fsSL https://install.python-poetry.org | python3.11 - \
    && ln -s ~/.local/bin/poetry /usr/local/bin/poetry

 #
@@ -1,12 +1,11 @@
-# Dynamically constructed Dockerfile
+# Dynamic constructed Dockerfile

-This folder builds a runtime image (sandbox), which will use a dynamically generated `Dockerfile`
-that depends on the `base_image` **AND** a [Python source distribution](https://docs.python.org/3.10/distutils/sourcedist.html) that is based on the current commit of `openhands`.
+This folder builds runtime image (sandbox), which will use a `Dockerfile` that is dynamically generated depends on the `base_image` AND a [Python source distribution](https://docs.python.org/3.10/distutils/sourcedist.html) that's based on the current commit of `openhands`.

-The following command will generate a `Dockerfile` file for `nikolaik/python-nodejs:python3.12-nodejs22` (the default base image), an updated `config.sh` and the runtime source distribution files/folders into `containers/runtime`:
+The following command will generate Dockerfile for `ubuntu:22.04` and the source distribution `.tar` into `containers/runtime`.

 ```bash
 poetry run python3 openhands/runtime/utils/runtime_build.py \
-    --base_image nikolaik/python-nodejs:python3.12-nodejs22 \
+    --base_image ubuntu:22.04 \
    --build_folder containers/runtime
 ```
@@ -0,0 +1,44 @@
+FROM ubuntu:22.04
+
+# install basic packages
+RUN apt-get update && apt-get install -y \
+    curl \
+    wget \
+    git \
+    vim \
+    nano \
+    unzip \
+    zip \
+    python3 \
+    python3-pip \
+    python3-venv \
+    python3-dev \
+    build-essential \
+    openssh-server \
+    sudo \
+    gcc \
+    jq \
+    g++ \
+    make \
+    iproute2 \
+    && rm -rf /var/lib/apt/lists/*
+
+RUN mkdir -p -m0755 /var/run/sshd
+
+# symlink python3 to python
+RUN ln -s /usr/bin/python3 /usr/bin/python
+
+# ==== OpenHands Runtime Client ====
+RUN mkdir -p /openhands && mkdir -p /openhands/logs && chmod 777 /openhands/logs
+RUN wget --progress=bar:force -O Miniforge3.sh "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
+RUN bash Miniforge3.sh -b -p /openhands/miniforge3
+RUN chmod -R g+w /openhands/miniforge3
+RUN bash -c ". /openhands/miniforge3/etc/profile.d/conda.sh && conda config --set changeps1 False && conda config --append channels conda-forge"
+RUN echo "" > /openhands/bash.bashrc
+RUN rm -f Miniforge3.sh
+
+# - agentskills dependencies
+RUN /openhands/miniforge3/bin/pip install --upgrade pip
+RUN /openhands/miniforge3/bin/pip install jupyterlab notebook jupyter_kernel_gateway flake8
+RUN /openhands/miniforge3/bin/pip install python-docx PyPDF2 python-pptx pylatexenc openai
+RUN /openhands/miniforge3/bin/pip install python-dotenv toml termcolor pydantic python-docx pyyaml docker pexpect tenacity e2b browsergym minio
@@ -0,0 +1,4 @@
+DOCKER_REGISTRY=ghcr.io
+DOCKER_ORG=all-hands-ai
+DOCKER_IMAGE=sandbox
+DOCKER_BASE_DIR="."
@@ -38,6 +38,6 @@ repos:
      - id: mypy
        additional_dependencies:
          [types-requests, types-setuptools, types-pyyaml, types-toml]
-        entry: mypy --config-file dev_config/python/mypy.ini openhands/
+        entry: mypy --config-file dev_config/python/mypy.ini openhands/ agenthub/
        always_run: true
        pass_filenames: false
@@ -1,3 +1,7 @@
+---
+sidebar_position: 8
+---
+
 # 📚 Misc

 ## ⭐️ Research Strategy
@@ -1,3 +1,7 @@
+---
+sidebar_position: 3
+---
+
 # 🧠 Main Agent and Capabilities

 ## CodeActAgent
@@ -1,3 +1,7 @@
+---
+sidebar_position: 7
+---
+
 # 🏛️ System Architecture

 <div style={{ textAlign: 'center' }}>
@@ -21,7 +21,7 @@ The OpenHands Runtime system uses a client-server architecture implemented with
 graph TD
    A[User-provided Custom Docker Image] --> B[OpenHands Backend]
    B -->|Builds| C[OH Runtime Image]
-    C -->|Launches| D[Action Executor]
+    C -->|Launches| D[Runtime Client]
    D -->|Initializes| E[Browser]
    D -->|Initializes| F[Bash Shell]
    D -->|Initializes| G[Plugins]
@@ -49,10 +49,10 @@ graph TD
 1. User Input: The user provides a custom base Docker image
 2. Image Building: OpenHands builds a new Docker image (the "OH runtime image") based on the user-provided image. This new image includes OpenHands-specific code, primarily the "runtime client"
 3. Container Launch: When OpenHands starts, it launches a Docker container using the OH runtime image
-4. Action Execution Server Initialization: The action execution server initializes an `ActionExecutor` inside the container, setting up necessary components like a bash shell and loading any specified plugins
-5. Communication: The OpenHands backend (`openhands/runtime/impl/eventstream/eventstream_runtime.py`) communicates with the action execution server over RESTful API, sending actions and receiving observations
+4. Client Initialization: The runtime client initializes inside the container, setting up necessary components like a bash shell and loading any specified plugins
+5. Communication: The OpenHands backend (`runtime.py`) communicates with the runtime client over RESTful API, sending actions and receiving observations
 6. Action Execution: The runtime client receives actions from the backend, executes them in the sandboxed environment, and sends back observations
-7. Observation Return: The action execution server sends execution results back to the OpenHands backend as observations
+7. Observation Return: The client sends execution results back to the OpenHands backend as observations


 The role of the client:
@@ -70,46 +70,74 @@ Check out the [relevant code](https://github.com/All-Hands-AI/OpenHands/blob/mai

 ### Image Tagging System

-OpenHands uses a dual-tagging system for its runtime images to balance reproducibility with flexibility.
-Tags may be in one of 2 formats:
+OpenHands uses a dual-tagging system for its runtime images to balance reproducibility with flexibility:

- **Generic**: `oh_v{openhands_version}_{16_digit_lock_hash}` (e.g.: `oh_v0.9.9_1234567890abcdef`)
- **Specific**: `oh_v{openhands_version}_{16_digit_lock_hash}_{16_digit_source_hash}`
-  (e.g.: `oh_v0.9.9_1234567890abcdef_1234567890abcdef`)
+1. Hash-based tag: `{target_image_repo}:{target_image_hash_tag}`.
+   Example: `runtime:abc123def456`

-#### Lock Hash
+   - This tag is based on the MD5 hash of the Docker build folder, which includes the source code (of runtime client and related dependencies) and Dockerfile
+   - Identical hash tags guarantee that the images were built with exactly the same source code and Dockerfile
+   - This ensures reproducibility; the same hash always means the same image contents

-This hash is built from the first 16 digits of the MD5 of:
- The name of the base image upon which the image was built (e.g.: `nikolaik/python-nodejs:python3.12-nodejs22`)
- The content of the `pyproject.toml` included in the image.
- The content of the `poetry.lock` included in the image.
+2. Generic tag: `{target_image_repo}:{target_image_tag}`.
+   Example: `runtime:oh_v0.9.3_ubuntu_tag_22.04`

-This effectively gives a hash for the dependencies of Openhands independent of the source code.
+   - This tag follows the format: `runtime:oh_v{OH_VERSION}_{BASE_IMAGE_NAME}_tag_{BASE_IMAGE_TAG}`
+   - It represents the latest build for a particular base image and OpenHands version combination
+   - This tag is updated whenever a new image is built from the same base image, even if the source code changes

-#### Source Hash
+The hash-based tag ensures reproducibility, while the generic tag provides a stable reference to the latest version of a particular configuration. This dual-tagging approach allows OpenHands to efficiently manage both development and production environments.

-This is the first 16 digits of the MD5 of the directory hash for the source directory. This gives a hash
-for only the openhands source
+### Build Process

-#### Build Process
+1. Image Naming Convention:
+   - Hash-based tag: `{target_image_repo}:{target_image_hash_tag}`.
+     Example: `runtime:abc123def456`
+   - Generic tag: `{target_image_repo}:{target_image_tag}`.
+     Example: `runtime:oh_v0.9.3_ubuntu_tag_22.04`

-When generating an image...
+2. Build Process:
+   - a. Convert the base image name to an OH runtime image name
+      Example: `ubuntu:22.04` -> `runtime:oh_v0.9.3_ubuntu_tag_22.04`
+   - b. Generate a build context (Dockerfile and OpenHands source code) and calculate its hash
+   - c. Check for an existing image with the calculated hash
+   - d. If not found, check for a recent compatible image to use as a base
+   - e. If no compatible image exists, build from scratch using the original base image
+   - f. Tag the new image with both hash-based and generic tags

- OpenHands first checks whether an image with the same **Specific** tag exists. If there is such an image,
-  no build is performed - the existing image is used.
- OpenHands next checks whether an image with the **Generic** tag exists. If there is such an image,
-  OpenHands builds a new image based upon it, bypassing all installation steps (like `poetry install` and
-  `apt-get`) except a final operation to copy the current source code. The new image is tagged with a
-  **Specific** tag only.
- If neither a **Specific** nor **Generic** tag exists, a brand new image is built based upon the base
-  image (Which is a slower operation). This new image is tagged with both the **Generic** and **Specific**
-  tags.
+3. Image Reuse and Rebuilding Logic:
+   The system follows these steps to determine whether to build a new image or use an existing one from a user-provided (base) image (e.g., `ubuntu:22.04`):
+   - a. If an image exists with the same hash (e.g., `runtime:abc123def456`), it will be reused as is
+   - b. If the exact hash is not found, the system will try to rebuild using the latest generic image (e.g., `runtime:oh_v0.9.3_ubuntu_tag_22.04`) as a base. This saves time by leveraging existing dependencies
+   - c. If neither the hash-tagged nor the generic-tagged image is found, the system will build the image completely from scratch

-This dual-tagging approach allows OpenHands to efficiently manage both development and production environments.
+4. Caching and Efficiency:
+   - The system attempts to reuse existing images when possible to save build time
+   - If an exact match (by hash) is found, it's used without rebuilding
+   - If a compatible image is found, it's used as a base for rebuilding, saving time on dependency installation
+
+Here's a flowchart illustrating the build process:
+
+```mermaid
+flowchart TD
+    A[Start] --> B{Convert base image name}
+    B --> |ubuntu:22.04 -> runtime:oh_v0.9.3_ubuntu_tag_22.04| C[Generate build context and hash]
+    C --> D{Check for existing image with hash}
+    D -->|Found runtime:abc123def456| E[Use existing image]
+    D -->|Not found| F{Check for runtime:oh_v0.9.3_ubuntu_tag_22.04}
+    F -->|Found| G[Rebuild based on recent image]
+    F -->|Not found| H[Build from scratch]
+    G --> I[Tag with hash and generic tags]
+    H --> I
+    E --> J[End]
+    I --> J
+```
+
+This approach ensures that:

 1. Identical source code and Dockerfile always produce the same image (via hash-based tags)
 2. The system can quickly rebuild images when minor changes occur (by leveraging recent compatible images)
-3. The generic tag (e.g., `runtime:oh_v0.9.3_1234567890abcdef`) always points to the latest build for a particular base image and OpenHands version combination
+3. The generic tag (e.g., `runtime:oh_v0.9.3_ubuntu_tag_22.04`) always points to the latest build for a particular base image and OpenHands version combination

 ## Runtime Plugin System

--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Robert Brennan	9aaecfbc68	Update openhands/runtime/client/client.py	2024-09-19 15:26:57 -04:00
Robert Brennan	b24066fb64	revert to old client.py	2024-09-19 14:31:09 -04:00