Add GPTSwarm (Graph-based Workflow) (#2460 )

* update * revert the main branch lock file * regenerate the poetry.lock * move poetry into another dependency group * fix infer.sh and gpt code --------- Co-authored-by: yufansong <yufan@risingwave-labs.com>
2026-04-29 03:00:45 -04:00 · 2024-08-10 23:02:44 -07:00
1226 changed files with 62689 additions and 89987 deletions
--- a/.devcontainer/README.MD
+++ b/.devcontainer/README.MD
@@ -1 +0,0 @@
-The files in this directory configure a development container for GitHub Codespaces.
--- a/.devcontainer/devcontainer.json
+++ b/.devcontainer/devcontainer.json
@@ -1,15 +0,0 @@
-{
-	"name": "OpenHands Codespaces",
-	"image": "mcr.microsoft.com/devcontainers/universal",
-	"customizations":{
-        "vscode":{
-            "extensions": [
-                "ms-python.python"
-            ]
-        }
-    },
-	"onCreateCommand": "sh ./.devcontainer/on_create.sh",
-	"postCreateCommand": "make build",
-	"postStartCommand": "USE_HOST_NETWORK=True nohup bash -c 'make run &'"
-
-}
--- a/.devcontainer/on_create.sh
+++ b/.devcontainer/on_create.sh
@@ -1,6 +0,0 @@
-#!/usr/bin/env bash
-sudo apt update
-sudo apt install -y netcat
-sudo add-apt-repository -y ppa:deadsnakes/ppa
-sudo apt install -y python3.12
-curl -sSL https://install.python-poetry.org | python3.12 -
--- a/.github/ISSUE_TEMPLATE/bug_template.yml
+++ b/.github/ISSUE_TEMPLATE/bug_template.yml
@@ -1,61 +1,75 @@
 name: Bug
-description: Report a problem with OpenHands
+description: Report a problem with OpenDevin
 title: '[Bug]: '
 labels: ['bug']
 body:
  - type: markdown
    attributes:
-      value: Thank you for taking the time to fill out this bug report. Please provide as much information as possible to help us understand and address the issue effectively.
+      value: Thank you for taking the time to fill out this bug report. We greatly appreciate your effort to complete this template fully. Please provide as much information as possible to help us understand and address the issue effectively.

  - type: checkboxes
    attributes:
      label: Is there an existing issue for the same bug?
      description: Please check if an issue already exists for the bug you encountered.
      options:
+      - label: I have checked the troubleshooting document at https://opendevin.github.io/OpenDevin/modules/usage/troubleshooting
+        required: true
      - label: I have checked the existing issues.
        required: true

  - type: textarea
    id: bug-description
    attributes:
-      label: Describe the bug and reproduction steps
-      description: Provide a description of the issue along with any reproduction steps.
+      label: Describe the bug
+      description: Provide a short description of the problem.
    validations:
      required: true

-  - type: dropdown
-    id: installation
+  - type: textarea
+    id: current-version
    attributes:
-      label: OpenHands Installation
-      description: How are you running OpenHands?
-      options:
-        - Docker command in README
-        - Development workflow
-        - app.all-hands.dev
-        - Other
-      default: 0
+      label: Current OpenDevin version
+      description: What version of OpenDevin are you using? If you're running in docker, tell us the tag you're using (e.g. ghcr.io/opendevin/opendevin:0.3.1).
+      render: bash
+    validations:
+      required: true

-  - type: input
-    id: openhands-version
+  - type: textarea
+    id: config
    attributes:
-      label: OpenHands Version
-      description: What version of OpenHands are you using?
-      placeholder: ex. 0.9.8, main, etc.
+      label: Installation and Configuration
+      description: Please provide any commands you ran and any configuration (redacting API keys)
+      render: bash
+    validations:
+      required: true

-  - type: dropdown
-    id: os
+  - type: textarea
+    id: model-agent
+    attributes:
+      label: Model and Agent
+      description: What model and agent are you using? You can see these settings in the UI by clicking the settings wheel.
+      placeholder: |
+        - Model:
+        - Agent:
+
+  - type: textarea
+    id: os-version
    attributes:
      label: Operating System
-      options:
-        - MacOS
-        - Linux
-        - WSL on Windows
+      description: What Operating System are you using? Linux, Mac OS, WSL on Windows
+
+  - type: textarea
+    id: repro-steps
+    attributes:
+      label: Reproduction Steps
+      description: Please list the steps to reproduce the issue.
+      placeholder: |
+        1.
+        2.
+        3.

  - type: textarea
    id: additional-context
    attributes:
      label: Logs, Errors, Screenshots, and Additional Context
-      description: Please provide any additional information you think might help. If you want to share the chat history
-        you can click the thumbs-down (👎) button above the input field and you will get a shareable link
-        (you can also click thumbs up when things are going well of course!). LLM logs will be stored in the
-        `logs/llm/default` folder. Please add any additional context about the problem here.
+      description: If you want to share the chat history you can click the thumbs-down (👎) button above the input field and you will get a shareable link (you can also click thumbs up when things are going well of course!). LLM logs will be stored in the `logs/llm/default` folder. Please add any additional context about the problem here.
--- a/.github/ISSUE_TEMPLATE/feature_request.md
+++ b/.github/ISSUE_TEMPLATE/feature_request.md
@@ -1,6 +1,6 @@
 ---
 name: Feature Request
-about: Suggest an idea for OpenHands features
+about: Suggest an idea for OpenDevin features
 title: ''
 labels: 'enhancement'
 assignees: ''
--- a/.github/dependabot.yml
+++ b/.github/dependabot.yml
@@ -1,69 +1,22 @@
+# To get started with Dependabot version updates, you'll need to specify which
+# package ecosystems to update and where the package manifests are located.
+# Please see the documentation for all configuration options:
+# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
+
 version: 2
 updates:
-  - package-ecosystem: "pip"
-    directory: "/"
+  - package-ecosystem: "pip" # See documentation for possible values
+    directory: "/" # Location of package manifests
    schedule:
      interval: "daily"
-    open-pull-requests-limit: 1
-    groups:
-      # put packages in their own group if they have a history of breaking the build or needing to be reverted
-      pre-commit:
-        patterns:
-          - "pre-commit"
-      llama:
-        patterns:
-          - "llama*"
-      chromadb:
-        patterns:
-          - "chromadb"
-      security-all:
-        applies-to: "security-updates"
-        patterns:
-          - "*"
-      version-all:
-        applies-to: "version-updates"
-        patterns:
-          - "*"
-
-  - package-ecosystem: "npm"
-    directory: "/frontend"
+    open-pull-requests-limit: 20
+  - package-ecosystem: "npm" # See documentation for possible values
+    directory: "/frontend" # Location of package manifests
    schedule:
      interval: "daily"
-    open-pull-requests-limit: 1
-    groups:
-      docusaurus:
-        patterns:
-          - "*docusaurus*"
-      eslint:
-        patterns:
-          - "*eslint*"
-      security-all:
-        applies-to: "security-updates"
-        patterns:
-          - "*"
-      version-all:
-        applies-to: "version-updates"
-        patterns:
-          - "*"
-
-  - package-ecosystem: "npm"
-    directory: "/docs"
+    open-pull-requests-limit: 20
+  - package-ecosystem: "npm" # See documentation for possible values
+    directory: "/docs" # Location of package manifests
    schedule:
-      interval: "weekly"
-      day: "wednesday"
-    open-pull-requests-limit: 1
-    groups:
-      docusaurus:
-        patterns:
-          - "*docusaurus*"
-      eslint:
-        patterns:
-          - "*eslint*"
-      security-all:
-        applies-to: "security-updates"
-        patterns:
-          - "*"
-      version-all:
-        applies-to: "version-updates"
-        patterns:
-          - "*"
+      interval: "daily"
+    open-pull-requests-limit: 20
--- a/.github/pull_request_template.md
+++ b/.github/pull_request_template.md
@@ -1,11 +1,5 @@
-**End-user friendly description of the problem this fixes or functionality that this introduces**
+**What is the problem that this fixes or functionality that this introduces? Does it fix any open issues?**

- [ ] Include this change in the Release Notes. If checked, you must provide an **end-user friendly** description for your change below
+**Give a brief summary of what the PR does, explaining any non-trivial design decisions**

---
-**Give a summary of what the PR does, explaining any non-trivial design decisions**
-
-
-
---
-**Link of any specific issues this addresses**
+**Other references**
--- a/.github/workflows/clean-up.yml
+++ b/.github/workflows/clean-up.yml
@@ -1,69 +0,0 @@
-# Workflow that cleans up outdated and old workflows to prevent out of disk issues
-name: Delete old workflow runs
-
-# This workflow is currently only triggered manually
-on:
-  workflow_dispatch:
-    inputs:
-      days:
-        description: 'Days-worth of runs to keep for each workflow'
-        required: true
-        default: '30'
-      minimum_runs:
-        description: 'Minimum runs to keep for each workflow'
-        required: true
-        default: '10'
-      delete_workflow_pattern:
-        description: 'Name or filename of the workflow (if not set, all workflows are targeted)'
-        required: false
-      delete_workflow_by_state_pattern:
-        description: 'Filter workflows by state: active, deleted, disabled_fork, disabled_inactivity, disabled_manually'
-        required: true
-        default: "ALL"
-        type: choice
-        options:
-          - "ALL"
-          - active
-          - deleted
-          - disabled_inactivity
-          - disabled_manually
-      delete_run_by_conclusion_pattern:
-        description: 'Remove runs based on conclusion: action_required, cancelled, failure, skipped, success'
-        required: true
-        default: 'ALL'
-        type: choice
-        options:
-          - 'ALL'
-          - 'Unsuccessful: action_required,cancelled,failure,skipped'
-          - action_required
-          - cancelled
-          - failure
-          - skipped
-          - success
-      dry_run:
-        description: 'Logs simulated changes, no deletions are performed'
-        required: false
-
-jobs:
-  del_runs:
-    runs-on: ubuntu-latest
-    permissions:
-      actions: write
-      contents: read
-    steps:
-      - name: Delete workflow runs
-        uses: Mattraks/delete-workflow-runs@v2
-        with:
-          token: ${{ github.token }}
-          repository: ${{ github.repository }}
-          retain_days: ${{ github.event.inputs.days }}
-          keep_minimum_runs: ${{ github.event.inputs.minimum_runs }}
-          delete_workflow_pattern: ${{ github.event.inputs.delete_workflow_pattern }}
-          delete_workflow_by_state_pattern: ${{ github.event.inputs.delete_workflow_by_state_pattern }}
-          delete_run_by_conclusion_pattern: >-
-            ${{
-              startsWith(github.event.inputs.delete_run_by_conclusion_pattern, 'Unsuccessful:')
-              && 'action_required,cancelled,failure,skipped'
-              || github.event.inputs.delete_run_by_conclusion_pattern
-            }}
-          dry_run: ${{ github.event.inputs.dry_run }}
--- a/.github/workflows/deploy-docs.yml
+++ b/.github/workflows/deploy-docs.yml
@@ -1,30 +1,18 @@
-# Workflow that builds and deploys the documentation website
 name: Deploy Docs to GitHub Pages

-# * Always run on "main"
-# * Run on PRs that target the "main" branch and have changes in the "docs" folder or this workflow
 on:
  push:
    branches:
      - main
  pull_request:
-    paths:
-      - 'docs/**'
-      - '.github/workflows/deploy-docs.yml'
    branches:
      - main

-# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
-concurrency:
-  group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
-  cancel-in-progress: true
-
 jobs:
-  # Build the documentation website
  build:
-    if: github.repository == 'All-Hands-AI/OpenHands'
    name: Build Docusaurus
    runs-on: ubuntu-latest
+    if: github.repository == 'OpenDevin/OpenDevin'
    steps:
      - uses: actions/checkout@v4
        with:
@@ -37,29 +25,25 @@ jobs:
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
-          python-version: '3.12'
+          python-version: "3.11"
+
      - name: Generate Python Docs
        run: rm -rf docs/modules/python && pip install pydoc-markdown && pydoc-markdown
      - name: Install dependencies
        run: cd docs && npm ci
      - name: Build website
        run: cd docs && npm run build
+
      - name: Upload Build Artifact
        if: github.ref == 'refs/heads/main'
        uses: actions/upload-pages-artifact@v3
        with:
          path: docs/build

-  # Deploy the documentation website
  deploy:
-    if: github.ref == 'refs/heads/main' && github.repository == 'All-Hands-AI/OpenHands'
    name: Deploy to GitHub Pages
-    runs-on: ubuntu-latest
-    # This job only runs on "main" so only run one of these jobs at a time
-    # otherwise it will fail if one is already running
-    concurrency:
-      group: ${{ github.workflow }}-${{ github.ref }}
    needs: build
+    if: github.ref == 'refs/heads/main' && github.repository == 'OpenDevin/OpenDevin'
    # Grant GITHUB_TOKEN the permissions required to make a Pages deployment
    permissions:
      pages: write # to deploy to Pages
@@ -68,6 +52,7 @@ jobs:
    environment:
      name: github-pages
      url: ${{ steps.deployment.outputs.page_url }}
+    runs-on: ubuntu-latest
    steps:
      - name: Deploy to GitHub Pages
        id: deployment
--- a/.github/workflows/dummy-agent-test.yml
+++ b/.github/workflows/dummy-agent-test.yml
@@ -1,56 +1,37 @@
-# Workflow that uses the DummyAgent to run a simple task
 name: Run E2E test with dummy agent

-# Always run on "main"
-# Always run on PRs
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
+
 on:
  push:
    branches:
    - main
  pull_request:

-# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
-concurrency:
-  group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
-  cancel-in-progress: true
+env:
+  PERSIST_SANDBOX : "false"

 jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
-      - name: Free Disk Space (Ubuntu)
-        uses: jlumbroso/free-disk-space@main
-        with:
-          # this might remove tools that are actually needed,
-          # if set to "true" but frees about 6 GB
-          tool-cache: true
-          # all of these default to true, but feel free to set to
-          # "false" if necessary for your workflow
-          android: true
-          dotnet: true
-          haskell: true
-          large-packages: true
-          docker-images: false
-          swap-storage: true
-      - name: Set up Docker Buildx
-        id: buildx
-        uses: docker/setup-buildx-action@v3
-      - name: Install poetry via pipx
-        run: pipx install poetry
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
-          python-version: '3.12'
-          cache: 'poetry'
-      - name: Install Python dependencies using Poetry
-        run: poetry install --without evaluation,llama-index
-      - name: Build Environment
-        run: make build
+          python-version: '3.11'
+      - name: Set up environment
+        run: |
+          curl -sSL https://install.python-poetry.org | python3 -
+          poetry install --without evaluation
+          poetry run playwright install --with-deps chromium
+          wget https://huggingface.co/BAAI/bge-small-en-v1.5/raw/main/1_Pooling/config.json -P /tmp/llama_index/models--BAAI--bge-small-en-v1.5/snapshots/5c38ec7c405ec4b44b94cc5a9bb96e735b38267a/1_Pooling/
      - name: Run tests
        run: |
          set -e
-          SANDBOX_FORCE_REBUILD_RUNTIME=True poetry run python3 openhands/core/main.py -t "do a flip" -d ./workspace/ -c DummyAgent
+          poetry run python opendevin/core/main.py -t "do a flip" -d ./workspace/ -c DummyAgent
      - name: Check exit code
        run: |
          if [ $? -ne 0 ]; then
--- a/.github/workflows/eval-runner.yml
+++ b/.github/workflows/eval-runner.yml
@@ -1,160 +0,0 @@
-name: Run Evaluation
-
-on:
-  pull_request:
-    types: [labeled]
-  schedule:
-    - cron: "0 1 * * *" # Run daily at 1 AM UTC
-  workflow_dispatch:
-    inputs:
-      reason:
-        description: "Reason for manual trigger"
-        required: true
-        default: ""
-
-env:
-  N_PROCESSES: 32 # Global configuration for number of parallel processes for evaluation
-
-jobs:
-  run-evaluation:
-    if: github.event.label.name == 'eval-this' || github.event_name != 'pull_request'
-    runs-on: ubuntu-latest
-    permissions:
-      contents: "read"
-      id-token: "write"
-      pull-requests: "write"
-      issues: "write"
-    strategy:
-      matrix:
-        python-version: ["3.12"]
-    steps:
-      - name: Checkout repository
-        uses: actions/checkout@v4
-
-      - name: Install poetry via pipx
-        run: pipx install poetry
-
-      - name: Set up Python
-        uses: actions/setup-python@v5
-        with:
-          python-version: ${{ matrix.python-version }}
-          cache: "poetry"
-
-      - name: Comment on PR if 'eval-this' label is present
-        if: github.event_name == 'pull_request' && github.event.label.name == 'eval-this'
-        uses: KeisukeYamashita/create-comment@v1
-        with:
-          unique: false
-          comment: |
-            Hi! I started running the evaluation on your PR. You will receive a comment with the results shortly.
-
-      - name: Install Python dependencies using Poetry
-        run: poetry install
-
-      - name: Configure config.toml for evaluation
-        env:
-          DEEPSEEK_API_KEY: ${{ secrets.DEEPSEEK_LLM_API_KEY }}
-        run: |
-          echo "[llm.eval]" > config.toml
-          echo "model = \"deepseek/deepseek-chat\"" >> config.toml
-          echo "api_key = \"$DEEPSEEK_API_KEY\"" >> config.toml
-          echo "temperature = 0.0" >> config.toml
-
-      - name: Run integration test evaluation
-        env:
-          ALLHANDS_API_KEY: ${{ secrets.ALLHANDS_EVAL_RUNTIME_API_KEY }}
-          RUNTIME: remote
-          SANDBOX_REMOTE_RUNTIME_API_URL: https://runtime.eval.all-hands.dev
-          EVAL_DOCKER_IMAGE_PREFIX: us-central1-docker.pkg.dev/evaluation-092424/swe-bench-images
-
-        run: |
-          poetry run ./evaluation/integration_tests/scripts/run_infer.sh llm.eval HEAD CodeActAgent '' $N_PROCESSES
-
-          # get evaluation report
-          REPORT_FILE=$(find evaluation/evaluation_outputs/outputs/integration_tests/CodeActAgent/deepseek-chat_maxiter_10_N* -name "report.md" -type f | head -n 1)
-          echo "REPORT_FILE: $REPORT_FILE"
-          echo "INTEGRATION_TEST_REPORT<<EOF" >> $GITHUB_ENV
-          cat $REPORT_FILE >> $GITHUB_ENV
-          echo >> $GITHUB_ENV
-          echo "EOF" >> $GITHUB_ENV
-
-      - name: Run SWE-Bench evaluation
-        env:
-          ALLHANDS_API_KEY: ${{ secrets.ALLHANDS_EVAL_RUNTIME_API_KEY }}
-          RUNTIME: remote
-          SANDBOX_REMOTE_RUNTIME_API_URL: https://runtime.eval.all-hands.dev
-          EVAL_DOCKER_IMAGE_PREFIX: us-central1-docker.pkg.dev/evaluation-092424/swe-bench-images
-
-        run: |
-          poetry run ./evaluation/swe_bench/scripts/run_infer.sh llm.eval HEAD CodeActAgent 300 30 $N_PROCESSES "princeton-nlp/SWE-bench_Lite" test
-          OUTPUT_FOLDER=$(find evaluation/evaluation_outputs/outputs/princeton-nlp__SWE-bench_Lite-test/CodeActAgent -name "deepseek-chat_maxiter_50_N_*-no-hint-run_1" -type d | head -n 1)
-          echo "OUTPUT_FOLDER for SWE-bench evaluation: $OUTPUT_FOLDER"
-          poetry run ./evaluation/swe_bench/scripts/eval_infer_remote.sh $OUTPUT_FOLDER/output.jsonl $N_PROCESSES "princeton-nlp/SWE-bench_Lite" test
-
-          poetry run ./evaluation/swe_bench/scripts/eval/summarize_outputs.py $OUTPUT_FOLDER/output.jsonl > summarize_outputs.log 2>&1
-          echo "SWEBENCH_REPORT<<EOF" >> $GITHUB_ENV
-          cat summarize_outputs.log >> $GITHUB_ENV
-          echo "EOF" >> $GITHUB_ENV
-
-      - name: Create tar.gz of evaluation outputs
-        run: |
-          TIMESTAMP=$(date +'%y-%m-%d-%H-%M')
-          tar -czvf evaluation_outputs_${TIMESTAMP}.tar.gz evaluation/evaluation_outputs/outputs
-
-      - name: Upload evaluation results as artifact
-        uses: actions/upload-artifact@v4
-        id: upload_results_artifact
-        with:
-          name: evaluation-outputs
-          path: evaluation_outputs_*.tar.gz
-
-      - name: Get artifact URL
-        run: echo "ARTIFACT_URL=${{ steps.upload_results_artifact.outputs.artifact-url }}" >> $GITHUB_ENV
-
-      - name: Authenticate to Google Cloud
-        uses: 'google-github-actions/auth@v2'
-        with:
-          credentials_json: ${{ secrets.GCP_RESEARCH_OBJECT_CREATOR_SA_KEY }}
-
-      - name: Set timestamp and trigger reason
-        run: |
-          echo "TIMESTAMP=$(date +'%Y-%m-%d-%H-%M')" >> $GITHUB_ENV
-          if [[ "${{ github.event_name }}" == "pull_request" ]]; then
-            echo "TRIGGER_REASON=pr-${{ github.event.pull_request.number }}" >> $GITHUB_ENV
-          elif [[ "${{ github.event_name }}" == "schedule" ]]; then
-            echo "TRIGGER_REASON=schedule" >> $GITHUB_ENV
-          else
-            echo "TRIGGER_REASON=manual-${{ github.event.inputs.reason }}" >> $GITHUB_ENV
-          fi
-
-      - name: Upload evaluation results to Google Cloud Storage
-        uses: 'google-github-actions/upload-cloud-storage@v2'
-        with:
-          path: 'evaluation/evaluation_outputs/outputs'
-          destination: 'openhands-oss-eval-results/${{ env.TIMESTAMP }}-${{ env.TRIGGER_REASON }}'
-
-      - name: Comment with evaluation results and artifact link
-        id: create_comment
-        uses: KeisukeYamashita/create-comment@v1
-        with:
-          number: ${{ github.event_name == 'pull_request' && github.event.pull_request.number || 4504 }}
-          unique: false
-          comment: |
-              Trigger by: ${{ github.event_name == 'pull_request' && format('Pull Request (eval-this label on PR #{0})', github.event.pull_request.number) || github.event_name == 'schedule' && 'Daily Schedule' || format('Manual Trigger: {0}', github.event.inputs.reason) }}
-              Commit: ${{ github.sha }}
-              **SWE-Bench Evaluation Report**
-              ${{ env.SWEBENCH_REPORT }}
-              ---
-              **Integration Tests Evaluation Report**
-              ${{ env.INTEGRATION_TEST_REPORT }}
-              ---
-              You can download the full evaluation outputs [here](${{ env.ARTIFACT_URL }}).
-
-      - name: Post to a Slack channel
-        id: slack
-        uses: slackapi/slack-github-action@v1.27.0
-        with:
-          channel-id: 'C07SVQSCR6F'
-          slack-message: "*Evaluation Trigger:* ${{ github.event_name == 'pull_request' && format('Pull Request (eval-this label on PR #{0})', github.event.pull_request.number) || github.event_name == 'schedule' && 'Daily Schedule' || format('Manual Trigger: {0}', github.event.inputs.reason) }}\n\nLink to summary: [here](https://github.com/${{ github.repository }}/issues/${{ github.event_name == 'pull_request' && github.event.pull_request.number || 4504 }}#issuecomment-${{ steps.create_comment.outputs.comment-id }})"
-        env:
-          SLACK_BOT_TOKEN: ${{ secrets.EVAL_NOTIF_SLACK_BOT_TOKEN }}
--- a/.github/workflows/fe-unit-tests.yml
+++ b/.github/workflows/fe-unit-tests.yml
@@ -1,44 +0,0 @@
-# Workflow that runs frontend unit tests
-name: Run Frontend Unit Tests
-
-# * Always run on "main"
-# * Run on PRs that have changes in the "frontend" folder or this workflow
-on:
-  push:
-    branches:
-      - main
-  pull_request:
-    paths:
-      - 'frontend/**'
-      -  '.github/workflows/fe-unit-tests.yml'
-
-# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
-concurrency:
-  group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
-  cancel-in-progress: true
-
-jobs:
-  # Run frontend unit tests
-  fe-test:
-    name: FE Unit Tests
-    runs-on: ubuntu-latest
-    strategy:
-      matrix:
-        node-version: [20]
-    steps:
-      - name: Checkout
-        uses: actions/checkout@v4
-      - name: Set up Node.js
-        uses: actions/setup-node@v4
-        with:
-          node-version: ${{ matrix.node-version }}
-      - name: Install dependencies
-        working-directory: ./frontend
-        run: npm ci
-      - name: Run tests and collect coverage
-        working-directory: ./frontend
-        run: npm run test:coverage
-      - name: Upload coverage to Codecov
-        uses: codecov/codecov-action@v4
-        env:
-          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
--- a/.github/workflows/ghcr-build.yml
+++ b/.github/workflows/ghcr-build.yml
@@ -1,447 +0,0 @@
-# Workflow that builds, tests and then pushes the OpenHands and runtime docker images to the ghcr.io repository
-name: Docker
-
-# Always run on "main"
-# Always run on tags
-# Always run on PRs
-# Can also be triggered manually
-on:
-  push:
-    branches:
-      - main
-    tags:
-      - '*'
-  pull_request:
-  workflow_dispatch:
-    inputs:
-      reason:
-        description: 'Reason for manual trigger'
-        required: true
-        default: ''
-
-# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
-concurrency:
-  group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
-  cancel-in-progress: true
-
-env:
-  BASE_IMAGE_FOR_HASH_EQUIVALENCE_TEST: nikolaik/python-nodejs:python3.12-nodejs22
-  RELEVANT_SHA: ${{ github.event.pull_request.head.sha || github.sha }}
-
-jobs:
-  # Builds the OpenHands Docker images
-  ghcr_build_app:
-    name: Build App Image
-    runs-on: ubuntu-latest
-    permissions:
-      contents: read
-      packages: write
-    outputs:
-      hash_from_app_image: ${{ steps.get_hash_in_app_image.outputs.hash_from_app_image }}
-    steps:
-      - name: Checkout
-        uses: actions/checkout@v4
-      - name: Free Disk Space (Ubuntu)
-        uses: jlumbroso/free-disk-space@main
-        with:
-          # this might remove tools that are actually needed,
-          # if set to "true" but frees about 6 GB
-          tool-cache: true
-          # all of these default to true, but feel free to set to
-          # "false" if necessary for your workflow
-          android: true
-          dotnet: true
-          haskell: true
-          large-packages: true
-          docker-images: false
-          swap-storage: true
-      - name: Set up QEMU
-        uses: docker/setup-qemu-action@v3.0.0
-        with:
-          image: tonistiigi/binfmt:latest
-      - name: Login to GHCR
-        uses: docker/login-action@v3
-        with:
-          registry: ghcr.io
-          username: ${{ github.repository_owner }}
-          password: ${{ secrets.GITHUB_TOKEN }}
-      - name: Set up Docker Buildx
-        id: buildx
-        uses: docker/setup-buildx-action@v3
-      - name: Build and push app image
-        if: "!github.event.pull_request.head.repo.fork"
-        run: |
-          ./containers/build.sh -i openhands -o ${{ github.repository_owner }} --push
-      - name: Build app image
-        if: "github.event.pull_request.head.repo.fork"
-        run: |
-          ./containers/build.sh -i openhands -o ${{ github.repository_owner }} --load
-      - name: Get hash in App Image
-        id: get_hash_in_app_image
-        run: |
-          # Lowercase the repository owner
-          export REPO_OWNER=${{ github.repository_owner }}
-          REPO_OWNER=$(echo $REPO_OWNER | tr '[:upper:]' '[:lower:]')
-          # Run the build script in the app image
-          docker run -e SANDBOX_USER_ID=0 -v /var/run/docker.sock:/var/run/docker.sock ghcr.io/${REPO_OWNER}/openhands:${{ env.RELEVANT_SHA }} /bin/bash -c "mkdir -p containers/runtime; python3 openhands/runtime/utils/runtime_build.py --base_image ${{ env.BASE_IMAGE_FOR_HASH_EQUIVALENCE_TEST }} --build_folder containers/runtime --force_rebuild" 2>&1 | tee docker-outputs.txt
-          # Get the hash from the build script
-          hash_from_app_image=$(cat docker-outputs.txt | grep "Hash for docker build directory" | awk -F "): " '{print $2}' | uniq | head -n1)
-          echo "hash_from_app_image=$hash_from_app_image" >> $GITHUB_OUTPUT
-          echo "Hash from app image: $hash_from_app_image"
-
-  # Builds the runtime Docker images
-  ghcr_build_runtime:
-    name: Build Image
-    runs-on: ubuntu-latest
-    permissions:
-      contents: read
-      packages: write
-    strategy:
-      matrix:
-        base_image:
-          - image: 'nikolaik/python-nodejs:python3.12-nodejs22'
-            tag: nikolaik
-    steps:
-      - name: Checkout
-        uses: actions/checkout@v4
-      - name: Free Disk Space (Ubuntu)
-        uses: jlumbroso/free-disk-space@main
-        with:
-          # this might remove tools that are actually needed,
-          # if set to "true" but frees about 6 GB
-          tool-cache: true
-          # all of these default to true, but feel free to set to
-          # "false" if necessary for your workflow
-          android: true
-          dotnet: true
-          haskell: true
-          large-packages: true
-          docker-images: false
-          swap-storage: true
-      - name: Set up QEMU
-        uses: docker/setup-qemu-action@v3.0.0
-        with:
-          image: tonistiigi/binfmt:latest
-      - name: Login to GHCR
-        uses: docker/login-action@v3
-        with:
-          registry: ghcr.io
-          username: ${{ github.repository_owner }}
-          password: ${{ secrets.GITHUB_TOKEN }}
-      - name: Set up Docker Buildx
-        id: buildx
-        uses: docker/setup-buildx-action@v3
-      - name: Set up Python
-        uses: actions/setup-python@v5
-        with:
-          python-version: '3.12'
-      - name: Cache Poetry dependencies
-        uses: actions/cache@v4
-        with:
-          path: |
-            ~/.cache/pypoetry
-            ~/.virtualenvs
-          key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
-          restore-keys: |
-            ${{ runner.os }}-poetry-
-      - name: Install poetry via pipx
-        run: pipx install poetry
-      - name: Install Python dependencies using Poetry
-        run: make install-python-dependencies
-      - name: Create source distribution and Dockerfile
-        run: poetry run python3 openhands/runtime/utils/runtime_build.py --base_image ${{ matrix.base_image.image }} --build_folder containers/runtime --force_rebuild
-      - name: Build and push runtime image ${{ matrix.base_image.image }}
-        if: github.event.pull_request.head.repo.fork != true
-        run: |
-          ./containers/build.sh -i runtime -o ${{ github.repository_owner }} --push -t ${{ matrix.base_image.tag }}
-      # Forked repos can't push to GHCR, so we need to upload the image as an artifact
-      - name: Build runtime image ${{ matrix.base_image.image }} for fork
-        if: github.event.pull_request.head.repo.fork
-        uses: docker/build-push-action@v6
-        with:
-          tags: ghcr.io/all-hands-ai/runtime:${{ env.RELEVANT_SHA }}-${{ matrix.base_image.tag }}
-          outputs: type=docker,dest=/tmp/runtime-${{ matrix.base_image.tag }}.tar
-          context: containers/runtime
-      - name: Upload runtime image for fork
-        if: github.event.pull_request.head.repo.fork
-        uses: actions/upload-artifact@v4
-        with:
-          name: runtime-${{ matrix.base_image.tag }}
-          path: /tmp/runtime-${{ matrix.base_image.tag }}.tar
-
-  verify_hash_equivalence_in_runtime_and_app:
-    name: Verify Hash Equivalence in Runtime and Docker images
-    runs-on: ubuntu-latest
-    needs: [ghcr_build_runtime, ghcr_build_app]
-    strategy:
-      fail-fast: false
-      matrix:
-        base_image: ['nikolaik']
-    steps:
-      - uses: actions/checkout@v4
-      - name: Cache Poetry dependencies
-        uses: actions/cache@v4
-        with:
-          path: |
-            ~/.cache/pypoetry
-            ~/.virtualenvs
-          key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
-          restore-keys: |
-            ${{ runner.os }}-poetry-
-      - name: Set up Python
-        uses: actions/setup-python@v5
-        with:
-          python-version: '3.12'
-      - name: Install poetry via pipx
-        run: pipx install poetry
-      - name: Install Python dependencies using Poetry
-        run: make install-python-dependencies
-      - name: Get hash in App Image
-        run: |
-          echo "Hash from app image: ${{ needs.ghcr_build_app.outputs.hash_from_app_image }}"
-          echo "hash_from_app_image=${{ needs.ghcr_build_app.outputs.hash_from_app_image }}" >> $GITHUB_ENV
-
-      - name: Get hash using code (development mode)
-        run: |
-          mkdir -p containers/runtime
-          poetry run python3 openhands/runtime/utils/runtime_build.py --base_image ${{ env.BASE_IMAGE_FOR_HASH_EQUIVALENCE_TEST }} --build_folder containers/runtime --force_rebuild > output.txt 2>&1
-          hash_from_code=$(cat output.txt | grep "Hash for docker build directory" | awk -F "): " '{print $2}' | uniq | head -n1)
-          echo "hash_from_code=$hash_from_code" >> $GITHUB_ENV
-
-      - name: Compare hashes
-        run: |
-          echo "Hash from App Image: ${{ env.hash_from_app_image }}"
-          echo "Hash from Code: ${{ env.hash_from_code }}"
-          if [ "${{ env.hash_from_app_image }}" = "${{ env.hash_from_code }}" ]; then
-            echo "Hashes match!"
-          else
-            echo "Hashes do not match!"
-            exit 1
-          fi
-
-  # Run unit tests with the EventStream runtime Docker images as root
-  test_runtime_root:
-    name: RT Unit Tests (Root)
-    needs: [ghcr_build_runtime]
-    runs-on: ubuntu-latest
-    strategy:
-      fail-fast: false
-      matrix:
-        base_image: ['nikolaik']
-    steps:
-      - uses: actions/checkout@v4
-      - name: Free Disk Space (Ubuntu)
-        uses: jlumbroso/free-disk-space@main
-        with:
-          # this might remove tools that are actually needed,
-          # if set to "true" but frees about 6 GB
-          tool-cache: true
-          # all of these default to true, but feel free to set to
-          # "false" if necessary for your workflow
-          android: true
-          dotnet: true
-          haskell: true
-          large-packages: true
-          docker-images: false
-          swap-storage: true
-      - name: Set up Docker Buildx
-        id: buildx
-        uses: docker/setup-buildx-action@v3
-      # Forked repos can't push to GHCR, so we need to download the image as an artifact
-      - name: Download runtime image for fork
-        if: github.event.pull_request.head.repo.fork
-        uses: actions/download-artifact@v4
-        with:
-          name: runtime-${{ matrix.base_image }}
-          path: /tmp
-      - name: Load runtime image for fork
-        if: github.event.pull_request.head.repo.fork
-        run: |
-          docker load --input /tmp/runtime-${{ matrix.base_image }}.tar
-      - name: Cache Poetry dependencies
-        uses: actions/cache@v4
-        with:
-          path: |
-            ~/.cache/pypoetry
-            ~/.virtualenvs
-          key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
-          restore-keys: |
-            ${{ runner.os }}-poetry-
-      - name: Set up Python
-        uses: actions/setup-python@v5
-        with:
-          python-version: '3.12'
-      - name: Install poetry via pipx
-        run: pipx install poetry
-      - name: Install Python dependencies using Poetry
-        run: make install-python-dependencies
-      - name: Run runtime tests
-        run: |
-          # We install pytest-xdist in order to run tests across CPUs
-          poetry run pip install pytest-xdist
-
-          # Install to be able to retry on failures for flaky tests
-          poetry run pip install pytest-rerunfailures
-
-          image_name=ghcr.io/${{ github.repository_owner }}/runtime:${{ env.RELEVANT_SHA }}-${{ matrix.base_image }}
-          image_name=$(echo $image_name | tr '[:upper:]' '[:lower:]')
-
-          SKIP_CONTAINER_LOGS=true \
-          TEST_RUNTIME=eventstream \
-          SANDBOX_USER_ID=$(id -u) \
-          SANDBOX_RUNTIME_CONTAINER_IMAGE=$image_name \
-          TEST_IN_CI=true \
-          RUN_AS_OPENHANDS=false \
-          poetry run pytest -n 3 -raRs --reruns 2 --reruns-delay 5 --cov=openhands --cov-report=xml -s ./tests/runtime
-      - name: Upload coverage to Codecov
-        uses: codecov/codecov-action@v4
-        env:
-          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
-
-  # Run unit tests with the EventStream runtime Docker images as openhands user
-  test_runtime_oh:
-    name: RT Unit Tests (openhands)
-    runs-on: ubuntu-latest
-    needs: [ghcr_build_runtime]
-    strategy:
-      matrix:
-        base_image: ['nikolaik']
-    steps:
-      - uses: actions/checkout@v4
-      - name: Free Disk Space (Ubuntu)
-        uses: jlumbroso/free-disk-space@main
-        with:
-          # this might remove tools that are actually needed,
-          # if set to "true" but frees about 6 GB
-          tool-cache: true
-          # all of these default to true, but feel free to set to
-          # "false" if necessary for your workflow
-          android: true
-          dotnet: true
-          haskell: true
-          large-packages: true
-          docker-images: false
-          swap-storage: true
-      - name: Set up Docker Buildx
-        id: buildx
-        uses: docker/setup-buildx-action@v3
-      # Forked repos can't push to GHCR, so we need to download the image as an artifact
-      - name: Download runtime image for fork
-        if: github.event.pull_request.head.repo.fork
-        uses: actions/download-artifact@v4
-        with:
-          name: runtime-${{ matrix.base_image }}
-          path: /tmp
-      - name: Load runtime image for fork
-        if: github.event.pull_request.head.repo.fork
-        run: |
-          docker load --input /tmp/runtime-${{ matrix.base_image }}.tar
-      - name: Cache Poetry dependencies
-        uses: actions/cache@v4
-        with:
-          path: |
-            ~/.cache/pypoetry
-            ~/.virtualenvs
-          key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
-          restore-keys: |
-            ${{ runner.os }}-poetry-
-      - name: Set up Python
-        uses: actions/setup-python@v5
-        with:
-          python-version: '3.12'
-      - name: Install poetry via pipx
-        run: pipx install poetry
-      - name: Install Python dependencies using Poetry
-        run: make install-python-dependencies
-      - name: Run runtime tests
-        run: |
-          # We install pytest-xdist in order to run tests across CPUs
-          poetry run pip install pytest-xdist
-
-          # Install to be able to retry on failures for flaky tests
-          poetry run pip install pytest-rerunfailures
-
-          image_name=ghcr.io/${{ github.repository_owner }}/runtime:${{ env.RELEVANT_SHA }}-${{ matrix.base_image }}
-          image_name=$(echo $image_name | tr '[:upper:]' '[:lower:]')
-
-          SKIP_CONTAINER_LOGS=true \
-          TEST_RUNTIME=eventstream \
-          SANDBOX_USER_ID=$(id -u) \
-          SANDBOX_RUNTIME_CONTAINER_IMAGE=$image_name \
-          TEST_IN_CI=true \
-          RUN_AS_OPENHANDS=true \
-          poetry run pytest -n 3 -raRs --reruns 2 --reruns-delay 5 --cov=openhands --cov-report=xml -s ./tests/runtime
-      - name: Upload coverage to Codecov
-        uses: codecov/codecov-action@v4
-        env:
-          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
-
-  # The two following jobs (named identically) are to check whether all the runtime tests have passed as the
-  # "All Runtime Tests Passed" is a required job for PRs to merge
-  # Due to this bug: https://github.com/actions/runner/issues/2566, we want to create a job that runs when the
-  # prerequisites have been cancelled or failed so merging is disallowed, otherwise Github considers "skipped" as "success"
-  runtime_tests_check_success:
-    name: All Runtime Tests Passed
-    if: ${{ !cancelled() && !contains(needs.*.result, 'failure') && !contains(needs.*.result, 'cancelled') }}
-    runs-on: ubuntu-latest
-    needs: [test_runtime_root, test_runtime_oh, verify_hash_equivalence_in_runtime_and_app]
-    steps:
-      - name: All tests passed
-        run: echo "All runtime tests have passed successfully!"
-
-  runtime_tests_check_fail:
-    name: All Runtime Tests Passed
-    if: ${{ cancelled() || contains(needs.*.result, 'failure') || contains(needs.*.result, 'cancelled') }}
-    runs-on: ubuntu-latest
-    needs: [test_runtime_root, test_runtime_oh, verify_hash_equivalence_in_runtime_and_app]
-    steps:
-      - name: Some tests failed
-        run: |
-          echo "Some runtime tests failed or were cancelled"
-          exit 1
-  update_pr_description:
-    name: Update PR Description
-    if: github.event_name == 'pull_request' && !github.event.pull_request.head.repo.fork && github.actor != 'dependabot[bot]'
-    needs: [ghcr_build_runtime]
-    runs-on: ubuntu-latest
-    steps:
-      - name: Checkout
-        uses: actions/checkout@v4
-
-      - name: Get short SHA
-        id: short_sha
-        run: echo "SHORT_SHA=$(echo ${{ github.event.pull_request.head.sha }} | cut -c1-7)" >> $GITHUB_OUTPUT
-
-      - name: Update PR Description
-        env:
-          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
-          PR_NUMBER: ${{ github.event.pull_request.number }}
-          REPO: ${{ github.repository }}
-          SHORT_SHA: ${{ steps.short_sha.outputs.SHORT_SHA }}
-        run: |
-          echo "updating PR description"
-          DOCKER_RUN_COMMAND="docker run -it --rm \
-            -p 3000:3000 \
-            -v /var/run/docker.sock:/var/run/docker.sock \
-            --add-host host.docker.internal:host-gateway \
-            -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:$SHORT_SHA-nikolaik \
-            --name openhands-app-$SHORT_SHA \
-            docker.all-hands.dev/all-hands-ai/openhands:$SHORT_SHA"
-
-          PR_BODY=$(gh pr view $PR_NUMBER --json body --jq .body)
-
-          if echo "$PR_BODY" | grep -q "To run this PR locally, use the following command:"; then
-            UPDATED_PR_BODY=$(echo "${PR_BODY}" | sed -E "s|docker run -it --rm.*|$DOCKER_RUN_COMMAND|")
-          else
-            UPDATED_PR_BODY="${PR_BODY}
-
-          ---
-
-          To run this PR locally, use the following command:
-          \`\`\`
-          $DOCKER_RUN_COMMAND
-          \`\`\`"
-          fi
-
-          echo "updated body: $UPDATED_PR_BODY"
-          gh pr edit $PR_NUMBER --body "$UPDATED_PR_BODY"
--- a/.github/workflows/ghcr.yml
+++ b/.github/workflows/ghcr.yml
@@ -0,0 +1,263 @@
+name: Build Publish and Test Docker Image
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
+
+on:
+  push:
+    branches:
+      - main
+    tags:
+      - '*'
+  pull_request:
+  workflow_dispatch:
+    inputs:
+      reason:
+        description: 'Reason for manual trigger'
+        required: true
+        default: ''
+
+jobs:
+  ghcr_build:
+    runs-on: ubuntu-latest
+
+    outputs:
+      tags: ${{ steps.capture-tags.outputs.tags }}
+
+    permissions:
+      contents: read
+      packages: write
+
+    strategy:
+      matrix:
+        image: ["sandbox", "opendevin"]
+        platform: ["amd64", "arm64"]
+
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+
+      - name: Free Disk Space (Ubuntu)
+        uses: jlumbroso/free-disk-space@main
+        with:
+          # this might remove tools that are actually needed,
+          # if set to "true" but frees about 6 GB
+          tool-cache: true
+          # all of these default to true, but feel free to set to
+          # "false" if necessary for your workflow
+          android: true
+          dotnet: true
+          haskell: true
+          large-packages: true
+          docker-images: false
+          swap-storage: true
+
+      - name: Set up QEMU
+        uses: docker/setup-qemu-action@v3
+
+      - name: Set up Docker Buildx
+        id: buildx
+        uses: docker/setup-buildx-action@v3
+
+      - name: Build and export image
+        id: build
+        run: ./containers/build.sh ${{ matrix.image }} ${{ github.repository_owner }} ${{ matrix.platform }}
+
+      - name: Capture tags
+        id: capture-tags
+        run: |
+          tags=$(cat tags.txt)
+          echo "tags=$tags"
+          echo "tags=$tags" >> $GITHUB_OUTPUT
+
+      - name: Upload Docker image as artifact
+        uses: actions/upload-artifact@v4
+        with:
+          name: ${{ matrix.image }}-docker-image-${{ matrix.platform }}
+          path: /tmp/${{ matrix.image }}_image_${{ matrix.platform }}.tar
+
+  test-for-sandbox:
+    name: Test for Sandbox
+    runs-on: ubuntu-latest
+    needs: ghcr_build
+    env:
+      PERSIST_SANDBOX: "false"
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Install poetry via pipx
+        run: pipx install poetry
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.11"
+          cache: "poetry"
+
+      - name: Install Python dependencies using Poetry
+        run: make install-python-dependencies
+
+      - name: Download sandbox Docker image
+        uses: actions/download-artifact@v4
+        with:
+          name: sandbox-docker-image-amd64
+          path: /tmp/
+
+      - name: Load sandbox image and run sandbox tests
+        run: |
+          # Load the Docker image and capture the output
+          output=$(docker load -i /tmp/sandbox_image_amd64.tar)
+
+          # Extract the first image name from the output
+          image_name=$(echo "$output" | grep -oP 'Loaded image: \K.*' | head -n 1)
+
+          # Print the full name of the image
+          echo "Loaded Docker image: $image_name"
+
+          SANDBOX_CONTAINER_IMAGE=$image_name TEST_IN_CI=true poetry run pytest --cov=agenthub --cov=opendevin --cov-report=xml -s ./tests/unit/test_sandbox.py
+
+      - name: Upload coverage to Codecov
+        uses: codecov/codecov-action@v4
+        env:
+          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
+
+  integration-tests-on-linux:
+    name: Integration Tests on Linux
+    runs-on: ubuntu-latest
+    needs: ghcr_build
+    env:
+      PERSIST_SANDBOX: "false"
+    strategy:
+      fail-fast: false
+      matrix:
+        python-version: ["3.11"]
+        sandbox: ["ssh", "local"]
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Install poetry via pipx
+        run: pipx install poetry
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+          cache: 'poetry'
+
+      - name: Install Python dependencies using Poetry
+        run: make install-python-dependencies
+
+      - name: Download sandbox Docker image
+        uses: actions/download-artifact@v4
+        with:
+          name: sandbox-docker-image-amd64
+          path: /tmp/
+
+      - name: Load sandbox image and run integration tests
+        env:
+          SANDBOX_BOX_TYPE: ${{ matrix.sandbox }}
+        run: |
+          # Load the Docker image and capture the output
+          output=$(docker load -i /tmp/sandbox_image_amd64.tar)
+
+          # Extract the first image name from the output
+          image_name=$(echo "$output" | grep -oP 'Loaded image: \K.*' | head -n 1)
+
+          # Print the full name of the image
+          echo "Loaded Docker image: $image_name"
+
+          SANDBOX_CONTAINER_IMAGE=$image_name TEST_IN_CI=true TEST_ONLY=true ./tests/integration/regenerate.sh
+
+      - name: Upload coverage to Codecov
+        uses: codecov/codecov-action@v4
+        env:
+          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
+
+  ghcr_push:
+    runs-on: ubuntu-latest
+    # don't push if integration tests or sandbox tests fail
+    needs: [ghcr_build, integration-tests-on-linux, test-for-sandbox]
+    if: github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/')
+
+    env:
+      tags: ${{ needs.ghcr_build.outputs.tags }}
+
+    permissions:
+      contents: read
+      packages: write
+
+    strategy:
+      matrix:
+        image: ["sandbox", "opendevin"]
+        platform: ["amd64", "arm64"]
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+
+      - name: Login to GHCR
+        uses: docker/login-action@v2
+        with:
+          registry: ghcr.io
+          username: ${{ github.repository_owner }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Download Docker images
+        uses: actions/download-artifact@v4
+        with:
+          name: ${{ matrix.image }}-docker-image-${{ matrix.platform }}
+          path: /tmp/${{ matrix.platform }}
+
+      - name: Load images and push to registry
+        run: |
+          mv /tmp/${{ matrix.platform }}/${{ matrix.image }}_image_${{ matrix.platform }}.tar .
+          loaded_image=$(docker load -i ${{ matrix.image }}_image_${{ matrix.platform }}.tar | grep "Loaded image:" | head -n 1 | awk '{print $3}')
+          echo "loaded image = $loaded_image"
+          tags=$(echo ${tags} | tr ' ' '\n')
+          image_name=$(echo "ghcr.io/${{ github.repository_owner }}/${{ matrix.image }}" | tr '[:upper:]' '[:lower:]')
+          echo "image name = $image_name"
+          for tag in $tags; do
+            echo "tag = $tag"
+            docker tag $loaded_image $image_name:${tag}_${{ matrix.platform }}
+            docker push $image_name:${tag}_${{ matrix.platform }}
+          done
+
+  create_manifest:
+    runs-on: ubuntu-latest
+    needs: [ghcr_build, ghcr_push]
+    if: github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/')
+
+    env:
+      tags: ${{ needs.ghcr_build.outputs.tags }}
+
+    strategy:
+      matrix:
+        image: ["sandbox", "opendevin"]
+
+    permissions:
+      contents: read
+      packages: write
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+
+      - name: Login to GHCR
+        uses: docker/login-action@v2
+        with:
+          registry: ghcr.io
+          username: ${{ github.repository_owner }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Create and push multi-platform manifest
+        run: |
+          image_name=$(echo "ghcr.io/${{ github.repository_owner }}/${{ matrix.image }}" | tr '[:upper:]' '[:lower:]')
+          echo "image name = $image_name"
+          tags=$(echo ${tags} | tr ' ' '\n')
+          for tag in $tags; do
+            echo 'tag = $tag'
+            docker buildx imagetools create --tag $image_name:$tag \
+              $image_name:${tag}_amd64 \
+              $image_name:${tag}_arm64
+          done
--- a/.github/workflows/lint.yml
+++ b/.github/workflows/lint.yml
@@ -1,41 +1,37 @@
-# Workflow that runs lint on the frontend and python code
 name: Lint

-# The jobs in this workflow are required, so they must run at all times
-# Always run on "main"
-# Always run on PRs
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
+
 on:
  push:
    branches:
    - main
  pull_request:

-# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
-concurrency:
-  group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
-  cancel-in-progress: true
-
 jobs:
-  # Run lint on the frontend code
  lint-frontend:
    name: Lint frontend
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
+
      - name: Install Node.js 20
        uses: actions/setup-node@v4
        with:
          node-version: 20
+
      - name: Install dependencies
        run: |
          cd frontend
          npm install --frozen-lockfile
+
      - name: Lint
        run: |
          cd frontend
          npm run lint

-  # Run lint on the python code
  lint-python:
    name: Lint python
    runs-on: ubuntu-latest
@@ -46,9 +42,9 @@ jobs:
      - name: Set up python
        uses: actions/setup-python@v5
        with:
-          python-version: 3.12
+          python-version: 3.11
          cache: 'pip'
      - name: Install pre-commit
        run: pip install pre-commit==3.7.0
      - name: Run pre-commit hooks
-        run: pre-commit run --files openhands/**/* evaluation/**/* tests/**/* --show-diff-on-failure --config ./dev_config/python/.pre-commit-config.yaml
+        run: pre-commit run --files opendevin/**/* agenthub/**/* evaluation/**/* tests/**/* --show-diff-on-failure --config ./dev_config/python/.pre-commit-config.yaml
--- a/.github/workflows/openhands-resolver.yml
+++ b/.github/workflows/openhands-resolver.yml
@@ -1,15 +0,0 @@
-name: Resolve Issues with OpenHands
-
-on:
-  issues:
-    types: [labeled]
-  pull_request:
-    types: [labeled]
-
-jobs:
-  call-openhands-resolver:
-    uses: All-Hands-AI/openhands-resolver/.github/workflows/openhands-resolver.yml@main
-    if: github.event.label.name == 'fix-me'
-    with:
-      max_iterations: 50
-    secrets: inherit
--- a/.github/workflows/py-unit-tests-mac.yml
+++ b/.github/workflows/py-unit-tests-mac.yml
@@ -1,96 +0,0 @@
-# Workflow that runs python unit tests on mac
-name: Run Python Unit Tests Mac
-
-# This job is flaky so only run it nightly
-on:
-  schedule:
-    - cron: '0 0 * * *'
-
-jobs:
-  # Run python unit tests on macOS
-  test-on-macos:
-    name: Python Unit Tests on macOS
-    runs-on: macos-14
-    env:
-      INSTALL_DOCKER: '1' # Set to '0' to skip Docker installation
-    strategy:
-      matrix:
-        python-version: ['3.12']
-    steps:
-      - uses: actions/checkout@v4
-      - name: Set up Python ${{ matrix.python-version }}
-        uses: actions/setup-python@v5
-        with:
-          python-version: ${{ matrix.python-version }}
-      - name: Cache Poetry dependencies
-        uses: actions/cache@v4
-        with:
-          path: |
-            ~/.cache/pypoetry
-            ~/.virtualenvs
-          key: ${{ runner.os }}-poetry-${{ hashFiles('**/poetry.lock') }}
-          restore-keys: |
-            ${{ runner.os }}-poetry-
-      - name: Install poetry via pipx
-        run: pipx install poetry
-      - name: Install Python dependencies using Poetry
-        run: poetry install --without evaluation,llama-index
-      - name: Install & Start Docker
-        if: env.INSTALL_DOCKER == '1'
-        run: |
-          INSTANCE_NAME="colima-${GITHUB_RUN_ID}"
-
-          # Uninstall colima to upgrade to the latest version
-          if brew list colima &>/dev/null; then
-            brew uninstall colima
-            # unlinking colima dependency: go
-            brew uninstall go@1.21
-          fi
-          rm -rf ~/.colima ~/.lima
-          brew install --HEAD colima
-          brew install docker
-
-          start_colima() {
-            # Find a free port in the range 10000-20000
-            RANDOM_PORT=$((RANDOM % 10001 + 10000))
-
-            # Original line:
-            if ! colima start --network-address --arch x86_64 --cpu=1 --memory=1 --verbose --ssh-port $RANDOM_PORT; then
-              echo "Failed to start Colima."
-              return 1
-            fi
-            return 0
-          }
-
-          # Attempt to start Colima for 5 total attempts:
-          ATTEMPT_LIMIT=5
-          for ((i=1; i<=ATTEMPT_LIMIT; i++)); do
-
-            if start_colima; then
-              echo "Colima started successfully."
-              break
-            else
-              colima stop -f
-              sleep 10
-              colima delete -f
-              if [ $i -eq $ATTEMPT_LIMIT ]; then
-                exit 1
-              fi
-              sleep 10
-            fi
-          done
-
-          # For testcontainers to find the Colima socket
-          # https://github.com/abiosoft/colima/blob/main/docs/FAQ.md#cannot-connect-to-the-docker-daemon-at-unixvarrundockersock-is-the-docker-daemon-running
-          sudo ln -sf $HOME/.colima/default/docker.sock /var/run/docker.sock
-      - name: Build Environment
-        run: make build
-      - name: Set up Docker Buildx
-        id: buildx
-        uses: docker/setup-buildx-action@v3
-      - name: Run Tests
-        run: poetry run pytest --forked --cov=openhands --cov-report=xml ./tests/unit --ignore=tests/unit/test_memory.py
-      - name: Upload coverage to Codecov
-        uses: codecov/codecov-action@v4
-        env:
-          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
--- a/.github/workflows/py-unit-tests.yml
+++ b/.github/workflows/py-unit-tests.yml
@@ -1,49 +0,0 @@
-# Workflow that runs python unit tests
-name: Run Python Unit Tests
-
-# The jobs in this workflow are required, so they must run at all times
-# * Always run on "main"
-# * Always run on PRs
-on:
-  push:
-    branches:
-      - main
-  pull_request:
-
-# If triggered by a PR, it will be in the same group. However, each commit on main will be in its own unique group
-concurrency:
-  group: ${{ github.workflow }}-${{ (github.head_ref && github.ref) || github.run_id }}
-  cancel-in-progress: true
-
-jobs:
-  # Run python unit tests on Linux
-  test-on-linux:
-    name: Python Unit Tests on Linux
-    runs-on: ubuntu-latest
-    env:
-      INSTALL_DOCKER: '0' # Set to '0' to skip Docker installation
-    strategy:
-      matrix:
-        python-version: ['3.12']
-    steps:
-      - uses: actions/checkout@v4
-      - name: Set up Docker Buildx
-        id: buildx
-        uses: docker/setup-buildx-action@v3
-      - name: Install poetry via pipx
-        run: pipx install poetry
-      - name: Set up Python
-        uses: actions/setup-python@v5
-        with:
-          python-version: ${{ matrix.python-version }}
-          cache: 'poetry'
-      - name: Install Python dependencies using Poetry
-        run: poetry install --without evaluation,llama-index
-      - name: Build Environment
-        run: make build
-      - name: Run Tests
-        run: poetry run pytest --forked --cov=openhands --cov-report=xml -svv ./tests/unit --ignore=tests/unit/test_memory.py
-      - name: Upload coverage to Codecov
-        uses: codecov/codecov-action@v4
-        env:
-          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
--- a/.github/workflows/pypi-release.yml
+++ b/.github/workflows/pypi-release.yml
@@ -1,31 +0,0 @@
-# Publishes the OpenHands PyPi package
-name: Publish PyPi Package
-
-# Triggered manually
-on:
-  workflow_dispatch:
-    inputs:
-      reason:
-        description: 'Reason for manual trigger'
-        required: true
-        default: ''
-
-jobs:
-  release:
-    runs-on: ubuntu-latest
-    steps:
-      - uses: actions/checkout@v4
-      - uses: actions/setup-python@v5
-        with:
-          python-version: 3.12
-      - name: Install Poetry
-        uses: snok/install-poetry@v1.4.1
-        with:
-          virtualenvs-in-project: true
-          virtualenvs-path: ~/.virtualenvs
-      - name: Install Poetry Dependencies
-        run: poetry install --no-interaction --no-root
-      - name: Build poetry project
-        run: ./build.sh
-      - name: publish
-        run: poetry publish -u __token__ -p ${{ secrets.PYPI_TOKEN }}
--- a/.github/workflows/review-pr.yml
+++ b/.github/workflows/review-pr.yml
@@ -1,5 +1,4 @@
-# Workflow that uses OpenHands to review a pull request. PR must be labeled 'review-this'
-name: Use OpenHands to Review Pull Request
+name: Use OpenDevin to Review Pull Request

 on:
  pull_request:
@@ -13,31 +12,29 @@ jobs:
  dogfood:
    if: contains(github.event.pull_request.labels.*.name, 'review-this')
    runs-on: ubuntu-latest
+    container:
+      image: ghcr.io/opendevin/opendevin
+      volumes:
+        - /var/run/docker.sock:/var/run/docker.sock
+
    steps:
-    - uses: actions/checkout@v4
-    - name: Set up Docker Buildx
-      id: buildx
-      uses: docker/setup-buildx-action@v3
-    - name: Set up Python
-      uses: actions/setup-python@v5
-      with:
-        python-version: '3.12'
    - name: install git, github cli
      run: |
-        sudo apt-get install -y git gh
+        apt-get install -y git gh
        git config --global --add safe.directory $PWD
+
    - name: Checkout Repository
      uses: actions/checkout@v4
      with:
        ref: ${{ github.event.pull_request.base.ref }} # check out the target branch
+
    - name: Download Diff
      run: |
        curl -O "${{ github.event.pull_request.diff_url }}" -L
+
    - name: Write Task File
      run: |
-        echo "Your coworker wants to apply a pull request to this project." > task.txt
-        echo "Read and review ${{ github.event.pull_request.number }}.diff file. Create a review-${{ github.event.pull_request.number }}.txt and write your concise comments and suggestions there." >> task.txt
-        echo "Do not ask me for confirmation at any point." >> task.txt
+        echo "Your coworker wants to apply a pull request to this project. Read and review ${{ github.event.pull_request.number }}.diff file. Create a review-${{ github.event.pull_request.number }}.txt and write your concise comments and suggestions there." > task.txt
        echo "" >> task.txt
        echo "Title" >> task.txt
        echo "${{ github.event.pull_request.title }}" >> task.txt
@@ -46,25 +43,27 @@ jobs:
        echo "${{ github.event.pull_request.body }}" >> task.txt
        echo "" >> task.txt
        echo "Diff file is: ${{ github.event.pull_request.number }}.diff" >> task.txt
+
    - name: Set up environment
      run: |
        curl -sSL https://install.python-poetry.org | python3 -
        export PATH="/github/home/.local/bin:$PATH"
-        poetry install --without evaluation,llama-index
+        poetry install --without evaluation
        poetry run playwright install --with-deps chromium
-    - name: Run OpenHands
+
+    - name: Run OpenDevin
      env:
-        LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
-        LLM_MODEL: ${{ vars.LLM_MODEL }}
+        LLM_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+        OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+        SANDBOX_BOX_TYPE: ssh
      run: |
        # Append path to launch poetry
        export PATH="/github/home/.local/bin:$PATH"
        # Append path to correctly import package, note: must set pwd at first
        export PYTHONPATH=$(pwd):$PYTHONPATH
-        export WORKSPACE_MOUNT_PATH=$GITHUB_WORKSPACE
-        export WORKSPACE_BASE=$GITHUB_WORKSPACE
-        echo -e "/exit\n" | poetry run python openhands/core/main.py -i 50 -f task.txt
+        WORKSPACE_MOUNT_PATH=$GITHUB_WORKSPACE poetry run python ./opendevin/core/main.py -i 50 -f task.txt -d $GITHUB_WORKSPACE
        rm task.txt
+
    - name: Check if review file is non-empty
      id: check_file
      run: |
@@ -73,6 +72,7 @@ jobs:
          echo "non_empty=true" >> $GITHUB_OUTPUT
        fi
      shell: bash
+
    - name: Create PR review if file is non-empty
      env:
        GH_TOKEN: ${{ github.token }}
--- a/.github/workflows/run-unit-tests.yml
+++ b/.github/workflows/run-unit-tests.yml
@@ -0,0 +1,138 @@
+name: Run Unit Tests
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
+
+on:
+  push:
+    branches:
+      - main
+    paths-ignore:
+      - '**/*.md'
+      - 'frontend/**'
+      - 'docs/**'
+      - 'evaluation/**'
+  pull_request:
+
+env:
+  PERSIST_SANDBOX : "false"
+
+jobs:
+  fe-test:
+    runs-on: ubuntu-latest
+
+    strategy:
+      matrix:
+        node-version: [20]
+
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+
+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: ${{ matrix.node-version }}
+
+      - name: Install dependencies
+        working-directory: ./frontend
+        run: npm ci
+
+      - name: Run tests and collect coverage
+        working-directory: ./frontend
+        run: npm run test:coverage
+
+      - name: Upload coverage to Codecov
+        uses: codecov/codecov-action@v4
+        env:
+          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
+
+  test-on-macos:
+    name: Test on macOS
+    runs-on: macos-12
+    env:
+      INSTALL_DOCKER: "1" # Set to '0' to skip Docker installation
+    strategy:
+      matrix:
+        python-version: ["3.11"]
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Install poetry via pipx
+        run: pipx install poetry
+
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+          cache: "poetry"
+
+      - name: Install Python dependencies using Poetry
+        run: poetry install
+
+      - name: Install & Start Docker
+        if: env.INSTALL_DOCKER == '1'
+        run: |
+          # Uninstall colima to upgrade to the latest version
+          if brew list colima &>/dev/null; then
+              brew uninstall colima
+              # unlinking colima dependency: go
+              brew uninstall go@1.21
+          fi
+          rm -rf ~/.colima ~/.lima
+          brew install --HEAD colima
+          brew services start colima
+          brew install docker
+          colima delete
+          colima start  --network-address --arch x86_64 --cpu=1 --memory=1
+
+          # For testcontainers to find the Colima socket
+          # https://github.com/abiosoft/colima/blob/main/docs/FAQ.md#cannot-connect-to-the-docker-daemon-at-unixvarrundockersock-is-the-docker-daemon-running
+          sudo ln -sf $HOME/.colima/default/docker.sock /var/run/docker.sock
+
+      - name: Build Environment
+        run: make build
+
+      - name: Run Tests
+        run: poetry run pytest --forked --cov=agenthub --cov=opendevin --cov-report=xml ./tests/unit -k "not test_sandbox"
+
+      - name: Upload coverage to Codecov
+        uses: codecov/codecov-action@v4
+        env:
+          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
+  test-on-linux:
+    name: Test on Linux
+    runs-on: ubuntu-latest
+    env:
+      INSTALL_DOCKER: "0" # Set to '0' to skip Docker installation
+    strategy:
+      matrix:
+        python-version: ["3.11"]
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Install poetry via pipx
+        run: pipx install poetry
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python-version }}
+          cache: "poetry"
+
+      - name: Install Python dependencies using Poetry
+        run: poetry install --without evaluation
+
+      - name: Build Environment
+        run: make build
+
+      - name: Run Tests
+        run: poetry run pytest --forked --cov=agenthub --cov=opendevin --cov-report=xml ./tests/unit -k "not test_sandbox"
+
+      - name: Upload coverage to Codecov
+        uses: codecov/codecov-action@v4
+        env:
+          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
--- a/.github/workflows/solve-issue.yml
+++ b/.github/workflows/solve-issue.yml
@@ -0,0 +1,122 @@
+name: Use OpenDevin to Resolve GitHub Issue
+
+on:
+  issues:
+    types: [labeled]
+
+permissions:
+  contents: write
+  pull-requests: write
+  issues: write
+
+jobs:
+  dogfood:
+    if: github.event.label.name == 'solve-this'
+    runs-on: ubuntu-latest
+    container:
+      image: ghcr.io/opendevin/opendevin
+      volumes:
+        - /var/run/docker.sock:/var/run/docker.sock
+
+    steps:
+    - name: install git, github cli
+      run: apt-get install -y git gh
+
+    - name: Checkout Repository
+      uses: actions/checkout@v4
+
+    - name: Write Task File
+      env:
+        ISSUE_TITLE: ${{ github.event.issue.title }}
+        ISSUE_BODY: ${{ github.event.issue.body }}
+      run: |
+        echo "TITLE:" > task.txt
+        echo "${ISSUE_TITLE}" >> task.txt
+        echo "" >> task.txt
+        echo "BODY:" >> task.txt
+        echo "${ISSUE_BODY}" >> task.txt
+
+    - name: Set up environment
+      run: |
+        curl -sSL https://install.python-poetry.org | python3 -
+        export PATH="/github/home/.local/bin:$PATH"
+        poetry install --without evaluation
+        poetry run playwright install --with-deps chromium
+
+
+    - name: Run OpenDevin
+      env:
+        ISSUE_TITLE: ${{ github.event.issue.title }}
+        ISSUE_BODY: ${{ github.event.issue.body }}
+        LLM_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+        OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+        SANDBOX_BOX_TYPE: ssh
+      run: |
+        # Append path to launch poetry
+        export PATH="/github/home/.local/bin:$PATH"
+        # Append path to correctly import package, note: must set pwd at first
+        export PYTHONPATH=$(pwd):$PYTHONPATH
+        WORKSPACE_MOUNT_PATH=$GITHUB_WORKSPACE poetry run python ./opendevin/core/main.py -i 50 -f task.txt -d $GITHUB_WORKSPACE
+        rm task.txt
+
+    - name: Setup Git, Create Branch, and Commit Changes
+      run: |
+        # Setup Git configuration
+        git config --global --add safe.directory $PWD
+        git config --global user.name 'OpenDevin'
+        git config --global user.email 'OpenDevin@users.noreply.github.com'
+
+        # Create a unique branch name with a timestamp
+        BRANCH_NAME="fix/${{ github.event.issue.number }}-$(date +%Y%m%d%H%M%S)"
+
+        # Checkout new branch
+        git checkout -b $BRANCH_NAME
+
+        # Add all changes to staging, except task.txt
+        git add --all -- ':!task.txt'
+
+        # Commit the changes, if any
+        git commit -m "OpenDevin: Resolve Issue #${{ github.event.issue.number }}"
+        if [ $? -ne 0 ]; then
+          echo "No changes to commit."
+          exit 0
+        fi
+
+        # Push changes
+        git push --set-upstream origin $BRANCH_NAME
+
+    - name: Fetch Default Branch
+      env:
+        GH_TOKEN: ${{ github.token }}
+      run: |
+        # Fetch the default branch using gh cli
+        DEFAULT_BRANCH=$(gh repo view --json defaultBranchRef --jq .defaultBranchRef.name)
+        echo "Default branch is $DEFAULT_BRANCH"
+        echo "DEFAULT_BRANCH=$DEFAULT_BRANCH" >> $GITHUB_ENV
+
+    - name: Generate PR
+      env:
+        GH_TOKEN: ${{ github.token }}
+      run: |
+        # Create PR and capture URL
+        PR_URL=$(gh pr create \
+          --title "OpenDevin: Resolve Issue #2" \
+          --body "This PR was generated by OpenDevin to resolve issue #2" \
+          --repo "foragerr/OpenDevin" \
+          --head "${{ github.head_ref }}" \
+          --base "${{ env.DEFAULT_BRANCH }}" \
+          | grep -o 'https://github.com/[^ ]*')
+
+        # Extract PR number from URL
+        PR_NUMBER=$(echo "$PR_URL" | grep -o '[0-9]\+$')
+
+        # Set environment vars
+        echo "PR_URL=$PR_URL" >> $GITHUB_ENV
+        echo "PR_NUMBER=$PR_NUMBER" >> $GITHUB_ENV
+
+    - name: Post Comment
+      env:
+        GH_TOKEN: ${{ github.token }}
+      run: |
+        gh issue comment ${{ github.event.issue.number }} \
+          -b "OpenDevin raised [PR #${{ env.PR_NUMBER }}](${{ env.PR_URL }}) to resolve this issue."
--- a/.github/workflows/stale.yml
+++ b/.github/workflows/stale.yml
@@ -1,7 +1,4 @@
-# Workflow that marks issues and PRs with no activity for 30 days with "Stale" and closes them after 7 more days of no activity
 name: 'Close stale issues'
-
-# Runs every day at 01:30
 on:
  schedule:
    - cron: '30 1 * * *'
@@ -12,10 +9,21 @@ jobs:
    steps:
      - uses: actions/stale@v9
        with:
+          # Aggressively close issues that have been explicitly labeled `age-out`
+          any-of-labels: age-out
+          stale-issue-message: 'This issue is stale because it has been open for 7 days with no activity. Remove stale label or comment or this will be closed in 1 day.'
+          close-issue-message: 'This issue was closed because it has been stalled for over 7 days with no activity.'
+          stale-pr-message: 'This PR is stale because it has been open for 7 days with no activity. Remove stale label or comment or this will be closed in 1 days.'
+          close-pr-message: 'This PR was closed because it has been stalled for over 7 days with no activity.'
+          days-before-stale: 7
+          days-before-close: 1
+
+      - uses: actions/stale@v9
+        with:
+          # Be more lenient with other issues
          stale-issue-message: 'This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.'
-          stale-pr-message: 'This PR is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.'
-          days-before-stale: 30
-          exempt-issue-labels: 'tracked'
          close-issue-message: 'This issue was closed because it has been stalled for over 30 days with no activity.'
+          stale-pr-message: 'This PR is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.'
          close-pr-message: 'This PR was closed because it has been stalled for over 30 days with no activity.'
+          days-before-stale: 30
          days-before-close: 7
--- a/.github/workflows/update-pyproject-version.yml
+++ b/.github/workflows/update-pyproject-version.yml
@@ -0,0 +1,48 @@
+name: Update pyproject.toml Version and Tags
+
+on:
+  release:
+    types:
+      - published
+
+jobs:
+  update-pyproject-and-tags:
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 0  # Fetch all history for all branches and tags
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.11"
+
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install toml
+
+      - name: Get release tag
+        id: get_release_tag
+        run: echo "RELEASE_TAG=${GITHUB_REF#refs/tags/}" >> $GITHUB_ENV
+
+      - name: Update pyproject.toml with release tag
+        run: |
+          python -c "
+          import toml
+          with open('pyproject.toml', 'r') as f:
+              data = toml.load(f)
+          data['tool']['poetry']['version'] = '${{ env.RELEASE_TAG }}'
+          with open('pyproject.toml', 'w') as f:
+              toml.dump(data, f)
+          "
+
+      - name: Commit and push pyproject.toml changes
+        uses: stefanzweifel/git-auto-commit-action@v4
+        with:
+          commit_message: "Update pyproject.toml version to ${{ env.RELEASE_TAG }}"
+          branch: main
+          file_pattern: pyproject.toml
--- a/.gitignore
+++ b/.gitignore
@@ -121,7 +121,6 @@ celerybeat.pid

 # Environments
 .env
-frontend/.env
 .venv
 env/
 venv/
@@ -170,15 +169,11 @@ evaluation/outputs
 evaluation/swe_bench/eval_workspace*
 evaluation/SWE-bench/data
 evaluation/webarena/scripts/webarena_env.sh
-evaluation/bird/data
-evaluation/gaia/data
-evaluation/gorilla/data
-evaluation/toolqa/data
-evaluation/scienceagentbench/benchmark

 # frontend

 # dependencies
+frontend/node_modules
 frontend/.pnp
 frontend/bun.lockb
 frontend/yarn.lock
@@ -215,17 +210,10 @@ cache

 # configuration
 config.toml
-config.toml_
 config.toml.bak

+containers/agnostic_sandbox
+
 # swe-bench-eval
 image_build_logs
 run_instance_logs
-
-runtime_*.tar
-
-# docker build
-containers/runtime/Dockerfile
-containers/runtime/project.tar.gz
-containers/runtime/code
-**/node_modules/
--- a/.openhands_instructions
+++ b/.openhands_instructions
@@ -1,28 +0,0 @@
-OpenHands is an automated AI software engineer. It is a repo with a Python backend
-(in the `openhands` directory) and TypeScript frontend (in the `frontend` directory).
-
-General Setup:
- To set up the entire repo, including frontend and backend, run `make build`
- To run linting and type-checking before finishing the job, run `poetry run pre-commit run --all-files --config ./dev_config/python/.pre-commit-config.yaml`
-
-Backend:
- Located in the `openhands` directory
- Testing:
-  - All tests are in `tests/unit/test_*.py`
-  - To test new code, run `poetry run pytest tests/unit/test_xxx.py` where `xxx` is the appropriate file for the current functionality
-  - Write all tests with pytest
-
-Frontend:
- Located in the `frontend` directory
- Prerequisites: A recent version of NodeJS / NPM
- Setup: Run `npm install` in the frontend directory
- Testing:
-  - Run tests: `npm run test`
-  - To run specific tests: `npm run test -- -t "TestName"`
- Building:
-  - Build for production: `npm run build`
- Environment Variables:
-  - Set in `frontend/.env` or as environment variables
-  - Available variables: VITE_BACKEND_HOST, VITE_USE_TLS, VITE_INSECURE_SKIP_VERIFY, VITE_FRONTEND_PORT
- Internationalization:
-  - Generate i18n declaration file: `npm run make-i18n`
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -1,71 +1,96 @@
 # Contributing

-Thanks for your interest in contributing to OpenHands! We welcome and appreciate contributions.
+Thanks for your interest in contributing to OpenDevin! We welcome and appreciate contributions. 

-## Understanding OpenHands's CodeBase
-
-To understand the codebase, please refer to the README in each module:
- [frontend](./frontend/README.md)
- [evaluation](./evaluation/README.md)
- [openhands](./openhands/README.md)
-   - [agenthub](./openhands/agenthub/README.md)
-   - [server](./openhands/server/README.md)
-
-## Setting up your development environment
-
-We have a separate doc [Development.md](https://github.com/All-Hands-AI/OpenHands/blob/main/Development.md) that tells you how to set up a development workflow.
-
-## How can I contribute?
+## How Can I Contribute?

 There are many ways that you can contribute:

-1. **Download and use** OpenHands, and send [issues](https://github.com/All-Hands-AI/OpenHands/issues) when you encounter something that isn't working or a feature that you'd like to see.
-2. **Send feedback** after each session by [clicking the thumbs-up thumbs-down buttons](https://docs.all-hands.dev/modules/usage/feedback), so we can see where things are working and failing, and also build an open dataset for training code agents.
-3. **Improve the Codebase** by sending PRs (see details below). In particular, we have some [good first issues](https://github.com/All-Hands-AI/OpenHands/labels/good%20first%20issue) that may be ones to start on.
+1. **Download and use** OpenDevin, and send [issues](https://github.com/OpenDevin/OpenDevin/issues) when you encounter something that isn't working or a feature that you'd like to see.
+2. **Send feedback** after each session by [clicking the thumbs-up thumbs-down buttons](https://opendevin.github.io/OpenDevin/modules/usage/feedback), so we can see where things are working and failing, and also build an open dataset for training code agents.
+3. **Improve the Codebase** by sending PRs (see details below). In particular, we have some [good first issue](https://github.com/OpenDevin/OpenDevin/labels/good%20first%20issue) issues that may be ones to start on.

-## What can I build?
-Here are a few ways you can help improve the codebase.
+## Understanding OpenDevin's CodeBase

-#### UI/UX
-We're always looking to improve the look and feel of the application. If you've got a small fix
-for something that's bugging you, feel free to open up a PR that changes the `./frontend` directory.
+To understand the codebase, please refer to the README in each module:
+- [frontend](./frontend/README.md)
+- [agenthub](./agenthub/README.md)
+- [evaluation](./evaluation/README.md)
+- [opendevin](./opendevin/README.md)
+    - [server](./opendevin/server/README.md)

-If you're looking to make a bigger change, add a new UI element, or significantly alter the style
-of the application, please open an issue first, or better, join the #frontend channel in our Slack
-to gather consensus from our design team first.
-
-#### Improving the agent
-Our main agent is the CodeAct agent. You can [see its prompts here](https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/agenthub/codeact_agent)
-
-Changes to these prompts, and to the underlying behavior in Python, can have a huge impact on user experience.
-You can try modifying the prompts to see how they change the behavior of the agent as you use the app
-locally, but we will need to do an end-to-end evaluation of any changes here to ensure that the agent
-is getting better over time.
-
-We use the [SWE-bench](https://www.swebench.com/) benchmark to test our agent. You can join the #evaluation
-channel in Slack to learn more.
-
-#### Adding a new agent
-You may want to experiment with building new types of agents. You can add an agent to `openhands/agenthub`
-to help expand the capabilities of OpenHands.
-
-#### Adding a new runtime
-The agent needs a place to run code and commands. When you run OpenHands on your laptop, it uses a Docker container
-to do this by default. But there are other ways of creating a sandbox for the agent.
-
-If you work for a company that provides a cloud-based runtime, you could help us add support for that runtime
-by implementing the [interface specified here](https://github.com/All-Hands-AI/OpenHands/blob/main/openhands/runtime/runtime.py).
-
-#### Testing
 When you write code, it is also good to write tests. Please navigate to the `tests` folder to see existing test suites.
 At the moment, we have two kinds of tests: `unit` and `integration`. Please refer to the README for each test suite. These tests also run on GitHub's continuous integration to ensure quality of the project.

-## Sending Pull Requests to OpenHands
+## Sending Pull Requests to OpenDevin

-You'll need to fork our repository to send us a Pull Request. You can learn more
-about how to fork a GitHub repo and open a PR with your changes in [this article](https://medium.com/swlh/forks-and-pull-requests-how-to-contribute-to-github-repos-8843fac34ce8)
+### 1. Fork the Official Repository
+Fork the [OpenDevin repository](https://github.com/OpenDevin/OpenDevin) into your own account.
+Clone your own forked repository into your local environment:

-### Pull Request title
+```shell
+git clone git@github.com:<YOUR-USERNAME>/OpenDevin.git
+```
+
+### 2. Configure Git
+
+Set the official repository as your [upstream](https://www.atlassian.com/git/tutorials/git-forks-and-upstreams) to synchronize with the latest update in the official repository.
+Add the original repository as upstream:
+
+```shell
+cd OpenDevin
+git remote add upstream git@github.com:OpenDevin/OpenDevin.git
+```
+
+Verify that the remote is set:
+
+```shell
+git remote -v
+```
+
+You should see both `origin` and `upstream` in the output.
+
+### 3. Synchronize with Official Repository
+Synchronize latest commit with official repository before coding:
+
+```shell
+git fetch upstream
+git checkout main
+git merge upstream/main
+git push origin main
+```
+
+### 4. Set up the Development Environment
+
+We have a separate doc [Development.md](https://github.com/OpenDevin/OpenDevin/blob/main/Development.md) that tells you how to set up a development workflow.
+
+### 5. Write Code and Commit It
+
+Once you have done this, you can write code, test it, and commit it to a branch (replace `my_branch` with an appropriate name):
+
+```shell
+git checkout -b my_branch
+git add .
+git commit
+git push origin my_branch
+```
+
+### 6. Open a Pull Request
+
+* On GitHub, go to the page of your forked repository, and create a Pull Request:
+   - Click on `Branches`
+   - Click on the `...` beside your branch and click on `New pull request`
+   - Set `base repository` to `OpenDevin/OpenDevin`
+   - Set `base` to `main`
+   - Click `Create pull request`
+  
+The PR should appear in [OpenDevin PRs](https://github.com/OpenDevin/OpenDevin/pulls).
+
+Then the OpenDevin team will review your code.
+
+## PR Rules
+
+### 1. Pull Request title
 As described [here](https://github.com/commitizen/conventional-commit-types/blob/master/index.json), a valid PR title should begin with one of the following prefixes:

 - `feat`: A new feature
@@ -84,11 +109,9 @@ For example, a PR title could be:
 - `refactor: modify package path`
 - `feat(frontend): xxxx`, where `(frontend)` means that this PR mainly focuses on the frontend component.

-You may also check out previous PRs in the [PR list](https://github.com/All-Hands-AI/OpenHands/pulls).
+You may also check out previous PRs in the [PR list](https://github.com/OpenDevin/OpenDevin/pulls).

-### Pull Request description
+### 2. Pull Request description
 - If your PR is small (such as a typo fix), you can go brief.
 - If it contains a lot of changes, it's better to write more details.

-If your changes are user-facing (e.g. a new feature in the UI, a change in behavior, or a bugfix)
-please include a short message that we can add to our changelog.
--- a/CREDITS.md
+++ b/CREDITS.md
@@ -1,312 +0,0 @@
-# Credits
-
-## Contributors
-
-We would like to thank all the [contributors](https://github.com/All-Hands-AI/OpenHands/graphs/contributors) who have helped make OpenHands possible. We greatly appreciate your dedication and hard work.
-
-## Open Source Projects
-
-OpenHands includes and adapts the following open source projects. We are grateful for their contributions to the open source community:
-
-#### [SWE Agent](https://github.com/princeton-nlp/swe-agent)
-   - License: MIT License
-   - Description: Adapted for use in OpenHands's agent hub
-
-#### [Aider](https://github.com/paul-gauthier/aider)
-   - License: Apache License 2.0
-   - Description: AI pair programming tool. OpenHands has adapted and integrated its linter module for code-related tasks in [`agentskills utilities`](https://github.com/All-Hands-AI/OpenHands/tree/main/openhands/runtime/plugins/agent_skills/utils/aider)
-
-#### [BrowserGym](https://github.com/ServiceNow/BrowserGym)
-   - License: Apache License 2.0
-   - Description: Adapted in implementing the browsing agent
-
-
-### Reference Implementations for Evaluation Benchmarks
-OpenHands integrates code of the reference implementations for the following agent evaluation benchmarks:
-
-#### [HumanEval](https://github.com/openai/human-eval)
-   - License: MIT License
-
-#### [DSP](https://github.com/microsoft/DataScienceProblems)
-   - License: MIT License
-
-#### [HumanEvalPack](https://github.com/bigcode-project/bigcode-evaluation-harness)
-   - License: Apache License 2.0
-
-#### [AgentBench](https://github.com/THUDM/AgentBench)
-   - License: Apache License 2.0
-
-#### [SWE-Bench](https://github.com/princeton-nlp/SWE-bench)
-   - License: MIT License
-
-#### [BIRD](https://bird-bench.github.io/)
-   - License: MIT License
-   - Dataset: CC-BY-SA 4.0
-
-#### [Gorilla APIBench](https://github.com/ShishirPatil/gorilla)
-   - License: Apache License 2.0
-
-#### [GPQA](https://github.com/idavidrein/gpqa)
-   - License: MIT License
-
-#### [ProntoQA](https://github.com/asaparov/prontoqa)
-   - License: Apache License 2.0
-
-
-## Open Source licenses
-
-### MIT License
-
-Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
-
-The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
-
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
-### BSD 3-Clause License
-
-Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
-
-1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
-
-2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
-
-3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
-
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-### Apache License 2.0
-
-
-                                 Apache License
-                           Version 2.0, January 2004
-                        http://www.apache.org/licenses/
-
-   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
-
-   1. Definitions.
-
-      "License" shall mean the terms and conditions for use, reproduction,
-      and distribution as defined by Sections 1 through 9 of this document.
-
-      "Licensor" shall mean the copyright owner or entity authorized by
-      the copyright owner that is granting the License.
-
-      "Legal Entity" shall mean the union of the acting entity and all
-      other entities that control, are controlled by, or are under common
-      control with that entity. For the purposes of this definition,
-      "control" means (i) the power, direct or indirect, to cause the
-      direction or management of such entity, whether by contract or
-      otherwise, or (ii) ownership of fifty percent (50%) or more of the
-      outstanding shares, or (iii) beneficial ownership of such entity.
-
-      "You" (or "Your") shall mean an individual or Legal Entity
-      exercising permissions granted by this License.
-
-      "Source" form shall mean the preferred form for making modifications,
-      including but not limited to software source code, documentation
-      source, and configuration files.
-
-      "Object" form shall mean any form resulting from mechanical
-      transformation or translation of a Source form, including but
-      not limited to compiled object code, generated documentation,
-      and conversions to other media types.
-
-      "Work" shall mean the work of authorship, whether in Source or
-      Object form, made available under the License, as indicated by a
-      copyright notice that is included in or attached to the work
-      (an example is provided in the Appendix below).
-
-      "Derivative Works" shall mean any work, whether in Source or Object
-      form, that is based on (or derived from) the Work and for which the
-      editorial revisions, annotations, elaborations, or other modifications
-      represent, as a whole, an original work of authorship. For the purposes
-      of this License, Derivative Works shall not include works that remain
-      separable from, or merely link (or bind by name) to the interfaces of,
-      the Work and Derivative Works thereof.
-
-      "Contribution" shall mean any work of authorship, including
-      the original version of the Work and any modifications or additions
-      to that Work or Derivative Works thereof, that is intentionally
-      submitted to Licensor for inclusion in the Work by the copyright owner
-      or by an individual or Legal Entity authorized to submit on behalf of
-      the copyright owner. For the purposes of this definition, "submitted"
-      means any form of electronic, verbal, or written communication sent
-      to the Licensor or its representatives, including but not limited to
-      communication on electronic mailing lists, source code control systems,
-      and issue tracking systems that are managed by, or on behalf of, the
-      Licensor for the purpose of discussing and improving the Work, but
-      excluding communication that is conspicuously marked or otherwise
-      designated in writing by the copyright owner as "Not a Contribution."
-
-      "Contributor" shall mean Licensor and any individual or Legal Entity
-      on behalf of whom a Contribution has been received by Licensor and
-      subsequently incorporated within the Work.
-
-   2. Grant of Copyright License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      copyright license to reproduce, prepare Derivative Works of,
-      publicly display, publicly perform, sublicense, and distribute the
-      Work and such Derivative Works in Source or Object form.
-
-   3. Grant of Patent License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      (except as stated in this section) patent license to make, have made,
-      use, offer to sell, sell, import, and otherwise transfer the Work,
-      where such license applies only to those patent claims licensable
-      by such Contributor that are necessarily infringed by their
-      Contribution(s) alone or by combination of their Contribution(s)
-      with the Work to which such Contribution(s) was submitted. If You
-      institute patent litigation against any entity (including a
-      cross-claim or counterclaim in a lawsuit) alleging that the Work
-      or a Contribution incorporated within the Work constitutes direct
-      or contributory patent infringement, then any patent licenses
-      granted to You under this License for that Work shall terminate
-      as of the date such litigation is filed.
-
-   4. Redistribution. You may reproduce and distribute copies of the
-      Work or Derivative Works thereof in any medium, with or without
-      modifications, and in Source or Object form, provided that You
-      meet the following conditions:
-
-      (a) You must give any other recipients of the Work or
-          Derivative Works a copy of this License; and
-
-      (b) You must cause any modified files to carry prominent notices
-          stating that You changed the files; and
-
-      (c) You must retain, in the Source form of any Derivative Works
-          that You distribute, all copyright, patent, trademark, and
-          attribution notices from the Source form of the Work,
-          excluding those notices that do not pertain to any part of
-          the Derivative Works; and
-
-      (d) If the Work includes a "NOTICE" text file as part of its
-          distribution, then any Derivative Works that You distribute must
-          include a readable copy of the attribution notices contained
-          within such NOTICE file, excluding those notices that do not
-          pertain to any part of the Derivative Works, in at least one
-          of the following places: within a NOTICE text file distributed
-          as part of the Derivative Works; within the Source form or
-          documentation, if provided along with the Derivative Works; or,
-          within a display generated by the Derivative Works, if and
-          wherever such third-party notices normally appear. The contents
-          of the NOTICE file are for informational purposes only and
-          do not modify the License. You may add Your own attribution
-          notices within Derivative Works that You distribute, alongside
-          or as an addendum to the NOTICE text from the Work, provided
-          that such additional attribution notices cannot be construed
-          as modifying the License.
-
-      You may add Your own copyright statement to Your modifications and
-      may provide additional or different license terms and conditions
-      for use, reproduction, or distribution of Your modifications, or
-      for any such Derivative Works as a whole, provided Your use,
-      reproduction, and distribution of the Work otherwise complies with
-      the conditions stated in this License.
-
-   5. Submission of Contributions. Unless You explicitly state otherwise,
-      any Contribution intentionally submitted for inclusion in the Work
-      by You to the Licensor shall be under the terms and conditions of
-      this License, without any additional terms or conditions.
-      Notwithstanding the above, nothing herein shall supersede or modify
-      the terms of any separate license agreement you may have executed
-      with Licensor regarding such Contributions.
-
-   6. Trademarks. This License does not grant permission to use the trade
-      names, trademarks, service marks, or product names of the Licensor,
-      except as required for reasonable and customary use in describing the
-      origin of the Work and reproducing the content of the NOTICE file.
-
-   7. Disclaimer of Warranty. Unless required by applicable law or
-      agreed to in writing, Licensor provides the Work (and each
-      Contributor provides its Contributions) on an "AS IS" BASIS,
-      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
-      implied, including, without limitation, any warranties or conditions
-      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
-      PARTICULAR PURPOSE. You are solely responsible for determining the
-      appropriateness of using or redistributing the Work and assume any
-      risks associated with Your exercise of permissions under this License.
-
-   8. Limitation of Liability. In no event and under no legal theory,
-      whether in tort (including negligence), contract, or otherwise,
-      unless required by applicable law (such as deliberate and grossly
-      negligent acts) or agreed to in writing, shall any Contributor be
-      liable to You for damages, including any direct, indirect, special,
-      incidental, or consequential damages of any character arising as a
-      result of this License or out of the use or inability to use the
-      Work (including but not limited to damages for loss of goodwill,
-      work stoppage, computer failure or malfunction, or any and all
-      other commercial damages or losses), even if such Contributor
-      has been advised of the possibility of such damages.
-
-   9. Accepting Warranty or Additional Liability. While redistributing
-      the Work or Derivative Works thereof, You may choose to offer,
-      and charge a fee for, acceptance of support, warranty, indemnity,
-      or other liability obligations and/or rights consistent with this
-      License. However, in accepting such obligations, You may act only
-      on Your own behalf and on Your sole responsibility, not on behalf
-      of any other Contributor, and only if You agree to indemnify,
-      defend, and hold each Contributor harmless for any liability
-      incurred by, or claims asserted against, such Contributor by reason
-      of your accepting any such warranty or additional liability.
-
-   END OF TERMS AND CONDITIONS
-
-   APPENDIX: How to apply the Apache License to your work.
-
-      To apply the Apache License to your work, attach the following
-      boilerplate notice, with the fields enclosed by brackets "[]"
-      replaced with your own identifying information. (Don't include
-      the brackets!)  The text should be enclosed in the appropriate
-      comment syntax for the file format. We also recommend that a
-      file or class name and description of purpose be included on the
-      same "printed page" as the copyright notice for easier
-      identification within third-party archives.
-
-   Copyright [yyyy] [name of copyright owner]
-
-
-
-### Non-Open Source Reference Implementations:
-
-#### [MultiPL-E](https://github.com/nuprl/MultiPL-E)
-   - License: BSD 3-Clause License with Machine Learning Restriction
-
-BSD 3-Clause License with Machine Learning Restriction
-
-Copyright (c) 2022, Northeastern University, Oberlin College, Roblox Inc,
-Stevens Institute of Technology, University of Massachusetts Amherst, and
-Wellesley College.
-
-All rights reserved.
-
-Redistribution and use in source and binary forms, with or without
-modification, are permitted provided that the following conditions are met:
-
-1. Redistributions of source code must retain the above copyright notice, this
-   list of conditions and the following disclaimer.
-
-2. Redistributions in binary form must reproduce the above copyright notice,
-   this list of conditions and the following disclaimer in the documentation
-   and/or other materials provided with the distribution.
-
-3. Neither the name of the copyright holder nor the names of its
-   contributors may be used to endorse or promote products derived from
-   this software without specific prior written permission.
-
-4.  The contents of this repository may not be used as training data for any
-    machine learning model, including but not limited to neural networks.
-
-THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
-AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
-IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
-DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
-FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
-DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
-SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
-CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
-OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
--- a/Development.md
+++ b/Development.md
@@ -1,18 +1,16 @@
 # Development Guide
-This guide is for people working on OpenHands and editing the source code.
-If you wish to contribute your changes, check out the [CONTRIBUTING.md](https://github.com/All-Hands-AI/OpenHands/blob/main/CONTRIBUTING.md) on how to clone and setup the project initially before moving on.
-Otherwise, you can clone the OpenHands project directly.
+This guide is for people working on OpenDevin and editing the source code.
+If you wish to contribute your changes, check out the [CONTRIBUTING.md](https://github.com/OpenDevin/OpenDevin/blob/main/CONTRIBUTING.md) on how to clone and setup the project initially before moving on.
+Otherwise, you can clone the OpenDevin project directly.

 ## Start the server for development
 ### 1. Requirements
-* Linux, Mac OS, or [WSL on Windows](https://learn.microsoft.com/en-us/windows/wsl/install)  [Ubuntu <= 22.04]
+* Linux, Mac OS, or [WSL on Windows](https://learn.microsoft.com/en-us/windows/wsl/install)  [ Ubuntu <= 22.04]
 * [Docker](https://docs.docker.com/engine/install/) (For those on MacOS, make sure to allow the default Docker socket to be used from advanced settings!)
-* [Python](https://www.python.org/downloads/) = 3.12
+* [Python](https://www.python.org/downloads/) = 3.11
 * [NodeJS](https://nodejs.org/en/download/package-manager) >= 18.17.1
 * [Poetry](https://python-poetry.org/docs/#installing-with-the-official-installer) >= 1.8
-* OS-specific dependencies:
-  - Ubuntu: build-essential => `sudo apt-get install build-essential`
-  - WSL: netcat => `sudo apt-get install netcat`
+* netcat => sudo apt-get install netcat

 Make sure you have all these dependencies installed before moving on to `make build`.

@@ -24,42 +22,42 @@ If you want to develop without system admin/sudo access to upgrade/install `Pyth
 curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
 bash Miniforge3-$(uname)-$(uname -m).sh

-# Install Python 3.12, nodejs, and poetry
-mamba install python=3.12
+# Install Python 3.11, nodejs, and poetry
+mamba install python=3.11
 mamba install conda-forge::nodejs
 mamba install conda-forge::poetry
 ```

 ### 2. Build and Setup The Environment
-Begin by building the project which includes setting up the environment and installing dependencies. This step ensures that OpenHands is ready to run on your system:
+Begin by building the project which includes setting up the environment and installing dependencies. This step ensures that OpenDevin is ready to run on your system:

 ```bash
 make build
 ```

 ### 3. Configuring the Language Model
-OpenHands supports a diverse array of Language Models (LMs) through the powerful [litellm](https://docs.litellm.ai) library. By default, we've chosen the mighty GPT-4 from OpenAI as our go-to model, but the world is your oyster! You can unleash the potential of Anthropic's suave Claude, the enigmatic Llama, or any other LM that piques your interest.
+OpenDevin supports a diverse array of Language Models (LMs) through the powerful [litellm](https://docs.litellm.ai) library. By default, we've chosen the mighty GPT-4 from OpenAI as our go-to model, but the world is your oyster! You can unleash the potential of Anthropic's suave Claude, the enigmatic Llama, or any other LM that piques your interest.

 To configure the LM of your choice, run:
-
+       
   ```bash
   make setup-config
   ```
-
-   This command will prompt you to enter the LLM API key, model name, and other variables ensuring that OpenHands is tailored to your specific needs. Note that the model name will apply only when you run headless. If you use the UI, please set the model in the UI.
-
-   Note: If you have previously run OpenHands using the docker command, you may have already set some environmental variables in your terminal. The final configurations are set from highest to lowest priority:
+   
+   This command will prompt you to enter the LLM API key, model name, and other variables ensuring that OpenDevin is tailored to your specific needs. Note that the model name will apply only when you run headless. If you use the UI, please set the model in the UI.
+   
+   Note: If you have previously run OpenDevin using the docker command, you may have already set some environmental variables in your terminal. The final configurations are set from highest to lowest priority:
   Environment variables > config.toml variables > default variables

 **Note on Alternative Models:**
-Some alternative models may prove more challenging to tame than others. Fear not, brave adventurer! We shall soon unveil LLM-specific documentation to guide you on your quest.
-And if you've already mastered the art of wielding a model other than OpenAI's GPT, we encourage you to share your setup instructions with us by creating instructions and adding it [to our documentation](https://github.com/All-Hands-AI/OpenHands/tree/main/docs/modules/usage/llms).
+Some alternative models may prove more challenging to tame than others. Fear not, brave adventurer! We shall soon unveil LLM-specific documentation to guide you on your quest. 
+And if you've already mastered the art of wielding a model other than OpenAI's GPT, we encourage you to share your setup instructions with us by creating instructions and adding it [to our documentation](https://github.com/OpenDevin/OpenDevin/tree/main/docs/modules/usage/llms).

 For a full list of the LM providers and models available, please consult the [litellm documentation](https://docs.litellm.ai/docs/providers).

 ### 4. Running the application
 #### Option A: Run the Full Application
-Once the setup is complete, launching OpenHands is as simple as running a single command. This command starts both the backend and frontend servers seamlessly, allowing you to interact with OpenHands:
+Once the setup is complete, launching OpenDevin is as simple as running a single command. This command starts both the backend and frontend servers seamlessly, allowing you to interact with OpenDevin:
 ```bash
 make run
 ```
@@ -77,52 +75,24 @@ make run

 ### 6. LLM Debugging
 If you encounter any issues with the Language Model (LM) or you're simply curious, you can inspect the actual LLM prompts and responses. To do so, export DEBUG=1 in the environment and restart the backend.
-OpenHands will then log the prompts and responses in the logs/llm/CURRENT_DATE directory, allowing you to identify the causes.
+OpenDevin will then log the prompts and responses in the logs/llm/CURRENT_DATE directory, allowing you to identify the causes.

 ### 7. Help
-Need assistance or information on available targets and commands? The help command provides all the necessary guidance to ensure a smooth experience with OpenHands.
+Need assistance or information on available targets and commands? The help command provides all the necessary guidance to ensure a smooth experience with OpenDevin.
 ```bash
 make help
 ```

 ### 8. Testing
-To run tests, refer to the following:
 #### Unit tests

 ```bash
-poetry run pytest ./tests/unit/test_*.py
+poetry run pytest ./tests/unit/test_sandbox.py
 ```

+#### Integration tests
+Please refer to [this README](./tests/integration/README.md) for details.
+
 ### 9. Add or update dependency
 1. Add your dependency in `pyproject.toml` or use `poetry add xxx`
 2. Update the poetry.lock file via `poetry lock --no-update`
-
-### 9. Use existing Docker image
-To reduce build time (e.g., if no changes were made to the client-runtime component), you can use an existing Docker container image. Follow these steps:
-1. Set the SANDBOX_RUNTIME_CONTAINER_IMAGE environment variable to the desired Docker image.
-2. Example: export SANDBOX_RUNTIME_CONTAINER_IMAGE=ghcr.io/all-hands-ai/runtime:0.13-nikolaik
-
-## Develop inside Docker container
-
-TL;DR
-
-```bash
-make docker-dev
-```
-
-See more details [here](./containers/dev/README.md)
-
-If you are just interested in running `OpenHands` without installing all the required tools on your host.
-
-```bash
-make docker-run
-```
-
-If you do not have `make` on your host, run:
-
-```bash
-cd ./containers/dev
-./dev.sh
-```
-
-You do need [Docker](https://docs.docker.com/engine/install/) installed on your host though.
--- a/ISSUE_TRIAGE.md
+++ b/ISSUE_TRIAGE.md
@@ -1,25 +0,0 @@
-# Issue Triage
-These are the procedures and guidelines on how issues are triaged in this repo by the maintainers.
-
-## General
-* Most issues must be tagged with **enhancement** or **bug**
-* Issues may be tagged with what it relates to (**backend**, **frontend**, **agent quality**, etc.)
-
-## Severity
-* **Low**: Minor issues, single user report
-* **Medium**: Affecting multiple users
-* **Critical**: Affecting all users or potential security issues
-
-## Effort
-* Issues may be estimated with effort required (**small effort**, **medium effort**, **large effort**)
-
-## Difficulty
-* Issues with low implementation difficulty may be tagged with **good first issue**
-
-## Not Enough Information
-* User is asked to provide more information (logs, how to reproduce, etc.) when the issue is not clear
-* If an issue is unclear and the author does not provide more information or respond to a request, the issue may be closed as **not planned** (Usually after a week)
-
-## Multiple Requests/Fixes in One Issue
-* These issues will be narrowed down to one request/fix so the issue is more easily tracked and fixed
-* Issues may be broken down into multiple issues if required
--- a/MANIFEST.in
+++ b/MANIFEST.in
@@ -1,5 +0,0 @@
-# Exclude all Python bytecode files
-global-exclude *.pyc
-
-# Exclude Python cache directories
-global-exclude __pycache__
--- a/88
+++ b/88
@@ -1,16 +1,16 @@
 SHELL=/bin/bash
-# Makefile for OpenHands project
+# Makefile for OpenDevin project

 # Variables
-BACKEND_HOST ?= "127.0.0.1"
+DOCKER_IMAGE = ghcr.io/opendevin/sandbox:main
 BACKEND_PORT = 3000
-BACKEND_HOST_PORT = "$(BACKEND_HOST):$(BACKEND_PORT)"
+BACKEND_HOST = "127.0.0.1:$(BACKEND_PORT)"
 FRONTEND_PORT = 3001
 DEFAULT_WORKSPACE_DIR = "./workspace"
 DEFAULT_MODEL = "gpt-4o"
 CONFIG_FILE = config.toml
 PRE_COMMIT_CONFIG_PATH = "./dev_config/python/.pre-commit-config.yaml"
-PYTHON_VERSION = 3.12
+PYTHON_VERSION = 3.11

 # ANSI color codes
 GREEN=$(shell tput -Txterm setaf 2)
@@ -23,6 +23,9 @@ RESET=$(shell tput -Txterm sgr0)
 build:
 	@echo "$(GREEN)Building project...$(RESET)"
 	@$(MAKE) -s check-dependencies
+ifeq ($(INSTALL_DOCKER),)
+	@$(MAKE) -s pull-docker-image
+endif
 	@$(MAKE) -s install-python-dependencies
 	@$(MAKE) -s install-frontend-dependencies
 	@$(MAKE) -s install-pre-commit-hooks
@@ -121,6 +124,11 @@ check-poetry:
 		exit 1; \
 	fi

+pull-docker-image:
+	@echo "$(YELLOW)Pulling Docker image...$(RESET)"
+	@docker pull $(DOCKER_IMAGE)
+	@echo "$(GREEN)Docker image pulled successfully.$(RESET)"
+
 install-python-dependencies:
 	@echo "$(GREEN)Installing Python dependencies...$(RESET)"
 	@if [ -z "${TZ}" ]; then \
@@ -133,7 +141,7 @@ install-python-dependencies:
 		export HNSWLIB_NO_NATIVE=1; \
 		poetry run pip install chroma-hnswlib; \
 	fi
-	@poetry install --without llama-index
+	@poetry install
 	@if [ -f "/etc/manjaro-release" ]; then \
 		echo "$(BLUE)Detected Manjaro Linux. Installing Playwright dependencies...$(RESET)"; \
 		poetry run pip install playwright; \
@@ -154,8 +162,11 @@ install-frontend-dependencies:
 	@echo "$(YELLOW)Setting up frontend environment...$(RESET)"
 	@echo "$(YELLOW)Detect Node.js version...$(RESET)"
 	@cd frontend && node ./scripts/detect-node-version.js
-	echo "$(BLUE)Installing frontend dependencies with npm...$(RESET)"
-	@cd frontend && npm install
+	@cd frontend && \
+		echo "$(BLUE)Installing frontend dependencies with npm...$(RESET)" && \
+		npm install && \
+		echo "$(BLUE)Running make-i18n with npm...$(RESET)" && \
+		npm run make-i18n
 	@echo "$(GREEN)Frontend dependencies installed successfully.$(RESET)"

 install-pre-commit-hooks:
@@ -166,7 +177,7 @@ install-pre-commit-hooks:

 lint-backend:
 	@echo "$(YELLOW)Running linters...$(RESET)"
-	@poetry run pre-commit run --files openhands/**/* agenthub/**/* evaluation/**/* --show-diff-on-failure --config $(PRE_COMMIT_CONFIG_PATH)
+	@poetry run pre-commit run --files opendevin/**/* agenthub/**/* evaluation/**/* --show-diff-on-failure --config $(PRE_COMMIT_CONFIG_PATH)

 lint-frontend:
 	@echo "$(YELLOW)Running linters for frontend...$(RESET)"
@@ -190,12 +201,12 @@ build-frontend:
 # Start backend
 start-backend:
 	@echo "$(YELLOW)Starting backend...$(RESET)"
-	@poetry run uvicorn openhands.server.listen:app --host $(BACKEND_HOST) --port $(BACKEND_PORT) --reload --reload-exclude "$(shell pwd)/workspace"
+	@poetry run uvicorn opendevin.server.listen:app --port $(BACKEND_PORT) --reload --reload-exclude "workspace/*"

 # Start frontend
 start-frontend:
 	@echo "$(YELLOW)Starting frontend...$(RESET)"
-	@cd frontend && VITE_BACKEND_HOST=$(BACKEND_HOST_PORT) VITE_FRONTEND_PORT=$(FRONTEND_PORT) npm run dev -- --port $(FRONTEND_PORT) --host $(BACKEND_HOST)
+	@cd frontend && VITE_BACKEND_HOST=$(BACKEND_HOST) VITE_FRONTEND_PORT=$(FRONTEND_PORT) npm run start

 # Common setup for running the app (non-callable)
 _run_setup:
@@ -205,7 +216,7 @@ _run_setup:
 	fi
 	@mkdir -p logs
 	@echo "$(YELLOW)Starting backend server...$(RESET)"
-	@poetry run uvicorn openhands.server.listen:app --host $(BACKEND_HOST) --port $(BACKEND_PORT) &
+	@poetry run uvicorn opendevin.server.listen:app --port $(BACKEND_PORT) &
 	@echo "$(YELLOW)Waiting for the backend to start...$(RESET)"
 	@until nc -z localhost $(BACKEND_PORT); do sleep 0.1; done
 	@echo "$(GREEN)Backend started successfully.$(RESET)"
@@ -214,23 +225,9 @@ _run_setup:
 run:
 	@echo "$(YELLOW)Running the app...$(RESET)"
 	@$(MAKE) -s _run_setup
-	@$(MAKE) -s start-frontend
+	@cd frontend && echo "$(BLUE)Starting frontend with npm...$(RESET)" && npm run start -- --port $(FRONTEND_PORT)
 	@echo "$(GREEN)Application started successfully.$(RESET)"

-# Run the app (in docker)
-docker-run: WORKSPACE_BASE ?= $(PWD)/workspace
-docker-run:
-	@if [ -f /.dockerenv ]; then \
-		echo "Running inside a Docker container. Exiting..."; \
-		exit 0; \
-	else \
-		echo "$(YELLOW)Running the app in Docker $(OPTIONS)...$(RESET)"; \
-		export WORKSPACE_BASE=${WORKSPACE_BASE}; \
-		export SANDBOX_USER_ID=$(shell id -u); \
-		export DATE=$(shell date +%Y%m%d%H%M%S); \
-		docker compose up $(OPTIONS); \
-	fi
-
 # Run the app (WSL mode)
 run-wsl:
 	@echo "$(YELLOW)Running the app in WSL mode...$(RESET)"
@@ -252,6 +249,16 @@ setup-config-prompts:
 	 workspace_dir=$${workspace_dir:-$(DEFAULT_WORKSPACE_DIR)}; \
 	 echo "workspace_base=\"$$workspace_dir\"" >> $(CONFIG_FILE).tmp

+	@read -p "Do you want to persist the sandbox container? [true/false] [default: false]: " persist_sandbox; \
+	 persist_sandbox=$${persist_sandbox:-false}; \
+	 if [ "$$persist_sandbox" = "true" ]; then \
+		 read -p "Enter a password for the sandbox container: " ssh_password; \
+		 echo "ssh_password=\"$$ssh_password\"" >> $(CONFIG_FILE).tmp; \
+		 echo "persist_sandbox=$$persist_sandbox" >> $(CONFIG_FILE).tmp; \
+	 else \
+		echo "persist_sandbox=$$persist_sandbox" >> $(CONFIG_FILE).tmp; \
+	 fi
+
 	@echo "" >> $(CONFIG_FILE).tmp

 	@echo "[llm]" >> $(CONFIG_FILE).tmp
@@ -275,10 +282,6 @@ setup-config-prompts:
 		echo "    - nomic-embed-text"; \
 		echo "    - all-minilm"; \
 		echo "    - stable-code"; \
-		echo "    - bge-m3"; \
-		echo "    - bge-large"; \
-		echo "    - paraphrase-multilingual"; \
-		echo "    - snowflake-arctic-embed"; \
 		echo "  - Leave blank to default to 'BAAI/bge-small-en-v1.5' via huggingface"; \
 		read -p "> " llm_embedding_model; \
 		echo "embedding_model=\"$$llm_embedding_model\"" >> $(CONFIG_FILE).tmp; \
@@ -295,20 +298,10 @@ setup-config-prompts:
 		fi


-# Develop in container
-docker-dev:
-	@if [ -f /.dockerenv ]; then \
-		echo "Running inside a Docker container. Exiting..."; \
-		exit 0; \
-	else \
-		echo "$(YELLOW)Build and run in Docker $(OPTIONS)...$(RESET)"; \
-		./containers/dev/dev.sh $(OPTIONS); \
-	fi
-
 # Clean up all caches
 clean:
 	@echo "$(YELLOW)Cleaning up caches...$(RESET)"
-	@rm -rf openhands/.cache
+	@rm -rf opendevin/.cache
 	@echo "$(GREEN)Caches cleaned up successfully.$(RESET)"

 # Help
@@ -317,16 +310,13 @@ help:
 	@echo "Targets:"
 	@echo "  $(GREEN)build$(RESET)               - Build project, including environment setup and dependencies."
 	@echo "  $(GREEN)lint$(RESET)                - Run linters on the project."
-	@echo "  $(GREEN)setup-config$(RESET)        - Setup the configuration for OpenHands by providing LLM API key,"
+	@echo "  $(GREEN)setup-config$(RESET)        - Setup the configuration for OpenDevin by providing LLM API key,"
 	@echo "                        LLM Model name, and workspace directory."
-	@echo "  $(GREEN)start-backend$(RESET)       - Start the backend server for the OpenHands project."
-	@echo "  $(GREEN)start-frontend$(RESET)      - Start the frontend server for the OpenHands project."
-	@echo "  $(GREEN)run$(RESET)                 - Run the OpenHands application, starting both backend and frontend servers."
+	@echo "  $(GREEN)start-backend$(RESET)       - Start the backend server for the OpenDevin project."
+	@echo "  $(GREEN)start-frontend$(RESET)      - Start the frontend server for the OpenDevin project."
+	@echo "  $(GREEN)run$(RESET)                 - Run the OpenDevin application, starting both backend and frontend servers."
 	@echo "                        Backend Log file will be stored in the 'logs' directory."
-	@echo "  $(GREEN)docker-dev$(RESET)          - Build and run the OpenHands application in Docker."
-	@echo "  $(GREEN)docker-run$(RESET)          - Run the OpenHands application, starting both backend and frontend servers in Docker."
 	@echo "  $(GREEN)help$(RESET)                - Display this help message, providing information on available targets."

 # Phony targets
-.PHONY: build check-dependencies check-python check-npm check-docker check-poetry install-python-dependencies install-frontend-dependencies install-pre-commit-hooks lint start-backend start-frontend run run-wsl setup-config setup-config-prompts help
-.PHONY: docker-dev docker-run
+.PHONY: build check-dependencies check-python check-npm check-docker check-poetry pull-docker-image install-python-dependencies install-frontend-dependencies install-pre-commit-hooks lint start-backend start-frontend run run-wsl setup-config setup-config-prompts help
--- a/README.md
+++ b/README.md
@@ -1,107 +1,122 @@
 <a name="readme-top"></a>

-<div align="center">
-  <img src="./docs/static/img/logo.png" alt="Logo" width="200">
-  <h1 align="center">OpenHands: Code Less, Make More</h1>
-</div>
+<!--
+*** Thanks for checking out the Best-README-Template. If you have a suggestion
+*** that would make this better, please fork the repo and create a pull request
+*** or simply open an issue with the tag "enhancement".
+*** Don't forget to give the project a star!
+*** Thanks again! Now go create something AMAZING! :D
+-->

+<!-- PROJECT SHIELDS -->
+<!--
+*** I'm using markdown "reference style" links for readability.
+*** Reference links are enclosed in brackets [ ] instead of parentheses ( ).
+*** See the bottom of this document for the declaration of the reference variables
+*** for contributors-url, forks-url, etc. This is an optional, concise syntax you may use.
+*** https://www.markdownguide.org/basic-syntax/#reference-style-links
+-->

 <div align="center">
-  <a href="https://github.com/All-Hands-AI/OpenHands/graphs/contributors"><img src="https://img.shields.io/github/contributors/All-Hands-AI/OpenHands?style=for-the-badge&color=blue" alt="Contributors"></a>
-  <a href="https://github.com/All-Hands-AI/OpenHands/stargazers"><img src="https://img.shields.io/github/stars/All-Hands-AI/OpenHands?style=for-the-badge&color=blue" alt="Stargazers"></a>
-  <a href="https://codecov.io/github/All-Hands-AI/OpenHands?branch=main"><img alt="CodeCov" src="https://img.shields.io/codecov/c/github/All-Hands-AI/OpenHands?style=for-the-badge&color=blue"></a>
-  <a href="https://github.com/All-Hands-AI/OpenHands/blob/main/LICENSE"><img src="https://img.shields.io/github/license/All-Hands-AI/OpenHands?style=for-the-badge&color=blue" alt="MIT License"></a>
+  <a href="https://github.com/OpenDevin/OpenDevin/graphs/contributors"><img src="https://img.shields.io/github/contributors/opendevin/opendevin?style=for-the-badge&color=blue" alt="Contributors"></a>
+  <a href="https://github.com/OpenDevin/OpenDevin/network/members"><img src="https://img.shields.io/github/forks/opendevin/opendevin?style=for-the-badge&color=blue" alt="Forks"></a>
+  <a href="https://github.com/OpenDevin/OpenDevin/stargazers"><img src="https://img.shields.io/github/stars/opendevin/opendevin?style=for-the-badge&color=blue" alt="Stargazers"></a>
+  <a href="https://github.com/OpenDevin/OpenDevin/issues"><img src="https://img.shields.io/github/issues/opendevin/opendevin?style=for-the-badge&color=blue" alt="Issues"></a>
+  <a href="https://github.com/OpenDevin/OpenDevin/blob/main/LICENSE"><img src="https://img.shields.io/github/license/opendevin/opendevin?style=for-the-badge&color=blue" alt="MIT License"></a>
  <br/>
-  <a href="https://join.slack.com/t/openhands-ai/shared_invite/zt-2tom0er4l-JeNUGHt_AxpEfIBstbLPiw"><img src="https://img.shields.io/badge/Slack-Join%20Us-red?logo=slack&logoColor=white&style=for-the-badge" alt="Join our Slack community"></a>
+  <a href="https://join.slack.com/t/opendevin/shared_invite/zt-2i1iqdag6-bVmvamiPA9EZUu7oCO6KhA"><img src="https://img.shields.io/badge/Slack-Join%20Us-red?logo=slack&logoColor=white&style=for-the-badge" alt="Join our Slack community"></a>
  <a href="https://discord.gg/ESHStjSjD4"><img src="https://img.shields.io/badge/Discord-Join%20Us-purple?logo=discord&logoColor=white&style=for-the-badge" alt="Join our Discord community"></a>
-  <a href="https://github.com/All-Hands-AI/OpenHands/blob/main/CREDITS.md"><img src="https://img.shields.io/badge/Project-Credits-blue?style=for-the-badge&color=FFE165&logo=github&logoColor=white" alt="Credits"></a>
-  <br/>
-  <a href="https://docs.all-hands.dev/modules/usage/getting-started"><img src="https://img.shields.io/badge/Documentation-000?logo=googledocs&logoColor=FFE165&style=for-the-badge" alt="Check out the documentation"></a>
-  <a href="https://arxiv.org/abs/2407.16741"><img src="https://img.shields.io/badge/Paper%20on%20Arxiv-000?logoColor=FFE165&logo=arxiv&style=for-the-badge" alt="Paper on Arxiv"></a>
-  <a href="https://huggingface.co/spaces/OpenHands/evaluation"><img src="https://img.shields.io/badge/Benchmark%20score-000?logoColor=FFE165&logo=huggingface&style=for-the-badge" alt="Evaluation Benchmark Score"></a>
-  <hr>
+  <a href="https://codecov.io/github/opendevin/opendevin?branch=main"><img alt="CodeCov" src="https://img.shields.io/codecov/c/github/opendevin/opendevin?style=for-the-badge"></a>
 </div>

-Welcome to OpenHands (formerly OpenDevin), a platform for software development agents powered by AI.
+<!-- PROJECT LOGO -->
+<div align="center">
+  <img src="./docs/static/img/logo.png" alt="Logo" width="200" height="200">
+  <h1 align="center">OpenDevin: Code Less, Make More</h1>
+  <a href="https://opendevin.github.io/OpenDevin/modules/usage/intro"><img src="https://img.shields.io/badge/Documentation-OpenDevin-blue?logo=googledocs&logoColor=white&style=for-the-badge" alt="Check out the documentation"></a>
+  <a href="https://huggingface.co/spaces/OpenDevin/evaluation"><img src="https://img.shields.io/badge/Evaluation-Benchmark%20on%20HF%20Space-green?style=for-the-badge" alt="Evaluation Benchmark"></a>
+</div>
+<hr>

-OpenHands agents can do anything a human developer can: modify code, run commands, browse the web,
-call APIs, and yes—even copy code snippets from StackOverflow.
+Welcome to OpenDevin, a platform for autonomous software engineers, powered by AI and LLMs.

-Learn more at [docs.all-hands.dev](https://docs.all-hands.dev), or jump to the [Quick Start](#-quick-start).
+OpenDevin agents collaborate with human developers to write code, fix bugs, and ship features.

 ![App screenshot](./docs/static/img/screenshot.png)

-## ⚡ Quick Start
+## ⚡ Getting Started
+OpenDevin works best with the most recent version of Docker, `26.0.0`.
+You must be using Linux, Mac OS, or WSL on Windows.

-The easiest way to run OpenHands is in Docker.
-See the [Installation](https://docs.all-hands.dev/modules/usage/installation) guide for
-system requirements and more information.
+To start OpenDevin in a docker container, run the following commands in your terminal:
+
+> [!WARNING]
+> When you run the following command, files in `./workspace` may be modified or deleted.

 ```bash
-docker pull docker.all-hands.dev/all-hands-ai/runtime:0.13-nikolaik
-
-docker run -it --pull=always \
-    -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.13-nikolaik \
+WORKSPACE_BASE=$(pwd)/workspace
+docker run -it \
+    --pull=always \
+    -e SANDBOX_USER_ID=$(id -u) \
+    -e WORKSPACE_MOUNT_PATH=$WORKSPACE_BASE \
+    -v $WORKSPACE_BASE:/opt/workspace_base \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -p 3000:3000 \
-    -e LOG_ALL_EVENTS=true \
    --add-host host.docker.internal:host-gateway \
-    --name openhands-app \
-    docker.all-hands.dev/all-hands-ai/openhands:0.13
+    --name opendevin-app-$(date +%Y%m%d%H%M%S) \
+    ghcr.io/opendevin/opendevin
 ```

-You'll find OpenHands running at [http://localhost:3000](http://localhost:3000)!
+> [!NOTE]
+> By default, this command pulls the `latest` tag, which represents the most recent release of OpenDevin. You have other options as well:
+> - For a specific release version, use `ghcr.io/opendevin/opendevin:<OpenDevin_version>` (replace <OpenDevin_version> with the desired version number).
+> - For the most up-to-date development version, use `ghcr.io/opendevin/opendevin:main`. This version may be **(unstable!)** and is recommended for testing or development purposes only.
+> 
+> Choose the tag that best suits your needs based on stability requirements and desired features.

-Finally, you'll need a model provider and API key.
-[Anthropic's Claude 3.5 Sonnet](https://www.anthropic.com/api) (`anthropic/claude-3-5-sonnet-20241022`)
-works best, but you have [many options](https://docs.all-hands.dev/modules/usage/llms).
+You'll find OpenDevin running at [http://localhost:3000](http://localhost:3000) with access to `./workspace`. To have OpenDevin operate on your code, place it in `./workspace`.
+OpenDevin will only have access to this workspace folder. The rest of your system will not be affected as it runs in a secured docker sandbox.

---
+Upon opening OpenDevin, you must select the appropriate `Model` and enter the `API Key` within the settings that should pop up automatically. These can be set at any time by selecting
+the `Settings` button (gear icon) in the UI. If the required `Model` does not exist in the list, you can manually enter it in the text box.

-You can also [connect OpenHands to your local filesystem](https://docs.all-hands.dev/modules/usage/runtimes),
-run OpenHands in a scriptable [headless mode](https://docs.all-hands.dev/modules/usage/how-to/headless-mode),
-interact with it via a [friendly CLI](https://docs.all-hands.dev/modules/usage/how-to/cli-mode),
-or run it on tagged issues with [a github action](https://github.com/All-Hands-AI/OpenHands-resolver).
+For the development workflow, see [Development.md](https://github.com/OpenDevin/OpenDevin/blob/main/Development.md).

-Visit [Installation](https://docs.all-hands.dev/modules/usage/installation) for more information and setup instructions.
+Are you having trouble? Check out our [Troubleshooting Guide](https://opendevin.github.io/OpenDevin/modules/usage/troubleshooting).

-If you want to modify the OpenHands source code, check out [Development.md](https://github.com/All-Hands-AI/OpenHands/blob/main/Development.md).
+## 🚀 Documentation

-Having issues? The [Troubleshooting Guide](https://docs.all-hands.dev/modules/usage/troubleshooting) can help.
+To learn more about the project, and for tips on using OpenDevin,
+**check out our [documentation](https://opendevin.github.io/OpenDevin/modules/usage/intro)**.

-## 📖 Documentation
-
-To learn more about the project, and for tips on using OpenHands,
-**check out our [documentation](https://docs.all-hands.dev/modules/usage/getting-started)**.
-
-There you'll find resources on how to use different LLM providers,
+There you'll find resources on how to use different LLM providers (like ollama and Anthropic's Claude),
 troubleshooting resources, and advanced configuration options.

 ## 🤝 How to Contribute

-OpenHands is a community-driven project, and we welcome contributions from everyone.
+OpenDevin is a community-driven project, and we welcome contributions from everyone.
 Whether you're a developer, a researcher, or simply enthusiastic about advancing the field of
 software engineering with AI, there are many ways to get involved:

 - **Code Contributions:** Help us develop new agents, core functionality, the frontend and other interfaces, or sandboxing solutions.
 - **Research and Evaluation:** Contribute to our understanding of LLMs in software engineering, participate in evaluating the models, or suggest improvements.
- **Feedback and Testing:** Use the OpenHands toolset, report bugs, suggest features, or provide feedback on usability.
+- **Feedback and Testing:** Use the OpenDevin toolset, report bugs, suggest features, or provide feedback on usability.

 For details, please check [CONTRIBUTING.md](./CONTRIBUTING.md).

 ## 🤖 Join Our Community

-Whether you're a developer, a researcher, or simply enthusiastic about OpenHands, we'd love to have you in our community.
+Whether you're a developer, a researcher, or simply enthusiastic about OpenDevin, we'd love to have you in our community.
 Let's make software engineering better together!

- [Slack workspace](https://join.slack.com/t/openhands-ai/shared_invite/zt-2tom0er4l-JeNUGHt_AxpEfIBstbLPiw) - Here we talk about research, architecture, and future development.
+- [Slack workspace](https://join.slack.com/t/opendevin/shared_invite/zt-2jsrl32uf-fTeeFjNyNYxqSZt5NPY3fA) - Here we talk about research, architecture, and future development.
 - [Discord server](https://discord.gg/ESHStjSjD4) - This is a community-run server for general discussion, questions, and feedback.

 ## 📈 Progress

 <p align="center">
-  <a href="https://star-history.com/#All-Hands-AI/OpenHands&Date">
-    <img src="https://api.star-history.com/svg?repos=All-Hands-AI/OpenHands&type=Date" width="500" alt="Star History Chart">
+  <a href="https://star-history.com/#OpenDevin/OpenDevin&Date">
+    <img src="https://api.star-history.com/svg?repos=OpenDevin/OpenDevin&type=Date" width="500" alt="Star History Chart">
  </a>
 </p>

@@ -109,22 +124,26 @@ Let's make software engineering better together!

 Distributed under the MIT License. See [`LICENSE`](./LICENSE) for more information.

-## 🙏 Acknowledgements
-
-OpenHands is built by a large number of contributors, and every contribution is greatly appreciated! We also build upon other open source projects, and we are deeply thankful for their work.
-
-For a list of open source projects and licenses used in OpenHands, please see our [CREDITS.md](./CREDITS.md) file.
+[contributors-shield]: https://img.shields.io/github/contributors/opendevin/opendevin?style=for-the-badge
+[contributors-url]: https://github.com/OpenDevin/OpenDevin/graphs/contributors
+[forks-shield]: https://img.shields.io/github/forks/opendevin/opendevin?style=for-the-badge
+[forks-url]: https://github.com/OpenDevin/OpenDevin/network/members
+[stars-shield]: https://img.shields.io/github/stars/opendevin/opendevin?style=for-the-badge
+[stars-url]: https://github.com/OpenDevin/OpenDevin/stargazers
+[issues-shield]: https://img.shields.io/github/issues/opendevin/opendevin?style=for-the-badge
+[issues-url]: https://github.com/OpenDevin/OpenDevin/issues
+[license-shield]: https://img.shields.io/github/license/opendevin/opendevin?style=for-the-badge
+[license-url]: https://github.com/OpenDevin/OpenDevin/blob/main/LICENSE

 ## 📚 Cite

 ```
-@misc{openhands,
-      title={{OpenHands: An Open Platform for AI Software Developers as Generalist Agents}},
-      author={Xingyao Wang and Boxuan Li and Yufan Song and Frank F. Xu and Xiangru Tang and Mingchen Zhuge and Jiayi Pan and Yueqi Song and Bowen Li and Jaskirat Singh and Hoang H. Tran and Fuqiang Li and Ren Ma and Mingzhang Zheng and Bill Qian and Yanjun Shao and Niklas Muennighoff and Yizhe Zhang and Binyuan Hui and Junyang Lin and Robert Brennan and Hao Peng and Heng Ji and Graham Neubig},
-      year={2024},
-      eprint={2407.16741},
-      archivePrefix={arXiv},
-      primaryClass={cs.SE},
-      url={https://arxiv.org/abs/2407.16741},
+@misc{opendevin2024,
+  author       = {{OpenDevin Team}},
+  title        = {{OpenDevin: An Open Platform for AI Software Developers as Generalist Agents}},
+  year         = {2024},
+  version      = {v1.0},
+  howpublished = {\url{https://github.com/OpenDevin/OpenDevin}},
+  note         = {Accessed: ENTER THE DATE YOU ACCESSED THE PROJECT}
 }
 ```
--- a/agenthub/README.md
+++ b/agenthub/README.md
@@ -0,0 +1,72 @@
+# Agent Framework Research
+
+In this folder, there may exist multiple implementations of `Agent` that will be used by the framework.
+
+For example, `agenthub/codeact_agent`, etc.
+Contributors from different backgrounds and interests can choose to contribute to any (or all!) of these directions.
+
+## Constructing an Agent
+
+The abstraction for an agent can be found [here](../opendevin/controller/agent.py).
+
+Agents are run inside of a loop. At each iteration, `agent.step()` is called with a
+[State](../opendevin/controller/state/state.py) input, and the agent must output an [Action](../opendevin/events/action).
+
+Every agent also has a `self.llm` which it can use to interact with the LLM configured by the user.
+See the [LiteLLM docs for `self.llm.completion`](https://docs.litellm.ai/docs/completion).
+
+## State
+
+The `state` contains:
+
+- A history of actions taken by the agent, as well as any observations (e.g. file content, command output) from those actions
+- A list of actions/observations that have happened since the most recent step
+- A [`root_task`](https://github.com/OpenDevin/OpenDevin/blob/main/opendevin/controller/state/task.py), which contains a plan of action
+  - The agent can add and modify subtasks through the `AddTaskAction` and `ModifyTaskAction`
+
+## Actions
+
+Here is a list of available Actions, which can be returned by `agent.step()`:
+
+- [`CmdRunAction`](../opendevin/events/action/commands.py) - Runs a command inside a sandboxed terminal
+- [`IPythonRunCellAction`](../opendevin/events/action/commands.py) - Execute a block of Python code interactively (in Jupyter notebook) and receives `CmdOutputObservation`. Requires setting up `jupyter` [plugin](../opendevin/runtime/plugins) as a requirement.
+- [`FileReadAction`](../opendevin/events/action/files.py) - Reads the content of a file
+- [`FileWriteAction`](../opendevin/events/action/files.py) - Writes new content to a file
+- [`BrowseURLAction`](../opendevin/events/action/browse.py) - Gets the content of a URL
+- [`AddTaskAction`](../opendevin/events/action/tasks.py) - Adds a subtask to the plan
+- [`ModifyTaskAction`](../opendevin/events/action/tasks.py) - Changes the state of a subtask.
+- [`AgentFinishAction`](../opendevin/events/action/agent.py) - Stops the control loop, allowing the user/delegator agent to enter a new task
+- [`AgentRejectAction`](../opendevin/events/action/agent.py) - Stops the control loop, allowing the user/delegator agent to enter a new task
+- [`AgentFinishAction`](../opendevin/events/action/agent.py) - Stops the control loop, allowing the user to enter a new task
+- [`MessageAction`](../opendevin/events/action/message.py) - Represents a message from an agent or the user
+
+You can use `action.to_dict()` and `action_from_dict` to serialize and deserialize actions.
+
+## Observations
+
+There are also several types of Observations. These are typically available in the step following the corresponding Action.
+But they may also appear as a result of asynchronous events (e.g. a message from the user).
+
+Here is a list of available Observations:
+
+- [`CmdOutputObservation`](../opendevin/events/observation/commands.py)
+- [`BrowserOutputObservation`](../opendevin/events/observation/browse.py)
+- [`FileReadObservation`](../opendevin/events/observation/files.py)
+- [`FileWriteObservation`](../opendevin/events/observation/files.py)
+- [`ErrorObservation`](../opendevin/events/observation/error.py)
+- [`SuccessObservation`](../opendevin/events/observation/success.py)
+
+You can use `observation.to_dict()` and `observation_from_dict` to serialize and deserialize observations.
+
+## Interface
+
+Every agent must implement the following methods:
+
+### `step`
+
+```
+def step(self, state: "State") -> "Action"
+```
+
+`step` moves the agent forward one step towards its goal. This probably means
+sending a prompt to the LLM, then parsing the response into an `Action`.
--- a/openhands/agenthub/init.py
+++ b/openhands/agenthub/init.py
@@ -1,23 +1,28 @@
 from dotenv import load_dotenv

-from openhands.agenthub.micro.agent import MicroAgent
-from openhands.agenthub.micro.registry import all_microagents
-from openhands.controller.agent import Agent
+from opendevin.controller.agent import Agent
+
+from .micro.agent import MicroAgent
+from .micro.registry import all_microagents

 load_dotenv()


-from openhands.agenthub import (  # noqa: E402
+from . import (  # noqa: E402
    browsing_agent,
    codeact_agent,
    codeact_swe_agent,
    delegator_agent,
    dummy_agent,
+    gptswarm_agent,
+    monologue_agent,
    planner_agent,
 )

 __all__ = [
+    'monologue_agent',
    'codeact_agent',
+    'gptswarm_agent',
    'codeact_swe_agent',
    'planner_agent',
    'delegator_agent',
--- a/openhands/agenthub/browsing_agent/README.md
+++ b/openhands/agenthub/browsing_agent/README.md
@@ -8,9 +8,9 @@ This folder implements the basic BrowserGym [demo agent](https://github.com/Serv
 Note that for browsing tasks, GPT-4 is usually a requirement to get reasonable results, due to the complexity of the web page structures.

 ```
-poetry run python ./openhands/core/main.py \
+poetry run python ./opendevin/core/main.py \
           -i 10 \
           -t "tell me the usa's president using google search" \
           -c BrowsingAgent \
-           -m claude-3-5-sonnet-20241022
+           -m gpt-4o-2024-05-13
 ```
--- a/agenthub/browsing_agent/init.py
+++ b/agenthub/browsing_agent/init.py
@@ -0,0 +1,5 @@
+from opendevin.controller.agent import Agent
+
+from .browsing_agent import BrowsingAgent
+
+Agent.register('BrowsingAgent', BrowsingAgent)
--- a/openhands/agenthub/browsing_agent/browsing_agent.py
+++ b/openhands/agenthub/browsing_agent/browsing_agent.py
@@ -3,25 +3,24 @@ import os
 from browsergym.core.action.highlevel import HighLevelActionSet
 from browsergym.utils.obs import flatten_axtree_to_str

-from openhands.agenthub.browsing_agent.response_parser import BrowsingResponseParser
-from openhands.controller.agent import Agent
-from openhands.controller.state.state import State
-from openhands.core.config import AgentConfig
-from openhands.core.logger import openhands_logger as logger
-from openhands.core.message import Message, TextContent
-from openhands.events.action import (
+from agenthub.browsing_agent.response_parser import BrowsingResponseParser
+from opendevin.controller.agent import Agent
+from opendevin.controller.state.state import State
+from opendevin.core.logger import opendevin_logger as logger
+from opendevin.events.action import (
    Action,
    AgentFinishAction,
    BrowseInteractiveAction,
    MessageAction,
 )
-from openhands.events.event import EventSource
-from openhands.events.observation import BrowserOutputObservation
-from openhands.events.observation.observation import Observation
-from openhands.llm.llm import LLM
-from openhands.runtime.plugins import (
+from opendevin.events.event import EventSource
+from opendevin.events.observation import BrowserOutputObservation
+from opendevin.events.observation.observation import Observation
+from opendevin.llm.llm import LLM
+from opendevin.runtime.plugins import (
    PluginRequirement,
 )
+from opendevin.runtime.tools import RuntimeTool

 USE_NAV = (
    os.environ.get('USE_NAV', 'true') == 'true'
@@ -65,15 +64,10 @@ In order to accomplish my goal I need to send the information asked back to the
 """


-def get_prompt(
-    error_prefix: str, cur_url: str, cur_axtree_txt: str, prev_action_str: str
-) -> str:
+def get_prompt(error_prefix: str, cur_axtree_txt: str, prev_action_str: str) -> str:
    prompt = f"""\
 {error_prefix}

-# Current Page URL:
-{cur_url}
-
 # Current Accessibility Tree:
 {cur_axtree_txt}

@@ -98,19 +92,20 @@ class BrowsingAgent(Agent):
    """

    sandbox_plugins: list[PluginRequirement] = []
+    runtime_tools: list[RuntimeTool] = [RuntimeTool.BROWSER]
    response_parser = BrowsingResponseParser()

    def __init__(
        self,
        llm: LLM,
-        config: AgentConfig,
    ) -> None:
-        """Initializes a new instance of the BrowsingAgent class.
+        """
+        Initializes a new instance of the BrowsingAgent class.

        Parameters:
        - llm (LLM): The llm to be used by this agent
        """
-        super().__init__(llm, config)
+        super().__init__(llm)
        # define a configurable action space, with chat functionality, web navigation, and webpage grounding using accessibility tree and HTML.
        # see https://github.com/ServiceNow/BrowserGym/blob/main/core/src/browsergym/core/action/highlevel.py for more details
        action_subsets = ['chat', 'bid']
@@ -125,13 +120,16 @@ class BrowsingAgent(Agent):
        self.reset()

    def reset(self) -> None:
-        """Resets the Browsing Agent."""
+        """
+        Resets the Browsing Agent.
+        """
        super().reset()
        self.cost_accumulator = 0
        self.error_accumulator = 0

    def step(self, state: State) -> Action:
-        """Performs one step using the Browsing Agent.
+        """
+        Performs one step using the Browsing Agent.
        This includes gathering information on previous steps and prompting the model to make a browsing command to execute.

        Parameters:
@@ -142,21 +140,20 @@ class BrowsingAgent(Agent):
        - MessageAction(content) - Message action to run (e.g. ask for clarification)
        - AgentFinishAction() - end the interaction
        """
-        messages: list[Message] = []
+        messages = []
        prev_actions = []
-        cur_url = ''
        cur_axtree_txt = ''
        error_prefix = ''
        last_obs = None
        last_action = None

-        if EVAL_MODE and len(state.history) == 1:
+        if EVAL_MODE and len(state.history.get_events_as_list()) == 1:
            # for webarena and miniwob++ eval, we need to retrieve the initial observation already in browser env
            # initialize and retrieve the first observation by issuing an noop OP
            # For non-benchmark browsing, the browser env starts with a blank page, and the agent is expected to first navigate to desired websites
            return BrowseInteractiveAction(browser_actions='noop()')

-        for event in state.history:
+        for event in state.history.get_events():
            if isinstance(event, BrowseInteractiveAction):
                prev_actions.append(event.browser_actions)
                last_action = event
@@ -171,7 +168,7 @@ class BrowsingAgent(Agent):

        prev_action_str = '\n'.join(prev_actions)
        # if the final BrowserInteractiveAction exec BrowserGym's send_msg_to_user,
-        # we should also send a message back to the user in OpenHands and call it a day
+        # we should also send a message back to the user in OpenDevin and call it a day
        if (
            isinstance(last_action, BrowseInteractiveAction)
            and last_action.browsergym_send_msg_to_user
@@ -185,9 +182,6 @@ class BrowsingAgent(Agent):
                self.error_accumulator += 1
                if self.error_accumulator > 5:
                    return MessageAction('Too many errors encountered. Task failed.')
-
-            cur_url = last_obs.url
-
            try:
                cur_axtree_txt = flatten_axtree_to_str(
                    last_obs.axtree_object,
@@ -201,23 +195,21 @@ class BrowsingAgent(Agent):
                )
                return MessageAction('Error encountered when browsing.')

-        goal, _ = state.get_current_user_intent()
-
-        if goal is None:
+        if (goal := state.get_current_user_intent()) is None:
            goal = state.inputs['task']
-
        system_msg = get_system_message(
            goal,
            self.action_space.describe(with_long_description=False, with_examples=True),
        )

-        messages.append(Message(role='system', content=[TextContent(text=system_msg)]))
-
-        prompt = get_prompt(error_prefix, cur_url, cur_axtree_txt, prev_action_str)
-        messages.append(Message(role='user', content=[TextContent(text=prompt)]))
+        messages.append({'role': 'system', 'content': system_msg})

+        prompt = get_prompt(error_prefix, cur_axtree_txt, prev_action_str)
+        messages.append({'role': 'user', 'content': prompt})
+        logger.debug(prompt)
        response = self.llm.completion(
-            messages=self.llm.format_messages_for_llm(messages),
+            messages=messages,
+            temperature=0.0,
            stop=[')```', ')\n```'],
        )
        return self.response_parser.parse(response)
--- a/openhands/agenthub/browsing_agent/prompt.py
+++ b/openhands/agenthub/browsing_agent/prompt.py
@@ -12,11 +12,12 @@ from browsergym.core.action.base import AbstractActionSet
 from browsergym.core.action.highlevel import HighLevelActionSet
 from browsergym.core.action.python import PythonActionSet

-from openhands.agenthub.browsing_agent.utils import (
+from opendevin.runtime.browser.browser_env import BrowserEnv
+
+from .utils import (
    ParseError,
    parse_html_tags_raise,
 )
-from openhands.runtime.browser.browser_env import BrowserEnv


@dataclass
@@ -57,7 +58,7 @@ class Flags:

    @classmethod
    def from_dict(self, flags_dict):
-        """Helper for JSON serializable requirement."""
+        """Helper for JSON serializble requirement."""
        if isinstance(flags_dict, Flags):
            return flags_dict

@@ -74,8 +75,7 @@ class PromptElement:
    Prompt elements are used to build the prompt. Use flags to control which
    prompt elements are visible. We use class attributes as a convenient way
    to implement static prompts, but feel free to override them with instance
-    attributes or @property decorator.
-    """
+    attributes or @property decorator."""

    _prompt = ''
    _abstract_ex = ''
@@ -200,10 +200,11 @@ def fit_tokens(
    model_name : str, optional
        The name of the model used when tokenizing.

-    Returns:
+    Returns
    -------
    str : the prompt after shrinking.
    """
+
    if max_prompt_chars is None:
        return shrinkable.prompt

@@ -354,7 +355,7 @@ and executed by a program, make sure to follow the formatting instructions.
        self._prompt += '\n'.join(
            [
                f"""\
- - [{msg['role']}], {msg['message']}"""
+ - [{msg['role']}] {msg['message']}"""
                for msg in chat_messages
            ]
        )
@@ -578,8 +579,8 @@ the form is not visible yet or some fields are disabled. I need to replan.
 def diff(previous, new):
    """Return a string showing the difference between original and new.

-    If the difference is above diff_threshold, return the diff string.
-    """
+    If the difference is above diff_threshold, return the diff string."""
+
    if previous == new:
        return 'Identical', []

--- a/agenthub/browsing_agent/response_parser.py
+++ b/agenthub/browsing_agent/response_parser.py
@@ -0,0 +1,90 @@
+import ast
+
+from opendevin.controller.action_parser import ActionParser, ResponseParser
+from opendevin.core.logger import opendevin_logger as logger
+from opendevin.events.action import (
+    Action,
+    BrowseInteractiveAction,
+)
+
+
+class BrowsingResponseParser(ResponseParser):
+    def __init__(self):
+        # Need to pay attention to the item order in self.action_parsers
+        super().__init__()
+        self.action_parsers = [BrowsingActionParserMessage()]
+        self.default_parser = BrowsingActionParserBrowseInteractive()
+
+    def parse(self, response: str) -> Action:
+        action_str = self.parse_response(response)
+        return self.parse_action(action_str)
+
+    def parse_response(self, response) -> str:
+        action_str = response['choices'][0]['message']['content']
+        if action_str is None:
+            return ''
+        action_str = action_str.strip()
+        if not action_str.endswith('```'):
+            action_str = action_str + ')```'
+        logger.info(action_str)
+        return action_str
+
+    def parse_action(self, action_str: str) -> Action:
+        for action_parser in self.action_parsers:
+            if action_parser.check_condition(action_str):
+                return action_parser.parse(action_str)
+        return self.default_parser.parse(action_str)
+
+
+class BrowsingActionParserMessage(ActionParser):
+    """
+    Parser action:
+        - BrowseInteractiveAction(browser_actions) - unexpected response format, message back to user
+    """
+
+    def __init__(
+        self,
+    ):
+        pass
+
+    def check_condition(self, action_str: str) -> bool:
+        return '```' not in action_str
+
+    def parse(self, action_str: str) -> Action:
+        msg = f'send_msg_to_user("""{action_str}""")'
+        return BrowseInteractiveAction(
+            browser_actions=msg,
+            thought=action_str,
+            browsergym_send_msg_to_user=action_str,
+        )
+
+
+class BrowsingActionParserBrowseInteractive(ActionParser):
+    """
+    Parser action:
+        - BrowseInteractiveAction(browser_actions) - handle send message to user function call in BrowserGym
+    """
+
+    def __init__(
+        self,
+    ):
+        pass
+
+    def check_condition(self, action_str: str) -> bool:
+        return True
+
+    def parse(self, action_str: str) -> Action:
+        thought = action_str.split('```')[0].strip()
+        action_str = action_str.split('```')[1].strip()
+        msg_content = ''
+        for sub_action in action_str.split('\n'):
+            if 'send_msg_to_user(' in sub_action:
+                tree = ast.parse(sub_action)
+                args = tree.body[0].value.args  # type: ignore
+                msg_content = args[0].value
+
+        return BrowseInteractiveAction(
+            browser_actions=action_str,
+            thought=thought,
+            browsergym_send_msg_to_user=msg_content,
+        )
--- a/openhands/agenthub/browsing_agent/utils.py
+++ b/openhands/agenthub/browsing_agent/utils.py
@@ -7,6 +7,7 @@ import yaml

 def yaml_parser(message):
    """Parse a yaml message for the retry function."""
+
    # saves gpt-3.5 from some yaml parsing errors
    message = re.sub(r':\s*\n(?=\S|\n)', ': ', message)

@@ -46,6 +47,7 @@ def _compress_chunks(text, identifier, skip_list, split_regex='\n\n+'):

 def compress_string(text):
    """Compress a string by replacing redundant paragraphs and lines with identifiers."""
+
    # Perform paragraph-level compression
    def_dict, compressed_text = _compress_chunks(
        text, identifier='§', skip_list=[], split_regex='\n\n+'
@@ -77,12 +79,12 @@ def extract_html_tags(text, keys):
    keys : list of str
        The HTML tags to extract the content from.

-    Returns:
+    Returns
    -------
    dict
        A dictionary mapping each key to a list of subset in `text` that match the key.

-    Notes:
+    Notes
    -----
    All text and keys will be converted to lowercase before matching.

@@ -124,7 +126,7 @@ def parse_html_tags(text, keys=(), optional_keys=(), merge_multiple=False):
    optional_keys : list of str
        The HTML tags to extract the content from, but are optional.

-    Returns:
+    Returns
    -------
    dict
        A dictionary mapping each key to subset of `text` that match the key.
--- a/agenthub/codeact_agent/README.md
+++ b/agenthub/codeact_agent/README.md
@@ -0,0 +1,29 @@
+# CodeAct Agent Framework
+
+This folder implements the CodeAct idea ([paper](https://arxiv.org/abs/2402.01030), [tweet](https://twitter.com/xingyaow_/status/1754556835703751087)) that consolidates LLM agents’ **act**ions into a unified **code** action space for both *simplicity* and *performance* (see paper for more details).
+
+The conceptual idea is illustrated below. At each turn, the agent can:
+
+1. **Converse**: Communicate with humans in natural language to ask for clarification, confirmation, etc.
+2. **CodeAct**: Choose to perform the task by executing code
+   - Execute any valid Linux `bash` command
+   - Execute any valid `Python` code with [an interactive Python interpreter](https://ipython.org/). This is simulated through `bash` command, see plugin system below for more details.
+
+![image](https://github.com/OpenDevin/OpenDevin/assets/38853559/92b622e3-72ad-4a61-8f41-8c040b6d5fb3)
+
+## Plugin System
+
+To make the CodeAct agent more powerful with only access to `bash` action space, CodeAct agent leverages OpenDevin's plugin system:
+- [Jupyter plugin](https://github.com/OpenDevin/OpenDevin/tree/main/opendevin/runtime/plugins/jupyter): for IPython execution via bash command
+- [SWE-agent tool plugin](https://github.com/OpenDevin/OpenDevin/tree/main/opendevin/runtime/plugins/swe_agent_commands): Powerful bash command line tools for software development tasks introduced by [swe-agent](https://github.com/princeton-nlp/swe-agent).
+
+## Demo
+
+https://github.com/OpenDevin/OpenDevin/assets/38853559/f592a192-e86c-4f48-ad31-d69282d5f6ac
+
+*Example of CodeActAgent with `gpt-4-turbo-2024-04-09` performing a data science task (linear regression)*
+
+## Work-in-progress & Next step
+
+[] Support web-browsing
+[] Complete the workflow for CodeAct agent to submit Github PRs
--- a/agenthub/codeact_agent/init.py
+++ b/agenthub/codeact_agent/init.py
@@ -0,0 +1,5 @@
+from opendevin.controller.agent import Agent
+
+from .codeact_agent import CodeActAgent
+
+Agent.register('CodeActAgent', CodeActAgent)
--- a/agenthub/codeact_agent/action_parser.py
+++ b/agenthub/codeact_agent/action_parser.py
@@ -0,0 +1,183 @@
+import re
+
+from opendevin.controller.action_parser import ActionParser, ResponseParser
+from opendevin.events.action import (
+    Action,
+    AgentDelegateAction,
+    AgentFinishAction,
+    CmdRunAction,
+    IPythonRunCellAction,
+    MessageAction,
+)
+
+
+class CodeActResponseParser(ResponseParser):
+    """
+    Parser action:
+        - CmdRunAction(command) - bash command to run
+        - IPythonRunCellAction(code) - IPython code to run
+        - AgentDelegateAction(agent, inputs) - delegate action for (sub)task
+        - MessageAction(content) - Message action to run (e.g. ask for clarification)
+        - AgentFinishAction() - end the interaction
+    """
+
+    def __init__(self):
+        # Need pay attention to the item order in self.action_parsers
+        super().__init__()
+        self.action_parsers = [
+            CodeActActionParserFinish(),
+            CodeActActionParserCmdRun(),
+            CodeActActionParserIPythonRunCell(),
+            CodeActActionParserAgentDelegate(),
+        ]
+        self.default_parser = CodeActActionParserMessage()
+
+    def parse(self, response) -> Action:
+        action_str = self.parse_response(response)
+        return self.parse_action(action_str)
+
+    def parse_response(self, response) -> str:
+        action = response.choices[0].message.content
+        if action is None:
+            return ''
+        for lang in ['bash', 'ipython', 'browse']:
+            if f'<execute_{lang}>' in action and f'</execute_{lang}>' not in action:
+                action += f'</execute_{lang}>'
+        return action
+
+    def parse_action(self, action_str: str) -> Action:
+        for action_parser in self.action_parsers:
+            if action_parser.check_condition(action_str):
+                return action_parser.parse(action_str)
+        return self.default_parser.parse(action_str)
+
+
+class CodeActActionParserFinish(ActionParser):
+    """
+    Parser action:
+        - AgentFinishAction() - end the interaction
+    """
+
+    def __init__(
+        self,
+    ):
+        self.finish_command = None
+
+    def check_condition(self, action_str: str) -> bool:
+        self.finish_command = re.search(r'<finish>.*</finish>', action_str, re.DOTALL)
+        return self.finish_command is not None
+
+    def parse(self, action_str: str) -> Action:
+        assert (
+            self.finish_command is not None
+        ), 'self.finish_command should not be None when parse is called'
+        thought = action_str.replace(self.finish_command.group(0), '').strip()
+        return AgentFinishAction(thought=thought)
+
+
+class CodeActActionParserCmdRun(ActionParser):
+    """
+    Parser action:
+        - CmdRunAction(command) - bash command to run
+        - AgentFinishAction() - end the interaction
+    """
+
+    def __init__(
+        self,
+    ):
+        self.bash_command = None
+
+    def check_condition(self, action_str: str) -> bool:
+        self.bash_command = re.search(
+            r'<execute_bash>(.*?)</execute_bash>', action_str, re.DOTALL
+        )
+        return self.bash_command is not None
+
+    def parse(self, action_str: str) -> Action:
+        assert (
+            self.bash_command is not None
+        ), 'self.bash_command should not be None when parse is called'
+        thought = action_str.replace(self.bash_command.group(0), '').strip()
+        # a command was found
+        command_group = self.bash_command.group(1).strip()
+        if command_group.strip() == 'exit':
+            return AgentFinishAction()
+        return CmdRunAction(command=command_group, thought=thought)
+
+
+class CodeActActionParserIPythonRunCell(ActionParser):
+    """
+    Parser action:
+        - IPythonRunCellAction(code) - IPython code to run
+    """
+
+    def __init__(
+        self,
+    ):
+        self.python_code = None
+        self.jupyter_kernel_init_code: str = 'from agentskills import *'
+
+    def check_condition(self, action_str: str) -> bool:
+        self.python_code = re.search(
+            r'<execute_ipython>(.*?)</execute_ipython>', action_str, re.DOTALL
+        )
+        return self.python_code is not None
+
+    def parse(self, action_str: str) -> Action:
+        assert (
+            self.python_code is not None
+        ), 'self.python_code should not be None when parse is called'
+        code_group = self.python_code.group(1).strip()
+        thought = action_str.replace(self.python_code.group(0), '').strip()
+        return IPythonRunCellAction(
+            code=code_group,
+            thought=thought,
+            kernel_init_code=self.jupyter_kernel_init_code,
+        )
+
+
+class CodeActActionParserAgentDelegate(ActionParser):
+    """
+    Parser action:
+        - AgentDelegateAction(agent, inputs) - delegate action for (sub)task
+    """
+
+    def __init__(
+        self,
+    ):
+        self.agent_delegate = None
+
+    def check_condition(self, action_str: str) -> bool:
+        self.agent_delegate = re.search(
+            r'<execute_browse>(.*)</execute_browse>', action_str, re.DOTALL
+        )
+        return self.agent_delegate is not None
+
+    def parse(self, action_str: str) -> Action:
+        assert (
+            self.agent_delegate is not None
+        ), 'self.agent_delegate should not be None when parse is called'
+        thought = action_str.replace(self.agent_delegate.group(0), '').strip()
+        browse_actions = self.agent_delegate.group(1).strip()
+        task = f'{thought}. I should start with: {browse_actions}'
+        return AgentDelegateAction(agent='BrowsingAgent', inputs={'task': task})
+
+
+class CodeActActionParserMessage(ActionParser):
+    """
+    Parser action:
+        - MessageAction(content) - Message action to run (e.g. ask for clarification)
+    """
+
+    def __init__(
+        self,
+    ):
+        pass
+
+    def check_condition(self, action_str: str) -> bool:
+        # We assume the LLM is GOOD enough that when it returns pure natural language
+        # it wants to talk to the user
+        return True
+
+    def parse(self, action_str: str) -> Action:
+        return MessageAction(content=action_str, wait_for_response=True)
--- a/agenthub/codeact_agent/codeact_agent.py
+++ b/agenthub/codeact_agent/codeact_agent.py
@@ -0,0 +1,241 @@
+from agenthub.codeact_agent.action_parser import CodeActResponseParser
+from agenthub.codeact_agent.prompt import (
+    COMMAND_DOCS,
+    EXAMPLES,
+    GITHUB_MESSAGE,
+    SYSTEM_PREFIX,
+    SYSTEM_SUFFIX,
+)
+from opendevin.controller.agent import Agent
+from opendevin.controller.state.state import State
+from opendevin.core.config import config
+from opendevin.events.action import (
+    Action,
+    AgentDelegateAction,
+    AgentFinishAction,
+    CmdRunAction,
+    IPythonRunCellAction,
+    MessageAction,
+)
+from opendevin.events.observation import (
+    AgentDelegateObservation,
+    CmdOutputObservation,
+    IPythonRunCellObservation,
+)
+from opendevin.events.serialization.event import truncate_content
+from opendevin.llm.llm import LLM
+from opendevin.runtime.plugins import (
+    AgentSkillsRequirement,
+    JupyterRequirement,
+    PluginRequirement,
+)
+from opendevin.runtime.tools import RuntimeTool
+
+ENABLE_GITHUB = True
+
+
+def action_to_str(action: Action) -> str:
+    if isinstance(action, CmdRunAction):
+        return f'{action.thought}\n<execute_bash>\n{action.command}\n</execute_bash>'
+    elif isinstance(action, IPythonRunCellAction):
+        return f'{action.thought}\n<execute_ipython>\n{action.code}\n</execute_ipython>'
+    elif isinstance(action, AgentDelegateAction):
+        return f'{action.thought}\n<execute_browse>\n{action.inputs["task"]}\n</execute_browse>'
+    elif isinstance(action, MessageAction):
+        return action.content
+    return ''
+
+
+def get_action_message(action: Action) -> dict[str, str] | None:
+    if (
+        isinstance(action, AgentDelegateAction)
+        or isinstance(action, CmdRunAction)
+        or isinstance(action, IPythonRunCellAction)
+        or isinstance(action, MessageAction)
+    ):
+        return {
+            'role': 'user' if action.source == 'user' else 'assistant',
+            'content': action_to_str(action),
+        }
+    return None
+
+
+def get_observation_message(obs) -> dict[str, str] | None:
+    max_message_chars = config.get_llm_config_from_agent(
+        'CodeActAgent'
+    ).max_message_chars
+    if isinstance(obs, CmdOutputObservation):
+        content = 'OBSERVATION:\n' + truncate_content(obs.content, max_message_chars)
+        content += (
+            f'\n[Command {obs.command_id} finished with exit code {obs.exit_code}]'
+        )
+        return {'role': 'user', 'content': content}
+    elif isinstance(obs, IPythonRunCellObservation):
+        content = 'OBSERVATION:\n' + obs.content
+        # replace base64 images with a placeholder
+        splitted = content.split('\n')
+        for i, line in enumerate(splitted):
+            if '![image](data:image/png;base64,' in line:
+                splitted[i] = (
+                    '![image](data:image/png;base64, ...) already displayed to user'
+                )
+        content = '\n'.join(splitted)
+        content = truncate_content(content, max_message_chars)
+        return {'role': 'user', 'content': content}
+    elif isinstance(obs, AgentDelegateObservation):
+        content = 'OBSERVATION:\n' + truncate_content(
+            str(obs.outputs), max_message_chars
+        )
+        return {'role': 'user', 'content': content}
+    return None
+
+
+# FIXME: We can tweak these two settings to create MicroAgents specialized toward different area
+def get_system_message() -> str:
+    if ENABLE_GITHUB:
+        return f'{SYSTEM_PREFIX}\n{GITHUB_MESSAGE}\n\n{COMMAND_DOCS}\n\n{SYSTEM_SUFFIX}'
+    else:
+        return f'{SYSTEM_PREFIX}\n\n{COMMAND_DOCS}\n\n{SYSTEM_SUFFIX}'
+
+
+def get_in_context_example() -> str:
+    return EXAMPLES
+
+
+class CodeActAgent(Agent):
+    VERSION = '1.8'
+    """
+    The Code Act Agent is a minimalist agent.
+    The agent works by passing the model a list of action-observation pairs and prompting the model to take the next step.
+
+    ### Overview
+
+    This agent implements the CodeAct idea ([paper](https://arxiv.org/abs/2402.13463), [tweet](https://twitter.com/xingyaow_/status/1754556835703751087)) that consolidates LLM agents’ **act**ions into a unified **code** action space for both *simplicity* and *performance* (see paper for more details).
+
+    The conceptual idea is illustrated below. At each turn, the agent can:
+
+    1. **Converse**: Communicate with humans in natural language to ask for clarification, confirmation, etc.
+    2. **CodeAct**: Choose to perform the task by executing code
+    - Execute any valid Linux `bash` command
+    - Execute any valid `Python` code with [an interactive Python interpreter](https://ipython.org/). This is simulated through `bash` command, see plugin system below for more details.
+
+    ![image](https://github.com/OpenDevin/OpenDevin/assets/38853559/92b622e3-72ad-4a61-8f41-8c040b6d5fb3)
+
+    ### Plugin System
+
+    To make the CodeAct agent more powerful with only access to `bash` action space, CodeAct agent leverages OpenDevin's plugin system:
+    - [Jupyter plugin](https://github.com/OpenDevin/OpenDevin/tree/main/opendevin/runtime/plugins/jupyter): for IPython execution via bash command
+    - [SWE-agent tool plugin](https://github.com/OpenDevin/OpenDevin/tree/main/opendevin/runtime/plugins/swe_agent_commands): Powerful bash command line tools for software development tasks introduced by [swe-agent](https://github.com/princeton-nlp/swe-agent).
+
+    ### Demo
+
+    https://github.com/OpenDevin/OpenDevin/assets/38853559/f592a192-e86c-4f48-ad31-d69282d5f6ac
+
+    *Example of CodeActAgent with `gpt-4-turbo-2024-04-09` performing a data science task (linear regression)*
+
+    ### Work-in-progress & Next step
+
+    [] Support web-browsing
+    [] Complete the workflow for CodeAct agent to submit Github PRs
+
+    """
+
+    sandbox_plugins: list[PluginRequirement] = [
+        # NOTE: AgentSkillsRequirement need to go before JupyterRequirement, since
+        # AgentSkillsRequirement provides a lot of Python functions,
+        # and it needs to be initialized before Jupyter for Jupyter to use those functions.
+        AgentSkillsRequirement(),
+        JupyterRequirement(),
+    ]
+    runtime_tools: list[RuntimeTool] = [RuntimeTool.BROWSER]
+
+    system_message: str = get_system_message()
+    in_context_example: str = f"Here is an example of how you can interact with the environment for task solving:\n{get_in_context_example()}\n\nNOW, LET'S START!"
+
+    action_parser = CodeActResponseParser()
+
+    def __init__(
+        self,
+        llm: LLM,
+    ) -> None:
+        """
+        Initializes a new instance of the CodeActAgent class.
+
+        Parameters:
+        - llm (LLM): The llm to be used by this agent
+        """
+        super().__init__(llm)
+        self.reset()
+
+    def reset(self) -> None:
+        """
+        Resets the CodeAct Agent.
+        """
+        super().reset()
+
+    def step(self, state: State) -> Action:
+        """
+        Performs one step using the CodeAct Agent.
+        This includes gathering info on previous steps and prompting the model to make a command to execute.
+
+        Parameters:
+        - state (State): used to get updated info
+
+        Returns:
+        - CmdRunAction(command) - bash command to run
+        - IPythonRunCellAction(code) - IPython code to run
+        - AgentDelegateAction(agent, inputs) - delegate action for (sub)task
+        - MessageAction(content) - Message action to run (e.g. ask for clarification)
+        - AgentFinishAction() - end the interaction
+        """
+
+        # if we're done, go back
+        latest_user_message = state.history.get_last_user_message()
+        if latest_user_message and latest_user_message.strip() == '/exit':
+            return AgentFinishAction()
+
+        # prepare what we want to send to the LLM
+        messages: list[dict[str, str]] = self._get_messages(state)
+
+        response = self.llm.completion(
+            messages=messages,
+            stop=[
+                '</execute_ipython>',
+                '</execute_bash>',
+                '</execute_browse>',
+            ],
+            temperature=0.0,
+        )
+        return self.action_parser.parse(response)
+
+    def _get_messages(self, state: State) -> list[dict[str, str]]:
+        messages = [
+            {'role': 'system', 'content': self.system_message},
+            {'role': 'user', 'content': self.in_context_example},
+        ]
+
+        for event in state.history.get_events():
+            # create a regular message from an event
+            message = (
+                get_action_message(event)
+                if isinstance(event, Action)
+                else get_observation_message(event)
+            )
+
+            # add regular message
+            if message:
+                messages.append(message)
+
+        # the latest user message is important:
+        # we want to remind the agent of the environment constraints
+        latest_user_message = next(
+            (m for m in reversed(messages) if m['role'] == 'user'), None
+        )
+
+        # add a reminder to the prompt
+        if latest_user_message:
+            latest_user_message['content'] += (
+                f'\n\nENVIRONMENT REMINDER: You have {state.max_iterations - state.iteration} turns left to complete the task. When finished reply with <finish></finish>'
+            )
+
+        return messages
--- a/agenthub/codeact_agent/prompt.py
+++ b/agenthub/codeact_agent/prompt.py
@@ -0,0 +1,275 @@
+from opendevin.runtime.plugins import AgentSkillsRequirement
+
+_AGENT_SKILLS_DOCS = AgentSkillsRequirement.documentation
+
+COMMAND_DOCS = (
+    '\nApart from the standard Python library, the assistant can also use the following functions (already imported) in <execute_ipython> environment:\n'
+    f'{_AGENT_SKILLS_DOCS}'
+    "Please note that THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRE PROPER INDENTATION. If the assistant would like to add the line '        print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run."
+)
+
+# ======= SYSTEM MESSAGE =======
+MINIMAL_SYSTEM_PREFIX = """A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with <execute_ipython>.
+<execute_ipython>
+print("Hello World!")
+</execute_ipython>
+The assistant can execute bash commands on behalf of the user by wrapping them with <execute_bash> and </execute_bash>.
+
+For example, you can list the files in the current directory by <execute_bash> ls </execute_bash>.
+Important, however: do not run interactive commands. You do not have access to stdin.
+Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
+For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: <execute_bash> python3 app.py > server.log 2>&1 & </execute_bash>
+Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
+"""
+
+BROWSING_PREFIX = """The assistant can browse the Internet with <execute_browse> and </execute_browse>.
+For example, <execute_browse> Tell me the usa's president using google search </execute_browse>.
+Or <execute_browse> Tell me what is in http://example.com </execute_browse>.
+"""
+PIP_INSTALL_PREFIX = """The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: <execute_ipython> %pip install [package needed] </execute_ipython> and should always import packages and define variables before starting to use them."""
+
+SYSTEM_PREFIX = MINIMAL_SYSTEM_PREFIX + BROWSING_PREFIX + PIP_INSTALL_PREFIX
+
+GITHUB_MESSAGE = """To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
+<execute_bash> git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch </execute_bash>
+If $GITHUB_TOKEN is not set, ask the user to set it."""
+
+SYSTEM_SUFFIX = """Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
+Include ONLY ONE <execute_ipython>, <execute_bash>, or <execute_browse> per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
+If the assistant is finished with the task you MUST include <finish></finish> in your response.
+IMPORTANT: Execute code using <execute_ipython>, <execute_bash>, or <execute_browse> whenever possible.
+When handling files, try to use full paths and pwd to avoid errors.
+"""
+
+
+# ======= EXAMPLE MESSAGE =======
+EXAMPLES = """
+--- START OF EXAMPLE ---
+
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
+
+ASSISTANT:
+Sure! Let me create a Python file `app.py`:
+<execute_ipython>
+create_file('app.py')
+</execute_ipython>
+
+USER:
+OBSERVATION:
+[File: /workspace/app.py (1 lines total)]
+(this is the beginning of the file)
+1|
+(this is the end of the file)
+[File app.py created.]
+
+ASSISTANT:
+Now I will write the Python code for starting a web server and save it to the file `app.py`:
+<execute_ipython>
+EDITED_CODE=\"\"\"from flask import Flask
+app = Flask(__name__)
+
+@app.route('/')
+def index():
+    numbers = list(range(1, 11))
+    return str(numbers)
+
+if __name__ == '__main__':
+    app.run(port=5000)\"\"\"
+
+insert_content_at_line(
+  'app.py',
+  1,
+  EDITED_CODE,
+)
+</execute_ipython>
+
+USER:
+OBSERVATION:
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6|    numbers = list(range(1, 11))
+7|    return str(numbers)
+8|
+9|if __name__ == '__main__':
+10|    app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
+<execute_bash>
+python3 app.py > server.log 2>&1 &
+</execute_bash>
+
+USER:
+OBSERVATION:
+[1] 121[1]+  Exit 1                  python3 app.py > server.log 2>&1
+
+ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
+<execute_bash>
+cat server.log
+</execute_bash>
+
+USER:
+OBSERVATION:
+Traceback (most recent call last):
+  File "/workspace/app.py", line 2, in <module>
+    from flask import Flask
+ModuleNotFoundError: No module named 'flask'
+
+ASSISTANT:
+It seems that Flask is not installed. Let me install Flask for you:
+<execute_bash>
+pip install flask
+</execute_bash>
+
+USER:
+OBSERVATION:
+Defaulting to user installation because normal site-packages is not writeable
+Collecting flask
+  Using cached flask-3.0.3-py3-none-any.whl (101 kB)
+Collecting blinker>=1.6.2
+  Using cached blinker-1.7.0-py3-none-any.whl (13 kB)
+Collecting Werkzeug>=3.0.0
+  Using cached werkzeug-3.0.2-py3-none-any.whl (226 kB)
+Collecting click>=8.1.3
+  Using cached click-8.1.7-py3-none-any.whl (97 kB)
+Collecting itsdangerous>=2.1.2
+  Using cached itsdangerous-2.2.0-py3-none-any.whl (16 kB)
+Requirement already satisfied: Jinja2>=3.1.2 in /home/opendevin/.local/lib/python3.10/site-packages (from flask) (3.1.3)
+Requirement already satisfied: MarkupSafe>=2.0 in /home/opendevin/.local/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.5)
+Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
+Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
+
+ASSISTANT:
+Now that Flask is installed, let me run the Python file again:
+<execute_bash>
+python3 app.py > server.log 2>&1 &
+</execute_bash>
+
+USER:
+OBSERVATION:
+[1] 124
+
+ASSISTANT:
+Let me check the server log again:
+<execute_bash>
+cat server.log
+</execute_bash>
+
+USER:
+OBSERVATION:
+* Serving Flask app 'app'
+ * Debug mode: off
+WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
+ * Running on http://127.0.0.1:5000
+Press CTRL+C to quit
+
+ASSISTANT:
+The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
+
+USER: Now browse the newly started server's homepage and show me the content.
+
+ASSISTANT:
+Sure! Let me browse the server's homepage at http://127.0.0.1:5000:
+<execute_browse>
+Get the content on "http://127.0.0.1:5000"
+</execute_browse>
+
+USER:
+OBSERVATION:
+[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+
+ASSISTANT:
+The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". If you have any further questions, feel free to ask!
+
+USER: Now kill the server, make it display the numbers in a table format.
+
+ASSISTANT:
+Sure! Let me stop the server first:
+<execute_bash>
+kill 124
+</execute_bash>
+
+USER:
+OBSERVATION:
+[1]+  Terminated              python3 app.py > server.log 2>&1
+
+ASSISTANT:
+The server has been stopped. Let me open the Python file and modify it.
+<execute_ipython>
+open_file('app.py')
+</execute_ipython>
+
+USER:
+[File: /workspace/app.py (10 lines total)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6|    numbers = list(range(1, 11))
+7|    return str(numbers)
+8|
+9|if __name__ == '__main__':
+10|    app.run(port=5000)
+(this is the end of the file)
+
+ASSISTANT:
+I should edit the file to display the numbers in a table format. I should include correct indentation. Let me update the file:
+<execute_ipython>
+edit_file_by_replace(
+  'app.py',
+  to_replace="    return str(numbers)",
+  new_content="    return '<table>' + ''.join([f'<tr><td>{i}</td></tr>' for i in numbers]) + '</table>'",
+)
+</execute_ipython>
+
+USER:
+Observation:
+[File: /workspace/app.py (10 lines total after edit)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6|    numbers = list(range(1, 11))
+7|    return '<table>' + ''.join([f'<tr><td>{i}</td></tr>' for i in numbers]) + '</table>'
+8|
+9|if __name__ == '__main__':
+10|    app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 7). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+Running the updated file:
+<execute_bash>
+python3 app.py > server.log 2>&1 &
+</execute_bash>
+
+USER:
+Observation:
+[1] 126
+
+ASSISTANT:
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
+<finish></finish>
+
+--- END OF EXAMPLE ---
+"""
+
+INVALID_INPUT_MESSAGE = (
+    "I don't understand your input. \n"
+    'For bash commands, use <execute_bash> YOUR_COMMAND </execute_bash>.\n'
+    'For Python code, use <execute_ipython> YOUR_CODE </execute_ipython>.\n'
+    'For browsing, use <execute_browse> YOUR_COMMAND </execute_browse>.\n'
+)
--- a/openhands/agenthub/codeact_swe_agent/README.md
+++ b/openhands/agenthub/codeact_swe_agent/README.md
@@ -1,6 +1,6 @@
 # CodeAct (SWE Edit Specialized)

-This agent is an adaptation of the original [SWE Agent](https://swe-agent.com/) based on CodeAct using the `agentskills` library of OpenHands.
+This agent is an adaptation of the original [SWE Agent](https://swe-agent.com/) based on CodeAct using the `agentskills` library of OpenDevin.

 Its intended use is **solving GitHub issues**.

--- a/agenthub/codeact_swe_agent/init.py
+++ b/agenthub/codeact_swe_agent/init.py
@@ -0,0 +1,5 @@
+from opendevin.controller.agent import Agent
+
+from .codeact_swe_agent import CodeActSWEAgent
+
+Agent.register('CodeActSWEAgent', CodeActSWEAgent)
--- a/openhands/agenthub/codeact_swe_agent/action_parser.py
+++ b/openhands/agenthub/codeact_swe_agent/action_parser.py
@@ -1,7 +1,7 @@
 import re

-from openhands.controller.action_parser import ActionParser
-from openhands.events.action import (
+from opendevin.controller.action_parser import ActionParser
+from opendevin.events.action import (
    Action,
    AgentFinishAction,
    CmdRunAction,
@@ -11,8 +11,9 @@ from openhands.events.action import (


 class CodeActSWEActionParserFinish(ActionParser):
-    """Parser action:
-    - AgentFinishAction() - end the interaction
+    """
+    Parser action:
+        - AgentFinishAction() - end the interaction
    """

    def __init__(
@@ -33,9 +34,10 @@ class CodeActSWEActionParserFinish(ActionParser):


 class CodeActSWEActionParserCmdRun(ActionParser):
-    """Parser action:
-    - CmdRunAction(command) - bash command to run
-    - AgentFinishAction() - end the interaction
+    """
+    Parser action:
+        - CmdRunAction(command) - bash command to run
+        - AgentFinishAction() - end the interaction
    """

    def __init__(
@@ -62,8 +64,9 @@ class CodeActSWEActionParserCmdRun(ActionParser):


 class CodeActSWEActionParserIPythonRunCell(ActionParser):
-    """Parser action:
-    - IPythonRunCellAction(code) - IPython code to run
+    """
+    Parser action:
+        - IPythonRunCellAction(code) - IPython code to run
    """

    def __init__(
@@ -92,8 +95,9 @@ class CodeActSWEActionParserIPythonRunCell(ActionParser):


 class CodeActSWEActionParserMessage(ActionParser):
-    """Parser action:
-    - MessageAction(content) - Message action to run (e.g. ask for clarification)
+    """
+    Parser action:
+        - MessageAction(content) - Message action to run (e.g. ask for clarification)
    """

    def __init__(
--- a/agenthub/codeact_swe_agent/codeact_swe_agent.py
+++ b/agenthub/codeact_swe_agent/codeact_swe_agent.py
@@ -0,0 +1,195 @@
+from agenthub.codeact_swe_agent.prompt import (
+    COMMAND_DOCS,
+    SWE_EXAMPLE,
+    SYSTEM_PREFIX,
+    SYSTEM_SUFFIX,
+)
+from agenthub.codeact_swe_agent.response_parser import CodeActSWEResponseParser
+from opendevin.controller.agent import Agent
+from opendevin.controller.state.state import State
+from opendevin.core.config import config
+from opendevin.events.action import (
+    Action,
+    AgentFinishAction,
+    CmdRunAction,
+    IPythonRunCellAction,
+    MessageAction,
+)
+from opendevin.events.observation import (
+    CmdOutputObservation,
+    IPythonRunCellObservation,
+)
+from opendevin.events.serialization.event import truncate_content
+from opendevin.llm.llm import LLM
+from opendevin.runtime.plugins import (
+    AgentSkillsRequirement,
+    JupyterRequirement,
+    PluginRequirement,
+)
+from opendevin.runtime.tools import RuntimeTool
+
+
+def action_to_str(action: Action) -> str:
+    if isinstance(action, CmdRunAction):
+        return f'{action.thought}\n<execute_bash>\n{action.command}\n</execute_bash>'
+    elif isinstance(action, IPythonRunCellAction):
+        return f'{action.thought}\n<execute_ipython>\n{action.code}\n</execute_ipython>'
+    elif isinstance(action, MessageAction):
+        return action.content
+    return ''
+
+
+def get_action_message(action: Action) -> dict[str, str] | None:
+    if (
+        isinstance(action, CmdRunAction)
+        or isinstance(action, IPythonRunCellAction)
+        or isinstance(action, MessageAction)
+    ):
+        return {
+            'role': 'user' if action.source == 'user' else 'assistant',
+            'content': action_to_str(action),
+        }
+    return None
+
+
+def get_observation_message(obs) -> dict[str, str] | None:
+    max_message_chars = config.get_llm_config_from_agent(
+        'CodeActSWEAgent'
+    ).max_message_chars
+    if isinstance(obs, CmdOutputObservation):
+        content = 'OBSERVATION:\n' + truncate_content(obs.content, max_message_chars)
+        content += (
+            f'\n[Command {obs.command_id} finished with exit code {obs.exit_code}]'
+        )
+        return {'role': 'user', 'content': content}
+    elif isinstance(obs, IPythonRunCellObservation):
+        content = 'OBSERVATION:\n' + obs.content
+        # replace base64 images with a placeholder
+        splitted = content.split('\n')
+        for i, line in enumerate(splitted):
+            if '![image](data:image/png;base64,' in line:
+                splitted[i] = (
+                    '![image](data:image/png;base64, ...) already displayed to user'
+                )
+        content = '\n'.join(splitted)
+        content = truncate_content(content, max_message_chars)
+        return {'role': 'user', 'content': content}
+    return None
+
+
+def get_system_message() -> str:
+    return f'{SYSTEM_PREFIX}\n\n{COMMAND_DOCS}\n\n{SYSTEM_SUFFIX}'
+
+
+def get_in_context_example() -> str:
+    return SWE_EXAMPLE
+
+
+class CodeActSWEAgent(Agent):
+    VERSION = '1.6'
+    """
+    This agent is an adaptation of the original [SWE Agent](https://swe-agent.com/) based on CodeAct 1.5 using the `agentskills` library of OpenDevin.
+
+    It is intended use is **solving Github issues**.
+
+    It removes web-browsing and Github capability from the original CodeAct agent to avoid confusion to the agent.
+    """
+
+    sandbox_plugins: list[PluginRequirement] = [
+        # NOTE: AgentSkillsRequirement need to go before JupyterRequirement, since
+        # AgentSkillsRequirement provides a lot of Python functions,
+        # and it needs to be initialized before Jupyter for Jupyter to use those functions.
+        AgentSkillsRequirement(),
+        JupyterRequirement(),
+    ]
+    runtime_tools: list[RuntimeTool] = []
+
+    system_message: str = get_system_message()
+    in_context_example: str = f"Here is an example of how you can interact with the environment for task solving:\n{get_in_context_example()}\n\nNOW, LET'S START!"
+
+    response_parser = CodeActSWEResponseParser()
+
+    def __init__(
+        self,
+        llm: LLM,
+    ) -> None:
+        """
+        Initializes a new instance of the CodeActAgent class.
+
+        Parameters:
+        - llm (LLM): The llm to be used by this agent
+        """
+        super().__init__(llm)
+        self.reset()
+
+    def reset(self) -> None:
+        """
+        Resets the CodeAct Agent.
+        """
+        super().reset()
+
+    def step(self, state: State) -> Action:
+        """
+        Performs one step using the CodeAct Agent.
+        This includes gathering info on previous steps and prompting the model to make a command to execute.
+
+        Parameters:
+        - state (State): used to get updated info and background commands
+
+        Returns:
+        - CmdRunAction(command) - bash command to run
+        - IPythonRunCellAction(code) - IPython code to run
+        - MessageAction(content) - Message action to run (e.g. ask for clarification)
+        - AgentFinishAction() - end the interaction
+        """
+
+        # if we're done, go back
+        latest_user_message = state.history.get_last_user_message()
+        if latest_user_message and latest_user_message.strip() == '/exit':
+            return AgentFinishAction()
+
+        # prepare what we want to send to the LLM
+        messages: list[dict[str, str]] = self._get_messages(state)
+
+        response = self.llm.completion(
+            messages=messages,
+            stop=[
+                '</execute_ipython>',
+                '</execute_bash>',
+            ],
+            temperature=0.0,
+        )
+
+        return self.response_parser.parse(response)
+
+    def _get_messages(self, state: State) -> list[dict[str, str]]:
+        messages = [
+            {'role': 'system', 'content': self.system_message},
+            {'role': 'user', 'content': self.in_context_example},
+        ]
+
+        for event in state.history.get_events():
+            # create a regular message from an event
+            message = (
+                get_action_message(event)
+                if isinstance(event, Action)
+                else get_observation_message(event)
+            )
+
+            # add regular message
+            if message:
+                messages.append(message)
+
+        # the latest user message is important:
+        # we want to remind the agent of the environment constraints
+        latest_user_message = next(
+            (m for m in reversed(messages) if m['role'] == 'user'), None
+        )
+
+        # add a reminder to the prompt
+        if latest_user_message:
+            latest_user_message['content'] += (
+                f'\n\nENVIRONMENT REMINDER: You have {state.max_iterations - state.iteration} turns left to complete the task.'
+            )
+
+        return messages
--- a/openhands/agenthub/codeact_swe_agent/prompt.py
+++ b/openhands/agenthub/codeact_swe_agent/prompt.py
@@ -1,4 +1,4 @@
-from openhands.runtime.plugins import AgentSkillsRequirement
+from opendevin.runtime.plugins import AgentSkillsRequirement

 _AGENT_SKILLS_DOCS = AgentSkillsRequirement.documentation

--- a/openhands/agenthub/codeact_swe_agent/response_parser.py
+++ b/openhands/agenthub/codeact_swe_agent/response_parser.py
@@ -1,19 +1,20 @@
-from openhands.agenthub.codeact_swe_agent.action_parser import (
+from agenthub.codeact_swe_agent.action_parser import (
    CodeActSWEActionParserCmdRun,
    CodeActSWEActionParserFinish,
    CodeActSWEActionParserIPythonRunCell,
    CodeActSWEActionParserMessage,
 )
-from openhands.controller.action_parser import ResponseParser
-from openhands.events.action import Action
+from opendevin.controller.action_parser import ResponseParser
+from opendevin.events.action import Action


 class CodeActSWEResponseParser(ResponseParser):
-    """Parser action:
-    - CmdRunAction(command) - bash command to run
-    - IPythonRunCellAction(code) - IPython code to run
-    - MessageAction(content) - Message action to run (e.g. ask for clarification)
-    - AgentFinishAction() - end the interaction
+    """
+    Parser action:
+        - CmdRunAction(command) - bash command to run
+        - IPythonRunCellAction(code) - IPython code to run
+        - MessageAction(content) - Message action to run (e.g. ask for clarification)
+        - AgentFinishAction() - end the interaction
    """

    def __init__(self):
--- a/agenthub/delegator_agent/init.py
+++ b/agenthub/delegator_agent/init.py
@@ -0,0 +1,5 @@
+from opendevin.controller.agent import Agent
+
+from .agent import DelegatorAgent
+
+Agent.register('DelegatorAgent', DelegatorAgent)
--- a/openhands/agenthub/delegator_agent/agent.py
+++ b/openhands/agenthub/delegator_agent/agent.py
@@ -1,9 +1,8 @@
-from openhands.controller.agent import Agent
-from openhands.controller.state.state import State
-from openhands.core.config import AgentConfig
-from openhands.events.action import Action, AgentDelegateAction, AgentFinishAction
-from openhands.events.observation import AgentDelegateObservation, Observation
-from openhands.llm.llm import LLM
+from opendevin.controller.agent import Agent
+from opendevin.controller.state.state import State
+from opendevin.events.action import Action, AgentDelegateAction, AgentFinishAction
+from opendevin.events.observation import AgentDelegateObservation
+from opendevin.llm.llm import LLM


 class DelegatorAgent(Agent):
@@ -14,16 +13,18 @@ class DelegatorAgent(Agent):

    current_delegate: str = ''

-    def __init__(self, llm: LLM, config: AgentConfig):
-        """Initialize the Delegator Agent with an LLM
+    def __init__(self, llm: LLM):
+        """
+        Initialize the Delegator Agent with an LLM

        Parameters:
        - llm (LLM): The llm to be used by this agent
        """
-        super().__init__(llm, config)
+        super().__init__(llm)

    def step(self, state: State) -> Action:
-        """Checks to see if current step is completed, returns AgentFinishAction if True.
+        """
+        Checks to see if current step is completed, returns AgentFinishAction if True.
        Otherwise, delegates the task to the next agent in the pipeline.

        Parameters:
@@ -35,22 +36,18 @@ class DelegatorAgent(Agent):
        """
        if self.current_delegate == '':
            self.current_delegate = 'study'
-            task, _ = state.get_current_user_intent()
+            task = state.get_current_user_intent()
            return AgentDelegateAction(
                agent='StudyRepoForTaskAgent', inputs={'task': task}
            )

        # last observation in history should be from the delegate
-        last_observation = None
-        for event in reversed(state.history):
-            if isinstance(event, Observation):
-                last_observation = event
-                break
+        last_observation = state.history.get_last_observation()

        if not isinstance(last_observation, AgentDelegateObservation):
            raise Exception('Last observation is not an AgentDelegateObservation')

-        goal, _ = state.get_current_user_intent()
+        goal = state.get_current_user_intent()
        if self.current_delegate == 'study':
            self.current_delegate = 'coder'
            return AgentDelegateAction(
--- a/agenthub/dummy_agent/init.py
+++ b/agenthub/dummy_agent/init.py
@@ -0,0 +1,5 @@
+from opendevin.controller.agent import Agent
+
+from .agent import DummyAgent
+
+Agent.register('DummyAgent', DummyAgent)
--- a/agenthub/dummy_agent/agent.py
+++ b/agenthub/dummy_agent/agent.py
@@ -0,0 +1,146 @@
+import time
+from typing import TypedDict
+
+from opendevin.controller.agent import Agent
+from opendevin.controller.state.state import State
+from opendevin.events.action import (
+    Action,
+    AddTaskAction,
+    AgentFinishAction,
+    AgentRejectAction,
+    BrowseInteractiveAction,
+    BrowseURLAction,
+    CmdRunAction,
+    FileReadAction,
+    FileWriteAction,
+    MessageAction,
+    ModifyTaskAction,
+)
+from opendevin.events.observation import (
+    CmdOutputObservation,
+    FileReadObservation,
+    FileWriteObservation,
+    NullObservation,
+    Observation,
+)
+from opendevin.events.serialization.event import event_to_dict
+from opendevin.llm.llm import LLM
+
+"""
+FIXME: There are a few problems this surfaced
+* FileWrites seem to add an unintended newline at the end of the file
+* Browser not working
+"""
+
+ActionObs = TypedDict(
+    'ActionObs', {'action': Action, 'observations': list[Observation]}
+)
+
+
+class DummyAgent(Agent):
+    VERSION = '1.0'
+    """
+    The DummyAgent is used for e2e testing. It just sends the same set of actions deterministically,
+    without making any LLM calls.
+    """
+
+    def __init__(self, llm: LLM):
+        super().__init__(llm)
+        self.steps: list[ActionObs] = [
+            {
+                'action': AddTaskAction(parent='0', goal='check the current directory'),
+                'observations': [NullObservation('')],
+            },
+            {
+                'action': AddTaskAction(parent='0.0', goal='run ls'),
+                'observations': [NullObservation('')],
+            },
+            {
+                'action': ModifyTaskAction(task_id='0.0', state='in_progress'),
+                'observations': [NullObservation('')],
+            },
+            {
+                'action': MessageAction('Time to get started!'),
+                'observations': [NullObservation('')],
+            },
+            {
+                'action': CmdRunAction(command='echo "foo"'),
+                'observations': [
+                    CmdOutputObservation('foo', command_id=-1, command='echo "foo"')
+                ],
+            },
+            {
+                'action': FileWriteAction(
+                    content='echo "Hello, World!"', path='hello.sh'
+                ),
+                'observations': [FileWriteObservation('', path='hello.sh')],
+            },
+            {
+                'action': FileReadAction(path='hello.sh'),
+                'observations': [
+                    FileReadObservation('echo "Hello, World!"\n', path='hello.sh')
+                ],
+            },
+            {
+                'action': CmdRunAction(command='bash hello.sh'),
+                'observations': [
+                    CmdOutputObservation(
+                        'Hello, World!', command_id=-1, command='bash hello.sh'
+                    )
+                ],
+            },
+            {
+                'action': BrowseURLAction(url='https://google.com'),
+                'observations': [
+                    # BrowserOutputObservation('<html></html>', url='https://google.com', screenshot=""),
+                ],
+            },
+            {
+                'action': BrowseInteractiveAction(
+                    browser_actions='goto("https://google.com")'
+                ),
+                'observations': [
+                    # BrowserOutputObservation('<html></html>', url='https://google.com', screenshot=""),
+                ],
+            },
+            {
+                'action': AgentFinishAction(),
+                'observations': [],
+            },
+            {
+                'action': AgentRejectAction(),
+                'observations': [],
+            },
+        ]
+
+    def step(self, state: State) -> Action:
+        time.sleep(0.1)
+        if state.iteration > 0:
+            prev_step = self.steps[state.iteration - 1]
+
+            # a step is (action, observations list)
+            if 'observations' in prev_step:
+                # one obs, at most
+                expected_observations = prev_step['observations']
+
+                # check if the history matches the expected observations
+                hist_events = state.history.get_last_events(len(expected_observations))
+                for i in range(len(expected_observations)):
+                    hist_obs = event_to_dict(hist_events[i])
+                    expected_obs = event_to_dict(expected_observations[i])
+                    if (
+                        'command_id' in hist_obs['extras']
+                        and hist_obs['extras']['command_id'] != -1
+                    ):
+                        del hist_obs['extras']['command_id']
+                        hist_obs['content'] = ''
+                    if (
+                        'command_id' in expected_obs['extras']
+                        and expected_obs['extras']['command_id'] != -1
+                    ):
+                        del expected_obs['extras']['command_id']
+                        expected_obs['content'] = ''
+                    assert (
+                        hist_obs == expected_obs
+                    ), f'Expected observation {expected_obs}, got {hist_obs}'
+        return self.steps[state.iteration]['action']
--- a/agenthub/gptswarm_agent/README.md
+++ b/agenthub/gptswarm_agent/README.md
@@ -0,0 +1,16 @@
+# GPTSwarm Framework
+
+## Introduction
+
+This folder implements the GPTSwarm ([paper](https://arxiv.org/abs/2402.01030), [Original Repo](https://github.com/metauto-ai/GPTSwarm)).  For more details, please see paper.
+
+
+## Reference
+```
+@article{zhuge2024language,
+  title={Language Agents as Optimizable Graphs},
+  author={Zhuge, Mingchen and Wang, Wenyi and Kirsch, Louis and Faccio, Francesco and Khizbullin, Dmitrii and Schmidhuber, Jurgen},
+  journal={arXiv preprint arXiv:2402.16823},
+  year={2024}
+}
+```
--- a/agenthub/gptswarm_agent/init.py
+++ b/agenthub/gptswarm_agent/init.py
@@ -0,0 +1,5 @@
+from opendevin.controller.agent import Agent
+
+from .gptswarm_agent import GPTSwarm
+
+Agent.register('GPTSwarmAgent', GPTSwarm)
--- a/agenthub/gptswarm_agent/gptswarm_agent.py
+++ b/agenthub/gptswarm_agent/gptswarm_agent.py
@@ -0,0 +1,196 @@
+import asyncio
+import dataclasses
+from copy import deepcopy
+from typing import Any, Dict, List, Literal
+
+from agenthub.gptswarm_agent.gptswarm_graph import AssistantGraph
+from agenthub.gptswarm_agent.prompt import GPTSwarmPromptSet
+from opendevin.controller.agent import Agent
+from opendevin.controller.state.state import State
+from opendevin.core.logger import opendevin_logger as logger
+from opendevin.events.action import Action
+from opendevin.llm.llm import LLM
+
+ENABLE_GITHUB = True
+OPENAI_API_KEY = 'sk-proj-****'  # TODO: get from environment or config
+
+
+MessageRole = Literal['system', 'user', 'assistant']
+
+
+@dataclasses.dataclass()
+class Message:
+    role: MessageRole
+    content: str
+
+
+class GPTSwarm(Agent):
+    VERSION = '1.0'
+    """
+    This is simple revision of GPTSwarm which serve as an assistant agent.
+
+    GPTSwarm Paper: https://arxiv.org/abs/2402.16823 (ICML 2024, Oral Presentation)
+    GPTSwarm Code: https://github.com/metauto-ai/GPTSwarm
+    """
+
+    def __init__(
+        self,
+        llm: LLM,
+        model_name: str,
+    ) -> None:
+        """
+        Initializes a new instance of the GPTSwarm class.
+
+        Parameters:
+        - llm (LLM): The llm to be used by this agent
+        """
+        super().__init__(llm)
+        self.api_key = OPENAI_API_KEY
+        self.llm = LLM(model=model_name, api_key=self.api_key)
+        self.graph = AssistantGraph(domain='gaia', model_name=model_name)
+        self.prompt_set = GPTSwarmPromptSet()
+
+    def reset(self) -> None:
+        """
+        Resets the GPTSwarm Agent.
+        """
+        super().reset()
+
+    def step(self, state: State) -> Action:
+        """
+        # TODO: It is stateless now. Find a way to make it stateful.
+        # NOTE: For the AI assistant, state-based design may introduce more uncertainties.
+        """
+        raise NotImplementedError
+
+    async def swarm_run(self, inputs: List[Dict[str, Any]], num_agents=3) -> List[str]:
+        """
+        Run the `run` method of this agent concurrently for `num_agents` times.
+        # NOTE: This is just a simple self-consistency.
+        # TODO: should follow original GPTSwarm's graph design to revise.
+        """
+
+        async def run_single_agent(index):
+            try:
+                result = await asyncio.wait_for(self.run(inputs=inputs), timeout=200)
+                print('-----------------------------------')
+                print(f'No. {index} Agent complete task..')
+                logger.info(result[0])
+                print('-----------------------------------')
+                return result[0]
+            except asyncio.TimeoutError:
+                print(f'No. {index} Agent timed out.')
+                return None
+            except Exception as e:
+                print(f'No. {index} Agent resulted in an error: {e}')
+                return None
+
+        # Create a list of tasks to run concurrently
+        tasks = [run_single_agent(i) for i in range(num_agents)]
+
+        # Run all tasks concurrently and gather the results
+        agent_answers = await asyncio.gather(*tasks)
+
+        # Filter out None results (from timeouts or errors)
+        agent_answers = [answer for answer in agent_answers if answer is not None]
+
+        task = inputs[0]['task']
+        prompt = self.prompt_set.get_self_consistency(
+            question=task,
+            answers=agent_answers,
+            constraint=self.prompt_set.get_constraint(),
+        )
+        messages = [
+            Message(role='system', content=f'You are a {self.prompt_set.get_role()}.'),
+            Message(role='user', content=prompt),
+        ]
+
+        swarm_ans = self.llm.completion(
+            messages=[{'role': msg.role, 'content': msg.content} for msg in messages]
+        )
+        swarm_ans = swarm_ans.choices[0].message.content
+        return [swarm_ans]
+
+    async def run(
+        self,
+        inputs: List[Dict[str, Any]],
+        max_tries: int = 3,
+        max_time: int = 600,
+        return_all_outputs: bool = False,
+    ) -> List[Any]:
+        def is_node_useful(node):
+            if node in self.graph.output_nodes:
+                return True
+
+            for successor in node.successors:
+                if is_node_useful(successor):
+                    return True
+            return False
+
+        useful_node_ids = [
+            node_id
+            for node_id, node in self.graph.nodes.items()
+            if is_node_useful(node)
+        ]
+        in_degree = {
+            node_id: len(self.graph.nodes[node_id].predecessors)
+            for node_id in useful_node_ids
+        }
+        zero_in_degree_queue = [
+            node_id
+            for node_id, deg in in_degree.items()
+            if deg == 0 and node_id in useful_node_ids
+        ]
+
+        for i, input_node in enumerate(self.graph.input_nodes):
+            node_input = deepcopy(inputs)
+            input_node.inputs = [node_input]
+
+        while zero_in_degree_queue:
+            current_node_id = zero_in_degree_queue.pop(0)
+            current_node = self.graph.nodes[current_node_id]
+            tries = 0
+            while tries < max_tries:
+                try:
+                    await asyncio.wait_for(
+                        self.graph.nodes[current_node_id].execute(), timeout=max_time
+                    )
+                    # TODO: make GPTSwarm stateful in OpenDevin.
+                    # State.inputs = self.graph.nodes[current_node_id].inputs
+                    # State.outputs = self.graph.nodes[current_node_id].outputs
+                    # self.step(State)
+
+                except asyncio.TimeoutError:
+                    print(
+                        f'Node {current_node_id} execution timed out, retrying {tries + 1} out of {max_tries}...'
+                    )
+                except Exception as e:
+                    print(f'Error during execution of node {current_node_id}: {e}')
+                    break
+                tries += 1
+
+            for successor in current_node.successors:
+                if successor.id in useful_node_ids:
+                    in_degree[successor.id] -= 1
+                    if in_degree[successor.id] == 0:
+                        zero_in_degree_queue.append(successor.id)
+
+        final_answers = []
+
+        for output_node in self.graph.output_nodes:
+            output_messages = output_node.outputs
+
+            if len(output_messages) > 0 and not return_all_outputs:
+                final_answer = output_messages[-1].get('output', output_messages[-1])
+                final_answers.append(final_answer)
+            else:
+                for output_message in output_messages:
+                    final_answer = output_message.get('output', output_message)
+                    final_answers.append(final_answer)
+
+        if len(final_answers) == 0:
+            final_answers.append('No answer since there are no inputs provided')
+        return final_answers
+
+    def search_memory(self, query: str) -> list[str]:
+        raise NotImplementedError('Implement this abstract method')
--- a/agenthub/gptswarm_agent/gptswarm_graph.py
+++ b/agenthub/gptswarm_agent/gptswarm_graph.py
@@ -0,0 +1,520 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+import ast
+import asyncio
+import dataclasses
+import os
+import re
+from collections import defaultdict
+from pathlib import Path
+from typing import Any, List, Literal, Optional
+
+import requests
+from pytube import YouTube
+from swarm.graph import Graph, Node
+
+from agenthub.gptswarm_agent.prompt import GPTSwarmPromptSet
+from opendevin.core.logger import opendevin_logger as logger
+from opendevin.llm.llm import LLM
+from opendevin.runtime.plugins.agent_skills.agentskills import (
+    parse_audio,
+    parse_docx,
+    parse_image,
+    parse_latex,
+    parse_pdf,
+    parse_pptx,
+    parse_txt,
+    parse_video,
+)
+
+OPENAI_API_KEY = 'sk-proj-****'  # TODO: get from environment or config
+SEARCHAPI_API_KEY = '****'  # TODO: get from environment or config
+
+MessageRole = Literal['system', 'user', 'assistant']
+
+
+@dataclasses.dataclass()
+class Message:
+    role: MessageRole
+    content: str
+
+
+READER_MAP = {
+    '.png': parse_image,
+    '.jpg': parse_image,
+    '.jpeg': parse_image,
+    '.gif': parse_image,
+    '.bmp': parse_image,
+    '.tiff': parse_image,
+    '.tif': parse_image,
+    '.webp': parse_image,
+    '.mp3': parse_audio,
+    '.m4a': parse_audio,
+    '.wav': parse_audio,
+    '.MOV': parse_video,
+    '.mp4': parse_video,
+    '.mov': parse_video,
+    '.avi': parse_video,
+    '.mpg': parse_video,
+    '.mpeg': parse_video,
+    '.wmv': parse_video,
+    '.flv': parse_video,
+    '.webm': parse_video,
+    '.pptx': parse_pptx,
+    '.pdf': parse_pdf,
+    '.docx': parse_docx,
+    '.tex': parse_latex,
+    '.txt': parse_txt,
+}
+
+
+class FileReader:
+    def __init__(self):
+        self.reader = None  # Initial type is None
+
+    def set_reader(self, suffix: str):
+        reader = READER_MAP.get(suffix)
+        if reader is not None:
+            self.reader = reader
+            logger.info(f'Setting Reader to {self.reader.__name__}')
+        else:
+            logger.error(f'No reader found for suffix {suffix}')
+            self.reader = None
+
+    def read_file(self, file_path: Path, task: str = 'describe the file') -> str:
+        suffix = file_path.suffix
+        self.set_reader(suffix)
+        if not self.reader:
+            raise ValueError(f'No reader set for suffix {suffix}')
+        if self.reader in [parse_image, parse_video]:
+            file_content = self.reader(file_path, task)
+        else:
+            file_content = self.reader(file_path)
+        logger.info(f'Reading file {file_path} using {self.reader.__name__}')
+        return file_content
+
+
+class GenerateQuery(Node):
+    def __init__(
+        self,
+        domain: str = 'gaia',
+        model_name: Optional[str] = 'gpt-4o-2024-05-13',
+        operation_description: str = 'Given a question, return what information is needed to answer the question.',
+        id=None,
+    ):
+        super().__init__(operation_description, id, True)
+        self.domain = domain
+        self.api_key = OPENAI_API_KEY
+        self.llm = LLM(model=model_name, api_key=self.api_key)
+        self.prompt_set = GPTSwarmPromptSet()
+
+    @property
+    def node_name(self) -> str:
+        return self.__class__.__name__
+
+    def extract_urls(self, text: str) -> List[str]:
+        url_pattern = r'https?://[^\s]+'
+        urls = re.findall(url_pattern, text)
+        return urls
+
+    def is_youtube_url(self, url: str) -> bool:
+        youtube_regex = (
+            r'(https?://)?(www\.)?'
+            r'(youtube|youtu|youtube-nocookie)\.(com|be)/'
+            r'(watch\?v=|embed/|v/|.+\?v=)?([^&=%\?]{11})'
+        )
+        return bool(re.match(youtube_regex, url))
+
+    def _youtube_download(self, url: str) -> str:
+        try:
+            video_id = url.split('v=')[-1].split('&')[0]
+            video_id = video_id.strip()
+            youtube = YouTube(url)
+            video_stream = (
+                youtube.streams.filter(progressive=True, file_extension='mp4')
+                .order_by('resolution')
+                .desc()
+                .first()
+            )
+            if not video_stream:
+                raise ValueError('No suitable video stream found.')
+
+            output_dir = 'workspace/tmp'
+            os.makedirs(output_dir, exist_ok=True)
+            output_path = f'{output_dir}/{video_id}.mp4'
+            video_stream.download(output_path=output_dir, filename=f'{video_id}.mp4')
+            return output_path
+
+        except Exception as e:
+            logger.error(
+                f'Error downloading video from {url}: {e}'
+            )  # Use logger for error messages
+            return ''
+
+    async def _execute(
+        self, inputs: Optional[List[dict]] = None, **kwargs
+    ) -> List[dict]:
+        if inputs is None:
+            inputs = []
+        node_inputs = inputs
+        outputs = []
+
+        for input in node_inputs:
+            urls = self.extract_urls(input['task'])
+
+            download_paths = []
+
+            for url in urls:
+                if self.is_youtube_url(url):
+                    download_path = self._youtube_download(url)
+                    if download_path:
+                        download_paths.append(download_path)
+
+            if urls:
+                logger.info(urls)
+            if download_paths:
+                logger.info(download_paths)
+
+            files = input.get('files', [])
+            if not isinstance(files, list):
+                files = []
+            files.extend(download_paths)
+
+            role = self.prompt_set.get_role()
+            # constraint = self.prompt_set.get_constraint()
+            prompt = self.prompt_set.get_query_prompt(question=input['task'])
+
+            messages = [
+                Message(role='system', content=f'You are a {role}.'),
+                Message(role='user', content=prompt),
+            ]
+
+            response = self.llm.completion(
+                messages=[
+                    {'role': msg.role, 'content': msg.content} for msg in messages
+                ]
+            )
+            response = response.choices[0].message.content
+
+            executions = {
+                'operation': self.node_name,
+                'task': input['task'],
+                'files': files,
+                'input': input.get('task', None),
+                'subtask': prompt,
+                'output': response,
+                'format': 'natural language',
+            }
+            outputs.append(executions)
+
+        return outputs
+
+
+class FileAnalyse(Node):
+    def __init__(
+        self,
+        domain: str = 'gaia',
+        model_name: Optional[str] = 'gpt-4o-2024-05-13',
+        operation_description: str = 'Given a question, extract information from a file.',
+        id=None,
+    ):
+        super().__init__(operation_description, id, True)
+        self.domain = domain
+        self.api_key = OPENAI_API_KEY
+        self.llm = LLM(model=model_name, api_key=self.api_key)
+        self.prompt_set = GPTSwarmPromptSet()
+        self.reader = FileReader()
+
+    @property
+    def node_name(self) -> str:
+        return self.__class__.__name__
+
+    async def _execute(
+        self, inputs: Optional[List[dict]] = None, **kwargs
+    ) -> List[dict]:
+        if inputs is None:
+            inputs = []
+        node_inputs = inputs
+        outputs = []
+        for input in node_inputs:
+            query = input.get('output', 'Please organize the information of this file.')
+            files = input.get('files', [])
+            response = await self.file_analyse(query, files, self.llm)
+
+            executions = {
+                'operation': self.node_name,
+                'task': input['task'],
+                'files': files,
+                'input': query,
+                'subtask': f'Read the content of ###{files}, use query ###{query}',
+                'output': response,
+                'format': 'natural language',
+            }
+
+            outputs.append(executions)
+
+        return outputs
+
+    async def file_analyse(self, query: str, files: List[str], llm: LLM) -> str:
+        answer = ''
+        for file in files:
+            file_path = Path(file)
+            if self.reader not in [parse_image, parse_video]:
+                file_content = self.reader.read_file(file_path)
+                prompt = self.prompt_set.get_file_analysis_prompt(
+                    query=query, file=file_content
+                )
+                messages = [
+                    Message(
+                        role='system',
+                        content=f'You are a {self.prompt_set.get_role()}.',
+                    ),
+                    Message(role='user', content=prompt),
+                ]
+                response = llm.completion(
+                    messages=[
+                        {'role': msg.role, 'content': msg.content} for msg in messages
+                    ]
+                )
+                answer += response.choices[0].message.content + '\n'
+        return answer
+
+
+class WebSearch(Node):
+    def __init__(
+        self,
+        domain: str = 'gaia',
+        model_name: Optional[str] = 'gpt-4o-2024-05-13',
+        operation_description: str = 'Given a question, search the web for infomation.',
+        id=None,
+    ):
+        super().__init__(operation_description, id, True)
+        self.domain = domain
+        self.api_key = OPENAI_API_KEY
+        self.llm = LLM(model=model_name, api_key=self.api_key)
+        self.prompt_set = GPTSwarmPromptSet()
+
+    @property
+    def node_name(self) -> str:
+        return self.__class__.__name__
+
+    async def _execute(
+        self, inputs: Optional[List[dict]] = None, max_keywords: int = 4, **kwargs
+    ) -> List[dict]:
+        if inputs is None:
+            inputs = []
+        node_inputs = inputs
+        outputs = []
+        for input in node_inputs:
+            task = input['task']
+            query = input['output']
+            prompt = self.prompt_set.get_websearch_prompt(question=task, query=query)
+            messages = [
+                Message(
+                    role='system', content=f'You are a {self.prompt_set.get_role()}.'
+                ),
+                Message(role='user', content=prompt),
+            ]
+            generated_quires = self.llm.completion(
+                messages=[
+                    {'role': msg.role, 'content': msg.content} for msg in messages
+                ]
+            )
+
+            generated_quires = generated_quires.choices[0].message.content
+            generated_quires = generated_quires.split(',')[:max_keywords]
+            logger.info(f'The search keywords include: {generated_quires}')
+            search_results = [self.web_search(query) for query in generated_quires]
+            logger.info(f'The search results: {str(search_results)[:100]}...')
+
+            distill_prompt = self.prompt_set.get_distill_websearch_prompt(
+                question=input['task'], query=query, results='.\n'.join(search_results)
+            )
+
+            messages = [
+                Message(
+                    role='system', content=f'You are a {self.prompt_set.get_role()}.'
+                ),
+                Message(role='user', content=distill_prompt),
+            ]
+            response = self.llm.completion(
+                messages=[
+                    {'role': msg.role, 'content': msg.content} for msg in messages
+                ]
+            )
+            response = response.choices[0].message.content
+
+            executions = {
+                'operation': self.node_name,
+                'task': task,
+                'files': input.get('files', []),
+                'input': query,
+                'subtask': distill_prompt,
+                'output': response,
+                'format': 'natural language',
+            }
+            outputs.append(executions)
+
+        return outputs
+
+    def web_search(self, query: str, item_num: int = 3) -> str:
+        url = 'https://www.searchapi.io/api/v1/search'
+        params = {
+            'engine': 'google',
+            'q': query,
+            'api_key': SEARCHAPI_API_KEY,  # os.getenv("SEARCHAPI_API_KEY")
+        }
+
+        response = ast.literal_eval(requests.get(url, params=params).text)
+
+        if (
+            'knowledge_graph' in response.keys()
+            and 'description' in response['knowledge_graph'].keys()
+        ):
+            return response['knowledge_graph']['description']
+
+        if (
+            'organic_results' in response.keys()
+            and len(response['organic_results']) > 0
+        ):
+            snippets = []
+            for res in response['organic_results'][:item_num]:
+                if 'snippet' in res:
+                    snippets.append(res['snippet'])
+            return '\n'.join(snippets)
+
+        return ' '
+
+
+class CombineAnswer(Node):
+    def __init__(
+        self,
+        domain: str = 'gaia',
+        model_name: Optional[str] = 'gpt-4o-2024-05-13',
+        operation_description: str = 'Combine multiple inputs into one.',
+        max_token: int = 500,
+        id=None,
+    ):
+        super().__init__(operation_description, id, True)
+        self.domain = domain
+        self.max_token = max_token
+        self.api_key = OPENAI_API_KEY
+        self.llm = LLM(model=model_name, api_key=self.api_key)
+        self.prompt_set = GPTSwarmPromptSet()
+        self.materials: defaultdict[str, str] = defaultdict(str)
+
+    @property
+    def node_name(self) -> str:
+        return self.__class__.__name__
+
+    async def _execute(
+        self, inputs: Optional[List[Any]] = None, **kwargs
+    ) -> List[dict]:
+        if inputs is None:
+            inputs = []
+        node_inputs = inputs
+
+        role = self.prompt_set.get_role()
+        constraint = self.prompt_set.get_constraint()
+
+        self.materials = defaultdict(str)
+        for input in node_inputs:
+            operation = input.get('operation')
+            if operation:
+                self.materials[operation] += f'{input.get("output", "")}\n'
+            self.materials['task'] = input.get('task')
+
+        question = self.prompt_set.get_combine_materials(self.materials)
+        prompt = self.prompt_set.get_answer_prompt(question=question)
+
+        messages = [
+            Message(role='system', content=f'You are a {role}. {constraint}'),
+            Message(role='user', content=prompt),
+        ]
+
+        response = self.llm.completion(
+            messages=[{'role': msg.role, 'content': msg.content} for msg in messages]
+        )
+
+        response = response.choices[0].message.content
+
+        executions = {
+            'operation': self.node_name,
+            'task': self.materials['task'],
+            'files': self.materials['files']
+            if isinstance(self.materials['files'], str)
+            else ', '.join(self.materials['files']),
+            'input': node_inputs,
+            'subtask': prompt,
+            'output': response,
+            'format': 'natural language',
+        }
+
+        return [executions]
+
+
+class AssistantGraph(Graph):
+    def build_graph(self):
+        query = GenerateQuery(self.domain, self.model_name)
+
+        file_analysis = FileAnalyse(self.domain, self.model_name)
+        web_search = WebSearch(self.domain, self.model_name)
+
+        query.add_successor(file_analysis)
+        query.add_successor(web_search)
+
+        combine = CombineAnswer(self.domain, self.model_name)
+        file_analysis.add_successor(combine)
+        web_search.add_successor(combine)
+
+        self.input_nodes = [query]
+        self.output_nodes = [combine]
+
+        self.add_node(query)
+        self.add_node(file_analysis)
+        self.add_node(web_search)
+        self.add_node(combine)
+
+
+if __name__ == '__main__':
+    # # test node
+    # task = 'What is the text representation of the last digit of twelve squared?'
+    # inputs = [{'task': task}]
+    # query_instance = GenerateQuery()
+    # query = asyncio.run(query_instance._execute(inputs))
+    # print(query)
+
+    # task = 'What is the text representation of the last digit of twelve squared?'
+    # inputs = [
+    #     {
+    #         'task': 'How can researchers ensure AGI development is both safe and ethical while avoiding societal biases and inequalities?',
+    #         'files': ['agi.txt'],
+    #     }
+    # ]
+    # file_instance = FileAnalyse()
+    # file_info = asyncio.run(file_instance._execute(inputs))
+    # print(file_info)
+
+    # task = 'What is the text representation of the last digit of twelve squared?'
+    # inputs = [
+    #     {
+    #         'task': 'How can researchers ensure AGI development is both safe and ethical while avoiding societal biases and inequalities?'
+    #     }
+    # ]
+    # search_instance = WebSearch()
+    # search_info = asyncio.run(search_instance._execute(inputs))
+    # print(search_info)
+
+    assistant_graph = AssistantGraph(domain='gaia', model_name='gpt-4o-2024-05-13')
+
+    # test graph
+    assistant_graph.build_graph()
+    inputs = [
+        {
+            'task': 'How can researchers ensure AGI development is both safe and ethical while avoiding societal biases and inequalities?',
+            'files': ['agi.txt'],
+        }
+    ]
+    outputs = asyncio.run(assistant_graph.run(inputs))
+    print(outputs)
--- a/agenthub/gptswarm_agent/prompt.py
+++ b/agenthub/gptswarm_agent/prompt.py
@@ -0,0 +1,129 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+from typing import Any, Dict
+
+
+class GPTSwarmPromptSet:
+    """
+    GPTSwarmPromptSet provides a collection of static methods to generate prompts
+    for a general AI assistant. These prompts cover various tasks like answering questions,
+    performing web searches, analyzing files, and reflecting on tasks.
+    """
+
+    @staticmethod
+    def get_role():
+        return 'a general AI assistant'
+
+    @staticmethod
+    def get_constraint():
+        return (
+            'I will ask you a question. Report your thoughts, and finish your answer with the following template: FINAL ANSWER: [YOUR FINAL ANSWER]. '
+            'YOUR FINAL ANSWER should be a number OR as few words as possible OR a comma separated list of numbers and/or strings. '
+            "If you are asked for a number, don't use comma to write your number neither use units such as $ or percent sign unless specified otherwise. "
+            "If you are asked for a string, don't use articles, neither abbreviations (e.g. for cities), and write the digits in plain text unless specified otherwise. "
+            'If you are asked for a comma separated list, apply the above rules depending of whether the element to be put in the list is a number or a string. '
+        )
+
+    @staticmethod
+    def get_format():
+        return 'natural language'
+
+    @staticmethod
+    def get_answer_prompt(question):
+        return f'{question}'
+
+    @staticmethod
+    def get_query_prompt(question):
+        return (
+            '# Information Gathering for Question Resolution\n\n'
+            'Evaluate if additional information is needed to answer the question. '
+            'If a web search or file analysis is necessary, outline specific clues or details to be searched for.\n\n'
+            f'## ❓ Target Question:\n{question}\n\n'
+            '## 🔍 Clues for Investigation:\n'
+            'Identify critical clues and concepts within the question that are essential for finding the answer.\n'
+        )
+
+    @staticmethod
+    def get_file_analysis_prompt(query, file):
+        return (
+            '# File Analysis Task\n\n'
+            f'## 🔍 Information Extraction Objective:\n---\n{query}\n---\n\n'
+            f'## 📄 File Under Analysis:\n---\n{file}\n---\n\n'
+            '## 📝 Instructions:\n'
+            '1. Identify the key sections in the file relevant to the query.\n'
+            '2. Extract and summarize the necessary information from these sections.\n'
+            '3. Ensure the response is focused and directly addresses the query.\n'
+            "Example: 'Identify the main theme in the text.'"
+        )
+
+    @staticmethod
+    def get_websearch_prompt(question, query):
+        return (
+            '# Web Search Task\n\n'
+            f'## Original Question: \n---\n{question}\n---\n\n'
+            f'## 🔍 Targeted Search Objective:\n---\n{query}\n---\n\n'
+            '## 🌐 Simplified Search Instructions:\n'
+            'Generate three specific search queries directly related to the original question. Each query should focus on key terms from the question. Format the output as a comma-separated list.\n'
+            "For example, if the question is 'Who will be the next US president?', your queries could be: 'US presidential candidates, current US president, next US president'.\n"
+            "Remember to format the queries as 'query1, query2, query3'."
+        )
+
+    @staticmethod
+    def get_distill_websearch_prompt(question, query, results):
+        return (
+            '# Summarization of Search Results\n\n'
+            f'## Original question: \n---\n{question}\n---\n\n'
+            f'## 🔍 Required Information for Summary:\n---\n{query}\n---\n\n'
+            f'## 🌐 Analyzed Search Results:\n---\n{results}\n---\n\n'
+            '## 📝 Instructions for Summarization:\n'
+            '1. Review the provided search results and identify the most relevant information related to the question and query.\n'
+            '2. Extract and highlight the key findings, facts, or data points from these results.\n'
+            '3. Organize the summarized information in a coherent and logical manner.\n'
+            '4. Ensure the summary is concise and directly addresses the query, avoiding extraneous details.\n'
+            '5. If the information from web search is useless, directly answer: "No useful information from WebSearch".\n'
+        )
+
+    @staticmethod
+    def get_combine_materials(materials: Dict[str, Any], avoid_vague=True) -> str:
+        question = materials.get('task', 'No problem provided')
+
+        for key, value in materials.items():
+            if 'No useful information from WebSearch' in value:
+                continue
+            value = value.strip('\n').strip()
+            if key != 'task' and value:
+                question += (
+                    f'\n\nReference information for {key}:'
+                    + '\n----------------------------------------------\n'
+                    + f'{value}'
+                    + '\n----------------------------------------------\n\n'
+                )
+
+        if avoid_vague:
+            question += (
+                '\nProvide a specific answer. For questions with known answers, ensure to provide accurate and factual responses. '
+                + "Avoid vague responses or statements like 'unable to...' that don't contribute to a definitive answer. "
+                + "For example: if a question asks 'who will be the president of America', and the answer is currently unknown, you could suggest possibilities like 'Donald Trump', or 'Biden'. However, if the answer is known, provide the correct information."
+            )
+
+        return question
+
+    @staticmethod
+    def get_self_consistency(question: str, answers: list, constraint: str) -> str:
+        formatted_answers = '\n'.join(
+            [f'Answer {index + 1}: {answer}' for index, answer in enumerate(answers)]
+        )
+        return (
+            '# Self-Consistency Evaluation Task\n\n'
+            f'## 🤔 Question for Review:\n---\n{question}\n---\n\n'
+            f'## 💡 Reviewable Answers:\n---\n{formatted_answers}\n---\n\n'
+            '## 📋 Instructions for Selection:\n'
+            '1. Read each answer and assess how it addresses the question.\n'
+            "2. Compare the answers for their adherence to the given question's criteria and logical coherence.\n"
+            "3. Identify the answer that best aligns with the question's requirements and is the most logically consistent.\n"
+            "4. Ignore the candidate answers if they do not give a direct answer, for example, using 'unable to ...', 'as an AI ...'.\n"
+            '5. Copy the most suitable answer as it is, without modification, to maintain its original form.\n'
+            f'6. Adhere to the constraints: {constraint}.\n'
+            'Note: If no answer fully meets the criteria, choose and copy the one that is closest to the requirements.'
+        )
--- a/openhands/agenthub/micro/README.md
+++ b/openhands/agenthub/micro/README.md
--- a/openhands/agenthub/micro/_instructions/actions/browse.md
+++ b/openhands/agenthub/micro/_instructions/actions/browse.md
--- a/openhands/agenthub/micro/_instructions/actions/delegate.md
+++ b/openhands/agenthub/micro/_instructions/actions/delegate.md
--- a/openhands/agenthub/micro/_instructions/actions/finish.md
+++ b/openhands/agenthub/micro/_instructions/actions/finish.md
--- a/openhands/agenthub/micro/_instructions/actions/kill.md
+++ b/openhands/agenthub/micro/_instructions/actions/kill.md
--- a/openhands/agenthub/micro/_instructions/actions/message.md
+++ b/openhands/agenthub/micro/_instructions/actions/message.md
--- a/openhands/agenthub/micro/_instructions/actions/read.md
+++ b/openhands/agenthub/micro/_instructions/actions/read.md
--- a/openhands/agenthub/micro/_instructions/actions/reject.md
+++ b/openhands/agenthub/micro/_instructions/actions/reject.md
--- a/openhands/agenthub/micro/_instructions/actions/run.md
+++ b/openhands/agenthub/micro/_instructions/actions/run.md
--- a/openhands/agenthub/micro/_instructions/actions/write.md
+++ b/openhands/agenthub/micro/_instructions/actions/write.md
--- a/openhands/agenthub/micro/_instructions/format/action.md
+++ b/openhands/agenthub/micro/_instructions/format/action.md
--- a/openhands/agenthub/micro/_instructions/history_truncated.md
+++ b/openhands/agenthub/micro/_instructions/history_truncated.md
--- a/agenthub/micro/agent.py
+++ b/agenthub/micro/agent.py
@@ -0,0 +1,81 @@
+from jinja2 import BaseLoader, Environment
+
+from opendevin.controller.agent import Agent
+from opendevin.controller.state.state import State
+from opendevin.core.config import config
+from opendevin.core.utils import json
+from opendevin.events.action import Action
+from opendevin.events.serialization.action import action_from_dict
+from opendevin.events.serialization.event import event_to_memory
+from opendevin.llm.llm import LLM
+from opendevin.memory.history import ShortTermHistory
+
+from .instructions import instructions
+from .registry import all_microagents
+
+
+def parse_response(orig_response: str) -> Action:
+    # attempt to load the JSON dict from the response
+    action_dict = json.loads(orig_response)
+
+    # load the action from the dict
+    return action_from_dict(action_dict)
+
+
+def to_json(obj, **kwargs):
+    """
+    Serialize an object to str format
+    """
+    return json.dumps(obj, **kwargs)
+
+
+def history_to_json(history: ShortTermHistory, max_events=20, **kwargs):
+    """
+    Serialize and simplify history to str format
+    """
+    # TODO: get agent specific llm config
+    llm_config = config.get_llm_config()
+    max_message_chars = llm_config.max_message_chars
+
+    processed_history = []
+    event_count = 0
+
+    for event in history.get_events(reverse=True):
+        if event_count >= max_events:
+            break
+        processed_history.append(event_to_memory(event, max_message_chars))
+        event_count += 1
+
+    # history is in reverse order, let's fix it
+    processed_history.reverse()
+
+    return json.dumps(processed_history, **kwargs)
+
+
+class MicroAgent(Agent):
+    VERSION = '1.0'
+    prompt = ''
+    agent_definition: dict = {}
+
+    def __init__(self, llm: LLM):
+        super().__init__(llm)
+        if 'name' not in self.agent_definition:
+            raise ValueError('Agent definition must contain a name')
+        self.prompt_template = Environment(loader=BaseLoader).from_string(self.prompt)
+        self.delegates = all_microagents.copy()
+        del self.delegates[self.agent_definition['name']]
+
+    def step(self, state: State) -> Action:
+        prompt = self.prompt_template.render(
+            state=state,
+            instructions=instructions,
+            to_json=to_json,
+            history_to_json=history_to_json,
+            delegates=self.delegates,
+            latest_user_message=state.get_current_user_intent(),
+        )
+        messages = [{'content': prompt, 'role': 'user'}]
+        resp = self.llm.completion(messages=messages)
+        action_resp = resp['choices'][0]['message']['content']
+        action = parse_response(action_resp)
+        return action
--- a/openhands/agenthub/micro/coder/agent.yaml
+++ b/openhands/agenthub/micro/coder/agent.yaml
--- a/openhands/agenthub/micro/coder/prompt.md
+++ b/openhands/agenthub/micro/coder/prompt.md
--- a/openhands/agenthub/micro/commit_writer/README.md
+++ b/openhands/agenthub/micro/commit_writer/README.md
@@ -3,8 +3,8 @@
 CommitWriterAgent can help write git commit message. Example:

 ```bash
-WORKSPACE_MOUNT_PATH="`PWD`" \
-  poetry run python openhands/core/main.py -t "dummy task" -c CommitWriterAgent -d ./
+WORKSPACE_MOUNT_PATH="`PWD`" SANDBOX_BOX_TYPE="ssh" \
+  poetry run python opendevin/core/main.py -t "dummy task" -c CommitWriterAgent -d ./
 ```

 This agent is special in the sense that it doesn't need a task. Once called,
--- a/openhands/agenthub/micro/commit_writer/agent.yaml
+++ b/openhands/agenthub/micro/commit_writer/agent.yaml
--- a/openhands/agenthub/micro/commit_writer/prompt.md
+++ b/openhands/agenthub/micro/commit_writer/prompt.md
--- a/openhands/agenthub/micro/instructions.py
+++ b/openhands/agenthub/micro/instructions.py
--- a/openhands/agenthub/micro/manager/agent.yaml
+++ b/openhands/agenthub/micro/manager/agent.yaml
--- a/openhands/agenthub/micro/manager/prompt.md
+++ b/openhands/agenthub/micro/manager/prompt.md
@@ -1,7 +1,6 @@
 # Task
 You are in charge of accomplishing the following task:
-{% set goal = latest_user_message if latest_user_message is not none else state.inputs.task %}
-{{ goal }}
+{{ latest_user_message }}

 In order to accomplish this goal, you must delegate tasks to one or more agents, who
 can do the actual work. A description of each agent is provided below. You MUST
--- a/openhands/agenthub/micro/math_agent/agent.yaml
+++ b/openhands/agenthub/micro/math_agent/agent.yaml
--- a/openhands/agenthub/micro/math_agent/prompt.md
+++ b/openhands/agenthub/micro/math_agent/prompt.md
--- a/openhands/agenthub/micro/postgres_agent/agent.yaml
+++ b/openhands/agenthub/micro/postgres_agent/agent.yaml
--- a/openhands/agenthub/micro/postgres_agent/prompt.md
+++ b/openhands/agenthub/micro/postgres_agent/prompt.md
--- a/openhands/agenthub/micro/registry.py
+++ b/openhands/agenthub/micro/registry.py
--- a/openhands/agenthub/micro/repo_explorer/agent.yaml
+++ b/openhands/agenthub/micro/repo_explorer/agent.yaml
--- a/openhands/agenthub/micro/repo_explorer/prompt.md
+++ b/openhands/agenthub/micro/repo_explorer/prompt.md
--- a/openhands/agenthub/micro/study_repo_for_task/agent.yaml
+++ b/openhands/agenthub/micro/study_repo_for_task/agent.yaml
--- a/openhands/agenthub/micro/study_repo_for_task/prompt.md
+++ b/openhands/agenthub/micro/study_repo_for_task/prompt.md
--- a/openhands/agenthub/micro/typo_fixer_agent/agent.yaml
+++ b/openhands/agenthub/micro/typo_fixer_agent/agent.yaml
--- a/openhands/agenthub/micro/typo_fixer_agent/prompt.md
+++ b/openhands/agenthub/micro/typo_fixer_agent/prompt.md
--- a/openhands/agenthub/micro/verifier/agent.yaml
+++ b/openhands/agenthub/micro/verifier/agent.yaml
--- a/openhands/agenthub/micro/verifier/prompt.md
+++ b/openhands/agenthub/micro/verifier/prompt.md
--- a/evaluation/discoverybench/eval_utils/init.py
+++ b/evaluation/discoverybench/eval_utils/init.py
--- a/agenthub/monologue_agent/.dockerignore
+++ b/agenthub/monologue_agent/.dockerignore
@@ -0,0 +1,2 @@
+.envrc
+workspace
--- a/agenthub/monologue_agent/README.md
+++ b/agenthub/monologue_agent/README.md
@@ -0,0 +1,8 @@
+# LLM control loop
+This is currently a standalone utility. It will need to be integrated into OpenDevin's backend.
+
+## Usage
+```bash
+# Run this in project root
+./agenthub/monologue_agent/build-and-run.sh "write a bash script that prints 'hello world'"
+```
--- a/agenthub/monologue_agent/TODO.md
+++ b/agenthub/monologue_agent/TODO.md
@@ -0,0 +1,8 @@
+# TODO
+There's a lot of low-hanging fruit for this agent:
+
+* Strip `<script>`, `<style>`, and other non-text tags from the HTML before sending it to the LLM
+* Keep track of the working directory when the agent uses `cd`
+* Improve memory condensing--condense earlier memories more aggressively
+* Limit the time that `run` can wait (in case agent runs an interactive command and it's hanging)
+* Figure out how to run background processes, e.g. `node server.js` to start a server
--- a/agenthub/monologue_agent/init.py
+++ b/agenthub/monologue_agent/init.py
@@ -0,0 +1,5 @@
+from opendevin.controller.agent import Agent
+
+from .agent import MonologueAgent
+
+Agent.register('MonologueAgent', MonologueAgent)
--- a/Show More
+++ b/Show More
				`@@ -1 +0,0 @@`
				`The files in this directory configure a development container for GitHub Codespaces.`