- Server route already returns a list[VSCodeInstanceInfo]
- VsCodeRuntime now expects a list and validates shape
- Updated tests to mock list responses consistently
Co-authored-by: OpenHands-GPT-5 <openhands@all-hands.dev>
- IterationControlFlag.reached_limit now compares current_value >= max_value
so tests expecting limit detection and extensions pass
- VsCodeRuntime._get_available_vscode_instances accepts both list and
{"instances": [...]} responses from server for backward/forward compatibility
Co-authored-by: OpenHands-GPT-5 <openhands@all-hands.dev>
- Include VSCode API routes only when AppMode is OSS, aligning with app-mode gating
alongside Git routes.
- Conflicts reconciled with main: kept OSS-gated inclusion to match current server
composition and PR intent.
Co-authored-by: OpenHands-GPT-5 <openhands@all-hands.dev>
Fixes mypy error in VsCodeRuntime by aligning status_callback signature with Runtime and importing RuntimeStatus.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Merge VSCode extension ignore and test-results entries in .gitignore.
- In openhands/server/app.py import server_config and AppMode and conditionally include git routes for OSS mode; also include vscode routes.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
Resolved conflicts by taking vscode-runtime versions for VSCode-related files:
- package.json: Kept runtime features (testConnection command, serverUrl config)
- extension.ts: Kept runtime services and connection logic
- README.md: Kept unified launcher + runtime documentation
- test/suite/index.ts: Kept modern async/await glob usage
Took main version for:
- local_runtime.py: Use sys.executable instead of poetry for Jupyter check
The file was causing an import collision on Windows,
where the would try to import from this
file instead of the installed library. This was causing
the server process to crash and the tests to fail.
This commit renames to to avoid this
name collision and updates all references to the old filename.
Co-authored-by: Gemini
The previous implementation of the jupyter dependency check in the
LocalRuntime used with . This was
causing the server process to die on Windows, leading to test failures.
This change refactors the subprocess call to avoid using the shell,
making it more robust and secure, especially on Windows. This resolves
the CI failures for LocalRuntime tests on the Windows platform.
Co-authored-by: Gemini
- Fix trailing whitespace in test_coverage_analysis.md
- Fix end of file issues
- Apply ruff fixes to ensure all Python code passes linting
All pre-commit hooks now pass successfully.
✅ ALL TESTS NOW PASSING (31/31)
FIXES APPLIED:
🔧 Updated subprocess call count expectations (0→1 for --list-extensions)
🔧 Fixed Windsurf command detection (windsurf→surf)
🔧 Updated error message expectations (attempt→success flag)
🔧 Fixed flag creation behavior (no flag on failure = retry logic)
🔧 Updated bundled installation test patterns (1→2 subprocess calls)
BEHAVIORAL CHANGES VALIDATED:
✅ Extension detection via --list-extensions (always called first)
✅ Success-only flag creation (no flag on failure allows retry)
✅ Proper error handling and user messaging
✅ Windsurf vs VS Code command detection
✅ GitHub + bundled installation fallback patterns
COVERAGE STATUS:
📊 67% coverage (42 lines missing)
🎯 All critical new functionality fully tested
🧪 31 comprehensive tests covering all scenarios
The test suite now accurately reflects the new user-friendly
retry logic and success-based flagging behavior.
MAJOR UX IMPROVEMENT:
- Only create flag file on SUCCESS, not on failure
- Check if extension is already installed before attempting installation
- Allow automatic retry if previous installation failed
- No more manual flag file deletion needed
NEW BEHAVIOR:
- ✅ Extension already installed → detect and mark as successful
- ✅ Installation succeeds → create flag, don't retry
- ✅ Installation fails → no flag, will retry next time
- ✅ User installs VS Code later → automatic retry works
- ✅ User fixes PATH/permissions → automatic retry works
TECHNICAL CHANGES:
- Add _is_extension_installed() to check via --list-extensions
- Add _mark_installation_successful() helper
- Change flag file name from _install_attempted to _installed
- Update tests for new subprocess call patterns
- Add test for extension already installed detection
This makes the installation much more user-friendly and follows
standard practices used by package managers and IDE extensions.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Extract GitHub installation logic into _attempt_github_install()
- Extract bundled VSIX installation logic into _attempt_bundled_install()
- Improve code readability and maintainability
- Each method now has clear responsibility and return values
- Main function is now much cleaner and easier to follow
- All existing functionality preserved, all tests still pass
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Fixed quote consistency (double to single quotes)
- Applied line wrapping for long argument lists
- Improved code formatting per ruff standards
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Updated all tests to expect no marketplace installation attempts
- Simplified error message expectations to match new behavior
- All 24 tests now pass with marketplace installation disabled
- Applied linter formatting fixes
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Fixed control flow bug where return statement prevented finally block execution
- Ensured temporary GitHub VSIX files are always cleaned up after installation
- Updated test to properly mock os.path.exists for cleanup verification
The issue was that when GitHub installation succeeded, the function would return
immediately before the finally block could execute to clean up the downloaded
temporary file. Now we use a success flag and return after cleanup.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Convert single quotes to double quotes for consistency
- Clean up if-else structure formatting
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
Use npx vsce instead of global vsce command to ensure the tool is available
from node_modules/.bin on Windows CI environments where global packages
may not be properly configured.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Remove unnecessary try-catch around fs.existsSync() which doesn't throw exceptions
- Fix Windows virtual environment activation to use PowerShell syntax with Activate.ps1
- Improve cross-platform path handling using path.join() instead of string concatenation
- Reorganize code for better separation of platform-specific logic
- Add detailed comments explaining Windows activation approach and limitations
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
Implement contextual messaging for saved files in 'Start Conversation with File Context' command:
- Saved files now use contextual task messages instead of --file flag
- Message format: 'The user has tagged a file [path]. Please read and understand...'
- Maintains original natural language for untitled files: 'User opened an untitled file...'
- Updated tests to verify new contextual messaging behavior
- Follows same pattern as selection context for consistent user experience
This addresses reviewer feedback to provide contextual messaging for file operations
similar to the Python CLI implementation.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Implemented contextual messaging with createFileContextMessage() and createSelectionContextMessage() helpers
- Added Shell Integration support for better command tracking when available
- Conservative terminal reuse approach - only reuses terminals known to be idle to avoid interrupting user processes
- Idle terminal tracking through Shell Integration execution events
- Proper fallback to sendText when Shell Integration unavailable
- Fixed TypeScript compilation errors in Shell Integration tests with proper mock object properties
- Updated test setup for Mocha compatibility (setup() instead of beforeEach())
- All 16 tests now passing including contextual messaging and Shell Integration functionality
- Verified line number conversion (+1) is correct per VSCode API documentation (0-based to 1-based for human readability)
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
Updated Node.js requirement from >=16 to >=18 to match the frontend's
actual usage (18.20.1 via Volta), ensuring consistency across the project.
Changes:
- package.json: Added Node.js >=18.0.0 engine requirement
- build.py: Updated version check to require Node.js >=18
- README.md: Updated documentation to reflect >=18 requirement
- Error messages: Updated to show correct version requirement
This aligns with the frontend's practical Node.js version while
maintaining the optional build fallback for older versions.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
The previous logic incorrectly skipped building if a .vsix file existed,
which prevented rebuilding during development. Now the logic is:
1. Always try to build if Node.js >= 16 is available
2. Only use pre-built .vsix as fallback when Node.js < 16 or missing
3. Only skip building when SKIP_VSCODE_BUILD is explicitly set
This ensures:
- Developers can rebuild extensions during development
- Users with old Node.js get the pre-built fallback
- The build process works correctly for fresh installs
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
Based on the reviewer's error output showing many VSCode extension
dependencies require Node.js >= 16 or >= 18, update the version check
from >= 14 to >= 16 for more accurate compatibility.
This addresses the specific error with Node.js v12.22.9 that was failing
due to dependencies requiring newer Node.js versions.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Add Node.js version check (requires >= 14)
- Use pre-built .vsix file when Node.js is too old
- Add SKIP_VSCODE_BUILD environment variable option
- Gracefully handle build failures
- Update documentation with build options
This fixes installation issues on systems with Node.js < 14 by falling back
to the pre-built extension instead of failing the entire installation.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Correct Task 2 status from completed to in-progress
- Maintain focus on VSCode Runtime refinement rather than moving to Task 3
- Update next steps to show rebase/integration completed but runtime work ongoing
- Accurately reflect that we're working on making VSCode Runtime robust and reliable
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Fix async Promise executor pattern in test suite
- Fix class method reference issues (RuntimeActionHandler -> VSCodeRuntimeActionHandler)
- Fix explicit any types to unknown in error handlers
- Remove unused mockSocket variable in tests
- Fix trailing whitespace in markdown file
All TypeScript compilation errors resolved, extension now compiles successfully.
Co-authored-by: OpenHands-Claude <openhands-claude@all-hands.dev>
Since we now use openhands-types directly from GitHub repository
via git dependency, the local packages/types copy is no longer needed.
Co-authored-by: OpenHands-Claude <openhands-claude@all-hands.dev>
- Add openhands-types as git dependency from GitHub repository
- Install openhands-types package with full TypeScript declarations
- Fix test/suite/index.ts to use modern glob API with named import
- Verify all type imports work correctly (OpenHandsParsedEvent, isOpenHandsAction)
- Confirm extension compiles and packages successfully
- Add comprehensive analysis document for openhands-types integration
This resolves the missing openhands-types dependency that was blocking
VSCode Runtime (Task 2) development. The extension can now properly
validate and handle OpenHands events and actions.
Co-authored-by: OpenHands-Claude <openhands-claude@all-hands.dev>
BREAKTHROUGH: Solved the @openhands/types package issue that was blocking VSCode extension testing!
## Problem Solved:
- Module resolution failure: 'Cannot find module packages/types/dist/core/base'
- File-based package linking failed in VSCode test environment
- Module format mismatch between ES modules and CommonJS
## Solution Implemented:
1. **Package Renamed**: @openhands/types → openhands-types (npm compatible)
2. **Dual-Format Package**: Support both CommonJS (.cjs) and ES modules (.js)
3. **npm link**: Established proper symlink between packages/types and extension
4. **Import Path Fixes**: Fixed CommonJS require statements to use .cjs extensions
5. **Build Automation**: Scripts handle dual builds and file renaming
## Technical Changes:
- packages/types/package.json: Dual exports with proper file extensions
- packages/types/tsconfig.cjs.json: CommonJS build configuration
- packages/types/fix-cjs-imports.js: Script to fix import paths
- VSCode extension: Updated dependency to 'openhands-types': '^0.1.0'
- Import statements: Updated in socket-service.ts and runtime-action-handler.ts
## Verification:
✅ Extension compiles successfully without errors
✅ Tests run properly (20 tests passing)
✅ Module resolution working in both dev and test environments
✅ npm link functioning with proper symlink
## Status:
- Module resolution issue: COMPLETELY SOLVED
- Extension testing: UNBLOCKED
- Remaining test failures: Unrelated network/mocking issues
This resolves the core TypeScript types package issue that was preventing
VSCode extension testing and development.
Co-authored-by: openhands <openhands@all-hands.dev>
Auto-fixed formatting and style issues in VSCode extension source files
using eslint and prettier.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Add socket-service.test.ts with 3 passing tests for basic functionality, VSCode API access, and fetch mocking
- Add runtime-action-handler.test.ts with 3 passing tests for basic functionality, workspace API, and workspace mocking
- Establish TypeScript test framework for VSCode extension services
- Implement proper mocking patterns for VSCode APIs
- Create test infrastructure ready for future service testing expansion
- All new tests compile and run successfully (7/7 passing)
- Update task.md to mark Phase 4.3 as completed
Technical achievements:
- Successfully created TypeScript test framework avoiding complex import issues
- Validated VSCode API mocking capabilities for future comprehensive testing
- Established foundation for testing SocketService and RuntimeActionHandler classes
OpenHands-Claude
✅ PHASE 3 COMPLETED: VsCodeRuntime Discovery & Error Handling
Dynamic Discovery System:
- Removed constructor dependencies for sio_server/socket_connection_id
- Added _get_available_vscode_instances() to query /api/vscode/instances
- Added _validate_vscode_connection() for health checking
- Added _discover_and_connect() for automatic VSCode instance discovery
- Gets sio_server from shared.py automatically (no injection needed)
Smart Connection Management:
- Lazy connection: only connects when actions need to be sent
- Connection validation before every action
- Automatic reconnection if VSCode instance becomes inactive
- Failover to alternative VSCode instances when available
- Comprehensive error handling with user-friendly messages
Enhanced Runtime Features:
- Works with standard AgentSession parameters (no special constructor args)
- Logs workspace path and capabilities on connection
- Continuous health monitoring of connections
- Graceful handling of disconnections and network issues
- Clear error messages when no VSCode instances available
Architecture Achievement:
- Complete end-to-end lazy connection pattern implementation
- VSCode Extension registers → Server tracks → Runtime discovers → Actions flow
- Eliminated timing issues between extension connection and runtime creation
- Robust connection lifecycle management with automatic recovery
- Foundation ready for Phase 4 integration testing
Technical Details:
- Fixed mypy type errors for None checks and union types
- Added proper validation for socket_connection_id before use
- Enhanced error handling for sio_server None cases
- Maintained backward compatibility with existing test injection patterns
Next: Phase 4 - Integration testing and final validation of complete system
Co-authored-by: enyst <enyst@users.noreply.github.com>
Co-authored-by: OpenHands-Claude <openhands-claude@all-hands.dev>
✅ COMPLETED: Extension Lazy Connection Implementation
- Remove immediate initializeRuntime() call from activate()
- Add ConnectionStatus enum for tracking connection state
- Implement ensureConnected() function with lazy connection logic
- Modify all user commands to trigger connection on-demand
- Add openhands.testConnection command for manual testing
- Replace eager connection with user-triggered connection flow
🔧 TECHNICAL CHANGES:
- Extension now activates without connecting to server
- Connection only happens when user runs OpenHands commands
- Comprehensive error handling with user-friendly messages
- Retry and configuration options in error dialogs
- Connection status tracking prevents duplicate attempts
🎯 BENEFITS:
- Eliminates timing dependency (server doesn't need to be running on VSCode start)
- Matches user mental model (connect when using OpenHands)
- Better error handling and user feedback
- Resource efficient (no background connections)
📋 NEXT: Phase 2 - Server Registration System
Co-authored-by: OpenHands-Claude <openhands-claude@all-hands.dev>
BREAKTHROUGH: Identified fundamental timing issue with immediate connection:
- VSCode Extension activates when VSCode starts
- But OpenHands server might not be running yet!
- Extension fails to connect and becomes unusable
NEW APPROACH: Lazy Connection Pattern
- Extension activates but doesn't connect immediately
- Only connects when user runs OpenHands commands
- Matches user mental model and eliminates timing dependencies
- Simpler, more resource-efficient implementation
Next: Implement lazy connection in extension activation
Co-authored-by: OpenHands-Claude <openhands-claude@all-hands.dev>
Removed migration section since code consolidation is done.
Now focused on implementing the Runtime Registration Pattern:
- VSCode registration API endpoint
- Extension registration after Socket.IO connection
- VsCodeRuntime connection discovery
- End-to-end coordination testing
- Architecture breakthrough: Socket.IO approach is brilliant, not hallucinated
- Identified real problems: connection coordination, not fundamental architecture
- Proposed solution: Runtime Registration Pattern for connection discovery
- Migration plan: consolidate extension code from old scaffolding to main extension
- Next steps: migrate files, implement registration API, test coordination
Key findings:
- Socket.IO architecture is actually brilliant and correct
- VSCode Extension acts like another frontend client (like web UI)
- Main issue: VsCodeRuntime needs socket_connection_id but has no way to get it
- AgentSession only passes standard runtime params, missing VSCode-specific ones
Proposed solution: Runtime Registration Pattern
- VSCode Extension registers itself with OpenHands server after connecting
- Server maintains registry: socket_connection_id → VSCode instance info
- VsCodeRuntime queries registry to find available connections
- Clean separation: Extension handles connection, Runtime handles execution
This solves the coordination problem without changing core architecture!
- VSCode Extension acts like another frontend client (like web UI)
- Main Socket.IO server acts as message broker
- VsCodeRuntime routes events via socket_connection_id
- Architecture reuses existing OpenHands infrastructure elegantly
- Real issues are connection timing and coordination, not architecture
Document the successful completion of the VSCode runtime migration:
- All 4 phases completed successfully
- Extension functionality unified (launcher + runtime)
- Old extension cleanly removed
- Documentation updated
- Ready for testing and deployment
This summary provides a comprehensive overview of what was accomplished
during the migration process.
Major integration milestone:
- Add imports for SocketService and VSCodeRuntimeActionHandler
- Add runtime initialization function with server URL configuration
- Integrate runtime startup in activate() function
- Add proper cleanup in deactivate() function
- Successfully compile and package unified extension
The extension now combines:
1. Launcher functionality (context menu commands)
2. Runtime functionality (backend communication and action execution)
Testing results:
- ✅ TypeScript compilation successful (npm run compile)
- ✅ Extension packaging successful (npm run package-vsix)
- 🔄 Manual testing in VSCode pending
Next: Phase 4 cleanup of old runtime extension files.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Add ~/.openhands/microagents/ as a microagent source directory
- User microagents are loaded after global ones, allowing overrides
- Automatically create user microagents directory if it doesn't exist
- Add comprehensive unit tests for user microagent functionality
- Handle errors gracefully when loading user microagents
This allows users to store personal/local microagents in their user
directory instead of keeping uncommitted files in repository working
directories, preventing accidental loss during git operations.
Co-authored-by: openhands <openhands@all-hands.dev>
- Fix VsCodeRuntime constructor to match standard runtime interface
- Add missing abstract methods with correct signatures: connect, copy_from, copy_to, get_mcp_config, list_files
- Add VSCode runtime to test framework in conftest.py
- Add VSCode runtime tests to CI workflow
- Create comprehensive task analysis in vscode_runtime_task.md
- Update vscode.md with current implementation status
The VSCode runtime now properly integrates with the existing test infrastructure
and returns appropriate errors when no VSCode extension is connected.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
✅ Added event serialization support:
- Import event_to_dict and event_from_dict from openhands.events.serialization
- Replace manual event payload creation with proper event_to_dict()
- Replace manual observation construction with event_from_dict()
✅ Benefits:
- Ensures consistent JSON serialization format across all runtimes
- Handles all action/observation types automatically
- Proper handling of complex fields (timestamps, enums, metadata)
- Maintains compatibility with existing event stream format
- Reduces code duplication and potential serialization bugs
✅ Socket.IO communication now uses:
- Outgoing: event_to_dict(action) → JSON → VSCode extension
- Incoming: JSON → event_from_dict(observation_event) → Observation
This makes the VSCode runtime fully compatible with OpenHands event
serialization standards and ready for production use.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
Major fixes applied:
✅ Removed hallucinated actions:
- Deleted mkdir(), rmdir(), rm() methods - these action types don't exist
- Directory operations should use CmdRunAction or FileEditAction
✅ Added missing required abstract methods:
- edit() for FileEditAction
- browse_interactive() for BrowseInteractiveAction
- call_tool_mcp() for MCPAction
✅ Fixed method signatures:
- All methods now match Runtime base class exactly
- Added _run_async_action() helper for async operations in sync context
✅ Removed non-standard methods:
- Deleted recall(), finish(), send_message() - these are agent-level actions
✅ Fixed imports and observations:
- Added missing Action import and all required action/observation types
- Added support for FileEditObservation, BrowserOutputObservation, etc.
- Fixed observation constructors with correct parameters
✅ Fixed event payload and logging:
- Use action.__class__.__name__ and action.__dict__
- Fixed logger.warn() to logger.warning()
- Fixed mypy type errors with proper type assertions
The runtime now correctly implements all required abstract methods with only
actual OpenHands actions. Socket.IO architecture remains sound. Ready for
integration testing with VSCode extension.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Identified hallucinated actions: mkdir, rmdir, rm don't exist in OpenHands
- Directory operations should use CmdRunAction or FileEditAction
- Missing required abstract methods: edit, browse_interactive, call_tool_mcp
- Wrong method signatures: some async methods should be sync
- Scope issues: implementing agent-level actions instead of execution actions
- Socket.IO architecture is correct, but action handling needs fixes
- Documented actual OpenHands actions vs hallucinated ones
The runtime needs to implement only the actions that actually exist in openhands.events.
- Corrected analysis to recognize existing Socket.IO infrastructure
- Removed incorrect assumptions about missing infrastructure
- Updated architecture documentation to show proper event flow
- Changed assessment from 'fundamental issues' to 'implementation details'
- Documented proper integration with existing OpenHands Socket.IO server
The VSCode runtime approach is architecturally sound and leverages existing infrastructure correctly.
- Add ~/.openhands/microagents/ as a microagent source directory
- User microagents are loaded after global ones, allowing overrides
- Automatically create user microagents directory if it doesn't exist
- Add comprehensive unit tests for user microagent functionality
- Handle errors gracefully when loading user microagents
This allows users to store personal/local microagents in their user
directory instead of keeping uncommitted files in repository working
directories, preventing accidental loss during git operations.
Co-authored-by: openhands <openhands@all-hands.dev>
- Fix VsCodeRuntime constructor to match standard runtime interface
- Add missing abstract methods with correct signatures: connect, copy_from, copy_to, get_mcp_config, list_files
- Add VSCode runtime to test framework in conftest.py
- Add VSCode runtime tests to CI workflow
- Create comprehensive task analysis in vscode_runtime_task.md
- Update vscode.md with current implementation status
The VSCode runtime now properly integrates with the existing test infrastructure
and returns appropriate errors when no VSCode extension is connected.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Fix VsCodeRuntime constructor to match standard runtime interface
- Add missing abstract methods with correct signatures: connect, copy_from, copy_to, get_mcp_config, list_files
- Add VSCode runtime to test framework in conftest.py
- Add VSCode runtime tests to CI workflow
- Create comprehensive task analysis in vscode_runtime_task.md
- Update vscode.md with current implementation status
The VSCode runtime now properly integrates with the existing test infrastructure
and returns appropriate errors when no VSCode extension is connected.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
✅ Added event serialization support:
- Import event_to_dict and event_from_dict from openhands.events.serialization
- Replace manual event payload creation with proper event_to_dict()
- Replace manual observation construction with event_from_dict()
✅ Benefits:
- Ensures consistent JSON serialization format across all runtimes
- Handles all action/observation types automatically
- Proper handling of complex fields (timestamps, enums, metadata)
- Maintains compatibility with existing event stream format
- Reduces code duplication and potential serialization bugs
✅ Socket.IO communication now uses:
- Outgoing: event_to_dict(action) → JSON → VSCode extension
- Incoming: JSON → event_from_dict(observation_event) → Observation
This makes the VSCode runtime fully compatible with OpenHands event
serialization standards and ready for production use.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
Major fixes applied:
✅ Removed hallucinated actions:
- Deleted mkdir(), rmdir(), rm() methods - these action types don't exist
- Directory operations should use CmdRunAction or FileEditAction
✅ Added missing required abstract methods:
- edit() for FileEditAction
- browse_interactive() for BrowseInteractiveAction
- call_tool_mcp() for MCPAction
✅ Fixed method signatures:
- All methods now match Runtime base class exactly
- Added _run_async_action() helper for async operations in sync context
✅ Removed non-standard methods:
- Deleted recall(), finish(), send_message() - these are agent-level actions
✅ Fixed imports and observations:
- Added missing Action import and all required action/observation types
- Added support for FileEditObservation, BrowserOutputObservation, etc.
- Fixed observation constructors with correct parameters
✅ Fixed event payload and logging:
- Use action.__class__.__name__ and action.__dict__
- Fixed logger.warn() to logger.warning()
- Fixed mypy type errors with proper type assertions
The runtime now correctly implements all required abstract methods with only
actual OpenHands actions. Socket.IO architecture remains sound. Ready for
integration testing with VSCode extension.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Identified hallucinated actions: mkdir, rmdir, rm don't exist in OpenHands
- Directory operations should use CmdRunAction or FileEditAction
- Missing required abstract methods: edit, browse_interactive, call_tool_mcp
- Wrong method signatures: some async methods should be sync
- Scope issues: implementing agent-level actions instead of execution actions
- Socket.IO architecture is correct, but action handling needs fixes
- Documented actual OpenHands actions vs hallucinated ones
The runtime needs to implement only the actions that actually exist in openhands.events.
- Corrected analysis to recognize existing Socket.IO infrastructure
- Removed incorrect assumptions about missing infrastructure
- Updated architecture documentation to show proper event flow
- Changed assessment from 'fundamental issues' to 'implementation details'
- Documented proper integration with existing OpenHands Socket.IO server
The VSCode runtime approach is architecturally sound and leverages existing infrastructure correctly.
- Add OpenHands submenu to context menu for cleaner organization
- Group 'Start with File Content' and 'Start with Selected Text' commands
- Use shorter titles in context menu while preserving full descriptive names in Command Palette
- Leverage category field to automatically prefix commands with 'OpenHands:' in Ctrl+Shift+P
Co-authored-by: openhands <openhands@all-hands.dev>
- Change from 'OpenHands 14:32:45' to 'OpenHands 14:32'
- More human-friendly and cleaner terminal tab names
- Minute precision is sufficient for terminal identification
- VSCode handles duplicate names gracefully if needed
Co-authored-by: openhands <openhands@all-hands.dev>
- build.py is essential: runs npm install and npm run package-vsix
- Creates the .vsix file during Poetry build process
- Without it, there would be no .vsix file to include in package
- This is a necessary part of VSCode extension integration
Co-authored-by: openhands <openhands@all-hands.dev>
- Keep only essential change: include .vsix file in package
- Revert unnecessary changes to packages structure and dependencies
- Remove pytest from main dependencies (belongs in dev.dependencies)
- Remove custom build script (not needed for this PR)
- Cleaner, focused changes for VSCode extension integration
Co-authored-by: openhands <openhands@all-hands.dev>
- Moved development planning document to ~/.openhands/microagents/plan-vscode-integration.md
- PLAN.md was useful during development but doesn't belong in production extension
- Keeps repository clean for end users while preserving development history
Co-authored-by: openhands <openhands@all-hands.dev>
- Remove error messages for missing editor/file/selection contexts
- All commands now gracefully fallback to starting OpenHands without task
- Better user experience: clicking any command always starts OpenHands
- Commands behavior:
* startConversation: no task (unchanged)
* startConversationWithFileContext: file content as task, or no task if no file/empty
* startConversationWithSelectionContext: selected text as task, or no task if no selection
Co-authored-by: openhands <openhands@all-hands.dev>
- Replace last DEBUG showErrorMessage with output channel logging
- Keep legitimate user-facing error messages as popups
- All debug info now goes to 'OpenHands Debug' output channel
Co-authored-by: openhands <openhands@all-hands.dev>
- Remove vscode.window.showErrorMessage() calls for debug information
- Add dedicated 'OpenHands Debug' output channel for development logging
- Debug messages now appear in Output panel instead of popup notifications
- Users won't be bothered by debug messages, but developers can still access them
- Follows VSCode extension best practices for logging
Co-authored-by: openhands <openhands@all-hands.dev>
This development-time analysis file has been moved to user microagents
directory (~/.openhands/microagents/) as it's not needed by other developers.
The analysis was useful during development but doesn't belong in the PR.
- Uncomment .vscode-test/ in .gitignore to prevent accidental commits
- These files are generated during extension testing and shouldn't be in version control
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Add location info for public microagents in glossary
- Add comprehensive Microagents section to repo.md with:
- Types (public vs repository microagents)
- Loading behavior (frontmatter triggers vs always-loaded)
- Structure example with YAML frontmatter
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Add comprehensive VSCode API documentation references as comments
- Include Shell Integration requirements and compatibility notes
- Preserve important development references in the codebase for future maintainers
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Move TERMINAL_REUSE_ANALYSIS.md to .openhands/microagents/vscode-terminal-reuse-analysis.md
- Update README.md with essential user-facing terminal management info
- Remove detailed development analysis from PR, keeping it for future reference
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Updated package.json engines.vscode from ^1.80.0 to ^1.98.2
- Updated @types/vscode dependency to ^1.98.2
- Updated README.md requirements section
- Updated PLAN.md documentation
- Regenerated package-lock.json automatically via npm install
This aligns our main VSCode extension with the runtime extensions
which already require VSCode 1.98.2+, ensuring consistency across
all VSCode integrations in the project.
Co-authored-by: openhands <openhands@all-hands.dev>
- Add VSCode extension linting command to pre-push checklist
- Document VSCode extension structure, setup, and commands
- Include linting, building, and testing commands for the extension
Co-authored-by: OpenHands <openhands@all-hands.dev>
- Add comprehensive linting setup adapted from frontend configuration
- Configure ESLint with airbnb-base rules for Node.js/VSCode extensions
- Add Prettier configuration matching frontend standards
- Include linting scripts in package.json (lint, lint:fix, typecheck)
- Add development dependencies for linting tools
- Update documentation with linting workflow and development guidelines
- Apply automatic formatting to all source files
- Configure special rules for test files and VSCode extension patterns
This ensures code quality consistency with the main OpenHands codebase.
Co-authored-by: OpenHands <openhands@all-hands.dev>
The previous implementation used probing to detect terminal status, which
could interrupt running CLI processes. This fix implements safe state
tracking that only reuses terminals where OpenHands commands have completed.
Key changes:
- Remove intrusive terminal probing that interrupted running processes
- Add safe state tracking using Set to track idle terminals
- Only reuse terminals that we know are safe (completed our commands)
- Use Shell Integration API for monitoring command completion
- Create new terminals when terminal state is unknown (safe fallback)
- Clean up terminal state tracking when terminals are closed
This ensures that running CLIs and other processes in terminals are never
interrupted when sending new tasks to OpenHands.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Add Shell Integration API support for smart terminal detection
- Implement terminal probing to check if terminals are idle
- Add graceful fallback to new terminal creation when Shell Integration unavailable
- Refactor code into modular functions for better maintainability
- Add comprehensive tests for new terminal reuse functionality
- Update README with new features and requirements
- Support cross-shell compatibility (bash, zsh, PowerShell, fish)
This implements the advanced terminal handling described in TERMINAL_REUSE_ANALYSIS.md,
providing intelligent terminal reuse while maintaining backward compatibility.
Co-authored-by: OpenHands-Gemini <openhands@all-hands.dev>
- Add comprehensive analysis of VSCode's Shell Integration capabilities
- Document intelligent terminal probing with execution.read() and executeCommand()
- Update recommendations to use Shell Integration with graceful fallback
- Replace outdated API limitations with current 2024/2025 capabilities
- Add implementation strategy with phases and code examples
- Include proper references to VSCode API documentation
Co-authored-by: Claude 3.5 Sonnet <claude-3-5-sonnet@anthropic.com>
The build script was trying to copy the VSIX file to the same location,
causing a SameFileError. Since the VSIX is already built in the correct
location (openhands/integrations/vscode/) and pyproject.toml includes
it from there, no copying is needed.
Changes:
- Remove unnecessary copy operation from build_vscode_extension()
- Remove unused shutil import and RESOURCES_DIR variable
- Simplify to just build and verify the VSIX exists
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
- Move VS Code extension from root-level openhands-vscode/ to openhands/integrations/vscode/
- Update pyproject.toml to include VSIX from new location: openhands/integrations/vscode/*.vsix
- Update CLI code to load VSIX from new path: integrations/vscode/
- Update build.py to build extension in new location
- Preserve file history using git mv operations
- Maintain VSIX bundling in PyPI package for CLI auto-installation
This reorganization improves architectural consistency by placing the VS Code
integration alongside other integrations rather than at the root level.
The VSIX file is excluded as it's a build artifact generated by build.py.
Co-authored-by: OpenHands-Claude <openhands@all-hands.dev>
stale-issue-message:'This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.'
stale-pr-message:'This PR is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.'
days-before-stale:30
stale-issue-message:'This issue is stale because it has been open for 40 days with no activity. Remove the stale label or leave a comment, otherwise it will be closed in 10 days.'
stale-pr-message:'This PR is stale because it has been open for 40 days with no activity. Remove the stale label or leave a comment, otherwise it will be closed in 10 days.'
days-before-stale:40
exempt-issue-labels:'roadmap'
close-issue-message:'This issue was closed because it has been stalled for over 30 days with no activity.'
close-pr-message:'This PR was closed because it has been stalled for over 30 days with no activity.'
days-before-close:7
close-issue-message:'This issue was automatically closed due to 50 days of inactivity. We do this to help keep the issues somewhat manageable and focus on active issues.'
close-pr-message:'This PR was closed because it had no activity for 50 days. If you feel this was closed in error, and you would like to continue the PR, please resubmit or let us know.'
@@ -15,8 +15,6 @@ make build && make run FRONTEND_PORT=12000 FRONTEND_HOST=0.0.0.0 BACKEND_HOST=0.
IMPORTANT: Before making any changes to the codebase, ALWAYS run `make install-pre-commit-hooks` to ensure pre-commit hooks are properly installed.
Before pushing any changes, you MUST ensure that any lint errors or simple test errors have been fixed.
* If you've made changes to the backend, you should run `pre-commit run --config ./dev_config/python/.pre-commit-config.yaml` (this will run on staged files).
@@ -32,6 +30,12 @@ then re-run the command to ensure it passes. Common issues include:
- Trailing whitespace
- Missing newlines at end of files
## Git Best Practices
- Prefer specific `git add <filename>` instead of `git add .` to avoid accidentally staging unintended files
- Be especially careful with `git reset --hard` after staging files, as it will remove accidentally staged files
- When remote has new changes, use `git fetch upstream && git rebase upstream/<branch>` on the same branch
@@ -52,37 +52,63 @@ which comes with $20 in free credits for new users.
## 💻 Running OpenHands Locally
OpenHands can also run on your local system using Docker.
See the [Running OpenHands](https://docs.all-hands.dev/usage/installation) guide for
system requirements and more information.
### Option 1: CLI Launcher (Recommended)
> [!WARNING]
> On a public network? See our [Hardened Docker Installation Guide](https://docs.all-hands.dev/usage/runtimes/docker#hardened-docker-installation)
> to secure your deployment by restricting network binding and implementing additional security measures.
The easiest way to run OpenHands locally is using the CLI launcher with [uv](https://docs.astral.sh/uv/). This provides better isolation from your current project's virtual environment and is required for OpenHands' default MCP servers.
**Install uv** (if you haven't already):
See the [uv installation guide](https://docs.astral.sh/uv/getting-started/installation/) for the latest installation instructions for your platform.
> **Note**: If you used OpenHands before version 0.44, you may want to run `mv ~/.openhands-state ~/.openhands` to migrate your conversation history to the new location.
You'll find OpenHands running at [http://localhost:3000](http://localhost:3000)!
> [!WARNING]
> On a public network? See our [Hardened Docker Installation Guide](https://docs.all-hands.dev/usage/runtimes/docker#hardened-docker-installation)
> to secure your deployment by restricting network binding and implementing additional security measures.
### Getting Started
When you open the application, you'll be asked to choose an LLM provider and add an API key.
[Anthropic's Claude Sonnet 4](https://www.anthropic.com/api) (`anthropic/claude-sonnet-4-20250514`)
works best, but you have [many options](https://docs.all-hands.dev/usage/llms).
See the [Running OpenHands](https://docs.all-hands.dev/usage/installation) guide for
system requirements and more information.
## 💡 Other ways to run OpenHands
> [!WARNING]
@@ -93,8 +119,8 @@ works best, but you have [many options](https://docs.all-hands.dev/usage/llms).
#- SANDBOX_USER_ID=${SANDBOX_USER_ID:-1234} # enable this only if you want a specific non-root sandbox user but you will have to manually adjust permissions of ~/.openhands for this user
@@ -8,6 +8,29 @@ description: This guide walks you through the process of installing OpenHands Cl
- Signed in to [OpenHands Cloud](https://app.all-hands.dev) with [a Bitbucket account](/usage/cloud/openhands-cloud).
## IP Whitelisting
If your Bitbucket Cloud instance has IP restrictions, you'll need to whitelist the following IP addresses to allow OpenHands to access your repositories:
### Core App IP
```
34.68.58.200
```
### Runtime IPs
```
34.10.175.217
34.136.162.246
34.45.0.142
34.28.69.126
35.224.240.213
34.70.174.52
34.42.4.87
35.222.133.153
34.29.175.97
34.60.55.59
```
## Adding Bitbucket Repository Access
Upon signing into OpenHands Cloud with a Bitbucket account, OpenHands will have access to your repositories.
description: Complete guide for setting up Jira Data Center integration with OpenHands Cloud, including service account creation, personal access token generation, webhook configuration, and workspace integration setup.
---
# Jira Data Center Integration
## Platform Configuration
### Step 1: Create Service Account
1. **Access User Management**
- Log in to Jira Data Center as administrator
- Go to **Administration** > **User Management**
2. **Create User**
- Click **Create User**
- Username: `openhands-agent`
- Full Name: `OpenHands Agent`
- Email: `openhands@yourcompany.com` (replace with your preferred service account email)
- Password: Set a secure password
- Click **Create**
3. **Assign Permissions**
- Add user to appropriate groups
- Ensure access to relevant projects
- Grant necessary project permissions
### Step 2: Generate API Token
1. **Personal Access Tokens**
- Log in as the service account
- Go to **Profile** > **Personal Access Tokens**
- Click **Create token**
- Name: `OpenHands Cloud Integration`
- Expiry: Set appropriate expiration (recommend 1 year)
- Click **Create**
- **Important**: Copy and store the token securely
### Step 3: Configure Webhook
1. **Create Webhook**
- Go to **Administration** > **System** > **WebHooks**
- **JQL Filter**: Leave empty (or customize as needed)
- Click **Create**
- **Important**: Copy and store the webhook secret securely (you'll need this for workspace integration)
---
## Workspace Integration
### Step 1: Log in to OpenHands Cloud
1. **Navigate and Authenticate**
- Go to [OpenHands Cloud](https://app.all-hands.dev/)
- Sign in with your Git provider (GitHub, GitLab, or BitBucket)
- **Important:** Make sure you're signing in with the same Git provider account that contains the repositories you want the OpenHands agent to work on.
### Step 2: Configure Jira Data Center Integration
1. **Access Integration Settings**
- Navigate to **Settings** > **Integrations**
- Locate **Jira Data Center** section
2. **Configure Workspace**
- Click **Configure** button
- Enter your workspace name and click **Connect**
- If no integration exists, you'll be prompted to enter additional credentials required for the workspace integration:
- **Webhook Secret**: The webhook secret from Step 3 above
- **Service Account Email**: The service account email from Step 1 above
- **Service Account API Key**: The personal access token from Step 2 above
- Ensure **Active** toggle is enabled
<Note>
Workspace name is the host name of your Jira Data Center instance.
Here the workspace name is **jira.all-hands.dev**.
</Note>
3. **Complete OAuth Flow**
- You'll be redirected to Jira Data Center to complete OAuth verification
- Grant the necessary permissions to verify your workspace access. If you have access to multiple workspaces, select the correct one that you initially provided
- If successful, you will be redirected back to the **Integrations** settings in the OpenHands Cloud UI
### Managing Your Integration
**Edit Configuration:**
- Click the **Edit** button next to your configured platform
- Update any necessary credentials or settings
- Click **Update** to apply changes
- You will need to repeat the OAuth flow as before
- **Important:** Only the original user who created the integration can see the edit view
**Unlink Workspace:**
- In the edit view, click **Unlink** next to the workspace name
- This will deactivate your workspace link
- **Important:** If the original user who configured the integration chooses to unlink their integration, any users currently linked to that integration will also be unlinked, and the workspace integration will be deactivated. The integration can only be reactivated by the original user.
description: Complete guide for setting up Jira Cloud integration with OpenHands Cloud, including service account creation, API token generation, webhook configuration, and workspace integration setup.
---
# Jira Cloud Integration
## Platform Configuration
### Step 1: Create Service Account
1. **Navigate to User Management**
- Go to [Atlassian Admin](https://admin.atlassian.com/)
- Select your organization
- Go to **Directory** > **Users**
2. **Create OpenHands Service Account**
- Click **Service accounts**
- Click **Create a service account**
- Name: `OpenHands Agent`
- Click **Next**
- Select **User** role for Jira app
- Click **Create**
### Step 2: Generate API Token
1. **Access Service Account Configuration**
- Locate the created service account from above step and click on it
- Click **Create API token**
- Set the expiry to 365 days (maximum allowed value)
- Click **Next**
- In **Select token scopes** screen, filter by following values
- App: Jira
- Scope type: Classic
- Scope actions: Write, Read
- Select `read:jira-work` and `write:jira-work` scopes
- Click **Next**
- Review and create API token
- **Important**: Copy and securely store the token immediately
### Step 3: Configure Webhook
1. **Navigate to Webhook Settings**
- Go to **Jira Settings** > **System** > **WebHooks**
- **JQL Filter**: Leave empty (or customize as needed)
- Click **Create**
- **Important**: Copy and store the webhook secret securely (you'll need this for workspace integration)
---
## Workspace Integration
### Step 1: Log in to OpenHands Cloud
1. **Navigate and Authenticate**
- Go to [OpenHands Cloud](https://app.all-hands.dev/)
- Sign in with your Git provider (GitHub, GitLab, or BitBucket)
- **Important:** Make sure you're signing in with the same Git provider account that contains the repositories you want the OpenHands agent to work on.
### Step 2: Configure Jira Integration
1. **Access Integration Settings**
- Navigate to **Settings** > **Integrations**
- Locate **Jira Cloud** section
2. **Configure Workspace**
- Click **Configure** button
- Enter your workspace name and click **Connect**
- **Important:** Make sure you enter the full workspace name, eg: **yourcompany.atlassian.net**
- If no integration exists, you'll be prompted to enter additional credentials required for the workspace integration:
- **Webhook Secret**: The webhook secret from Step 3 above
- **Service Account Email**: The service account email from Step 1 above
- **Service Account API Key**: The API token from Step 2 above
- Ensure **Active** toggle is enabled
<Note>
Workspace name is the host name when accessing a resource in Jira Cloud.
Eg: https://all-hands.atlassian.net/browse/OH-55
Here the workspace name is **all-hands**.
</Note>
3. **Complete OAuth Flow**
- You'll be redirected to Jira Cloud to complete OAuth verification
- Grant the necessary permissions to verify your workspace access.
- If successful, you will be redirected back to the **Integrations** settings in the OpenHands Cloud UI
### Managing Your Integration
**Edit Configuration:**
- Click the **Edit** button next to your configured platform
- Update any necessary credentials or settings
- Click **Update** to apply changes
- You will need to repeat the OAuth flow as before
- **Important:** Only the original user who created the integration can see the edit view
**Unlink Workspace:**
- In the edit view, click **Unlink** next to the workspace name
- This will deactivate your workspace link
- **Important:** If the original user who configured the integration chooses to unlink their integration, any users currently linked to that workspace integration will also be unlinked, and the workspace integration will be deactivated. The integration can only be reactivated by the original user.
description: Complete guide for setting up Linear integration with OpenHands Cloud, including service account creation, API key generation, webhook configuration, and workspace integration setup.
---
# Linear Integration
## Platform Configuration
### Step 1: Create Service Account
1. **Access Team Settings**
- Log in to Linear as a team admin
- Go to **Settings** > **Members**
2. **Invite Service Account**
- Click **Invite members**
- Email: `openhands@yourcompany.com` (replace with your preferred service account email)
- Role: **Member** (with appropriate team access)
- Send invitation
3. **Complete Setup**
- Accept invitation from the service account email
- Complete profile setup
- Ensure access to relevant teams/workspaces
### Step 2: Generate API Key
1. **Access API Settings**
- Log in as the service account
- Go to **Settings** > **Security & access**
2. **Create Personal API Key**
- Click **Create new key**
- Name: `OpenHands Cloud Integration`
- Scopes: Select the following:
- `Read` - Read access to issues and comments
- `Create comments` - Ability to create or update comments
- Select the teams you want to provide access to, or allow access for all teams you have permissions for
- Click **Create**
- **Important**: Copy and store the API key securely
- Select the teams you want to provide access to, or allow access for all public teams
- Click **Create webhook**
- **Important**: Copy and store the webhook secret securely (you'll need this for workspace integration)
---
## Workspace Integration
### Step 1: Log in to OpenHands Cloud
1. **Navigate and Authenticate**
- Go to [OpenHands Cloud](https://app.all-hands.dev/)
- Sign in with your Git provider (GitHub, GitLab, or BitBucket)
- **Important:** Make sure you're signing in with the same Git provider account that contains the repositories you want the OpenHands agent to work on.
### Step 2: Configure Linear Integration
1. **Access Integration Settings**
- Navigate to **Settings** > **Integrations**
- Locate **Linear** section
2. **Configure Workspace**
- Click **Configure** button
- Enter your workspace name and click **Connect**
- If no integration exists, you'll be prompted to enter additional credentials required for the workspace integration:
- **Webhook Secret**: The webhook secret from Step 3 above
- **Service Account Email**: The service account email from Step 1 above
- **Service Account API Key**: The API key from Step 2 above
- Ensure **Active** toggle is enabled
<Note>
Workspace name is the identifier after the host name when accessing a resource in Linear.
Eg: https://linear.app/allhands/issue/OH-37
Here the workspace name is **allhands**.
</Note>
3. **Complete OAuth Flow**
- You'll be redirected to Linear to complete OAuth verification
- Grant the necessary permissions to verify your workspace access. If you have access to multiple workspaces, select the correct one that you initially provided
- If successful, you will be redirected back to the **Integrations** settings in the OpenHands Cloud UI
### Managing Your Integration
**Edit Configuration:**
- Click the **Edit** button next to your configured platform
- Update any necessary credentials or settings
- Click **Update** to apply changes
- You will need to repeat the OAuth flow as before
- **Important:** Only the original user who created the integration can see the edit view
**Unlink Workspace:**
- In the edit view, click **Unlink** next to the workspace name
- This will deactivate your workspace link
- **Important:** If the original user who configured the integration chooses to unlink their integration, any users currently linked to that integration will also be unlinked, and the workspace integration will be deactivated. The integration can only be reactivated by the original user.
description: Overview of OpenHands Cloud integrations with project management platforms including Jira Cloud, Jira Data Center, and Linear. Learn about setup requirements, usage methods, and troubleshooting.
---
# Project Management Tool Integrations
## Overview
OpenHands Cloud integrates with project management platforms (Jira Cloud, Jira Data Center, and Linear) to enable AI-powered task delegation. Users can invoke the OpenHands agent by:
- Adding `@openhands` in ticket comments
- Adding the `openhands` label to tickets
## Prerequisites
Integration requires two levels of setup:
1. **Platform Configuration** - Administrative setup of service accounts and webhooks on your project management platform (see individual platform documentation below)
2. **Workspace Integration** - Self-service configuration through the OpenHands Cloud UI to link your OpenHands account to the target workspace
### Platform-Specific Setup Guides:
- [Jira Cloud Integration](./jira-integration.md)
- [Jira Data Center Integration](./jira-dc-integration.md)
- [Linear Integration](./linear-integration.md)
## Usage
Once both the platform configuration and workspace integration are completed, users can trigger the OpenHands agent within their project management platforms using two methods:
### Method 1: Comment Mention
Add a comment to any issue with `@openhands` followed by your task description:
```
@openhands Please implement the user authentication feature described in this ticket
```
### Method 2: Label-based Delegation
Add the label `openhands` to any issue. The OpenHands agent will automatically process the issue based on its description and requirements.
### Git Repository Detection
The OpenHands agent needs to identify which Git repository to work with when processing your issues. Here's how to ensure proper repository detection:
#### Specifying the Target Repository
**Required:** Include the target Git repository in your issue description or comment to ensure the agent works with the correct codebase.
**Supported Repository Formats:**
- Full HTTPS URL: `https://github.com/owner/repository.git`
- GitHub URL without .git: `https://github.com/owner/repository`
- Owner/repository format: `owner/repository`
#### Platform-Specific Behavior
**Linear Integration:** When GitHub integration is enabled for your Linear workspace with issue sync activated, the target repository is automatically detected from the linked GitHub issue. Manual specification is not required in this configuration.
**Jira Integrations:** Always include the repository information in your issue description or `@openhands` comment to ensure proper repository detection.
## Troubleshooting
### Platform Configuration Issues
- **Webhook not triggering**: Verify the webhook URL is correct and the proper event types are selected (Comment, Issue updated)
- **API authentication failing**: Check API key/token validity and ensure required scopes are granted. If your current API token is expired, make sure to update it in the respective integration settings
- **Permission errors**: Ensure the service account has access to relevant projects/teams and appropriate permissions
### Workspace Integration Issues
- **Workspace linking requests credentials**: If there are no active workspace integrations for the workspace you specified, you need to configure it first. Contact your platform administrator that you want to integrate with (eg: Jira, Linear)
- **Integration not found**: Verify the workspace name matches exactly and that platform configuration was completed first
- **OAuth flow fails**: Make sure that you're authorizing with the correct account with proper workspace access
### General Issues
- **Agent not responding**: Check webhook logs in your platform settings and verify service account status
- **Agent fails to identify git repo**: Ensure you're signing in with the same Git provider account that contains the repositories you want OpenHands to work on
- **Partial functionality**: Ensure both platform configuration and workspace integration are properly completed
### Getting Help
For additional support, contact OpenHands Cloud support with:
- Your integration platform (Linear, Jira Cloud, or Jira Data Center)
- Workspace name
- Error logs from webhook/integration attempts
- Screenshots of configuration settings (without sensitive credentials)
@@ -12,6 +12,10 @@ description: This guide walks you through installing the OpenHands Slack app.
allowFullScreen>
</iframe>
<Info>
OpenHands utilizes a large language model (LLM), which may generate responses that are inaccurate or incomplete. While we strive for accuracy, OpenHands' outputs are not guaranteed to be correct, and we encourage users to validate critical information independently.
</Info>
## Prerequisites
- Access to OpenHands Cloud.
@@ -24,7 +28,7 @@ description: This guide walks you through installing the OpenHands Slack app.
**This step is for Slack admins/owners**
1. Make sure you have permissions to install Apps to your workspace.
2. Click the button below to install OpenHands Slack App <a target="_blank" href="https://slack.com/oauth/v2/authorize?client_id=7477886716822.8729519890534&scope=app_mentions:read,chat:write,users:read,channels:history,groups:history,mpim:history,im:history&user_scope=channels:history,groups:history,im:history,mpim:history"><img alt="Add to Slack" height="40" width="139" src="https://platform.slack-edge.com/img/add_to_slack.png" srcSet="https://platform.slack-edge.com/img/add_to_slack.png 1x, https://platform.slack-edge.com/img/add_to_slack@2x.png 2x" /></a>
2. Click the button below to install OpenHands Slack App <a target="_blank" href="https://slack.com/oauth/v2/authorize?client_id=7477886716822.8729519890534&scope=app_mentions:read,channels:history,chat:write,groups:history,im:history,mpim:history,users:read&user_scope="><img alt="Add to Slack" height="40" width="139" src="https://platform.slack-edge.com/img/add_to_slack.png" srcSet="https://platform.slack-edge.com/img/add_to_slack.png 1x, https://platform.slack-edge.com/img/add_to_slack@2x.png 2x" /></a>
3. In the top right corner, select the workspace to install the OpenHands Slack app.
**Note** - OpenHands requires Python version 3.12 or higher (Python 3.14 is not currently supported)
**Note** - OpenHands requires Python version 3.12 or higher (Python 3.14 is not currently supported) and `uv` for the default `fetch` MCP server (more details below).
1. Install OpenHands using pip:
```bash
pip install openhands-ai
```
#### Recommended: Using uv
Or if you prefer not to manage your own Python environment, you can use `uvx`:
We recommend using [uv](https://docs.astral.sh/uv/) for the best OpenHands experience. uv provides better isolation from your current project's virtual environment and is required for OpenHands' default MCP servers.
1. **Install uv** (if you haven't already):
See the [uv installation guide](https://docs.astral.sh/uv/getting-started/installation/) for the latest installation instructions for your platform.
2. **Launch OpenHands CLI**:
```bash
uvx --python 3.12 --from openhands-ai openhands
```
<AccordionGroup>
<Accordion title="Alternative: Traditional pip installation">
If you prefer to use pip:
```bash
# Install OpenHands
pip install openhands-ai
```
Note that you'll still need `uv` installed for the default MCP servers to work properly.
</Accordion>
<Accordion title="Create shell aliases for easy access across environments">
Add the following to your shell configuration file (`.bashrc`, `.zshrc`, etc.):
```bash
# Add OpenHands aliases
# Add OpenHands aliases (recommended)
alias openhands="uvx --python 3.12 --from openhands-ai openhands"
alias oh="uvx --python 3.12 --from openhands-ai openhands"
```
@@ -72,18 +87,19 @@ source ~/.bashrc # or source ~/.zshrc
</AccordionGroup>
2. Launch an interactive OpenHands conversation from the command line:
3. Launch an interactive OpenHands conversation from the command line:
```bash
openhands
# If using uvx (recommended)
uvx --python 3.12 --from openhands-ai openhands
```
<Note>
If you have cloned the repository, you can also run the CLI directly using Poetry:
poetry run python -m openhands.cli.main
poetry run openhands
</Note>
3. Set your model, API key, and other preferences using the UI (or alternatively environment variables, below).
4. Set your model, API key, and other preferences using the UI (or alternatively environment variables, below).
This command opens an interactive prompt where you can type tasks or commands and get responses from OpenHands.
The first time you run the CLI, it will take you through configuring the required LLM
@@ -103,7 +119,7 @@ The conversation history will be saved in `~/.openhands/sessions`.
@@ -153,6 +169,7 @@ You can use the following commands whenever the prompt (`>`) is displayed:
| `/new` | Start a new conversation |
| `/settings` | View and modify current LLM/agent settings |
| `/resume` | Resume the agent if paused |
| `/mcp` | Manage MCP server configuration and view connection errors |
#### Settings and Configuration
@@ -162,7 +179,7 @@ follow the prompts:
- **Basic settings**: Choose a model/provider and enter your API key.
- **Advanced settings**: Set custom endpoints, enable or disable confirmation mode, and configure memory condensation.
Settings can also be managed via the `config.toml` file.
Settings can also be managed via the `config.toml` file in the current directory or `~/.openhands/config.toml`.
#### Repository Initialization
@@ -174,6 +191,41 @@ project details and structure. Use this when onboarding the agent to a new codeb
You can pause the agent while it is running by pressing `Ctrl-P`. To continue the conversation after pausing, simply
type `/resume` at the prompt.
#### MCP Server Management
To configure Model Context Protocol (MCP) servers, you can refer to the documentation on [MCP servers](../mcp) and use the `/mcp` command in the CLI. This command provides an interactive interface for managing Model Context Protocol (MCP) servers:
- **List configured servers**: View all currently configured MCP servers (SSE, Stdio, and SHTTP)
- **Add new server**: Interactively add a new MCP server with guided prompts
- **Remove server**: Remove an existing MCP server from your configuration
- **View errors**: Display any connection errors that occurred during MCP server startup
This command modifies your `~/.openhands/config.toml` file and will prompt you to restart OpenHands for changes to take effect.
By default, the [Fetch MCP server](https://github.com/modelcontextprotocol/servers/tree/main/src/fetch) will be automatically configured for OpenHands. You can also [enable search engine](../search-engine-setup) via the [Tavily MCP server](https://github.com/tavily-ai/tavily-mcp) by setting the `search_api_key` under the `[core]` section in the `~/.openhands/config.toml` file.
##### Example of the `config.toml` file with MCP server configuration:
@@ -7,6 +7,67 @@ description: High level overview of the Graphical User Interface (GUI) in OpenHa
- [OpenHands is running](/usage/local-setup)
## Launching the GUI Server
### Using the CLI Command
You can launch the OpenHands GUI server directly from the command line using the `serve` command:
<Callout type="info">
**Prerequisites**: You need to have the [OpenHands CLI installed](/usage/how-to/cli-mode) first, OR have `uv` installed and run `uvx --python 3.12 --from openhands-ai openhands serve`. Otherwise, you'll need to use Docker directly (see the [Docker section](#using-docker-directly) below).
</Callout>
```bash
openhands serve
```
This command will:
- Check that Docker is installed and running
- Pull the required Docker images
- Launch the OpenHands GUI server at http://localhost:3000
- Use the same configuration directory (`~/.openhands`) as the CLI mode
#### Mounting Your Current Directory
To mount your current working directory into the GUI server container, use the `--mount-cwd` flag:
```bash
openhands serve --mount-cwd
```
This is useful when you want to work on files in your current directory through the GUI. The directory will be mounted at `/workspace` inside the container.
#### Using GPU Support
If you have NVIDIA GPUs and want to make them available to the OpenHands container, use the `--gpu` flag:
```bash
openhands serve --gpu
```
This will enable GPU support via nvidia-docker, mounting all available GPUs into the container. You can combine this with other flags:
```bash
openhands serve --gpu --mount-cwd
```
**Prerequisites for GPU support:**
- NVIDIA GPU drivers must be installed on your host system
- [NVIDIA Container Toolkit (nvidia-docker2)](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) must be installed and configured
#### Requirements
Before using the `openhands serve` command, ensure that:
- Docker is installed and running on your system
- You have internet access to pull the required Docker images
- Port 3000 is available on your system
The CLI will automatically check these requirements and provide helpful error messages if anything is missing.
### Using Docker Directly
Alternatively, you can run the GUI server using Docker directly. See the [local setup guide](/usage/local-setup) for detailed Docker instructions.
- [mistralai/devstral-small](https://www.all-hands.dev/blog/devstral-a-new-state-of-the-art-open-model-for-coding-agents) (20 May 2025) -- also available through [OpenRouter](https://openrouter.ai/mistralai/devstral-small:free)
- [all-hands/openhands-lm-32b-v0.1](https://www.all-hands.dev/blog/introducing-openhands-lm-32b----a-strong-open-coding-agent-model) (31 March 2025) -- also available through [OpenRouter](https://openrouter.ai/all-hands/openhands-lm-32b-v0.1)
### Known Issues
<Warning>
As of July 2025, there are known issues with Gemini 2.5 Pro conversations taking longer than normal with OpenHands. We are continuing to investigate.
</Warning>
<Note>
Most current local and open source models are not as powerful. When using such models, you may see long
wait times between messages, poor responses, or errors about malformed JSON. OpenHands can only be as powerful as the
@@ -30,5 +30,6 @@ When running OpenHands, you'll need to set the following in the OpenHands UI thr
## Pricing
Pricing follows official API provider rates.
[You can view model prices here.](https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json)
Pricing follows official API provider rates. [You can view model prices here.](https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json)
For `qwen3-coder-480b`, we charge the cheapest FP8 rate available on openrouter: \$0.4 per million input tokens and \$1.6 per million output tokens.
@@ -66,20 +66,64 @@ A system with a modern processor and a minimum of **4GB RAM** is recommended to
### Start the App
#### Option 1: Using the CLI Launcher with uv (Recommended)
We recommend using [uv](https://docs.astral.sh/uv/) for the best OpenHands experience. uv provides better isolation from your current project's virtual environment and is required for OpenHands' default MCP servers (like the [fetch MCP server](https://github.com/modelcontextprotocol/servers/tree/main/src/fetch)).
**Install uv** (if you haven't already):
See the [uv installation guide](https://docs.astral.sh/uv/getting-started/installation/) for the latest installation instructions for your platform.
This will automatically handle Docker requirements checking, image pulling, and launching the GUI server. The `--gpu` flag enables GPU support via nvidia-docker, and `--mount-cwd` mounts your current directory into the container.
<Accordion title="Alternative: Traditional pip installation">
If you prefer to use pip and have Python 3.12+ installed:
```bash
# Install OpenHands
pip install openhands-ai
# Launch the GUI server
openhands serve
```
Note that you'll still need `uv` installed for the default MCP servers to work properly.
</Accordion>
#### Option 2: Using Docker Directly
<Accordion title="Docker Command (Click to expand)">
> **Note**: If you used OpenHands before version 0.44, you may want to run `mv ~/.openhands-state ~/.openhands` to migrate your conversation history to the new location.
You'll find OpenHands running at http://localhost:3000!
@@ -100,6 +144,16 @@ OpenHands requires an API key to access most language models. Here's how to get
<AccordionGroup>
<Accordion title="OpenHands (Recommended)">
1. [Log in to OpenHands Cloud](https://app.all-hands.dev).
2. Go to the Settings page and navigate to the `API Keys` tab.
3. Copy your `LLM API Key`.
OpenHands provides access to state-of-the-art agentic coding models with competitive pricing. [Learn more about OpenHands LLM provider](/usage/llms/openhands-llms).
</Accordion>
<Accordion title="Anthropic (Claude)">
1. [Create an Anthropic account](https://console.anthropic.com/).
@@ -10,47 +10,83 @@ Model Context Protocol (MCP) is a mechanism that allows OpenHands to communicate
servers can provide additional functionality to the agent, such as specialized data processing, external API access,
or custom tools. MCP is based on the open standard defined at [modelcontextprotocol.io](https://modelcontextprotocol.io).
<Note>
MCP is currently not available on OpenHands Cloud. This feature is only available when running OpenHands locally.
</Note>
### How MCP Works
When OpenHands starts, it:
1. Reads the MCP configuration.
2. Connects to any configured SSE and SHTTP servers.
3. Starts any configured stdio servers.
4. Registers the tools provided by these servers with the agent.
The agent can then use these tools just like any built-in tool. When the agent calls an MCP tool:
1. OpenHands routes the call to the appropriate MCP server.
2. The server processes the request and returns a response.
3. OpenHands converts the response to an observation and presents it to the agent.
## Configuration
MCP configuration can be defined in:
* The OpenHands UI through the Settings under the `MCP` tab.
* The `config.toml` file under the `[mcp]` section if not using the UI.
### Configuration Example via config.toml
### Configuration Examples
#### Recommended: Using Proxy Servers (SSE/HTTP)
For stdio-based MCP servers, we recommend using MCP proxy tools like [`supergateway`](https://github.com/supercorp-ai/supergateway) instead of direct stdio connections.
[SuperGateway](https://github.com/supercorp-ai/supergateway) is a popular MCP proxy that converts stdio MCP servers to HTTP/SSE endpoints:
This folder contains the evaluation harness that we built on top of the original [SWE-Bench benchmark](https://www.swebench.com/) ([paper](https://arxiv.org/abs/2310.06770)).
**UPDATE (8/12/2025): We now support running SWE-rebench evaluation (see the paper [here](https://arxiv.org/abs/2505.20411))! For how to run it, checkout [this README](./SWE-rebench.md).**
**UPDATE (6/15/2025): We now support running SWE-bench-Live evaluation (see the paper [here](https://arxiv.org/abs/2505.23419))! For how to run it, checkout [this README](./SWE-bench-Live.md).**
**UPDATE (5/26/2025): We now support running interactive SWE-Bench evaluation (see the paper [here](https://arxiv.org/abs/2502.13069))! For how to run it, checkout [this README](./SWE-Interact.md).**
@@ -183,24 +185,7 @@ The final results will be saved to `evaluation/evaluation_outputs/outputs/swe_be
-`report.json`: a JSON file that contains keys like `"resolved_ids"` pointing to instance IDs that are resolved by the agent.
-`logs/`: a directory of test logs
### Run evaluation with `RemoteRuntime`
OpenHands Remote Runtime is currently in beta (read [here](https://runtime.all-hands.dev/) for more details), it allows you to run rollout in parallel in the cloud, so you don't need a powerful machine to run evaluation.
Fill out [this form](https://docs.google.com/forms/d/e/1FAIpQLSckVz_JFwg2_mOxNZjCtr7aoBFI2Mwdan3f75J_TrdMS1JV2g/viewform) to apply if you want to try this out!
# Example - This evaluates patches generated by CodeActAgent on Llama-3.1-70B-Instruct-Turbo on "princeton-nlp/SWE-bench_Lite"'s test set, with 16 number of workers running in parallel
SWE-rebench is a large-scale dataset for verifiable software engineering tasks.
It comes in **two datasets**:
* **[`nebius/SWE-rebench-leaderboard`](https://huggingface.co/datasets/nebius/SWE-rebench-leaderboard)** – updatable benchmark used for [leaderboard evaluation](https://swe-rebench.com/leaderboard).
* **[`nebius/SWE-rebench`](https://huggingface.co/datasets/nebius/SWE-rebench)** – full dataset with **21,302 tasks**, suitable for training or large-scale offline evaluation.
This document explains how to run OpenHands on SWE-rebench, using the leaderboard split as the main example.
To run on the full dataset, simply replace the dataset name.
## Setting Up
Set up your development environment and configure your LLM provider by following the [SWE-bench README](README.md) in this directory.
## Running Inference
Use the existing SWE-bench inference script, changing the dataset to `nebius/SWE-rebench-leaderboard` and selecting the split (`test` for leaderboard submission):
2. Clone the SWE-bench-fork repo (https://github.com/SWE-rebench/SWE-bench-fork) and follow its README to install dependencies.
3. Run the evaluation using the fork:
```bash
python -m swebench.harness.run_evaluation \
--dataset_name nebius/SWE-rebench-leaderboard \
--split test\
--predictions_path preds.jsonl \
--max_workers 10\
--run_id openhands
```
## Citation
```bibtex
@article{badertdinov2025swerebench,
title={SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents},
author={Badertdinov, Ibragim and Golubev, Alexander and Nekrashevich, Maksim and Shevtsov, Anton and Karasik, Simon and Andriushchenko, Andrei and Trofimova, Maria and Litvintseva, Daria and Yangel, Boris},
I've uploaded a python code repository in the directory {{ workspace_dir_name }}. Consider the following issue description:
<issue_description>
{{ instance.problem_statement }}
</issue_description>
Can you help me implement the necessary changes to the repository so that the requirements specified in the <issue_description> are met?
I've already taken care of all changes to any of the test files described in the <issue_description>. This means you DON'T have to modify the testing logic or any of the tests in any way!
Also the development Python environment is already set up for you (i.e., all dependencies already installed), so you don't need to install other packages.
Your task is to make the minimal changes to non-test files in the /workspace/{{ workspace_dir_name }} directory to ensure the <issue_description> is satisfied.
Follow these phases to resolve the issue:
Phase 1. READING: read the problem and reword it in clearer terms
1.1 If there are code or config snippets. Express in words any best practices or conventions in them.
1.5 Hightlight any best practices to take into account when testing and fixing the issue
Phase 2. RUNNING: install and run the tests on the repository
2.1 Follow the readme
2.2 Install the environment and anything needed
2.2 Iterate and figure out how to run the tests
Phase 3. EXPLORATION: find the files that are related to the problem and possible solutions
3.1 Use `grep` to search for relevant methods, classes, keywords and error messages.
3.2 Identify all files related to the problem statement.
3.3 Propose the methods and files to fix the issue and explain why.
3.4 From the possible file locations, select the most likely location to fix the issue.
Phase 4. TEST CREATION: before implementing any fix, create a script to reproduce and verify the issue.
4.1 Look at existing test files in the repository to understand the test format/structure.
4.2 Create a minimal reproduction script that reproduces the located issue.
4.3 Run the reproduction script to confirm you are reproducing the issue.
4.4 Adjust the reproduction script as necessary.
Phase 5. FIX ANALYSIS: state clearly the problem and how to fix it
5.1 State clearly what the problem is.
5.2 State clearly where the problem is located.
5.3 State clearly how the test reproduces the issue.
5.4 State clearly the best practices to take into account in the fix.
5.5 State clearly how to fix the problem.
Phase 6. FIX IMPLEMENTATION: Edit the source code to implement your chosen solution.
6.1 Make minimal, focused changes to fix the issue.
Phase 7. VERIFICATION: Test your implementation thoroughly.
7.1 Run your reproduction script to verify the fix works.
7.2 Add edge cases to your test script to ensure comprehensive coverage.
7.3 Run existing tests related to the modified code to ensure you haven't broken anything.
8. FINAL REVIEW: Carefully re-read the problem description and compare your changes with the base commit {{ instance.base_commit }}.
8.1 Ensure you've fully addressed all requirements.
8.2 Run any tests in the repository related to:
8.2.1 The issue you are fixing
8.2.2 The files you modified
8.2.3 The functions you changed
8.3 If any tests fail, revise your implementation until all tests pass
Be thorough in your exploration, testing, and reasoning. It's fine if your thinking process is lengthy - quality and completeness are more important than brevity.
You are provided with a Python code repository that contains an issue requiring your attention. The repository is located in a sandboxed environment, and you have access to the codebase to implement the necessary changes.
The code repository is located at: `/workspace/{{ workspace_dir_name }}`
(This path is provided for context; use file system tools to confirm paths before access).
## Goal
Your goal is to fix the issue described in the **Issue Description** section below. Implement the necessary changes to **non-test files only** within the repository, ensuring that **all relevant tests pass** after your changes.
## Key Requirements & Constraints
1. **Understand the problem** very well: it is a bug report, and you know humans don't always write good descriptions. Explore the codebase to understand the related code and the problem in depth. It is possible that the solution needs to be a bit more extensive than just the stated text. Don't exagerate though: don't do unrelated refactoring, but also don't interpret the description too strictly.
2. **Focus on the issues:** Implement the fix focusing on non-test files related to the issue.
2. **Environment Ready:** The Python environment is pre-configured with all dependencies. Do not install packages.
3. **Mandatory Testing Procedure:**
* **Create Test to Reproduce the Issue:** *Before* implementing any fix, you MUST create a *new test* (separate from existing tests) that specifically reproduces the issue.
* Take existing tests as example to understand the testing format/structure.
* Enhance this test with edge cases.
* Run this test to confirm reproduction.
* **Verify Fix:** After implementing the fix, run your test again to verify the issue is resolved.
* **Identify ALL Relevant Tests:** You MUST perform a **dedicated search and analysis** to identify **all** existing unit tests potentially affected by your changes. This includes:
* Tests in the same module/directory as the changed files (e.g., `tests/` subdirectories).
* Tests explicitly importing or using the modified code/classes/functions.
* Tests mentioned in the issue description or related documentation.
* Tests covering functionalities that *depend on* the modified code (analyze callers/dependencies if necessary).
**If you cannot confidently identify a specific subset, you MUST identify and plan to run the entire test suite for the modified application or module(s). State your identified test scope clearly.**
* **Run Identified Relevant Tests:** You MUST execute the **complete set** of relevant existing unit tests you identified in the previous step. Ensure you are running the *correct and comprehensive set* of tests. You MUST NOT modify these existing tests.
* **Final Check & Verification:** Before finishing, ensure **all** identified relevant existing tests pass. **Explicitly confirm that you have considered potential omissions in your test selection and believe the executed tests comprehensively cover the impact of your changes.** Failing to identify and run the *complete* relevant set constitutes a failure. If any identified tests fail, revise your fix. Passing all relevant tests is the primary measure of success.
4. **Defensive Programming:** Actively practice defensive programming: anticipate and handle potential edge cases, unexpected inputs, and different ways the affected code might be called **to ensure the fix works reliably and allows relevant tests to pass.** Analyze the potential impact on other parts of the codebase.
5. **Final Review:** Compare your solution against the original issue and the base commit ({{ instance.base_commit }}) to ensure completeness and test passage.
## General Workflow Guidance
* Prioritize understanding the problem, exploring the code, planning your fix, implementing it carefully using the required diff format, and **thoroughly testing** according to the **Mandatory Testing Procedure**.
* Consider trade-offs between different solutions. The goal is a **robust change that makes the relevant tests pass.** Quality, correctness, and reliability are key.
* Actively practice defensive programming: anticipate and handle potential edge cases, unexpected inputs, and different ways the affected code might be called **to ensure the fix works reliably and allows relevant tests to pass.** Analyze the potential impact on other parts of the codebase.
* IMPORTANT: Your solution will be tested by additional hidden tests, so do not assume the task is complete just because visible tests pass! Refine the solution until you are confident that it is robust and comprehensive according to the **Defensive Programming** requirement.
## Final Note
Be thorough in your exploration, testing, and reasoning. It's fine if your thinking process is lengthy - quality and completeness are more important than brevity.
poetry run ./evaluation/benchmarks/testgeneval/scripts/eval_infer_remote.sh evaluation/evaluation_outputs/outputs/kjain14__testgeneval-test/CodeActAgent/gpt-4o_maxiter_25_N_v0.20.0-no-hint-run_1/output.jsonl 10 kjain14/testgeneval test true
Command run (our approach):
./evaluation/benchmarks/testgeneval/scripts/run_infer.sh llm.eval_gpt HEAD CodeActAgent -1 25 10 kjain14/testgeneval test 1 ../TestGenEval/results/testgeneval/preds/gpt-4o-2024-08-06__testgeneval__0.2__test.jsonl
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.